Compare commits

..

160 Commits

Author SHA1 Message Date
vowelparrot
6dc02a63fd Add testing 2023-04-30 15:36:08 -07:00
Sami Liedes
98444779c5 Pandas agent: Pass callback manager to Python tool. (#3646)
The callback manager was not passed to the Python tool, making it hard
to observe some intermediate steps.

NOTE: The mypy hints seem to suggest that the tool does not accept None
as an input—but I didn't check the code—so I use get_callback_manager()
to get the real default callback manager in case None is passed to the
Pandas tool.

Co-authored-by: Sami Liedes <sami.liedes@rocket-science.ch>
2023-04-30 14:13:03 -07:00
Ankush Gola
d3ec00b566 Callbacks Refactor [base] (#3256)
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com>
Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-30 11:14:09 -07:00
Zander Chase
18ec22fe56 Remove multi-input tool section (#3810)
Moving to new notebook. Will re-intro w/ new agent
2023-04-29 15:29:08 -07:00
mbchang
adcad98bee fix: fix filepath error in agent simulations docs (#3795) 2023-04-29 11:21:27 -07:00
Harrison Chase
20aad0bed1 stripe docs 2023-04-29 08:16:37 -07:00
Harrison Chase
378f0889eb bump version to 153 (#3774) 2023-04-29 07:31:35 -07:00
Sheldon
399065e858 update zilliz example (#3578)
1. Now the Zilliz example can't connect to Zilliz Cloud, fixed

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-28 22:10:13 -07:00
Harrison Chase
bd7e0a534c Harrison/csv loader (#3771)
Co-authored-by: mrT23 <tal.r@codium.ai>
2023-04-28 21:54:24 -07:00
Harrison Chase
c494ca3ad2 Harrison/doc2txt (#3772)
Co-authored-by: rishni ratnam <rishniratnam@gmail.com>
2023-04-28 21:54:16 -07:00
Mike Wang
ce4fea983b [simple] added test case and improve self class return type annotation (#3773)
a simple follow up of https://github.com/hwchase17/langchain/pull/3748
- added test case
- improve annotation when function return type is class itself.
2023-04-28 21:54:07 -07:00
Harrison Chase
0c0f14407c Harrison/tair (#3770)
Co-authored-by: Seth Huang <848849+seth-hg@users.noreply.github.com>
2023-04-28 21:25:33 -07:00
Aurélien SCHILTZ
502ba6a0be Fix type annotation for SQLDatabaseToolkit.llm (#3581)
Currently `langchain.agents.agent_toolkits.SQLDatabaseToolkit` has a
field `llm` with type `BaseLLM`. This breaks initialization for some
LLMs. For example, trying to use it with GPT4:
```

from langchain.sql_database import SQLDatabase
from langchain.chat_models import ChatOpenAI
from langchain.agents.agent_toolkits import SQLDatabaseToolkit


db = SQLDatabase.from_uri("some_db_uri")
llm = ChatOpenAI(model_name="gpt-4")
toolkit = SQLDatabaseToolkit(db=db, llm=llm)

# pydantic.error_wrappers.ValidationError: 1 validation error for SQLDatabaseToolkit
# llm
#  Can't instantiate abstract class BaseLLM with abstract methods _agenerate, _generate, _llm_type (type=type_error)
```
Seems like much of the rest of the codebase has switched from BaseLLM to
BaseLanguageModel. This PR makes the change for SQLDatabaseToolkit as
well
2023-04-28 21:19:01 -07:00
uyhcire
0a7a2b99b5 Fix Chroma integration failing when there are less than 4 items in the collection (#3674)
The code was failing to decrement the `n_results` kwarg passed to
`query(...)`
2023-04-28 21:18:19 -07:00
Rafal Wojdyla
57e028549a Expose kwargs in LLMChainExtractor.from_llm (#3748)
Re: https://github.com/hwchase17/langchain/issues/3747
2023-04-28 21:18:05 -07:00
Mike Wang
512c24fc9c [annotation improvement] Make AgentType->Class Conversion More Scalable (#3749)
In the current solution, AgentType and AGENT_TO_CLASS are placed in two
separate files and both manually maintained. This might cause
inconsistency when we update either of them.

— latest —
based on the discussion with hwchase17, we don’t know how to further use
the newly introduced AgentTypeConfig type, so it doesn’t make sense yet
to add it. Instead, it’s better to move the dictionary to another file
to keep the loading.py file clear. The consistency is a good point.
Instead of asserting the consistency during linting, we added a unittest
for consistency check. I think it works as auto unittest is triggered
every time with clear failure notice. (well, force push is possible, but
we all know what we are doing, so let’s show trust. :>)

~~This PR includes~~
- ~~Introduced AgentTypeConfig as the source of truth of all AgentType
related meta data.~~
- ~~Each AgentTypeConfig is a annotated class type which can be used for
annotation in other places.~~
- ~~Each AgentTypeConfig can be easily extended when we have more meta
data needs.~~
- ~~Strong assertion to ensure AgentType and AGENT_TO_CLASS are always
consistent.~~
- ~~Made AGENT_TO_CLASS automatically generated.~~

~~Test Plan:~~
- ~~since this change is focusing on annotation, lint is the major test
focus.~~
- ~~lint, format and test passed on local.~~
2023-04-28 21:17:28 -07:00
Harrison Chase
b7ae9f715d Langchain with reddit (#3661) (#3768)
I have added a reddit document loader which fetches the text from the
Posts of Subreddits or Reddit users, using the `praw` Python package. I
have also added an example notebook reddit.ipynb in order to guide users
to use this dataloader.
This code was made in format similar to twiiter document loader. I have
run code formating, linting and also checked the code myself for
different scenarios.

This is my first contribution to an open source project and I am really
excited about this. If you want to suggest some improvements in my code,
I will be happy to do it. :)

Co-authored-by: Taaha Bajwa <taaha.s.bajwa@gmail.com>
2023-04-28 20:59:56 -07:00
Kohei Kumazaki
fa4c35e9e5 Fix encoding issue in WebBaseLoader (#3602)
The character code mismatches occurred when character information was
not included in the response header (In my case, a Japanese web page).
I solved this issue by changing the encoding setting to
apparent_encoding.
2023-04-28 20:56:33 -07:00
Harrison Chase
be7a8e0824 Harrison/redis cache (#3766)
Co-authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com>
2023-04-28 20:47:18 -07:00
Mike Wang
b588446bf9 [simple][test] Added test case for schema.py (#3692)
- added unittest for schema.py covering utility functions and token
counting.
- fixed a nit. based on huggingface doc, the tokenizer model is gpt-2.
[link](https://huggingface.co/transformers/v4.8.2/_modules/transformers/models/gpt2/tokenization_gpt2_fast.html)
- make lint && make format, passed on local
- screenshot of new test running result

<img width="1283" alt="Screenshot 2023-04-27 at 9 51 55 PM"
src="https://user-images.githubusercontent.com/62768671/235057441-c0ac3406-9541-453f-ba14-3ebb08656114.png">
2023-04-28 20:42:24 -07:00
Harrison Chase
15b92d361d Harrison/confluence stuff (#3765)
Co-authored-by: Jelmer Borst <japborst@gmail.com>
2023-04-28 20:19:44 -07:00
SimFG
5998b53596 Use the GPTCache api interface (#3693)
Use the GPTCache api interface to reduce the possibility of
compatibility issues
2023-04-28 20:18:51 -07:00
engkheng
f37a932b24 Improve chat prompt template docs (#3719)
Add a few more explanations and examples.
2023-04-28 20:16:22 -07:00
Robert Perrotta
22770f5202 Make StuffDocumentsChain doc separator configurable (#3718)
This PR makes the `"\n\n"` string with which `StuffDocumentsChain` joins
formatted documents a property so it can be configured. The new
`document_separator` property defaults to `"\n\n"` so the change is
backwards compatible.
2023-04-28 20:14:07 -07:00
Akhil Vempali
64ba24292d fix: 🐛 SQLAlchemy import error (#3716)
During the import of langchain, SQLAlchemy was throeing an errror
`ImportError: cannot import name 'Mapped' from 'sqlalchemy.orm'`. This
is becaue the Mapped name was introduced in v1.4
2023-04-28 20:13:32 -07:00
Jon Saginaw
f8d69e4e52 Enhancement: Blockchain Document Loader with better Metadata support (#3710)
This PR includes some minor alignment updates, including:

- metadata object extended to support contractAddress, blockchainType,
and tokenId
- notebook doc better aligned to standard langchain format
- startToken changed from int to str to support multiple hex value types
on the Alchemy API

The updated metadata will look like the below. It's possible for a
single contractAddress to exist across multiple blockchains (e.g.
Ethereum, Polygon, etc.) so it's important to include the
blockchainType.

```
 metadata = {"source": self.contract_address, 
                      "blockchain": self.blockchainType,
                      "tokenId": tokenId}
```
2023-04-28 20:13:05 -07:00
Davis Chase
220a7076ac Add Mathpix pdf loader (#3727)
Inspo
https://twitter.com/danielgross/status/1651695062307274754?s=46&t=1zHLap5WG4I_kQPPjfW9fA

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-28 20:11:22 -07:00
Rafal Wojdyla
37ed6f2177 Handle length safe embedding only if needed (#3723)
Re: https://github.com/hwchase17/langchain/issues/3722

Copy pasting context from the issue:


1bf1c37c0c/langchain/embeddings/openai.py (L210-L211)

Means that the length safe embedding method is "always" used, initial
implementation https://github.com/hwchase17/langchain/pull/991 has the
`embedding_ctx_length` set to -1 (meaning you had to opt-in for the
length safe method), https://github.com/hwchase17/langchain/pull/2330
changed that to max length of OpenAI embeddings v2, meaning the length
safe method is used at all times.

How about changing that if branch to use length safe method only when
needed, meaning when the text is longer than the max context length?
2023-04-28 20:10:04 -07:00
Harrison Chase
40f6e60e68 Harrison/stripe (#3762)
Co-authored-by: Ismail Pelaseyed <homanp@gmail.com>
2023-04-28 20:03:21 -07:00
Jelmer Borst
8cf2ff0be0 Confluence: Add page status filter for spaces (#3732)
At the moment all content in Confluence is retrieved by default,
including archived content.

Often, this is undesired as the content is not relevant anymore.

**Notes**
Fetching pages by label does not support excluding archived content.
This may lead to unexpected results.
2023-04-28 19:56:53 -07:00
Harrison Chase
7a129ac043 Harrison/pypdf loader (#3764)
Co-authored-by: Felipe Meres <felipe@felipemeres.com>
2023-04-28 19:56:21 -07:00
mbchang
4eefea0fe8 new example: single agent, simulated environment (openai gym) (#3758)
For many applications of LLM agents, the environment is real (internet,
database, REPL, etc). However, we can also define agents to interact in
simulated environments like text-based games. This is an example of how
to create a simple agent-environment interaction loop with
[Gymnasium](https://github.com/Farama-Foundation/Gymnasium) (formerly
[OpenAI Gym](https://github.com/openai/gym)).
2023-04-28 19:52:05 -07:00
0xDTE
6ce34bb4fe Fixing broken document links (#3756)
simple document url fixes. nothing fancy.
2023-04-28 19:51:23 -07:00
Rafal Wojdyla
160bfae93f Add DocstoreFn - lookup doc via arbitrary function (#3760)
This **partially** addresses
https://github.com/hwchase17/langchain/issues/1524, but it's also useful
for some of our use cases.

This `DocstoreFn` allows to lookup a document given a function that
accepts the `search` string without the need to implement a custom
`Docstore`.

This could be useful when:
* you don't want to implement a `Docstore` just to provide a custom
`search`
 * it's expensive to construct an `InMemoryDocstore`/dict
 * you retrieve documents from remote sources
 * you just want to reuse existing objects
2023-04-28 19:50:32 -07:00
Harrison Chase
c55ba43093 Harrison/vespa (#3761)
Co-authored-by: Lester Solbakken <lesters@users.noreply.github.com>
2023-04-28 19:48:43 -07:00
mbchang
ee20b3e0d0 bug fix: initialize the arxivAPIWrapper object (#3733) 2023-04-28 19:35:01 -07:00
leo-gan
e510732ad2 docs: improved vectorstore notebooks (#3724)
- Added links to the vectorstore providers
- Added installation code (it is not clear that we have to go to the
`LangChan Ecosystem` page to get installation instructions.)
2023-04-28 19:26:50 -07:00
BioErrorLog
ad4eae7ef0 Fix linting on the Quickstart Guide sample codes (#3701)
When copying and pasting the sample code from the Quickstart Guide, lint
errors ("missing whitespace around operator") occur."
2023-04-28 17:29:05 -07:00
Zander Chase
a46f1d830e Synchronous Browser (#3745)
Split out sync methods in playwright
2023-04-28 17:09:00 -07:00
Zander Chase
6c2b16e465 Add SceneXplain Tool (#3752) 2023-04-28 17:01:54 -07:00
erwanlc
72c5c15f7f Fix: Updated links for in depth explanation of chain types in the Question Answering notebooks (#3714)
In the notebook question_answering.ipynb
([link](https://github.com/hwchase17/langchain/blob/master/docs/modules/chains/index_examples/question_answering.ipynb)),
and the notebook qa_with_sources.ipynb
([link](https://github.com/hwchase17/langchain/blob/master/docs/modules/chains/index_examples/qa_with_sources.ipynb)),
the first paragraph contains a dead link:

> This notebook walks through how to use LangChain for question
answering over a list of documents. It covers four different types of
chains: stuff, map_reduce, refine, map_rerank. For a more in depth
explanation of what these chain types are, see
[here](32793f94fd/docs/modules/chains/combine_docs.md).

The file combine_docs.md doesn't exist anymore and thus provide 404 -
Page not found.

I updated the links so it redirect to
https://docs.langchain.com/docs/components/chains/index_related_chains
as in the summarize notebook
([link](https://github.com/hwchase17/langchain/blob/master/docs/modules/chains/index_examples/summarize.ipynb))
present in the same folder.
2023-04-28 15:06:46 -07:00
Alan Cha
e3b7a20454 Fix typo (#3728) 2023-04-28 13:01:09 -07:00
Zander Chase
5042bd40d3 Add Shell Tool (#3335)
Create an official bash shell tool to replace the dynamically generated one
2023-04-28 11:10:43 -07:00
Zander Chase
334c162f16 Add Other File Utilities (#3209)
Add other File Utilities, include
- List Directory
- Search for file
- Move
- Copy
- Remove file

Bundle as toolkit
Add a notebook that connects to the Chat Agent, which somewhat supports
multi-arg input tools
Update original read/write files to return the original dir paths and
better handle unsupported file paths.
Add unit tests
2023-04-28 10:53:37 -07:00
Zander Chase
491c27f861 PlayWright Web Browser Toolkit (#3262)
Adds a PlayWright web browser toolkit with the following tools:

- NavigateTool (navigate_browser) - navigate to a URL
- NavigateBackTool (previous_page) - wait for an element to appear
- ClickTool (click_element) - click on an element (specified by
selector)
- ExtractTextTool (extract_text) - use beautiful soup to extract text
from the current web page
- ExtractHyperlinksTool (extract_hyperlinks) - use beautiful soup to
extract hyperlinks from the current web page
- GetElementsTool (get_elements) - select elements by CSS selector
- CurrentPageTool (current_page) - get the current page URL
2023-04-28 10:42:44 -07:00
Zander Chase
da7b51455c Dynamic tool -> single purpose (#3697)
I think the logic of
https://github.com/hwchase17/langchain/pull/3684#pullrequestreview-1405358565
is too confusing.

I prefer this alternative because:
- All `Tool()` implementations by default will be treated the same as
before. No breaking changes.
- Less reliance on pydantic magic
- The decorator (which only is typed as returning a callable) can infer
schema and generate a structured tool
- Either way, the recommended way to create a custom tool is through
inheriting from the base tool
2023-04-28 09:38:41 -07:00
Zach Schillaci
1bf1c37c0c Update VectorDBQA to RetrievalQA in tools (#3698)
Because `VectorDBQA` and `VectorDBQAWithSourcesChain` are deprecated
2023-04-28 07:39:59 -07:00
Harrison Chase
32793f94fd bump version to 152 (#3695) 2023-04-28 00:21:53 -07:00
mbchang
1da3ee1386 Multiagent authoritarian (#3686)
This notebook showcases how to implement a multi-agent simulation where
a privileged agent decides who to speak.
This follows the polar opposite selection scheme as [multi-agent
decentralized speaker
selection](https://python.langchain.com/en/latest/use_cases/agent_simulations/multiagent_bidding.html).

We show an example of this approach in the context of a fictitious
simulation of a news network. This example will showcase how we can
implement agents that
- think before speaking
- terminate the conversation
2023-04-27 23:33:29 -07:00
Zander Chase
4654c58f72 Add validation on agent instantiation for multi-input tools (#3681)
Tradeoffs here:
- No lint-time checking for compatibility
- Differs from JS package
- The signature inference, etc. in the base tool isn't simple
- The `args_schema` is optional 

Pros:
- Forwards compatibility retained
- Doesn't break backwards compatibility
- User doesn't have to think about which class to subclass (single base
tool or dynamic `Tool` interface regardless of input)
-  No need to change the load_tools, etc. interfaces

Co-authored-by: Hasan Patel <mangafield@gmail.com>
2023-04-27 15:36:11 -07:00
Davis Chase
212aadd4af Nit: list to sequence (#3678) 2023-04-27 14:41:59 -07:00
Davis Chase
b807a114e4 Add query parsing unit tests (#3672) 2023-04-27 13:42:12 -07:00
Hasan Patel
03c05b15f6 Fixed some typos on deployment.md (#3652)
Fixed typos and added better formatting for easier readability
2023-04-27 13:01:24 -07:00
Zander Chase
1b5721c999 Remove Pexpect Dependency (#3667)
Resolves #3664

Next PR will be to clean up CI to catch this earlier. Triaging this, it
looks like it wasn't caught because pexpect is a `poetry` dependency.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-04-27 11:39:01 -07:00
Eugene Yurtsev
708787dddb Blob: Add validator and use future annotations (#3650)
Minor changes to the Blob schema.

---------

Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>
2023-04-27 14:33:59 -04:00
Eugene Yurtsev
c5a4b4fea1 Suppress duckdb warning in unit tests explicitly (#3653)
This catches the warning raised when using duckdb, asserts that it's as expected.

The goal is to resolve all existing warnings to make unit-testing much stricter.
2023-04-27 14:29:41 -04:00
Eugene Yurtsev
2052e70664 Add lazy iteration interface to document loaders (#3659)
Adding a lazy iteration for document loaders.

Following the plan here:
https://github.com/hwchase17/langchain/pull/2833

Keeping the `load` method as is for backwards compatibility. The `load`
returns a materialized list of documents and downstream users may rely on that
fact.

A new method that returns an iterable is introduced for handling lazy
loading.

---------

Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>
2023-04-27 14:29:01 -04:00
Piotr Mardziel
8a54217e7b update example of ConstitutionalChain.from_llm (#3630)
Example code was missing an argument and import. Fixed.
2023-04-27 11:17:31 -07:00
Eugene Yurtsev
e6c8cce050 Add unit-test to catch changes to required deps (#3662)
This adds a unit test that can catch changes to required dependencies
2023-04-27 13:04:17 -04:00
Eugene Yurtsev
055f58960a Fix pytest collection warning (#3651)
Fixes a pytest collection warning because the test class starts with the
prefix "Test"
2023-04-27 09:51:43 -07:00
Harrison Chase
0cf890eed4 bump version to 151 (#3658) 2023-04-27 09:02:39 -07:00
Davis Chase
3b609642ae Self-query with generic query constructor (#3607)
Alternate implementation of #3452 that relies on a generic query
constructor chain and language and then has vector store-specific
translation layer. Still refactoring and updating examples but general
structure is there and seems to work s well as #3452 on exampels

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-27 08:36:00 -07:00
plutopulp
6d6fd1b9e1 Add PipelineAI LLM integration (#3644)
Add PipelineAI LLM integration
2023-04-27 08:22:26 -07:00
Harrison Chase
a35bbbfa9e Harrison/lancedb (#3634)
Co-authored-by: Minh Le <minhle@canva.com>
2023-04-27 08:14:36 -07:00
Nuno Campos
52b5290810 Update README.md (#3643) 2023-04-27 08:14:09 -07:00
Eugene Yurtsev
5d02010763 Introduce Blob and Blob Loader interface (#3603)
This PR introduces a Blob data type and a Blob loader interface.

This is the first of a sequence of PRs that follows this proposal: 

https://github.com/hwchase17/langchain/pull/2833

The primary goals of these abstraction are:

* Decouple content loading from content parsing code.
* Help duplicated content loading code from document loaders.
* Make lazy loading a default for langchain.
2023-04-27 09:45:25 -04:00
Matt Robinson
8e10ac422e enhancement: add elements mode to UnstructuredURLLoader (#3456)
### Summary

Updates the `UnstructuredURLLoader` to include a "elements" mode that
retains additional metadata from `unstructured`. This makes
`UnstructuredURLLoader` consistent with other unstructured loaders,
which also support "elements" mode. Patched mode into the existing
`UnstructuredURLLoader` class instead of inheriting from
`UnstructuredBaseLoader` because it significantly simplified the
implementation.

### Testing

This should still work and show the url in the source for the metadata

```python
from langchain.document_loaders import UnstructuredURLLoader

urls = ["https://www.understandingwar.org/sites/default/files/Russian%20Offensive%20Campaign%20Assessment%2C%20April%2011%2C%202023.pdf"]

loader = UnstructuredURLLoader(urls=urls, headers={"Accept": "application/json"}, strategy="fast")
docs = loader.load()
print(docs[0].page_content[:1000])
docs[0].metadata
``` 

This should now work and show additional metadata from `unstructured`.

This should still work and show the url in the source for the metadata

```python
from langchain.document_loaders import UnstructuredURLLoader

urls = ["https://www.understandingwar.org/sites/default/files/Russian%20Offensive%20Campaign%20Assessment%2C%20April%2011%2C%202023.pdf"]

loader = UnstructuredURLLoader(urls=urls, headers={"Accept": "application/json"}, strategy="fast", mode="elements")
docs = loader.load()
print(docs[0].page_content[:1000])
docs[0].metadata
```
2023-04-26 22:09:45 -07:00
Eduard van Valkenburg
a3e3f26090 Some more PowerBI pydantic and import fixes (#3461) 2023-04-26 22:09:12 -07:00
Harrison Chase
ab749fa1bb Harrison/opensearch logic (#3631)
Co-authored-by: engineer-matsuo <95115586+engineer-matsuo@users.noreply.github.com>
2023-04-26 22:08:03 -07:00
ccw630
cf384dcb7f Supports async in SequentialChain/SimpleSequentialChain (#3503) 2023-04-26 22:07:20 -07:00
Ehsan M. Kermani
4a246e2fd6 Allow clearing cache and fix gptcache (#3493)
This PR

* Adds `clear` method for `BaseCache` and implements it for various
caches
* Adds the default `init_func=None` and fixes gptcache integtest
* Since right now integtest is not running in CI, I've verified the
changes by running `docs/modules/models/llms/examples/llm_caching.ipynb`
(until proper e2e integtest is done in CI)
2023-04-26 22:03:50 -07:00
Howard Su
83e871f1ff Fix Invalid Request using AzureOpenAI (#3522)
This fixes the error when calling AzureOpenAI of gpt-35-turbo model.

The error is:
InvalidRequestError: logprobs, best_of and echo parameters are not
available on gpt-35-turbo model. Please remove the parameter and try
again. For more details, see
https://go.microsoft.com/fwlink/?linkid=2227346.
2023-04-26 22:00:09 -07:00
Luoyger
f5aa767ef1 add --no-sandbox for chrome in url_selenium (#3589)
without --no-sandbox param, load documents from url by selenium in
chrome occured error below:

```Traceback (most recent call last):
  File "/data//playgroud/try_langchain.py", line 343, in <module>
    langchain_doc_loader()
  File "/data//playgroud/try_langchain.py", line 67, in langchain_doc_loader
    documents = loader.load()
  File "/install/anaconda3-env/envs/python3.10/lib/python3.10/site-packages/langchain/document_loaders/url_selenium.py", line 102, in load
    driver = self._get_driver()
  File "/install/anaconda3-env/envs/python3.10/lib/python3.10/site-packages/langchain/document_loaders/url_selenium.py", line 76, in _get_driver
    return Chrome(options=chrome_options)
  File "/install/anaconda3-env/envs/python3.10/lib/python3.10/site-packages/selenium/webdriver/chrome/webdriver.py", line 80, in __init__
    super().__init__(
  File "/install/anaconda3-env/envs/python3.10/lib/python3.10/site-packages/selenium/webdriver/chromium/webdriver.py", line 104, in __init__
    super().__init__(
  File "/install/anaconda3-env/envs/python3.10/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 286, in __init__
    self.start_session(capabilities, browser_profile)
  File "/install/anaconda3-env/envs/python3.10/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 378, in start_session
    response = self.execute(Command.NEW_SESSION, parameters)
  File "/install/anaconda3-env/envs/python3.10/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 440, in execute
    self.error_handler.check_response(response)
  File "/install/anaconda3-env/envs/python3.10/lib/python3.10/site-packages/selenium/webdriver/remote/errorhandler.py", line 245, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally.
  (unknown error: DevToolsActivePort file doesn't exist)
  (The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
Stacktrace:
#0 0x55cf8da1bfe3 <unknown>
#1 0x55cf8d75ad36 <unknown>
#2 0x55cf8d783b20 <unknown>
#3 0x55cf8d77fa9b <unknown>
#4 0x55cf8d7c1af7 <unknown>
#5 0x55cf8d7c111f <unknown>
#6 0x55cf8d7b8693 <unknown>
#7 0x55cf8d78b03a <unknown>
#8 0x55cf8d78c17e <unknown>
#9 0x55cf8d9dddbd <unknown>
#10 0x55cf8d9e1c6c <unknown>
#11 0x55cf8d9eb4b0 <unknown>
#12 0x55cf8d9e2d63 <unknown>
#13 0x55cf8d9b5c35 <unknown>
#14 0x55cf8da06138 <unknown>
#15 0x55cf8da062c7 <unknown>
#16 0x55cf8da14093 <unknown>
#17 0x7f3da31a72de start_thread
```

add option `chrome_options.add_argument("--no-sandbox")` for chrome.
2023-04-26 21:48:43 -07:00
Shukri
fac4f36a87 Update models used for embeddings in the weaviate example (#3594)
Use text-embedding-ada-002 because it [outperforms all other
models](https://openai.com/blog/new-and-improved-embedding-model).
2023-04-26 21:48:08 -07:00
cs0lar
440c98e24b Fix/issue 2695 (#3608)
## Background
fixes #2695  

## Changes
The `add_text` method uses the internal embedding function if one was
passes to the `Weaviate` constructor.
NOTE: the latest merge on the `Weaviate` class made the specification of
a `weaviate_api_key` mandatory which might not be desirable for all
users and connection methods (for example weaviate also support Embedded
Weaviate which I am happy to add support to here if people think it's
desirable). I wrapped the fetching of the api key into a try catch in
order to allow the `weaviate_api_key` to be unspecified. Do let me know
if this is unsatisfactory.

## Test Plan
added test for `add_texts` method.
2023-04-26 21:45:03 -07:00
brian-tecton-ai
615812581e Add Tecton example to the "Connecting to a Feature Store" example notebook (#3626)
This PR adds a similar example to the Feast example, using the [Tecton
Feature Platform](https://www.tecton.ai/) and features from the [Tecton
Fundamentals
Tutorial](https://docs.tecton.ai/docs/tutorials/tecton-fundamentals).
2023-04-26 21:38:50 -07:00
mbchang
3b7d27d39e new example: multiagent dialogue with decentralized speaker selection (#3629)
This notebook showcases how to implement a multi-agent simulation
without a fixed schedule for who speaks when. Instead the agents decide
for themselves who speaks. We can implement this by having each agent
bid to speak. Whichever agent's bid is the highest gets to speak.

We will show how to do this in the example below that showcases a
fictitious presidential debate.
2023-04-26 21:37:36 -07:00
leo-gan
36c59e0c25 Arxiv document loader (#3627)
It makes sense to use `arxiv` as another source of the documents for
downloading.
- Added the `arxiv` document_loader, based on the
`utilities/arxiv.py:ArxivAPIWrapper`
- added tests
- added an example notebook
- sorted `__all__` in `__init__.py` (otherwise it is hard to find a
class in the very long list)
2023-04-26 21:04:56 -07:00
Tim Asp
539142f8d5 Add way to get serpapi results async (#3604)
Sometimes it's nice to get the raw results from serpapi, and we're
missing the async version of this function.
2023-04-26 16:37:03 -07:00
Zander Chase
443a893ffd Align names of search tools (#3620)
Tools for Bing, DDG and Google weren't consistent even though the
underlying implementations were.
All three services now have the same tools and implementations to easily
switch and experiment when building chains.
2023-04-26 16:21:34 -07:00
Maciej Bryński
aa345a4bb7 Add get_text_separator parameter to BSHTMLLoader (#3551)
By default get_text doesn't separate content of different HTML tag.
Adding option for specifying separator helps with document splitting.
2023-04-26 16:10:16 -07:00
Bhupendra Aole
568c4f0d81 Close dataframe column names are being treated as one by the LLM (#3611)
We are sending sample dataframe to LLM with df.head().
If the column names are close by, LLM treats two columns names as one,
returning incorrect results.


![image](https://user-images.githubusercontent.com/4707543/234678692-97851fa0-9e12-44db-92ec-9ad9f3545ae2.png)

In the above case the LLM uses **Org Week** as the column name instead
of **Week** if asked about a specific week.

Returning head() as a markdown separates out the columns names and thus
using correct column name.


![image](https://user-images.githubusercontent.com/4707543/234678945-c6d7b218-143e-4e70-9e17-77dc64841a49.png)
2023-04-26 16:05:53 -07:00
James O'Dwyer
860fa59cd3 add metal to ecosystem (#3613) 2023-04-26 15:57:48 -07:00
Zander Chase
ee670c448e Persistent Bash Shell (#3580)
Clean up linting and make more idiomatic by using an output parser

---------

Co-authored-by: FergusFettes <fergusfettes@gmail.com>
2023-04-26 15:20:28 -07:00
Ilyes Bouchada
c5451f4298 Update docker-compose.yaml (#3582)
The following error gets returned when trying to launch
langchain-server:

ERROR: The Compose file
'/opt/homebrew/lib/python3.11/site-packages/langchain/docker-compose.yaml'
is invalid because:
services.langchain-db.expose is invalid: should be of the format
'PORT[/PROTOCOL]'

Solution:
Change line 28 from - 5432:5432 to - 5432
2023-04-26 15:11:59 -07:00
Kátia Nakamura
e1a4fc55e6 Add docs for Fly.io deployment (#3584)
A minimal example of how to deploy LangChain to Fly.io using Flask.
2023-04-26 14:41:08 -07:00
Chirag Bhatia
08478deec5 Fixed typo for HuggingFaceHub (#3612)
The current text has a typo. This PR contains the corrected spelling for
HuggingFaceHub
2023-04-26 14:33:31 -07:00
Charlie Holtz
246710def9 Fix Replicate llm response to handle iterator / multiple outputs (#3614)
One of our users noticed a bug when calling streaming models. This is
because those models return an iterator. So, I've updated the Replicate
`_call` code to join together the output. The other advantage of this
fix is that if you requested multiple outputs you would get them all –
previously I was just returning output[0].

I also adjusted the demo docs to use dolly, because we're featuring that
model right now and it's always hot, so people won't have to wait for
the model to boot up.

The error that this fixes:
```
> llm = Replicate(model=“replicate/flan-t5-xl:eec2f71c986dfa3b7a5d842d22e1130550f015720966bec48beaae059b19ef4c”)
>  llm(“hello”)
> Traceback (most recent call last):
  File "/Users/charlieholtz/workspace/dev/python/main.py", line 15, in <module>
    print(llm(prompt))
  File "/opt/homebrew/lib/python3.10/site-packages/langchain/llms/base.py", line 246, in __call__
    return self.generate([prompt], stop=stop).generations[0][0].text
  File "/opt/homebrew/lib/python3.10/site-packages/langchain/llms/base.py", line 140, in generate
    raise e
  File "/opt/homebrew/lib/python3.10/site-packages/langchain/llms/base.py", line 137, in generate
    output = self._generate(prompts, stop=stop)
  File "/opt/homebrew/lib/python3.10/site-packages/langchain/llms/base.py", line 324, in _generate
    text = self._call(prompt, stop=stop)
  File "/opt/homebrew/lib/python3.10/site-packages/langchain/llms/replicate.py", line 108, in _call
    return outputs[0]
TypeError: 'generator' object is not subscriptable
```
2023-04-26 14:26:33 -07:00
Harrison Chase
7536912125 bump ver 150 (#3599) 2023-04-26 08:29:09 -07:00
Chirag Bhatia
f174aa7712 Fix broken Cerebrium link in documentation (#3554)
The current hyperlink has a typo. This PR contains the corrected
hyperlink to Cerebrium docs
2023-04-26 08:11:58 -07:00
Harrison Chase
d880775e5d Harrison/plugnplai (#3573)
Co-authored-by: Eduardo Reis <edu.pontes@gmail.com>
2023-04-26 08:09:34 -07:00
Zander Chase
85dae78548 Confluence beautifulsoup (#3576)
Co-authored-by: Theau Heral <theau.heral@ln.email.gs.com>
2023-04-25 23:40:06 -07:00
Mike Wang
64501329ab [simple] updated annotation in load_tools.py (#3544)
- added a few missing annotation for complex local variables.
- auto formatted.
- I also went through all other files in agent directory. no seeing any
other missing piece. (there are several prompt strings not annotated,
but I think it’s trivial. Also adding annotation will make it harder to
read in terms of indents.) Anyway, I think this is the last PR in
agent/annotation.
2023-04-25 23:30:49 -07:00
Zander Chase
d6d697a41b Sentence Transformers Aliasing (#3541)
The sentence transformers was a dup of the HF one. 

This is a breaking change (model_name vs. model) for anyone using
`SentenceTransformerEmbeddings(model="some/nondefault/model")`, but
since it was landed only this week it seems better to do this now rather
than doing a wrapper.
2023-04-25 23:29:20 -07:00
Eric Peter
603ea75bcd Fix docs error for google drive loader (#3574) 2023-04-25 22:52:59 -07:00
CG80499
cfd34e268e Add ReAct eval chain (#3161)
- Adds GPT-4 eval chain for arbitrary agents using any set of tools
- Adds notebook

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-25 21:22:25 -07:00
mbchang
4bc209c6f7 example: multi player dnd (#3560)
This notebook shows how the DialogueAgent and DialogueSimulator class
make it easy to extend the [Two-Player Dungeons & Dragons
example](https://python.langchain.com/en/latest/use_cases/agent_simulations/two_player_dnd.html)
to multiple players.

The main difference between simulating two players and multiple players
is in revising the schedule for when each agent speaks

To this end, we augment DialogueSimulator to take in a custom function
that determines the schedule of which agent speaks. In the example
below, each character speaks in round-robin fashion, with the
storyteller interleaved between each player.
2023-04-25 21:20:39 -07:00
James Brotchie
5fdaa95e06 Strip surrounding quotes from requests tool URLs. (#3563)
Often an LLM will output a requests tool input argument surrounded by
single quotes. This triggers an exception in the requests library. Here,
we add a simple clean url function that strips any leading and trailing
single and double quotes before passing the URL to the underlying
requests library.

Co-authored-by: James Brotchie <brotchie@google.com>
2023-04-25 21:20:26 -07:00
Harrison Chase
f4829025fe add feast nb (#3565) 2023-04-25 17:46:06 -07:00
Harrison Chase
47da5f0e58 Harrison/streamlit handler (#3564)
Co-authored-by: kurupapi <37198601+kurupapi@users.noreply.github.com>
2023-04-25 17:26:30 -07:00
Filip Michalsky
49593a3e41 Notebook example: Context-Aware AI Sales Agent (#3547)
I would like to contribute with a jupyter notebook example
implementation of an AI Sales Agent using `langchain`.

The bot understands the conversation stage (you can define your own
stages fitting your needs)
using two chains:

1. StageAnalyzerChain - takes context and LLM decides what part of sales
conversation is one in
2. SalesConversationChain - generate next message

Schema:

https://images-genai.s3.us-east-1.amazonaws.com/architecture2.png

my original repo: https://github.com/filip-michalsky/SalesGPT

This example creates a sales person named Ted Lasso who is trying to
sell you mattresses.

Happy to update based on your feedback.

Thanks, Filip
https://twitter.com/FilipMichalsky
2023-04-25 16:14:33 -07:00
Harrison Chase
52d95ec47d anthropic docs: deprecated LLM, add chat model (#3549) 2023-04-25 16:11:14 -07:00
mbchang
628e93a9a0 docs: simplification of two agent d&d simulation (#3550)
Simplifies the [Two Agent
D&D](https://python.langchain.com/en/latest/use_cases/agent_simulations/two_player_dnd.html)
example with a cleaner, simpler interface that is extensible for
multiple agents.

`DialogueAgent`:
- `send()`: applies the chatmodel to the message history and returns the
message string
- `receive(name, message)`: adds the `message` spoken by `name` to
message history

The `DialogueSimulator` class takes a list of agents. At each step, it
performs the following:
1. Select the next speaker
2. Calls the next speaker to send a message 
3. Broadcasts the message to all other agents
4. Update the step counter.
The selection of the next speaker can be implemented as any function,
but in this case we simply loop through the agents.
2023-04-25 16:10:32 -07:00
apurvsibal
af7906f100 Update Alchemy Key URL (#3559)
Update Alchemy Key URL in Blockchain Document Loader. I want to say
thank you for the incredible work the LangChain library creators have
done.

I am amazed at how seamlessly the Loader integrates with Ethereum
Mainnet, Ethereum Testnet, Polygon Mainnet, and Polygon Testnet, and I
am excited to see how this technology can be extended in the future.

@hwchase17 - Please let me know if I can improve or if I have missed any
community guidelines in making the edit? Thank you again for your hard
work and dedication to the open source community.
2023-04-25 16:08:42 -07:00
Tiago De Gaspari
4d53cefbe9 Fix agents' notebooks outputs (#3517)
Fix agents' notebooks to make the answer reflect what is being asked by
the user.
2023-04-25 16:06:47 -07:00
engkheng
5680fb6894 Fix typo in Prompts Templates Getting Started page (#3514)
`from_templates` -> `from_template`
2023-04-25 16:05:13 -07:00
Vincent
9e36d7b82c adding add_documents and aadd_documents to class RedisVectorStoreRetriever (#3419)
Ran into this issue In vectorstores/redis.py when trying to use the
AutoGPT agent with redis vector store. The error I received was

`
langchain/experimental/autonomous_agents/autogpt/agent.py", line 134, in
run
    self.memory.add_documents([Document(page_content=memory_to_add)])
AttributeError: 'RedisVectorStoreRetriever' object has no attribute
'add_documents'
`

Added the needed function to the class RedisVectorStoreRetriever which
did not have the functionality like the base VectorStoreRetriever in
vectorstores/base.py that, for example, vectorstores/faiss.py has
2023-04-25 13:53:20 -07:00
Davis Chase
d18b0caf0e Add Anthropic default request timeout (#3540)
thanks @hitflame!

---------

Co-authored-by: Wenqiang Zhao <hitzhaowenqiang@sina.com>
Co-authored-by: delta@com <delta@com>
2023-04-25 11:40:41 -07:00
Zander Chase
b49ee372f1 Change Chain Docs (#3537)
Co-authored-by: engkheng <60956360+outday29@users.noreply.github.com>
2023-04-25 10:51:09 -07:00
Ikko Eltociear Ashimine
cf71b5d396 fix typo in comet_tracking.ipynb (#3505)
intializing -> initializing
2023-04-25 10:50:58 -07:00
Zander Chase
64bbbf2cc2 Add DDG to load_tools (#3535)
Fix linting

---------

Co-authored-by: Mike Wang <62768671+skcoirz@users.noreply.github.com>
2023-04-25 10:40:37 -07:00
Roma
2b4e9a3efa Add unit test for _merge_splits function (#3513)
This commit adds a new unit test for the _merge_splits function in the
text splitter. The new test verifies that the function merges text into
chunks of the correct size and overlap, using a specified separator. The
test passes on the current implementation of the function.
2023-04-25 10:02:59 -07:00
Sami Liedes
61da2bb742 Pandas agent: Pass forward callback manager (#3518)
The Pandas agent fails to pass callback_manager forward, making it
impossible to use custom callbacks with it. Fix that.

Co-authored-by: Sami Liedes <sami.liedes@rocket-science.ch>
2023-04-25 09:58:56 -07:00
mbchang
a08e9a3109 Docs: fix naming typo (#3532) 2023-04-25 09:58:25 -07:00
Harrison Chase
dc2188b36d bump version to 149 (#3530) 2023-04-25 08:43:59 -07:00
mbchang
831ca61481 docs: two_player_dnd docs (#3528) 2023-04-25 08:24:53 -07:00
yakigac
f338d6251c Add a test for cosmos db memory (#3525)
Test for #3434 @eavanvalkenburg 
Initially, I was unaware and had submitted a pull request #3450 for the
same purpose, but I have now repurposed the one I used for that. And it
worked.
2023-04-25 08:10:02 -07:00
leo-gan
6b28cbe058 improved arxiv (#3495)
Improved `arxiv/tool.py` by adding more specific information to the
`description`. It would help with selecting `arxiv` tool between other
tools.
Improved `arxiv.ipynb` with more useful descriptions.
2023-04-25 08:09:17 -07:00
mbchang
29f321046e doc: add two player D&D game (#3476)
In this notebook, we show how we can use concepts from
[CAMEL](https://www.camel-ai.org/) to simulate a role-playing game with
a protagonist and a dungeon master. To simulate this game, we create a
`TwoAgentSimulator` class that coordinates the dialogue between the two
agents.
2023-04-25 08:07:18 -07:00
Harrison Chase
0fc0aa62f2 Harrison/blockchain docloader (#3491)
Co-authored-by: Jon Saginaw <saginawj@users.noreply.github.com>
2023-04-25 08:07:06 -07:00
Harrison Chase
bee59b4689 Updated missing refactor in docs "return_map_steps" (#2956) (#3469)
Minor rename in the documentation that was overlooked when refactoring.

---------

Co-authored-by: Ehmad Zubair <ehmad@cogentlabs.co>
2023-04-24 22:28:47 -07:00
Harrison Chase
707741de58 Harrison/prediction guard (#3490)
Co-authored-by: Daniel Whitenack <whitenack.daniel@gmail.com>
2023-04-24 22:27:22 -07:00
Harrison Chase
7257f9e015 Harrison/tfidf parameters (#3481)
Co-authored-by: pao <go5kuramubon@gmail.com>
Co-authored-by: KyoHattori <kyo.hattori@abejainc.com>
2023-04-24 22:19:58 -07:00
Harrison Chase
eda69b13f3 openai embeddings (#3488) 2023-04-24 22:19:47 -07:00
Harrison Chase
d3ce47414d Harrison/chroma update (#3489)
Co-authored-by: vyeevani <30946190+vyeevani@users.noreply.github.com>
Co-authored-by: Vineeth Yeevani <vineeth.yeevani@gmail.com>
2023-04-24 22:19:36 -07:00
Sami Liedes
c8b70e1c6a langchain-server: Do not expose postgresql port to host (#3431)
Apart from being unnecessary, postgresql is run on its default port,
which means that the langchain-server will fail to start if there is
already a postgresql server running on the host. This is obviously less
than ideal.

(Yeah, I don't understand why "expose" is the syntax that does not
expose the ports to the host...)

Tested by running langchain-server and trying out debugging on a host
that already has postgresql bound to the port 5432.

Co-authored-by: Sami Liedes <sami.liedes@rocket-science.ch>
2023-04-24 22:19:23 -07:00
Harrison Chase
7084d69ea7 Harrison/verbose conv ret (#3492)
Co-authored-by: makretch <max.kretchmer@gmail.com>
2023-04-24 22:16:07 -07:00
Harrison Chase
36a039d017 Harrison/prompt prefix (#3496)
Co-authored-by: Ian <ArGregoryIan@gmail.com>
2023-04-24 22:15:44 -07:00
Harrison Chase
408a0183cd Harrison/weaviate (#3494)
Co-authored-by: Nick Rubell <nick@rubell.com>
2023-04-24 22:15:32 -07:00
Eduard van Valkenburg
ba7a5ac9d7 Azure CosmosDB memory (#3434)
Still needs docs, otherwise works.
2023-04-24 22:15:12 -07:00
Lucas Vieira
e6c1c32aff Support GCS Objects with / in GCS Loaders (#3356)
So, this is basically fixing the same things as #1517 but for GCS.

### Problem
When loading GCS Objects with `/` in the object key (eg.
folder/some-document.txt) using `GCSFileLoader`, the objects are
downloaded into a temporary directory and saved as a file.

This errors out when the parent directory does not exist within the
temporary directory.

### What this pr does
Creates parent directories based on object key.

This also works with deeply nested keys:
folder/subfolder/some-document.txt
2023-04-24 22:05:44 -07:00
Mindaugas Sharskus
a4d85f7fd5 [Fix #3365]: Changed regex to cover new line before action serious (#3367)
Fix for: [Changed regex to cover new line before action
serious.](https://github.com/hwchase17/langchain/issues/3365)
---

This PR fixes the issue where `ValueError: Could not parse LLM output:`
was thrown on seems to be valid input.

Changed regex to cover new lines before action serious (after the
keywords "Action:" and "Action Input:").

regex101: https://regex101.com/r/CXl1kB/1

---------

Co-authored-by: msarskus <msarskus@cisco.com>
2023-04-24 22:05:31 -07:00
Maxwell Mullin
696f840426 GuessedAtParserWarning from RTD document loader documentation example (#3397)
Addresses #3396 by adding 

`features='html.parser'` in example
2023-04-24 21:54:39 -07:00
engkheng
06f6c49e61 Improve llm_chain.ipynb and getting_started.ipynb for chains docs (#3380)
My attempt at improving the `Chain`'s `Getting Started` docs and
`LLMChain` docs. Might need some proof-reading as English is not my
first language.

In LLM examples, I replaced the example use case when a simpler one
(shorter LLM output) to reduce cognitive load.
2023-04-24 21:49:55 -07:00
Zander Chase
b89c258bc5 Add retry logic for ChromaDB (#3372)
Rewrite of #3368

Mainly an issue for when people are just getting started, but still nice
to not throw an error if the number of docs is < k.

Add a little decorator utility to block mutually exclusive keyword
arguments
2023-04-24 21:48:29 -07:00
tkarper
6b49be9951 Add Databutton to list of Deployment options (#3364) 2023-04-24 21:45:38 -07:00
jrhe
980cc41709 Adds progress bar using tqdm to directory_loader (#3349)
Approach copied from `WebBaseLoader`. Assumes the user doesn't have
`tqdm` installed.
2023-04-24 21:42:42 -07:00
killpanda
344e3508b1 bug_fixes: use md5 instead of uuid id generation (#3442)
At present, the method of generating `point` in qdrant is to use random
`uuid`. The problem with this approach is that even documents with the
same content will be inserted repeatedly instead of updated. Using `md5`
as the `ID` of `point` to insert text can achieve true `update or
insert`.

Co-authored-by: mayue <mayue05@qiyi.com>
2023-04-24 21:39:51 -07:00
Jon Luo
b765805964 Support SQLAlchemy 2.0 (#3310)
With https://github.com/executablebooks/jupyter-cache/pull/93 merged and
`MyST-NB` updated, we can now support SQLAlchemy 2. Closes #1766
2023-04-24 21:10:56 -07:00
engkheng
7c2c73af5f Update Getting Started page of Prompt Templates (#3298)
Updated `Getting Started` page of `Prompt Templates` to showcase more
features provided by the class. Might need some proof reading because
apparently English is not my first language.
2023-04-24 21:10:22 -07:00
Hasan Patel
a14d1c02f8 Updated Readme.md (#3477)
Corrected some minor grammar issues, changed infra to infrastructure for
more clarity. Improved readability
2023-04-24 20:11:29 -07:00
Davis Chase
b2564a6391 fix #3884 (#3475)
fixes mar bug #3384
2023-04-24 19:54:15 -07:00
Prakhar Agarwal
53b14de636 pass list of strings to embed method in tf_hub (#3284)
This fixes the below mentioned issue. Instead of simply passing the text
to `tensorflow_hub`, we convert it to a list and then pass it.
https://github.com/hwchase17/langchain/issues/3282

Co-authored-by: Prakhar Agarwal <i.prakhar-agarwal@devrev.ai>
2023-04-24 19:51:53 -07:00
Beau Horenberger
2b9f1cea4e add LoRA loading for the LlamaCpp LLM (#3363)
First PR, let me know if this needs anything like unit tests,
reformatting, etc. Seemed pretty straightforward to implement. Only
hitch was that mmap needs to be disabled when loading LoRAs or else you
segfault.
2023-04-24 18:31:14 -07:00
Ehsan M. Kermani
5d0674fb46 Use a consistent poetry version everywhere (#3250)
Fixes the discrepancy of poetry version in Dockerfile and the GAs
2023-04-24 18:19:51 -07:00
Felipe Lopes
8c56e92566 feat: add private weaviate api_key support on from_texts (#3139)
This PR adds support for providing a Weaviate API Key to the VectorStore
methods `from_documents` and `from_texts`. With this addition, users can
authenticate to Weaviate and make requests to private Weaviate servers
when using these methods.

## Motivation
Currently, LangChain's VectorStore methods do not provide a way to
authenticate to Weaviate. This limits the functionality of the library
and makes it more difficult for users to take advantage of Weaviate's
features.

This PR addresses this issue by adding support for providing a Weaviate
API Key as extra parameter used in the `from_texts` method.

## Contributing Guidelines
I have read the [contributing
guidelines](72b7d76d79/.github/CONTRIBUTING.md)
and the PR code passes the following tests:

- [x] make format
- [x] make lint
- [x] make coverage
- [x] make test
2023-04-24 17:55:34 -07:00
Zzz233
239dc10852 ES similarity_search_with_score() and metadata filter (#3046)
Add similarity_search_with_score() to ElasticVectorSearch, add metadata
filter to both similarity_search() and similarity_search_with_score()
2023-04-24 17:20:08 -07:00
Zander Chase
416f3bdf11 Vwp/alpaca streaming (#3468)
Co-authored-by: Luke Stanley <306671+lukestanley@users.noreply.github.com>
2023-04-24 16:27:51 -07:00
Cao Hoang
26035dfa59 remove default usage of openai model in SQLDatabaseToolkit (#2884)
#2866

This toolkit used openai LLM as the default, which could incurr unwanted
cost.
2023-04-24 16:27:38 -07:00
Harrison Chase
675d86aa11 show how to use memory in convo chain (#3463) 2023-04-24 13:29:51 -07:00
leo-gan
d5086d4760 added integration links to the ecosystem.rst (#3453)
Now it is hard to search for the integration points between
data_loaders, retrievers, tools, etc.
I've placed links to all groups of providers and integrations on the
`ecosystem` page.
So, it is easy to navigate between all integrations from a single
location.
2023-04-24 12:17:44 -07:00
Davis Chase
2cbd41145c Bugfix: Not all combine docs chains takes kwargs prompt (#3462)
Generalize ConversationalRetrievalChain.from_llm kwargs

---------

Co-authored-by: shubham.suneja <shubham.suneja>
2023-04-24 12:13:06 -07:00
cs0lar
3033c6b964 fixes #1214 (#3003)
### Background

Continuing to implement all the interface methods defined by the
`VectorStore` class. This PR pertains to implementation of the
`max_marginal_relevance_search_by_vector` method.

### Changes

- a `max_marginal_relevance_search_by_vector` method implementation has
been added in `weaviate.py`
- tests have been added to the the new method
- vcr cassettes have been added for the weaviate tests

### Test Plan

Added tests for the `max_marginal_relevance_search_by_vector`
implementation

### Change Safety

- [x] I have added tests to cover my changes
2023-04-24 11:50:55 -07:00
Harrison Chase
434d8c4c0e Merge branch 'master' of github.com:hwchase17/langchain 2023-04-24 11:30:14 -07:00
Harrison Chase
bdb5f2f9fb update notebook 2023-04-24 11:30:06 -07:00
Zander Chase
d06d47bc92 LM Requests Wrapper (#3457)
Co-authored-by: jnmarti <88381891+jnmarti@users.noreply.github.com>
2023-04-24 11:12:47 -07:00
Harrison Chase
b64c86a25f bump version to 148 (#3458) 2023-04-24 11:08:32 -07:00
mbchang
82845e3821 add meta-prompt to autonomous agents use cases (#3254)
An implementation of
[meta-prompt](https://noahgoodman.substack.com/p/meta-prompt-a-simple-self-improving),
where the agent modifies its own instructions across episodes with a
user.

![figure](https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F468217b9-96d9-47c0-a08b-dbf6b21b9f49_492x384.png)
2023-04-24 10:48:38 -07:00
yunfeilu92
77235bbe43 propogate kwargs to cls in OpenSearchVectorSearch (#3416)
kwargs shoud be passed into cls so that opensearch client can be
properly initlized in __init__(). Otherwise logic like below will not
work. as auth will not be passed into __init__

```python
docsearch = OpenSearchVectorSearch.from_documents(docs, embeddings, opensearch_url="http://localhost:9200")

query = "What did the president say about Ketanji Brown Jackson"
docs = docsearch.similarity_search(query)
```

Co-authored-by: EC2 Default User <ec2-user@ip-172-31-28-97.ec2.internal>
2023-04-24 10:43:41 -07:00
Eduard van Valkenburg
46c9636012 small constructor change and updated notebook (#3426)
small change in the pydantic definitions, same api. 

updated notebook with right constructure and added few shot example
2023-04-24 10:42:38 -07:00
445 changed files with 27978 additions and 6232 deletions

View File

@@ -6,7 +6,7 @@ on:
pull_request:
env:
POETRY_VERSION: "1.3.1"
POETRY_VERSION: "1.4.2"
jobs:
build:

View File

@@ -6,7 +6,7 @@ on:
pull_request:
env:
POETRY_VERSION: "1.3.1"
POETRY_VERSION: "1.4.2"
jobs:
build:

View File

@@ -10,7 +10,7 @@ on:
- 'pyproject.toml'
env:
POETRY_VERSION: "1.3.1"
POETRY_VERSION: "1.4.2"
jobs:
if_release:
@@ -45,5 +45,5 @@ jobs:
- name: Publish to PyPI
env:
POETRY_PYPI_TOKEN_PYPI: ${{ secrets.PYPI_API_TOKEN }}
run: |
run: |
poetry publish

View File

@@ -6,7 +6,7 @@ on:
pull_request:
env:
POETRY_VERSION: "1.3.1"
POETRY_VERSION: "1.4.2"
jobs:
build:

6
.gitignore vendored
View File

@@ -144,4 +144,8 @@ wandb/
/.ruff_cache/
*.pkl
*.bin
*.bin
# integration test artifacts
data_map*
\[('_type', 'fake'), ('stop', None)]

View File

@@ -4,6 +4,8 @@
[![lint](https://github.com/hwchase17/langchain/actions/workflows/lint.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/lint.yml) [![test](https://github.com/hwchase17/langchain/actions/workflows/test.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/test.yml) [![linkcheck](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml) [![Downloads](https://static.pepy.tech/badge/langchain/month)](https://pepy.tech/project/langchain) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai) [![](https://dcbadge.vercel.app/api/server/6adMQxSpJS?compact=true&style=flat)](https://discord.gg/6adMQxSpJS)
Looking for the JS/TS version? Check out [LangChain.js](https://github.com/hwchase17/langchainjs).
**Production Support:** As you move your LangChains into production, we'd love to offer more comprehensive support.
Please fill out [this form](https://forms.gle/57d8AmXBYp8PP8tZA) and we'll set up a dedicated support Slack channel.
@@ -15,12 +17,9 @@ or
## 🤔 What is this?
Large language models (LLMs) are emerging as a transformative technology, enabling
developers to build applications that they previously could not.
But using these LLMs in isolation is often not enough to
create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge.
Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. However, using these LLMs in isolation is often insufficient for creating a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge.
This library is aimed at assisting in the development of those types of applications. Common examples of these types of applications include:
This library aims to assist in the development of those types of applications. Common examples of these applications include:
**❓ Question Answering over specific documents**
@@ -53,23 +52,23 @@ These are, in increasing order of complexity:
**📃 LLMs and Prompts:**
This includes prompt management, prompt optimization, generic interface for all LLMs, and common utilities for working with LLMs.
This includes prompt management, prompt optimization, a generic interface for all LLMs, and common utilities for working with LLMs.
**🔗 Chains:**
Chains go beyond just a single LLM call, and are sequences of calls (whether to an LLM or a different utility). LangChain provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.
Chains go beyond a single LLM call and involve sequences of calls (whether to an LLM or a different utility). LangChain provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.
**📚 Data Augmented Generation:**
Data Augmented Generation involves specific types of chains that first interact with an external datasource to fetch data to use in the generation step. Examples of this include summarization of long pieces of text and question/answering over specific data sources.
Data Augmented Generation involves specific types of chains that first interact with an external data source to fetch data for use in the generation step. Examples include summarization of long pieces of text and question/answering over specific data sources.
**🤖 Agents:**
Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end to end agents.
Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end-to-end agents.
**🧠 Memory:**
Memory is the concept of persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.
Memory refers to persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.
**🧐 Evaluation:**
@@ -79,6 +78,6 @@ For more information on these concepts, please see our [full documentation](http
## 💁 Contributing
As an open source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infra, or better documentation.
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
For detailed information on how to contribute, see [here](.github/CONTRIBUTING.md).

BIN
docs/_static/MetalDash.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.5 MiB

View File

@@ -1,14 +1,10 @@
# Deployments
So you've made a really cool chain - now what? How do you deploy it and make it easily sharable with the world?
So, you've created a really cool chain - now what? How do you deploy it and make it easily shareable with the world?
This section covers several options for that.
Note that these are meant as quick deployment options for prototypes and demos, and not for production systems.
If you are looking for help with deployment of a production system, please contact us directly.
This section covers several options for that. Note that these options are meant for quick deployment of prototypes and demos, not for production systems. If you need help with the deployment of a production system, please contact us directly.
What follows is a list of template GitHub repositories aimed that are intended to be
very easy to fork and modify to use your chain.
This is far from an exhaustive list of options, and we are EXTREMELY open to contributions here.
What follows is a list of template GitHub repositories designed to be easily forked and modified to use your chain. This list is far from exhaustive, and we are EXTREMELY open to contributions here.
## [Streamlit](https://github.com/hwchase17/langchain-streamlit-template)
@@ -33,6 +29,10 @@ It implements a Question Answering app and contains instructions for deploying t
A minimal example on how to run LangChain on Vercel using Flask.
## [Fly.io](https://github.com/fly-apps/hello-fly-langchain)
A minimal example of how to deploy LangChain to [Fly.io](https://fly.io/) using Flask.
## [Digitalocean App Platform](https://github.com/homanp/digitalocean-langchain)
A minimal example on how to deploy LangChain to DigitalOcean App Platform.
@@ -43,13 +43,16 @@ A minimal example on how to deploy LangChain to Google Cloud Run.
## [SteamShip](https://github.com/steamship-core/steamship-langchain/)
This repository contains LangChain adapters for Steamship, enabling LangChain developers to rapidly deploy their apps on Steamship.
This includes: production ready endpoints, horizontal scaling across dependencies, persistant storage of app state, multi-tenancy support, etc.
This repository contains LangChain adapters for Steamship, enabling LangChain developers to rapidly deploy their apps on Steamship. This includes: production-ready endpoints, horizontal scaling across dependencies, persistent storage of app state, multi-tenancy support, etc.
## [Langchain-serve](https://github.com/jina-ai/langchain-serve)
This repository allows users to serve local chains and agents as RESTful, gRPC, or Websocket APIs thanks to [Jina](https://docs.jina.ai/). Deploy your chains & agents with ease and enjoy independent scaling, serverless and autoscaling APIs, as well as a Streamlit playground on Jina AI Cloud.
This repository allows users to serve local chains and agents as RESTful, gRPC, or WebSocket APIs, thanks to [Jina](https://docs.jina.ai/). Deploy your chains & agents with ease and enjoy independent scaling, serverless and autoscaling APIs, as well as a Streamlit playground on Jina AI Cloud.
## [BentoML](https://github.com/ssheng/BentoChain)
This repository provides an example of how to deploy a LangChain application with [BentoML](https://github.com/bentoml/BentoML). BentoML is a framework that enables the containerization of machine learning applications as standard OCI images. BentoML also allows for the automatic generation of OpenAPI and gRPC endpoints. With BentoML, you can integrate models from all popular ML frameworks and deploy them as microservices running on the most optimal hardware and scaling independently.
## [Databutton](https://databutton.com/home?new-data-app=true)
These templates serve as examples of how to build, deploy, and share LangChain applications using Databutton. You can create user interfaces with Streamlit, automate tasks by scheduling Python code, and store files and data in the built-in store. Examples include a Chatbot interface with conversational memory, a Personal search engine, and a starter template for LangChain apps. Deploying and sharing is just one click away.

View File

@@ -3,6 +3,25 @@ LangChain Ecosystem
Guides for how other companies/products can be used with LangChain
Groups
----------
LangChain provides integration with many LLMs and systems:
- `LLM Providers <./modules/models/llms/integrations.html>`_
- `Chat Model Providers <./modules/models/chat/integrations.html>`_
- `Text Embedding Model Providers <./modules/models/text_embedding.html>`_
- `Document Loader Integrations <./modules/indexes/document_loaders.html>`_
- `Text Splitter Integrations <./modules/indexes/text_splitters.html>`_
- `Vectorstore Providers <./modules/indexes/vectorstores.html>`_
- `Retriever Providers <./modules/indexes/retrievers.html>`_
- `Tool Providers <./modules/agents/tools.html>`_
- `Toolkit Integrations <./modules/agents/toolkits.html>`_
Companies / Products
----------
.. toctree::
:maxdepth: 1
:glob:

View File

@@ -61,7 +61,6 @@
"from datetime import datetime\n",
"\n",
"from langchain.llms import OpenAI\n",
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.callbacks import AimCallbackHandler, StdOutCallbackHandler"
]
},
@@ -109,8 +108,8 @@
" experiment_name=\"scenario 1: OpenAI LLM\",\n",
")\n",
"\n",
"manager = CallbackManager([StdOutCallbackHandler(), aim_callback])\n",
"llm = OpenAI(temperature=0, callback_manager=manager, verbose=True)"
"callbacks = [StdOutCallbackHandler(), aim_callback]\n",
"llm = OpenAI(temperature=0, callbacks=callbacks)"
]
},
{
@@ -177,7 +176,7 @@
"Title: {title}\n",
"Playwright: This is a synopsis for the above play:\"\"\"\n",
"prompt_template = PromptTemplate(input_variables=[\"title\"], template=template)\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callback_manager=manager)\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callbacks=callbacks)\n",
"\n",
"test_prompts = [\n",
" {\"title\": \"documentary about good video games that push the boundary of game design\"},\n",
@@ -249,13 +248,12 @@
],
"source": [
"# scenario 3 - Agent with Tools\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm, callback_manager=manager)\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm, callbacks=callbacks)\n",
"agent = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,\n",
" callback_manager=manager,\n",
" verbose=True,\n",
" callbacks=callbacks,\n",
")\n",
"agent.run(\n",
" \"Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?\"\n",

View File

@@ -79,7 +79,6 @@
"source": [
"from datetime import datetime\n",
"from langchain.callbacks import ClearMLCallbackHandler, StdOutCallbackHandler\n",
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.llms import OpenAI\n",
"\n",
"# Setup and use the ClearML Callback\n",
@@ -93,9 +92,9 @@
" complexity_metrics=True,\n",
" stream_logs=True\n",
")\n",
"manager = CallbackManager([StdOutCallbackHandler(), clearml_callback])\n",
"callbacks = [StdOutCallbackHandler(), clearml_callback]\n",
"# Get the OpenAI model ready to go\n",
"llm = OpenAI(temperature=0, callback_manager=manager, verbose=True)"
"llm = OpenAI(temperature=0, callbacks=callbacks)"
]
},
{
@@ -523,13 +522,12 @@
"from langchain.agents import AgentType\n",
"\n",
"# SCENARIO 2 - Agent with Tools\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm, callback_manager=manager)\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm, callbacks=callbacks)\n",
"agent = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,\n",
" callback_manager=manager,\n",
" verbose=True,\n",
" callbacks=callbacks,\n",
")\n",
"agent.run(\n",
" \"Who is the wife of the person who sang summer of 69?\"\n",

View File

@@ -64,7 +64,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"You can grab your [Comet API Key here](https://www.comet.com/signup?utm_source=langchain&utm_medium=referral&utm_campaign=comet_notebook) or click the link after intializing Comet"
"You can grab your [Comet API Key here](https://www.comet.com/signup?utm_source=langchain&utm_medium=referral&utm_campaign=comet_notebook) or click the link after initializing Comet"
]
},
{
@@ -121,7 +121,6 @@
"from datetime import datetime\n",
"\n",
"from langchain.callbacks import CometCallbackHandler, StdOutCallbackHandler\n",
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.llms import OpenAI\n",
"\n",
"comet_callback = CometCallbackHandler(\n",
@@ -131,8 +130,8 @@
" tags=[\"llm\"],\n",
" visualizations=[\"dep\"],\n",
")\n",
"manager = CallbackManager([StdOutCallbackHandler(), comet_callback])\n",
"llm = OpenAI(temperature=0.9, callback_manager=manager, verbose=True)\n",
"callbacks = [StdOutCallbackHandler(), comet_callback]\n",
"llm = OpenAI(temperature=0.9, callbacks=callbacks, verbose=True)\n",
"\n",
"llm_result = llm.generate([\"Tell me a joke\", \"Tell me a poem\", \"Tell me a fact\"] * 3)\n",
"print(\"LLM result\", llm_result)\n",
@@ -153,7 +152,6 @@
"outputs": [],
"source": [
"from langchain.callbacks import CometCallbackHandler, StdOutCallbackHandler\n",
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.chains import LLMChain\n",
"from langchain.llms import OpenAI\n",
"from langchain.prompts import PromptTemplate\n",
@@ -164,15 +162,14 @@
" stream_logs=True,\n",
" tags=[\"synopsis-chain\"],\n",
")\n",
"manager = CallbackManager([StdOutCallbackHandler(), comet_callback])\n",
"\n",
"llm = OpenAI(temperature=0.9, callback_manager=manager, verbose=True)\n",
"callbacks = [StdOutCallbackHandler(), comet_callback]\n",
"llm = OpenAI(temperature=0.9, callbacks=callbacks)\n",
"\n",
"template = \"\"\"You are a playwright. Given the title of play, it is your job to write a synopsis for that title.\n",
"Title: {title}\n",
"Playwright: This is a synopsis for the above play:\"\"\"\n",
"prompt_template = PromptTemplate(input_variables=[\"title\"], template=template)\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callback_manager=manager)\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callbacks=callbacks)\n",
"\n",
"test_prompts = [{\"title\": \"Documentary about Bigfoot in Paris\"}]\n",
"print(synopsis_chain.apply(test_prompts))\n",
@@ -194,7 +191,6 @@
"source": [
"from langchain.agents import initialize_agent, load_tools\n",
"from langchain.callbacks import CometCallbackHandler, StdOutCallbackHandler\n",
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.llms import OpenAI\n",
"\n",
"comet_callback = CometCallbackHandler(\n",
@@ -203,15 +199,15 @@
" stream_logs=True,\n",
" tags=[\"agent\"],\n",
")\n",
"manager = CallbackManager([StdOutCallbackHandler(), comet_callback])\n",
"llm = OpenAI(temperature=0.9, callback_manager=manager, verbose=True)\n",
"callbacks = [StdOutCallbackHandler(), comet_callback]\n",
"llm = OpenAI(temperature=0.9, callbacks=callbacks)\n",
"\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm, callback_manager=manager)\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm, callbacks=callbacks)\n",
"agent = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=\"zero-shot-react-description\",\n",
" callback_manager=manager,\n",
" callbacks=callbacks,\n",
" verbose=True,\n",
")\n",
"agent.run(\n",
@@ -255,7 +251,6 @@
"from rouge_score import rouge_scorer\n",
"\n",
"from langchain.callbacks import CometCallbackHandler, StdOutCallbackHandler\n",
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.chains import LLMChain\n",
"from langchain.llms import OpenAI\n",
"from langchain.prompts import PromptTemplate\n",
@@ -298,10 +293,10 @@
" tags=[\"custom_metrics\"],\n",
" custom_metrics=rouge_score.compute_metric,\n",
")\n",
"manager = CallbackManager([StdOutCallbackHandler(), comet_callback])\n",
"llm = OpenAI(temperature=0.9, callback_manager=manager, verbose=True)\n",
"callbacks = [StdOutCallbackHandler(), comet_callback]\n",
"llm = OpenAI(temperature=0.9)\n",
"\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callback_manager=manager)\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template)\n",
"\n",
"test_prompts = [\n",
" {\n",
@@ -323,7 +318,7 @@
" \"\"\"\n",
" }\n",
"]\n",
"print(synopsis_chain.apply(test_prompts))\n",
"print(synopsis_chain.apply(test_prompts, callbacks=callbacks))\n",
"comet_callback.flush_tracker(synopsis_chain, finish=True)"
]
}

View File

@@ -3,6 +3,7 @@
This page covers how to use the `GPT4All` wrapper within LangChain. The tutorial is divided into two parts: installation and setup, followed by usage with an example.
## Installation and Setup
- Install the Python package with `pip install pyllamacpp`
- Download a [GPT4All model](https://github.com/nomic-ai/pyllamacpp#supported-model) and place it in your desired directory
@@ -28,16 +29,16 @@ To stream the model's predictions, add in a CallbackManager.
```python
from langchain.llms import GPT4All
from langchain.callbacks.base import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
# There are many CallbackHandlers supported, such as
# from langchain.callbacks.streamlit import StreamlitCallbackHandler
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
model = GPT4All(model="./models/gpt4all-model.bin", n_ctx=512, n_threads=8, callback_handler=callback_handler, verbose=True)
callbacks = [StreamingStdOutCallbackHandler()]
model = GPT4All(model="./models/gpt4all-model.bin", n_ctx=512, n_threads=8)
# Generate text. Tokens are streamed through the callback manager.
model("Once upon a time, ")
model("Once upon a time, ", callbacks=callbacks)
```
## Model File

23
docs/ecosystem/lancedb.md Normal file
View File

@@ -0,0 +1,23 @@
# LanceDB
This page covers how to use [LanceDB](https://github.com/lancedb/lancedb) within LangChain.
It is broken into two parts: installation and setup, and then references to specific LanceDB wrappers.
## Installation and Setup
- Install the Python SDK with `pip install lancedb`
## Wrappers
### VectorStore
There exists a wrapper around LanceDB databases, allowing you to use it as a vectorstore,
whether for semantic search or example selection.
To import this vectorstore:
```python
from langchain.vectorstores import LanceDB
```
For a more detailed walkthrough of the LanceDB wrapper, see [this notebook](../modules/indexes/vectorstores/examples/lancedb.ipynb)

26
docs/ecosystem/metal.md Normal file
View File

@@ -0,0 +1,26 @@
# Metal
This page covers how to use [Metal](https://getmetal.io) within LangChain.
## What is Metal?
Metal is a managed retrieval & memory platform built for production. Easily index your data into `Metal` and run semantic search and retrieval on it.
![Metal](../_static/MetalDash.png)
## Quick start
Get started by [creating a Metal account](https://app.getmetal.io/signup).
Then, you can easily take advantage of the `MetalRetriever` class to start retrieving your data for semantic search, prompting context, etc. This class takes a `Metal` instance and a dictionary of parameters to pass to the Metal API.
```python
from langchain.retrievers import MetalRetriever
from metal_sdk.metal import Metal
metal = Metal("API_KEY", "CLIENT_ID", "INDEX_ID");
retriever = MetalRetriever(metal, params={"limit": 2})
docs = retriever.get_relevant_documents("search term")
```

View File

@@ -0,0 +1,19 @@
# PipelineAI
This page covers how to use the PipelineAI ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific PipelineAI wrappers.
## Installation and Setup
- Install with `pip install pipeline-ai`
- Get a Pipeline Cloud api key and set it as an environment variable (`PIPELINE_API_KEY`)
## Wrappers
### LLM
There exists a PipelineAI LLM wrapper, which you can access with
```python
from langchain.llms import PipelineAI
```

View File

@@ -0,0 +1,56 @@
# Prediction Guard
This page covers how to use the Prediction Guard ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Prediction Guard wrappers.
## Installation and Setup
- Install the Python SDK with `pip install predictionguard`
- Get an Prediction Guard access token (as described [here](https://docs.predictionguard.com/)) and set it as an environment variable (`PREDICTIONGUARD_TOKEN`)
## LLM Wrapper
There exists a Prediction Guard LLM wrapper, which you can access with
```python
from langchain.llms import PredictionGuard
```
You can provide the name of your Prediction Guard "proxy" as an argument when initializing the LLM:
```python
pgllm = PredictionGuard(name="your-text-gen-proxy")
```
Alternatively, you can use Prediction Guard's default proxy for SOTA LLMs:
```python
pgllm = PredictionGuard(name="default-text-gen")
```
You can also provide your access token directly as an argument:
```python
pgllm = PredictionGuard(name="default-text-gen", token="<your access token>")
```
## Example usage
Basic usage of the LLM wrapper:
```python
from langchain.llms import PredictionGuard
pgllm = PredictionGuard(name="default-text-gen")
pgllm("Tell me a joke")
```
Basic LLM Chaining with the Prediction Guard wrapper:
```python
from langchain import PromptTemplate, LLMChain
from langchain.llms import PredictionGuard
template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = LLMChain(prompt=prompt, llm=PredictionGuard(name="default-text-gen"), verbose=True)
question = "What NFL team won the Super Bowl in the year Justin Beiber was born?"
llm_chain.predict(question=question)
```

79
docs/ecosystem/redis.md Normal file
View File

@@ -0,0 +1,79 @@
# Redis
This page covers how to use the [Redis](https://redis.com) ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Redis wrappers.
## Installation and Setup
- Install the Redis Python SDK with `pip install redis`
## Wrappers
### Cache
The Cache wrapper allows for [Redis](https://redis.io) to be used as a remote, low-latency, in-memory cache for LLM prompts and responses.
#### Standard Cache
The standard cache is the Redis bread & butter of use case in production for both [open source](https://redis.io) and [enterprise](https://redis.com) users globally.
To import this cache:
```python
from langchain.cache import RedisCache
```
To use this cache with your LLMs:
```python
import langchain
import redis
redis_client = redis.Redis.from_url(...)
langchain.llm_cache = RedisCache(redis_client)
```
#### Semantic Cache
Semantic caching allows users to retrieve cached prompts based on semantic similarity between the user input and previously cached results. Under the hood it blends Redis as both a cache and a vectorstore.
To import this cache:
```python
from langchain.cache import RedisSemanticCache
```
To use this cache with your LLMs:
```python
import langchain
import redis
# use any embedding provider...
from tests.integration_tests.vectorstores.fake_embeddings import FakeEmbeddings
redis_url = "redis://localhost:6379"
langchain.llm_cache = RedisSemanticCache(
embedding=FakeEmbeddings(),
redis_url=redis_url
)
```
### VectorStore
The vectorstore wrapper turns Redis into a low-latency [vector database](https://redis.com/solutions/use-cases/vector-database/) for semantic search or LLM content retrieval.
To import this vectorstore:
```python
from langchain.vectorstores import Redis
```
For a more detailed walkthrough of the Redis vectorstore wrapper, see [this notebook](../modules/indexes/vectorstores/examples/redis.ipynb).
### Retriever
The Redis vector store retriever wrapper generalizes the vectorstore class to perform low-latency document retrieval. To create the retriever, simply call `.as_retriever()` on the base vectorstore class.
### Memory
Redis can be used to persist LLM conversations.
#### Vector Store Retriever Memory
For a more detailed walkthrough of the `VectorStoreRetrieverMemory` wrapper, see [this notebook](../modules/memory/types/vectorstore_retriever_memory.ipynb).
#### Chat Message History Memory
For a detailed example of Redis to cache conversation message history, see [this notebook](../modules/memory/examples/redis_chat_message_history.ipynb).

View File

@@ -9,7 +9,7 @@ This page covers how to run models on Replicate within LangChain.
Find a model on the [Replicate explore page](https://replicate.com/explore), and then paste in the model name and version in this format: `owner-name/model-name:version`
For example, for this [flan-t5 model](https://replicate.com/daanelson/flan-t5), click on the API tab. The model name/version would be: `daanelson/flan-t5:04e422a9b85baed86a4f24981d7f9953e20c5fd82f6103b74ebc431588e1cec8`
For example, for this [dolly model](https://replicate.com/replicate/dolly-v2-12b), click on the API tab. The model name/version would be: `"replicate/dolly-v2-12b:ef0e1aefc61f8e096ebe4db6b2bacc297daf2ef6899f0f7e001ec445893500e5"`
Only the `model` param is required, but any other model parameters can also be passed in with the format `input={model_param: value, ...}`
@@ -24,7 +24,7 @@ Replicate(model="stability-ai/stable-diffusion:db21e45d3f7023abc2a46ee38a23973f6
From here, we can initialize our model:
```python
llm = Replicate(model="daanelson/flan-t5:04e422a9b85baed86a4f24981d7f9953e20c5fd82f6103b74ebc431588e1cec8")
llm = Replicate(model="replicate/dolly-v2-12b:ef0e1aefc61f8e096ebe4db6b2bacc297daf2ef6899f0f7e001ec445893500e5")
```
And run it:
@@ -40,8 +40,7 @@ llm(prompt)
We can call any Replicate model (not just LLMs) using this syntax. For example, we can call [Stable Diffusion](https://replicate.com/stability-ai/stable-diffusion):
```python
text2image = Replicate(model="stability-ai/stable-diffusion:db21e45d3f7023abc2a46ee38a23973f6dce16bb082a930b0c49861f96d1e5bf",
input={'image_dimensions'='512x512'}
text2image = Replicate(model="stability-ai/stable-diffusion:db21e45d3f7023abc2a46ee38a23973f6dce16bb082a930b0c49861f96d1e5bf", input={'image_dimensions':'512x512'})
image_output = text2image("A cat riding a motorcycle by Picasso")
```

22
docs/ecosystem/tair.md Normal file
View File

@@ -0,0 +1,22 @@
# Tair
This page covers how to use the Tair ecosystem within LangChain.
## Installation and Setup
Install Tair Python SDK with `pip install tair`.
## Wrappers
### VectorStore
There exists a wrapper around TairVector, allowing you to use it as a vectorstore,
whether for semantic search or example selection.
To import this vectorstore:
```python
from langchain.vectorstores import Tair
```
For a more detailed walkthrough of the Tair wrapper, see [this notebook](../modules/indexes/vectorstores/examples/tair.ipynb)

View File

@@ -50,7 +50,6 @@
"source": [
"from datetime import datetime\n",
"from langchain.callbacks import WandbCallbackHandler, StdOutCallbackHandler\n",
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.llms import OpenAI"
]
},
@@ -196,8 +195,8 @@
" name=\"llm\",\n",
" tags=[\"test\"],\n",
")\n",
"manager = CallbackManager([StdOutCallbackHandler(), wandb_callback])\n",
"llm = OpenAI(temperature=0, callback_manager=manager, verbose=True)"
"callbacks = [StdOutCallbackHandler(), wandb_callback]\n",
"llm = OpenAI(temperature=0, callbacks=callbacks)"
]
},
{
@@ -484,7 +483,7 @@
"Title: {title}\n",
"Playwright: This is a synopsis for the above play:\"\"\"\n",
"prompt_template = PromptTemplate(input_variables=[\"title\"], template=template)\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callback_manager=manager)\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callbacks=callbacks)\n",
"\n",
"test_prompts = [\n",
" {\n",
@@ -577,16 +576,15 @@
],
"source": [
"# SCENARIO 3 - Agent with Tools\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm, callback_manager=manager)\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm)\n",
"agent = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,\n",
" callback_manager=manager,\n",
" verbose=True,\n",
")\n",
"agent.run(\n",
" \"Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?\"\n",
" \"Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?\",\n",
" callbacks=callbacks,\n",
")\n",
"wandb_callback.flush_tracker(agent, reset=False, finish=True)"
]

View File

@@ -172,9 +172,9 @@ In order to load agents, you should understand the following concepts:
- LLM: The language model powering the agent.
- Agent: The agent to use. This should be a string that references a support agent class. Because this notebook focuses on the simplest, highest level API, this only covers using the standard supported agents. If you want to implement a custom agent, see the documentation for custom agents (coming soon).
**Agents**: For a list of supported agents and their specifications, see [here](../modules/agents/agents.md).
**Agents**: For a list of supported agents and their specifications, see [here](../modules/agents/getting_started.ipynb).
**Tools**: For a list of predefined tools and their specifications, see [here](../modules/agents/tools.md).
**Tools**: For a list of predefined tools and their specifications, see [here](../modules/agents/tools/getting_started.md).
For this example, you will also need to install the SerpAPI Python package.
@@ -361,9 +361,9 @@ from langchain.prompts.chat import (
chat = ChatOpenAI(temperature=0)
template="You are a helpful assistant that translates {input_language} to {output_language}."
template = "You are a helpful assistant that translates {input_language} to {output_language}."
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
human_template="{text}"
human_template = "{text}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])
@@ -387,9 +387,9 @@ from langchain.prompts.chat import (
chat = ChatOpenAI(temperature=0)
template="You are a helpful assistant that translates {input_language} to {output_language}."
template = "You are a helpful assistant that translates {input_language} to {output_language}."
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
human_template="{text}"
human_template = "{text}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])

View File

@@ -5,7 +5,6 @@ LangChain is a framework for developing applications powered by language models.
- *Be data-aware*: connect a language model to other sources of data
- *Be agentic*: allow a language model to interact with its environment
- *Be stateful*: store and retrieve application state in a manner that enables a language model to make increasingly complex decisions
The LangChain framework is designed with the above principles in mind.
@@ -45,6 +44,8 @@ These modules are, in increasing order of complexity:
- `Agents <./modules/agents.html>`_: Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end to end agents.
- `Callbacks <./modules/callbacks/getting_started.html>`_: It can be difficult to track all that occurs inside a chain or agent - callbacks help add a level of observability and introspection.
.. toctree::
:maxdepth: 1
@@ -58,6 +59,7 @@ These modules are, in increasing order of complexity:
./modules/memory.md
./modules/chains.md
./modules/agents.md
./modules/callbacks/getting_started.ipynb
Use Cases
----------

View File

@@ -28,7 +28,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 7,
"id": "da5df06c-af6f-4572-b9f5-0ab971c16487",
"metadata": {
"tags": []
@@ -42,7 +42,6 @@
"from langchain.agents import AgentType\n",
"from langchain.llms import OpenAI\n",
"from langchain.callbacks.stdout import StdOutCallbackHandler\n",
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.callbacks.tracers import LangChainTracer\n",
"from aiohttp import ClientSession\n",
"\n",
@@ -57,7 +56,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 8,
"id": "fd4c294e-b1d6-44b8-b32e-2765c017e503",
"metadata": {
"tags": []
@@ -73,16 +72,15 @@
"\u001b[32;1m\u001b[1;3m I need to find out who won the US Open men's final in 2019 and then calculate his age raised to the 0.334 power.\n",
"Action: Search\n",
"Action Input: \"US Open men's final 2019 winner\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mRafael Nadal\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Rafael Nadal's age\n",
"Observation: \u001b[33;1m\u001b[1;3mRafael Nadal defeated Daniil Medvedev in the final, 75, 63, 57, 46, 64 to win the men's singles tennis title at the 2019 US Open. It was his fourth US ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out the age of the winner\n",
"Action: Search\n",
"Action Input: \"Rafael Nadal age\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m36 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 36 raised to the 0.334 power\n",
"Thought:\u001b[32;1m\u001b[1;3m I now need to calculate his age raised to the 0.334 power\n",
"Action: Calculator\n",
"Action Input: 36^0.334\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 3.3098250249682484\n",
"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 3.3098250249682484\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Rafael Nadal, aged 36, won the US Open men's final in 2019 and his age raised to the 0.334 power is 3.3098250249682484.\u001b[0m\n",
"\n",
@@ -93,18 +91,17 @@
"\u001b[32;1m\u001b[1;3m I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.\n",
"Action: Search\n",
"Action Input: \"Olivia Wilde boyfriend\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mJason Sudeikis\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Jason Sudeikis' age\n",
"Observation: \u001b[33;1m\u001b[1;3mOlivia Wilde started dating Harry Styles after ending her years-long engagement to Jason Sudeikis — see their relationship timeline.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Harry Styles' age.\n",
"Action: Search\n",
"Action Input: \"Jason Sudeikis age\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m47 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 47 raised to the 0.23 power\n",
"Action Input: \"Harry Styles age\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m29 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 29 raised to the 0.23 power.\n",
"Action: Calculator\n",
"Action Input: 47^0.23\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.4242784855673896\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Jason Sudeikis, Olivia Wilde's boyfriend, is 47 years old and his age raised to the 0.23 power is 2.4242784855673896.\u001b[0m\n",
"Action Input: 29^0.23\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.169459462491557\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: Harry Styles, Olivia Wilde's boyfriend, is 29 years old and his age raised to the 0.23 power is 2.169459462491557.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
@@ -113,17 +110,17 @@
"\u001b[32;1m\u001b[1;3m I need to find out who won the grand prix and then calculate their age raised to the 0.23 power.\n",
"Action: Search\n",
"Action Input: \"Formula 1 Grand Prix Winner\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mMax Verstappen\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Max Verstappen's age\n",
"Observation: \u001b[33;1m\u001b[1;3mMichael Schumacher (top left) and Lewis Hamilton (top right) have each won the championship a record seven times during their careers, while Sebastian Vettel ( ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out the age of the winner\n",
"Action: Search\n",
"Action Input: \"Max Verstappen Age\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m25 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 25 raised to the 0.23 power\n",
"Action Input: \"Michael Schumacher age\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m54 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate the age raised to the 0.23 power\n",
"Action: Calculator\n",
"Action Input: 25^0.23\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 1.84599359907945\u001b[0m\n",
"Action Input: 54^0.23\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.502940725307012\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Max Verstappen, 25 years old, raised to the 0.23 power is 1.84599359907945.\u001b[0m\n",
"Final Answer: Michael Schumacher, aged 54, raised to the 0.23 power is 2.502940725307012.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
@@ -132,18 +129,17 @@
"\u001b[32;1m\u001b[1;3m I need to find out who won the US Open women's final in 2019 and then calculate her age raised to the 0.34 power.\n",
"Action: Search\n",
"Action Input: \"US Open women's final 2019 winner\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mBianca Andreescu defeated Serena Williams in the final, 63, 75 to win the women's singles tennis title at the 2019 US Open. It was her first major title, and she became the first Canadian, as well as the first player born in the 2000s, to win a major singles title.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Bianca Andreescu's age.\n",
"Observation: \u001b[33;1m\u001b[1;3mBianca Andreescu\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out her age\n",
"Action: Search\n",
"Action Input: \"Bianca Andreescu age\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m22 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the age of Bianca Andreescu and can calculate her age raised to the 0.34 power.\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate her age raised to the 0.34 power\n",
"Action: Calculator\n",
"Action Input: 22^0.34\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.8603798598506933\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: Bianca Andreescu won the US Open women's final in 2019 and her age raised to the 0.34 power is 2.8603798598506933.\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.8603798598506933\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Bianca Andreescu, aged 22, won the US Open women's final in 2019 and her age raised to the 0.34 power is 2.86.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
@@ -160,35 +156,32 @@
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 53 raised to the 0.19 power\n",
"Action: Calculator\n",
"Action Input: 53^0.19\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.12624064206896\n",
"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.12624064206896\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Jay-Z is Beyonce's husband and his age raised to the 0.19 power is 2.12624064206896.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"Serial executed in 65.11 seconds.\n"
"Serial executed in 52.47 seconds.\n"
]
}
],
"source": [
"def generate_serially():\n",
" for q in questions:\n",
" llm = OpenAI(temperature=0)\n",
" tools = load_tools([\"llm-math\", \"serpapi\"], llm=llm)\n",
" agent = initialize_agent(\n",
" tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True\n",
" )\n",
" agent.run(q)\n",
"llm = OpenAI(temperature=0)\n",
"tools = load_tools([\"llm-math\", \"serpapi\"], llm=llm)\n",
"agent = initialize_agent(\n",
" tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True\n",
")\n",
"\n",
"s = time.perf_counter()\n",
"generate_serially()\n",
"for q in questions:\n",
" agent.run(q)\n",
"elapsed = time.perf_counter() - s\n",
"print(f\"Serial executed in {elapsed:0.2f} seconds.\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 9,
"id": "076d7b85-45ec-465d-8b31-c2ad119c3438",
"metadata": {
"tags": []
@@ -202,6 +195,9 @@
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\n",
@@ -210,182 +206,94 @@
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.\n",
"Action: Search\n",
"Action Input: \"Olivia Wilde boyfriend\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out who Beyonce's husband is and then calculate his age raised to the 0.19 power.\n",
"Action: Search\n",
"Action Input: \"Who is Beyonce's husband?\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mJay-Z\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out who won the grand prix and then calculate their age raised to the 0.23 power.\n",
"Action: Search\n",
"Action Input: \"Formula 1 Grand Prix Winner\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out who won the US Open women's final in 2019 and then calculate her age raised to the 0.34 power.\n",
"\u001b[32;1m\u001b[1;3m I need to find out who won the US Open women's final in 2019 and then calculate her age raised to the 0.34 power.\n",
"Action: Search\n",
"Action Input: \"US Open women's final 2019 winner\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mJason Sudeikis\u001b[0m\n",
"Thought:\n",
"Observation: \u001b[33;1m\u001b[1;3mMax Verstappen\u001b[0m\n",
"Thought:\n",
"Observation: \u001b[33;1m\u001b[1;3mBianca Andreescu defeated Serena Williams in the final, 63, 75 to win the women's singles tennis title at the 2019 US Open. It was her first major title, and she became the first Canadian, as well as the first player born in the 2000s, to win a major singles title.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Jason Sudeikis' age\n",
"Action: Search\n",
"Action Input: \"Jason Sudeikis age\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out Jay-Z's age\n",
"Action: Search\n",
"Action Input: \"How old is Jay-Z?\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m53 years\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mBianca Andreescu\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out who won the US Open men's final in 2019 and then calculate his age raised to the 0.334 power.\n",
"Action: Search\n",
"Action Input: \"US Open men's final 2019 winner\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mRafael Nadal defeated Daniil Medvedev in the final, 75, 63, 57, 46, 64 to win the men's singles tennis title at the 2019 US Open. It was his fourth US ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.\n",
"Action: Search\n",
"Action Input: \"Olivia Wilde boyfriend\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out who won the grand prix and then calculate their age raised to the 0.23 power.\n",
"Action: Search\n",
"Action Input: \"Formula 1 Grand Prix Winner\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out who Beyonce's husband is and then calculate his age raised to the 0.19 power.\n",
"Action: Search\n",
"Action Input: \"Who is Beyonce's husband?\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mOlivia Wilde started dating Harry Styles after ending her years-long engagement to Jason Sudeikis — see their relationship timeline.\u001b[0m\n",
"Thought:\n",
"Observation: \u001b[33;1m\u001b[1;3m47 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Max Verstappen's age\n",
"Observation: \u001b[33;1m\u001b[1;3mJay-Z\u001b[0m\n",
"Thought:\n",
"Observation: \u001b[33;1m\u001b[1;3mMichael Schumacher (top left) and Lewis Hamilton (top right) have each won the championship a record seven times during their careers, while Sebastian Vettel ( ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out her age\n",
"Action: Search\n",
"Action Input: \"Max Verstappen Age\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m25 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Bianca Andreescu's age.\n",
"Action Input: \"Bianca Andreescu age\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out Jay-Z's age\n",
"Action: Search\n",
"Action Input: \"Bianca Andreescu age\"\u001b[0m\n",
"Action Input: \"How old is Jay-Z?\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m53 years\u001b[0m\n",
"Thought:\n",
"Observation: \u001b[33;1m\u001b[1;3m22 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 53 raised to the 0.19 power\n",
"Action: Calculator\n",
"Action Input: 53^0.19\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out the age of the winner\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Harry Styles' age.\n",
"Action: Search\n",
"Action Input: \"Rafael Nadal age\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to calculate 47 raised to the 0.23 power\n",
"Action Input: \"Harry Styles age\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m29 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate her age raised to the 0.34 power\n",
"Action: Calculator\n",
"Action Input: 47^0.23\u001b[0m\n",
"Action Input: 22^0.34\u001b[0m\u001b[32;1m\u001b[1;3m I need to calculate 53 raised to the 0.19 power\n",
"Action: Calculator\n",
"Action Input: 53^0.19\u001b[0m\u001b[32;1m\u001b[1;3m I need to calculate 29 raised to the 0.23 power.\n",
"Action: Calculator\n",
"Action Input: 29^0.23\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out the age of the winner\n",
"Action: Search\n",
"Action Input: \"Rafael Nadal age\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out the age of the winner\n",
"Action: Search\n",
"Action Input: \"Michael Schumacher age\"\u001b[0m\n",
"Observation: \n",
"Observation: \u001b[33;1m\u001b[1;3m36 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 25 raised to the 0.23 power\n",
"Action: Calculator\n",
"Action Input: 25^0.23\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.12624064206896\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the age of Bianca Andreescu and can calculate her age raised to the 0.34 power.\n",
"Action: Calculator\n",
"Action Input: 22^0.34\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 1.84599359907945\u001b[0m\n",
"Thought:\u001b[33;1m\u001b[1;3m54 years\u001b[0m\n",
"Thought:\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.4242784855673896\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now need to calculate his age raised to the 0.334 power\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.8603798598506933\u001b[0m\n",
"Thought:\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.169459462491557\u001b[0m\n",
"Thought:\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.12624064206896\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate the age raised to the 0.334 power\n",
"Action: Calculator\n",
"Action Input: 36^0.334\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.8603798598506933\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Jay-Z is Beyonce's husband and his age raised to the 0.19 power is 2.12624064206896.\u001b[0m\n",
"\n",
"Action Input: 36^0.334\u001b[0m\u001b[32;1m\u001b[1;3m I now need to calculate the age raised to the 0.23 power\n",
"Action: Calculator\n",
"Action Input: 54^0.23\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Max Verstappen, 25 years old, raised to the 0.23 power is 1.84599359907945.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 3.3098250249682484\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Jason Sudeikis, Olivia Wilde's boyfriend, is 47 years old and his age raised to the 0.23 power is 2.4242784855673896.\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 3.3098250249682484\u001b[0m\n",
"Thought:\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.502940725307012\u001b[0m\n",
"Thought:\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: Bianca Andreescu won the US Open women's final in 2019 and her age raised to the 0.34 power is 2.8603798598506933.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Rafael Nadal, aged 36, won the US Open men's final in 2019 and his age raised to the 0.334 power is 3.3098250249682484.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"Concurrent executed in 12.38 seconds.\n"
"Concurrent executed in 14.49 seconds.\n"
]
}
],
"source": [
"async def generate_concurrently():\n",
" agents = []\n",
" # To make async requests in Tools more efficient, you can pass in your own aiohttp.ClientSession, \n",
" # but you must manually close the client session at the end of your program/event loop\n",
" aiosession = ClientSession()\n",
" for _ in questions:\n",
" manager = CallbackManager([StdOutCallbackHandler()])\n",
" llm = OpenAI(temperature=0, callback_manager=manager)\n",
" async_tools = load_tools([\"llm-math\", \"serpapi\"], llm=llm, aiosession=aiosession, callback_manager=manager)\n",
" agents.append(\n",
" initialize_agent(async_tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True, callback_manager=manager)\n",
" )\n",
" tasks = [async_agent.arun(q) for async_agent, q in zip(agents, questions)]\n",
" await asyncio.gather(*tasks)\n",
" await aiosession.close()\n",
"llm = OpenAI(temperature=0)\n",
"tools = load_tools([\"llm-math\", \"serpapi\"], llm=llm)\n",
"agent = initialize_agent(\n",
" tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True\n",
")\n",
"\n",
"s = time.perf_counter()\n",
"# If running this outside of Jupyter, use asyncio.run(generate_concurrently())\n",
"await generate_concurrently()\n",
"# If running this outside of Jupyter, use asyncio.run or loop.run_until_complete\n",
"tasks = [agent.arun(q) for q in questions]\n",
"await asyncio.gather(*tasks)\n",
"elapsed = time.perf_counter() - s\n",
"print(f\"Concurrent executed in {elapsed:0.2f} seconds.\")"
]
},
{
"cell_type": "markdown",
"id": "97ef285c-4a43-4a4e-9698-cd52a1bc56c9",
"metadata": {},
"source": [
"## Using Tracing with Asynchronous Agents\n",
"\n",
"To use tracing with async agents, you must pass in a custom `CallbackManager` with `LangChainTracer` to each agent running asynchronously. This way, you avoid collisions while the trace is being collected."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "44bda05a-d33e-4e91-9a71-a0f3f96aae95",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who won the US Open men's final in 2019 and then calculate his age raised to the 0.334 power.\n",
"Action: Search\n",
"Action Input: \"US Open men's final 2019 winner\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mRafael Nadal\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Rafael Nadal's age\n",
"Action: Search\n",
"Action Input: \"Rafael Nadal age\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m36 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 36 raised to the 0.334 power\n",
"Action: Calculator\n",
"Action Input: 36^0.334\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 3.3098250249682484\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Rafael Nadal, aged 36, won the US Open men's final in 2019 and his age raised to the 0.334 power is 3.3098250249682484.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
}
],
"source": [
"# To make async requests in Tools more efficient, you can pass in your own aiohttp.ClientSession, \n",
"# but you must manually close the client session at the end of your program/event loop\n",
"aiosession = ClientSession()\n",
"tracer = LangChainTracer()\n",
"tracer.load_default_session()\n",
"manager = CallbackManager([StdOutCallbackHandler(), tracer])\n",
"\n",
"# Pass the manager into the llm if you want llm calls traced.\n",
"llm = OpenAI(temperature=0, callback_manager=manager)\n",
"\n",
"async_tools = load_tools([\"llm-math\", \"serpapi\"], llm=llm, aiosession=aiosession)\n",
"async_agent = initialize_agent(async_tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True, callback_manager=manager)\n",
"await async_agent.arun(questions[0])\n",
"await aiosession.close()"
]
}
],
"metadata": {
@@ -404,7 +312,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.9"
}
},
"nbformat": 4,

View File

@@ -49,7 +49,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 3,
"id": "a33e2f7e",
"metadata": {},
"outputs": [],
@@ -97,7 +97,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 4,
"id": "655d72f6",
"metadata": {},
"outputs": [],
@@ -107,7 +107,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 5,
"id": "490604e9",
"metadata": {},
"outputs": [],
@@ -117,7 +117,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 6,
"id": "653b1617",
"metadata": {},
"outputs": [
@@ -128,7 +128,7 @@
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\u001b[0m\u001b[36;1m\u001b[1;3mFoo Fighters is an American rock band formed in Seattle in 1994. Foo Fighters was initially formed as a one-man project by former Nirvana drummer Dave Grohl. Following the success of the 1995 eponymous debut album, Grohl recruited a band consisting of Nate Mendel, William Goldsmith, and Pat Smear.\u001b[0m\u001b[32;1m\u001b[1;3m\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\u001b[0m\u001b[36;1m\u001b[1;3mThe current population of Canada is 38,669,152 as of Monday, April 24, 2023, based on Worldometer elaboration of the latest United Nations data.\u001b[0m\u001b[32;1m\u001b[1;3m\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -136,10 +136,10 @@
{
"data": {
"text/plain": [
"'Foo Fighters is an American rock band formed in Seattle in 1994. Foo Fighters was initially formed as a one-man project by former Nirvana drummer Dave Grohl. Following the success of the 1995 eponymous debut album, Grohl recruited a band consisting of Nate Mendel, William Goldsmith, and Pat Smear.'"
"'The current population of Canada is 38,669,152 as of Monday, April 24, 2023, based on Worldometer elaboration of the latest United Nations data.'"
]
},
"execution_count": 7,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}

View File

@@ -373,6 +373,7 @@
"metadata": {},
"outputs": [],
"source": [
"tools = get_tools(\"whats the weather?\")\n",
"tool_names = [tool.name for tool in tools]\n",
"agent = LLMSingleActionAgent(\n",
" llm_chain=llm_chain, \n",

View File

@@ -31,7 +31,7 @@
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": 2,
"id": "d7c4ebdc",
"metadata": {},
"outputs": [],
@@ -43,7 +43,7 @@
},
{
"cell_type": "code",
"execution_count": 22,
"execution_count": 3,
"id": "becda2a1",
"metadata": {},
"outputs": [],
@@ -66,7 +66,7 @@
},
{
"cell_type": "code",
"execution_count": 23,
"execution_count": 4,
"id": "a33e2f7e",
"metadata": {},
"outputs": [],
@@ -96,8 +96,8 @@
" \"\"\"\n",
" if len(intermediate_steps) == 0:\n",
" return [\n",
" AgentAction(tool=\"Search\", tool_input=\"foo\", log=\"\"),\n",
" AgentAction(tool=\"RandomWord\", tool_input=\"foo\", log=\"\"),\n",
" AgentAction(tool=\"Search\", tool_input=kwargs[\"input\"], log=\"\"),\n",
" AgentAction(tool=\"RandomWord\", tool_input=kwargs[\"input\"], log=\"\"),\n",
" ]\n",
" else:\n",
" return AgentFinish(return_values={\"output\": \"bar\"}, log=\"\")\n",
@@ -117,8 +117,8 @@
" \"\"\"\n",
" if len(intermediate_steps) == 0:\n",
" return [\n",
" AgentAction(tool=\"Search\", tool_input=\"foo\", log=\"\"),\n",
" AgentAction(tool=\"RandomWord\", tool_input=\"foo\", log=\"\"),\n",
" AgentAction(tool=\"Search\", tool_input=kwargs[\"input\"], log=\"\"),\n",
" AgentAction(tool=\"RandomWord\", tool_input=kwargs[\"input\"], log=\"\"),\n",
" ]\n",
" else:\n",
" return AgentFinish(return_values={\"output\": \"bar\"}, log=\"\")"
@@ -126,7 +126,7 @@
},
{
"cell_type": "code",
"execution_count": 24,
"execution_count": 5,
"id": "655d72f6",
"metadata": {},
"outputs": [],
@@ -136,7 +136,7 @@
},
{
"cell_type": "code",
"execution_count": 25,
"execution_count": 6,
"id": "490604e9",
"metadata": {},
"outputs": [],
@@ -146,7 +146,7 @@
},
{
"cell_type": "code",
"execution_count": 26,
"execution_count": 7,
"id": "653b1617",
"metadata": {},
"outputs": [
@@ -157,7 +157,7 @@
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\u001b[0m\u001b[36;1m\u001b[1;3mFoo Fighters is an American rock band formed in Seattle in 1994. Foo Fighters was initially formed as a one-man project by former Nirvana drummer Dave Grohl. Following the success of the 1995 eponymous debut album, Grohl recruited a band consisting of Nate Mendel, William Goldsmith, and Pat Smear.\u001b[0m\u001b[32;1m\u001b[1;3m\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\u001b[0m\u001b[36;1m\u001b[1;3mThe current population of Canada is 38,669,152 as of Monday, April 24, 2023, based on Worldometer elaboration of the latest United Nations data.\u001b[0m\u001b[32;1m\u001b[1;3m\u001b[0m\n",
"Now I'm doing this!\n",
"\u001b[33;1m\u001b[1;3mfoo\u001b[0m\u001b[32;1m\u001b[1;3m\u001b[0m\n",
"\n",
@@ -170,7 +170,7 @@
"'bar'"
]
},
"execution_count": 26,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}

View File

@@ -14,11 +14,26 @@
},
{
"cell_type": "code",
"execution_count": 7,
"id": "0cdd9bf5",
"metadata": {},
"execution_count": null,
"id": "65a8ebbe-cd83-401a-8ca7-cb6541121d0c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install \"pandas\" > /dev/null"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "0cdd9bf5",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.callbacks import tracing_enabled\n",
"from langchain.agents import create_pandas_dataframe_agent"
]
},
@@ -26,7 +41,9 @@
"cell_type": "code",
"execution_count": 2,
"id": "051ebe84",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.llms import OpenAI\n",
@@ -37,17 +54,19 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 7,
"id": "4185ff46",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"agent = create_pandas_dataframe_agent(OpenAI(temperature=0), df, verbose=True)"
"agent = create_pandas_dataframe_agent(OpenAI(temperature=0), df, verbose=False)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 9,
"id": "a9207a2e",
"metadata": {},
"outputs": [
@@ -55,32 +74,17 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to count the number of rows\n",
"Action: python_repl_ast\n",
"Action Input: len(df)\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m891\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: There are 891 rows in the dataframe.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"Error in on_llm_start callback: Parent run with UUID 1b224cee-9715-4f0b-ba33-527468c2cebb not found.\n",
"Error in on_llm_end callback: No LLMRun found to be traced\n",
"Error in on_llm_start callback: Parent run with UUID 51658270-136c-4763-9c8a-49db12e2efc9 not found.\n",
"Error in on_llm_end callback: No LLMRun found to be traced\n",
"There are 891 rows.\n"
]
},
{
"data": {
"text/plain": [
"'There are 891 rows in the dataframe.'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"how many rows are there?\")"
"with tracing_enabled():\n",
" print(agent.run(\"how many rows are there?\"))"
]
},
{
@@ -96,11 +100,91 @@
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3m\n",
"You are working with a pandas dataframe in Python. The name of the dataframe is `df`.\n",
"You should use the tools below to answer the question posed of you:\n",
"\n",
"python_repl_ast: A Python shell. Use this to execute python commands. Input should be a valid python command. When using this tool, sometimes output is abbreviated - make sure it does not look abbreviated before using it in your answer.\n",
"\n",
"Use the following format:\n",
"\n",
"Question: the input question you must answer\n",
"Thought: you should always think about what to do\n",
"Action: the action to take, should be one of [python_repl_ast]\n",
"Action Input: the input to the action\n",
"Observation: the result of the action\n",
"... (this Thought/Action/Action Input/Observation can repeat N times)\n",
"Thought: I now know the final answer\n",
"Final Answer: the final answer to the original input question\n",
"\n",
"\n",
"This is the result of `print(df.head())`:\n",
"| | PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked |\n",
"|---:|--------------:|-----------:|---------:|:----------------------------------------------------|:-------|------:|--------:|--------:|:-----------------|--------:|:--------|:-----------|\n",
"| 0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22 | 1 | 0 | A/5 21171 | 7.25 | nan | S |\n",
"| 1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female | 38 | 1 | 0 | PC 17599 | 71.2833 | C85 | C |\n",
"| 2 | 3 | 1 | 3 | Heikkinen, Miss. Laina | female | 26 | 0 | 0 | STON/O2. 3101282 | 7.925 | nan | S |\n",
"| 3 | 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35 | 1 | 0 | 113803 | 53.1 | C123 | S |\n",
"| 4 | 5 | 0 | 3 | Allen, Mr. William Henry | male | 35 | 0 | 0 | 373450 | 8.05 | nan | S |\n",
"\n",
"Begin!\n",
"Question: how many people have more than 3 sibligngs\n",
"\u001b[0m\n",
"Error in on_llm_start callback: Parent run with UUID 6b86a7f0-bdf6-43e8-b94d-937c743b60f5 not found.\n",
"Error in on_llm_end callback: No LLMRun found to be traced\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to count the number of people with more than 3 siblings\n",
"Action: python_repl_ast\n",
"Action Input: df[df['SibSp'] > 3].shape[0]\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m30\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Thought:\n",
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3m\n",
"You are working with a pandas dataframe in Python. The name of the dataframe is `df`.\n",
"You should use the tools below to answer the question posed of you:\n",
"\n",
"python_repl_ast: A Python shell. Use this to execute python commands. Input should be a valid python command. When using this tool, sometimes output is abbreviated - make sure it does not look abbreviated before using it in your answer.\n",
"\n",
"Use the following format:\n",
"\n",
"Question: the input question you must answer\n",
"Thought: you should always think about what to do\n",
"Action: the action to take, should be one of [python_repl_ast]\n",
"Action Input: the input to the action\n",
"Observation: the result of the action\n",
"... (this Thought/Action/Action Input/Observation can repeat N times)\n",
"Thought: I now know the final answer\n",
"Final Answer: the final answer to the original input question\n",
"\n",
"\n",
"This is the result of `print(df.head())`:\n",
"| | PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked |\n",
"|---:|--------------:|-----------:|---------:|:----------------------------------------------------|:-------|------:|--------:|--------:|:-----------------|--------:|:--------|:-----------|\n",
"| 0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22 | 1 | 0 | A/5 21171 | 7.25 | nan | S |\n",
"| 1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female | 38 | 1 | 0 | PC 17599 | 71.2833 | C85 | C |\n",
"| 2 | 3 | 1 | 3 | Heikkinen, Miss. Laina | female | 26 | 0 | 0 | STON/O2. 3101282 | 7.925 | nan | S |\n",
"| 3 | 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35 | 1 | 0 | 113803 | 53.1 | C123 | S |\n",
"| 4 | 5 | 0 | 3 | Allen, Mr. William Henry | male | 35 | 0 | 0 | 373450 | 8.05 | nan | S |\n",
"\n",
"Begin!\n",
"Question: how many people have more than 3 sibligngs\n",
"Thought: I need to count the number of people with more than 3 siblings\n",
"Action: python_repl_ast\n",
"Action Input: df[df['SibSp'] > 3].shape[0]\n",
"Observation: 30\n",
"Thought:\u001b[0m\n",
"Error in on_llm_start callback: Parent run with UUID 73a825e3-1270-4735-b40b-208a0915e8be not found.\n",
"Error in on_llm_end callback: No LLMRun found to be traced\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: 30 people have more than 3 siblings.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
@@ -134,23 +218,253 @@
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3m\n",
"You are working with a pandas dataframe in Python. The name of the dataframe is `df`.\n",
"You should use the tools below to answer the question posed of you:\n",
"\n",
"python_repl_ast: A Python shell. Use this to execute python commands. Input should be a valid python command. When using this tool, sometimes output is abbreviated - make sure it does not look abbreviated before using it in your answer.\n",
"\n",
"Use the following format:\n",
"\n",
"Question: the input question you must answer\n",
"Thought: you should always think about what to do\n",
"Action: the action to take, should be one of [python_repl_ast]\n",
"Action Input: the input to the action\n",
"Observation: the result of the action\n",
"... (this Thought/Action/Action Input/Observation can repeat N times)\n",
"Thought: I now know the final answer\n",
"Final Answer: the final answer to the original input question\n",
"\n",
"\n",
"This is the result of `print(df.head())`:\n",
"| | PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked |\n",
"|---:|--------------:|-----------:|---------:|:----------------------------------------------------|:-------|------:|--------:|--------:|:-----------------|--------:|:--------|:-----------|\n",
"| 0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22 | 1 | 0 | A/5 21171 | 7.25 | nan | S |\n",
"| 1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female | 38 | 1 | 0 | PC 17599 | 71.2833 | C85 | C |\n",
"| 2 | 3 | 1 | 3 | Heikkinen, Miss. Laina | female | 26 | 0 | 0 | STON/O2. 3101282 | 7.925 | nan | S |\n",
"| 3 | 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35 | 1 | 0 | 113803 | 53.1 | C123 | S |\n",
"| 4 | 5 | 0 | 3 | Allen, Mr. William Henry | male | 35 | 0 | 0 | 373450 | 8.05 | nan | S |\n",
"\n",
"Begin!\n",
"Question: whats the square root of the average age?\n",
"\u001b[0m\n",
"Error in on_llm_start callback: Parent run with UUID 8d771bc6-208a-4580-b43d-ec92ca04e39f not found.\n",
"Error in on_llm_end callback: No LLMRun found to be traced\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to calculate the average age first\n",
"Action: python_repl_ast\n",
"Action Input: df['Age'].mean()\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m29.69911764705882\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I can now calculate the square root\n",
"Thought:\n",
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3m\n",
"You are working with a pandas dataframe in Python. The name of the dataframe is `df`.\n",
"You should use the tools below to answer the question posed of you:\n",
"\n",
"python_repl_ast: A Python shell. Use this to execute python commands. Input should be a valid python command. When using this tool, sometimes output is abbreviated - make sure it does not look abbreviated before using it in your answer.\n",
"\n",
"Use the following format:\n",
"\n",
"Question: the input question you must answer\n",
"Thought: you should always think about what to do\n",
"Action: the action to take, should be one of [python_repl_ast]\n",
"Action Input: the input to the action\n",
"Observation: the result of the action\n",
"... (this Thought/Action/Action Input/Observation can repeat N times)\n",
"Thought: I now know the final answer\n",
"Final Answer: the final answer to the original input question\n",
"\n",
"\n",
"This is the result of `print(df.head())`:\n",
"| | PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked |\n",
"|---:|--------------:|-----------:|---------:|:----------------------------------------------------|:-------|------:|--------:|--------:|:-----------------|--------:|:--------|:-----------|\n",
"| 0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22 | 1 | 0 | A/5 21171 | 7.25 | nan | S |\n",
"| 1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female | 38 | 1 | 0 | PC 17599 | 71.2833 | C85 | C |\n",
"| 2 | 3 | 1 | 3 | Heikkinen, Miss. Laina | female | 26 | 0 | 0 | STON/O2. 3101282 | 7.925 | nan | S |\n",
"| 3 | 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35 | 1 | 0 | 113803 | 53.1 | C123 | S |\n",
"| 4 | 5 | 0 | 3 | Allen, Mr. William Henry | male | 35 | 0 | 0 | 373450 | 8.05 | nan | S |\n",
"\n",
"Begin!\n",
"Question: whats the square root of the average age?\n",
"Thought: I need to calculate the average age first\n",
"Action: python_repl_ast\n",
"Action Input: df['Age'].mean()\n",
"Observation: 29.69911764705882\n",
"Thought:\u001b[0m\n",
"Error in on_llm_start callback: Parent run with UUID b30b9eac-eee3-406c-b7c3-a3db3b549948 not found.\n",
"Error in on_llm_end callback: No LLMRun found to be traced\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I now need to calculate the square root of the average age\n",
"Action: python_repl_ast\n",
"Action Input: math.sqrt(df['Age'].mean())\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mname 'math' is not defined\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to import the math library\n",
"Thought:\n",
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3m\n",
"You are working with a pandas dataframe in Python. The name of the dataframe is `df`.\n",
"You should use the tools below to answer the question posed of you:\n",
"\n",
"python_repl_ast: A Python shell. Use this to execute python commands. Input should be a valid python command. When using this tool, sometimes output is abbreviated - make sure it does not look abbreviated before using it in your answer.\n",
"\n",
"Use the following format:\n",
"\n",
"Question: the input question you must answer\n",
"Thought: you should always think about what to do\n",
"Action: the action to take, should be one of [python_repl_ast]\n",
"Action Input: the input to the action\n",
"Observation: the result of the action\n",
"... (this Thought/Action/Action Input/Observation can repeat N times)\n",
"Thought: I now know the final answer\n",
"Final Answer: the final answer to the original input question\n",
"\n",
"\n",
"This is the result of `print(df.head())`:\n",
"| | PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked |\n",
"|---:|--------------:|-----------:|---------:|:----------------------------------------------------|:-------|------:|--------:|--------:|:-----------------|--------:|:--------|:-----------|\n",
"| 0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22 | 1 | 0 | A/5 21171 | 7.25 | nan | S |\n",
"| 1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female | 38 | 1 | 0 | PC 17599 | 71.2833 | C85 | C |\n",
"| 2 | 3 | 1 | 3 | Heikkinen, Miss. Laina | female | 26 | 0 | 0 | STON/O2. 3101282 | 7.925 | nan | S |\n",
"| 3 | 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35 | 1 | 0 | 113803 | 53.1 | C123 | S |\n",
"| 4 | 5 | 0 | 3 | Allen, Mr. William Henry | male | 35 | 0 | 0 | 373450 | 8.05 | nan | S |\n",
"\n",
"Begin!\n",
"Question: whats the square root of the average age?\n",
"Thought: I need to calculate the average age first\n",
"Action: python_repl_ast\n",
"Action Input: df['Age'].mean()\n",
"Observation: 29.69911764705882\n",
"Thought: I now need to calculate the square root of the average age\n",
"Action: python_repl_ast\n",
"Action Input: math.sqrt(df['Age'].mean())\n",
"Observation: name 'math' is not defined\n",
"Thought:\u001b[0m\n",
"Error in on_llm_start callback: Parent run with UUID 5c463668-dfa6-45e5-856d-2da5c91516d8 not found.\n",
"Error in on_llm_end callback: No LLMRun found to be traced\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to import the math library\n",
"Action: python_repl_ast\n",
"Action Input: import math\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mNone\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I can now calculate the square root\n",
"Observation: \u001b[36;1m\u001b[1;3m\u001b[0m\n",
"Thought:\n",
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3m\n",
"You are working with a pandas dataframe in Python. The name of the dataframe is `df`.\n",
"You should use the tools below to answer the question posed of you:\n",
"\n",
"python_repl_ast: A Python shell. Use this to execute python commands. Input should be a valid python command. When using this tool, sometimes output is abbreviated - make sure it does not look abbreviated before using it in your answer.\n",
"\n",
"Use the following format:\n",
"\n",
"Question: the input question you must answer\n",
"Thought: you should always think about what to do\n",
"Action: the action to take, should be one of [python_repl_ast]\n",
"Action Input: the input to the action\n",
"Observation: the result of the action\n",
"... (this Thought/Action/Action Input/Observation can repeat N times)\n",
"Thought: I now know the final answer\n",
"Final Answer: the final answer to the original input question\n",
"\n",
"\n",
"This is the result of `print(df.head())`:\n",
"| | PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked |\n",
"|---:|--------------:|-----------:|---------:|:----------------------------------------------------|:-------|------:|--------:|--------:|:-----------------|--------:|:--------|:-----------|\n",
"| 0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22 | 1 | 0 | A/5 21171 | 7.25 | nan | S |\n",
"| 1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female | 38 | 1 | 0 | PC 17599 | 71.2833 | C85 | C |\n",
"| 2 | 3 | 1 | 3 | Heikkinen, Miss. Laina | female | 26 | 0 | 0 | STON/O2. 3101282 | 7.925 | nan | S |\n",
"| 3 | 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35 | 1 | 0 | 113803 | 53.1 | C123 | S |\n",
"| 4 | 5 | 0 | 3 | Allen, Mr. William Henry | male | 35 | 0 | 0 | 373450 | 8.05 | nan | S |\n",
"\n",
"Begin!\n",
"Question: whats the square root of the average age?\n",
"Thought: I need to calculate the average age first\n",
"Action: python_repl_ast\n",
"Action Input: df['Age'].mean()\n",
"Observation: 29.69911764705882\n",
"Thought: I now need to calculate the square root of the average age\n",
"Action: python_repl_ast\n",
"Action Input: math.sqrt(df['Age'].mean())\n",
"Observation: name 'math' is not defined\n",
"Thought: I need to import the math library\n",
"Action: python_repl_ast\n",
"Action Input: import math\n",
"Observation: \n",
"Thought:\u001b[0m\n",
"Error in on_llm_start callback: Parent run with UUID 3e09fe75-cb84-4b8d-83a1-7037d8e12aab not found.\n",
"Error in on_llm_end callback: No LLMRun found to be traced\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I now need to calculate the square root of the average age\n",
"Action: python_repl_ast\n",
"Action Input: math.sqrt(df['Age'].mean())\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m5.449689683556195\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Thought:\n",
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3m\n",
"You are working with a pandas dataframe in Python. The name of the dataframe is `df`.\n",
"You should use the tools below to answer the question posed of you:\n",
"\n",
"python_repl_ast: A Python shell. Use this to execute python commands. Input should be a valid python command. When using this tool, sometimes output is abbreviated - make sure it does not look abbreviated before using it in your answer.\n",
"\n",
"Use the following format:\n",
"\n",
"Question: the input question you must answer\n",
"Thought: you should always think about what to do\n",
"Action: the action to take, should be one of [python_repl_ast]\n",
"Action Input: the input to the action\n",
"Observation: the result of the action\n",
"... (this Thought/Action/Action Input/Observation can repeat N times)\n",
"Thought: I now know the final answer\n",
"Final Answer: the final answer to the original input question\n",
"\n",
"\n",
"This is the result of `print(df.head())`:\n",
"| | PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked |\n",
"|---:|--------------:|-----------:|---------:|:----------------------------------------------------|:-------|------:|--------:|--------:|:-----------------|--------:|:--------|:-----------|\n",
"| 0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22 | 1 | 0 | A/5 21171 | 7.25 | nan | S |\n",
"| 1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female | 38 | 1 | 0 | PC 17599 | 71.2833 | C85 | C |\n",
"| 2 | 3 | 1 | 3 | Heikkinen, Miss. Laina | female | 26 | 0 | 0 | STON/O2. 3101282 | 7.925 | nan | S |\n",
"| 3 | 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35 | 1 | 0 | 113803 | 53.1 | C123 | S |\n",
"| 4 | 5 | 0 | 3 | Allen, Mr. William Henry | male | 35 | 0 | 0 | 373450 | 8.05 | nan | S |\n",
"\n",
"Begin!\n",
"Question: whats the square root of the average age?\n",
"Thought: I need to calculate the average age first\n",
"Action: python_repl_ast\n",
"Action Input: df['Age'].mean()\n",
"Observation: 29.69911764705882\n",
"Thought: I now need to calculate the square root of the average age\n",
"Action: python_repl_ast\n",
"Action Input: math.sqrt(df['Age'].mean())\n",
"Observation: name 'math' is not defined\n",
"Thought: I need to import the math library\n",
"Action: python_repl_ast\n",
"Action Input: import math\n",
"Observation: \n",
"Thought: I now need to calculate the square root of the average age\n",
"Action: python_repl_ast\n",
"Action Input: math.sqrt(df['Age'].mean())\n",
"Observation: 5.449689683556195\n",
"Thought:\u001b[0m\n",
"Error in on_llm_start callback: Parent run with UUID 32e38f5f-a71a-4040-89c5-cca7752ba72d not found.\n",
"Error in on_llm_end callback: No LLMRun found to be traced\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: 5.449689683556195\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
@@ -196,7 +510,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.11.2"
}
},
"nbformat": 4,

File diff suppressed because one or more lines are too long

View File

@@ -55,14 +55,16 @@
},
"outputs": [],
"source": [
"llm = AzureOpenAI(temperature=0, deployment_name=\"text-davinci-003\", verbose=True)\n",
"fast_llm = AzureOpenAI(temperature=0.5, max_tokens=1000, deployment_name=\"gpt-35-turbo\", verbose=True)\n",
"smart_llm = AzureOpenAI(temperature=0, max_tokens=100, deployment_name=\"gpt-4\", verbose=True)\n",
"\n",
"toolkit = PowerBIToolkit(\n",
" powerbi=PowerBIDataset(None, \"<dataset_id>\", ['table1', 'table2'], DefaultAzureCredential()), \n",
" llm=llm\n",
" powerbi=PowerBIDataset(dataset_id=\"<dataset_id>\", table_names=['table1', 'table2'], credential=DefaultAzureCredential()), \n",
" llm=smart_llm\n",
")\n",
"\n",
"agent_executor = create_pbi_agent(\n",
" llm=llm,\n",
" llm=fast_llm,\n",
" toolkit=toolkit,\n",
" verbose=True,\n",
")"
@@ -141,6 +143,56 @@
"source": [
"agent_executor.run(\"What unique values are there for dimensions2 in table2\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6fd950e4",
"metadata": {},
"source": [
"## Example: add your own few-shot prompts"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "87d677f9",
"metadata": {},
"outputs": [],
"source": [
"#fictional example\n",
"few_shots = \"\"\"\n",
"Question: How many rows are in the table revenue?\n",
"DAX: EVALUATE ROW(\"Number of rows\", COUNTROWS(revenue_details))\n",
"----\n",
"Question: How many rows are in the table revenue where year is not empty?\n",
"DAX: EVALUATE ROW(\"Number of rows\", COUNTROWS(FILTER(revenue_details, revenue_details[year] <> \"\")))\n",
"----\n",
"Question: What was the average of value in revenue in dollars?\n",
"DAX: EVALUATE ROW(\"Average\", AVERAGE(revenue_details[dollar_value]))\n",
"----\n",
"\"\"\"\n",
"toolkit = PowerBIToolkit(\n",
" powerbi=PowerBIDataset(dataset_id=\"<dataset_id>\", table_names=['table1', 'table2'], credential=DefaultAzureCredential()), \n",
" llm=smart_llm,\n",
" examples=few_shots,\n",
")\n",
"agent_executor = create_pbi_agent(\n",
" llm=fast_llm,\n",
" toolkit=toolkit,\n",
" verbose=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "33f4bb43",
"metadata": {},
"outputs": [],
"source": [
"agent_executor.run(\"What was the maximum of value in revenue in dollars in 2022?\")"
]
}
],
"metadata": {

View File

@@ -674,153 +674,6 @@
"source": [
"agent.run(\"whats 2**.12\")"
]
},
{
"cell_type": "markdown",
"id": "8aa3c353-bd89-467c-9c27-b83a90cd4daa",
"metadata": {},
"source": [
"## Multi-argument tools\n",
"\n",
"Many functions expect structured inputs. These can also be supported using the Tool decorator or by directly subclassing `BaseTool`! We have to modify the LLM's OutputParser to map its string output to a dictionary to pass to the action, however."
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "537bc628",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from typing import Optional, Union\n",
"\n",
"@tool\n",
"def custom_search(k: int, query: str, other_arg: Optional[str] = None):\n",
" \"\"\"The custom search function.\"\"\"\n",
" return f\"Here are the results for the custom search: k={k}, query={query}, other_arg={other_arg}\""
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "d5c992cf-776a-40cd-a6c4-e7cf65ea709e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import re\n",
"from langchain.schema import (\n",
" AgentAction,\n",
" AgentFinish,\n",
")\n",
"from langchain.agents import AgentOutputParser\n",
"\n",
"# We will add a custom parser to map the arguments to a dictionary\n",
"class CustomOutputParser(AgentOutputParser):\n",
" \n",
" def parse_tool_input(self, action_input: str) -> dict:\n",
" # Regex pattern to match arguments and their values\n",
" pattern = r\"(\\w+)\\s*=\\s*(None|\\\"[^\\\"]*\\\"|\\d+)\"\n",
" matches = re.findall(pattern, action_input)\n",
" \n",
" if not matches:\n",
" raise ValueError(f\"Could not parse action input: `{action_input}`\")\n",
"\n",
" # Create a dictionary with the parsed arguments and their values\n",
" parsed_input = {}\n",
" for arg, value in matches:\n",
" if value == \"None\":\n",
" parsed_value = None\n",
" elif value.isdigit():\n",
" parsed_value = int(value)\n",
" else:\n",
" parsed_value = value.strip('\"')\n",
" parsed_input[arg] = parsed_value\n",
"\n",
" return parsed_input\n",
" \n",
" def parse(self, llm_output: str) -> Union[AgentAction, AgentFinish]:\n",
" # Check if agent should finish\n",
" if \"Final Answer:\" in llm_output:\n",
" return AgentFinish(\n",
" # Return values is generally always a dictionary with a single `output` key\n",
" # It is not recommended to try anything else at the moment :)\n",
" return_values={\"output\": llm_output.split(\"Final Answer:\")[-1].strip()},\n",
" log=llm_output,\n",
" )\n",
" # Parse out the action and action input\n",
" regex = r\"Action\\s*\\d*\\s*:(.*?)\\nAction\\s*\\d*\\s*Input\\s*\\d*\\s*:[\\s]*(.*)\"\n",
" match = re.search(regex, llm_output, re.DOTALL)\n",
" if not match:\n",
" raise ValueError(f\"Could not parse LLM output: `{llm_output}`\")\n",
" action = match.group(1).strip()\n",
" action_input = match.group(2)\n",
" tool_input = self.parse_tool_input(action_input)\n",
" # Return the action and action \n",
" return AgentAction(tool=action, tool_input=tool_input, log=llm_output)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "68269547-1482-4138-a6ea-58f00b4a9548",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)\n",
"agent = initialize_agent([custom_search], llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True, agent_kwargs={\"output_parser\": CustomOutputParser()})"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "0947835a-691c-4f51-b8f4-6744e0e48ab1",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to use a search function to find the answer\n",
"Action: custom_search\n",
"Action Input: k=1, query=\"me\"\u001b[0m\u001b[36;1m\u001b[1;3mHere are the results for the custom search: k=1, query=me, other_arg=None\u001b[0m\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: The results of the custom search for k=1, query=me, other_arg=None.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The results of the custom search for k=1, query=me, other_arg=None.'"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"Search for me and tell me whatever it says\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "caf39c66-102b-42c1-baf2-777a49886ce4",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -839,7 +692,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.11.2"
},
"vscode": {
"interpreter": {

View File

@@ -40,15 +40,19 @@
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "2a50dd27",
"metadata": {
"tags": []
},
"outputs": [],
"cell_type": "markdown",
"id": "c89c110c-96ac-4fe1-ba3e-6056543d1a59",
"metadata": {},
"source": [
"arxiv = ArxivAPIWrapper()"
"Run a query to get information about some `scientific article`/articles. The query text is limited to 300 characters.\n",
"\n",
"It returns these article fields:\n",
"- Publishing date\n",
"- Title\n",
"- Authors\n",
"- Summary\n",
"\n",
"Next query returns information about one article with arxiv Id equal \"1605.08386\". "
]
},
{
@@ -71,10 +75,22 @@
}
],
"source": [
"\n",
"arxiv = ArxivAPIWrapper()\n",
"docs = arxiv.run(\"1605.08386\")\n",
"docs"
]
},
{
"cell_type": "markdown",
"id": "840f70c9-8f80-4680-bb38-46198e931bcf",
"metadata": {},
"source": [
"Now, we want to get information about one author, `Caprice Stanley`.\n",
"\n",
"This query returns information about three articles. By default, query returns information only about three top articles."
]
},
{
"cell_type": "code",
"execution_count": 5,
@@ -99,6 +115,14 @@
"docs"
]
},
{
"cell_type": "markdown",
"id": "2d9b6292-a47d-4f99-9827-8e9f244bf887",
"metadata": {},
"source": [
"Now, we are trying to find information about non-existing article. In this case, the response is \"No good Arxiv Result was found\""
]
},
{
"cell_type": "code",
"execution_count": 6,
@@ -122,14 +146,6 @@
"docs = arxiv.run(\"1605.08386WWW\")\n",
"docs"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4f4e9602",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -148,7 +164,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -5,57 +5,158 @@
"id": "8f210ec3",
"metadata": {},
"source": [
"# Bash\n",
"It can often be useful to have an LLM generate bash commands, and then run them. A common use case for this is letting the LLM interact with your local file system. We provide an easy util to execute bash commands."
"# Shell Tool\n",
"\n",
"Giving agents access to the shell is powerful (though risky outside a sandboxed environment).\n",
"\n",
"The LLM can use it to execute any shell commands. A common use case for this is letting the LLM interact with your local file system."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "f7b3767b",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.utilities import BashProcess"
"from langchain.tools import ShellTool\n",
"\n",
"shell_tool = ShellTool()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "cf1c92f0",
"metadata": {},
"outputs": [],
"source": [
"bash = BashProcess()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "2fa952fc",
"metadata": {},
"id": "c92ac832-556b-4f66-baa4-b78f965dfba0",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"bash.ipynb\n",
"google_search.ipynb\n",
"python.ipynb\n",
"requests.ipynb\n",
"serpapi.ipynb\n",
"Hello World!\n",
"\n",
"real\t0m0.000s\n",
"user\t0m0.000s\n",
"sys\t0m0.000s\n",
"\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/wfh/code/lc/lckg/langchain/tools/shell/tool.py:34: UserWarning: The shell tool has no safeguards by default. Use at your own risk.\n",
" warnings.warn(\n"
]
}
],
"source": [
"print(bash.run(\"ls\"))"
"print(shell_tool.run({\"commands\": [\"echo 'Hello World!'\", \"time\"]}))"
]
},
{
"cell_type": "markdown",
"id": "2fa952fc",
"metadata": {},
"source": [
"### Use with Agents\n",
"\n",
"As with all tools, these can be given to an agent to accomplish more complex tasks. Let's have the agent fetch some links from a web page."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "851fee9f",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mQuestion: What is the task?\n",
"Thought: We need to download the langchain.com webpage and extract all the URLs from it. Then we need to sort the URLs and return them.\n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"shell\",\n",
" \"action_input\": {\n",
" \"commands\": [\n",
" \"curl -s https://langchain.com | grep -o 'http[s]*://[^\\\" ]*' | sort\"\n",
" ]\n",
" }\n",
"}\n",
"```\n",
"\u001b[0m"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/wfh/code/lc/lckg/langchain/tools/shell/tool.py:34: UserWarning: The shell tool has no safeguards by default. Use at your own risk.\n",
" warnings.warn(\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Observation: \u001b[36;1m\u001b[1;3mhttps://blog.langchain.dev/\n",
"https://discord.gg/6adMQxSpJS\n",
"https://docs.langchain.com/docs/\n",
"https://github.com/hwchase17/chat-langchain\n",
"https://github.com/hwchase17/langchain\n",
"https://github.com/hwchase17/langchainjs\n",
"https://github.com/sullivan-sean/chat-langchainjs\n",
"https://js.langchain.com/docs/\n",
"https://python.langchain.com/en/latest/\n",
"https://twitter.com/langchainai\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe URLs have been successfully extracted and sorted. We can return the list of URLs as the final answer.\n",
"Final Answer: [\"https://blog.langchain.dev/\", \"https://discord.gg/6adMQxSpJS\", \"https://docs.langchain.com/docs/\", \"https://github.com/hwchase17/chat-langchain\", \"https://github.com/hwchase17/langchain\", \"https://github.com/hwchase17/langchainjs\", \"https://github.com/sullivan-sean/chat-langchainjs\", \"https://js.langchain.com/docs/\", \"https://python.langchain.com/en/latest/\", \"https://twitter.com/langchainai\"]\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'[\"https://blog.langchain.dev/\", \"https://discord.gg/6adMQxSpJS\", \"https://docs.langchain.com/docs/\", \"https://github.com/hwchase17/chat-langchain\", \"https://github.com/hwchase17/langchain\", \"https://github.com/hwchase17/langchainjs\", \"https://github.com/sullivan-sean/chat-langchainjs\", \"https://js.langchain.com/docs/\", \"https://python.langchain.com/en/latest/\", \"https://twitter.com/langchainai\"]'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.agents import initialize_agent\n",
"from langchain.agents import AgentType\n",
"\n",
"llm = ChatOpenAI(temperature=0)\n",
"\n",
"shell_tool.description = shell_tool.description + f\"args {shell_tool.args}\".replace(\"{\", \"{{\").replace(\"}\", \"}}\")\n",
"self_ask_with_search = initialize_agent([shell_tool], llm, agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, verbose=True)\n",
"self_ask_with_search.run(\"Download the langchain.com webpage and grep for all urls. Return only a sorted list of them. Be sure to use double quotes.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "851fee9f",
"id": "8d0ea3ac-0890-4e39-9cec-74bd80b4b8b8",
"metadata": {},
"outputs": [],
"source": []
@@ -77,7 +178,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.8.16"
}
},
"nbformat": 4,

View File

@@ -27,7 +27,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.tools import DuckDuckGoSearchTool"
"from langchain.tools import DuckDuckGoSearchRun"
]
},
{
@@ -37,7 +37,7 @@
"metadata": {},
"outputs": [],
"source": [
"search = DuckDuckGoSearchTool()"
"search = DuckDuckGoSearchRun()"
]
},
{

View File

@@ -0,0 +1,190 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# File System Tools\n",
"\n",
"LangChain provides tools for interacting with a local file system out of the box. This notebook walks through some of them.\n",
"\n",
"Note: these tools are not recommended for use outside a sandboxed environment! "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, we'll import the tools."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.tools.file_management import (\n",
" ReadFileTool,\n",
" CopyFileTool,\n",
" DeleteFileTool,\n",
" MoveFileTool,\n",
" WriteFileTool,\n",
" ListDirectoryTool,\n",
")\n",
"from langchain.agents.agent_toolkits import FileManagementToolkit\n",
"from tempfile import TemporaryDirectory\n",
"\n",
"# We'll make a temporary directory to avoid clutter\n",
"working_directory = TemporaryDirectory()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The FileManagementToolkit\n",
"\n",
"If you want to provide all the file tooling to your agent, it's easy to do so with the toolkit. We'll pass the temporary directory in as a root directory as a workspace for the LLM.\n",
"\n",
"It's recommended to always pass in a root directory, since without one, it's easy for the LLM to pollute the working directory, and without one, there isn't any validation against\n",
"straightforward prompt injection."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"[CopyFileTool(name='copy_file', description='Create a copy of a file in a specified location', args_schema=<class 'langchain.tools.file_management.copy.FileCopyInput'>, return_direct=False, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x1156f4350>, root_dir='/var/folders/gf/6rnp_mbx5914kx7qmmh7xzmw0000gn/T/tmpxb8c3aug'),\n",
" DeleteFileTool(name='file_delete', description='Delete a file', args_schema=<class 'langchain.tools.file_management.delete.FileDeleteInput'>, return_direct=False, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x1156f4350>, root_dir='/var/folders/gf/6rnp_mbx5914kx7qmmh7xzmw0000gn/T/tmpxb8c3aug'),\n",
" FileSearchTool(name='file_search', description='Recursively search for files in a subdirectory that match the regex pattern', args_schema=<class 'langchain.tools.file_management.file_search.FileSearchInput'>, return_direct=False, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x1156f4350>, root_dir='/var/folders/gf/6rnp_mbx5914kx7qmmh7xzmw0000gn/T/tmpxb8c3aug'),\n",
" MoveFileTool(name='move_file', description='Move or rename a file from one location to another', args_schema=<class 'langchain.tools.file_management.move.FileMoveInput'>, return_direct=False, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x1156f4350>, root_dir='/var/folders/gf/6rnp_mbx5914kx7qmmh7xzmw0000gn/T/tmpxb8c3aug'),\n",
" ReadFileTool(name='read_file', description='Read file from disk', args_schema=<class 'langchain.tools.file_management.read.ReadFileInput'>, return_direct=False, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x1156f4350>, root_dir='/var/folders/gf/6rnp_mbx5914kx7qmmh7xzmw0000gn/T/tmpxb8c3aug'),\n",
" WriteFileTool(name='write_file', description='Write file to disk', args_schema=<class 'langchain.tools.file_management.write.WriteFileInput'>, return_direct=False, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x1156f4350>, root_dir='/var/folders/gf/6rnp_mbx5914kx7qmmh7xzmw0000gn/T/tmpxb8c3aug'),\n",
" ListDirectoryTool(name='list_directory', description='List files and directories in a specified folder', args_schema=<class 'langchain.tools.file_management.list_dir.DirectoryListingInput'>, return_direct=False, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x1156f4350>, root_dir='/var/folders/gf/6rnp_mbx5914kx7qmmh7xzmw0000gn/T/tmpxb8c3aug')]"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"toolkit = FileManagementToolkit(root_dir=str(working_directory.name)) # If you don't provide a root_dir, operations will default to the current working directory\n",
"toolkit.get_tools()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Selecting File System Tools\n",
"\n",
"If you only want to select certain tools, you can pass them in as arguments when initializing the toolkit, or you can individually initialize the desired tools."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"[ReadFileTool(name='read_file', description='Read file from disk', args_schema=<class 'langchain.tools.file_management.read.ReadFileInput'>, return_direct=False, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x1156f4350>, root_dir='/var/folders/gf/6rnp_mbx5914kx7qmmh7xzmw0000gn/T/tmpxb8c3aug'),\n",
" WriteFileTool(name='write_file', description='Write file to disk', args_schema=<class 'langchain.tools.file_management.write.WriteFileInput'>, return_direct=False, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x1156f4350>, root_dir='/var/folders/gf/6rnp_mbx5914kx7qmmh7xzmw0000gn/T/tmpxb8c3aug'),\n",
" ListDirectoryTool(name='list_directory', description='List files and directories in a specified folder', args_schema=<class 'langchain.tools.file_management.list_dir.DirectoryListingInput'>, return_direct=False, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x1156f4350>, root_dir='/var/folders/gf/6rnp_mbx5914kx7qmmh7xzmw0000gn/T/tmpxb8c3aug')]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tools = FileManagementToolkit(root_dir=str(working_directory.name), selected_tools=[\"read_file\", \"write_file\", \"list_directory\"]).get_tools()\n",
"tools"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'File written successfully to example.txt.'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"read_tool, write_tool, list_tool = tools\n",
"write_tool.run({\"file_path\": \"example.txt\", \"text\": \"Hello World!\"})"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'example.txt'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# List files in the working directory\n",
"list_tool.run({})"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -69,7 +69,8 @@
}
],
"source": [
"StableDiffusionTool().langchain.run(\"Please create a photo of a dog riding a skateboard\")"
"local_file_path = StableDiffusionTool().langchain.run(\"Please create a photo of a dog riding a skateboard\")\n",
"local_file_path"
]
},
{
@@ -89,7 +90,7 @@
"metadata": {},
"outputs": [],
"source": [
"im = Image.open(\"/Users/harrisonchase/workplace/langchain/docs/modules/agents/tools/examples/b61c1dd9-47e2-46f1-a47c-20d27640993d/tmp4ap48vnm.jpg\")"
"im = Image.open(local_file_path)"
]
},
{

View File

@@ -19,6 +19,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import Tool\n",
"from langchain.utilities import PythonREPL"
]
},
@@ -59,7 +60,14 @@
"id": "54fc1f03",
"metadata": {},
"outputs": [],
"source": []
"source": [
"# You can create the tool to pass to an agent\n",
"repl_tool = Tool(\n",
" name=\"python_repl\",\n",
" description=\"A Python shell. Use this to execute python commands. Input should be a valid python command. If you want to see the output of a value, you should print it out with `print(...)`.\",\n",
" func=python_repl\n",
")"
]
}
],
"metadata": {

View File

@@ -0,0 +1,116 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# SceneXplain\n",
"\n",
"\n",
"[SceneXplain](https://scenex.jina.ai/) is an ImageCaptioning service accessible through the SceneXplain Tool.\n",
"\n",
"To use this tool, you'll need to make an account and fetch your API Token [from the website](https://scenex.jina.ai/api). Then you can instantiate the tool."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from langchain.tools import SceneXplainTool\n",
"\n",
"\n",
"os.environ[\"SCENEX_API_KEY\"] = \"<YOUR_API_KEY>\"\n",
"tool = SceneXplainTool()\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Usage in an Agent\n",
"\n",
"The tool can be used in any LangChain agent as follows:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\n",
"Thought: Do I need to use a tool? Yes\n",
"Action: Image Explainer\n",
"Action Input: https://storage.googleapis.com/causal-diffusion.appspot.com/imagePrompts%2F0rw369i5h9t%2Foriginal.png\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mIn a charmingly whimsical scene, a young girl is seen braving the rain alongside her furry companion, the lovable Totoro. The two are depicted standing on a bustling street corner, where they are sheltered from the rain by a bright yellow umbrella. The girl, dressed in a cheerful yellow frock, holds onto the umbrella with both hands while gazing up at Totoro with an expression of wonder and delight.\n",
"\n",
"Totoro, meanwhile, stands tall and proud beside his young friend, holding his own umbrella aloft to protect them both from the downpour. His furry body is rendered in rich shades of grey and white, while his large ears and wide eyes lend him an endearing charm.\n",
"\n",
"In the background of the scene, a street sign can be seen jutting out from the pavement amidst a flurry of raindrops. A sign with Chinese characters adorns its surface, adding to the sense of cultural diversity and intrigue. Despite the dreary weather, there is an undeniable sense of joy and camaraderie in this heartwarming image.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m Do I need to use a tool? No\n",
"AI: This image appears to be a still from the 1988 Japanese animated fantasy film My Neighbor Totoro. The film follows two young girls, Satsuki and Mei, as they explore the countryside and befriend the magical forest spirits, including the titular character Totoro.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"This image appears to be a still from the 1988 Japanese animated fantasy film My Neighbor Totoro. The film follows two young girls, Satsuki and Mei, as they explore the countryside and befriend the magical forest spirits, including the titular character Totoro.\n"
]
}
],
"source": [
"from langchain.llms import OpenAI\n",
"from langchain.agents import initialize_agent\n",
"from langchain.memory import ConversationBufferMemory\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"memory = ConversationBufferMemory(memory_key=\"chat_history\")\n",
"tools = [\n",
" tool\n",
"]\n",
"\n",
"agent = initialize_agent(\n",
" tools, llm, memory=memory, agent=\"conversational-react-description\", verbose=True\n",
")\n",
"output = agent.run(\n",
" input=(\n",
" \"What is in this image https://storage.googleapis.com/causal-diffusion.appspot.com/imagePrompts%2F0rw369i5h9t%2Foriginal.png. \"\n",
" \"Is it movie or a game? If it is a movie, what is the name of the movie?\"\n",
" )\n",
")\n",
"\n",
"print(output)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -102,7 +102,15 @@
"id": "e0a1dc1c",
"metadata": {},
"outputs": [],
"source": []
"source": [
"from langchain.agents import Tool\n",
"# You can create the tool to pass to an agent\n",
"repl_tool = Tool(\n",
" name=\"python_repl\",\n",
" description=\"A Python shell. Use this to execute python commands. Input should be a valid python command. If you want to see the output of a value, you should print it out with `print(...)`.\",\n",
" func=search.run,\n",
")"
]
}
],
"metadata": {

File diff suppressed because it is too large Load Diff

View File

@@ -10,7 +10,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 9,
"metadata": {},
"outputs": [
{
@@ -24,8 +24,8 @@
"\n",
"```bash\n",
"echo \"Hello World\"\n",
"```\u001b[0m['```bash', 'echo \"Hello World\"', '```']\n",
"\n",
"```\u001b[0m\n",
"Code: \u001b[33;1m\u001b[1;3m['echo \"Hello World\"']\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3mHello World\n",
"\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
@@ -37,7 +37,7 @@
"'Hello World\\n'"
]
},
"execution_count": 1,
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
@@ -50,7 +50,7 @@
"\n",
"text = \"Please write a bash script that prints 'Hello World' to the console.\"\n",
"\n",
"bash_chain = LLMBashChain(llm=llm, verbose=True)\n",
"bash_chain = LLMBashChain.from_llm(llm, verbose=True)\n",
"\n",
"bash_chain.run(text)"
]
@@ -65,11 +65,12 @@
},
{
"cell_type": "code",
"execution_count": 28,
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts.prompt import PromptTemplate\n",
"from langchain.chains.llm_bash.prompt import BashOutputParser\n",
"\n",
"_PROMPT_TEMPLATE = \"\"\"If someone asks you to perform a task, your job is to come up with a series of bash commands that will perform the task. There is no need to put \"#!/bin/bash\" in your answer. Make sure to reason step by step, using this format:\n",
"Question: \"copy the files in the directory named 'target' into a new directory at the same level as target called 'myNewDirectory'\"\n",
@@ -88,12 +89,12 @@
"That is the format. Begin!\n",
"Question: {question}\"\"\"\n",
"\n",
"PROMPT = PromptTemplate(input_variables=[\"question\"], template=_PROMPT_TEMPLATE)"
"PROMPT = PromptTemplate(input_variables=[\"question\"], template=_PROMPT_TEMPLATE, output_parser=BashOutputParser())"
]
},
{
"cell_type": "code",
"execution_count": 29,
"execution_count": 11,
"metadata": {},
"outputs": [
{
@@ -107,8 +108,8 @@
"\n",
"```bash\n",
"printf \"Hello World\\n\"\n",
"```\u001b[0m['```bash', 'printf \"Hello World\\\\n\"', '```']\n",
"\n",
"```\u001b[0m\n",
"Code: \u001b[33;1m\u001b[1;3m['printf \"Hello World\\\\n\"']\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3mHello World\n",
"\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
@@ -120,18 +121,125 @@
"'Hello World\\n'"
]
},
"execution_count": 29,
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bash_chain = LLMBashChain(llm=llm, prompt=PROMPT, verbose=True)\n",
"bash_chain = LLMBashChain.from_llm(llm, prompt=PROMPT, verbose=True)\n",
"\n",
"text = \"Please write a bash script that prints 'Hello World' to the console.\"\n",
"\n",
"bash_chain.run(text)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Persistent Terminal\n",
"\n",
"By default, the chain will run in a separate subprocess each time it is called. This behavior can be changed by instantiating with a persistent bash process."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new LLMBashChain chain...\u001b[0m\n",
"List the current directory then move up a level.\u001b[32;1m\u001b[1;3m\n",
"\n",
"```bash\n",
"ls\n",
"cd ..\n",
"```\u001b[0m\n",
"Code: \u001b[33;1m\u001b[1;3m['ls', 'cd ..']\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3mapi.ipynb\t\t\tllm_summarization_checker.ipynb\n",
"constitutional_chain.ipynb\tmoderation.ipynb\n",
"llm_bash.ipynb\t\t\topenai_openapi.yaml\n",
"llm_checker.ipynb\t\topenapi.ipynb\n",
"llm_math.ipynb\t\t\tpal.ipynb\n",
"llm_requests.ipynb\t\tsqlite.ipynb\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'api.ipynb\\t\\t\\tllm_summarization_checker.ipynb\\r\\nconstitutional_chain.ipynb\\tmoderation.ipynb\\r\\nllm_bash.ipynb\\t\\t\\topenai_openapi.yaml\\r\\nllm_checker.ipynb\\t\\topenapi.ipynb\\r\\nllm_math.ipynb\\t\\t\\tpal.ipynb\\r\\nllm_requests.ipynb\\t\\tsqlite.ipynb'"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.utilities.bash import BashProcess\n",
"\n",
"\n",
"persistent_process = BashProcess(persistent=True)\n",
"bash_chain = LLMBashChain.from_llm(llm, bash_process=persistent_process, verbose=True)\n",
"\n",
"text = \"List the current directory then move up a level.\"\n",
"\n",
"bash_chain.run(text)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new LLMBashChain chain...\u001b[0m\n",
"List the current directory then move up a level.\u001b[32;1m\u001b[1;3m\n",
"\n",
"```bash\n",
"ls\n",
"cd ..\n",
"```\u001b[0m\n",
"Code: \u001b[33;1m\u001b[1;3m['ls', 'cd ..']\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3mexamples\t\tgetting_started.ipynb\tindex_examples\n",
"generic\t\t\thow_to_guides.rst\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'examples\\t\\tgetting_started.ipynb\\tindex_examples\\r\\ngeneric\\t\\t\\thow_to_guides.rst'"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Run the same command again and see that the state is maintained between calls\n",
"bash_chain.run(text)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -150,7 +258,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@@ -23,28 +23,16 @@
"\n",
"\n",
"\u001b[1m> Entering new SequentialChain chain...\u001b[0m\n",
"\u001b[1mChain 0\u001b[0m:\n",
"{'statement': '\\nNone. Mammals do not lay eggs.'}\n",
"\n",
"\u001b[1mChain 1\u001b[0m:\n",
"{'assertions': '\\n• Mammals reproduce using live birth\\n• Mammals do not lay eggs\\n• Animals that lay eggs are not mammals'}\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\u001b[1mChain 2\u001b[0m:\n",
"{'checked_assertions': '\\n1. True\\n\\n2. True\\n\\n3. False - Mammals are a class of animals that includes animals that lay eggs, such as monotremes (platypus and echidna).'}\n",
"\n",
"\u001b[1mChain 3\u001b[0m:\n",
"{'revised_statement': ' Monotremes, such as the platypus and echidna, lay the biggest eggs of any mammal.'}\n",
"\n",
"\n",
"\u001b[1m> Finished SequentialChain chain.\u001b[0m\n",
"\n",
"\u001b[1m> Finished LLMCheckerChain chain.\u001b[0m\n"
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"' Monotremes, such as the platypus and echidna, lay the biggest eggs of any mammal.'"
"' No mammal lays the biggest eggs. The Elephant Bird, which was a species of giant bird, laid the largest eggs of any bird.'"
]
},
"execution_count": 1,
@@ -60,7 +48,7 @@
"\n",
"text = \"What type of mammal lays the biggest eggs?\"\n",
"\n",
"checker_chain = LLMCheckerChain(llm=llm, verbose=True)\n",
"checker_chain = LLMCheckerChain.from_llm(llm, verbose=True)\n",
"\n",
"checker_chain.run(text)"
]
@@ -89,7 +77,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@@ -12,7 +12,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 4,
"id": "44e9ba31",
"metadata": {},
"outputs": [
@@ -24,23 +24,22 @@
"\n",
"\u001b[1m> Entering new LLMMathChain chain...\u001b[0m\n",
"What is 13 raised to the .3432 power?\u001b[32;1m\u001b[1;3m\n",
"```python\n",
"import math\n",
"print(math.pow(13, .3432))\n",
"```text\n",
"13 ** .3432\n",
"```\n",
"...numexpr.evaluate(\"13 ** .3432\")...\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m2.4116004626599237\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m2.4116004626599237\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'Answer: 2.4116004626599237\\n'"
"'Answer: 2.4116004626599237'"
]
},
"execution_count": 1,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
@@ -49,102 +48,7 @@
"from langchain import OpenAI, LLMMathChain\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"llm_math = LLMMathChain(llm=llm, verbose=True)\n",
"\n",
"llm_math.run(\"What is 13 raised to the .3432 power?\")"
]
},
{
"cell_type": "markdown",
"id": "2bdd5fc6",
"metadata": {},
"source": [
"## Customize Prompt\n",
"You can also customize the prompt that is used. Here is an example prompting it to use numpy"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "76be17b0",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts.prompt import PromptTemplate\n",
"\n",
"_PROMPT_TEMPLATE = \"\"\"You are GPT-3, and you can't do math.\n",
"\n",
"You can do basic math, and your memorization abilities are impressive, but you can't do any complex calculations that a human could not do in their head. You also have an annoying tendency to just make up highly specific, but wrong, answers.\n",
"\n",
"So we hooked you up to a Python 3 kernel, and now you can execute code. If you execute code, you must print out the final answer using the print function. You MUST use the python package numpy to answer your question. You must import numpy as np.\n",
"\n",
"\n",
"Question: ${{Question with hard calculation.}}\n",
"```python\n",
"${{Code that prints what you need to know}}\n",
"print(${{code}})\n",
"```\n",
"```output\n",
"${{Output of your code}}\n",
"```\n",
"Answer: ${{Answer}}\n",
"\n",
"Begin.\n",
"\n",
"Question: What is 37593 * 67?\n",
"\n",
"```python\n",
"import numpy as np\n",
"print(np.multiply(37593, 67))\n",
"```\n",
"```output\n",
"2518731\n",
"```\n",
"Answer: 2518731\n",
"\n",
"Question: {question}\"\"\"\n",
"\n",
"PROMPT = PromptTemplate(input_variables=[\"question\"], template=_PROMPT_TEMPLATE)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "0c42faa0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new LLMMathChain chain...\u001b[0m\n",
"What is 13 raised to the .3432 power?\u001b[32;1m\u001b[1;3m\n",
"\n",
"```python\n",
"import numpy as np\n",
"print(np.power(13, .3432))\n",
"```\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m2.4116004626599237\n",
"\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'Answer: 2.4116004626599237\\n'"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm_math = LLMMathChain(llm=llm, prompt=PROMPT, verbose=True)\n",
"llm_math = LLMMathChain.from_llm(llm, verbose=True)\n",
"\n",
"llm_math.run(\"What is 13 raised to the .3432 power?\")"
]
@@ -152,7 +56,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "0c62951b",
"id": "e978bb8e",
"metadata": {},
"outputs": [],
"source": []
@@ -174,7 +78,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@@ -221,11 +221,11 @@
"\n",
"• The light from these galaxies has been traveling for over 13 billion years to reach us. - True \n",
"\n",
"• JWST has provided us with the first images of exoplanets, which are planets outside of our own solar system. - False. The first exoplanet was discovered in 1992, but the first images of exoplanets were taken by the Hubble Space Telescope in 1995. \n",
"• JWST has provided us with the first images of exoplanets, which are planets outside of our own solar system. - False. The first exoplanet was discovered in 1992, but the first images of exoplanets were taken by the Hubble Space Telescope in 2004. \n",
"\n",
"• Exoplanets were first discovered in 1992. - True \n",
"\n",
"• The JWST has allowed us to see exoplanets in greater detail. - Undetermined. It is too early to tell as the JWST has not been launched yet.\n",
"• The JWST has allowed us to see exoplanets in greater detail. - Undetermined. The JWST has not yet been launched, so it is not yet known how much detail it will be able to provide.\n",
"\"\"\"\n",
"\n",
"Original Summary:\n",
@@ -296,11 +296,11 @@
"\n",
"• The light from these galaxies has been traveling for over 13 billion years to reach us. - True \n",
"\n",
"• JWST has provided us with the first images of exoplanets, which are planets outside of our own solar system. - False. The first exoplanet was discovered in 1992, but the first images of exoplanets were taken by the Hubble Space Telescope in 1995. \n",
"• JWST has provided us with the first images of exoplanets, which are planets outside of our own solar system. - False. The first exoplanet was discovered in 1992, but the first images of exoplanets were taken by the Hubble Space Telescope in 2004. \n",
"\n",
"• Exoplanets were first discovered in 1992. - True \n",
"\n",
"• The JWST has allowed us to see exoplanets in greater detail. - Undetermined. It is too early to tell as the JWST has not been launched yet.\n",
"• The JWST has allowed us to see exoplanets in greater detail. - Undetermined. The JWST has not yet been launched, so it is not yet known how much detail it will be able to provide.\n",
"\"\"\"\n",
"Result:\u001b[0m\n",
"\n",
@@ -312,7 +312,7 @@
"Your 9-year old might like these recent discoveries made by The James Webb Space Telescope (JWST):\n",
"• In 2023, The JWST will spot a number of galaxies nicknamed \"green peas.\" They were given this name because they are small, round, and green, like peas.\n",
"• The telescope will capture images of galaxies that are over 13 billion years old. This means that the light from these galaxies has been traveling for over 13 billion years to reach us.\n",
"• Exoplanets, which are planets outside of our own solar system, were first discovered in 1992. The JWST will allow us to see them in greater detail than ever before.\n",
"• Exoplanets, which are planets outside of our own solar system, were first discovered in 1992. The JWST will allow us to see them in greater detail when it is launched in 2023.\n",
"These discoveries can spark a child's imagination about the infinite wonders of the universe.\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
@@ -321,7 +321,7 @@
{
"data": {
"text/plain": [
"'Your 9-year old might like these recent discoveries made by The James Webb Space Telescope (JWST):\\n• In 2023, The JWST will spot a number of galaxies nicknamed \"green peas.\" They were given this name because they are small, round, and green, like peas.\\n• The telescope will capture images of galaxies that are over 13 billion years old. This means that the light from these galaxies has been traveling for over 13 billion years to reach us.\\n• Exoplanets, which are planets outside of our own solar system, were first discovered in 1992. The JWST will allow us to see them in greater detail than ever before.\\nThese discoveries can spark a child\\'s imagination about the infinite wonders of the universe.'"
"'Your 9-year old might like these recent discoveries made by The James Webb Space Telescope (JWST):\\n• In 2023, The JWST will spot a number of galaxies nicknamed \"green peas.\" They were given this name because they are small, round, and green, like peas.\\n• The telescope will capture images of galaxies that are over 13 billion years old. This means that the light from these galaxies has been traveling for over 13 billion years to reach us.\\n• Exoplanets, which are planets outside of our own solar system, were first discovered in 1992. The JWST will allow us to see them in greater detail when it is launched in 2023.\\nThese discoveries can spark a child\\'s imagination about the infinite wonders of the universe.'"
]
},
"execution_count": 1,
@@ -334,7 +334,7 @@
"from langchain.llms import OpenAI\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"checker_chain = LLMSummarizationCheckerChain(llm=llm, verbose=True, max_checks=2)\n",
"checker_chain = LLMSummarizationCheckerChain.from_llm(llm, verbose=True, max_checks=2)\n",
"text = \"\"\"\n",
"Your 9-year old might like these recent discoveries made by The James Webb Space Telescope (JWST):\n",
"• In 2023, The JWST spotted a number of galaxies nicknamed \"green peas.\" They were given this name because they are small, round, and green, like peas.\n",
@@ -407,7 +407,8 @@
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mBelow are some assertions that have been fact checked and are labeled as true of false. If the answer is false, a suggestion is given for a correction.\n",
"\n",
"Checked Assertions:\"\"\"\n",
"Checked Assertions:\n",
"\"\"\"\n",
"\n",
"- The Greenland Sea is an outlying portion of the Arctic Ocean located between Iceland, Norway, the Svalbard archipelago and Greenland. True\n",
"\n",
@@ -428,7 +429,8 @@
"- It is considered the northern branch of the Norwegian Sea. True\n",
"\"\"\"\n",
"\n",
"Original Summary:\"\"\"\n",
"Original Summary:\n",
"\"\"\"\n",
"The Greenland Sea is an outlying portion of the Arctic Ocean located between Iceland, Norway, the Svalbard archipelago and Greenland. It has an area of 465,000 square miles and is one of five oceans in the world, alongside the Pacific Ocean, Atlantic Ocean, Indian Ocean, and the Southern Ocean. It is the smallest of the five oceans and is covered almost entirely by water, some of which is frozen in the form of glaciers and icebergs. The sea is named after the island of Greenland, and is the Arctic Ocean's main outlet to the Atlantic. It is often frozen over so navigation is limited, and is considered the northern branch of the Norwegian Sea.\n",
"\"\"\"\n",
"\n",
@@ -443,7 +445,7 @@
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mBelow are some assertions that have been fact checked and are labeled as true of false.\n",
"\u001b[32;1m\u001b[1;3mBelow are some assertions that have been fact checked and are labeled as true or false.\n",
"\n",
"If all of the assertions are true, return \"True\". If any of the assertions are false, return \"False\".\n",
"\n",
@@ -555,7 +557,8 @@
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mBelow are some assertions that have been fact checked and are labeled as true of false. If the answer is false, a suggestion is given for a correction.\n",
"\n",
"Checked Assertions:\"\"\"\n",
"Checked Assertions:\n",
"\"\"\"\n",
"\n",
"- The Greenland Sea is an outlying portion of the Arctic Ocean located between Iceland, Norway, the Svalbard archipelago and Greenland. True\n",
"\n",
@@ -574,7 +577,8 @@
"- It is considered the northern branch of the Norwegian Sea. False - It is considered the northern branch of the Atlantic Ocean.\n",
"\"\"\"\n",
"\n",
"Original Summary:\"\"\"\n",
"Original Summary:\n",
"\"\"\"\n",
"\n",
"The Greenland Sea is an outlying portion of the Arctic Ocean located between Iceland, Norway, the Svalbard archipelago and Greenland. It has an area of 465,000 square miles and is an arm of the Arctic Ocean. It is covered almost entirely by water, some of which is frozen in the form of glaciers and icebergs. The sea is named after the island of Greenland, and is the Arctic Ocean's main outlet to the Atlantic. It is often frozen over so navigation is limited, and is considered the northern branch of the Norwegian Sea.\n",
"\"\"\"\n",
@@ -583,14 +587,20 @@
"\n",
"The output should have the same structure and formatting as the original summary.\n",
"\n",
"Summary:\u001b[0m\n",
"Summary:\u001b[0m\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mBelow are some assertions that have been fact checked and are labeled as true of false.\n",
"\u001b[32;1m\u001b[1;3mBelow are some assertions that have been fact checked and are labeled as true or false.\n",
"\n",
"If all of the assertions are true, return \"True\". If any of the assertions are false, return \"False\".\n",
"\n",
@@ -701,7 +711,8 @@
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mBelow are some assertions that have been fact checked and are labeled as true of false. If the answer is false, a suggestion is given for a correction.\n",
"\n",
"Checked Assertions:\"\"\"\n",
"Checked Assertions:\n",
"\"\"\"\n",
"\n",
"- The Greenland Sea is an outlying portion of the Arctic Ocean located between Iceland, Norway, the Svalbard archipelago and Greenland. True\n",
"\n",
@@ -718,7 +729,8 @@
"- It is considered the northern branch of the Atlantic Ocean. False - The Greenland Sea is considered part of the Arctic Ocean, not the Atlantic Ocean.\n",
"\"\"\"\n",
"\n",
"Original Summary:\"\"\"\n",
"Original Summary:\n",
"\"\"\"\n",
"\n",
"\n",
"The Greenland Sea is an outlying portion of the Arctic Ocean located between Iceland, Norway, the Svalbard archipelago and Greenland. It has an area of 465,000 square miles and is an arm of the Arctic Ocean. It is covered almost entirely by water, some of which is frozen in the form of glaciers and icebergs. The sea is named after the country of Greenland, and is the Arctic Ocean's main outlet to the Atlantic. It is often frozen over so navigation is limited, and is considered the northern branch of the Atlantic Ocean.\n",
@@ -735,7 +747,7 @@
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mBelow are some assertions that have been fact checked and are labeled as true of false.\n",
"\u001b[32;1m\u001b[1;3mBelow are some assertions that have been fact checked and are labeled as true or false.\n",
"\n",
"If all of the assertions are true, return \"True\". If any of the assertions are false, return \"False\".\n",
"\n",
@@ -813,14 +825,14 @@
"from langchain.llms import OpenAI\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"checker_chain = LLMSummarizationCheckerChain(llm=llm, verbose=True, max_checks=3)\n",
"checker_chain = LLMSummarizationCheckerChain.from_llm(llm, verbose=True, max_checks=3)\n",
"text = \"The Greenland Sea is an outlying portion of the Arctic Ocean located between Iceland, Norway, the Svalbard archipelago and Greenland. It has an area of 465,000 square miles and is one of five oceans in the world, alongside the Pacific Ocean, Atlantic Ocean, Indian Ocean, and the Southern Ocean. It is the smallest of the five oceans and is covered almost entirely by water, some of which is frozen in the form of glaciers and icebergs. The sea is named after the island of Greenland, and is the Arctic Ocean's main outlet to the Atlantic. It is often frozen over so navigation is limited, and is considered the northern branch of the Norwegian Sea.\"\n",
"checker_chain.run(text)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"metadata": {},
"outputs": [
{
@@ -1077,7 +1089,7 @@
"'Birds are not mammals, but they are a class of their own. They lay eggs, unlike mammals which give birth to live young.'"
]
},
"execution_count": 2,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
@@ -1087,17 +1099,10 @@
"from langchain.llms import OpenAI\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"checker_chain = LLMSummarizationCheckerChain(llm=llm, max_checks=3, verbose=True)\n",
"checker_chain = LLMSummarizationCheckerChain.from_llm(llm, max_checks=3, verbose=True)\n",
"text = \"Mammals can lay eggs, birds can lay eggs, therefore birds are mammals.\"\n",
"checker_chain.run(text)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {

View File

@@ -28,7 +28,7 @@
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(model_name='code-davinci-002', temperature=0, max_tokens=512)"
"llm = OpenAI(temperature=0, max_tokens=512)"
]
},
{
@@ -63,7 +63,9 @@
"cell_type": "code",
"execution_count": 4,
"id": "3ef64b27",
"metadata": {},
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
@@ -71,17 +73,17 @@
"text": [
"\n",
"\n",
"\u001B[1m> Entering new PALChain chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mdef solution():\n",
"\u001b[1m> Entering new PALChain chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mdef solution():\n",
" \"\"\"Jan has three times the number of pets as Marcia. Marcia has two more pets than Cindy. If Cindy has four pets, how many total pets do the three have?\"\"\"\n",
" cindy_pets = 4\n",
" marcia_pets = cindy_pets + 2\n",
" jan_pets = marcia_pets * 3\n",
" total_pets = cindy_pets + marcia_pets + jan_pets\n",
" result = total_pets\n",
" return result\u001B[0m\n",
" return result\u001b[0m\n",
"\n",
"\u001B[1m> Finished chain.\u001B[0m\n"
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
@@ -139,8 +141,8 @@
"text": [
"\n",
"\n",
"\u001B[1m> Entering new PALChain chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3m# Put objects into a list to record ordering\n",
"\u001b[1m> Entering new PALChain chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m# Put objects into a list to record ordering\n",
"objects = []\n",
"objects += [('booklet', 'blue')] * 2\n",
"objects += [('booklet', 'purple')] * 2\n",
@@ -151,9 +153,9 @@
"\n",
"# Count number of purple objects\n",
"num_purple = len([object for object in objects if object[1] == 'purple'])\n",
"answer = num_purple\u001B[0m\n",
"answer = num_purple\u001b[0m\n",
"\n",
"\u001B[1m> Finished PALChain chain.\u001B[0m\n"
"\u001b[1m> Finished PALChain chain.\u001b[0m\n"
]
},
{
@@ -212,8 +214,8 @@
"text": [
"\n",
"\n",
"\u001B[1m> Entering new PALChain chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3m# Put objects into a list to record ordering\n",
"\u001b[1m> Entering new PALChain chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m# Put objects into a list to record ordering\n",
"objects = []\n",
"objects += [('booklet', 'blue')] * 2\n",
"objects += [('booklet', 'purple')] * 2\n",
@@ -224,9 +226,9 @@
"\n",
"# Count number of purple objects\n",
"num_purple = len([object for object in objects if object[1] == 'purple'])\n",
"answer = num_purple\u001B[0m\n",
"answer = num_purple\u001b[0m\n",
"\n",
"\u001B[1m> Finished chain.\u001B[0m\n"
"\u001b[1m> Finished chain.\u001b[0m\n"
]
}
],
@@ -280,7 +282,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@@ -73,7 +73,7 @@
"metadata": {},
"outputs": [],
"source": [
"db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)"
"db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)"
]
},
{
@@ -175,7 +175,7 @@
"metadata": {},
"outputs": [],
"source": [
"db_chain = SQLDatabaseChain(llm=llm, database=db, prompt=PROMPT, verbose=True)"
"db_chain = SQLDatabaseChain.from_llm(llm, db, prompt=PROMPT, verbose=True)"
]
},
{
@@ -230,7 +230,7 @@
"metadata": {},
"outputs": [],
"source": [
"db_chain = SQLDatabaseChain(llm=llm, database=db, prompt=PROMPT, verbose=True, return_intermediate_steps=True)"
"db_chain = SQLDatabaseChain.from_llm(llm, db, prompt=PROMPT, verbose=True, return_intermediate_steps=True)"
]
},
{
@@ -285,7 +285,7 @@
"metadata": {},
"outputs": [],
"source": [
"db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True, top_k=3)"
"db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True, top_k=3)"
]
},
{
@@ -407,7 +407,7 @@
"metadata": {},
"outputs": [],
"source": [
"db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)"
"db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)"
]
},
{
@@ -569,7 +569,7 @@
}
],
"source": [
"db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)\n",
"db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)\n",
"db_chain.run(\"What are some example tracks by Bach?\")"
]
},
@@ -681,7 +681,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.10"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,199 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "593f7553-7038-498e-96d4-8255e5ce34f0",
"metadata": {},
"source": [
"# Creating a custom Chain\n",
"\n",
"To implement your own custom chain you can subclass `Chain` and implement the following methods:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "c19c736e-ca74-4726-bb77-0a849bcc2960",
"metadata": {
"tags": [],
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"from __future__ import annotations\n",
"\n",
"from typing import Any, Dict, List, Optional\n",
"\n",
"from pydantic import Extra\n",
"\n",
"from langchain.base_language import BaseLanguageModel\n",
"from langchain.callbacks.manager import (\n",
" AsyncCallbackManagerForChainRun,\n",
" CallbackManagerForChainRun,\n",
")\n",
"from langchain.chains.base import Chain\n",
"from langchain.prompts.base import BasePromptTemplate\n",
"\n",
"\n",
"class MyCustomChain(Chain):\n",
" \"\"\"\n",
" An example of a custom chain.\n",
" \"\"\"\n",
"\n",
" prompt: BasePromptTemplate\n",
" \"\"\"Prompt object to use.\"\"\"\n",
" llm: BaseLanguageModel\n",
" output_key: str = \"text\" #: :meta private:\n",
"\n",
" class Config:\n",
" \"\"\"Configuration for this pydantic object.\"\"\"\n",
"\n",
" extra = Extra.forbid\n",
" arbitrary_types_allowed = True\n",
"\n",
" @property\n",
" def input_keys(self) -> List[str]:\n",
" \"\"\"Will be whatever keys the prompt expects.\n",
"\n",
" :meta private:\n",
" \"\"\"\n",
" return self.prompt.input_variables\n",
"\n",
" @property\n",
" def output_keys(self) -> List[str]:\n",
" \"\"\"Will always return text key.\n",
"\n",
" :meta private:\n",
" \"\"\"\n",
" return [self.output_key]\n",
"\n",
" def _call(\n",
" self,\n",
" inputs: Dict[str, Any],\n",
" run_manager: Optional[CallbackManagerForChainRun] = None,\n",
" ) -> Dict[str, str]:\n",
" # Your custom chain logic goes here\n",
" # This is just an example that mimics LLMChain\n",
" prompt_value = self.prompt.format_prompt(**inputs)\n",
" \n",
" # Whenever you call a language model, or another chain, you should pass\n",
" # a callback manager to it. This allows the inner run to be tracked by\n",
" # any callbacks that are registered on the outer run.\n",
" # You can always obtain a callback manager for this by calling\n",
" # `run_manager.get_child()` as shown below.\n",
" response = self.llm.generate_prompt(\n",
" [prompt_value],\n",
" callbacks=run_manager.get_child() if run_manager else None\n",
" )\n",
"\n",
" # If you want to log something about this run, you can do so by calling\n",
" # methods on the `run_manager`, as shown below. This will trigger any\n",
" # callbacks that are registered for that event.\n",
" if run_manager:\n",
" run_manager.on_text(\"Log something about this run\")\n",
" \n",
" return {self.output_key: response.generations[0][0].text}\n",
"\n",
" async def _acall(\n",
" self,\n",
" inputs: Dict[str, Any],\n",
" run_manager: Optional[AsyncCallbackManagerForChainRun] = None,\n",
" ) -> Dict[str, str]:\n",
" # Your custom chain logic goes here\n",
" # This is just an example that mimics LLMChain\n",
" prompt_value = self.prompt.format_prompt(**inputs)\n",
" \n",
" # Whenever you call a language model, or another chain, you should pass\n",
" # a callback manager to it. This allows the inner run to be tracked by\n",
" # any callbacks that are registered on the outer run.\n",
" # You can always obtain a callback manager for this by calling\n",
" # `run_manager.get_child()` as shown below.\n",
" response = await self.llm.agenerate_prompt(\n",
" [prompt_value],\n",
" callbacks=run_manager.get_child() if run_manager else None\n",
" )\n",
"\n",
" # If you want to log something about this run, you can do so by calling\n",
" # methods on the `run_manager`, as shown below. This will trigger any\n",
" # callbacks that are registered for that event.\n",
" if run_manager:\n",
" await run_manager.on_text(\"Log something about this run\")\n",
" \n",
" return {self.output_key: response.generations[0][0].text}\n",
"\n",
" @property\n",
" def _chain_type(self) -> str:\n",
" return \"my_custom_chain\"\n"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "18361f89",
"metadata": {
"vscode": {
"languageId": "python"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new MyCustomChain chain...\u001b[0m\n",
"Log something about this run\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'Why did the callback function feel lonely? Because it was always waiting for someone to call it back!'"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.callbacks.stdout import StdOutCallbackHandler\n",
"from langchain.chat_models.openai import ChatOpenAI\n",
"from langchain.prompts.prompt import PromptTemplate\n",
"\n",
"\n",
"chain = MyCustomChain(\n",
" prompt=PromptTemplate.from_template('tell us a joke about {topic}'),\n",
" llm=ChatOpenAI()\n",
")\n",
"\n",
"chain.run({'topic': 'callbacks'}, callbacks=[StdOutCallbackHandler()])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -2,59 +2,90 @@
"cells": [
{
"cell_type": "markdown",
"id": "d8a5c5d4",
"id": "da7d0df7-f07c-462f-bd46-d0426f11f311",
"metadata": {},
"source": [
"# LLM Chain\n",
"\n",
"This notebook showcases a simple LLM chain."
"## LLM Chain"
]
},
{
"cell_type": "markdown",
"id": "3a55e9a1-becf-4357-889e-f365d23362ff",
"metadata": {},
"source": [
"`LLMChain` is perhaps one of the most popular ways of querying an LLM object. It formats the prompt template using the input key values provided (and also memory key values, if available), passes the formatted string to LLM and returns the LLM output. Below we show additional functionalities of `LLMChain` class."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "835e6978",
"metadata": {},
"outputs": [],
"id": "0e720e34-a0f0-4f1a-9732-43bc1460053a",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"{'product': 'colorful socks', 'text': '\\n\\nSocktastic!'}"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain import PromptTemplate, OpenAI, LLMChain"
"from langchain import PromptTemplate, OpenAI, LLMChain\n",
"\n",
"prompt_template = \"What is a good name for a company that makes {product}?\"\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"llm_chain = LLMChain(\n",
" llm=llm,\n",
" prompt=PromptTemplate.from_template(prompt_template)\n",
")\n",
"llm_chain(\"colorful socks\")"
]
},
{
"cell_type": "markdown",
"id": "06bcb078",
"id": "94304332-6398-4280-a61e-005ba29b5e1e",
"metadata": {},
"source": [
"## Single Input\n",
"\n",
"First, lets go over an example using a single input"
"## Additional ways of running LLM Chain"
]
},
{
"cell_type": "markdown",
"id": "4e51981f-cde9-4c05-99e1-446c27994e99",
"metadata": {},
"source": [
"Aside from `__call__` and `run` methods shared by all `Chain` object (see [Getting Started](../getting_started.ipynb) to learn more), `LLMChain` offers a few more ways of calling the chain logic:"
]
},
{
"cell_type": "markdown",
"id": "c08d2356-412d-4327-b8a0-233dcc443e30",
"metadata": {},
"source": [
"- `apply` allows you run the chain against a list of inputs:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "51a54c4d",
"metadata": {},
"id": "cf519eb6-2358-4db7-a28a-27433435181e",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001B[1m> Entering new LLMChain chain...\u001B[0m\n",
"Prompt after formatting:\n",
"\u001B[32;1m\u001B[1;3mQuestion: What NFL team won the Super Bowl in the year Justin Beiber was born?\n",
"\n",
"Answer: Let's think step by step.\u001B[0m\n",
"\n",
"\u001B[1m> Finished LLMChain chain.\u001B[0m\n"
]
},
{
"data": {
"text/plain": [
"' Justin Bieber was born in 1994, so the NFL team that won the Super Bowl in 1994 was the Dallas Cowboys.'"
"[{'text': '\\n\\nSocktastic!'},\n",
" {'text': '\\n\\nTechCore Solutions.'},\n",
" {'text': '\\n\\nFootwear Factory.'}]"
]
},
"execution_count": 2,
@@ -63,49 +94,37 @@
}
],
"source": [
"template = \"\"\"Question: {question}\n",
"input_list = [\n",
" {\"product\": \"socks\"},\n",
" {\"product\": \"computer\"},\n",
" {\"product\": \"shoes\"}\n",
"]\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n",
"llm_chain = LLMChain(prompt=prompt, llm=OpenAI(temperature=0), verbose=True)\n",
"\n",
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"\n",
"llm_chain.predict(question=question)"
"llm_chain.apply(input_list)"
]
},
{
"cell_type": "markdown",
"id": "79c3ec4d",
"metadata": {},
"id": "add442fb-baf6-40d9-ae8e-4ac1d8251ad0",
"metadata": {
"tags": []
},
"source": [
"## Multiple Inputs\n",
"Now lets go over an example using multiple inputs."
"- `generate` is similar to `apply`, except it return an `LLMResult` instead of string. `LLMResult` often contains useful generation such as token usages and finish reason."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "03dd6918",
"metadata": {},
"id": "85cbff83-a5cc-40b7-823c-47274ae4117d",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001B[1m> Entering new LLMChain chain...\u001B[0m\n",
"Prompt after formatting:\n",
"\u001B[32;1m\u001B[1;3mWrite a sad poem about ducks.\u001B[0m\n",
"\n",
"\u001B[1m> Finished LLMChain chain.\u001B[0m\n"
]
},
{
"data": {
"text/plain": [
"\"\\n\\nThe ducks swim in the pond,\\nTheir feathers so soft and warm,\\nBut they can't help but feel so forlorn.\\n\\nTheir quacks echo in the air,\\nBut no one is there to hear,\\nFor they have no one to share.\\n\\nThe ducks paddle around in circles,\\nTheir heads hung low in despair,\\nFor they have no one to care.\\n\\nThe ducks look up to the sky,\\nBut no one is there to see,\\nFor they have no one to be.\\n\\nThe ducks drift away in the night,\\nTheir hearts filled with sorrow and pain,\\nFor they have no one to gain.\""
"LLMResult(generations=[[Generation(text='\\n\\nSocktastic!', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\\n\\nTechCore Solutions.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\\n\\nFootwear Factory.', generation_info={'finish_reason': 'stop', 'logprobs': None})]], llm_output={'token_usage': {'prompt_tokens': 36, 'total_tokens': 55, 'completion_tokens': 19}, 'model_name': 'text-davinci-003'})"
]
},
"execution_count": 3,
@@ -114,46 +133,201 @@
}
],
"source": [
"template = \"\"\"Write a {adjective} poem about {subject}.\"\"\"\n",
"llm_chain.generate(input_list)"
]
},
{
"cell_type": "markdown",
"id": "a178173b-b183-432a-a517-250fe3191173",
"metadata": {},
"source": [
"- `predict` is similar to `run` method except in 2 ways:\n",
" - Input key is specified as keyword argument instead of a Python dict\n",
" - It supports multiple input keys."
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "787d9f55-b080-4123-bed2-0598a9cb0466",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'\\n\\nSocktastic!'"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Single input example\n",
"llm_chain.predict(product=\"colorful socks\")"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "092a769f-9661-42a0-9da1-19d09ccbc4a7",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'\\n\\nQ: What did the duck say when his friend died?\\nA: Quack, quack, goodbye.'"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Multiple inputs example\n",
"\n",
"template = \"\"\"Tell me a {adjective} joke about {subject}.\"\"\"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"adjective\", \"subject\"])\n",
"llm_chain = LLMChain(prompt=prompt, llm=OpenAI(temperature=0), verbose=True)\n",
"llm_chain = LLMChain(prompt=prompt, llm=OpenAI(temperature=0))\n",
"\n",
"llm_chain.predict(adjective=\"sad\", subject=\"ducks\")"
]
},
{
"cell_type": "markdown",
"id": "672f59d4",
"id": "4b72ad22-0a5d-4ca7-9e3f-8c46dc17f722",
"metadata": {},
"source": [
"## Parsing the outputs"
]
},
{
"cell_type": "markdown",
"id": "85a77662-d028-4048-be4b-aa496e2dde22",
"metadata": {},
"source": [
"By default, `LLMChain` does not parse the output even if the underlying `prompt` object has an output parser. If you would like to apply that output parser on the LLM output, use `predict_and_parse` instead of `predict` and `apply_and_parse` instead of `apply`. "
]
},
{
"cell_type": "markdown",
"id": "b83977f1-847c-45de-b840-f1aff6725f83",
"metadata": {},
"source": [
"With `predict`:"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "5feb5177-c20b-4909-890b-a64d7e551f55",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'\\n\\nRed, orange, yellow, green, blue, indigo, violet'"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.output_parsers import CommaSeparatedListOutputParser\n",
"\n",
"output_parser = CommaSeparatedListOutputParser()\n",
"template = \"\"\"List all the colors in a rainbow\"\"\"\n",
"prompt = PromptTemplate(template=template, input_variables=[], output_parser=output_parser)\n",
"llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
"\n",
"llm_chain.predict()"
]
},
{
"cell_type": "markdown",
"id": "7b931615-804b-4f34-8086-7bbc2f96b3b2",
"metadata": {},
"source": [
"With `predict_and_parser`:"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "43a374cd-a179-43e5-9aa0-62f3cbdf510d",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"['Red', 'orange', 'yellow', 'green', 'blue', 'indigo', 'violet']"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm_chain.predict_and_parse()"
]
},
{
"cell_type": "markdown",
"id": "8176f619-4e5c-4a02-91ba-e96ebe2aabda",
"metadata": {},
"source": [
"## Initialize from string"
]
},
{
"cell_type": "markdown",
"id": "9813ac87-e118-413b-b448-2fefdf2319b8",
"metadata": {},
"source": [
"## From string\n",
"You can also construct an LLMChain from a string template directly."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "f8bc262e",
"metadata": {},
"execution_count": 16,
"id": "ca88ccb1-974e-41c1-81ce-753e3f1234fa",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template = \"\"\"Write a {adjective} poem about {subject}.\"\"\"\n",
"llm_chain = LLMChain.from_string(llm=OpenAI(temperature=0), template=template)\n"
"template = \"\"\"Tell me a {adjective} joke about {subject}.\"\"\"\n",
"llm_chain = LLMChain.from_string(llm=llm, template=template)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "cb164a76",
"metadata": {},
"execution_count": 18,
"id": "4703d1bc-f4fc-44bc-9ea1-b4498835833d",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"\"\\n\\nThe ducks swim in the pond,\\nTheir feathers so soft and warm,\\nBut they can't help but feel so forlorn.\\n\\nTheir quacks echo in the air,\\nBut no one is there to hear,\\nFor they have no one to share.\\n\\nThe ducks paddle around in circles,\\nTheir heads hung low in despair,\\nFor they have no one to care.\\n\\nThe ducks look up to the sky,\\nBut no one is there to see,\\nFor they have no one to be.\\n\\nThe ducks drift away in the night,\\nTheir hearts filled with sorrow and pain,\\nFor they have no one to gain.\""
"'\\n\\nQ: What did the duck say when his friend died?\\nA: Quack, quack, goodbye.'"
]
},
"execution_count": 4,
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
@@ -161,14 +335,6 @@
"source": [
"llm_chain.predict(adjective=\"sad\", subject=\"ducks\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9f0adbc7",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -187,7 +353,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.10"
}
},
"nbformat": 4,

View File

@@ -22,10 +22,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Query an LLM with the `LLMChain`\n",
"## Quick start: Using `LLMChain`\n",
"\n",
"The `LLMChain` is a simple chain that takes in a prompt template, formats it with the user input and returns the response from an LLM.\n",
"\n",
"\n",
"To use the `LLMChain`, first create a prompt template."
]
},
@@ -67,7 +68,7 @@
"text": [
"\n",
"\n",
"Rainbow Socks Co.\n"
"SockSplash!\n"
]
}
],
@@ -88,7 +89,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 3,
"metadata": {
"tags": []
},
@@ -97,9 +98,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"Rainbow Threads\n"
"Rainbow Sox Co.\n"
]
}
],
@@ -125,7 +124,252 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"This is one of the simpler types of chains, but understanding how it works will set you up well for working with more complex chains."
"## Different ways of calling chains\n",
"\n",
"All classes inherited from `Chain` offer a few ways of running chain logic. The most direct one is by using `__call__`:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'adjective': 'corny',\n",
" 'text': 'Why did the tomato turn red? Because it saw the salad dressing!'}"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chat = ChatOpenAI(temperature=0)\n",
"prompt_template = \"Tell me a {adjective} joke\"\n",
"llm_chain = LLMChain(\n",
" llm=chat,\n",
" prompt=PromptTemplate.from_template(prompt_template)\n",
")\n",
"\n",
"llm_chain(inputs={\"adjective\":\"corny\"})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"By default, `__call__` returns both the input and output key values. You can configure it to only return output key values by setting `return_only_outputs` to `True`."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'text': 'Why did the tomato turn red? Because it saw the salad dressing!'}"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm_chain(\"corny\", return_only_outputs=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If the `Chain` only outputs one output key (i.e. only has one element in its `output_keys`), you can use `run` method. Note that `run` outputs a string instead of a dictionary."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['text']"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# llm_chain only has one output key, so we can use run\n",
"llm_chain.output_keys"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Why did the tomato turn red? Because it saw the salad dressing!'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm_chain.run({\"adjective\":\"corny\"})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the case of one input key, you can input the string directly without specifying the input mapping."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'adjective': 'corny',\n",
" 'text': 'Why did the tomato turn red? Because it saw the salad dressing!'}"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# These two are equivalent\n",
"llm_chain.run({\"adjective\":\"corny\"})\n",
"llm_chain.run(\"corny\")\n",
"\n",
"# These two are also equivalent\n",
"llm_chain(\"corny\")\n",
"llm_chain({\"adjective\":\"corny\"})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Tips: You can easily integrate a `Chain` object as a `Tool` in your `Agent` via its `run` method. See an example [here](../agents/tools/custom_tools.ipynb)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Add memory to chains\n",
"\n",
"`Chain` supports taking a `BaseMemory` object as its `memory` argument, allowing `Chain` object to persist data across multiple calls. In other words, it makes `Chain` a stateful object."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'The next four colors of a rainbow are green, blue, indigo, and violet.'"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.chains import ConversationChain\n",
"from langchain.memory import ConversationBufferMemory\n",
"\n",
"conversation = ConversationChain(\n",
" llm=chat,\n",
" memory=ConversationBufferMemory()\n",
")\n",
"\n",
"conversation.run(\"Answer briefly. What are the first 3 colors of a rainbow?\")\n",
"# -> The first three colors of a rainbow are red, orange, and yellow.\n",
"conversation.run(\"And the next 4?\")\n",
"# -> The next four colors of a rainbow are green, blue, indigo, and violet."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Essentially, `BaseMemory` defines an interface of how `langchain` stores memory. It allows reading of stored data through `load_memory_variables` method and storing new data through `save_context` method. You can learn more about it in [Memory](../memory/getting_started.ipynb) section."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Debug Chain\n",
"\n",
"It can be hard to debug `Chain` object solely from its output as most `Chain` objects involve a fair amount of input prompt preprocessing and LLM output post-processing. Setting `verbose` to `True` will print out some internal states of the `Chain` object while it is being ran."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new ConversationChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n",
"\n",
"Current conversation:\n",
"\n",
"Human: What is ChatGPT?\n",
"AI:\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'ChatGPT is an AI language model developed by OpenAI. It is based on the GPT-3 architecture and is capable of generating human-like responses to text prompts. ChatGPT has been trained on a massive amount of text data and can understand and respond to a wide range of topics. It is often used for chatbots, virtual assistants, and other conversational AI applications.'"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"conversation = ConversationChain(\n",
" llm=chat,\n",
" memory=ConversationBufferMemory(),\n",
" verbose=True\n",
")\n",
"conversation.run(\"What is ChatGPT?\")"
]
},
{
@@ -143,7 +387,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
@@ -163,7 +407,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 12,
"metadata": {},
"outputs": [
{
@@ -173,17 +417,15 @@
"\n",
"\n",
"\u001b[1m> Entering new SimpleSequentialChain chain...\u001b[0m\n",
"\u001b[36;1m\u001b[1;3m\n",
"\n",
"Cheerful Toes.\u001b[0m\n",
"\u001b[36;1m\u001b[1;3mRainbow Socks Co.\u001b[0m\n",
"\u001b[33;1m\u001b[1;3m\n",
"\n",
"\"Spread smiles from your toes!\"\u001b[0m\n",
"\"Step into Color with Rainbow Socks!\"\u001b[0m\n",
"\n",
"\u001b[1m> Finished SimpleSequentialChain chain.\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"\"Spread smiles from your toes!\"\n"
"\"Step into Color with Rainbow Socks!\"\n"
]
}
],
@@ -214,7 +456,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
@@ -248,12 +490,13 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we can try running the chain that we called."
"Now, we can try running the chain that we called.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 14,
"metadata": {},
"outputs": [
{
@@ -263,9 +506,9 @@
"Concatenated output:\n",
"\n",
"\n",
"Rainbow Socks Co.\n",
"Socktastic Colors.\n",
"\n",
"\"Step Into Colorful Comfort!\"\n"
"\"Put Some Color in Your Step!\"\n"
]
}
],
@@ -311,7 +554,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.6"
},
"vscode": {
"interpreter": {

View File

@@ -12,7 +12,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 1,
"id": "70c4e529",
"metadata": {
"tags": []
@@ -36,7 +36,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 2,
"id": "01c46e92",
"metadata": {
"tags": []
@@ -58,7 +58,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 3,
"id": "433363a5",
"metadata": {
"tags": []
@@ -81,18 +81,17 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 4,
"id": "a8930cf7",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"name": "stderr",
"output_type": "stream",
"text": [
"Running Chroma using direct local API.\n",
"Using DuckDB in-memory for database. Data will be transient.\n"
"Using embedded DuckDB without persistence: data will be transient\n"
]
}
],
@@ -104,6 +103,25 @@
"vectorstore = Chroma.from_documents(documents, embeddings)"
]
},
{
"cell_type": "markdown",
"id": "898b574b",
"metadata": {},
"source": [
"We can now create a memory object, which is neccessary to track the inputs/outputs and hold a conversation."
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "af803fee",
"metadata": {},
"outputs": [],
"source": [
"from langchain.memory import ConversationBufferMemory\n",
"memory = ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True)"
]
},
{
"cell_type": "markdown",
"id": "3c96b118",
@@ -114,12 +132,96 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 21,
"id": "7b4110f3",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), memory=memory)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "e8ce4fe9",
"metadata": {},
"outputs": [],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"result = qa({\"question\": query})"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "4c79862b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He also said that she is a consensus builder and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.\""
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"result[\"answer\"]"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "c697d9d1",
"metadata": {},
"outputs": [],
"source": [
"query = \"Did he mention who she suceeded\"\n",
"result = qa({\"question\": query})"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "ba0678f3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' Ketanji Brown Jackson succeeded Justice Stephen Breyer on the United States Supreme Court.'"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"result['answer']"
]
},
{
"cell_type": "markdown",
"id": "84426220",
"metadata": {},
"source": [
"## Pass in chat history\n",
"\n",
"In the above example, we used a Memory object to track chat history. We can also just pass it in explicitly. In order to do this, we need to initialize a chain without any memory object."
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "676b8a36",
"metadata": {},
"outputs": [],
"source": [
"qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever())"
]
@@ -134,7 +236,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 6,
"id": "7fe3e730",
"metadata": {
"tags": []
@@ -148,7 +250,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 7,
"id": "bfff9cc8",
"metadata": {
"tags": []
@@ -160,7 +262,7 @@
"\" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He also said that she is a consensus builder and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.\""
]
},
"execution_count": 9,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
@@ -179,7 +281,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 8,
"id": "00b4cf00",
"metadata": {
"tags": []
@@ -193,7 +295,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 9,
"id": "f01828d1",
"metadata": {
"tags": []
@@ -205,7 +307,7 @@
"' Ketanji Brown Jackson succeeded Justice Stephen Breyer on the United States Supreme Court.'"
]
},
"execution_count": 11,
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
@@ -487,7 +589,6 @@
"outputs": [],
"source": [
"from langchain.chains.llm import LLMChain\n",
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
"from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT, QA_PROMPT\n",
"from langchain.chains.question_answering import load_qa_chain\n",
@@ -495,7 +596,7 @@
"# Construct a ConversationalRetrievalChain with a streaming llm for combine docs\n",
"# and a separate, non-streaming llm for question generation\n",
"llm = OpenAI(temperature=0)\n",
"streaming_llm = OpenAI(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
"streaming_llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)\n",
"\n",
"question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)\n",
"doc_chain = load_qa_chain(streaming_llm, chain_type=\"stuff\", prompt=QA_PROMPT)\n",
@@ -636,7 +737,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@@ -7,7 +7,7 @@
"source": [
"# Question Answering with Sources\n",
"\n",
"This notebook walks through how to use LangChain for question answering with sources over a list of documents. It covers four different chain types: `stuff`, `map_reduce`, `refine`,`map-rerank`. For a more in depth explanation of what these chain types are, see [here](../combine_docs.md)."
"This notebook walks through how to use LangChain for question answering with sources over a list of documents. It covers four different chain types: `stuff`, `map_reduce`, `refine`,`map-rerank`. For a more in depth explanation of what these chain types are, see [here](https://docs.langchain.com/docs/components/chains/index_related_chains)."
]
},
{
@@ -267,7 +267,7 @@
"source": [
"**Intermediate Steps**\n",
"\n",
"We can also return the intermediate steps for `map_reduce` chains, should we want to inspect them. This is done with the `return_map_steps` variable."
"We can also return the intermediate steps for `map_reduce` chains, should we want to inspect them. This is done with the `return_intermediate_steps` variable."
]
},
{

View File

@@ -7,7 +7,7 @@
"source": [
"# Question Answering\n",
"\n",
"This notebook walks through how to use LangChain for question answering over a list of documents. It covers four different types of chains: `stuff`, `map_reduce`, `refine`, `map_rerank`. For a more in depth explanation of what these chain types are, see [here](../combine_docs.md)."
"This notebook walks through how to use LangChain for question answering over a list of documents. It covers four different types of chains: `stuff`, `map_reduce`, `refine`, `map_rerank`. For a more in depth explanation of what these chain types are, see [here](https://docs.langchain.com/docs/components/chains/index_related_chains)."
]
},
{

View File

@@ -0,0 +1,177 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "bda1f3f5",
"metadata": {},
"source": [
"# Arxiv\n",
"\n",
"[arXiv](https://arxiv.org/) is an open-access archive for 2 million scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics.\n",
"\n",
"This notebook shows how to load scientific articles from `Arxiv.org` into a document format that we can use downstream."
]
},
{
"cell_type": "markdown",
"id": "1b7a1eef-7bf7-4e7d-8bfc-c4e27c9488cb",
"metadata": {},
"source": [
"## Installation"
]
},
{
"cell_type": "markdown",
"id": "2abd5578-aa3d-46b9-99af-8b262f0b3df8",
"metadata": {},
"source": [
"First, you need to install `arxiv` python package."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b674aaea-ed3a-4541-8414-260a8f67f623",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install arxiv"
]
},
{
"cell_type": "markdown",
"id": "094b5f13-7e54-4354-9d83-26d6926ecaa0",
"metadata": {
"tags": []
},
"source": [
"Second, you need to install `PyMuPDF` python package which transform PDF files from the `arxiv.org` site into the text fromat."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7cd91121-2e96-43ba-af50-319853695f86",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install pymupdf"
]
},
{
"cell_type": "markdown",
"id": "95f05e1c-195e-4e2b-ae8e-8d6637f15be6",
"metadata": {},
"source": [
"## Examples"
]
},
{
"cell_type": "markdown",
"id": "e29b954c-1407-4797-ae21-6ba8937156be",
"metadata": {},
"source": [
"`ArxivLoader` has these arguments:\n",
"- `query`: free text which used to find documents in the Arxiv\n",
"- optional `load_max_docs`: default=100. Use it to limit number of downloaded documents. It takes time to download all 100 documents, so use a small number for experiments.\n",
"- optional `load_all_available_meta`: default=False. By defaul only the most important fields downloaded: `Published` (date when document was published/last updated), `Title`, `Authors`, `Summary`. If True, other fields also downloaded."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9bfd5e46",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders.base import Document\n",
"from langchain.document_loaders import ArxivLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "700e4ef2",
"metadata": {},
"outputs": [],
"source": [
"docs = ArxivLoader(query=\"1605.08386\", load_max_docs=2).load()\n",
"len(docs)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "8977bac0-0042-4f23-9754-247dbd32439b",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"{'Published': '2016-05-26',\n",
" 'Title': 'Heat-bath random walks with Markov bases',\n",
" 'Authors': 'Caprice Stanley, Tobias Windisch',\n",
" 'Summary': 'Graphs on lattice points are studied whose edges come from a finite set of\\nallowed moves of arbitrary length. We show that the diameter of these graphs on\\nfibers of a fixed integer matrix can be bounded from above by a constant. We\\nthen study the mixing behaviour of heat-bath random walks on these graphs. We\\nalso state explicit conditions on the set of moves so that the heat-bath random\\nwalk, a generalization of the Glauber dynamics, is an expander in fixed\\ndimension.'}"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"doc[0].metadata # meta-information of the Document"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "46969806-45a9-4c4d-a61b-cfb9658fc9de",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'arXiv:1605.08386v1 [math.CO] 26 May 2016\\nHEAT-BATH RANDOM WALKS WITH MARKOV BASES\\nCAPRICE STANLEY AND TOBIAS WINDISCH\\nAbstract. Graphs on lattice points are studied whose edges come from a finite set of\\nallowed moves of arbitrary length. We show that the diameter of these graphs on fibers of a\\nfixed integer matrix can be bounded from above by a constant. We then study the mixing\\nbehaviour of heat-b'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"doc[0].page_content[:400] # all pages of the Document content\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,151 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "vm8vn9t8DvC_"
},
"source": [
"# Blockchain"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "5WjXERXzFEhg"
},
"source": [
"## Overview"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "juAmbgoWD17u"
},
"source": [
"The intention of this notebook is to provide a means of testing functionality in the Langchain Document Loader for Blockchain.\n",
"\n",
"Initially this Loader supports:\n",
"\n",
"* Loading NFTs as Documents from NFT Smart Contracts (ERC721 and ERC1155)\n",
"* Ethereum Maninnet, Ethereum Testnet, Polgyon Mainnet, Polygon Testnet (default is eth-mainnet)\n",
"* Alchemy's getNFTsForCollection API\n",
"\n",
"It can be extended if the community finds value in this loader. Specifically:\n",
"\n",
"* Additional APIs can be added (e.g. Tranction-related APIs)\n",
"\n",
"This Document Loader Requires:\n",
"\n",
"* A free [Alchemy API Key](https://www.alchemy.com/)\n",
"\n",
"The output takes the following format:\n",
"\n",
"- pageContent= Individual NFT\n",
"- metadata={'source': '0x1a92f7381b9f03921564a437210bb9396471050c', 'blockchain': 'eth-mainnet', 'tokenId': '0x15'})"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load NFTs into Document Loader"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"alchemyApiKey = \"get from https://www.alchemy.com/ and set in environment variable ALCHEMY_API_KEY\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Option 1: Ethereum Mainnet (default BlockchainType)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "J3LWHARC-Kn0"
},
"outputs": [],
"source": [
"contractAddress = \"0xbc4ca0eda7647a8ab7c2061c2e118a18a936f13d\" # Bored Ape Yacht Club contract address\n",
"\n",
"blockchainType = BlockchainType.ETH_MAINNET #default value, optional parameter\n",
"\n",
"blockchainLoader = BlockchainDocumentLoader(contract_address=contractAddress,\n",
" api_key=alchemyApiKey)\n",
"\n",
"nfts = blockchainLoader.load()\n",
"\n",
"nfts[:2]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Option 2: Polygon Mainnet"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"contractAddress = \"0x448676ffCd0aDf2D85C1f0565e8dde6924A9A7D9\" # Polygon Mainnet contract address\n",
"\n",
"blockchainType = BlockchainType.POLYGON_MAINNET \n",
"\n",
"blockchainLoader = BlockchainDocumentLoader(contract_address=contractAddress, \n",
" blockchainType=blockchainType, \n",
" api_key=alchemyApiKey)\n",
"\n",
"nfts = blockchainLoader.load()\n",
"\n",
"nfts[:2]"
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [
"5WjXERXzFEhg"
],
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
}
},
"nbformat": 4,
"nbformat_minor": 0
}

View File

@@ -68,6 +68,51 @@
"len(docs)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e633d62f",
"metadata": {},
"source": [
"## Show a progress bar"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "43911860",
"metadata": {},
"source": [
"By default a progress bar will not be shown. To show a progress bar, install the `tqdm` library (e.g. `pip install tqdm`), and set the `show_progress` parameter to `True`."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "bb93daac",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: tqdm in /Users/jon/.pyenv/versions/3.9.16/envs/microbiome-app/lib/python3.9/site-packages (4.65.0)\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"0it [00:00, ?it/s]\n"
]
}
],
"source": [
"%pip install tqdm\n",
"loader = DirectoryLoader('../', glob=\"**/*.md\", show_progress=True)\n",
"docs = loader.load()"
]
},
{
"cell_type": "markdown",
"id": "c5652850",

View File

@@ -16,7 +16,7 @@
"1. `pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib`\n",
"\n",
"## 🧑 Instructions for ingesting your Google Docs data\n",
"By default, the `GoogleDriveLoader` expects the `credentials.json` file to be `~/.credentials/credentials.json`, but this is configurable using the `credentials_file` keyword argument. Same thing with `token.json`. Note that `token.json` will be created automatically the first time you use the loader.\n",
"By default, the `GoogleDriveLoader` expects the `credentials.json` file to be `~/.credentials/credentials.json`, but this is configurable using the `credentials_path` keyword argument. Same thing with `token.json` - `token_path`. Note that `token.json` will be created automatically the first time you use the loader.\n",
"\n",
"`GoogleDriveLoader` can load from a list of Google Docs document ids or a folder id. You can obtain your folder and document id from the URL:\n",
"* Folder: https://drive.google.com/drive/u/0/folders/1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5 -> folder id is `\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\"`\n",

View File

@@ -155,6 +155,46 @@
" print(str(doc.metadata[\"page\"]) + \":\", doc.page_content)"
]
},
{
"cell_type": "markdown",
"id": "6d5c9879",
"metadata": {},
"source": [
"## Using MathPix\n",
"\n",
"Inspired by Daniel Gross's [https://gist.github.com/danielgross/3ab4104e14faccc12b49200843adab21](https://gist.github.com/danielgross/3ab4104e14faccc12b49200843adab21)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "950eb58f",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import MathpixPDFLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cb6fd473",
"metadata": {},
"outputs": [],
"source": [
"loader = MathpixPDFLoader(\"example_data/layout-parser-paper.pdf\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a1d41e1a",
"metadata": {},
"outputs": [],
"source": [
"data = loader.load()"
]
},
{
"cell_type": "markdown",
"id": "09d64998",
@@ -566,10 +606,50 @@
"Additionally, you can pass along any of the options from the [PyMuPDF documentation](https://pymupdf.readthedocs.io/en/latest/app1.html#plain-text/) as keyword arguments in the `load` call, and it will be pass along to the `get_text()` call."
]
},
{
"cell_type": "markdown",
"id": "15b57eab",
"metadata": {},
"source": [
"## PyPDF Directory\n",
"\n",
"Load PDFs from directory"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "b9e521d9",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import PyPDFDirectoryLoader"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "4b20590f",
"metadata": {},
"outputs": [],
"source": [
"loader = PyPDFDirectoryLoader(\"example_data/\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "e5ead943",
"metadata": {},
"outputs": [],
"source": [
"docs = loader.load()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1bf73c97",
"id": "ea25b03c",
"metadata": {},
"outputs": [],
"source": []
@@ -577,9 +657,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "langchain_dev",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "langchain_dev"
"name": "python3"
},
"language_info": {
"codemirror_mode": {
@@ -591,7 +671,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@@ -40,7 +40,7 @@
"metadata": {},
"outputs": [],
"source": [
"loader = ReadTheDocsLoader(\"rtdocs\")"
"loader = ReadTheDocsLoader(\"rtdocs\", features='html.parser')"
]
},
{

View File

@@ -0,0 +1,112 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Reddit\n",
"\n",
"This loader fetches the text from the Posts of Subreddits or Reddit users, using the `praw` Python package.\n",
"\n",
"Make a Reddit Application from https://www.reddit.com/prefs/apps/ and initialize the loader with with your Reddit API credentials."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import RedditPostsLoader"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# !pip install praw"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# load using 'subreddit' mode\n",
"loader = RedditPostsLoader(\n",
" client_id=\"YOUR CLIENT ID\",\n",
" client_secret=\"YOUR CLIENT SECRET\",\n",
" user_agent=\"extractor by u/Master_Ocelot8179\",\n",
" categories=['new', 'hot'], # List of categories to load posts from\n",
" mode = 'subreddit',\n",
" search_queries=['investing', 'wallstreetbets'], # List of subreddits to load posts from\n",
" number_posts=20 # Default value is 10\n",
" )\n",
"\n",
"# # or load using 'username' mode\n",
"# loader = RedditPostsLoader(\n",
"# client_id=\"YOUR CLIENT ID\",\n",
"# client_secret=\"YOUR CLIENT SECRET\",\n",
"# user_agent=\"extractor by u/Master_Ocelot8179\",\n",
"# categories=['new', 'hot'], \n",
"# mode = 'username',\n",
"# search_queries=['ga3far', 'Master_Ocelot8179'], # List of usernames to load posts from\n",
"# number_posts=20\n",
"# )\n",
"\n",
"# Note: Categories can be only of following value - \"controversial\" \"hot\" \"new\" \"rising\" \"top\""
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='Hello, I am not looking for investment advice. I will apply my own due diligence. However, I am interested if anyone knows as a UK resident how fees and exchange rate differences would impact performance?\\n\\nI am planning to create a pie of index funds (perhaps UK, US, europe) or find a fund with a good track record of long term growth at low rates. \\n\\nDoes anyone have any ideas?', metadata={'post_subreddit': 'r/investing', 'post_category': 'new', 'post_title': 'Long term retirement funds fees/exchange rate query', 'post_score': 1, 'post_id': '130pa6m', 'post_url': 'https://www.reddit.com/r/investing/comments/130pa6m/long_term_retirement_funds_feesexchange_rate_query/', 'post_author': Redditor(name='Badmanshiz')}),\n",
" Document(page_content='I much prefer the Roth IRA and would rather rollover my 401k to that every year instead of keeping it in the limited 401k options. But if I rollover, will I be able to continue contributing to my 401k? Or will that close my account? I realize that there are tax implications of doing this but I still think it is the better option.', metadata={'post_subreddit': 'r/investing', 'post_category': 'new', 'post_title': 'Is it possible to rollover my 401k every year?', 'post_score': 3, 'post_id': '130ja0h', 'post_url': 'https://www.reddit.com/r/investing/comments/130ja0h/is_it_possible_to_rollover_my_401k_every_year/', 'post_author': Redditor(name='AnCap_Catholic')}),\n",
" Document(page_content='Have a general question? Want to offer some commentary on markets? Maybe you would just like to throw out a neat fact that doesn\\'t warrant a self post? Feel free to post here! \\n\\nIf your question is \"I have $10,000, what do I do?\" or other \"advice for my personal situation\" questions, you should include relevant information, such as the following:\\n\\n* How old are you? What country do you live in? \\n* Are you employed/making income? How much? \\n* What are your objectives with this money? (Buy a house? Retirement savings?) \\n* What is your time horizon? Do you need this money next month? Next 20yrs? \\n* What is your risk tolerance? (Do you mind risking it at blackjack or do you need to know its 100% safe?) \\n* What are you current holdings? (Do you already have exposure to specific funds and sectors? Any other assets?) \\n* Any big debts (include interest rate) or expenses? \\n* And any other relevant financial information will be useful to give you a proper answer. \\n\\nPlease consider consulting our FAQ first - https://www.reddit.com/r/investing/wiki/faq\\nAnd our [side bar](https://www.reddit.com/r/investing/about/sidebar) also has useful resources. \\n\\nIf you are new to investing - please refer to Wiki - [Getting Started](https://www.reddit.com/r/investing/wiki/index/gettingstarted/)\\n\\nThe reading list in the wiki has a list of books ranging from light reading to advanced topics depending on your knowledge level. Link here - [Reading List](https://www.reddit.com/r/investing/wiki/readinglist)\\n\\nCheck the resources in the sidebar.\\n\\nBe aware that these answers are just opinions of Redditors and should be used as a starting point for your research. You should strongly consider seeing a registered investment adviser if you need professional support before making any financial decisions!', metadata={'post_subreddit': 'r/investing', 'post_category': 'new', 'post_title': 'Daily General Discussion and Advice Thread - April 27, 2023', 'post_score': 5, 'post_id': '130eszz', 'post_url': 'https://www.reddit.com/r/investing/comments/130eszz/daily_general_discussion_and_advice_thread_april/', 'post_author': Redditor(name='AutoModerator')}),\n",
" Document(page_content=\"Based on recent news about salt battery advancements and the overall issues of lithium, I was wondering what would be feasible ways to invest into non-lithium based battery technologies? CATL is of course a choice, but the selection of brokers I currently have in my disposal don't provide HK stocks at all.\", metadata={'post_subreddit': 'r/investing', 'post_category': 'new', 'post_title': 'Investing in non-lithium battery technologies?', 'post_score': 2, 'post_id': '130d6qp', 'post_url': 'https://www.reddit.com/r/investing/comments/130d6qp/investing_in_nonlithium_battery_technologies/', 'post_author': Redditor(name='-manabreak')}),\n",
" Document(page_content='Hello everyone,\\n\\nI would really like to invest in an ETF that follows spy or another big index, as I think this form of investment suits me best. \\n\\nThe problem is, that I live in Denmark where ETFs and funds are taxed annually on unrealised gains at quite a steep rate. This means that an ETF growing say 10% per year will only grow about 6%, which really ruins the long term effects of compounding interest.\\n\\nHowever stocks are only taxed on realised gains which is why they look more interesting to hold long term.\\n\\nI do not like the lack of diversification this brings, as I am looking to spend tonnes of time picking the right long term stocks.\\n\\nIt would be ideal to find a few stocks that over the long term somewhat follows the indexes. Does anyone have suggestions?\\n\\nI have looked at Nasdaq Inc. which quite closely follows Nasdaq 100. \\n\\nI really appreciate any help.', metadata={'post_subreddit': 'r/investing', 'post_category': 'new', 'post_title': 'Stocks that track an index', 'post_score': 7, 'post_id': '130auvj', 'post_url': 'https://www.reddit.com/r/investing/comments/130auvj/stocks_that_track_an_index/', 'post_author': Redditor(name='LeAlbertP')})]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"documents = loader.load()\n",
"documents[:5]"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "env1",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,92 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Stripe\n",
"\n",
"This notebook covers how to load data from the Stripe REST API into a format that can be ingested into LangChain, along with example usage for vectorization."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"\n",
"from langchain.document_loaders import StripeLoader\n",
"from langchain.indexes import VectorstoreIndexCreator"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The Stripe API requires an access token, which can be found inside of the Stripe dashboard.\n",
"\n",
"This document loader also requires a `resource` option which defines what data you want to load.\n",
"\n",
"Following resources are available:\n",
"\n",
"`balance_transations` [Documentation](https://stripe.com/docs/api/balance_transactions/list)\n",
"\n",
"`charges` [Documentation](https://stripe.com/docs/api/charges/list)\n",
"\n",
"`customers` [Documentation](https://stripe.com/docs/api/customers/list)\n",
"\n",
"`events` [Documentation](https://stripe.com/docs/api/events/list)\n",
"\n",
"`refunds` [Documentation](https://stripe.com/docs/api/refunds/list)\n",
"\n",
"`disputes` [Documentation](https://stripe.com/docs/api/disputes/list)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"stripe_loader = StripeLoader(os.environ[\"STRIPE_ACCESS_TOKEN\"], \"charges\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create a vectorstore retriver from the loader\n",
"# see https://python.langchain.com/en/latest/modules/indexes/getting_started.html for more details\n",
"\n",
"index = VectorstoreIndexCreator().from_loaders([stripe_loader])\n",
"stripe_doc_retriever = index.vectorstore.as_retriever()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -10,9 +10,78 @@
"This covers how to load Word documents into a document format that we can use downstream."
]
},
{
"cell_type": "markdown",
"id": "9438686b",
"metadata": {},
"source": [
"## Using Docx2txt\n",
"\n",
"Load .docx using `Docx2txt` into a document."
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 3,
"id": "7b80ea89",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import Docx2txtLoader"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "99a12031",
"metadata": {},
"outputs": [],
"source": [
"loader = Docx2txtLoader(\"example_data/fake.docx\")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "b92f68b0",
"metadata": {},
"outputs": [],
"source": [
"data = loader.load()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "d83dd755",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='Lorem ipsum dolor sit amet.', metadata={'source': 'example_data/fake.docx'})]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data"
]
},
{
"cell_type": "markdown",
"id": "8d40727d",
"metadata": {},
"source": [
"## Using Unstructured"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "721c48aa",
"metadata": {},
"outputs": [],
@@ -129,7 +198,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,330 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "13afcae7",
"metadata": {},
"source": [
"# Self-querying retriever\n",
"In the notebook we'll demo the `SelfQueryRetriever`, which, as the name suggests, has the ability to query itself. Specifically, given any natural language query, the retriever uses a query-constructing LLM chain to write a structured query and then applies that structured query to it's underlying VectorStore. This allows the retriever to not only use the user-input query for semantic similarity comparison with the contents of stored documented, but to also extract filters from the user query on the metadata of stored documents and to execute those filter."
]
},
{
"cell_type": "markdown",
"id": "68e75fb9",
"metadata": {},
"source": [
"## Creating a Pinecone index\n",
"First we'll want to create a Pinecone VectorStore and seed it with some data. We've created a small demo set of documents that contain summaries of movies.\n",
"\n",
"NOTE: The self-query retriever currently only has built-in support for Pinecone VectorStore.\n",
"\n",
"NOTE: The self-query retriever requires you to have `lark` installed (`pip install lark`)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "63a8af5b",
"metadata": {},
"outputs": [],
"source": [
"# !pip install lark"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "3eb9c9a4",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/harrisonchase/.pyenv/versions/3.9.1/envs/langchain/lib/python3.9/site-packages/pinecone/index.py:4: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console)\n",
" from tqdm.autonotebook import tqdm\n"
]
}
],
"source": [
"import os\n",
"\n",
"import pinecone\n",
"\n",
"\n",
"pinecone.init(api_key=os.environ[\"PINECONE_API_KEY\"], environment=os.environ[\"PINECONE_ENV\"])"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "cb4a5787",
"metadata": {},
"outputs": [],
"source": [
"from langchain.schema import Document\n",
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.vectorstores import Pinecone\n",
"\n",
"embeddings = OpenAIEmbeddings()\n",
"# create new index\n",
"pinecone.create_index(\"langchain-self-retriever-demo\", dimension=1536)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "bcbe04d9",
"metadata": {},
"outputs": [],
"source": [
"docs = [\n",
" Document(page_content=\"A bunch of scientists bring back dinosaurs and mayhem breaks loose\", metadata={\"year\": 1993, \"rating\": 7.7, \"genre\": [\"action\", \"science fiction\"]}),\n",
" Document(page_content=\"Leo DiCaprio gets lost in a dream within a dream within a dream within a ...\", metadata={\"year\": 2010, \"director\": \"Christopher Nolan\", \"rating\": 8.2}),\n",
" Document(page_content=\"A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea\", metadata={\"year\": 2006, \"director\": \"Satoshi Kon\", \"rating\": 8.6}),\n",
" Document(page_content=\"A bunch of normal-sized women are supremely wholesome and some men pine after them\", metadata={\"year\": 2019, \"director\": \"Greta Gerwig\", \"rating\": 8.3}),\n",
" Document(page_content=\"Toys come alive and have a blast doing so\", metadata={\"year\": 1995, \"genre\": \"animated\"}),\n",
" Document(page_content=\"Three men walk into the Zone, three men walk out of the Zone\", metadata={\"year\": 1979, \"rating\": 9.9, \"director\": \"Andrei Tarkovsky\", \"genre\": [\"science fiction\", \"thriller\"], \"rating\": 9.9})\n",
"]\n",
"vectorstore = Pinecone.from_documents(\n",
" docs, embeddings, index_name=\"langchain-self-retriever-demo\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "5ecaab6d",
"metadata": {},
"source": [
"# Creating our self-querying retriever\n",
"Now we can instantiate our retriever. To do this we'll need to provide some information upfront about the metadata fields that our documents support and a short description of the document contents."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "86e34dbf",
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import OpenAI\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"\n",
"metadata_field_info=[\n",
" AttributeInfo(\n",
" name=\"genre\",\n",
" description=\"The genre of the movie\", \n",
" type=\"string or list[string]\", \n",
" ),\n",
" AttributeInfo(\n",
" name=\"year\",\n",
" description=\"The year the movie was released\", \n",
" type=\"integer\", \n",
" ),\n",
" AttributeInfo(\n",
" name=\"director\",\n",
" description=\"The name of the movie director\", \n",
" type=\"string\", \n",
" ),\n",
" AttributeInfo(\n",
" name=\"rating\",\n",
" description=\"A 1-10 rating for the movie\",\n",
" type=\"float\"\n",
" ),\n",
"]\n",
"document_content_description = \"Brief summary of a movie\"\n",
"llm = OpenAI(temperature=0)\n",
"retriever = SelfQueryRetriever.from_llm(llm, vectorstore, document_content_description, metadata_field_info, verbose=True)"
]
},
{
"cell_type": "markdown",
"id": "ea9df8d4",
"metadata": {},
"source": [
"# Testing it out\n",
"And now we can try actually using our retriever!"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "38a126e9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"query='dinosaur' filter=None\n"
]
},
{
"data": {
"text/plain": [
"[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'genre': ['action', 'science fiction'], 'rating': 7.7, 'year': 1993.0}),\n",
" Document(page_content='Toys come alive and have a blast doing so', metadata={'genre': 'animated', 'year': 1995.0}),\n",
" Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'director': 'Satoshi Kon', 'rating': 8.6, 'year': 2006.0}),\n",
" Document(page_content='Leo DiCaprio gets lost in a dream within a dream within a dream within a ...', metadata={'director': 'Christopher Nolan', 'rating': 8.2, 'year': 2010.0})]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# This example only specifies a relevant query\n",
"retriever.get_relevant_documents(\"What are some movies about dinosaurs\")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "fc3f1e6e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"query=' ' filter=Comparison(comparator=<Comparator.GT: 'gt'>, attribute='rating', value=8.5)\n"
]
},
{
"data": {
"text/plain": [
"[Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'director': 'Satoshi Kon', 'rating': 8.6, 'year': 2006.0}),\n",
" Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'director': 'Andrei Tarkovsky', 'genre': ['science fiction', 'thriller'], 'rating': 9.9, 'year': 1979.0})]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# This example only specifies a filter\n",
"retriever.get_relevant_documents(\"I want to watch a movie rated higher than 8.5\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "b19d4da0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"query='women' filter=Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='director', value='Greta Gerwig')\n"
]
},
{
"data": {
"text/plain": [
"[Document(page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them', metadata={'director': 'Greta Gerwig', 'rating': 8.3, 'year': 2019.0})]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# This example specifies a query and a filter\n",
"retriever.get_relevant_documents(\"Has Greta Gerwig directed any movies about women\")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "f900e40e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"query=' ' filter=Operation(operator=<Operator.AND: 'and'>, arguments=[Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='genre', value='science fiction'), Comparison(comparator=<Comparator.GT: 'gt'>, attribute='rating', value=8.5)])\n"
]
},
{
"data": {
"text/plain": [
"[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'director': 'Andrei Tarkovsky', 'genre': ['science fiction', 'thriller'], 'rating': 9.9, 'year': 1979.0})]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# This example specifies a composite filter\n",
"retriever.get_relevant_documents(\"What's a highly rated (above 8.5) science fiction film?\")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "12a51522",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"query='toys' filter=Operation(operator=<Operator.AND: 'and'>, arguments=[Comparison(comparator=<Comparator.GT: 'gt'>, attribute='year', value=1990.0), Comparison(comparator=<Comparator.LT: 'lt'>, attribute='year', value=2005.0), Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='genre', value='animated')])\n"
]
},
{
"data": {
"text/plain": [
"[Document(page_content='Toys come alive and have a blast doing so', metadata={'genre': 'animated', 'year': 1995.0})]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# This example specifies a query and composite filter\n",
"retriever.get_relevant_documents(\"What's a movie after 1990 but before 2005 that's all about toys, and preferably is animated\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "69bbd809",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,124 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "ce0f17b9",
"metadata": {},
"source": [
"# Vespa retriever\n",
"\n",
"This notebook shows how to use Vespa.ai as a LangChain retriever.\n",
"Vespa.ai is a platform for highly efficient structured text and vector search.\n",
"Please refer to [Vespa.ai](https://vespa.ai) for more information.\n",
"\n",
"In order to create a retriever, we use [pyvespa](https://pyvespa.readthedocs.io/en/latest/index.html) to\n",
"create a connection a Vespa service."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "c10dd962",
"metadata": {},
"outputs": [],
"source": [
"from vespa.application import Vespa\n",
"\n",
"vespa_app = Vespa(url=\"https://doc-search.vespa.oath.cloud\")"
]
},
{
"cell_type": "markdown",
"id": "3df4ce53",
"metadata": {},
"source": [
"This creates a connection to a Vespa service, here the Vespa documentation search service.\n",
"Using pyvespa, you can also connect to a\n",
"[Vespa Cloud instance](https://pyvespa.readthedocs.io/en/latest/deploy-vespa-cloud.html)\n",
"or a local\n",
"[Docker instance](https://pyvespa.readthedocs.io/en/latest/deploy-docker.html).\n",
"\n",
"\n",
"After connecting to the service, you can set up the retriever:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7ccca1f4",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"from langchain.retrievers.vespa_retriever import VespaRetriever\n",
"\n",
"vespa_query_body = {\n",
" \"yql\": \"select content from paragraph where userQuery()\",\n",
" \"hits\": 5,\n",
" \"ranking\": \"documentation\",\n",
" \"locale\": \"en-us\"\n",
"}\n",
"vespa_content_field = \"content\"\n",
"retriever = VespaRetriever(vespa_app, vespa_query_body, vespa_content_field)"
]
},
{
"cell_type": "markdown",
"id": "1e7e34e1",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"This sets up a LangChain retriever that fetches documents from the Vespa application.\n",
"Here, up to 5 results are retrieved from the `content` field in the `paragraph` document type,\n",
"using `doumentation` as the ranking method. The `userQuery()` is replaced with the actual query\n",
"passed from LangChain.\n",
"\n",
"Please refer to the [pyvespa documentation](https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa.html#Query)\n",
"for more information.\n",
"\n",
"Now you can return the results and continue using the results in LangChain."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "f47a2bfe",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"retriever.get_relevant_documents(\"what is vespa?\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -6,7 +6,7 @@
"metadata": {},
"source": [
"# tiktoken (OpenAI) Length Function\n",
"You can also use tiktoken, a open source tokenizer package from OpenAI to estimate tokens used. Will probably be more accurate for their models.\n",
"You can also use tiktoken, an open source tokenizer package from OpenAI to estimate tokens used. Will probably be more accurate for their models.\n",
"\n",
"1. How the text is split: by character passed in\n",
"2. How the chunk size is measured: by `tiktoken` tokenizer"

View File

@@ -6,15 +6,21 @@
"source": [
"# AnalyticDB\n",
"\n",
"This notebook shows how to use functionality related to the AnalyticDB vector database.\n",
">[AnalyticDB for PostgreSQL](https://www.alibabacloud.com/help/en/analyticdb-for-postgresql/latest/product-introduction-overview) is a massively parallel processing (MPP) data warehousing service that is designed to analyze large volumes of data online.\n",
"\n",
">`AnalyticDB for PostgreSQL` is developed based on the open source `Greenplum Database` project and is enhanced with in-depth extensions by `Alibaba Cloud`. AnalyticDB for PostgreSQL is compatible with the ANSI SQL 2003 syntax and the PostgreSQL and Oracle database ecosystems. AnalyticDB for PostgreSQL also supports row store and column store. AnalyticDB for PostgreSQL processes petabytes of data offline at a high performance level and supports highly concurrent online queries.\n",
"\n",
"This notebook shows how to use functionality related to the `AnalyticDB` vector database.\n",
"To run, you should have an [AnalyticDB](https://www.alibabacloud.com/help/en/analyticdb-for-postgresql/latest/product-introduction-overview) instance up and running:\n",
"- Using [AnalyticDB Cloud Vector Database](https://www.alibabacloud.com/product/hybriddb-postgresql). Click here to fast deploy it."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@@ -24,12 +30,10 @@
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Split documents and get embeddings by call OpenAI API"
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "code",
@@ -48,6 +52,7 @@
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Connect to AnalyticDB by setting related ENVIRONMENTS.\n",
"```\n",
@@ -59,10 +64,7 @@
"```\n",
"\n",
"Then store your embeddings and documents into AnalyticDB"
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "code",
@@ -90,12 +92,10 @@
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Query and retrieve data"
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "code",
@@ -129,13 +129,6 @@
"source": [
"print(docs[0].page_content)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -154,9 +147,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 1
"nbformat_minor": 4
}

View File

@@ -1,34 +1,31 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "683953b3",
"metadata": {},
"source": [
"# Annoy\n",
"\n",
"This notebook shows how to use functionality related to the Annoy vector database.\n",
"\n",
"> \"Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mmapped into memory so that many processes may share the same data.\"\n",
"\n",
"This notebook shows how to use functionality related to the `Annoy` vector database.\n",
"\n",
"via [Annoy](https://github.com/spotify/annoy) \n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3b450bdc",
"metadata": {},
"source": [
"```{note}\n",
"Annoy is read-only - once the index is built you cannot add any more emebddings!\n",
"If you want to progressively add to your VectorStore then better choose an alternative!\n",
"NOTE: Annoy is read-only - once the index is built you cannot add any more emebddings!\n",
"If you want to progressively add new entries to your VectorStore then better choose an alternative!\n",
"```"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6613d222",
"metadata": {},
@@ -123,7 +120,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "4583b231",
"metadata": {},
@@ -265,7 +261,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "341390c2",
"metadata": {},
@@ -409,7 +404,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6f570f69",
"metadata": {},
@@ -472,7 +466,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "df4beb83",
"metadata": {},
@@ -564,7 +557,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -6,7 +6,20 @@
"source": [
"# AtlasDB\n",
"\n",
"This notebook shows you how to use functionality related to the AtlasDB"
"This notebook shows you how to use functionality related to the `AtlasDB`.\n",
"\n",
">[MongoDBs](https://www.mongodb.com/) [Atlas](https://www.mongodb.com/cloud/atlas) is an on-demand fully managed service. `MongoDB Atlas` runs on `AWS`, `Microsoft Azure`, and `Google Cloud Platform`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install spacy"
]
},
{
@@ -15,7 +28,34 @@
"metadata": {
"pycharm": {
"is_executing": true
}
},
"scrolled": true,
"tags": []
},
"outputs": [],
"source": [
"!python3 -m spacy download en_core_web_sm"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install nomic"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"pycharm": {
"is_executing": true
},
"tags": []
},
"outputs": [],
"source": [
@@ -28,31 +68,21 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 7,
"metadata": {
"pycharm": {
"is_executing": true
},
"scrolled": true
"tags": []
},
"outputs": [],
"source": [
"!python -m spacy download en_core_web_sm"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"ATLAS_TEST_API_KEY = '7xDPkYXSYDc1_ErdTPIcoAR9RNd8YDlkS3nVNXcVoIMZ6'"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"execution_count": 8,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"loader = TextLoader('../../../state_of_the_union.txt')\n",
@@ -71,7 +101,8 @@
"metadata": {
"pycharm": {
"is_executing": true
}
},
"tags": []
},
"outputs": [],
"source": [
@@ -165,13 +196,6 @@
"source": [
"db.project"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -190,9 +214,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 1
"nbformat_minor": 4
}

View File

@@ -7,14 +7,68 @@
"source": [
"# Chroma\n",
"\n",
"This notebook shows how to use functionality related to the Chroma vector database."
">[Chroma](https://docs.trychroma.com/getting-started) is a database for building AI applications with embeddings.\n",
"\n",
"This notebook shows how to use functionality related to the `Chroma` vector database."
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "0825fa4a-d950-4e78-8bba-20cfcc347765",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install chromadb"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "42080f37-8fd1-4cec-acd9-15d2b03b2f4d",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"# get a token: https://platform.openai.com/account/api-keys\n",
"\n",
"from getpass import getpass\n",
"\n",
"OPENAI_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "c7a94d6c-b4d4-4498-9bdd-eb50c92b85c5",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "aac9563e",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@@ -25,9 +79,11 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 8,
"id": "a3c3999a",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
@@ -41,9 +97,11 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 9,
"id": "5eabdb75",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
@@ -94,9 +152,11 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 10,
"id": "72aaa9c8",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"docs = db.similarity_search_with_score(query)"
@@ -104,18 +164,20 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 11,
"id": "d88e958e",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"(Document(page_content='In state after state, new laws have been passed, not only to suppress the vote, but to subvert entire elections. \\n\\nWe cannot let this happen. \\n\\nTonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', lookup_str='', metadata={'source': '../../state_of_the_union.txt'}, lookup_index=0),\n",
" 0.3913410007953644)"
"(Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'}),\n",
" 0.3949805498123169)"
]
},
"execution_count": 6,
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
@@ -170,7 +232,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "f568a322",
"metadata": {},
@@ -300,7 +361,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -6,24 +6,30 @@
"source": [
"# Deep Lake\n",
"\n",
"This notebook showcases basic functionality related to Deep Lake. While Deep Lake can store embeddings, it is capable of storing any type of data. It is a fully fledged serverless data lake with version control, query engine and streaming dataloader to deep learning frameworks. \n",
">[Deep Lake](https://docs.activeloop.ai/) as a Multi-Modal Vector Store that stores embeddings and their metadata including text, jsons, images, audio, video, and more. It saves the data locally, in your cloud, or on Activeloop storage. It performs hybrid search including embeddings and their attributes.\n",
"\n",
"For more information, please see the Deep Lake [documentation](docs.activeloop.ai) or [api reference](docs.deeplake.ai)"
"This notebook showcases basic functionality related to `Deep Lake`. While `Deep Lake` can store embeddings, it is capable of storing any type of data. It is a fully fledged serverless data lake with version control, query engine and streaming dataloader to deep learning frameworks. \n",
"\n",
"For more information, please see the Deep Lake [documentation](https://docs.activeloop.ai) or [api reference](https://docs.deeplake.ai)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!python3 -m pip install openai deeplake tiktoken"
"!pip install openai deeplake tiktoken"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@@ -33,9 +39,19 @@
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [],
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"OpenAI API Key: ········\n"
]
}
],
"source": [
"import os\n",
"import getpass\n",
@@ -46,8 +62,10 @@
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"execution_count": 4,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
@@ -61,7 +79,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -70,9 +87,19 @@
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"execution_count": 6,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/leo/.local/lib/python3.10/site-packages/deeplake/util/check_latest_version.py:32: UserWarning: A newer version of deeplake (3.3.2) is available. It's recommended that you update to the latest version using `pip install -U deeplake`.\n",
" warnings.warn(\n"
]
},
{
"name": "stdout",
"output_type": "stream",
@@ -80,11 +107,26 @@
"./my_deeplake/ loaded successfully.\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": []
},
{
"name": "stderr",
"output_type": "stream",
"text": []
},
{
"name": "stderr",
"output_type": "stream",
"text": []
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Evaluating ingest: 100%|██████████| 1/1 [00:04<00:00\n"
"Evaluating ingest: 100%|██████████████████████████████████████| 1/1 [00:07<00:00\n"
]
},
{
@@ -93,17 +135,17 @@
"text": [
"Dataset(path='./my_deeplake/', tensors=['embedding', 'ids', 'metadata', 'text'])\n",
"\n",
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding generic (4, 1536) float32 None \n",
" ids text (4, 1) str None \n",
" metadata json (4, 1) str None \n",
" text text (4, 1) str None \n"
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding generic (42, 1536) float32 None \n",
" ids text (42, 1) str None \n",
" metadata json (42, 1) str None \n",
" text text (42, 1) str None \n"
]
}
],
"source": [
"db = DeepLake(dataset_path=\"./my_deeplake/\", embedding_function=embeddings, overwrite=True)\n",
"db = DeepLake(dataset_path=\"./my_deeplake/\", embedding_function=embeddings)\n",
"db.add_documents(docs)\n",
"# or shorter\n",
"# db = DeepLake.from_documents(docs, dataset_path=\"./my_deeplake/\", embedding=embeddings, overwrite=True)\n",
@@ -113,8 +155,10 @@
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"execution_count": 7,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
@@ -135,7 +179,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -144,8 +187,10 @@
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"execution_count": 8,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
@@ -155,6 +200,11 @@
"\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": []
},
{
"name": "stderr",
"output_type": "stream",
@@ -168,12 +218,12 @@
"text": [
"Dataset(path='./my_deeplake/', read_only=True, tensors=['embedding', 'ids', 'metadata', 'text'])\n",
"\n",
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding generic (4, 1536) float32 None \n",
" ids text (4, 1) str None \n",
" metadata json (4, 1) str None \n",
" text text (4, 1) str None \n"
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding generic (42, 1536) float32 None \n",
" ids text (42, 1) str None \n",
" metadata json (42, 1) str None \n",
" text text (42, 1) str None \n"
]
}
],
@@ -183,7 +233,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -191,7 +240,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -200,14 +248,16 @@
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"execution_count": 9,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/media/sdb/davit/Git/experiments/langchain/langchain/llms/openai.py:672: UserWarning: You are trying to use a chat model. This way of initializing it is no longer supported. Instead, please use: `from langchain.chat_models import ChatOpenAI`\n",
"/home/leo/.local/lib/python3.10/site-packages/langchain/llms/openai.py:624: UserWarning: You are trying to use a chat model. This way of initializing it is no longer supported. Instead, please use: `from langchain.chat_models import ChatOpenAI`\n",
" warnings.warn(\n"
]
}
@@ -221,16 +271,18 @@
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"execution_count": 10,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"\"The president nominated Ketanji Brown Jackson to serve on the United States Supreme Court, describing her as one of the nation's top legal minds and a consensus builder with a background in private practice and public defense, and noting that she has received broad support from both Democrats and Republicans.\""
"'The president nominated Ketanji Brown Jackson to serve on the United States Supreme Court. He described her as a former top litigator in private practice, a former federal public defender, a consensus builder, and from a family of public school educators and police officers. He also mentioned that she has received broad support from various groups since being nominated.'"
]
},
"execution_count": 53,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
@@ -241,7 +293,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -325,7 +376,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -357,7 +407,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -389,7 +438,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -406,7 +454,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -429,7 +476,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -533,7 +579,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -594,7 +639,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -638,7 +682,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -815,7 +858,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.0"
"version": "3.10.6"
},
"vscode": {
"interpreter": {
@@ -824,5 +867,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -7,14 +7,135 @@
"source": [
"# ElasticSearch\n",
"\n",
"This notebook shows how to use functionality related to the ElasticSearch database."
"[Elasticsearch](https://www.elastic.co/elasticsearch/) is a distributed, RESTful search and analytics engine. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.\n",
"\n",
"This notebook shows how to use functionality related to the `Elasticsearch` database."
]
},
{
"cell_type": "markdown",
"id": "b66c12b2-2a07-4136-ac77-ce1c9fa7a409",
"metadata": {
"tags": []
},
"source": [
"## Installation"
]
},
{
"cell_type": "markdown",
"id": "81f43794-f002-477c-9b68-4975df30e718",
"metadata": {},
"source": [
"Check out [Elasticsearch installation instructions](https://www.elastic.co/guide/en/elasticsearch/reference/current/install-elasticsearch.html).\n",
"\n",
"To connect to an Elasticsearch instance that does not require\n",
"login credentials, pass the Elasticsearch URL and index name along with the\n",
"embedding object to the constructor.\n",
"\n",
"Example:\n",
"```python\n",
" from langchain import ElasticVectorSearch\n",
" from langchain.embeddings import OpenAIEmbeddings\n",
"\n",
" embedding = OpenAIEmbeddings()\n",
" elastic_vector_search = ElasticVectorSearch(\n",
" elasticsearch_url=\"http://localhost:9200\",\n",
" index_name=\"test_index\",\n",
" embedding=embedding\n",
" )\n",
"```\n",
"\n",
"To connect to an Elasticsearch instance that requires login credentials,\n",
"including Elastic Cloud, use the Elasticsearch URL format\n",
"https://username:password@es_host:9243. For example, to connect to Elastic\n",
"Cloud, create the Elasticsearch URL with the required authentication details and\n",
"pass it to the ElasticVectorSearch constructor as the named parameter\n",
"elasticsearch_url.\n",
"\n",
"You can obtain your Elastic Cloud URL and login credentials by logging in to the\n",
"Elastic Cloud console at https://cloud.elastic.co, selecting your deployment, and\n",
"navigating to the \"Deployments\" page.\n",
"\n",
"To obtain your Elastic Cloud password for the default \"elastic\" user:\n",
"1. Log in to the Elastic Cloud console at https://cloud.elastic.co\n",
"2. Go to \"Security\" > \"Users\"\n",
"3. Locate the \"elastic\" user and click \"Edit\"\n",
"4. Click \"Reset password\"\n",
"5. Follow the prompts to reset the password\n",
"\n",
"Format for Elastic Cloud URLs is\n",
"https://username:password@cluster_id.region_id.gcp.cloud.es.io:9243.\n",
"\n",
"Example:\n",
"```python\n",
" from langchain import ElasticVectorSearch\n",
" from langchain.embeddings import OpenAIEmbeddings\n",
"\n",
" embedding = OpenAIEmbeddings()\n",
"\n",
" elastic_host = \"cluster_id.region_id.gcp.cloud.es.io\"\n",
" elasticsearch_url = f\"https://username:password@{elastic_host}:9243\"\n",
" elastic_vector_search = ElasticVectorSearch(\n",
" elasticsearch_url=elasticsearch_url,\n",
" index_name=\"test_index\",\n",
" embedding=embedding\n",
" )\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "d6197931-cbe5-460c-a5e6-b5eedb83887c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install elasticsearch"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "67ab8afa-f7c6-4fbf-b596-cb512da949da",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"OpenAI API Key: ········\n"
]
}
],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "markdown",
"id": "f6030187-0bd7-4798-8372-a265036af5e0",
"metadata": {
"tags": []
},
"source": [
"## Example"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "aac9563e",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@@ -25,9 +146,11 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 5,
"id": "a3c3999a",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
@@ -43,7 +166,9 @@
"cell_type": "code",
"execution_count": null,
"id": "12eb86d8",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"db = ElasticVectorSearch.from_documents(docs, embeddings, elasticsearch_url=\"http://localhost:9200\")\n",
@@ -105,7 +230,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -7,14 +7,65 @@
"source": [
"# FAISS\n",
"\n",
"This notebook shows how to use functionality related to the FAISS vector database."
">[Facebook AI Similarity Search (Faiss)](https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/) is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning.\n",
"\n",
"[Faiss documentation](https://faiss.ai/).\n",
"\n",
"This notebook shows how to use functionality related to the `FAISS` vector database."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "aac9563e",
"execution_count": null,
"id": "497fcd89-e832-46a7-a74a-c71199666206",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"#!pip install faiss\n",
"# OR\n",
"!pip install faiss-cpu"
]
},
{
"cell_type": "markdown",
"id": "38237514-b3fa-44a4-9cff-30cd6bf50073",
"metadata": {},
"source": [
"We want to use OpenAIEmbeddings so we have to get the OpenAI API Key. "
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "47f9b495-88f1-4286-8d5d-1416103931a7",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"OpenAI API Key: ········\n"
]
}
],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "aac9563e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@@ -25,9 +76,11 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 7,
"id": "a3c3999a",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
@@ -41,9 +94,11 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 8,
"id": "5eabdb75",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"db = FAISS.from_documents(docs, embeddings)\n",
@@ -54,9 +109,11 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 9,
"id": "4b172de8",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
@@ -315,7 +372,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,214 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "683953b3",
"metadata": {},
"source": [
"# LanceDB\n",
"\n",
">[LanceDB](https://lancedb.com/) is an open-source database for vector-search built with persistent storage, which greatly simplifies retrevial, filtering and management of embeddings. Fully open source.\n",
"\n",
"This notebook shows how to use functionality related to the `LanceDB` vector database based on the Lance data format."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bfcf346a",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install lancedb"
]
},
{
"cell_type": "markdown",
"id": "99134dd1-b91e-486f-8d90-534248e43b9d",
"metadata": {},
"source": [
"We want to use OpenAIEmbeddings so we have to get the OpenAI API Key. "
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a0361f5c-e6f4-45f4-b829-11680cf03cec",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"OpenAI API Key: ········\n"
]
}
],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aac9563e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.vectorstores import LanceDB"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "a3c3999a",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"loader = TextLoader('../../../state_of_the_union.txt')\n",
"documents = loader.load()\n",
"\n",
"documents = CharacterTextSplitter().split_documents(documents)\n",
"\n",
"embeddings = OpenAIEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "6e104aee",
"metadata": {},
"outputs": [],
"source": [
"import lancedb\n",
"\n",
"db = lancedb.connect('/tmp/lancedb')\n",
"table = db.create_table(\"my_table\", data=[\n",
" {\"vector\": embeddings.embed_query(\"Hello World\"), \"text\": \"Hello World\", \"id\": \"1\"}\n",
"], mode=\"overwrite\")\n",
"\n",
"docsearch = LanceDB.from_documents(documents, embeddings, connection=table)\n",
"\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = docsearch.similarity_search(query)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "9c608226",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"They were responding to a 9-1-1 call when a man shot and killed them with a stolen gun. \n",
"\n",
"Officer Mora was 27 years old. \n",
"\n",
"Officer Rivera was 22. \n",
"\n",
"Both Dominican Americans whod grown up on the same streets they later chose to patrol as police officers. \n",
"\n",
"I spoke with their families and told them that we are forever in debt for their sacrifice, and we will carry on their mission to restore the trust and safety every community deserves. \n",
"\n",
"Ive worked on these issues a long time. \n",
"\n",
"I know what works: Investing in crime preventionand community police officers wholl walk the beat, wholl know the neighborhood, and who can restore trust and safety. \n",
"\n",
"So lets not abandon our streets. Or choose between safety and equal justice. \n",
"\n",
"Lets come together to protect our communities, restore trust, and hold law enforcement accountable. \n",
"\n",
"Thats why the Justice Department required body cameras, banned chokeholds, and restricted no-knock warrants for its officers. \n",
"\n",
"Thats why the American Rescue Plan provided $350 Billion that cities, states, and counties can use to hire more police and invest in proven strategies like community violence interruption—trusted messengers breaking the cycle of violence and trauma and giving young people hope. \n",
"\n",
"We should all agree: The answer is not to Defund the police. The answer is to FUND the police with the resources and training they need to protect our communities. \n",
"\n",
"I ask Democrats and Republicans alike: Pass my budget and keep our neighborhoods safe. \n",
"\n",
"And I will keep doing everything in my power to crack down on gun trafficking and ghost guns you can buy online and make at home—they have no serial numbers and cant be traced. \n",
"\n",
"And I ask Congress to pass proven measures to reduce gun violence. Pass universal background checks. Why should anyone on a terrorist list be able to purchase a weapon? \n",
"\n",
"Ban assault weapons and high-capacity magazines. \n",
"\n",
"Repeal the liability shield that makes gun manufacturers the only industry in America that cant be sued. \n",
"\n",
"These laws dont infringe on the Second Amendment. They save lives. \n",
"\n",
"The most fundamental right in America is the right to vote and to have it counted. And its under assault. \n",
"\n",
"In state after state, new laws have been passed, not only to suppress the vote, but to subvert entire elections. \n",
"\n",
"We cannot let this happen. \n",
"\n",
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence. \n",
"\n",
"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n",
"\n",
"And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n",
"\n",
"We can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \n",
"\n",
"Weve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \n",
"\n",
"Were putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster.\n"
]
}
],
"source": [
"print(docs[0].page_content)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a359ed74",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -7,16 +7,63 @@
"source": [
"# Milvus\n",
"\n",
">[Milvus](https://milvus.io/docs/overview.md) is a database that stores, indexes, and manages massive embedding vectors generated by deep neural networks and other machine learning (ML) models.\n",
"\n",
"This notebook shows how to use functionality related to the Milvus vector database.\n",
"\n",
"To run, you should have a Milvus instance up and running: https://milvus.io/docs/install_standalone-docker.md"
"To run, you should have a [Milvus instance up and running](https://milvus.io/docs/install_standalone-docker.md)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a62cff8a-bcf7-4e33-bbbc-76999c2e3e20",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install pymilvus"
]
},
{
"cell_type": "markdown",
"id": "7a0f9e02-8eb0-4aef-b11f-8861360472ee",
"metadata": {},
"source": [
"We want to use OpenAIEmbeddings so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "8b6ed9cd-81b9-46e5-9c20-5aafca2844d0",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"OpenAI API Key: ········\n"
]
}
],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "aac9563e",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@@ -27,9 +74,11 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 4,
"id": "a3c3999a",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
@@ -45,7 +94,9 @@
"cell_type": "code",
"execution_count": null,
"id": "dcf88bdf",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"vector_db = Milvus.from_documents(\n",
@@ -74,14 +125,6 @@
"source": [
"docs[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a359ed74",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -100,7 +143,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -1,37 +1,63 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "683953b3",
"metadata": {},
"source": [
"# MyScale\n",
"\n",
"This notebook shows how to use functionality related to the MyScale vector database."
">[MyScale](https://docs.myscale.com/en/overview/) is a cloud-based database optimized for AI applications and solutions, built on the open-source [ClickHouse](https://github.com/ClickHouse/ClickHouse). \n",
"\n",
"This notebook shows how to use functionality related to the `MyScale` vector database."
]
},
{
"cell_type": "markdown",
"id": "43ead5d5-2c1f-4dce-a69a-cb00e4f9d6f0",
"metadata": {},
"source": [
"## Setting up envrionments"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "aac9563e",
"metadata": {},
"execution_count": null,
"id": "7dccc580-8270-4714-ad61-f79783dd6eea",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import MyScale\n",
"from langchain.document_loaders import TextLoader"
"!pip install clickhouse-connect"
]
},
{
"cell_type": "markdown",
"id": "15a1d477-9cdb-4d82-b019-96951ecb2b72",
"metadata": {},
"source": [
"We want to use OpenAIEmbeddings so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "91003ea5-0c8c-436c-a5de-aaeaeef2f458",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "a9d16fa3",
"metadata": {},
"source": [
"## Setting up envrionments\n",
"\n",
"There are two ways to set up parameters for myscale index.\n",
"\n",
"1. Environment Variables\n",
@@ -56,9 +82,26 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"id": "aac9563e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import MyScale\n",
"from langchain.document_loaders import TextLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a3c3999a",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
@@ -126,7 +169,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e3a8b105",
"metadata": {},
@@ -145,7 +187,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "f59360c0",
"metadata": {},
@@ -216,7 +257,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "a359ed74",
"metadata": {},
@@ -259,7 +299,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.8"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -7,7 +7,10 @@
"source": [
"# OpenSearch\n",
"\n",
"This notebook shows how to use functionality related to the OpenSearch database.\n",
"> [OpenSearch](https://opensearch.org/) is a scalable, flexible, and extensible open-source software suite for search, analytics, and observability applications licensed under Apache 2.0. `OpenSearch` is a distributed search and analytics engine based on `Apache Lucene`.\n",
"\n",
"\n",
"This notebook shows how to use functionality related to the `OpenSearch` database.\n",
"\n",
"To run, you should have the opensearch instance up and running: [here](https://opensearch.org/docs/latest/install-and-configure/install-opensearch/index/)\n",
"`similarity_search` by default performs the Approximate k-NN Search which uses one of the several algorithms like lucene, nmslib, faiss recommended for\n",
@@ -15,6 +18,39 @@
"Check [this](https://opensearch.org/docs/latest/search-plugins/knn/index/) for more details."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6e606066-9386-4427-8a87-1b93f435c57e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install opensearch-py"
]
},
{
"cell_type": "markdown",
"id": "b1fa637e-4fbf-4d5a-9188-2cad826a193e",
"metadata": {},
"source": [
"We want to use OpenAIEmbeddings so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "28e5455e-322d-4010-9e3b-491d522ef5db",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -233,9 +269,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
}

View File

@@ -6,7 +6,38 @@
"source": [
"# PGVector\n",
"\n",
"This notebook shows how to use functionality related to the Postgres vector database (PGVector)."
">[PGVector](https://github.com/pgvector/pgvector) is an open-source vector similarity search for `Postgres`\n",
"\n",
"It supports:\n",
"- exact and approximate nearest neighbor search\n",
"- L2 distance, inner product, and cosine distance\n",
"\n",
"This notebook shows how to use the Postgres vector database (`PGVector`)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"See the [installation instruction](https://github.com/pgvector/pgvector)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install pgvector"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
@@ -14,6 +45,31 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"## Loading Environment Variables\n",
"from typing import List, Tuple\n",
@@ -23,8 +79,10 @@
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"execution_count": 4,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@@ -182,9 +240,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -7,15 +7,75 @@
"source": [
"# Pinecone\n",
"\n",
"This notebook shows how to use functionality related to the Pinecone vector database."
"[Pinecone](https://docs.pinecone.io/docs/overview) is a vector database with broad functionality.\n",
"\n",
"This notebook shows how to use functionality related to the `Pinecone` vector database.\n",
"\n",
"To use Pinecone, you must have an API key. \n",
"Here are the [installation instructions](https://docs.pinecone.io/docs/quickstart)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "aac9563e",
"execution_count": null,
"id": "b4c41cad-08ef-4f72-a545-2151e4598efe",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install pinecone-client"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c1e38361-c1fe-4ac6-86e9-c90ebaf7ae87",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"PINECONE_API_KEY = getpass.getpass('Pinecone API Key:')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "02a536e0-d603-4d79-b18b-1ed562977b40",
"metadata": {},
"outputs": [],
"source": [
"PINECONE_ENV = getpass.getpass('Pinecone Environment:')"
]
},
{
"cell_type": "markdown",
"id": "320af802-9271-46ee-948f-d2453933d44b",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ffea66e4-bc23-46a9-9580-b348dfe7b7a7",
"metadata": {},
"outputs": [],
"source": [
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "aac9563e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
@@ -50,8 +110,8 @@
"\n",
"# initialize pinecone\n",
"pinecone.init(\n",
" api_key=\"YOUR_API_KEY\", # find at app.pinecone.io\n",
" environment=\"YOUR_ENV\" # next to api key in console\n",
" api_key=PINECONE_API_KEY, # find at app.pinecone.io\n",
" environment=PINECONE_ENV # next to api key in console\n",
")\n",
"\n",
"index_name = \"langchain-demo\"\n",
@@ -100,7 +160,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -7,22 +7,72 @@
"source": [
"# Qdrant\n",
"\n",
"This notebook shows how to use functionality related to the Qdrant vector database. There are various modes of how to run Qdrant, and depending on the chosen one, there will be some subtle differences. The options include:\n",
">[Qdrant](https://qdrant.tech/documentation/) (read: quadrant ) is a vector similarity search engine. It provides a production-ready service with a convenient API to store, search, and manage points - vectors with an additional payload. `Qdrant` is tailored to extended filtering support. It makes it useful for all sorts of neural network or semantic-based matching, faceted search, and other applications.\n",
"\n",
"\n",
"This notebook shows how to use functionality related to the `Qdrant` vector database. \n",
"\n",
"There are various modes of how to run `Qdrant`, and depending on the chosen one, there will be some subtle differences. The options include:\n",
"- Local mode, no server required\n",
"- On-premise server deployment\n",
"- Qdrant Cloud"
"- Qdrant Cloud\n",
"\n",
"See the [installation instructions](https://qdrant.tech/documentation/install/)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "e03e8460-8f32-4d1f-bb93-4f7636a476fa",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install qdrant-client"
]
},
{
"cell_type": "markdown",
"id": "7b2f111b-357a-4f42-9730-ef0603bdc1b5",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "082e7e8b-ac52-430c-98d6-8f0924457642",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"OpenAI API Key: ········\n"
]
}
],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "aac9563e",
"metadata": {
"ExecuteTime": {
"end_time": "2023-04-04T10:51:22.282884Z",
"start_time": "2023-04-04T10:51:21.408077Z"
}
},
"tags": []
},
"outputs": [],
"source": [
@@ -34,13 +84,14 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 4,
"id": "a3c3999a",
"metadata": {
"ExecuteTime": {
"end_time": "2023-04-04T10:51:22.520144Z",
"start_time": "2023-04-04T10:51:22.285826Z"
}
},
"tags": []
},
"outputs": [],
"source": [
@@ -70,13 +121,14 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 5,
"id": "8429667e",
"metadata": {
"ExecuteTime": {
"end_time": "2023-04-04T10:51:22.525091Z",
"start_time": "2023-04-04T10:51:22.522015Z"
}
},
"tags": []
},
"outputs": [],
"source": [
@@ -99,13 +151,14 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 6,
"id": "24b370e2",
"metadata": {
"ExecuteTime": {
"end_time": "2023-04-04T10:51:24.827567Z",
"start_time": "2023-04-04T10:51:22.529080Z"
}
},
"tags": []
},
"outputs": [],
"source": [
@@ -242,13 +295,14 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 7,
"id": "a8c513ab",
"metadata": {
"ExecuteTime": {
"end_time": "2023-04-04T10:51:25.204469Z",
"start_time": "2023-04-04T10:51:24.855618Z"
}
},
"tags": []
},
"outputs": [],
"source": [
@@ -258,13 +312,14 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 8,
"id": "fc516993",
"metadata": {
"ExecuteTime": {
"end_time": "2023-04-04T10:51:25.220984Z",
"start_time": "2023-04-04T10:51:25.213943Z"
}
},
"tags": []
},
"outputs": [
{

View File

@@ -1,19 +1,52 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Redis\n",
"\n",
">[Redis (Remote Dictionary Server)](https://en.wikipedia.org/wiki/Redis) is an in-memory data structure store, used as a distributed, in-memory keyvalue database, cache and message broker, with optional durability.\n",
"\n",
"This notebook shows how to use functionality related to the [Redis vector database](https://redis.com/solutions/use-cases/vector-database/)."
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install redis"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings import OpenAIEmbeddings\n",
@@ -227,9 +260,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 1
"nbformat_minor": 4
}

View File

@@ -1,17 +1,23 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "683953b3",
"metadata": {},
"source": [
"# SupabaseVectorStore\n",
"# SupabaseVectorStore"
]
},
{
"cell_type": "markdown",
"id": "cc80fa84-1f2f-48b4-bd39-3e6412f012f1",
"metadata": {},
"source": [
">[Supabase](https://supabase.com/docs) is an open source Firebase alternative.\n",
"\n",
"This notebook shows how to use Supabase and `pgvector` as your VectorStore.\n",
"This notebook shows how to use `Supabase` and `pgvector` as your VectorStore.\n",
"\n",
"To run this notebook, please ensure:\n",
"\n",
"- the `pgvector` extension is enabled\n",
"- you have installed the `supabase-py` package\n",
"- that you have created a `match_documents` function in your database\n",
@@ -57,23 +63,66 @@
" LIMIT match_count;\n",
" END;\n",
" $$;\n",
"```\n"
"```"
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "6bd4498b",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# with pip\n",
"# !pip install supabase\n",
"!pip install supabase\n",
"\n",
"# with conda\n",
"# !conda install -c conda-forge supabase"
]
},
{
"cell_type": "markdown",
"id": "69bff365-3039-4ff8-a641-aa190166179d",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "19846a7b-99bc-47a7-8e1c-f13c2497f1ae",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c71c3901-d44b-4d09-92c5-3018628c28fa",
"metadata": {},
"outputs": [],
"source": [
"os.environ['SUPABASE_URL'] = getpass.getpass('Supabase URL:')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8b91ecfa-f61b-489a-a337-dff1f12f6ab2",
"metadata": {},
"outputs": [],
"source": [
"os.environ['SUPABASE_SERVICE_KEY'] = getpass.getpass('Supabase Service Key:')"
]
},
{
"cell_type": "code",
"execution_count": 2,
@@ -391,7 +440,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,129 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Tair\n",
"\n",
"This notebook shows how to use functionality related to the Tair vector database.\n",
"To run, you should have an [Tair](https://www.alibabacloud.com/help/en/tair/latest/what-is-tair) instance up and running."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.fake import FakeEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import Tair"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
"loader = TextLoader('../../../state_of_the_union.txt')\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = FakeEmbeddings(size=128)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Connect to Tair using the `TAIR_URL` environment variable \n",
"```\n",
"export TAIR_URL=\"redis://{username}:{password}@{tair_address}:{tair_port}\"\n",
"```\n",
"\n",
"or the keyword argument `tair_url`.\n",
"\n",
"Then store documents and embeddings into Tair."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"tair_url = \"redis://localhost:6379\"\n",
"\n",
"# drop first if index already exists\n",
"Tair.drop_index(tair_url=tair_url)\n",
"\n",
"vector_store = Tair.from_documents(\n",
" docs,\n",
" embeddings,\n",
" tair_url=tair_url\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Query similar documents."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Document(page_content='Were going after the criminals who stole billions in relief money meant for small businesses and millions of Americans. \\n\\nAnd tonight, Im announcing that the Justice Department will name a chief prosecutor for pandemic fraud. \\n\\nBy the end of this year, the deficit will be down to less than half what it was before I took office. \\n\\nThe only president ever to cut the deficit by more than one trillion dollars in a single year. \\n\\nLowering your costs also means demanding more competition. \\n\\nIm a capitalist, but capitalism without competition isnt capitalism. \\n\\nIts exploitation—and it drives up prices. \\n\\nWhen corporations dont have to compete, their profits go up, your prices go up, and small businesses and family farmers and ranchers go under. \\n\\nWe see it happening with ocean carriers moving goods in and out of America. \\n\\nDuring the pandemic, these foreign-owned companies raised prices by as much as 1,000% and made record profits.', metadata={'source': '../../../state_of_the_union.txt'})"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = vector_store.similarity_search(query)\n",
"docs[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 1
}

View File

@@ -7,14 +7,73 @@
"source": [
"# Weaviate\n",
"\n",
"This notebook shows how to use functionality related to the Weaviate vector database."
">[Weaviate](https://weaviate.io/) is an open-source vector database. It allows you to store data objects and vector embeddings from your favorite ML-models, and scale seamlessly into billions of data objects.\n",
"\n",
"This notebook shows how to use functionality related to the `Weaviate`vector database.\n",
"\n",
"See the `Weaviate` [installation instructions](https://weaviate.io/developers/weaviate/installation)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e9ab167c-fffc-4d30-b1c1-37cc1b641698",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install weaviate-client"
]
},
{
"cell_type": "markdown",
"id": "6b34828d-e627-4d85-aabd-eeb15d9f4b00",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "37697b9f-fbb2-430e-b95d-28d6eb83486d",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fea2dbae-a609-4458-a05f-f1c8e1f37c6f",
"metadata": {},
"outputs": [],
"source": [
"WEAVIATE_URL = getpass.getpass('WEAVIATE_URL:')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "53b7ce2d-3c09-4d1c-b66b-5769ce6746ae",
"metadata": {},
"outputs": [],
"source": [
"os.environ['WEAVIATE_API_KEY'] = getpass.getpass('WEAVIATE_API_KEY:')"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "aac9563e",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@@ -75,7 +134,8 @@
" \"vectorizer\": \"text2vec-openai\",\n",
" \"moduleConfig\": {\n",
" \"text2vec-openai\": {\n",
" \"model\": \"babbage\",\n",
" \"model\": \"ada\",\n",
" \"modelVersion\": \"002\",\n",
" \"type\": \"text\"\n",
" }\n",
" },\n",
@@ -155,7 +215,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -7,9 +7,57 @@
"source": [
"# Zilliz\n",
"\n",
">[Zilliz Cloud](https://zilliz.com/doc/quick_start) is a fully managed service on cloud for `LF AI Milvus®`,\n",
"\n",
"This notebook shows how to use functionality related to the Zilliz Cloud managed vector database.\n",
"\n",
"To run, you should have a Zilliz Cloud instance up and running: https://zilliz.com/cloud"
"To run, you should have a `Zilliz Cloud` instance up and running. Here are the [installation instructions](https://zilliz.com/cloud)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c0c50102-e6ac-4475-a930-49c94ed0bd99",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install pymilvus"
]
},
{
"cell_type": "markdown",
"id": "4b25e246-ffe7-4822-a6bf-85d1a120df00",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d6691489-1ebc-40fa-bc09-b0916903a24d",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "19a71422",
"metadata": {},
"outputs": [],
"source": [
"# replace \n",
"ZILLIZ_CLOUD_URI = \"\" # example: \"https://in01-17f69c292d4a5sa.aws-us-west-2.vectordb.zillizcloud.com:19536\"\n",
"ZILLIZ_CLOUD_USERNAME = \"\" # example: \"username\"\n",
"ZILLIZ_CLOUD_PASSWORD = \"\" # example: \"*********\""
]
},
{
@@ -25,18 +73,6 @@
"from langchain.document_loaders import TextLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "19a71422",
"metadata": {},
"outputs": [],
"source": [
"# replace \n",
"ZILLIZ_CLOUD_HOSTNAME = \"\" # example: \"in01-17f69c292d4a50a.aws-us-west-2.vectordb.zillizcloud.com\"\n",
"ZILLIZ_CLOUD_PORT = \"\" #example: \"19532\""
]
},
{
"cell_type": "code",
"execution_count": null,
@@ -63,7 +99,12 @@
"vector_db = Milvus.from_documents(\n",
" docs,\n",
" embeddings,\n",
" connection_args={\"host\": ZILLIZ_CLOUD_HOSTNAME, \"port\": ZILLIZ_CLOUD_PORT},\n",
" connection_args={\n",
" \"uri\": ZILLIZ_CLOUD_URI,\n",
" \"username\": ZILLIZ_CLOUD_USERNAME,\n",
" \"password\": ZILLIZ_CLOUD_PASSWORD,\n",
" \"secure\": True\n",
" }\n",
")"
]
},
@@ -104,7 +145,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.9"
"version": "3.9.6"
}
},
"nbformat": 4,

View File

@@ -11,9 +11,9 @@
"source": [
"# Getting Started\n",
"\n",
"This notebook showcases basic functionality related to VectorStores. A key part of working with vectorstores is creating the vector to put in them, which is usually created via embeddings. Therefore, it is recommended that you familiarize yourself with the [embedding notebook](embeddings.ipynb) before diving into this.\n",
"This notebook showcases basic functionality related to VectorStores. A key part of working with vectorstores is creating the vector to put in them, which is usually created via embeddings. Therefore, it is recommended that you familiarize yourself with the [embedding notebook](../../models/text_embedding.htpl) before diving into this.\n",
"\n",
"This covers generic high level functionality related to all vector stores. For guides on specific vectorstores, please see the how-to guides [here](../how_to_guides.rst)"
"This covers generic high level functionality related to all vector stores."
]
},
{
@@ -265,7 +265,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -80,9 +80,8 @@
}
],
"source": [
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
"chat = ChatOpenAI(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
"chat = ChatOpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)\n",
"resp = chat([HumanMessage(content=\"Write me a song about sparkling water.\")])"
]
},

View File

@@ -373,9 +373,8 @@
}
],
"source": [
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
"chat = ChatOpenAI(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
"chat = ChatOpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)\n",
"resp = chat([HumanMessage(content=\"Write me a song about sparkling water.\")])\n"
]
},

View File

@@ -0,0 +1,179 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "bf733a38-db84-4363-89e2-de6735c37230",
"metadata": {},
"source": [
"# Anthropic\n",
"\n",
"This notebook covers how to get started with Anthropic chat models."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "d4a7c55d-b235-4ca4-a579-c90cc9570da9",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.chat_models import ChatAnthropic\n",
"from langchain.prompts.chat import (\n",
" ChatPromptTemplate,\n",
" SystemMessagePromptTemplate,\n",
" AIMessagePromptTemplate,\n",
" HumanMessagePromptTemplate,\n",
")\n",
"from langchain.schema import (\n",
" AIMessage,\n",
" HumanMessage,\n",
" SystemMessage\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "70cf04e8-423a-4ff6-8b09-f11fb711c817",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"chat = ChatAnthropic()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "8199ef8f-eb8b-4253-9ea0-6c24a013ca4c",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\" J'aime programmer. \", additional_kwargs={})"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"messages = [\n",
" HumanMessage(content=\"Translate this sentence from English to French. I love programming.\")\n",
"]\n",
"chat(messages)"
]
},
{
"cell_type": "markdown",
"id": "c361ab1e-8c0c-4206-9e3c-9d1424a12b9c",
"metadata": {},
"source": [
"## `ChatAnthropic` also supports async and streaming functionality:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "93a21c5c-6ef9-4688-be60-b2e1f94842fb",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "c5fac0e9-05a4-4fc1-a3b3-e5bbb24b971b",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"LLMResult(generations=[[ChatGeneration(text=\" J'aime la programmation.\", generation_info=None, message=AIMessage(content=\" J'aime la programmation.\", additional_kwargs={}))]], llm_output={})"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"await chat.agenerate([messages])"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "025be980-e50d-4a68-93dc-c9c7b500ce34",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" J'adore programmer."
]
},
{
"data": {
"text/plain": [
"AIMessage(content=\" J'adore programmer.\", additional_kwargs={})"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chat = ChatAnthropic(streaming=True, verbose=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))\n",
"chat(messages)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "df45f59f",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -22,18 +22,20 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 6,
"id": "a65696a0",
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms.base import LLM\n",
"from typing import Optional, List, Mapping, Any"
"from typing import Any, List, Mapping, Optional\n",
"\n",
"from langchain.callbacks.manager import CallbackManagerForLLMRun\n",
"from langchain.llms.base import LLM"
]
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 7,
"id": "d5ceff02",
"metadata": {},
"outputs": [],
@@ -46,7 +48,12 @@
" def _llm_type(self) -> str:\n",
" return \"custom\"\n",
" \n",
" def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:\n",
" def _call(\n",
" self,\n",
" prompt: str,\n",
" stop: Optional[List[str]] = None,\n",
" run_manager: Optional[CallbackManagerForLLMRun] = None,\n",
" ) -> str:\n",
" if stop is not None:\n",
" raise ValueError(\"stop kwargs are not permitted.\")\n",
" return prompt[:self.n]\n",
@@ -67,7 +74,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 8,
"id": "10e5ece6",
"metadata": {},
"outputs": [],
@@ -77,7 +84,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 9,
"id": "8cd49199",
"metadata": {},
"outputs": [
@@ -87,7 +94,7 @@
"'This is a '"
]
},
"execution_count": 4,
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
@@ -106,7 +113,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 10,
"id": "9c33fa19",
"metadata": {},
"outputs": [

View File

@@ -41,7 +41,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 5,
"id": "f69f6283",
"metadata": {},
"outputs": [],
@@ -52,7 +52,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 6,
"id": "64005d1f",
"metadata": {},
"outputs": [
@@ -60,8 +60,8 @@
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 14.2 ms, sys: 4.9 ms, total: 19.1 ms\n",
"Wall time: 1.1 s\n"
"CPU times: user 26.1 ms, sys: 21.5 ms, total: 47.6 ms\n",
"Wall time: 1.68 s\n"
]
},
{
@@ -70,7 +70,7 @@
"'\\n\\nWhy did the chicken cross the road?\\n\\nTo get to the other side.'"
]
},
"execution_count": 4,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
@@ -83,7 +83,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 7,
"id": "c8a1cb2b",
"metadata": {},
"outputs": [
@@ -91,8 +91,8 @@
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 162 µs, sys: 7 µs, total: 169 µs\n",
"Wall time: 175 µs\n"
"CPU times: user 238 µs, sys: 143 µs, total: 381 µs\n",
"Wall time: 1.76 ms\n"
]
},
{
@@ -101,7 +101,7 @@
"'\\n\\nWhy did the chicken cross the road?\\n\\nTo get to the other side.'"
]
},
"execution_count": 5,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
@@ -214,9 +214,18 @@
"## Redis Cache"
]
},
{
"cell_type": "markdown",
"id": "c5c9a4d5",
"metadata": {},
"source": [
"### Standard Cache\n",
"Use [Redis](../../../../ecosystem/redis.md) to cache prompts and responses."
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 8,
"id": "39f6eb0b",
"metadata": {},
"outputs": [],
@@ -225,15 +234,35 @@
"# (make sure your local Redis instance is running first before running this example)\n",
"from redis import Redis\n",
"from langchain.cache import RedisCache\n",
"\n",
"langchain.llm_cache = RedisCache(redis_=Redis())"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 9,
"id": "28920749",
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 6.88 ms, sys: 8.75 ms, total: 15.6 ms\n",
"Wall time: 1.04 s\n"
]
},
{
"data": {
"text/plain": [
"'\\n\\nWhy did the chicken cross the road?\\n\\nTo get to the other side!'"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"# The first time, it is not yet in cache, so it should take longer\n",
@@ -242,16 +271,124 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 14,
"id": "94bf9415",
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 1.59 ms, sys: 610 µs, total: 2.2 ms\n",
"Wall time: 5.58 ms\n"
]
},
{
"data": {
"text/plain": [
"'\\n\\nWhy did the chicken cross the road?\\n\\nTo get to the other side!'"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"# The second time it is, so it goes faster\n",
"llm(\"Tell me a joke\")"
]
},
{
"cell_type": "markdown",
"id": "82be23f6",
"metadata": {},
"source": [
"### Semantic Cache\n",
"Use [Redis](../../../../ecosystem/redis.md) to cache prompts and responses and evaluate hits based on semantic similarity."
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "64df3099",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.cache import RedisSemanticCache\n",
"\n",
"\n",
"langchain.llm_cache = RedisSemanticCache(\n",
" redis_url=\"redis://localhost:6379\",\n",
" embedding=OpenAIEmbeddings()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "8e91d3ac",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 351 ms, sys: 156 ms, total: 507 ms\n",
"Wall time: 3.37 s\n"
]
},
{
"data": {
"text/plain": [
"\"\\n\\nWhy don't scientists trust atoms?\\nBecause they make up everything.\""
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"# The first time, it is not yet in cache, so it should take longer\n",
"llm(\"Tell me a joke\")"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "df856948",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 6.25 ms, sys: 2.72 ms, total: 8.97 ms\n",
"Wall time: 262 ms\n"
]
},
{
"data": {
"text/plain": [
"\"\\n\\nWhy don't scientists trust atoms?\\nBecause they make up everything.\""
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"# The second time, while not a direct hit, the question is semantically similar to the original question,\n",
"# so it uses the cached result!\n",
"llm(\"Tell me one joke\")"
]
},
{
"cell_type": "markdown",
"id": "684eab55",
@@ -785,7 +922,9 @@
"id": "9df0dab8",
"metadata": {},
"outputs": [],
"source": []
"source": [
"!rm .langchain.db sqlite.db"
]
}
],
"metadata": {

View File

@@ -12,7 +12,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "4ac0ff54-540a-4f2b-8d9a-b590fec7fe07",
"metadata": {
"tags": []
@@ -21,14 +21,13 @@
"source": [
"from langchain.llms import OpenAI, Anthropic\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
"from langchain.schema import HumanMessage"
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"id": "77f60a4b-f786-41f2-972e-e5bb8a48dcd5",
"metadata": {
"tags": []
@@ -79,7 +78,7 @@
}
],
"source": [
"llm = OpenAI(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
"llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)\n",
"resp = llm(\"Write me a song about sparkling water.\")"
]
},
@@ -95,7 +94,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": null,
"id": "a35373f1-9ee6-4753-a343-5aee749b8527",
"metadata": {
"tags": []
@@ -136,7 +135,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": null,
"id": "22665f16-e05b-473c-a4bd-ad75744ea024",
"metadata": {
"tags": []
@@ -191,7 +190,7 @@
}
],
"source": [
"chat = ChatOpenAI(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
"chat = ChatOpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)\n",
"resp = chat([HumanMessage(content=\"Write me a song about sparkling water.\")])"
]
},
@@ -205,7 +204,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"id": "eadae4ba-9f21-4ec8-845d-dd43b0edc2dc",
"metadata": {
"tags": []
@@ -245,7 +244,7 @@
}
],
"source": [
"llm = Anthropic(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
"llm = Anthropic(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)\n",
"llm(\"Write me a song about sparkling water.\")"
]
}

View File

@@ -1,146 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "9597802c",
"metadata": {},
"source": [
"# Anthropic\n",
"\n",
"[Anthropic](https://console.anthropic.com/docs) is creator of the `Claude` LLM.\n",
"\n",
"This example goes over how to use LangChain to interact with Anthropic models."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e55c0f2e-63e1-4e83-ac44-ffcc1dfeacc8",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Install the package\n",
"!pip install anthropic"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cec62d45-afa2-422a-95ef-57f8ab41a6f9",
"metadata": {},
"outputs": [],
"source": [
"# get a new token: https://www.anthropic.com/earlyaccess\n",
"\n",
"from getpass import getpass\n",
"\n",
"ANTHROPIC_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "6fb585dd",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.llms import Anthropic\n",
"from langchain import PromptTemplate, LLMChain"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "035dea0f",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template = \"\"\"Question: {question}\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3f3458d9",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = Anthropic(anthropic_api_key=ANTHROPIC_API_KEY)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "a641dbd9",
"metadata": {},
"outputs": [],
"source": [
"llm_chain = LLMChain(prompt=prompt, llm=llm)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "9f844993",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\" Step 1: Justin Beiber was born on March 1, 1994\\nStep 2: The NFL season ends with the Super Bowl in January/February\\nStep 3: Therefore, the Super Bowl that occurred closest to Justin Beiber's birth would be Super Bowl XXIX in 1995\\nStep 4: The San Francisco 49ers won Super Bowl XXIX in 1995\\n\\nTherefore, the answer is the San Francisco 49ers won the Super Bowl in the year Justin Beiber was born.\""
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"\n",
"llm_chain.run(question)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4797d719",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -6,7 +6,7 @@
"source": [
"# CerebriumAI\n",
"\n",
"`Cerebrium` is an AWS Sagemaker alternative. It also provides API access to [several LLM models](https://docs.cerebrium.ai/cerebrium/prebuilt-models/deploymen).\n",
"`Cerebrium` is an AWS Sagemaker alternative. It also provides API access to [several LLM models](https://docs.cerebrium.ai/cerebrium/prebuilt-models/deployment).\n",
"\n",
"This notebook goes over how to use Langchain with [CerebriumAI](https://docs.cerebrium.ai/introduction)."
]

View File

@@ -40,7 +40,6 @@
"source": [
"from langchain import PromptTemplate, LLMChain\n",
"from langchain.llms import GPT4All\n",
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler"
]
},
@@ -124,9 +123,9 @@
"outputs": [],
"source": [
"# Callbacks support token-wise streaming\n",
"callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])\n",
"callbacks = [StreamingStdOutCallbackHandler()]\n",
"# Verbose is required to pass to the callback manager\n",
"llm = GPT4All(model=local_path, callback_manager=callback_manager, verbose=True)"
"llm = GPT4All(model=local_path, callbacks=callbacks, verbose=True)"
]
},
{

View File

@@ -11,7 +11,7 @@
"\n",
"The [Hugging Face Model Hub](https://huggingface.co/models) hosts over 120k models, 20k datasets, and 50k demo apps (Spaces), all open source and publicly available, in an online platform where people can easily collaborate and build ML together.\n",
"\n",
"These can be called from LangChain either through this local pipeline wrapper or by calling their hosted inference endpoints through the HuggingFaceHub class. For more information on the hosted pipelines, see the [HugigngFaceHub](huggingface_hub.ipynb) notebook."
"These can be called from LangChain either through this local pipeline wrapper or by calling their hosted inference endpoints through the HuggingFaceHub class. For more information on the hosted pipelines, see the [HuggingFaceHub](huggingface_hub.ipynb) notebook."
]
},
{

View File

@@ -41,7 +41,9 @@
"outputs": [],
"source": [
"from langchain.llms import LlamaCpp\n",
"from langchain import PromptTemplate, LLMChain"
"from langchain import PromptTemplate, LLMChain\n",
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler"
]
},
{
@@ -67,7 +69,14 @@
},
"outputs": [],
"source": [
"llm = LlamaCpp(model_path=\"./ggml-model-q4_0.bin\")"
"# Callbacks support token-wise streaming\n",
"callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])\n",
"# Verbose is required to pass to the callback manager\n",
"\n",
"# Make sure the model path is correct for your system!\n",
"llm = LlamaCpp(\n",
" model_path=\"./ggml-model-q4_0.bin\", callback_manager=callback_manager, verbose=True\n",
")"
]
},
{
@@ -84,10 +93,17 @@
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" First we need to identify what year Justin Beiber was born in. A quick google search reveals that he was born on March 1st, 1994. Now we know when the Super Bowl was played in, so we can look up which NFL team won it. The NFL Superbowl of the year 1994 was won by the San Francisco 49ers against the San Diego Chargers."
]
},
{
"data": {
"text/plain": [
"'\\n\\nWe know that Justin Bieber is currently 25 years old and that he was born on March 1st, 1994 and that he is a singer and he has an album called Purpose, so we know that he was born when Super Bowl XXXVIII was played between Dallas and Seattle and that it took place February 1st, 2004 and that the Seattle Seahawks won 24-21, so Seattle is our answer!'"
"' First we need to identify what year Justin Beiber was born in. A quick google search reveals that he was born on March 1st, 1994. Now we know when the Super Bowl was played in, so we can look up which NFL team won it. The NFL Superbowl of the year 1994 was won by the San Francisco 49ers against the San Diego Chargers.'"
]
},
"execution_count": 6,

View File

@@ -0,0 +1,171 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# PipelineAI\n",
"\n",
"PipelineAI allows you to run your ML models at scale in the cloud. It also provides API access to [several LLM models](https://pipeline.ai).\n",
"\n",
"This notebook goes over how to use Langchain with [PipelineAI](https://docs.pipeline.ai/docs)."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Install pipeline-ai\n",
"The `pipeline-ai` library is required to use the `PipelineAI` API, AKA `Pipeline Cloud`. Install `pipeline-ai` using `pip install pipeline-ai`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install the package\n",
"!pip install pipeline-ai"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Imports"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from langchain.llms import PipelineAI\n",
"from langchain import PromptTemplate, LLMChain"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set the Environment API Key\n",
"Make sure to get your API key from PipelineAI. Check out the [cloud quickstart guide](https://docs.pipeline.ai/docs/cloud-quickstart). You'll be given a 30 day free trial with 10 hours of serverless GPU compute to test different models."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"PIPELINE_API_KEY\"] = \"YOUR_API_KEY_HERE\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create the PipelineAI instance\n",
"When instantiating PipelineAI, you need to specify the id or tag of the pipeline you want to use, e.g. `pipeline_key = \"public/gpt-j:base\"`. You then have the option of passing additional pipeline-specific keyword arguments:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"llm = PipelineAI(pipeline_key=\"YOUR_PIPELINE_KEY\", pipeline_kwargs={...})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create a Prompt Template\n",
"We will create a prompt template for Question and Answer."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"template = \"\"\"Question: {question}\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initiate the LLMChain"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"llm_chain = LLMChain(prompt=prompt, llm=llm)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Run the LLMChain\n",
"Provide a question and run the LLMChain."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"\n",
"llm_chain.run(question)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"vscode": {
"interpreter": {
"hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -0,0 +1,155 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# PredictionGuard\n",
"\n",
"How to use PredictionGuard wrapper"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "3RqWPav7AtKL"
},
"outputs": [],
"source": [
"! pip install predictionguard langchain"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "2xe8JEUwA7_y"
},
"outputs": [],
"source": [
"import predictionguard as pg\n",
"from langchain.llms import PredictionGuard"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mesCTyhnJkNS"
},
"source": [
"## Basic LLM usage\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Ua7Mw1N4HcER"
},
"outputs": [],
"source": [
"pgllm = PredictionGuard(name=\"default-text-gen\", token=\"<your access token>\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Qo2p5flLHxrB"
},
"outputs": [],
"source": [
"pgllm(\"Tell me a joke\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "v3MzIUItJ8kV"
},
"source": [
"## Chaining"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "pPegEZExILrT"
},
"outputs": [],
"source": [
"from langchain import PromptTemplate, LLMChain"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "suxw62y-J-bg"
},
"outputs": [],
"source": [
"template = \"\"\"Question: {question}\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n",
"llm_chain = LLMChain(prompt=prompt, llm=pgllm, verbose=True)\n",
"\n",
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"\n",
"llm_chain.predict(question=question)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "l2bc26KHKr7n"
},
"outputs": [],
"source": [
"template = \"\"\"Write a {adjective} poem about {subject}.\"\"\"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"adjective\", \"subject\"])\n",
"llm_chain = LLMChain(prompt=prompt, llm=pgllm, verbose=True)\n",
"\n",
"llm_chain.predict(adjective=\"sad\", subject=\"ducks\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "I--eSa2PLGqq"
},
"outputs": [],
"source": []
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 1
}

View File

@@ -44,7 +44,7 @@
},
"outputs": [
{
"name": "stdin",
"name": "stdout",
"output_type": "stream",
"text": [
" ········\n"
@@ -85,6 +85,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -92,7 +93,7 @@
"\n",
"Find a model on the [replicate explore page](https://replicate.com/explore), and then paste in the model name and version in this format: model_name/version\n",
"\n",
"For example, for this [flan-t5 model]( https://replicate.com/daanelson/flan-t5), click on the API tab. The model name/version would be: `daanelson/flan-t5:04e422a9b85baed86a4f24981d7f9953e20c5fd82f6103b74ebc431588e1cec8`\n",
"For example, for this [dolly model](https://replicate.com/replicate/dolly-v2-12b), click on the API tab. The model name/version would be: `replicate/dolly-v2-12b:ef0e1aefc61f8e096ebe4db6b2bacc297daf2ef6899f0f7e001ec445893500e5`\n",
"\n",
"Only the `model` param is required, but we can add other model params when initializing.\n",
"\n",
@@ -113,7 +114,7 @@
},
"outputs": [],
"source": [
"llm = Replicate(model=\"daanelson/flan-t5:04e422a9b85baed86a4f24981d7f9953e20c5fd82f6103b74ebc431588e1cec8\")"
"llm = Replicate(model=\"replicate/dolly-v2-12b:ef0e1aefc61f8e096ebe4db6b2bacc297daf2ef6899f0f7e001ec445893500e5\")"
]
},
{
@@ -243,7 +244,7 @@
"metadata": {},
"outputs": [],
"source": [
"llm = Replicate(model=\"daanelson/flan-t5:04e422a9b85baed86a4f24981d7f9953e20c5fd82f6103b74ebc431588e1cec8\")\n",
"dolly_llm = Replicate(model=\"replicate/dolly-v2-12b:ef0e1aefc61f8e096ebe4db6b2bacc297daf2ef6899f0f7e001ec445893500e5\")\n",
"text2image = Replicate(model=\"stability-ai/stable-diffusion:db21e45d3f7023abc2a46ee38a23973f6dce16bb082a930b0c49861f96d1e5bf\")"
]
},
@@ -265,7 +266,7 @@
" template=\"What is a good name for a company that makes {product}?\",\n",
")\n",
"\n",
"chain = LLMChain(llm=llm, prompt=prompt)"
"chain = LLMChain(llm=dolly_llm, prompt=prompt)"
]
},
{
@@ -285,7 +286,7 @@
" input_variables=[\"company_name\"],\n",
" template=\"Write a description of a logo for this company: {company_name}\",\n",
")\n",
"chain_two = LLMChain(llm=llm, prompt=second_prompt)"
"chain_two = LLMChain(llm=dolly_llm, prompt=second_prompt)"
]
},
{

View File

@@ -8,12 +8,14 @@
"source": [
"# Sentence Transformers Embeddings\n",
"\n",
"Let's generate embeddings using the [SentenceTransformers](https://www.sbert.net/) integration. SentenceTransformers is a python package that can generate text and image embeddings, originating from [Sentence-BERT](https://arxiv.org/abs/1908.10084)"
"[SentenceTransformers](https://www.sbert.net/) embeddings are called using the `HuggingFaceEmbeddings` integration. We have also added an alias for `SentenceTransformerEmbeddings` for users who are more familiar with directly using that package.\n",
"\n",
"SentenceTransformers is a python package that can generate text and image embeddings, originating from [Sentence-BERT](https://arxiv.org/abs/1908.10084)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 1,
"id": "06c9f47d",
"metadata": {},
"outputs": [
@@ -21,10 +23,9 @@
"name": "stdout",
"output_type": "stream",
"text": [
"huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
"To disable this warning, you can either:\n",
"\t- Avoid using `tokenizers` before the fork if possible\n",
"\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n"
"\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.0.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.1.1\u001b[0m\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
]
}
],
@@ -34,27 +35,28 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 2,
"id": "861521a9",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings import SentenceTransformerEmbeddings "
"from langchain.embeddings import HuggingFaceEmbeddings, SentenceTransformerEmbeddings "
]
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": null,
"id": "ff9be586",
"metadata": {},
"outputs": [],
"source": [
"embeddings = SentenceTransformerEmbeddings(model=\"all-MiniLM-L6-v2\")"
"embeddings = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n",
"# Equivalent to SentenceTransformerEmbeddings(model_name=\"all-MiniLM-L6-v2\")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 4,
"id": "d0a98ae9",
"metadata": {},
"outputs": [],
@@ -64,7 +66,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 5,
"id": "5d6c682b",
"metadata": {},
"outputs": [],
@@ -74,7 +76,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 6,
"id": "bb5e74c0",
"metadata": {},
"outputs": [],
@@ -107,7 +109,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.8.16"
},
"vscode": {
"interpreter": {

View File

@@ -1,22 +1,28 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "6488fdaf",
"metadata": {},
"source": [
"# Chat Prompt Template\n",
"\n",
"Chat Models takes a list of chat messages as input - this list commonly referred to as a prompt.\n",
"Typically this is not simply a hardcoded list of messages but rather a combination of a template, some examples, and user input.\n",
"LangChain provides several classes and functions to make constructing and working with prompts easy.\n"
"[Chat Models](../models/chat.rst) takes a list of chat messages as input - this list commonly referred to as a prompt.\n",
"These chat messages differ from raw string (which you would pass into a [LLM](../models/llms.rst) model) in that every message is associated with a role.\n",
"\n",
"For example, in OpenAI [Chat Completion API](https://platform.openai.com/docs/guides/chat/introduction), a chat message can be associated with the AI, human or system role. The model is supposed to follow instruction from system chat message more closely.\n",
"\n",
"Therefore, LangChain provides several related prompt templates to make constructing and working with prompts easily. You are encouraged to use these chat related prompt templates instead of `PromptTemplate` when querying chat models to fully exploit the potential of underlying chat model.\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 35,
"id": "7647a621",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.prompts import (\n",
@@ -38,16 +44,18 @@
"id": "acb4a2f6",
"metadata": {},
"source": [
"You can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplates`. You can use `ChatPromptTemplate`'s `format_prompt` -- this returns a `PromptValue`, which you can convert to a string or Message object, depending on whether you want to use the formatted value as input to an llm or chat model.\n",
"To create a message template associated with a role, you use `MessagePromptTemplate`. \n",
"\n",
"For convenience, there is a `from_template` method exposed on the template. If you were to use this template, this is what it would look like:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 36,
"id": "3124f5e9",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template=\"You are a helpful assistant that translates {input_language} to {output_language}.\"\n",
@@ -57,10 +65,46 @@
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "9c7e2e6f",
"cell_type": "markdown",
"id": "c8b08cda-7c57-4c15-a1e5-80627cfa9cbd",
"metadata": {},
"source": [
"If you wanted to construct the `MessagePromptTemplate` more directly, you could create a PromptTemplate outside and then pass it in, eg:"
]
},
{
"cell_type": "code",
"execution_count": 37,
"id": "5a8d249e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"prompt=PromptTemplate(\n",
" template=\"You are a helpful assistant that translates {input_language} to {output_language}.\",\n",
" input_variables=[\"input_language\", \"output_language\"],\n",
")\n",
"system_message_prompt_2 = SystemMessagePromptTemplate(prompt=prompt)\n",
"\n",
"assert system_message_prompt == system_message_prompt_2"
]
},
{
"cell_type": "markdown",
"id": "96836c5c-41f8-4073-95ac-ea1daab2e00e",
"metadata": {},
"source": [
"After that, you can build a `ChatPromptTemplate` from one or more `MessagePromptTemplates`. You can use `ChatPromptTemplate`'s `format_prompt` -- this returns a `PromptValue`, which you can convert to a string or Message object, depending on whether you want to use the formatted value as input to an llm or chat model."
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "9c7e2e6f",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
@@ -69,7 +113,7 @@
" HumanMessage(content='I love programming.', additional_kwargs={})]"
]
},
"execution_count": 3,
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
@@ -82,25 +126,225 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "0dbdf94f",
"id": "0899f681-012e-4687-a754-199a9a396738",
"metadata": {
"tags": []
},
"source": [
"## Format output\n",
"\n",
"The output of the format method is available as string, list of messages and `ChatPromptValue`"
]
},
{
"cell_type": "markdown",
"id": "584166de-0c31-4bc9-bf7a-5b359e7173d8",
"metadata": {},
"source": [
"If you wanted to construct the MessagePromptTemplate more directly, you could create a PromptTemplate outside and then pass it in, eg:"
"As string:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "5a8d249e",
"metadata": {},
"execution_count": 39,
"id": "2f6c7ad1-def5-41dc-a4fe-3731dc8917f9",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'System: You are a helpful assistant that translates English to French.\\nHuman: I love programming.'"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"output = chat_prompt.format(input_language=\"English\", output_language=\"French\", text=\"I love programming.\")\n",
"output"
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "144b3368-43f3-49fa-885f-3f3470e9ab7e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"prompt=PromptTemplate(\n",
" template=\"You are a helpful assistant that translates {input_language} to {output_language}.\",\n",
" input_variables=[\"input_language\", \"output_language\"],\n",
")\n",
"system_message_prompt = SystemMessagePromptTemplate(prompt=prompt)"
"# or alternatively \n",
"output_2 = chat_prompt.format_prompt(input_language=\"English\", output_language=\"French\", text=\"I love programming.\").to_string()\n",
"\n",
"assert output == output_2"
]
},
{
"cell_type": "markdown",
"id": "51970399-c2e1-4c9b-8003-6f5b1236fda8",
"metadata": {},
"source": [
"As `ChatPromptValue`"
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "681ce4a7-d972-4cdf-ac77-ec35182fd352",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"ChatPromptValue(messages=[SystemMessage(content='You are a helpful assistant that translates English to French.', additional_kwargs={}), HumanMessage(content='I love programming.', additional_kwargs={})])"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chat_prompt.format_prompt(input_language=\"English\", output_language=\"French\", text=\"I love programming.\")"
]
},
{
"cell_type": "markdown",
"id": "61041810-4418-4406-9c8a-91c9034c9752",
"metadata": {},
"source": [
"As list of Message objects"
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "4ec2f166-1ef9-4071-8c37-fcfbd3f4bc29",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"[SystemMessage(content='You are a helpful assistant that translates English to French.', additional_kwargs={}),\n",
" HumanMessage(content='I love programming.', additional_kwargs={})]"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chat_prompt.format_prompt(input_language=\"English\", output_language=\"French\", text=\"I love programming.\").to_messages()"
]
},
{
"cell_type": "markdown",
"id": "73dcd3a2-ad6d-4b7b-ab21-1d9e417f959e",
"metadata": {},
"source": [
"## Different types of `MessagePromptTemplate`\n",
"\n",
"LangChain provides different types of `MessagePromptTemplate`. The most commonly used are `AIMessagePromptTemplate`, `SystemMessagePromptTemplate` and `HumanMessagePromptTemplate`, which create an AI message, system message and human message respectively.\n",
"\n",
"However, in cases where the chat model supports taking chat message with arbitrary role, you can use `ChatMessagePromptTemplate`, which allows user to specify the role name."
]
},
{
"cell_type": "code",
"execution_count": 45,
"id": "55a21b1f-9cdc-4072-a41d-695b00dd11e6",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"ChatMessage(content='May the force be with you', additional_kwargs={}, role='Jedi')"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.prompts import ChatMessagePromptTemplate\n",
"\n",
"prompt = \"May the {subject} be with you\"\n",
"\n",
"chat_message_prompt = ChatMessagePromptTemplate.from_template(role=\"Jedi\", template=prompt)\n",
"chat_message_prompt.format(subject=\"force\")"
]
},
{
"cell_type": "markdown",
"id": "cd6bf82b-4709-4683-b215-6b7c468f3347",
"metadata": {},
"source": [
"LangChain also provides `MessagesPlaceholder`, which gives you full control of what messages to be rendered during formatting. This can be useful when you are uncertain of what role you should be using for your message prompt templates or when you wish to insert a list of messages during formatting."
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "9f034650-5e50-4bc3-80f8-5e428ca6444d",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.prompts import MessagesPlaceholder\n",
"\n",
"human_prompt = \"Summarize our conversation so far in {word_count} words.\"\n",
"human_message_template = HumanMessagePromptTemplate.from_template(human_prompt)\n",
"\n",
"chat_prompt = ChatPromptTemplate.from_messages([MessagesPlaceholder(variable_name=\"conversation\"), human_message_template])"
]
},
{
"cell_type": "code",
"execution_count": 47,
"id": "9789e8e7-b3f9-4391-a85c-373e576107b3",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"[HumanMessage(content='What is the best way to learn programming?', additional_kwargs={}),\n",
" AIMessage(content='1. Choose a programming language: Decide on a programming language that you want to learn. \\n\\n2. Start with the basics: Familiarize yourself with the basic programming concepts such as variables, data types and control structures.\\n\\n3. Practice, practice, practice: The best way to learn programming is through hands-on experience', additional_kwargs={}),\n",
" HumanMessage(content='Summarize our conversation so far in 10 words.', additional_kwargs={})]"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"human_message = HumanMessage(content=\"What is the best way to learn programming?\")\n",
"ai_message = AIMessage(content=\"\"\"\\\n",
"1. Choose a programming language: Decide on a programming language that you want to learn. \n",
"\n",
"2. Start with the basics: Familiarize yourself with the basic programming concepts such as variables, data types and control structures.\n",
"\n",
"3. Practice, practice, practice: The best way to learn programming is through hands-on experience\\\n",
"\"\"\")\n",
"\n",
"chat_prompt.format_prompt(conversation=[human_message, ai_message], word_count=\"10\").to_messages()"
]
}
],
@@ -120,7 +364,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.10"
}
},
"nbformat": 4,

Some files were not shown because too many files have changed in this diff Show More