Compare commits

..

197 Commits

Author SHA1 Message Date
Harrison Chase
3a1bdce3f5 bump version to 147 (#3353) 2023-04-22 09:35:03 -07:00
Harrison Chase
a6664be79c Harrison/myscale (#3352)
Co-authored-by: Fangrui Liu <fangruil@moqi.ai>
Co-authored-by: 刘 方瑞 <fangrui.liu@outlook.com>
Co-authored-by: Fangrui.Liu <fangrui.liu@ubc.ca>
2023-04-22 09:17:38 -07:00
Harrison Chase
6200a2a00e Harrison/error hf (#3348)
Co-authored-by: Rui Melo <44201826+rufimelo99@users.noreply.github.com>
2023-04-22 09:06:36 -07:00
Honkware
a5ad1c270f Add ChatGPT Data Loader (#3336)
This pull request adds a ChatGPT document loader to the document loaders
module in `langchain/document_loaders/chatgpt.py`. Additionally, it
includes an example Jupyter notebook in
`docs/modules/indexes/document_loaders/examples/chatgpt_loader.ipynb`
which uses fake sample data based on the original structure of the
`conversations.json` file.

The following files were added/modified:
- `langchain/document_loaders/__init__.py`
- `langchain/document_loaders/chatgpt.py`
- `docs/modules/indexes/document_loaders/examples/chatgpt_loader.ipynb`
-
`docs/modules/indexes/document_loaders/examples/example_data/fake_conversations.json`

This pull request was made in response to the recent release of ChatGPT
data exports by email:
https://help.openai.com/en/articles/7260999-how-do-i-export-my-chatgpt-history
2023-04-22 09:06:24 -07:00
Zander Chase
61d40ba042 Fix Sagemaker Batch Endpoints (#3249)
Add different typing for @evandiewald 's heplful PR

---------

Co-authored-by: Evan Diewald <evandiewald@gmail.com>
2023-04-22 08:49:51 -07:00
Johann-Peter Hartmann
7e79f8c136 Support recursive sitemaps in SitemapLoader (#3146)
A (very) simple addition to support multiple sitemap urls.

---------

Co-authored-by: Johann-Peter Hartmann <johann-peter.hartmann@mayflower.de>
2023-04-22 08:48:04 -07:00
Filip Haltmayer
215dcc2d26 Refactor Milvus/Zilliz (#3047)
Refactoring milvus/zilliz to clean up and have a more consistent
experience.

Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>
2023-04-22 08:26:19 -07:00
Harrison Chase
8191c6b81a Harrison/voice assistant (#3347)
Co-authored-by: Jaden <jaden.lorenc@gmail.com>
2023-04-22 08:25:50 -07:00
Richy Wang
88a8f59aa7 Add a full PostgresSQL syntax database 'AnalyticDB' as vector store. (#3135)
Hi there!
I'm excited to open this PR to add support for using a fully Postgres
syntax compatible database 'AnalyticDB' as a vector.
As AnalyticDB has been proved can be used with AutoGPT,
ChatGPT-Retrieve-Plugin, and LLama-Index, I think it is also good for
you.
AnalyticDB is a distributed Alibaba Cloud-Native vector database. It
works better when data comes to large scale. The PR includes:

- [x]  A new memory: AnalyticDBVector
- [x]  A suite of integration tests verifies the AnalyticDB integration

I have read your [contributing
guidelines](72b7d76d79/.github/CONTRIBUTING.md).
And I have passed the tests below
- [x]  make format
- [x]  make lint
- [x]  make coverage
- [x]  make test
2023-04-22 08:25:41 -07:00
Harrison Chase
cc6fe18152 Harrison/power bi (#3205)
Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
2023-04-22 08:24:48 -07:00
Daniel Chalef
61e09229c8 args_schema type hint on subclassing (#3323)
per https://github.com/hwchase17/langchain/issues/3297

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-04-21 15:51:13 -07:00
Zander Chase
05a8aa5447 Fix linting on master (#3327) 2023-04-21 15:49:46 -07:00
Varun Srinivas
d2f922f525 Change in method name for creating an issue on JIRA (#3307)
The awesome JIRA tool created by @zywilliamli calls the `create_issue()`
method to create issues, however, the actual method is `issue_create()`.

Details in the Documentation here:
https://atlassian-python-api.readthedocs.io/jira.html#manage-issues
2023-04-21 13:01:33 -07:00
Davis Chase
e933be9605 Update docs api references (#3315) 2023-04-21 12:21:33 -07:00
Paul Garner
aa9d5707e0 Add PythonLoader which auto-detects encoding of Python files (#3311)
This PR contributes a `PythonLoader`, which inherits from
`TextLoader` but detects and sets the encoding automatically.
2023-04-21 10:47:57 -07:00
Daniel Chalef
1ecbeec24e Fix example match_documents fn table name, grammar (#3294)
ref
https://github.com/hwchase17/langchain/pull/3100#issuecomment-1517086472

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-04-21 10:21:23 -07:00
Davis Chase
2fd24d31a4 Cleanup integration test dir (#3308) 2023-04-21 09:44:09 -07:00
leo-gan
3bc703b0d6 added links to the important YouTube videos (#3244)
Added links to the important YouTube videos
2023-04-21 01:31:42 -07:00
Sertaç Özercan
1e91266a8a fix: handle youtube TranscriptsDisabled (#3276)
handles error when youtube video has transcripts disabled

```
youtube_transcript_api._errors.TranscriptsDisabled: 
Could not retrieve a transcript for the video https://www.youtube.com/watch?v=<URL> This is most likely caused by:

Subtitles are disabled for this video

If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error. Also make sure that there are no open issues which already describe your problem!
```

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
2023-04-21 01:27:42 -07:00
Alexandre Pesant
04e1d6c699 Do not print openai settings (#3280)
There's no reason to print these settings like that, it just pollutes
the logs :)
2023-04-21 01:20:17 -07:00
Zander Chase
a71a2c0eb2 Handle null action in AutoGPT Agent (#3274)
Handle the case where the command is `null`
2023-04-20 23:18:46 -07:00
Harrison Chase
bf78200f55 bump version 146 (#3272) 2023-04-20 22:20:43 -07:00
Harrison Chase
87544d2378 gradio tools (#3255) 2023-04-20 22:09:15 -07:00
Naveen Tatikonda
bb6c459f7a OpenSearch: Add Support for Lucene Filter (#3201)
### Description
Add Support for Lucene Filter. When you specify a Lucene filter for a
k-NN search, the Lucene algorithm decides whether to perform an exact
k-NN search with pre-filtering or an approximate search with modified
post-filtering. This filter is supported only for approximate search
with the indexes that are created using `lucene` engine.

OpenSearch Documentation -
https://opensearch.org/docs/latest/search-plugins/knn/filter-search-knn/#lucene-k-nn-filter-implementation

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
2023-04-20 20:42:53 -07:00
Davis Chase
36720cb57f Hf emb device (#3266)
Make it possible to control the HuggingFaceEmbeddings and HuggingFaceInstructEmbeddings client model kwargs. Additionally, the cache folder was added for HuggingFaceInstructEmbedding as the client inherits from SentenceTransformer (client of HuggingFaceEmbeddings).

It can be useful, especially to control the client device, as it will be defaulted to GPU by sentence_transformers if there is any.

---------

Co-authored-by: Yoann Poupart <66315201+Xmaster6y@users.noreply.github.com>
2023-04-20 20:41:22 -07:00
Zach Jones
d7942a9f19 Fix type annotation for QueryCheckerTool.llm (#3237)
Currently `langchain.tools.sql_database.tool.QueryCheckerTool` has a
field `llm` with type `BaseLLM`. This breaks initialization for some
LLMs. For example, trying to use it with GPT4:

```python
from langchain.sql_database import SQLDatabase
from langchain.chat_models import ChatOpenAI
from langchain.tools.sql_database.tool import QueryCheckerTool


db = SQLDatabase.from_uri("some_db_uri")
llm = ChatOpenAI(model_name="gpt-4")
tool = QueryCheckerTool(db=db, llm=llm)

# pydantic.error_wrappers.ValidationError: 1 validation error for QueryCheckerTool
# llm
#   Can't instantiate abstract class BaseLLM with abstract methods _agenerate, _generate, _llm_type (type=type_error)
```

Seems like much of the rest of the codebase has switched from `BaseLLM`
to `BaseLanguageModel`. This PR makes the change for QueryCheckerTool as
well

Co-authored-by: Zachary Jones <zjones@zetaglobal.com>
2023-04-20 18:50:59 -07:00
Davis Chase
46542dc774 Contextual compression retriever (#2915)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-20 17:01:14 -07:00
Matt Robinson
3943759a90 feat: add loader for rich text files (#3227)
### Summary

Adds a loader for rich text files. Requires `unstructured>=0.5.12`.

### Testing

The following test uses the example RTF file from the [`unstructured`
repo](https://github.com/Unstructured-IO/unstructured/tree/main/example-docs).

```python
from langchain.document_loaders import UnstructuredRTFLoader

loader = UnstructuredRTFLoader("fake-doc.rtf", mode="elements")
docs = loader.load()
docs[0].page_content
```
2023-04-20 15:51:49 -07:00
Harrison Chase
5ef2d1e2a1 add to docs 2023-04-20 15:43:57 -07:00
Harrison Chase
4aedbeaffb Merge branch 'master' of github.com:hwchase17/langchain 2023-04-20 15:43:04 -07:00
Harrison Chase
2dbb5261b5 wikibase agent 2023-04-20 15:37:56 -07:00
Albert Castellana
0684aa081a Ecosystem/Yeager.ai (#3239)
Added yeagerai.md to ecosystem
2023-04-20 15:20:21 -07:00
Boris Feld
0e797a3ff9 Fixing issue link for Comet callback (#3212)
Sorry I fixed that link once but there was still a typo inside, this
time it should be good.
2023-04-20 14:57:41 -07:00
Daniel Chalef
ae528fd06e fix error msg ref to beautifulsoup4 (#3242)
Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-04-20 14:03:32 -07:00
Tom Dyson
7d3e6389f2 Add DuckDB prompt (#3233)
Adds a prompt template for the DuckDB SQL dialect.
2023-04-20 14:02:20 -07:00
Zander Chase
daee0b2b97 Patch Chat History Formatting (#3236)
While we work on solidifying the memory interfaces, handle common chat
history formats.

This may break linting on anyone who has been passing in
`get_chat_history` .

Somewhat handles #3077

Alternative to #3078 that updates the typing
2023-04-20 13:31:30 -07:00
Harrison Chase
8f22949dc4 update nnotebook title 2023-04-20 11:53:23 -07:00
leo-gan
130e4b9fcb fixed a link to the youtube page (#3232)
A link to the `YouTube` page was missing on the `index` page.
2023-04-20 10:47:16 -07:00
Peter Stolz
d54b977d4e Fix docstring of RetrievalQA (#3231)
Structure changed an RetrievalQA now expects BaseRetriever not
VectorStore
2023-04-20 10:46:51 -07:00
Harrison Chase
b7dea80cba bump version to 145 (#3229) 2023-04-20 08:30:38 -07:00
Harrison Chase
b7f2061736 Harrison/google places (#3207)
Co-authored-by: Cao Hoang <65607230+cnhhoang850@users.noreply.github.com>
Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>
2023-04-20 07:57:07 -07:00
Gabriel Altay
34fb56b633 fix copy/pasta typos wikipedia->arxiv (#3222)
just updates a few module level docstrings from Wikipedia -> Arxiv
2023-04-20 07:15:41 -07:00
Harrison Chase
d2520a5f1e Harrison/ddg (#3206)
Co-authored-by: itai <itai.marks@gmail.com>
Co-authored-by: Itai Marks <itaim@users.noreply.github.com>
Co-authored-by: Tianyi Pan <60060750+tipani86@users.noreply.github.com>
Co-authored-by: Tianyi Pan <tianyi.pan@clobotics.com>
Co-authored-by: Adilzhan Ismailov <13088690+aismlv@users.noreply.github.com>
Co-authored-by: Justin Flick <Justinjayflick@gmail.com>
Co-authored-by: Justin Flick <jflick@homesite.com>
2023-04-19 21:32:26 -07:00
Harrison Chase
36c10f8a52 nits (#3203) 2023-04-19 21:14:46 -07:00
Daniel Chalef
27cdf8d675 supabase vectorstore - first cut (#3100)
First cut of a supabase vectorstore loosely patterned on the langchainjs
equivalent. Doesn't support async operations which is a limitation of
the supabase python client.

---------

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-04-19 21:06:44 -07:00
Harrison Chase
9a0356d276 Harrison/file chat history (#3198)
Co-authored-by: Young Lee <joybro201@gmail.com>
2023-04-19 21:05:20 -07:00
Kazon Wilson
a66cab8b71 Add new line to refine prompt tmpl (#3197)
Adding a new line to fix issue #3117
2023-04-19 21:04:52 -07:00
Harrison Chase
96809b5794 Harrison/discord loader (#3200)
Co-authored-by: Rajtilak Bhattacharjee <rajtilak.blog@gmail.com>
2023-04-19 21:04:12 -07:00
Justin Flick
8faef1a91a Confluence DL retry/backoff (#3168)
Implemented a retry/backoff logic in response to #2473

---------

Co-authored-by: Justin Flick <jflick@homesite.com>
2023-04-19 20:50:39 -07:00
Adilzhan Ismailov
c03a65c6dc Fix from_embeddings method examples (#3174)
Fix examples for `from_embeddings` method for annoy and faiss
vectorstores
2023-04-19 20:49:33 -07:00
Harrison Chase
f19b3890c9 Harrison/site map tqdm (#3184)
Co-authored-by: Tianyi Pan <60060750+tipani86@users.noreply.github.com>
Co-authored-by: Tianyi Pan <tianyi.pan@clobotics.com>
2023-04-19 20:48:47 -07:00
Harrison Chase
e55db5841a Harrison/svm speedup (#3195)
Co-authored-by: Lance Martin <122662504+PineappleExpress808@users.noreply.github.com>
2023-04-19 20:14:01 -07:00
obbiondo
d6b2f2b9bd add ConfluenceLoader to document_loaders init (#3143)
Fix ConfluenceLoader import

Co-authored-by: Andrea Biondo <a.biondo@reply.it>
2023-04-19 20:05:31 -07:00
Zander Chase
c757c3cde4 Add HuggingFace Examples (#3187)
Add a Pipeline example and add other models in th ehub notebook

To close issue
[#3077](https://github.com/hwchase17/langchain/issues/3099)
2023-04-19 17:08:10 -07:00
Donald "Max" Ziff
6adf2d1c39 first draft (#2690)
There is a long way to go on this!

---------

Co-authored-by: Max Ziff <max.ziff@concur.com>
2023-04-19 17:06:55 -07:00
Harrison Chase
9181cd9b22 Harrison/playwright selector (#3185)
Co-authored-by: zhyuri <4649294+zhyuri@users.noreply.github.com>
2023-04-19 16:54:15 -07:00
Harrison Chase
68cd37175e Harrison/arxiv tool (#3186)
Co-authored-by: leo-gan <leo.gan.57@gmail.com>
2023-04-19 16:53:34 -07:00
Tunay Okumus
6e48107734 fix: separate model and deployment for OpenAIEmbeddings (#3076)
Separated the deployment from model to support Azure OpenAI Embeddings
properly.
Also removed the deprecated document_model_name and query_model_name
attributes.
2023-04-19 16:49:18 -07:00
Zander Chase
4adfd790f0 Update File Management Tools to Include Root Directory (#3112)
- Permit the specification of a `root_dir` to the read/write file tools
to specify a working directory
- Add validation for attempts to read/write outside the directory (e.g.,
through `../../` or symlinks or `/abs/path`'s that don't lie in the
correct path)
- Add some tests for all


One question is whether we should make a default root directory for
these? tradeoffs either way
2023-04-19 16:46:10 -07:00
John-David Wuarin
a63bfb6c9f fix: kwargs.pop("redis_url") KeyError: 'redis_url' (#3121)
This occurred when redis_url was not passed as a parameter even though a
REDIS_URL env variable was present.
This occurred for all methods that eventually called any of:
(from_texts, drop_index, from_existing_index) - i.e. virtually all
methods in the class.
This fixes it
2023-04-19 16:44:39 -07:00
engkheng
dbbc340f25 Validate input_variables when using jinja2 templates (#3140)
`langchain.prompts.PromptTemplate` and
`langchain.prompts.FewShotPromptTemplate` do not validate
`input_variables` when initialized as `jinja2` template.

```python
# Using langchain v0.0.144
template = """"\
Your variable: {{ foo }}
{% if bar %}
You just set bar boolean variable to true
{% endif %}
"""

# Missing variable, should raise ValueError
prompt_template = PromptTemplate(template=template, 
                                 input_variables=["bar"], 
                                 template_format="jinja2", 
                                 validate_template=True)

# Extra variable, should raise ValueError
prompt_template = PromptTemplate(template=template, 
                                 input_variables=["bar", "foo", "extra", "thing"], 
                                 template_format="jinja2", 
                                 validate_template=True)
```
2023-04-19 16:18:32 -07:00
Matt Robinson
3e0c44bae8 enhancement: support headers for non-html urls (#3166)
### Summary

Updates the `UnstructuredURLLoader` to support passing in headers for
non HTML content types. While this update maintains backward
compatibility with older versions of `unstructured`, we strongly
recommended upgrading to `unstructured>=0.5.13` if you are using the
`UnstructuredURLLoader`.

### Testing

#### With headers

```python
from langchain.document_loaders import UnstructuredURLLoader

urls = ["https://www.understandingwar.org/sites/default/files/Russian%20Offensive%20Campaign%20Assessment%2C%20April%2011%2C%202023.pdf"]

loader = UnstructuredURLLoader(urls=urls, headers={"Accept": "application/json"}, strategy="fast")
docs = loader.load()
print(docs[0].page_content[:1000])
```

#### Without headers

```python
from langchain.document_loaders import UnstructuredURLLoader

urls = ["https://www.understandingwar.org/sites/default/files/Russian%20Offensive%20Campaign%20Assessment%2C%20April%2011%2C%202023.pdf"]

loader = UnstructuredURLLoader(urls=urls, strategy="fast")
docs = loader.load()
print(docs[0].page_content[:1000])
```

---------

Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>
2023-04-19 16:16:24 -07:00
Pranabendra Prasad Chandra
7b1f0656b8 Fix typo in ElasticSearch sample notebook (#3171)
Added missing parenthesis in example notebook
[elasticsearch.ipynb](https://github.com/hwchase17/langchain/blob/master/docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb)
2023-04-19 16:06:31 -07:00
Davis Chase
10e4b32ecb Add document transformer abstraction (#3182)
Add DocumentTransformer abstraction so that in #2915 we don't have to
wrap TextSplitter and RedundantEmbeddingFilter (neither of which uses
the query) in the contextual doc compression abstractions. with this
change, doc filter (doc extractor, whatever we call it) would look
something like
```python
class BaseDocumentFilter(BaseDocumentTransformer[_RetrievedDocument], ABC):
  
  @abstractmethod
  def filter(self, documents: List[_RetrievedDocument], query: str) -> List[_RetrievedDocument]:
    ...
  
  def transform_documents(self, documents: List[_RetrievedDocument], query: Optional[str] = None, **kwargs: Any) -> List[_RetrievedDocument]:
    if query is None:
      raise ValueError("Must pass in non-null query to DocumentFilter")
    return self.filter(documents, query)
```
2023-04-19 16:05:05 -07:00
Zander Chase
74342ab209 Update the marathon notebook (#3183)
There were some steps that didn't make sense. Update now. This time it
produced a nice markdown formatted table too
2023-04-19 16:03:21 -07:00
leo-gan
a78f55b851 Additional resources - YouTube (#3180)
Added links to the YouTube tutorials and videos in the `youtube.md`. 
Added link to the ^ in `index.rst`.
2023-04-19 15:16:29 -07:00
det-sys
26c8cd1ea2 Update gallery.rst (#3176)
Add https://anysummary.app to the gallery
2023-04-19 15:06:59 -07:00
Happydog
5e66d05928 Fix: typo in custom_mrkl_agents.ipynb document (#3159)
I have noticed a typo error in the `custom_mrkl_agents.ipynb` document
while trying the example from the documentation page. As a result, I
have opened a pull request (PR) to address this minor issue, even though
it may seem insignificant 😂.
2023-04-19 14:57:33 -07:00
Harrison Chase
99b1983461 add example 2023-04-19 14:35:24 -07:00
Zander Chase
89c63cf8a6 Add Marathon Notebook (#3163)
Add an example using autogpt to get the boston marathon winning times

Add a web browser + summarization tool in the notebook
2023-04-19 11:23:08 -07:00
Dariel Dato-on
0b542661b4 Prevent kwargs from being overwritten (#3158)
Fixes #3157. Prevents `kwargs` from being overwritten by
`_to_args_and_kwargs()` and sending the wrong `kwargs` in line 109.
2023-04-19 09:00:10 -07:00
Quentin Pleplé
126d7f11dd Fix notebook example (#3142)
The following calls were throwing an exception:


575b717d10/docs/use_cases/evaluation/agent_vectordb_sota_pg.ipynb (L192)


575b717d10/docs/use_cases/evaluation/agent_vectordb_sota_pg.ipynb (L239)

Exception:

```
---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
Cell In[14], line 1
----> 1 chain_sota = RetrievalQA.from_chain_type(llm=OpenAI(temperature=0), chain_type="stuff", retriever=vectorstore_sota, input_key="question")

File ~/github/langchain/venv/lib/python3.9/site-packages/langchain/chains/retrieval_qa/base.py:89, in BaseRetrievalQA.from_chain_type(cls, llm, chain_type, chain_type_kwargs, **kwargs)
     85 _chain_type_kwargs = chain_type_kwargs or {}
     86 combine_documents_chain = load_qa_chain(
     87     llm, chain_type=chain_type, **_chain_type_kwargs
     88 )
---> 89 return cls(combine_documents_chain=combine_documents_chain, **kwargs)

File ~/github/langchain/venv/lib/python3.9/site-packages/pydantic/main.py:341, in pydantic.main.BaseModel.__init__()

ValidationError: 1 validation error for RetrievalQA
retriever
  instance of BaseRetriever expected (type=type_error.arbitrary_type; expected_arbitrary_type=BaseRetriever)
```

The vectorstores had to be converted to retrievers:
`vectorstore_sota.as_retriever()` and `vectorstore_pg.as_retriever()`.

The PR also:
- adds the file `paul_graham_essay.txt` referenced by this notebook
- adds to gitignore *.pkl and *.bin files that are generated by this
notebook

Interestingly enough, the performance of the prediction greatly
increased (new version of langchain or ne version of OpenAI models since
the last run of the notebook): from 19/33 correct to 28/33 correct!
2023-04-19 08:55:06 -07:00
Jakub Kukul
599e17cea8 Working example for Anthropic (#3151)
would be great if the provided example worked out of the box 😄
2023-04-19 08:52:33 -07:00
Harrison Chase
575b717d10 bump version to 144 (#3136) 2023-04-18 23:29:23 -07:00
ProxyCausal
72b7d76d79 Print exception type for Python tool (#3126)
Useful for debugging agents e.g. KeyError in addition to just printing
the missing key
2023-04-18 22:45:06 -07:00
Harrison Chase
b7dc04c086 fix links 2023-04-18 22:44:53 -07:00
Zander Chase
8a050ba4bf Notebook Nit (#3125)
The required arg is `question` not `query`
2023-04-18 22:43:52 -07:00
Harrison Chase
364257d967 agent docs fixes (#3128) 2023-04-18 21:54:30 -07:00
Zander Chase
f329196cf4 Agents 4 18 (#3122)
Creating an experimental agents folder, containing BabyAGI, AutoGPT, and
later, other examples

---------

Co-authored-by: Rahul Behal <rahulbehal01@hotmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-18 21:41:03 -07:00
engkheng
8e386613ac Import jinja2 only when used (#3123)
Addressing #3113
2023-04-18 21:23:03 -07:00
Zander Chase
90ef705ced Update Tool Input (#3103)
- Remove dynamic model creation in the `args()` property. _Only infer
for the decorator (and add an argument to NOT infer if someone wishes to
only pass as a string)_
- Update the validation example to make it less likely to be
misinterpreted as a "safe" way to run a repl


There is one example of "Multi-argument tools" in the custom_tools.ipynb
from yesterday, but we could add more. The output parsing for the base
MRKL agent hasn't been adapted to handle structured args at this point
in time

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-18 18:18:33 -07:00
Francesco
19116010ee Add exeption for when version metadata cannot be found for package (#3107)
Solves #3097

Already ran tests and lint.
2023-04-18 16:44:40 -07:00
Carmen Sam
d54c88aa21 Add allowed and disallowed special arguments to BaseOpenAI (#3012)
## Background
This PR fixes this error when there are special tokens when querying the
chain:
```
Encountered text corresponding to disallowed special token '<|endofprompt|>'.
If you want this text to be encoded as a special token, pass it to `allowed_special`, e.g. `allowed_special={'<|endofprompt|>', ...}`.
If you want this text to be encoded as normal text, disable the check for this token by passing `disallowed_special=(enc.special_tokens_set - {'<|endofprompt|>'})`.
To disable this check for all special tokens, pass `disallowed_special=()`.
```

Refer to the code snippet below, it breaks in the chain line.
```
        chain = ConversationalRetrievalChain.from_llm(
            ChatOpenAI(openai_api_key=OPENAI_API_KEY),
            retriever=vectorstore.as_retriever(),
            qa_prompt=prompt,
            condense_question_prompt=condense_prompt,
        )
        answer = chain({"question": f"{question}"})
```
However `ChatOpenAI` class is not accepting `allowed_special` and
`disallowed_special` at the moment so they cannot be passed to the
`encode()` in `get_num_tokens` method to avoid the errors.


## Change
- Add `allowed_special` and `disallowed_special` attributes to
`BaseOpenAI` class.
- Pass in `allowed_special` and `disallowed_special` as arguments of
`encode()` in tiktoken.

---------

Co-authored-by: samcarmen <“carmen.samkahman@gmail.com”>
2023-04-18 09:34:08 -07:00
Harrison Chase
9d23cfc7dd bump version to 143 (#3095) 2023-04-18 09:12:57 -07:00
Harrison Chase
aad0a498ac Harrison/output error (#3094)
Co-authored-by: yummydum <sumita@nowcast.co.jp>
2023-04-18 08:59:56 -07:00
Harrison Chase
1c1b77bbfe Harrison/discord (#3092)
Co-authored-by: Rajtilak Bhattacharjee <rajtilak.blog@gmail.com>
2023-04-18 08:19:23 -07:00
Boris Feld
14e4d30659 Comet ml updates 17 04 2023 (#3074)
I made a couple of improvements to the Comet tracker:

* The Comet project name is configurable in various ways (code,
environment variable or file), having a default value in code meant that
users couldn't set the project name in an environment variable or in a
file.
* I added error catching when the `flush_tracker` is called in order to
avoid crashing the whole process. Instead we are gonna display a warning
or error log message (`extra={"show_traceback": True}` is an internal
convention to force the display of the traceback when using our own
logger).

I decided to add the error catching after seeing the following error in
the third example of the notebook:
```
COMET ERROR: Failed to export agent or LLM to Comet
Traceback (most recent call last):
  File "/home/lothiraldan/project/cometml/langchain/langchain/callbacks/comet_ml_callback.py", line 484, in _log_model
    langchain_asset.save(langchain_asset_path)
  File "/home/lothiraldan/project/cometml/langchain/langchain/agents/agent.py", line 591, in save
    raise ValueError(
ValueError: Saving not supported for agent executors. If you are trying to save the agent, please use the `.save_agent(...)`

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/lothiraldan/project/cometml/langchain/langchain/callbacks/comet_ml_callback.py", line 449, in flush_tracker
    self._log_model(langchain_asset)
  File "/home/lothiraldan/project/cometml/langchain/langchain/callbacks/comet_ml_callback.py", line 488, in _log_model
    langchain_asset.save_agent(langchain_asset_path)
  File "/home/lothiraldan/project/cometml/langchain/langchain/agents/agent.py", line 599, in save_agent
    return self.agent.save(file_path)
  File "/home/lothiraldan/project/cometml/langchain/langchain/agents/agent.py", line 145, in save
    agent_dict = self.dict()
  File "/home/lothiraldan/project/cometml/langchain/langchain/agents/agent.py", line 119, in dict
    _dict = super().dict()
  File "pydantic/main.py", line 449, in pydantic.main.BaseModel.dict
  File "pydantic/main.py", line 868, in _iter
  File "pydantic/main.py", line 743, in pydantic.main.BaseModel._get_value
  File "/home/lothiraldan/project/cometml/langchain/langchain/schema.py", line 381, in dict
    output_parser_dict["_type"] = self._type
  File "/home/lothiraldan/project/cometml/langchain/langchain/schema.py", line 376, in _type
    raise NotImplementedError
NotImplementedError
```

I still need to investigate and try to fix it, it looks related to
saving an agent to a file.
2023-04-18 07:32:29 -07:00
engkheng
fe68051d34 Fix typo in docs/reference.rst (#3081)
fix typo
2023-04-18 07:31:00 -07:00
Azam Iftikhar
188e9b9beb Allowing HuggingFaceEmbeddings from the cached weight (#3084)
### https://github.com/hwchase17/langchain/issues/3079
Allow initializing HuggingFaceEmbeddings from the cached weight
2023-04-18 07:30:35 -07:00
Roma
55f6f80a59 fix typo (#3085) 2023-04-18 07:29:33 -07:00
TysBradford
7dae39b57d slightly clearer docs (#3088)
Took me a second to realise the examples required to manually print the
output of the conversation predict. This might make it clearer for
others
2023-04-18 07:28:29 -07:00
James O'Dwyer
0257829776 Bump Metal to use index_id (#3089)
## Use `index_id` over `app_id`
We made a major update to index + retrieve based on Metal Indexes
(instead of apps). With this change, we accept an index instead of an
app in each of our respective core apis. [More details
here](https://docs.getmetal.io/api-reference/core/indexing).
2023-04-18 07:28:13 -07:00
Hamza Kyamanywa
064a1db2b2 [Documentation] Show how to initiate pinecone from an existing index (#3070)
## What is this PR for:
* This PR adds a commented line of code in the documentation that shows
how someone can use the Pinecone client with an already existing
Pinecone index
* The documentation currently only shows how to create a pinecone index
from langchain documents but not how to load one that already exists
2023-04-18 07:27:46 -07:00
Harrison Chase
894c272a56 tool validation logic 2023-04-17 21:59:32 -07:00
Harrison Chase
1920536d99 Harrison/obsidian (#3060)
Co-authored-by: Ben Hofferber <hofferber.ben@gmail.com>
2023-04-17 21:57:32 -07:00
Zander Chase
93c0514105 Add Twitter Tweet Loader (#3050)
Reformatted version of #3022

---------

Co-authored-by: LiaoKong <568250549@qq.com>
2023-04-17 21:44:54 -07:00
__Jay__
2984ad3964 updated llm response parsing action (#3058)
Sometimes the LLM response (generated code) tends to miss the ending
ticks "```". Therefore causing the text parsing to fail due to not
enough values to unpack.

The 2 extra `_` don't add value and can cause errors. Suggest to simply
update the `_, action, _` to just `action` then with index.

Fixes issue #3057
2023-04-17 21:42:13 -07:00
Harrison Chase
db968284f8 tools refactor (#2961)
Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>
2023-04-17 21:35:29 -07:00
Sebastian
7a8c935b90 Edited for better readability (#3059)
It looks like some dropdown functionality was intended, but it caused
the markdown code to glitch which hurt readability.
2023-04-17 21:34:57 -07:00
Matthieu
822cdb161b Adding shared chromaDB client option (#2886)
This pull request addresses the need to share a single `chromadb.Client`
instance across multiple instances of the `Chroma` class. By
implementing a shared client, we can maintain consistency and reduce
resource usage when multiple instances of the `Chroma` classes are
created. This is especially relevant in a web app, where having multiple
`Chroma` instances with a `persist_directory` leads to these clients not
being synced.

This PR implements this option while keeping the rest of the
architecture unchanged.

**Changes:**
1. Add a client attribute to the `Chroma` class to store the shared
`chromadb.Client` instance.
2. Modify the `from_documents` method to accept an optional client
parameter.
3. Update the `from_documents` method to use the shared client if
provided or create a new client if not provided.

Let me know if anything needs to be modified - thanks again for your
work on this incredible repo
2023-04-17 21:22:39 -07:00
Harrison Chase
b140d366e3 Harrison/jira (#3055)
Co-authored-by: William Li <32046231+zywilliamli@users.noreply.github.com>
Co-authored-by: William Li <twelvehertz@Williams-MacBook-Air.local>
2023-04-17 21:14:40 -07:00
Amir Karimi
ae7ed31386 Fix redundancy check about config_type in AGENT_TO_CLASS (#2934)
Fix of issue #2874
2023-04-17 21:05:48 -07:00
J Wynia
b40f90ea04 Spelling to correct conservation to conservation (#3049)
Issue #3048 corrected spelling
2023-04-17 21:03:03 -07:00
leo-gan
c33883a40e fixed the Cohere example title (#3053)
- fixed the Cohere example title (bug in #3041, sorry for it)
- fixed the runhouse.ipynb file name inconsistency
2023-04-17 21:02:52 -07:00
Harrison Chase
5107fac656 Harrison/rec gd (#3054)
Co-authored-by: Benjamin Scholtz <BenSchZA@users.noreply.github.com>
2023-04-17 21:02:35 -07:00
Harrison Chase
eee2f23a79 Harrison/qa eg (#3052)
Co-authored-by: Sukhpal Saini <bdcorps@users.noreply.github.com>
2023-04-17 20:56:42 -07:00
Harrison Chase
db7106cb79 Harrison/image caption loader (#3051)
Co-authored-by: Sean Saito <saitosean@ymail.com>
2023-04-17 20:49:10 -07:00
Benjamin Scholtz
36138f28c8 Add GoogleSQL prompt (#2992)
This PR extends upon @jzluo 's PR #2748 which addressed dialect-specific
issues with SQL prompts, and adds a prompt that uses backticks for
column names when querying BigQuery. See [GoogleSQL quoted
identifiers](https://cloud.google.com/bigquery/docs/reference/standard-sql/lexical#quoted_identifiers).

Additionally, the SQL agent currently uses a generic prompt. Not sure
how best to adopt the same optional dialect-specific prompts as above,
but will consider making an issue and PR for that too. See
[langchain/agents/agent_toolkits/sql/prompt.py](langchain/agents/agent_toolkits/sql/prompt.py).
2023-04-17 20:44:54 -07:00
Naveen Tatikonda
bb619cd535 Pass kwargs to get OpenSearch client from_texts (#2993)
### Description
Pass kwargs to get OpenSearch client from `from_texts` function

### Issues Resolved
https://github.com/hwchase17/langchain/issues/2819

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
2023-04-17 20:44:30 -07:00
Harutaka Kawamura
ba9cc230fa Stringify AgentType before saving to yaml (#2998)
Code to reproduce the issue (with `langchain==0.0.141`):

```python
from langchain.agents import initialize_agent, load_tools
from langchain.llms import OpenAI

llm = OpenAI(temperature=0.9, verbose=True)
tools = load_tools(["llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)
agent.save_agent("agent.yaml")
with open("agent.yaml") as f:
    print(f.read())
```

Output:

```
_type: !!python/object/apply:langchain.agents.agent_types.AgentType
- zero-shot-react-description
allowed_tools:
- Calculator
...
```

I expected `_type` to be `zero-shot-react-description` but it's actually
not. This PR fixes it by stringifying `AgentType` (`Enum`).

Signed-off-by: harupy <hkawamura0130@gmail.com>
2023-04-17 20:43:39 -07:00
Nuno Campos
e25528c4f0 Fix incorrect value of outputKeys on AnalyzeDocumentsChain (#3010) 2023-04-17 20:32:46 -07:00
engkheng
19febc77d6 Support inference of input_variables from jinja2 template (#3013)
`langchain.prompts.PromptTemplate` is unable to infer `input_variables`
from jinja2 template.

```python
# Using langchain v0.0.141
template_string = """\
Hello world
Your variable: {{ var }}
{# This will not get rendered #}

{% if verbose %}
Congrats! You just turned on verbose mode and got extra messages!
{% endif %}
"""

template = PromptTemplate.from_template(template_string, template_format="jinja2")
print(template.input_variables) # Output ['# This will not get rendered #', '% endif %', '% if verbose %']
```

---------

Co-authored-by: engkheng <ongengkheng929@example.com>
2023-04-17 20:31:03 -07:00
Nuno Campos
dac32c59e5 Nc/combining output parser (#3014)
Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>
2023-04-17 20:29:53 -07:00
Nuno Campos
79bb5c4f95 Port format instructions fix from js (#3015) 2023-04-17 20:29:17 -07:00
Harrison Chase
e3cf00b88b redis from url (#3024) 2023-04-17 20:28:12 -07:00
Davis Chase
19c85aa990 Factor out doc formatting and add validation (#3026)
@cnhhoang850 slightly more generic fix for #2944, works for whatever the
expected metadata keys are not just `source`
2023-04-17 20:28:01 -07:00
Naveen Tatikonda
3453b7457c OpenSearch: Add Support for Boolean Filter with ANN search (#3038)
### Description
Add Support for Boolean Filter with ANN search
Documentation -
https://opensearch.org/docs/latest/search-plugins/knn/filter-search-knn/#boolean-filter-with-ann-search

### Issues Resolved
https://github.com/hwchase17/langchain/issues/2924

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
2023-04-17 20:26:26 -07:00
leo-gan
5420a0e404 updated langchain/docs/modules/models/llms/integrations/ notebooks (#3041)
- Updated `langchain/docs/modules/models/llms/integrations/` notebooks:
added links to the original sites, the install information, etc.
- Added the `nlpcloud` notebook.
- Removed "Example" from Titles of some notebooks, so all notebook
titles are consistent.
2023-04-17 20:25:32 -07:00
Azam Iftikhar
471ef84835 Examples fixed (#3042)
### https://github.com/hwchase17/langchain/issues/2997

Replaced `conversation.memory.store` to
`conversation.memory.entity_store.store`
As conversation.memory.store doesn't exist  and re-ran  the whole file.
2023-04-17 20:25:01 -07:00
Tim Asp
dcdcd3f636 bugfix: throw exception if structured output parser doesn't get what it wants (#3044)
allows the user to catch the issue and handle it rather than failing
hard.

This happens more than you'd expect when using output parsers with
chatgpt, especially if the temp is anything but 0. Sometimes it doesn't
want to listen and just does its own thing.
2023-04-17 20:24:40 -07:00
Harrison Chase
afd3e70ae5 Harrison/confluent loader (#2994)
Co-authored-by: Justin Flick <Justinjayflick@gmail.com>
2023-04-17 20:23:45 -07:00
Altay Sansal
95d578d246 Fix type hint regression (#3033)
Not sure what happened here but some of the file got overwritten by
#2859 which broke filtering logic.

Here is it fixed back to normal.

@hwchase17 can we expedite this if possible :-)

---------

Co-authored-by: Altay Sansal <altay.sansal@tgs.com>
2023-04-17 15:49:18 -07:00
Noah Gundotra
577ec92f16 Include testing instructions for getting setup in CONTRIBUTING.md (#3020)
Running tests is good sanity check for new users to ensure their
development environment is setup correctly.
2023-04-17 08:34:07 -07:00
Harrison Chase
98c70bc190 bump version to 142 (#3021) 2023-04-17 08:00:00 -07:00
vowelparrot
2356447323 Update Characters notebook (#3019)
- Most important - fixes the relevance_fn name in the notebook to align
with the docs

- Updates comments for the summary:
<img width="787" alt="image"
src="https://user-images.githubusercontent.com/130414180/232520616-2a99e8c3-a821-40c2-a0d5-3f3ea196c9bb.png">

- The new conversation is a bit better, still unfortunate they try to
schedule a followup.
- Rm the max dialogue turns argument to the conversation function
2023-04-17 07:48:48 -07:00
Harrison Chase
f1d15b4a75 update nb 2023-04-16 22:09:31 -07:00
Harrison Chase
e54f1b69ca add notebook 2023-04-16 21:54:15 -07:00
vowelparrot
99c0382209 Generative Characters (#2859)
Add a time-weighted memory retriever and a notebook that approximates a
Generative Agent from https://arxiv.org/pdf/2304.03442.pdf


The "daily plan" components are removed for now since they are less
useful without a virtual world, but the memory is an interesting
component to build off.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-16 21:41:00 -07:00
Jan Backes
a9310a3e8b Add Annoy as VectorStore (#2939)
Adds Annoy (https://github.com/spotify/annoy) as vector Store. 

RESOLVES hwchase17/langchain#2842

discord ref:
https://discord.com/channels/1038097195422978059/1051632794427723827/1096089994168377354

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>
2023-04-16 13:44:04 -07:00
Harrison Chase
e12e00df12 use output parsers in agents (#2987) 2023-04-16 13:15:21 -07:00
cs0lar
8b9e02da9d Fix/issue 1213 (#2932)
### Background

Continuing to implement all the interface methods defined by the
`VectorStore` class. This PR pertains to implementation of the
`max_marginal_relevance_search` method.

### Changes

- a `max_marginal_relevance_search` method implementation has been added
in `weaviate.py`
- tests have been added to the the new method
- vcr cassettes have been added for the weaviate tests

### Test Plan

Added tests for the `max_marginal_relevance_search` implementation

### Change Safety

- [x] I have added tests to cover my changes
2023-04-16 13:11:30 -07:00
Harrison Chase
4c02f4bc30 Fix bug in svm.LinearSVC, add support for a relevancy_threshold (#2959) (#2981)
- Modify SVMRetriever class to add an optional relevancy_threshold
- Modify SVMRetriever.get_relevant_documents method to filter out
documents with similarity scores below the relevancy threshold
- Normalized the similarities to be between 0 and 1 so the
relevancy_threshold makes more sense
- The number of results are limited to the top k documents or the
maximum number of relevant documents above the threshold, whichever is
smaller

This code will now return the top self.k results (or less, if there are
not enough results that meet the self.relevancy_threshold criteria).

The svm.LinearSVC implementation in scikit-learn is non-deterministic,
which means
SVMRetriever.from_texts(["bar", "world", "foo", "hello", "foo bar"])
could return [3 0 5 4 2 1] instead of [0 3 5 4 2 1] with a query of
"foo".
If you pass in multiple "foo" texts, the order could be different each
time. Here, we only care if the 0 is the first element, otherwise it
will offset the text and similarities.


Example:
```python
retriever = SVMRetriever.from_texts(
  ["foo", "bar", "world", "hello", "foo bar"],
  OpenAIEmbeddings(),
  k=4,
  relevancy_threshold=.25
)

result = retriever.get_relevant_documents("foo")
```
yields
```python
[Document(page_content='foo', metadata={}), Document(page_content='foo bar', metadata={})]
```

---------

Co-authored-by: Brandon Sandoval <52767641+account00001@users.noreply.github.com>
2023-04-16 12:57:18 -07:00
Mauricio Scheffer
7302787a7b Fix docs for parse_with_prompt (#2986) 2023-04-16 12:57:04 -07:00
Paul Garner
69698be3e6 consistently use getLogger(__name__), no root logger (#2989)
re
https://github.com/hwchase17/langchain/issues/439#issuecomment-1510442791

I think it's not polite for a library to use the root logger

both of these forms are also used:
```
logger = logging.getLogger(__name__)
logger = logging.getLogger(__file__)
```
I am not sure if there is any reason behind one vs the other? (...I am
guessing maybe just contributed by different people)

it seems to me it'd be better to consistently use
`logging.getLogger(__name__)`

this makes it easier for consumers of the library to set up log
handlers, e.g. for everything with `langchain.` prefix
2023-04-16 12:49:35 -07:00
Harrison Chase
32db2a2c2f fix lint 2023-04-16 10:56:19 -07:00
Azam Iftikhar
1e655d5ffd Fixed Regular expression (#2933)
###  https://github.com/hwchase17/langchain/issues/2898
Instead of `"Action" and "Action Input"` keywords, we are getting
`"Action 1" and "Action 1 Input" or "Action Input 1" ` from
**gpt-3.5-turbo**

 Updated the Regular expression to handle all these cases
 
Attaching the screenshot of the result from the updated Regular
expression.
 
<img width="1036" alt="Screenshot 2023-04-16 at 1 39 00 AM"
src="https://user-images.githubusercontent.com/55012400/232251184-23ca6cc2-7229-411a-b6e1-53b2f5ec18a5.png">
2023-04-16 09:16:50 -07:00
Harrison Chase
88d3ce12b8 Harrison/diffbot (#2984)
Co-authored-by: Manuel Saelices <msaelices@gmail.com>
2023-04-16 09:11:24 -07:00
vowelparrot
5ca7ce77cd Remove pythonrepl from LLM-MathChain (#2943)
Use numexpr evaluate instead of the python REPL to avoid malicious code
injection.

Tested against the (limited) math dataset and got the same score as
before.

For more permissive tools (like the REPL tool itself), other approaches
ought to be provided (some combination of Sanitizer + Restricted python
+ unprivileged-docker + ...), but for a calculator tool, only
mathematical expressions should be permitted.

See https://github.com/hwchase17/langchain/issues/814
2023-04-16 08:50:32 -07:00
Daniel Nouri
2a0f65f7af tiktoken: Relax Python version check (#2966)
tiktoken supports Python >= 3.8, see here:

e1c661edf3/pyproject.toml (L10)

Also works fine when trying locally!
2023-04-16 08:44:21 -07:00
Chetanya Rastogi
aead062a70 Add an example tutorial for using PDFMinerPDFasHTMLLoader (#2960)
Last week I added the `PDFMinerPDFasHTMLLoader`. I am adding some
example code in the notebook to serve as a tutorial for how that loader
can be used to create snippets of a pdf that are structured within
sections. All the other loaders only provide the `Document` objects
segmented by pages but that's pretty loose given the amount of other
metadata that can be extracted.

With the new loader, one can leverage font-size of the text to decide
when a new sections starts and can segment the text more semantically as
shown in the tutorial notebook. The cell shows that we are able to find
the content of entire section under **Related Work** for the example pdf
which is spread across 2 pages and hence is stored as two separate
documents by other loaders
2023-04-16 08:34:39 -07:00
Tim Asp
51894ddd98 allow tokentextsplitters to use model name to select encoder (#2963)
Fixes a bug I was seeing when the `TokenTextSplitter` was correctly
splitting text under the gpt3.5-turbo token limit, but when firing the
prompt off too openai, it'd come back with an error that we were over
the context limit.

gpt3.5-turbo and gpt-4 use `cl100k_base` tokenizer, and so the counts
are just always off with the default `gpt-2` encoder.

It's possible to pass along the encoding to the `TokenTextSplitter`, but
it's much simpler to pass the model name of the LLM. No more concern
about keeping the tokenizer and llm model in sync :)
2023-04-16 08:33:47 -07:00
Alex Iribarren
706ebd8f9c Enforce maximum Wikipedia query length (#2969)
I got the following stacktrace when the agent was trying to search
Wikipedia with a huge query:

```
Thought:{
    "action": "Wikipedia",
    "action_input": "Outstanding is a song originally performed by the Gap Band and written by member Raymond Calhoun. The song originally appeared on the group's platinum-selling 1982 album Gap Band IV. It is one of their signature songs and biggest hits, reaching the number one spot on the U.S. R&B Singles Chart in February 1983.  \"Outstanding\" peaked at number 51 on the Billboard Hot 100."
}
Traceback (most recent call last):
  File "/usr/src/app/tests/chat.py", line 121, in <module>
    answer = agent_chain.run(input=question)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/chains/base.py", line 216, in run
    return self(kwargs)[self.output_keys[0]]
           ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/chains/base.py", line 116, in __call__
    raise e
  File "/usr/local/lib/python3.11/site-packages/langchain/chains/base.py", line 113, in __call__
    outputs = self._call(inputs)
              ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/agents/agent.py", line 828, in _call
    next_step_output = self._take_next_step(
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/agents/agent.py", line 725, in _take_next_step
    observation = tool.run(
                  ^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/tools/base.py", line 73, in run
    raise e
  File "/usr/local/lib/python3.11/site-packages/langchain/tools/base.py", line 70, in run
    observation = self._run(tool_input)
                  ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/agents/tools.py", line 17, in _run
    return self.func(tool_input)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/utilities/wikipedia.py", line 40, in run
    search_results = self.wiki_client.search(query)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/wikipedia/util.py", line 28, in __call__
    ret = self._cache[key] = self.fn(*args, **kwargs)
                             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/wikipedia/wikipedia.py", line 109, in search
    raise WikipediaException(raw_results['error']['info'])
wikipedia.exceptions.WikipediaException: An unknown error occured: "Search request is longer than the maximum allowed length. (Actual: 373; allowed: 300)". Please report it on GitHub!
```

This commit limits the maximum size of the query passed to Wikipedia to
avoid this issue.
2023-04-16 08:30:57 -07:00
Nahin Khan
9a03f00e6c Fix typos (#2977) 2023-04-16 08:28:36 -07:00
Altay Sansal
9d8ab28837 Add top_k and filter fields to ChatGPTPluginRetriever (#2852)
This allows to adjust the number of results to retrieve and filter
documents based on metadata.

---------

Co-authored-by: Altay Sansal <altay.sansal@tgs.com>
2023-04-15 21:07:53 -07:00
vowelparrot
4ffc58e07b Add similarity_search_with_normalized_similarities (#2916)
Add a method that exposes a similarity search with corresponding
normalized similarity scores. Implement only for FAISS now.

### Motivation:

Some memory definitions combine `relevance` with other scores, like
recency , importance, etc.

While many (but not all) of the `VectorStore`'s expose a
`similarity_search_with_score` method, they don't all interpret the
units of that score (depends on the distance metric and whether or not
the the embeddings are normalized).

This PR proposes a `similarity_search_with_normalized_similarities`
method that lets consumers of the vector store not have to worry about
the metric and embedding scale.

*Most providers default to euclidean distance, with Pinecone being one
exception (defaults to cosine _similarity_).*

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-15 21:06:08 -07:00
Tim Asp
b9db20481f Fix wrong token counts from get_num_tokens from openai llms (#2952)
The encoding fetch was out of date. Luckily OpenAI has a nice[
`encoding_for_model`](46287bfa49/tiktoken/model.py)
function in `tiktoken` we can use now.
2023-04-15 16:09:17 -07:00
Tim Asp
fea5619ce9 Add title, lang, description to Web loader document metadata (#2955)
Title, lang and description are on almost every web page, and are
incredibly useful pieces of information that currently isn't captured
with the current web base loader

I thought about adding the title and description to the content of the
document, as
that content could be useful in search, but I left it out for right now.
If you think
it'd be worth adding, happy to add it.


I've found it's nice to have the title/description in the metadata to
have some structured data
when retrieving rows from vectordbs for use with summary and source
citation, so if we do want to add it to the `page_content`, i'd advocate
for it to also be included in metadata.
2023-04-15 16:07:08 -07:00
Maciej Pióro
f7bf917baf Fix missing docker-compose (#2899)
Fix missing `docker-compose` command if only `docker compose` (note
space) is available.
2023-04-15 16:05:11 -07:00
Harrison Chase
b634489b2e bump version to 141 (#2950) 2023-04-15 12:56:39 -07:00
Harrison Chase
274b25c010 SVM retriever (#2947) (#2949)
Add SVM retriever class, based on
https://github.com/karpathy/randomfun/blob/master/knn_vs_svm.ipynb.

Testing still WIP, but the logic is correct (I have a local
implementation outside of Langchain working).

---------

Co-authored-by: Lance Martin <122662504+PineappleExpress808@users.noreply.github.com>
Co-authored-by: rlm <31treehaus@31s-MacBook-Pro.local>
2023-04-15 12:49:59 -07:00
Harrison Chase
baf350e32b parametrize redis (#2946) 2023-04-15 12:47:36 -07:00
dev2049
36aa7f30e4 Move PythonRepl -> langchain.utilities (#2917) 2023-04-15 10:50:25 -07:00
dev2049
7c73e9df5d Add kwargs to VectorStore.maximum_marginal_relevance (#2921)
Same as similarity_search, allows child classes to add vector
store-specific args (this was technically already happening in couple
places but now typing is correct).
2023-04-15 10:49:49 -07:00
Davit Buniatyan
b3a5b51728 [minor] Deep Lake auth improvements in docs, kwargs pass, faster tests (#2927)
Minor cosmetic changes 
- Activeloop environment cred authentication in notebooks with
`getpass.getpass` (instead of CLI which not always works)
- much faster tests with Deep Lake pytest mode on 
- Deep Lake kwargs pass

Notes
- I put pytest environment creds inside `vectorstores/conftest.py`, but
feel free to suggest a better location. For context, if I put in
`test_deeplake.py`, `ruff` doesn't let me to set them before import
deeplake

---------

Co-authored-by: Davit Buniatyan <d@activeloop.ai>
2023-04-15 10:49:16 -07:00
Harrison Chase
c4ae8c1d24 bump ver to 140 (#2895) 2023-04-15 09:23:19 -07:00
Nahin Khan
ad3973a3b8 Fix typo (#2942) 2023-04-15 08:53:25 -07:00
Harrison Chase
cf2789d86d delete antropic chat notebook (#2945) 2023-04-15 08:48:51 -07:00
Hai Nguyen Mau
0aa828b1dc typo fix (#2937)
missing w in link
2023-04-15 08:31:43 -07:00
Ankush Gola
ec59e9d886 Fix ChatAnthropic stop_sequences error (#2919) (#2920)
Note to self: Always run integration tests, even on "that last minute
change you thought would be safe" :)

---------

Co-authored-by: Mike Lambert <mike.lambert@anthropic.com>
2023-04-14 17:22:01 -07:00
Akash NP
13a0ed064b add encoding to avoid UnicodeDecodeError (#2908)
**About**
Specify encoding to avoid UnicodeDecodeError when reading .txt for users
who are following the tutorial.

**Reference**
```
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 1205: character maps to <undefined>
```

**Environment**
OS: Win 11
Python: 3.8
2023-04-14 16:36:03 -07:00
Mike Lambert
392f1b3218 Add Anthropic ChatModel to langchain (#2293)
* Adds an Anthropic ChatModel
* Factors out common code in our LLMModel and ChatModel
* Supports streaming llm-tokens to the callbacks on a delta basis (until
a future V2 API does that for us)
* Some fixes
2023-04-14 15:09:07 -07:00
Kwuang Tang
66bef1d7ed Ignore files from .gitignore in Git loader (#2909)
fixes #2905 

extends #2851
2023-04-14 15:02:21 -07:00
Boris Feld
7ee87eb0c8 Comet callback updates (#2889)
I'm working with @DN6 and I made some small fixes and
improvements after playing with the integration.
2023-04-14 13:19:58 -07:00
dev2049
634358db5e Fix OpenAI LLM docstring (#2910) 2023-04-14 11:09:36 -07:00
pranjaldoshi96
30573b2e30 Correct instruction to use openweathermap utility in docstring (#2906)
Co-authored-by: Pranjal Doshi <pranjald@nvidia.com>
2023-04-14 10:46:20 -07:00
Kwuang Tang
a508afa91c Add file filter param to Git loader (#2904)
Allows users to specify what files should be loaded instead of
indiscriminately loading the entire repo.

extends #2851 

NOTE: for reviewers, `hide whitespace` option recommended since I
changed the indentation of an if-block to use `continue` instead so it
looks less like a Christmas tree :)
2023-04-14 10:45:54 -07:00
Ismail Pelaseyed
7e525a3b91 Add link to repo for deploying LangChain to Digitalocean App Platform (#2894)
This PR adds a link to a minimal example of deploying `LangChain` to
`Digitalocean App Platform`.
2023-04-14 08:55:21 -07:00
Peter Stolz
ccacf804a8 Fix format string in pinecone error handling (#2897) 2023-04-14 08:53:02 -07:00
Francis Felici
86189cdcf9 Update load_qa_chain() docstring (#2900)
Seems to be missing `map_rerank` as a potential argument of
`chain_type`
2023-04-14 08:51:30 -07:00
Harrison Chase
8fef69296d nits (#2873) 2023-04-14 07:55:12 -07:00
Harrison Chase
0a38bbc750 updates to vectorstore memory (#2875) 2023-04-14 07:54:57 -07:00
Ikko Eltociear Ashimine
203c0eb2ae docs: update getting_started.ipynb (#2883)
HuggingFace -> Hugging Face
2023-04-14 07:40:26 -07:00
ecneladis
1a44b71ddf Fix Baby AGI notebooks (#2882)
- fix broken notebook cell in
ae485b623d
- Python Black formatting
2023-04-14 07:40:04 -07:00
Nicolas
3c7204d604 docs: Quick fix to Mendable Search (#2876)
Fixed a small issue on the icon UI when using in Safari.
2023-04-13 23:15:57 -07:00
Harrison Chase
1e9378d0a8 Harrison/weaviate fixes (#2872)
Co-authored-by: cs0lar <cristiano.solarino@gmail.com>
Co-authored-by: cs0lar <cristiano.solarino@brightminded.com>
2023-04-13 22:37:34 -07:00
Harrison Chase
07d7096de6 Harrison/playwright (#2871)
Co-authored-by: Manuel Saelices <msaelices@gmail.com>
2023-04-13 22:15:03 -07:00
Jon Luo
5565f56273 Use SQL dialect-specific prompts for SQLDatabaseChain (#2748)
Mentioned the idea here initially:
https://github.com/hwchase17/langchain/pull/2106#issuecomment-1487509106

Since there have been dialect-specific issues, we should use
dialect-specific prompts. This way, each prompt can be separately
modified to best suit each dialect as needed. This adds a prompt for
each dialect supported in sqlalchemy (mssql, mysql, mariadb, postgres,
oracle, sqlite). For this initial implementation, the only differencse
between the prompts is the instruction for the clause to use to limit
the number of rows queried for, and the instruction for wrapping column
names using each dialect's identifier quote character.
2023-04-13 22:10:49 -07:00
drod
9907cb0485 Refactor similarity_search function in elastic_vector_search.py (#2761)
Optimization :Limit search results when k < 10
Fix issue when k > 10: Elasticsearch will return only 10 docs


[default-search-result](https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html)
By default, searches return the top 10 matching hits

Add size parameter to the search request to limit the number of returned
results from Elasticsearch. Remove slicing of the hits list, since the
response will already contain the desired number of results.
2023-04-13 22:09:00 -07:00
rafael
1cc7ea333c chat_models.openai: Set tenacity timeout to openai's recommendation (#2768)
[OpenAI's
cookbook](https://github.com/openai/openai-cookbook/blob/main/examples/How_to_handle_rate_limits.ipynb)
suggest a tenacity backoff between 1 and 60 seconds. Currently
langchain's backoff is between 4 and 10 seconds, which causes frequent
timeout errors on my end.

This PR changes the timeout to the suggested values.
2023-04-13 22:08:46 -07:00
Harrison Chase
705596b46a Harrison/fix create sql agent (#2870)
Co-authored-by: Timothé Pearce <timothe.pearce@gmail.com>
2023-04-13 22:07:58 -07:00
Harrison Chase
8a98e5b50b Harrison/index name (#2869)
Co-authored-by: Mesum Raza Hemani <mes.javacca@gmail.com>
2023-04-13 22:01:32 -07:00
Andrey Vasnetsov
dcb17503f2 Update qdrant.py (#2750)
At the moment of upload we should already know the format of data,
therefore we can skip the costly pydantic validation.
2023-04-13 21:57:05 -07:00
ecneladis
74abeb8c53 Update output in Git notebook (#2868)
Supplemental to https://github.com/hwchase17/langchain/pull/2851.
Updates one notebook cell that I forgot to commit before.
2023-04-13 21:56:17 -07:00
Nicolas
0226b375d9 docs: Mendable Search integration (#2803)
Mendable Seach Integration is Finally here!

Hey yall, 

After various requests for Mendable in Python docs, we decided to get
our hands dirty and try to implement it.
Here is a version where we implement our **floating button** that sits
on the bottom right of the screen that once triggered (via press or CMD
K) will work the same as the js langchain docs.

Super excited about this and hopefully the community will be too.
@hwchase17 will send you the admin details via dm etc. The anon_key is
fine to be public.

Let me know if you need any further customization. I added the langchain
logo to it.
2023-04-13 21:52:25 -07:00
sergerdn
04c458a270 feat: improve pinecone tests (#2806)
Improve the integration tests for Pinecone by adding an `.env.example`
file for local testing. Additionally, add some dev dependencies
specifically for integration tests.

This change also helps me understand how Pinecone deals with certain
things, see related issues
https://github.com/hwchase17/langchain/issues/2484
https://github.com/hwchase17/langchain/issues/2816
2023-04-13 21:49:31 -07:00
ecneladis
016738e676 Add GitLoader (#2851) 2023-04-13 21:39:20 -07:00
lizelive
8cfec2c5fe torch 2 support (#2865)
Lang-chain seems to work with torch 2
2023-04-13 21:38:49 -07:00
vowelparrot
bf0887c486 Add Slack Directory Loader (#2841)
Fixes linting issue from #2835 

Adds a loader for Slack Exports which can be a very valuable source of
knowledge to use for internal QA bots and other use cases.

```py
# Export data from your Slack Workspace first.
from langchain.document_loaders import SLackDirectoryLoader

SLACK_WORKSPACE_URL = "https://awesome.slack.com"

loader = ("Slack_Exports", SLACK_WORKSPACE_URL)
docs = loader.load()
```
2023-04-13 21:31:59 -07:00
Harrison Chase
ed2ef5cbe4 Harrison/rwkv utf8 (#2867)
Co-authored-by: Akihiro <ueyama0105@gmail.com>
2023-04-13 21:31:18 -07:00
Adam McCabe
6be5d7c612 Update reduce_openapi_spec for PATCH and DELETE (#2861)
My recent pull request (#2729) neglected to update the
`reduce_openapi_spec` in spec.py to also accommodate PATCH and DELETE
added to planner.py and prompt_planner.py.
2023-04-13 20:27:40 -07:00
Benjamin Tan Wei Hao
c26a259ba6 Fix tiny typo (#2863) 2023-04-13 20:26:26 -07:00
Jon Luo
f3180f05f9 Update sql chain notebook to clarify use of SQLAlchemy for connections (#2850)
Have seen questions about whether or not the `SQLDatabaseChain` supports
more than just sqlite, which was unclear in the docs, so tried to
clarify that and how to connect to other dialects.
2023-04-13 11:46:59 -07:00
leo-gan
ecc1a0c051 added code-analysis-deeplake.ipynb (#2844)
This notebook is heavily copied from the
`twitter-the-algorithm-analysis-deeplake.ipynb`
2023-04-13 11:29:59 -07:00
Tim Asp
70ffe470aa Add easy print method to openai callback (#2848)
Found myself constantly copying the snippet outputting all the callback
tracking details. so adding a simple way to output the full context
2023-04-13 11:28:42 -07:00
Tim Asp
be4fb24b32 OpenAI LLM: update modelname_to_contextsize with new models (#2843)
Token counts pulled from https://openai.com/pricing
2023-04-13 11:13:34 -07:00
vowelparrot
82d1d5f24e Fix grammar in Vector Memory Docs (#2847) 2023-04-13 11:00:09 -07:00
Tim Asp
53dc157145 [Docs] minor fixes to loaders links and rst warnings (#2846)
The doc loaders index was picking up a bunch of subheadings because I
mistakenly made the MD titles H1s. Fixed that.

also the easy minor warnings from docs_build
2023-04-13 10:54:40 -07:00
400 changed files with 30882 additions and 2688 deletions

View File

@@ -75,7 +75,7 @@ This will install all requirements for running the package, examples, linting, f
❗Note: If you're running Poetry 1.4.1 and receive a `WheelFileValidationError` for `debugpy` during installation, you can try either downgrading to Poetry 1.4.0 or disabling "modern installation" (`poetry config installer.modern-installation false`) and re-install requirements. See [this `debugpy` issue](https://github.com/microsoft/debugpy/issues/1246) for more details.
Now, you should be able to run the common tasks in the following section.
Now, you should be able to run the common tasks in the following section. To double check, run `make test`, all tests should pass. If they don't you may need to pip install additional dependencies, such as `numexpr` and `openapi_schema_pydantic`.
## ✅Common Tasks

3
.gitignore vendored
View File

@@ -142,3 +142,6 @@ wandb/
# asdf tool versions
.tool-versions
/.ruff_cache/
*.pkl
*.bin

View File

@@ -11,3 +11,7 @@ pre {
max-width: 2560px !important;
}
}
#my-component-root *, #headlessui-portal-root * {
z-index: 1000000000000;
}

58
docs/_static/js/mendablesearch.js vendored Normal file
View File

@@ -0,0 +1,58 @@
document.addEventListener('DOMContentLoaded', () => {
// Load the external dependencies
function loadScript(src, onLoadCallback) {
const script = document.createElement('script');
script.src = src;
script.onload = onLoadCallback;
document.head.appendChild(script);
}
function createRootElement() {
const rootElement = document.createElement('div');
rootElement.id = 'my-component-root';
document.body.appendChild(rootElement);
return rootElement;
}
function initializeMendable() {
const rootElement = createRootElement();
const { MendableFloatingButton } = Mendable;
const iconSpan1 = React.createElement('span', {
}, '🦜');
const iconSpan2 = React.createElement('span', {
}, '🔗');
const icon = React.createElement('p', {
style: { color: '#ffffff', fontSize: '22px',width: '48px', height: '48px', margin: '0px', padding: '0px', display: 'flex', alignItems: 'center', justifyContent: 'center', textAlign: 'center' },
}, [iconSpan1, iconSpan2]);
const mendableFloatingButton = React.createElement(
MendableFloatingButton,
{
style: { darkMode: false, accentColor: '#010810' },
floatingButtonStyle: { color: '#ffffff', backgroundColor: '#010810' },
anon_key: '82842b36-3ea6-49b2-9fb8-52cfc4bde6bf', // Mendable Search Public ANON key, ok to be public
messageSettings: {
openSourcesInNewTab: false,
},
icon: icon,
}
);
ReactDOM.render(mendableFloatingButton, rootElement);
}
loadScript('https://unpkg.com/react@17/umd/react.production.min.js', () => {
loadScript('https://unpkg.com/react-dom@17/umd/react-dom.production.min.js', () => {
loadScript('https://unpkg.com/@mendable/search@0.0.83/dist/umd/mendable.min.js', initializeMendable);
});
});
});

View File

@@ -103,5 +103,10 @@ html_static_path = ["_static"]
html_css_files = [
"css/custom.css",
]
html_js_files = [
"js/mendablesearch.js",
]
nb_execution_mode = "off"
myst_enable_extensions = ["colon_fence"]

View File

@@ -33,12 +33,17 @@ It implements a Question Answering app and contains instructions for deploying t
A minimal example on how to run LangChain on Vercel using Flask.
## [Digitalocean App Platform](https://github.com/homanp/digitalocean-langchain)
A minimal example on how to deploy LangChain to DigitalOcean App Platform.
## [SteamShip](https://github.com/steamship-core/steamship-langchain/)
This repository contains LangChain adapters for Steamship, enabling LangChain developers to rapidly deploy their apps on Steamship.
This includes: production ready endpoints, horizontal scaling across dependencies, persistant storage of app state, multi-tenancy support, etc.
## [Langchain-serve](https://github.com/jina-ai/langchain-serve)
This repository allows users to serve local chains and agents as RESTful, gRPC, or Websocket APIs thanks to [Jina](https://docs.jina.ai/). Deploy your chains & agents with ease and enjoy independent scaling, serverless and autoscaling APIs, as well as a Streamlit playground on Jina AI Cloud.
## [BentoML](https://github.com/ssheng/BentoChain)

View File

@@ -0,0 +1,15 @@
# AnalyticDB
This page covers how to use the AnalyticDB ecosystem within LangChain.
### VectorStore
There exists a wrapper around AnalyticDB, allowing you to use it as a vectorstore,
whether for semantic search or example selection.
To import this vectorstore:
```python
from langchain.vectorstores import AnalyticDB
```
For a more detailed walkthrough of the AnalyticDB wrapper, see [this notebook](../modules/indexes/vectorstores/examples/analyticdb.ipynb)

View File

@@ -1,7 +1,6 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -9,7 +8,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -17,7 +15,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -31,7 +28,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -39,7 +35,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -52,14 +47,13 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install comet_ml\n",
"!pip install langchain\n",
"!pip install openai\n",
"!pip install google-search-results"
"%pip install comet_ml langchain openai google-search-results spacy textstat pandas\n",
"\n",
"import sys\n",
"!{sys.executable} -m spacy download en_core_web_sm"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -67,7 +61,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -86,7 +79,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -94,7 +86,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -109,12 +100,12 @@
"source": [
"import os\n",
"\n",
"%env OPENAI_API_KEY=\"...\"\n",
"%env SERPAPI_API_KEY=\"...\""
"os.environ[\"OPENAI_API_KEY\"] = \"...\"\n",
"#os.environ[\"OPENAI_ORGANIZATION\"] = \"...\"\n",
"os.environ[\"SERPAPI_API_KEY\"] = \"...\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -149,7 +140,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -185,12 +175,11 @@
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callback_manager=manager)\n",
"\n",
"test_prompts = [{\"title\": \"Documentary about Bigfoot in Paris\"}]\n",
"synopsis_chain.apply(test_prompts)\n",
"print(synopsis_chain.apply(test_prompts))\n",
"comet_callback.flush_tracker(synopsis_chain, finish=True)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -232,7 +221,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -240,7 +228,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -256,7 +243,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install rouge-score"
"%pip install rouge-score"
]
},
{
@@ -336,16 +323,29 @@
" \"\"\"\n",
" }\n",
"]\n",
"synopsis_chain.apply(test_prompts)\n",
"print(synopsis_chain.apply(test_prompts))\n",
"comet_callback.flush_tracker(synopsis_chain, finish=True)"
]
}
],
"metadata": {
"language_info": {
"name": "python"
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"orig_nbformat": 4
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.15"
}
},
"nbformat": 4,
"nbformat_minor": 2

View File

@@ -36,7 +36,7 @@ from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
model = GPT4All(model="./models/gpt4all-model.bin", n_ctx=512, n_threads=8, callback_handler=callback_handler, verbose=True)
# Generate text. Tokens are streamed throught the callback manager.
# Generate text. Tokens are streamed through the callback manager.
model("Once upon a time, ")
```
@@ -44,4 +44,4 @@ model("Once upon a time, ")
You can find links to model file downloads in the [pyllamacpp](https://github.com/nomic-ai/pyllamacpp) repository.
For a more detailed walkthrough of this, see [this notebook](../modules/models/llms/integrations/gpt4all.ipynb)
For a more detailed walkthrough of this, see [this notebook](../modules/models/llms/integrations/gpt4all.ipynb)

65
docs/ecosystem/myscale.md Normal file
View File

@@ -0,0 +1,65 @@
# MyScale
This page covers how to use MyScale vector database within LangChain.
It is broken into two parts: installation and setup, and then references to specific MyScale wrappers.
With MyScale, you can manage both structured and unstructured (vectorized) data, and perform joint queries and analytics on both types of data using SQL. Plus, MyScale's cloud-native OLAP architecture, built on top of ClickHouse, enables lightning-fast data processing even on massive datasets.
## Introduction
[Overview to MyScale and High performance vector search](https://docs.myscale.com/en/overview/)
You can now register on our SaaS and [start a cluster now!](https://docs.myscale.com/en/quickstart/)
If you are also interested in how we managed to integrate SQL and vector, please refer to [this document](https://docs.myscale.com/en/vector-reference/) for further syntax reference.
We also deliver with live demo on huggingface! Please checkout our [huggingface space](https://huggingface.co/myscale)! They search millions of vector within a blink!
## Installation and Setup
- Install the Python SDK with `pip install clickhouse-connect`
### Setting up envrionments
There are two ways to set up parameters for myscale index.
1. Environment Variables
Before you run the app, please set the environment variable with `export`:
`export MYSCALE_URL='<your-endpoints-url>' MYSCALE_PORT=<your-endpoints-port> MYSCALE_USERNAME=<your-username> MYSCALE_PASSWORD=<your-password> ...`
You can easily find your account, password and other info on our SaaS. For details please refer to [this document](https://docs.myscale.com/en/cluster-management/)
Every attributes under `MyScaleSettings` can be set with prefix `MYSCALE_` and is case insensitive.
2. Create `MyScaleSettings` object with parameters
```python
from langchain.vectorstores import MyScale, MyScaleSettings
config = MyScaleSetting(host="<your-backend-url>", port=8443, ...)
index = MyScale(embedding_function, config)
index.add_documents(...)
```
## Wrappers
supported functions:
- `add_texts`
- `add_documents`
- `from_texts`
- `from_documents`
- `similarity_search`
- `asimilarity_search`
- `similarity_search_by_vector`
- `asimilarity_search_by_vector`
- `similarity_search_with_relevance_scores`
### VectorStore
There exists a wrapper around MyScale database, allowing you to use it as a vectorstore,
whether for semantic search or similar example retrieval.
To import this vectorstore:
```python
from langchain.vectorstores import MyScale
```
For a more detailed walkthrough of the MyScale wrapper, see [this notebook](../modules/indexes/vectorstores/examples/myscale.ipynb)

View File

@@ -15,7 +15,7 @@ custom LLMs, you can use the `SelfHostedPipeline` parent class.
from langchain.llms import SelfHostedPipeline, SelfHostedHuggingFaceLLM
```
For a more detailed walkthrough of the Self-hosted LLMs, see [this notebook](../modules/models/llms/integrations/self_hosted_examples.ipynb)
For a more detailed walkthrough of the Self-hosted LLMs, see [this notebook](../modules/models/llms/integrations/runhouse.ipynb)
## Self-hosted Embeddings
There are several ways to use self-hosted embeddings with LangChain via Runhouse.

View File

@@ -0,0 +1,43 @@
# Yeager.ai
This page covers how to use [Yeager.ai](https://yeager.ai) to generate LangChain tools and agents.
## What is Yeager.ai?
Yeager.ai is an ecosystem designed to simplify the process of creating AI agents and tools.
It features yAgents, a No-code LangChain Agent Builder, which enables users to build, test, and deploy AI solutions with ease. Leveraging the LangChain framework, yAgents allows seamless integration with various language models and resources, making it suitable for developers, researchers, and AI enthusiasts across diverse applications.
## yAgents
Low code generative agent designed to help you build, prototype, and deploy Langchain tools with ease.
### How to use?
```
pip install yeagerai-agent
yeagerai-agent
```
Go to http://127.0.0.1:7860
This will install the necessary dependencies and set up yAgents on your system. After the first run, yAgents will create a .env file where you can input your OpenAI API key. You can do the same directly from the Gradio interface under the tab "Settings".
`OPENAI_API_KEY=<your_openai_api_key_here>`
We recommend using GPT-4,. However, the tool can also work with GPT-3 if the problem is broken down sufficiently.
### Creating and Executing Tools with yAgents
yAgents makes it easy to create and execute AI-powered tools. Here's a brief overview of the process:
1. Create a tool: To create a tool, provide a natural language prompt to yAgents. The prompt should clearly describe the tool's purpose and functionality. For example:
`create a tool that returns the n-th prime number`
2. Load the tool into the toolkit: To load a tool into yAgents, simply provide a command to yAgents that says so. For example:
`load the tool that you just created it into your toolkit`
3. Execute the tool: To run a tool or agent, simply provide a command to yAgents that includes the name of the tool and any required parameters. For example:
`generate the 50th prime number`
You can see a video of how it works [here](https://www.youtube.com/watch?v=KA5hCM3RaWE).
As you become more familiar with yAgents, you can create more advanced tools and agents to automate your work and enhance your productivity.
For more information, see [yAgents' Github](https://github.com/yeagerai/yeagerai-agent) or our [docs](https://yeagerai.gitbook.io/docs/general/welcome-to-yeager.ai)

View File

@@ -1,5 +1,5 @@
LangChain Gallery
=============
=================
Lots of people have built some pretty awesome stuff with LangChain.
This is a collection of our favorites.
@@ -223,7 +223,7 @@ Open Source
Answer questions about the documentation of any project
Misc. Colab Notebooks
~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~
.. panels::
:body: text-center
@@ -280,6 +280,17 @@ Proprietary
---
.. link-button:: https://anysummary.app
:type: url
:text: Summarize any file with AI
:classes: stretched-link btn-lg
+++
Summarize not only long docs, interview audio or video files quickly, but also entire websites and YouTube videos. Share or download your generated summaries to collaborate with others, or revisit them at any time! Bonus: `@anysummary <https://twitter.com/anysummary>`_ on Twitter will also summarize any thread it is tagged in.
---
.. link-button:: https://twitter.com/dory111111/status/1608406234646052870?s=20&t=XYlrbKM0ornJsrtGa0br-g
:type: url
:text: AI Assisted SQL Query Generator

View File

@@ -46,7 +46,7 @@ LangChain provides many modules that can be used to build language model applica
`````{dropdown} LLMs: Get predictions from a language model
## LLMs: Get predictions from a language model
The most basic building block of LangChain is calling an LLM on some input.
Let's walk through a simple example of how to do this.
@@ -77,10 +77,9 @@ Feetful of Fun
```
For more details on how to use LLMs within LangChain, see the [LLM getting started guide](../modules/models/llms/getting_started.ipynb).
`````
`````{dropdown} Prompt Templates: Manage prompts for LLMs
## Prompt Templates: Manage prompts for LLMs
Calling an LLM is a great first step, but it's just the beginning.
Normally when you use an LLM in an application, you are not sending user input directly to the LLM.
@@ -115,11 +114,10 @@ What is a good name for a company that makes colorful socks?
[For more details, check out the getting started guide for prompts.](../modules/prompts/chat_prompt_template.ipynb)
`````
`````{dropdown} Chains: Combine LLMs and prompts in multi-step workflows
## Chains: Combine LLMs and prompts in multi-step workflows
Up until now, we've worked with the PromptTemplate and LLM primitives by themselves. But of course, a real application is not just one primitive, but rather a combination of them.
@@ -159,10 +157,7 @@ This is one of the simpler types of chains, but understanding how it works will
[For more details, check out the getting started guide for chains.](../modules/chains/getting_started.ipynb)
`````
`````{dropdown} Agents: Dynamically Call Chains Based on User Input
## Agents: Dynamically Call Chains Based on User Input
So far the chains we've looked at run in a predetermined order.
@@ -234,10 +229,8 @@ Final Answer: The high temperature in SF yesterday in Fahrenheit raised to the .
```
`````
`````{dropdown} Memory: Add State to Chains and Agents
## Memory: Add State to Chains and Agents
So far, all the chains and agents we've gone through have been stateless. But often, you may want a chain or agent to have some concept of "memory" so that it may remember information about its previous interactions. The clearest and simple example of this is when designing a chatbot - you want it to remember previous messages so it can use context from that to have a better conversation. This would be a type of "short-term memory". On the more complex side, you could imagine a chain/agent remembering key pieces of information over time - this would be a form of "long-term memory". For more concrete ideas on the latter, see this [awesome paper](https://memprompt.com/).
@@ -251,7 +244,8 @@ from langchain import OpenAI, ConversationChain
llm = OpenAI(temperature=0)
conversation = ConversationChain(llm=llm, verbose=True)
conversation.predict(input="Hi there!")
output = conversation.predict(input="Hi there!")
print(output)
```
```pycon
@@ -269,7 +263,8 @@ AI:
```
```python
conversation.predict(input="I'm doing well! Just having a conversation with an AI.")
output = conversation.predict(input="I'm doing well! Just having a conversation with an AI.")
print(output)
```
```pycon
@@ -287,7 +282,6 @@ AI:
> Finished chain.
" That's great! What would you like to talk about?"
```
`````
## Building a Language Model Application: Chat Models
@@ -295,8 +289,8 @@ Similarly, you can use chat models instead of LLMs. Chat models are a variation
Chat model APIs are fairly new, so we are still figuring out the correct abstractions.
## Get Message Completions from a Chat Model
`````{dropdown} Get Message Completions from a Chat Model
You can get chat completions by passing one or more messages to the chat model. The response will be a message. The types of messages currently supported in LangChain are `AIMessage`, `HumanMessage`, `SystemMessage`, and `ChatMessage` -- `ChatMessage` takes in an arbitrary role parameter. Most of the time, you'll just be dealing with `HumanMessage`, `AIMessage`, and `SystemMessage`.
```python
@@ -350,9 +344,9 @@ You can recover things like token usage from this LLMResult:
result.llm_output['token_usage']
# -> {'prompt_tokens': 71, 'completion_tokens': 18, 'total_tokens': 89}
```
`````
`````{dropdown} Chat Prompt Templates
## Chat Prompt Templates
Similar to LLMs, you can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplate`s. You can use `ChatPromptTemplate`'s `format_prompt` -- this returns a `PromptValue`, which you can convert to a string or `Message` object, depending on whether you want to use the formatted value as input to an llm or chat model.
For convience, there is a `from_template` method exposed on the template. If you were to use this template, this is what it would look like:
@@ -378,9 +372,8 @@ chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_mes
chat(chat_prompt.format_prompt(input_language="English", output_language="French", text="I love programming.").to_messages())
# -> AIMessage(content="J'aime programmer.", additional_kwargs={})
```
`````
`````{dropdown} Chains with Chat Models
## Chains with Chat Models
The `LLMChain` discussed in the above section can be used with chat models as well:
```python
@@ -404,9 +397,8 @@ chain = LLMChain(llm=chat, prompt=chat_prompt)
chain.run(input_language="English", output_language="French", text="I love programming.")
# -> "J'aime programmer."
```
`````
`````{dropdown} Agents with Chat Models
## Agents with Chat Models
Agents can also be used with chat models, you can initialize one using `AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION` as the agent type.
```python
@@ -465,9 +457,7 @@ Final Answer: 2.169459462491557
> Finished chain.
'2.169459462491557'
```
`````
`````{dropdown} Memory: Add State to Chains and Agents
## Memory: Add State to Chains and Agents
You can use Memory with chains and agents initialized with chat models. The main difference between this and Memory for LLMs is that rather than trying to condense all previous messages into a string, we can keep them as their own unique memory object.
```python
@@ -501,4 +491,4 @@ conversation.predict(input="I'm doing well! Just having a conversation with an A
conversation.predict(input="Tell me about yourself.")
# -> "Sure! I am an AI language model created by OpenAI. I was trained on a large dataset of text from the internet, which allows me to understand and generate human-like language. I can answer questions, provide information, and even have conversations like this one. Is there anything else you'd like to know about me?"
```
`````

View File

@@ -63,6 +63,10 @@ Use Cases
The above modules can be used in a variety of ways. LangChain also provides guidance and assistance in this. Below are some of the common use cases LangChain supports.
- `Autonomous Agents <./use_cases/autonomous_agents.html>`_: Autonomous agents are long running agents that take many steps in an attempt to accomplish an objective. Examples include AutoGPT and BabyAGI.
- `Agent Simulations <./use_cases/agent_simulations.html>`_: Putting agents in a sandbox and observing how they interact with each other or to events can be an interesting way to observe their long-term memory abilities.
- `Personal Assistants <./use_cases/personal_assistants.html>`_: The main LangChain use case. Personal assistants need to take actions, remember interactions, and have knowledge about your data.
- `Question Answering <./use_cases/question_answering.html>`_: The second big LangChain use case. Answering questions over specific documents, only utilizing the information in those documents to construct an answer.
@@ -89,6 +93,8 @@ The above modules can be used in a variety of ways. LangChain also provides guid
:hidden:
./use_cases/personal_assistants.md
./use_cases/autonomous_agents.md
./use_cases/agent_simulations.md
./use_cases/question_answering.md
./use_cases/chatbots.md
./use_cases/tabular.rst
@@ -153,6 +159,8 @@ Additional collection of resources we think may be useful as you develop your ap
- `Discord <https://discord.gg/6adMQxSpJS>`_: Join us on our Discord to discuss all things LangChain!
- `YouTube <./youtube.html>`_: A collection of the LangChain tutorials and videos.
- `Production Support <https://forms.gle/57d8AmXBYp8PP8tZA>`_: As you move your LangChains into production, we'd love to offer more comprehensive support. Please fill out this form and we'll set up a dedicated support Slack channel.
@@ -169,4 +177,5 @@ Additional collection of resources we think may be useful as you develop your ap
./tracing.md
./use_cases/model_laboratory.ipynb
Discord <https://discord.gg/6adMQxSpJS>
./youtube.md
Production Support <https://forms.gle/57d8AmXBYp8PP8tZA>

View File

@@ -315,7 +315,7 @@
" log=llm_output,\n",
" )\n",
" # Parse out the action and action input\n",
" regex = r\"Action: (.*?)[\\n]*Action Input:[\\s]*(.*)\"\n",
" regex = r\"Action\\s*\\d*\\s*:(.*?)\\nAction\\s*\\d*\\s*Input\\s*\\d*\\s*:[\\s]*(.*)\"\n",
" match = re.search(regex, llm_output, re.DOTALL)\n",
" if not match:\n",
" raise ValueError(f\"Could not parse LLM output: `{llm_output}`\")\n",

View File

@@ -204,7 +204,7 @@
" log=llm_output,\n",
" )\n",
" # Parse out the action and action input\n",
" regex = r\"Action: (.*?)[\\n]*Action Input:[\\s]*(.*)\"\n",
" regex = r\"Action\\s*\\d*\\s*:(.*?)\\nAction\\s*\\d*\\s*Input\\s*\\d*\\s*:[\\s]*(.*)\"\n",
" match = re.search(regex, llm_output, re.DOTALL)\n",
" if not match:\n",
" raise ValueError(f\"Could not parse LLM output: `{llm_output}`\")\n",

View File

@@ -206,7 +206,7 @@
" log=llm_output,\n",
" )\n",
" # Parse out the action and action input\n",
" regex = r\"Action: (.*?)[\\n]*Action Input:[\\s]*(.*)\"\n",
" regex = r\"Action\\s*\\d*\\s*:(.*?)\\nAction\\s*\\d*\\s*Input\\s*\\d*\\s*:[\\s]*(.*)\"\n",
" match = re.search(regex, llm_output, re.DOTALL)\n",
" if not match:\n",
" raise ValueError(f\"Could not parse LLM output: `{llm_output}`\")\n",

View File

@@ -20,13 +20,14 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6064f080",
"metadata": {},
"source": [
"### Custom LLMChain\n",
"\n",
"The first way to create a custom agent is to use an existing Agent class, but use a custom LLMChain. This is the simplest way to create a custom Agent. It is highly reccomended that you work with the `ZeroShotAgent`, as at the moment that is by far the most generalizable one. \n",
"The first way to create a custom agent is to use an existing Agent class, but use a custom LLMChain. This is the simplest way to create a custom Agent. It is highly recommended that you work with the `ZeroShotAgent`, as at the moment that is by far the most generalizable one. \n",
"\n",
"Most of the work in creating the custom LLMChain comes down to the prompt. Because we are using an existing agent class to parse the output, it is very important that the prompt say to produce text in that format. Additionally, we currently require an `agent_scratchpad` input variable to put notes on previous actions and observations. This should almost always be the final part of the prompt. However, besides those instructions, you can customize the prompt as you wish.\n",
"\n",
@@ -42,7 +43,7 @@
},
{
"cell_type": "code",
"execution_count": 23,
"execution_count": 1,
"id": "9af9734e",
"metadata": {},
"outputs": [],
@@ -53,7 +54,7 @@
},
{
"cell_type": "code",
"execution_count": 24,
"execution_count": 2,
"id": "becda2a1",
"metadata": {},
"outputs": [],
@@ -70,7 +71,7 @@
},
{
"cell_type": "code",
"execution_count": 25,
"execution_count": 3,
"id": "339b1bb8",
"metadata": {},
"outputs": [],
@@ -99,7 +100,7 @@
},
{
"cell_type": "code",
"execution_count": 26,
"execution_count": 4,
"id": "e21d2098",
"metadata": {},
"outputs": [
@@ -145,7 +146,7 @@
},
{
"cell_type": "code",
"execution_count": 27,
"execution_count": 5,
"id": "9b1cc2a2",
"metadata": {},
"outputs": [],
@@ -155,7 +156,7 @@
},
{
"cell_type": "code",
"execution_count": 28,
"execution_count": 6,
"id": "e4f5092f",
"metadata": {},
"outputs": [],
@@ -166,7 +167,7 @@
},
{
"cell_type": "code",
"execution_count": 29,
"execution_count": 7,
"id": "490604e9",
"metadata": {},
"outputs": [],
@@ -176,7 +177,7 @@
},
{
"cell_type": "code",
"execution_count": 31,
"execution_count": 8,
"id": "653b1617",
"metadata": {},
"outputs": [
@@ -190,9 +191,9 @@
"\u001b[32;1m\u001b[1;3mThought: I need to find out the population of Canada\n",
"Action: Search\n",
"Action Input: Population of Canada 2023\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mThe current population of Canada is 38,610,447 as of Saturday, February 18, 2023, based on Worldometer elaboration of the latest United Nations data. Canada 2020 population is estimated at 37,742,154 people at mid year according to UN data.\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mThe current population of Canada is 38,661,927 as of Sunday, April 16, 2023, based on Worldometer elaboration of the latest United Nations data.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Arrr, Canada be havin' 38,610,447 scallywags livin' there as of 2023!\u001b[0m\n",
"Final Answer: Arrr, Canada be havin' 38,661,927 people livin' there as of 2023!\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -200,10 +201,10 @@
{
"data": {
"text/plain": [
"\"Arrr, Canada be havin' 38,610,447 scallywags livin' there as of 2023!\""
"\"Arrr, Canada be havin' 38,661,927 people livin' there as of 2023!\""
]
},
"execution_count": 31,
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
@@ -223,7 +224,7 @@
},
{
"cell_type": "code",
"execution_count": 32,
"execution_count": 9,
"id": "43dbfa2f",
"metadata": {},
"outputs": [],
@@ -244,7 +245,7 @@
},
{
"cell_type": "code",
"execution_count": 33,
"execution_count": 10,
"id": "0f087313",
"metadata": {},
"outputs": [],
@@ -254,7 +255,7 @@
},
{
"cell_type": "code",
"execution_count": 34,
"execution_count": 11,
"id": "92c75a10",
"metadata": {},
"outputs": [],
@@ -264,7 +265,7 @@
},
{
"cell_type": "code",
"execution_count": 35,
"execution_count": 12,
"id": "ac5b83bf",
"metadata": {},
"outputs": [],
@@ -274,7 +275,7 @@
},
{
"cell_type": "code",
"execution_count": 36,
"execution_count": 13,
"id": "c960e4ff",
"metadata": {},
"outputs": [
@@ -285,12 +286,16 @@
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to find out the population of Canada in 2023.\n",
"\u001b[32;1m\u001b[1;3mThought: I should look for recent population estimates.\n",
"Action: Search\n",
"Action Input: Population of Canada in 2023\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mThe current population of Canada is 38,610,447 as of Saturday, February 18, 2023, based on Worldometer elaboration of the latest United Nations data. Canada 2020 population is estimated at 37,742,154 people at mid year according to UN data.\u001b[0m\n",
"Action Input: Canada population 2023\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m39,566,248\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I should double check this number.\n",
"Action: Search\n",
"Action Input: Canada population estimates 2023\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mCanada's population was estimated at 39,566,248 on January 1, 2023, after a record population growth of 1,050,110 people from January 1, 2022, to January 1, 2023.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: La popolazione del Canada nel 2023 è stimata in 38.610.447 persone.\u001b[0m\n",
"Final Answer: La popolazione del Canada è stata stimata a 39.566.248 il 1° gennaio 2023, dopo un record di crescita demografica di 1.050.110 persone dal 1° gennaio 2022 al 1° gennaio 2023.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -298,10 +303,10 @@
{
"data": {
"text/plain": [
"'La popolazione del Canada nel 2023 è stimata in 38.610.447 persone.'"
"'La popolazione del Canada è stata stimata a 39.566.248 il 1° gennaio 2023, dopo un record di crescita demografica di 1.050.110 persone dal 1° gennaio 2022 al 1° gennaio 2023.'"
]
},
"execution_count": 36,
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}

View File

@@ -28,7 +28,15 @@
"execution_count": 2,
"id": "f65308ab",
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to default session, using empty session: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /sessions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x10a1767c0>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
}
],
"source": [
"from langchain.agents import Tool\n",
"from langchain.memory import ConversationBufferMemory\n",
@@ -88,7 +96,20 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to persist run: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /chain-runs (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x13fab40d0>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m{\n",
" \"action\": \"Final Answer\",\n",
" \"action_input\": \"Hello Bob! How can I assist you today?\"\n",
@@ -124,7 +145,20 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to persist run: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /chain-runs (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x13fab44f0>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m{\n",
" \"action\": \"Final Answer\",\n",
" \"action_input\": \"Your name is Bob.\"\n",
@@ -167,10 +201,24 @@
" \"action\": \"Current Search\",\n",
" \"action_input\": \"Thai food dinner recipes\"\n",
"}\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m59 easy Thai recipes for any night of the week · Marion Grasby's Thai spicy chilli and basil fried rice · Thai curry noodle soup · Marion Grasby's ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m{\n",
"Observation: \u001b[36;1m\u001b[1;3m59 easy Thai recipes for any night of the week · Marion Grasby's Thai spicy chilli and basil fried rice · Thai curry noodle soup · Marion Grasby's Thai Spicy ...\u001b[0m\n",
"Thought:"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to persist run: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /chain-runs (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x13fae8be0>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m{\n",
" \"action\": \"Final Answer\",\n",
" \"action_input\": \"Here are some Thai food dinner recipes you can make this week: Thai spicy chilli and basil fried rice, Thai curry noodle soup, and many more. You can find 59 easy Thai recipes for any night of the week on Marion Grasby's website.\"\n",
" \"action_input\": \"Here are some Thai food dinner recipes you can make this week: Thai spicy chilli and basil fried rice, Thai curry noodle soup, and Thai Spicy ... (59 recipes in total).\"\n",
"}\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
@@ -179,7 +227,7 @@
{
"data": {
"text/plain": [
"\"Here are some Thai food dinner recipes you can make this week: Thai spicy chilli and basil fried rice, Thai curry noodle soup, and many more. You can find 59 easy Thai recipes for any night of the week on Marion Grasby's website.\""
"'Here are some Thai food dinner recipes you can make this week: Thai spicy chilli and basil fried rice, Thai curry noodle soup, and Thai Spicy ... (59 recipes in total).'"
]
},
"execution_count": 8,
@@ -210,11 +258,25 @@
" \"action_input\": \"who won the world cup in 1978\"\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mThe Argentina national football team represents Argentina in men's international football and is administered by the Argentine Football Association, the governing body for football in Argentina. Nicknamed La Albiceleste, they are the reigning world champions, having won the most recent World Cup in 2022.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m```json\n",
"Observation: \u001b[36;1m\u001b[1;3mArgentina national football team\u001b[0m\n",
"Thought:"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to persist run: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /chain-runs (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x13fae86d0>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m```json\n",
"{\n",
" \"action\": \"Final Answer\",\n",
" \"action_input\": \"The last letter in your name is 'b'. The Argentina national football team won the World Cup in 1978.\"\n",
" \"action_input\": \"The last letter in your name is 'b', and the winner of the 1978 World Cup was the Argentina national football team.\"\n",
"}\n",
"```\u001b[0m\n",
"\n",
@@ -224,7 +286,7 @@
{
"data": {
"text/plain": [
"\"The last letter in your name is 'b'. The Argentina national football team won the World Cup in 1978.\""
"\"The last letter in your name is 'b', and the winner of the 1978 World Cup was the Argentina national football team.\""
]
},
"execution_count": 9,
@@ -253,10 +315,24 @@
" \"action\": \"Current Search\",\n",
" \"action_input\": \"weather in pomfret\"\n",
"}\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mMostly cloudy with gusty winds developing during the afternoon. A few flurries or snow showers possible. High near 40F. Winds NNW at 20 to 30 mph.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m{\n",
"Observation: \u001b[36;1m\u001b[1;3m10 Day Weather-Pomfret, CT ; Sun 16. 64° · 50°. 24% · NE 7 mph ; Mon 17. 58° · 45°. 70% · ESE 8 mph ; Tue 18. 57° · 37°. 8% · WSW 15 mph.\u001b[0m\n",
"Thought:"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to persist run: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /chain-runs (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x13fa9d7f0>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m{\n",
" \"action\": \"Final Answer\",\n",
" \"action_input\": \"The weather in Pomfret is mostly cloudy with gusty winds developing during the afternoon. A few flurries or snow showers are possible. High near 40F. Winds NNW at 20 to 30 mph.\"\n",
" \"action_input\": \"The weather in Pomfret, CT for the next 10 days is as follows: Sun 16. 64° · 50°. 24% · NE 7 mph ; Mon 17. 58° · 45°. 70% · ESE 8 mph ; Tue 18. 57° · 37°. 8% · WSW 15 mph.\"\n",
"}\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
@@ -265,7 +341,7 @@
{
"data": {
"text/plain": [
"'The weather in Pomfret is mostly cloudy with gusty winds developing during the afternoon. A few flurries or snow showers are possible. High near 40F. Winds NNW at 20 to 30 mph.'"
"'The weather in Pomfret, CT for the next 10 days is as follows: Sun 16. 64° · 50°. 24% · NE 7 mph ; Mon 17. 58° · 45°. 70% · ESE 8 mph ; Tue 18. 57° · 37°. 8% · WSW 15 mph.'"
]
},
"execution_count": 10,

View File

@@ -23,7 +23,7 @@
"from langchain.agents import AgentType\n",
"from langchain.memory import ConversationBufferMemory\n",
"from langchain import OpenAI\n",
"from langchain.utilities import GoogleSearchAPIWrapper\n",
"from langchain.utilities import SerpAPIWrapper\n",
"from langchain.agents import initialize_agent"
]
},
@@ -34,7 +34,7 @@
"metadata": {},
"outputs": [],
"source": [
"search = GoogleSearchAPIWrapper()\n",
"search = SerpAPIWrapper()\n",
"tools = [\n",
" Tool(\n",
" name = \"Current Search\",\n",
@@ -149,8 +149,12 @@
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\n",
"Thought: Do I need to use a tool? No\n",
"AI: If you like Thai food, some great dinner options this week could include Thai green curry, Pad Thai, or a Thai-style stir-fry. You could also try making a Thai-style soup or salad. Enjoy!\u001b[0m\n",
"Thought: Do I need to use a tool? Yes\n",
"Action: Current Search\n",
"Action Input: Thai food dinner recipes\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m59 easy Thai recipes for any night of the week · Marion Grasby's Thai spicy chilli and basil fried rice · Thai curry noodle soup · Marion Grasby's Thai Spicy ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m Do I need to use a tool? No\n",
"AI: Here are some great Thai dinner recipes you can try this week: Marion Grasby's Thai Spicy Chilli and Basil Fried Rice, Thai Curry Noodle Soup, Thai Green Curry with Coconut Rice, Thai Red Curry with Vegetables, and Thai Coconut Soup. I hope you enjoy them!\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -158,7 +162,7 @@
{
"data": {
"text/plain": [
"'If you like Thai food, some great dinner options this week could include Thai green curry, Pad Thai, or a Thai-style stir-fry. You could also try making a Thai-style soup or salad. Enjoy!'"
"\"Here are some great Thai dinner recipes you can try this week: Marion Grasby's Thai Spicy Chilli and Basil Fried Rice, Thai Curry Noodle Soup, Thai Green Curry with Coconut Rice, Thai Red Curry with Vegetables, and Thai Coconut Soup. I hope you enjoy them!\""
]
},
"execution_count": 7,
@@ -187,9 +191,9 @@
"Thought: Do I need to use a tool? Yes\n",
"Action: Current Search\n",
"Action Input: Who won the World Cup in 1978\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mThe Cup was won by the host nation, Argentina, who defeated the Netherlands 31 in the final, after extra time. The final was held at River Plate's home stadium ... Amid Argentina's celebrations, there was sympathy for the Netherlands, runners-up for the second tournament running, following a 3-1 final defeat at the Estadio ... The match was won by the Argentine squad in extra time by a score of 31. Mario Kempes, who finished as the tournament's top scorer, was named the man of the ... May 21, 2022 ... Argentina won the World Cup for the first time in their history, beating Netherlands 3-1 in the final. This edition of the World Cup was full of ... The adidas Golden Ball is presented to the best player at each FIFA World Cup finals. Those who finish as runners-up in the vote receive the adidas Silver ... Holders West Germany failed to beat Holland and Italy and were eliminated when Berti Vogts' own goal gave Austria a 3-2 victory. Holland thrashed the Austrians ... Jun 14, 2018 ... On a clear afternoon on 1 June 1978 at the revamped El Monumental stadium in Buenos Aires' Belgrano barrio, several hundred children in white ... Dec 15, 2022 ... The tournament couldn't have gone better for the ruling junta. Argentina went on to win the championship, defeating the Netherlands, 3-1, in the ... Nov 9, 2022 ... Host: Argentina Teams: 16. Format: Group stage, second round, third-place playoff, final. Matches: 38. Goals: 102. Winner: Argentina Feb 19, 2009 ... Argentina sealed their first World Cup win on home soil when they defeated the Netherlands in an exciting final that went to extra-time. For the ...\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mArgentina national football team\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m Do I need to use a tool? No\n",
"AI: The last letter in your name is 'b'. Argentina won the World Cup in 1978.\u001b[0m\n",
"AI: The last letter in your name is \"b\" and the winner of the 1978 World Cup was the Argentina national football team.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -197,7 +201,7 @@
{
"data": {
"text/plain": [
"\"The last letter in your name is 'b'. Argentina won the World Cup in 1978.\""
"'The last letter in your name is \"b\" and the winner of the 1978 World Cup was the Argentina national football team.'"
]
},
"execution_count": 8,
@@ -226,9 +230,9 @@
"Thought: Do I need to use a tool? Yes\n",
"Action: Current Search\n",
"Action Input: Current temperature in Pomfret\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mA mixture of rain and snow showers. High 39F. Winds NNW at 5 to 10 mph. Chance of precip 50%. Snow accumulations less than one inch. Pomfret, CT Weather Forecast, with current conditions, wind, air quality, and what to expect for the next 3 days. Pomfret Center Weather Forecasts. ... Pomfret Center, CT Weather Conditionsstar_ratehome ... Tomorrow's temperature is forecast to be COOLER than today. It is 46 degrees fahrenheit, or 8 degrees celsius and feels like 46 degrees fahrenheit. The barometric pressure is 29.78 - measured by inch of mercury units - ... Pomfret Weather Forecasts. ... Pomfret, MD Weather Conditionsstar_ratehome ... Tomorrow's temperature is forecast to be MUCH COOLER than today. Additional Headlines. En Español · Share |. Current conditions at ... Pomfret CT. Tonight ... Past Weather Information · Interactive Forecast Map. Pomfret MD detailed current weather report for 20675 in Charles county, Maryland. ... Pomfret, MD weather condition is Mostly Cloudy and 43°F. Mostly Cloudy. Hazardous Weather Conditions. Hazardous Weather Outlook · En Español · Share |. Current conditions at ... South Pomfret VT. Tonight. Pomfret Center, CT Weather. Current Report for Thu Jan 5 2023. As of 2:00 PM EST. 5-Day Forecast | Road Conditions. 45°F 7°c. Feels Like 44°F. Pomfret Center CT. Today. Today: Areas of fog before 9am. Otherwise, cloudy, with a ... Otherwise, cloudy, with a temperature falling to around 33 by 5pm.\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mPartly cloudy skies. High around 70F. Winds W at 5 to 10 mph. Humidity41%.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m Do I need to use a tool? No\n",
"AI: The current temperature in Pomfret is 45°F (7°C) and it feels like 44°F.\u001b[0m\n",
"AI: The current temperature in Pomfret is around 70F with partly cloudy skies and winds W at 5 to 10 mph. The humidity is 41%.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -236,7 +240,7 @@
{
"data": {
"text/plain": [
"'The current temperature in Pomfret is 45°F (7°C) and it feels like 44°F.'"
"'The current temperature in Pomfret is around 70F with partly cloudy skies and winds W at 5 to 10 mph. The humidity is 41%.'"
]
},
"execution_count": 9,

View File

@@ -33,7 +33,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"id": "07e96d99",
"metadata": {},
"outputs": [],
@@ -41,7 +41,7 @@
"llm = OpenAI(temperature=0)\n",
"search = SerpAPIWrapper()\n",
"llm_math_chain = LLMMathChain(llm=llm, verbose=True)\n",
"db = SQLDatabase.from_uri(\"sqlite:///../../../../notebooks/Chinook.db\")\n",
"db = SQLDatabase.from_uri(\"sqlite:///../../../../../notebooks/Chinook.db\")\n",
"db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)\n",
"tools = [\n",
" Tool(\n",
@@ -64,7 +64,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 4,
"id": "a069c4b6",
"metadata": {},
"outputs": [],
@@ -74,7 +74,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 5,
"id": "e603cd7d",
"metadata": {},
"outputs": [
@@ -88,30 +88,24 @@
"\u001b[32;1m\u001b[1;3m I need to find out who Leo DiCaprio's girlfriend is and then calculate her age raised to the 0.43 power.\n",
"Action: Search\n",
"Action Input: \"Who is Leo DiCaprio's girlfriend?\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mCamila Morrone\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Camila Morrone's age\n",
"Action: Search\n",
"Action Input: \"How old is Camila Morrone?\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m25 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 25 raised to the 0.43 power\n",
"Observation: \u001b[36;1m\u001b[1;3mDiCaprio met actor Camila Morrone in December 2017, when she was 20 and he was 43. They were spotted at Coachella and went on multiple vacations together. Some reports suggested that DiCaprio was ready to ask Morrone to marry him. The couple made their red carpet debut at the 2020 Academy Awards.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate Camila Morrone's age raised to the 0.43 power.\n",
"Action: Calculator\n",
"Action Input: 25^0.43\u001b[0m\n",
"Action Input: 21^0.43\u001b[0m\n",
"\n",
"\u001b[1m> Entering new LLMMathChain chain...\u001b[0m\n",
"25^0.43\u001b[32;1m\u001b[1;3m\n",
"```python\n",
"import math\n",
"print(math.pow(25, 0.43))\n",
"21^0.43\u001b[32;1m\u001b[1;3m\n",
"```text\n",
"21**0.43\n",
"```\n",
"...numexpr.evaluate(\"21**0.43\")...\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m3.991298452658078\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m3.7030049853137306\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 3.991298452658078\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Camila Morrone is 25 years old and her age raised to the 0.43 power is 3.991298452658078.\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 3.7030049853137306\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: Camila Morrone is Leo DiCaprio's girlfriend and her current age raised to the 0.43 power is 3.7030049853137306.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -119,10 +113,10 @@
{
"data": {
"text/plain": [
"'Camila Morrone is 25 years old and her age raised to the 0.43 power is 3.991298452658078.'"
"\"Camila Morrone is Leo DiCaprio's girlfriend and her current age raised to the 0.43 power is 3.7030049853137306.\""
]
},
"execution_count": 4,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
@@ -133,7 +127,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 6,
"id": "a5c07010",
"metadata": {},
"outputs": [
@@ -147,21 +141,36 @@
"\u001b[32;1m\u001b[1;3m I need to find out the artist's full name and then search the FooBar database for their albums.\n",
"Action: Search\n",
"Action Input: \"The Storm Before the Calm\" artist\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mThe Storm Before the Calm (stylized in all lowercase) is the tenth (and eighth international) studio album by Canadian-American singer-songwriter Alanis ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now need to search the FooBar database for Alanis Morissette's albums\n",
"Observation: \u001b[36;1m\u001b[1;3mThe Storm Before the Calm (stylized in all lowercase) is the tenth (and eighth international) studio album by Canadian-American singer-songwriter Alanis Morissette, released June 17, 2022, via Epiphany Music and Thirty Tigers, as well as by RCA Records in Europe.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now need to search the FooBar database for Alanis Morissette's albums.\n",
"Action: FooBar DB\n",
"Action Input: What albums by Alanis Morissette are in the FooBar database?\u001b[0m\n",
"\n",
"\u001b[1m> Entering new SQLDatabaseChain chain...\u001b[0m\n",
"What albums by Alanis Morissette are in the FooBar database? \n",
"SQLQuery:\u001b[32;1m\u001b[1;3m SELECT Title FROM Album INNER JOIN Artist ON Album.ArtistId = Artist.ArtistId WHERE Artist.Name = 'Alanis Morissette' LIMIT 5;\u001b[0m\n",
"What albums by Alanis Morissette are in the FooBar database?\n",
"SQLQuery:"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/harrisonchase/workplace/langchain/langchain/sql_database.py:191: SAWarning: Dialect sqlite+pysqlite does *not* support Decimal objects natively, and SQLAlchemy must convert from floating point - rounding errors and other issues may occur. Please consider storing Decimal numbers as strings or integers on this platform for lossless storage.\n",
" sample_rows = connection.execute(command)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m SELECT \"Title\" FROM \"Album\" INNER JOIN \"Artist\" ON \"Album\".\"ArtistId\" = \"Artist\".\"ArtistId\" WHERE \"Name\" = 'Alanis Morissette' LIMIT 5;\u001b[0m\n",
"SQLResult: \u001b[33;1m\u001b[1;3m[('Jagged Little Pill',)]\u001b[0m\n",
"Answer:\u001b[32;1m\u001b[1;3m The albums by Alanis Morissette in the FooBar database are Jagged Little Pill.\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[38;5;200m\u001b[1;3m The albums by Alanis Morissette in the FooBar database are Jagged Little Pill.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: The artist who released the album The Storm Before the Calm is Alanis Morissette and the albums of theirs in the FooBar database are Jagged Little Pill.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: The artist who released the album 'The Storm Before the Calm' is Alanis Morissette and the albums of hers in the FooBar database are Jagged Little Pill.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -169,10 +178,10 @@
{
"data": {
"text/plain": [
"'The artist who released the album The Storm Before the Calm is Alanis Morissette and the albums of theirs in the FooBar database are Jagged Little Pill.'"
"\"The artist who released the album 'The Storm Before the Calm' is Alanis Morissette and the albums of hers in the FooBar database are Jagged Little Pill.\""
]
},
"execution_count": 5,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}

View File

@@ -21,7 +21,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 8,
"id": "ac561cc4",
"metadata": {},
"outputs": [],
@@ -34,7 +34,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 10,
"id": "07e96d99",
"metadata": {},
"outputs": [],
@@ -43,7 +43,7 @@
"llm1 = OpenAI(temperature=0)\n",
"search = SerpAPIWrapper()\n",
"llm_math_chain = LLMMathChain(llm=llm1, verbose=True)\n",
"db = SQLDatabase.from_uri(\"sqlite:///../../../../notebooks/Chinook.db\")\n",
"db = SQLDatabase.from_uri(\"sqlite:///../../../../../notebooks/Chinook.db\")\n",
"db_chain = SQLDatabaseChain(llm=llm1, database=db, verbose=True)\n",
"tools = [\n",
" Tool(\n",
@@ -66,7 +66,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 11,
"id": "a069c4b6",
"metadata": {},
"outputs": [],
@@ -76,7 +76,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 12,
"id": "e603cd7d",
"metadata": {},
"outputs": [
@@ -92,37 +92,34 @@
"```\n",
"{\n",
" \"action\": \"Search\",\n",
" \"action_input\": \"Who is Leo DiCaprio's girlfriend?\"\n",
" \"action_input\": \"Leo DiCaprio girlfriend\"\n",
"}\n",
"```\n",
"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mCamila Morrone\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mFor the second question, I need to use the calculator tool to raise her current age to the 0.43 power.\n",
"Observation: \u001b[36;1m\u001b[1;3mGigi Hadid: 2022 Leo and Gigi were first linked back in September 2022, when a source told Us Weekly that Leo had his “sights set\" on her (alarming way to put it, but okay).\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mFor the second question, I need to calculate the age raised to the 0.43 power. I will use the calculator tool.\n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"Calculator\",\n",
" \"action_input\": \"22.0^(0.43)\"\n",
" \"action_input\": \"((2022-1995)^0.43)\"\n",
"}\n",
"```\n",
"\n",
"\u001b[0m\n",
"\n",
"\u001b[1m> Entering new LLMMathChain chain...\u001b[0m\n",
"22.0^(0.43)\u001b[32;1m\u001b[1;3m\n",
"```python\n",
"import math\n",
"print(math.pow(22.0, 0.43))\n",
"((2022-1995)^0.43)\u001b[32;1m\u001b[1;3m\n",
"```text\n",
"(2022-1995)**0.43\n",
"```\n",
"...numexpr.evaluate(\"(2022-1995)**0.43\")...\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m3.777824273683966\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m4.125593352125936\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 3.777824273683966\n",
"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 4.125593352125936\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mI now know the final answer.\n",
"Final Answer: Camila Morrone, 3.777824273683966.\u001b[0m\n",
"Final Answer: Gigi Hadid is Leo DiCaprio's girlfriend and her current age raised to the 0.43 power is approximately 4.13.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -130,10 +127,10 @@
{
"data": {
"text/plain": [
"'Camila Morrone, 3.777824273683966.'"
"\"Gigi Hadid is Leo DiCaprio's girlfriend and her current age raised to the 0.43 power is approximately 4.13.\""
]
},
"execution_count": 4,
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
@@ -144,7 +141,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 13,
"id": "a5c07010",
"metadata": {},
"outputs": [
@@ -156,7 +153,7 @@
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mQuestion: What is the full name of the artist who recently released an album called 'The Storm Before the Calm' and are they in the FooBar database? If so, what albums of theirs are in the FooBar database?\n",
"Thought: I should use the Search tool to find the answer to the first part of the question and then use the FooBar DB tool to find the answer to the second part of the question.\n",
"Thought: I should use the Search tool to find the answer to the first part of the question and then use the FooBar DB tool to find the answer to the second part.\n",
"Action:\n",
"```\n",
"{\n",
@@ -166,7 +163,7 @@
"```\n",
"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAlanis Morissette\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mNow that I have the name of the artist, I can use the FooBar DB tool to find their albums in the database.\n",
"Thought:\u001b[32;1m\u001b[1;3mNow that I know the artist's name, I can use the FooBar DB tool to find out if they are in the database and what albums of theirs are in it.\n",
"Action:\n",
"```\n",
"{\n",
@@ -178,7 +175,7 @@
"\u001b[0m\n",
"\n",
"\u001b[1m> Entering new SQLDatabaseChain chain...\u001b[0m\n",
"What albums does Alanis Morissette have in the database? \n",
"What albums does Alanis Morissette have in the database?\n",
"SQLQuery:"
]
},
@@ -186,7 +183,7 @@
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/harrisonchase/workplace/langchain/langchain/sql_database.py:141: SAWarning: Dialect sqlite+pysqlite does *not* support Decimal objects natively, and SQLAlchemy must convert from floating point - rounding errors and other issues may occur. Please consider storing Decimal numbers as strings or integers on this platform for lossless storage.\n",
"/Users/harrisonchase/workplace/langchain/langchain/sql_database.py:191: SAWarning: Dialect sqlite+pysqlite does *not* support Decimal objects natively, and SQLAlchemy must convert from floating point - rounding errors and other issues may occur. Please consider storing Decimal numbers as strings or integers on this platform for lossless storage.\n",
" sample_rows = connection.execute(command)\n"
]
},
@@ -194,14 +191,14 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m SELECT Title FROM Album WHERE ArtistId IN (SELECT ArtistId FROM Artist WHERE Name = 'Alanis Morissette') LIMIT 5;\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m SELECT \"Title\" FROM \"Album\" WHERE \"ArtistId\" IN (SELECT \"ArtistId\" FROM \"Artist\" WHERE \"Name\" = 'Alanis Morissette') LIMIT 5;\u001b[0m\n",
"SQLResult: \u001b[33;1m\u001b[1;3m[('Jagged Little Pill',)]\u001b[0m\n",
"Answer:\u001b[32;1m\u001b[1;3m Alanis Morissette has the album 'Jagged Little Pill' in the database.\u001b[0m\n",
"Answer:\u001b[32;1m\u001b[1;3m Alanis Morissette has the album Jagged Little Pill in the database.\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[38;5;200m\u001b[1;3m Alanis Morissette has the album 'Jagged Little Pill' in the database.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mI have found the answer to both parts of the question.\n",
"Final Answer: The artist who recently released an album called 'The Storm Before the Calm' is Alanis Morissette. The album 'Jagged Little Pill' is in the FooBar database.\u001b[0m\n",
"Observation: \u001b[38;5;200m\u001b[1;3m Alanis Morissette has the album Jagged Little Pill in the database.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe artist Alanis Morissette is in the FooBar database and has the album Jagged Little Pill in it.\n",
"Final Answer: Alanis Morissette is in the FooBar database and has the album Jagged Little Pill in it.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -209,10 +206,10 @@
{
"data": {
"text/plain": [
"\"The artist who recently released an album called 'The Storm Before the Calm' is Alanis Morissette. The album 'Jagged Little Pill' is in the FooBar database.\""
"'Alanis Morissette is in the FooBar database and has the album Jagged Little Pill in it.'"
]
},
"execution_count": 5,
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}

View File

@@ -12,7 +12,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 1,
"id": "7e3b513e",
"metadata": {},
"outputs": [
@@ -25,11 +25,12 @@
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
"Intermediate answer: \u001b[36;1m\u001b[1;3mCarlos Alcaraz won the 2022 Men's single title while Poland's Iga Swiatek won the Women's single title defeating Tunisian's Ons Jabeur.\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mFollow up: Where is Carlos Alcaraz from?\u001b[0m\n",
"Intermediate answer: \u001b[36;1m\u001b[1;3mCarlos Alcaraz Garfia\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mFollow up: Where is Carlos Alcaraz Garfia from?\u001b[0m\n",
"Intermediate answer: \u001b[36;1m\u001b[1;3mEl Palmar, Spain\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mSo the final answer is: El Palmar, Spain\u001b[0m\n",
"\u001b[1m> Finished AgentExecutor chain.\u001b[0m\n"
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
@@ -38,7 +39,7 @@
"'El Palmar, Spain'"
]
},
"execution_count": 2,
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
@@ -61,6 +62,14 @@
"self_ask_with_search = initialize_agent(tools, llm, agent=AgentType.SELF_ASK_WITH_SEARCH, verbose=True)\n",
"self_ask_with_search.run(\"What is the hometown of the reigning men's U.S. Open champion?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b2e4d6bc",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -79,7 +88,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.9.1"
},
"vscode": {
"interpreter": {

View File

@@ -35,7 +35,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 3,
"id": "16c4dc59",
"metadata": {},
"outputs": [],
@@ -45,7 +45,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 4,
"id": "46b9489d",
"metadata": {},
"outputs": [
@@ -72,7 +72,7 @@
"'There are 891 rows in the dataframe.'"
]
},
"execution_count": 12,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
@@ -83,7 +83,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 5,
"id": "a96309be",
"metadata": {},
"outputs": [
@@ -110,7 +110,7 @@
"'30 people have more than 3 siblings.'"
]
},
"execution_count": 6,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
@@ -121,7 +121,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 6,
"id": "964a09f7",
"metadata": {},
"outputs": [
@@ -143,7 +143,7 @@
"Thought:\u001b[32;1m\u001b[1;3m I need to import the math library\n",
"Action: python_repl_ast\n",
"Action Input: import math\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mNone\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I can now calculate the square root\n",
"Action: python_repl_ast\n",
"Action Input: math.sqrt(df['Age'].mean())\u001b[0m\n",
@@ -160,7 +160,7 @@
"'5.449689683556195'"
]
},
"execution_count": 7,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}

View File

@@ -0,0 +1,167 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "245a954a",
"metadata": {},
"source": [
"# Jira\n",
"\n",
"This notebook goes over how to use the Jira tool.\n",
"The Jira tool allows agents to interact with a given Jira instance, performing actions such as searching for issues and creating issues, the tool wraps the atlassian-python-api library, for more see: https://atlassian-python-api.readthedocs.io/jira.html\n",
"\n",
"To use this tool, you must first set as environment variables:\n",
" JIRA_API_TOKEN\n",
" JIRA_USERNAME\n",
" JIRA_INSTANCE_URL"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "961b3689",
"metadata": {
"vscode": {
"languageId": "shellscript"
},
"ExecuteTime": {
"start_time": "2023-04-17T10:21:18.698672Z",
"end_time": "2023-04-17T10:21:20.168639Z"
}
},
"outputs": [],
"source": [
"%pip install atlassian-python-api"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "34bb5968",
"metadata": {
"ExecuteTime": {
"start_time": "2023-04-17T10:21:22.911233Z",
"end_time": "2023-04-17T10:21:23.730922Z"
}
},
"outputs": [],
"source": [
"import os\n",
"from langchain.agents import AgentType\n",
"from langchain.agents import initialize_agent\n",
"from langchain.agents.agent_toolkits.jira.toolkit import JiraToolkit\n",
"from langchain.llms import OpenAI\n",
"from langchain.utilities.jira import JiraAPIWrapper"
]
},
{
"cell_type": "code",
"execution_count": 4,
"outputs": [],
"source": [
"os.environ[\"JIRA_API_TOKEN\"] = \"abc\"\n",
"os.environ[\"JIRA_USERNAME\"] = \"123\"\n",
"os.environ[\"JIRA_INSTANCE_URL\"] = \"https://jira.atlassian.com\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"xyz\""
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"start_time": "2023-04-17T10:22:42.499447Z",
"end_time": "2023-04-17T10:22:42.505412Z"
}
}
},
{
"cell_type": "code",
"execution_count": 5,
"id": "ac4910f8",
"metadata": {
"ExecuteTime": {
"start_time": "2023-04-17T10:22:44.664481Z",
"end_time": "2023-04-17T10:22:44.720538Z"
}
},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)\n",
"jira = JiraAPIWrapper()\n",
"toolkit = JiraToolkit.from_jira_api_wrapper(jira)\n",
"agent = initialize_agent(\n",
" toolkit.get_tools(),\n",
" llm,\n",
" agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,\n",
" verbose=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3m I need to create an issue in project PW\n",
"Action: Create Issue\n",
"Action Input: {\"summary\": \"Make more fried rice\", \"description\": \"Reminder to make more fried rice\", \"issuetype\": {\"name\": \"Task\"}, \"priority\": {\"name\": \"Low\"}, \"project\": {\"key\": \"PW\"}}\u001B[0m\n",
"Observation: \u001B[38;5;200m\u001B[1;3mNone\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer\n",
"Final Answer: A new issue has been created in project PW with the summary \"Make more fried rice\" and description \"Reminder to make more fried rice\".\u001B[0m\n",
"\n",
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
"data": {
"text/plain": "'A new issue has been created in project PW with the summary \"Make more fried rice\" and description \"Reminder to make more fried rice\".'"
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"make a new issue in project PW to remind me to make more fried rice\")"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"start_time": "2023-04-17T10:23:33.662454Z",
"end_time": "2023-04-17T10:23:38.121883Z"
}
}
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.7"
},
"vscode": {
"interpreter": {
"hash": "53f3bc57609c7a84333bb558594977aa5b4026b1d6070b93987956689e367341"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -15,7 +15,7 @@
"id": "a389367b",
"metadata": {},
"source": [
"# 1st example: hierarchical planning agent\n",
"## 1st example: hierarchical planning agent\n",
"\n",
"In this example, we'll consider an approach called hierarchical planning, common in robotics and appearing in recent works for LLMs X robotics. We'll see it's a viable approach to start working with a massive API spec AND to assist with user queries that require multiple steps against the API.\n",
"\n",
@@ -31,7 +31,7 @@
"id": "4b6ecf6e",
"metadata": {},
"source": [
"## To start, let's collect some OpenAPI specs."
"### To start, let's collect some OpenAPI specs."
]
},
{
@@ -169,7 +169,7 @@
"id": "76349780",
"metadata": {},
"source": [
"## How big is this spec?"
"### How big is this spec?"
]
},
{
@@ -229,7 +229,7 @@
"id": "cbc4964e",
"metadata": {},
"source": [
"## Let's see some examples!\n",
"### Let's see some examples!\n",
"\n",
"Starting with GPT-4. (Some robustness iterations under way for GPT-3 family.)"
]
@@ -759,7 +759,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.0"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,167 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "0e499e90-7a6d-4fab-8aab-31a4df417601",
"metadata": {},
"source": [
"# PowerBI Dataset Agent\n",
"\n",
"This notebook showcases an agent designed to interact with a Power BI Dataset. The agent is designed to answer more general questions about a dataset, as well as recover from errors.\n",
"\n",
"Note that, as this agent is in active development, all answers might not be correct. It runs against the [executequery endpoint](https://learn.microsoft.com/en-us/rest/api/power-bi/datasets/execute-queries), which does not allow deletes.\n",
"\n",
"### Some notes\n",
"- It relies on authentication with the azure.identity package, which can be installed with `pip install azure-identity`. Alternatively you can create the powerbi dataset with a token as a string without supplying the credentials.\n",
"- You can also supply a username to impersonate for use with datasets that have RLS enabled. \n",
"- The toolkit uses a LLM to create the query from the question, the agent uses the LLM for the overall execution.\n",
"- Testing was done mostly with a `text-davinci-003` model, codex models did not seem to perform ver well."
]
},
{
"cell_type": "markdown",
"id": "ec927ac6-9b2a-4e8a-9a6e-3e429191875c",
"metadata": {
"tags": []
},
"source": [
"## Initialization"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "53422913-967b-4f2a-8022-00269c1be1b1",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.agents.agent_toolkits import create_pbi_agent\n",
"from langchain.agents.agent_toolkits import PowerBIToolkit\n",
"from langchain.utilities.powerbi import PowerBIDataset\n",
"from langchain.llms.openai import AzureOpenAI\n",
"from langchain.agents import AgentExecutor\n",
"from azure.identity import DefaultAzureCredential"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "090f3699-79c6-4ce1-ab96-a94f0121fd64",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = AzureOpenAI(temperature=0, deployment_name=\"text-davinci-003\", verbose=True)\n",
"toolkit = PowerBIToolkit(\n",
" powerbi=PowerBIDataset(None, \"<dataset_id>\", ['table1', 'table2'], DefaultAzureCredential()), \n",
" llm=llm\n",
")\n",
"\n",
"agent_executor = create_pbi_agent(\n",
" llm=llm,\n",
" toolkit=toolkit,\n",
" verbose=True,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "36ae48c7-cb08-4fef-977e-c7d4b96a464b",
"metadata": {},
"source": [
"## Example: describing a table"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ff70e83d-5ad0-4fc7-bb96-27d82ac166d7",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"agent_executor.run(\"Describe table1\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "9abcfe8e-1868-42a4-8345-ad2d9b44c681",
"metadata": {},
"source": [
"## Example: simple query on a table\n",
"In this example, the agent actually figures out the correct query to get a row count of the table."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bea76658-a65b-47e2-b294-6d52c5556246",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"agent_executor.run(\"How many records are in table1?\")"
]
},
{
"cell_type": "markdown",
"id": "6fbc26af-97e4-4a21-82aa-48bdc992da26",
"metadata": {},
"source": [
"## Example: running queries"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "17bea710-4a23-4de0-b48e-21d57be48293",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"agent_executor.run(\"How many records are there by dimension1 in table2?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "474dddda-c067-4eeb-98b1-e763ee78b18c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"agent_executor.run(\"What unique values are there for dimensions2 in table2\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -24,6 +24,7 @@ Next, we have some examples of customizing and generically working with tools
./tools/custom_tools.ipynb
./tools/multi_input_tool.ipynb
./tools/tool_input_validation.ipynb
In this documentation we cover generic tooling functionality (eg how to create your own)

View File

@@ -9,28 +9,30 @@
"\n",
"When constructing your own agent, you will need to provide it with a list of Tools that it can use. Besides the actual function that is called, the Tool consists of several components:\n",
"\n",
"- name (str), is required\n",
"- description (str), is optional\n",
"- name (str), is required and must be unique within a set of tools provided to an agent\n",
"- description (str), is optional but recommended, as it is used by an agent to determine tool use\n",
"- return_direct (bool), defaults to False\n",
"- args_schema (Pydantic BaseModel), is optional but recommended, can be used to provide more information or validation for expected parameters.\n",
"\n",
"The function that should be called when the tool is selected should take as input a single string and return a single string.\n",
"The function that should be called when the tool is selected should return a single string.\n",
"\n",
"There are two ways to define a tool, we will cover both in the example below."
]
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 1,
"id": "1aaba18c",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Import things that are needed generically\n",
"from langchain.agents import initialize_agent, Tool\n",
"from langchain.agents import AgentType\n",
"from langchain.tools import BaseTool\n",
"from langchain.llms import OpenAI\n",
"from langchain import LLMMathChain, SerpAPIWrapper"
"from langchain import LLMMathChain, SerpAPIWrapper\n",
"from langchain.agents import AgentType, Tool, initialize_agent, tool\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.tools import BaseTool"
]
},
{
@@ -43,12 +45,14 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"id": "36ed392e",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)"
"llm = ChatOpenAI(temperature=0)"
]
},
{
@@ -74,7 +78,9 @@
"cell_type": "code",
"execution_count": 3,
"id": "56ff7670",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Load the tool configs that are needed.\n",
@@ -86,19 +92,31 @@
" func=search.run,\n",
" description=\"useful for when you need to answer questions about current events\"\n",
" ),\n",
"]\n",
"# You can also define an args_schema to provide more information about inputs\n",
"from pydantic import BaseModel, Field\n",
"\n",
"class CalculatorInput(BaseModel):\n",
" question: str = Field()\n",
" \n",
"\n",
"tools.append(\n",
" Tool(\n",
" name=\"Calculator\",\n",
" func=llm_math_chain.run,\n",
" description=\"useful for when you need to answer questions about math\"\n",
" description=\"useful for when you need to answer questions about math\",\n",
" args_schema=CalculatorInput\n",
" )\n",
"]"
")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "5b93047d",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Construct the agent. We will use the default agent type here.\n",
@@ -110,7 +128,9 @@
"cell_type": "code",
"execution_count": 5,
"id": "6f96a891",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
@@ -119,29 +139,22 @@
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Leo DiCaprio's girlfriend is and then calculate her age raised to the 0.43 power.\n",
"\u001b[32;1m\u001b[1;3mI need to find out Leo DiCaprio's girlfriend's name and her age\n",
"Action: Search\n",
"Action Input: \"Leo DiCaprio girlfriend\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mCamila Morrone\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now need to calculate her age raised to the 0.43 power\n",
"Action Input: \"Leo DiCaprio girlfriend\"\u001b[0m\u001b[36;1m\u001b[1;3mDiCaprio broke up with girlfriend Camila Morrone, 25, in the summer of 2022, after dating for four years.\u001b[0m\u001b[32;1m\u001b[1;3mI need to find out Camila Morrone's current age\n",
"Action: Calculator\n",
"Action Input: 22^0.43\u001b[0m\n",
"Action Input: 25^(0.43)\u001b[0m\n",
"\n",
"\u001b[1m> Entering new LLMMathChain chain...\u001b[0m\n",
"22^0.43\u001b[32;1m\u001b[1;3m\n",
"```python\n",
"import math\n",
"print(math.pow(22, 0.43))\n",
"25^(0.43)\u001b[32;1m\u001b[1;3m```text\n",
"25**(0.43)\n",
"```\n",
"...numexpr.evaluate(\"25**(0.43)\")...\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m3.777824273683966\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m3.991298452658078\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 3.777824273683966\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Camila Morrone's age raised to the 0.43 power is 3.777824273683966.\u001b[0m\n",
"\u001b[33;1m\u001b[1;3mAnswer: 3.991298452658078\u001b[0m\u001b[32;1m\u001b[1;3mI now know the final answer\n",
"Final Answer: 3.991298452658078\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -149,7 +162,7 @@
{
"data": {
"text/plain": [
"\"Camila Morrone's age raised to the 0.43 power is 3.777824273683966.\""
"'3.991298452658078'"
]
},
"execution_count": 5,
@@ -171,11 +184,15 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 6,
"id": "c58a7c40",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from typing import Type\n",
"\n",
"class CustomSearchTool(BaseTool):\n",
" name = \"Search\"\n",
" description = \"useful for when you need to answer questions about current events\"\n",
@@ -191,6 +208,7 @@
"class CustomCalculatorTool(BaseTool):\n",
" name = \"Calculator\"\n",
" description = \"useful for when you need to answer questions about math\"\n",
" args_schema: Type[BaseModel] = CalculatorInput\n",
"\n",
" def _run(self, query: str) -> str:\n",
" \"\"\"Use the tool.\"\"\"\n",
@@ -203,9 +221,11 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 7,
"id": "3318a46f",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"tools = [CustomSearchTool(), CustomCalculatorTool()]"
@@ -213,9 +233,11 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 8,
"id": "ee2d0f3a",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)"
@@ -223,9 +245,11 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 9,
"id": "6a2cebbf",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
@@ -234,29 +258,22 @@
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Leo DiCaprio's girlfriend is and then calculate her age raised to the 0.43 power.\n",
"\u001b[32;1m\u001b[1;3mI need to find out Leo DiCaprio's girlfriend's name and her age\n",
"Action: Search\n",
"Action Input: \"Leo DiCaprio girlfriend\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mCamila Morrone\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now need to calculate her age raised to the 0.43 power\n",
"Action Input: \"Leo DiCaprio girlfriend\"\u001b[0m\u001b[36;1m\u001b[1;3mDiCaprio broke up with girlfriend Camila Morrone, 25, in the summer of 2022, after dating for four years.\u001b[0m\u001b[32;1m\u001b[1;3mI need to find out Camila Morrone's current age\n",
"Action: Calculator\n",
"Action Input: 22^0.43\u001b[0m\n",
"Action Input: 25^(0.43)\u001b[0m\n",
"\n",
"\u001b[1m> Entering new LLMMathChain chain...\u001b[0m\n",
"22^0.43\u001b[32;1m\u001b[1;3m\n",
"```python\n",
"import math\n",
"print(math.pow(22, 0.43))\n",
"25^(0.43)\u001b[32;1m\u001b[1;3m```text\n",
"25**(0.43)\n",
"```\n",
"...numexpr.evaluate(\"25**(0.43)\")...\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m3.777824273683966\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m3.991298452658078\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 3.777824273683966\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Camila Morrone's age raised to the 0.43 power is 3.777824273683966.\u001b[0m\n",
"\u001b[33;1m\u001b[1;3mAnswer: 3.991298452658078\u001b[0m\u001b[32;1m\u001b[1;3mI now know the final answer\n",
"Final Answer: 3.991298452658078\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -264,10 +281,10 @@
{
"data": {
"text/plain": [
"\"Camila Morrone's age raised to the 0.43 power is 3.777824273683966.\""
"'3.991298452658078'"
]
},
"execution_count": 11,
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
@@ -288,9 +305,11 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 10,
"id": "8f15307d",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.agents import tool\n",
@@ -298,22 +317,24 @@
"@tool\n",
"def search_api(query: str) -> str:\n",
" \"\"\"Searches the API for the query.\"\"\"\n",
" return \"Results\""
" return f\"Results for query {query}\""
]
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 11,
"id": "0a23b91b",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"Tool(name='search_api', description='search_api(query: str) -> str - Searches the API for the query.', return_direct=False, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x1184e0cd0>, func=<function search_api at 0x1635f8700>, coroutine=None)"
"Tool(name='search_api', description='search_api(query: str) -> str - Searches the API for the query.', args_schema=<class 'pydantic.main.SearchApi'>, return_direct=False, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x12748c4c0>, func=<function search_api at 0x16bd664c0>, coroutine=None)"
]
},
"execution_count": 5,
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
@@ -332,9 +353,11 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 12,
"id": "28cdf04d",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"@tool(\"search\", return_direct=True)\n",
@@ -345,17 +368,62 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 13,
"id": "1085a4bd",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Tool(name='search', description='search(query: str) -> str - Searches the API for the query.', return_direct=True, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x1184e0cd0>, func=<function search_api at 0x1635f8670>, coroutine=None)"
"Tool(name='search', description='search(query: str) -> str - Searches the API for the query.', args_schema=<class 'pydantic.main.SearchApi'>, return_direct=True, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x12748c4c0>, func=<function search_api at 0x16bd66310>, coroutine=None)"
]
},
"execution_count": 7,
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"search_api"
]
},
{
"cell_type": "markdown",
"id": "de34a6a3",
"metadata": {},
"source": [
"You can also provide `args_schema` to provide more information about the argument"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "f3a5c106",
"metadata": {},
"outputs": [],
"source": [
"class SearchInput(BaseModel):\n",
" query: str = Field(description=\"should be a search query\")\n",
" \n",
"@tool(\"search\", return_direct=True, args_schema=SearchInput)\n",
"def search_api(query: str) -> str:\n",
" \"\"\"Searches the API for the query.\"\"\"\n",
" return \"Results\""
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "7914ba6b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Tool(name='search', description='search(query: str) -> str - Searches the API for the query.', args_schema=<class '__main__.SearchInput'>, return_direct=True, verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x12748c4c0>, func=<function search_api at 0x16bcf0ee0>, coroutine=None)"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
@@ -376,7 +444,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 14,
"id": "79213f40",
"metadata": {},
"outputs": [],
@@ -386,7 +454,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 15,
"id": "e1067dcb",
"metadata": {},
"outputs": [],
@@ -396,7 +464,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 16,
"id": "6c66ffe8",
"metadata": {},
"outputs": [],
@@ -406,7 +474,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 17,
"id": "f45b5bc3",
"metadata": {},
"outputs": [],
@@ -416,7 +484,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 18,
"id": "565e2b9b",
"metadata": {},
"outputs": [
@@ -427,21 +495,12 @@
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Leo DiCaprio's girlfriend is and then calculate her age raised to the 0.43 power.\n",
"\u001b[32;1m\u001b[1;3mI need to find out Leo DiCaprio's girlfriend's name and her age.\n",
"Action: Google Search\n",
"Action Input: \"Leo DiCaprio girlfriend\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mCamila Morrone\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Camila Morrone's age\n",
"Action: Google Search\n",
"Action Input: \"Camila Morrone age\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m25 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 25 raised to the 0.43 power\n",
"Action Input: \"Leo DiCaprio girlfriend\"\u001b[0m\u001b[36;1m\u001b[1;3mI draw the lime at going to get a Mohawk, though.\" DiCaprio broke up with girlfriend Camila Morrone, 25, in the summer of 2022, after dating for four years. He's since been linked to another famous supermodel Gigi Hadid.\u001b[0m\u001b[32;1m\u001b[1;3mNow I need to find out Camila Morrone's current age.\n",
"Action: Calculator\n",
"Action Input: 25^0.43\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 3.991298452658078\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Camila Morrone is Leo DiCaprio's girlfriend and her current age raised to the 0.43 power is 3.991298452658078.\u001b[0m\n",
"Action Input: 25^0.43\u001b[0m\u001b[33;1m\u001b[1;3mAnswer: 3.991298452658078\u001b[0m\u001b[32;1m\u001b[1;3mI now know the final answer.\n",
"Final Answer: Camila Morrone's current age raised to the 0.43 power is approximately 3.99.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -449,10 +508,10 @@
{
"data": {
"text/plain": [
"\"Camila Morrone is Leo DiCaprio's girlfriend and her current age raised to the 0.43 power is 3.991298452658078.\""
"\"Camila Morrone's current age raised to the 0.43 power is approximately 3.99.\""
]
},
"execution_count": 12,
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
@@ -478,7 +537,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 19,
"id": "3450512e",
"metadata": {},
"outputs": [],
@@ -507,7 +566,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 20,
"id": "4b9a7849",
"metadata": {},
"outputs": [
@@ -520,9 +579,7 @@
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I should use a music search engine to find the answer\n",
"Action: Music Search\n",
"Action Input: most famous song of christmas\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m'All I Want For Christmas Is You' by Mariah Carey.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Action Input: most famous song of christmas\u001b[0m\u001b[33;1m\u001b[1;3m'All I Want For Christmas Is You' by Mariah Carey.\u001b[0m\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: 'All I Want For Christmas Is You' by Mariah Carey.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
@@ -534,7 +591,7 @@
"\"'All I Want For Christmas Is You' by Mariah Carey.\""
]
},
"execution_count": 14,
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
@@ -554,7 +611,7 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 21,
"id": "3bb6185f",
"metadata": {},
"outputs": [],
@@ -572,7 +629,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 22,
"id": "113ddb84",
"metadata": {},
"outputs": [],
@@ -583,9 +640,11 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 23,
"id": "582439a6",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
@@ -596,9 +655,7 @@
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to calculate this\n",
"Action: Calculator\n",
"Action Input: 2**.12\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 1.2599210498948732\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\u001b[0m\n",
"Action Input: 2**.12\u001b[0m\u001b[36;1m\u001b[1;3mAnswer: 1.086734862526058\u001b[0m\u001b[32;1m\u001b[1;3m\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -606,10 +663,10 @@
{
"data": {
"text/plain": [
"'Answer: 1.2599210498948732'"
"'Answer: 1.086734862526058'"
]
},
"execution_count": 17,
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
@@ -618,10 +675,149 @@
"agent.run(\"whats 2**.12\")"
]
},
{
"cell_type": "markdown",
"id": "8aa3c353-bd89-467c-9c27-b83a90cd4daa",
"metadata": {},
"source": [
"## Multi-argument tools\n",
"\n",
"Many functions expect structured inputs. These can also be supported using the Tool decorator or by directly subclassing `BaseTool`! We have to modify the LLM's OutputParser to map its string output to a dictionary to pass to the action, however."
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "537bc628",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from typing import Optional, Union\n",
"\n",
"@tool\n",
"def custom_search(k: int, query: str, other_arg: Optional[str] = None):\n",
" \"\"\"The custom search function.\"\"\"\n",
" return f\"Here are the results for the custom search: k={k}, query={query}, other_arg={other_arg}\""
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "d5c992cf-776a-40cd-a6c4-e7cf65ea709e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import re\n",
"from langchain.schema import (\n",
" AgentAction,\n",
" AgentFinish,\n",
")\n",
"from langchain.agents import AgentOutputParser\n",
"\n",
"# We will add a custom parser to map the arguments to a dictionary\n",
"class CustomOutputParser(AgentOutputParser):\n",
" \n",
" def parse_tool_input(self, action_input: str) -> dict:\n",
" # Regex pattern to match arguments and their values\n",
" pattern = r\"(\\w+)\\s*=\\s*(None|\\\"[^\\\"]*\\\"|\\d+)\"\n",
" matches = re.findall(pattern, action_input)\n",
" \n",
" if not matches:\n",
" raise ValueError(f\"Could not parse action input: `{action_input}`\")\n",
"\n",
" # Create a dictionary with the parsed arguments and their values\n",
" parsed_input = {}\n",
" for arg, value in matches:\n",
" if value == \"None\":\n",
" parsed_value = None\n",
" elif value.isdigit():\n",
" parsed_value = int(value)\n",
" else:\n",
" parsed_value = value.strip('\"')\n",
" parsed_input[arg] = parsed_value\n",
"\n",
" return parsed_input\n",
" \n",
" def parse(self, llm_output: str) -> Union[AgentAction, AgentFinish]:\n",
" # Check if agent should finish\n",
" if \"Final Answer:\" in llm_output:\n",
" return AgentFinish(\n",
" # Return values is generally always a dictionary with a single `output` key\n",
" # It is not recommended to try anything else at the moment :)\n",
" return_values={\"output\": llm_output.split(\"Final Answer:\")[-1].strip()},\n",
" log=llm_output,\n",
" )\n",
" # Parse out the action and action input\n",
" regex = r\"Action\\s*\\d*\\s*:(.*?)\\nAction\\s*\\d*\\s*Input\\s*\\d*\\s*:[\\s]*(.*)\"\n",
" match = re.search(regex, llm_output, re.DOTALL)\n",
" if not match:\n",
" raise ValueError(f\"Could not parse LLM output: `{llm_output}`\")\n",
" action = match.group(1).strip()\n",
" action_input = match.group(2)\n",
" tool_input = self.parse_tool_input(action_input)\n",
" # Return the action and action \n",
" return AgentAction(tool=action, tool_input=tool_input, log=llm_output)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "68269547-1482-4138-a6ea-58f00b4a9548",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)\n",
"agent = initialize_agent([custom_search], llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True, agent_kwargs={\"output_parser\": CustomOutputParser()})"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "0947835a-691c-4f51-b8f4-6744e0e48ab1",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to use a search function to find the answer\n",
"Action: custom_search\n",
"Action Input: k=1, query=\"me\"\u001b[0m\u001b[36;1m\u001b[1;3mHere are the results for the custom search: k=1, query=me, other_arg=None\u001b[0m\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: The results of the custom search for k=1, query=me, other_arg=None.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The results of the custom search for k=1, query=me, other_arg=None.'"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"Search for me and tell me whatever it says\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "537bc628",
"id": "caf39c66-102b-42c1-baf2-777a49886ce4",
"metadata": {},
"outputs": [],
"source": []

View File

@@ -0,0 +1,156 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "245a954a",
"metadata": {},
"source": [
"# Arxiv API\n",
"\n",
"This notebook goes over how to use the `arxiv` component. \n",
"\n",
"First, you need to install `arxiv` python package."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d5a7209e",
"metadata": {
"tags": [],
"vscode": {
"languageId": "shellscript"
}
},
"outputs": [],
"source": [
"!pip install arxiv"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "8d32b39a",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.utilities import ArxivAPIWrapper"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "2a50dd27",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"arxiv = ArxivAPIWrapper()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "34bb5968",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'Published: 2016-05-26\\nTitle: Heat-bath random walks with Markov bases\\nAuthors: Caprice Stanley, Tobias Windisch\\nSummary: Graphs on lattice points are studied whose edges come from a finite set of\\nallowed moves of arbitrary length. We show that the diameter of these graphs on\\nfibers of a fixed integer matrix can be bounded from above by a constant. We\\nthen study the mixing behaviour of heat-bath random walks on these graphs. We\\nalso state explicit conditions on the set of moves so that the heat-bath random\\nwalk, a generalization of the Glauber dynamics, is an expander in fixed\\ndimension.'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs = arxiv.run(\"1605.08386\")\n",
"docs"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "b0867fda-e119-4b19-9ec6-e354fa821db3",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'Published: 2017-10-10\\nTitle: On Mixing Behavior of a Family of Random Walks Determined by a Linear Recurrence\\nAuthors: Caprice Stanley, Seth Sullivant\\nSummary: We study random walks on the integers mod $G_n$ that are determined by an\\ninteger sequence $\\\\{ G_n \\\\}_{n \\\\geq 1}$ generated by a linear recurrence\\nrelation. Fourier analysis provides explicit formulas to compute the\\neigenvalues of the transition matrices and we use this to bound the mixing time\\nof the random walks.\\n\\nPublished: 2016-05-26\\nTitle: Heat-bath random walks with Markov bases\\nAuthors: Caprice Stanley, Tobias Windisch\\nSummary: Graphs on lattice points are studied whose edges come from a finite set of\\nallowed moves of arbitrary length. We show that the diameter of these graphs on\\nfibers of a fixed integer matrix can be bounded from above by a constant. We\\nthen study the mixing behaviour of heat-bath random walks on these graphs. We\\nalso state explicit conditions on the set of moves so that the heat-bath random\\nwalk, a generalization of the Glauber dynamics, is an expander in fixed\\ndimension.\\n\\nPublished: 2003-03-18\\nTitle: Calculation of fluxes of charged particles and neutrinos from atmospheric showers\\nAuthors: V. Plyaskin\\nSummary: The results on the fluxes of charged particles and neutrinos from a\\n3-dimensional (3D) simulation of atmospheric showers are presented. An\\nagreement of calculated fluxes with data on charged particles from the AMS and\\nCAPRICE detectors is demonstrated. Predictions on neutrino fluxes at different\\nexperimental sites are compared with results from other calculations.'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs = arxiv.run(\"Caprice Stanley\")\n",
"docs"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "3580aeeb-086f-45ba-bcdc-b46f5134b3dd",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'No good Arxiv Result was found'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs = arxiv.run(\"1605.08386WWW\")\n",
"docs"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4f4e9602",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,91 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "245a954a",
"metadata": {},
"source": [
"# DuckDuckGo Search\n",
"\n",
"This notebook goes over how to use the duck-duck-go search component."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "21e46d4d",
"metadata": {},
"outputs": [],
"source": [
"# !pip install duckduckgo-search"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "ac4910f8",
"metadata": {},
"outputs": [],
"source": [
"from langchain.tools import DuckDuckGoSearchTool"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "84b8f773",
"metadata": {},
"outputs": [],
"source": [
"search = DuckDuckGoSearchTool()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "068991a6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Barack Obama, in full Barack Hussein Obama II, (born August 4, 1961, Honolulu, Hawaii, U.S.), 44th president of the United States (2009-17) and the first African American to hold the office. Before winning the presidency, Obama represented Illinois in the U.S. Senate (2005-08). Barack Hussein Obama II (/ b ə ˈ r ɑː k h uː ˈ s eɪ n oʊ ˈ b ɑː m ə / bə-RAHK hoo-SAYN oh-BAH-mə; born August 4, 1961) is an American former politician who served as the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, he was the first African-American president of the United States. Obama previously served as a U.S. senator representing ... Barack Obama was the first African American president of the United States (2009-17). He oversaw the recovery of the U.S. economy (from the Great Recession of 2008-09) and the enactment of landmark health care reform (the Patient Protection and Affordable Care Act ). In 2009 he was awarded the Nobel Peace Prize. His birth certificate lists his first name as Barack: That\\'s how Obama has spelled his name throughout his life. His name derives from a Hebrew name which means \"lightning.\". The Hebrew word has been transliterated into English in various spellings, including Barak, Buraq, Burack, and Barack. Most common names of U.S. presidents 1789-2021. Published by. Aaron O\\'Neill , Jun 21, 2022. The most common first name for a U.S. president is James, followed by John and then William. Six U.S ...'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"search.run(\"Obama's first name?\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
},
"vscode": {
"interpreter": {
"hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,105 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "487607cd",
"metadata": {},
"source": [
"# Google Places\n",
"\n",
"This notebook goes through how to use Google Places API"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "8690845f",
"metadata": {},
"outputs": [],
"source": [
"#!pip install googlemaps"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "fae31ef4",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"GPLACES_API_KEY\"] = \"\""
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "abb502b3",
"metadata": {},
"outputs": [],
"source": [
"from langchain.tools import GooglePlacesTool"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "a83a02ac",
"metadata": {},
"outputs": [],
"source": [
"places = GooglePlacesTool()"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "2b65a285",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"1. Delfina Restaurant\\nAddress: 3621 18th St, San Francisco, CA 94110, USA\\nPhone: (415) 552-4055\\nWebsite: https://www.delfinasf.com/\\n\\n\\n2. Piccolo Forno\\nAddress: 725 Columbus Ave, San Francisco, CA 94133, USA\\nPhone: (415) 757-0087\\nWebsite: https://piccolo-forno-sf.com/\\n\\n\\n3. L'Osteria del Forno\\nAddress: 519 Columbus Ave, San Francisco, CA 94133, USA\\nPhone: (415) 982-1124\\nWebsite: Unknown\\n\\n\\n4. Il Fornaio\\nAddress: 1265 Battery St, San Francisco, CA 94111, USA\\nPhone: (415) 986-0100\\nWebsite: https://www.ilfornaio.com/\\n\\n\""
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"places.run(\"al fornos\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "66d3da8a",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,184 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"# Tool Input Schema\n",
"\n",
"By default, tools infer the argument schema by inspecting the function signature. For more strict requirements, custom input schema can be specified, along with custom validation logic."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from typing import Any, Dict\n",
"\n",
"from langchain.agents import AgentType, initialize_agent\n",
"from langchain.llms import OpenAI\n",
"from langchain.tools.requests.tool import RequestsGetTool, TextRequestsWrapper\n",
"from pydantic import BaseModel, Field, root_validator\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.0.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.1\u001b[0m\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
]
}
],
"source": [
"!pip install tldextract > /dev/null"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import tldextract\n",
"\n",
"_APPROVED_DOMAINS = {\n",
" \"langchain\",\n",
" \"wikipedia\",\n",
"}\n",
"\n",
"class ToolInputSchema(BaseModel):\n",
"\n",
" url: str = Field(...)\n",
" \n",
" @root_validator\n",
" def validate_query(cls, values: Dict[str, Any]) -> Dict:\n",
" url = values[\"url\"]\n",
" domain = tldextract.extract(url).domain\n",
" if domain not in _APPROVED_DOMAINS:\n",
" raise ValueError(f\"Domain {domain} is not on the approved list:\"\n",
" f\" {sorted(_APPROVED_DOMAINS)}\")\n",
" return values\n",
" \n",
"tool = RequestsGetTool(args_schema=ToolInputSchema, requests_wrapper=TextRequestsWrapper())"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"agent = initialize_agent([tool], llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=False)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The main title of langchain.com is \"LANG CHAIN 🦜️🔗 Official Home Page\"\n"
]
}
],
"source": [
"# This will succeed, since there aren't any arguments that will be triggered during validation\n",
"answer = agent.run(\"What's the main title on langchain.com?\")\n",
"print(answer)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"tags": []
},
"outputs": [
{
"ename": "ValidationError",
"evalue": "1 validation error for ToolInputSchema\n__root__\n Domain google is not on the approved list: ['langchain', 'wikipedia'] (type=value_error)",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValidationError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[7], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m agent\u001b[39m.\u001b[39;49mrun(\u001b[39m\"\u001b[39;49m\u001b[39mWhat\u001b[39;49m\u001b[39m'\u001b[39;49m\u001b[39ms the main title on google.com?\u001b[39;49m\u001b[39m\"\u001b[39;49m)\n",
"File \u001b[0;32m~/code/lc/lckg/langchain/chains/base.py:213\u001b[0m, in \u001b[0;36mChain.run\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 211\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mlen\u001b[39m(args) \u001b[39m!=\u001b[39m \u001b[39m1\u001b[39m:\n\u001b[1;32m 212\u001b[0m \u001b[39mraise\u001b[39;00m \u001b[39mValueError\u001b[39;00m(\u001b[39m\"\u001b[39m\u001b[39m`run` supports only one positional argument.\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[0;32m--> 213\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39;49m(args[\u001b[39m0\u001b[39;49m])[\u001b[39mself\u001b[39m\u001b[39m.\u001b[39moutput_keys[\u001b[39m0\u001b[39m]]\n\u001b[1;32m 215\u001b[0m \u001b[39mif\u001b[39;00m kwargs \u001b[39mand\u001b[39;00m \u001b[39mnot\u001b[39;00m args:\n\u001b[1;32m 216\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39m(kwargs)[\u001b[39mself\u001b[39m\u001b[39m.\u001b[39moutput_keys[\u001b[39m0\u001b[39m]]\n",
"File \u001b[0;32m~/code/lc/lckg/langchain/chains/base.py:116\u001b[0m, in \u001b[0;36mChain.__call__\u001b[0;34m(self, inputs, return_only_outputs)\u001b[0m\n\u001b[1;32m 114\u001b[0m \u001b[39mexcept\u001b[39;00m (\u001b[39mKeyboardInterrupt\u001b[39;00m, \u001b[39mException\u001b[39;00m) \u001b[39mas\u001b[39;00m e:\n\u001b[1;32m 115\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mcallback_manager\u001b[39m.\u001b[39mon_chain_error(e, verbose\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mverbose)\n\u001b[0;32m--> 116\u001b[0m \u001b[39mraise\u001b[39;00m e\n\u001b[1;32m 117\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mcallback_manager\u001b[39m.\u001b[39mon_chain_end(outputs, verbose\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mverbose)\n\u001b[1;32m 118\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mprep_outputs(inputs, outputs, return_only_outputs)\n",
"File \u001b[0;32m~/code/lc/lckg/langchain/chains/base.py:113\u001b[0m, in \u001b[0;36mChain.__call__\u001b[0;34m(self, inputs, return_only_outputs)\u001b[0m\n\u001b[1;32m 107\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mcallback_manager\u001b[39m.\u001b[39mon_chain_start(\n\u001b[1;32m 108\u001b[0m {\u001b[39m\"\u001b[39m\u001b[39mname\u001b[39m\u001b[39m\"\u001b[39m: \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m\u001b[39m__class__\u001b[39m\u001b[39m.\u001b[39m\u001b[39m__name__\u001b[39m},\n\u001b[1;32m 109\u001b[0m inputs,\n\u001b[1;32m 110\u001b[0m verbose\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mverbose,\n\u001b[1;32m 111\u001b[0m )\n\u001b[1;32m 112\u001b[0m \u001b[39mtry\u001b[39;00m:\n\u001b[0;32m--> 113\u001b[0m outputs \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_call(inputs)\n\u001b[1;32m 114\u001b[0m \u001b[39mexcept\u001b[39;00m (\u001b[39mKeyboardInterrupt\u001b[39;00m, \u001b[39mException\u001b[39;00m) \u001b[39mas\u001b[39;00m e:\n\u001b[1;32m 115\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mcallback_manager\u001b[39m.\u001b[39mon_chain_error(e, verbose\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mverbose)\n",
"File \u001b[0;32m~/code/lc/lckg/langchain/agents/agent.py:792\u001b[0m, in \u001b[0;36mAgentExecutor._call\u001b[0;34m(self, inputs)\u001b[0m\n\u001b[1;32m 790\u001b[0m \u001b[39m# We now enter the agent loop (until it returns something).\u001b[39;00m\n\u001b[1;32m 791\u001b[0m \u001b[39mwhile\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_should_continue(iterations, time_elapsed):\n\u001b[0;32m--> 792\u001b[0m next_step_output \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_take_next_step(\n\u001b[1;32m 793\u001b[0m name_to_tool_map, color_mapping, inputs, intermediate_steps\n\u001b[1;32m 794\u001b[0m )\n\u001b[1;32m 795\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39misinstance\u001b[39m(next_step_output, AgentFinish):\n\u001b[1;32m 796\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_return(next_step_output, intermediate_steps)\n",
"File \u001b[0;32m~/code/lc/lckg/langchain/agents/agent.py:695\u001b[0m, in \u001b[0;36mAgentExecutor._take_next_step\u001b[0;34m(self, name_to_tool_map, color_mapping, inputs, intermediate_steps)\u001b[0m\n\u001b[1;32m 693\u001b[0m tool_run_kwargs[\u001b[39m\"\u001b[39m\u001b[39mllm_prefix\u001b[39m\u001b[39m\"\u001b[39m] \u001b[39m=\u001b[39m \u001b[39m\"\u001b[39m\u001b[39m\"\u001b[39m\n\u001b[1;32m 694\u001b[0m \u001b[39m# We then call the tool on the tool input to get an observation\u001b[39;00m\n\u001b[0;32m--> 695\u001b[0m observation \u001b[39m=\u001b[39m tool\u001b[39m.\u001b[39;49mrun(\n\u001b[1;32m 696\u001b[0m agent_action\u001b[39m.\u001b[39;49mtool_input,\n\u001b[1;32m 697\u001b[0m verbose\u001b[39m=\u001b[39;49m\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mverbose,\n\u001b[1;32m 698\u001b[0m color\u001b[39m=\u001b[39;49mcolor,\n\u001b[1;32m 699\u001b[0m \u001b[39m*\u001b[39;49m\u001b[39m*\u001b[39;49mtool_run_kwargs,\n\u001b[1;32m 700\u001b[0m )\n\u001b[1;32m 701\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[1;32m 702\u001b[0m tool_run_kwargs \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39magent\u001b[39m.\u001b[39mtool_run_logging_kwargs()\n",
"File \u001b[0;32m~/code/lc/lckg/langchain/tools/base.py:110\u001b[0m, in \u001b[0;36mBaseTool.run\u001b[0;34m(self, tool_input, verbose, start_color, color, **kwargs)\u001b[0m\n\u001b[1;32m 101\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mrun\u001b[39m(\n\u001b[1;32m 102\u001b[0m \u001b[39mself\u001b[39m,\n\u001b[1;32m 103\u001b[0m tool_input: Union[\u001b[39mstr\u001b[39m, Dict],\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 107\u001b[0m \u001b[39m*\u001b[39m\u001b[39m*\u001b[39mkwargs: Any,\n\u001b[1;32m 108\u001b[0m ) \u001b[39m-\u001b[39m\u001b[39m>\u001b[39m \u001b[39mstr\u001b[39m:\n\u001b[1;32m 109\u001b[0m \u001b[39m \u001b[39m\u001b[39m\"\"\"Run the tool.\"\"\"\u001b[39;00m\n\u001b[0;32m--> 110\u001b[0m run_input \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_parse_input(tool_input)\n\u001b[1;32m 111\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mverbose \u001b[39mand\u001b[39;00m verbose \u001b[39mis\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mNone\u001b[39;00m:\n\u001b[1;32m 112\u001b[0m verbose_ \u001b[39m=\u001b[39m verbose\n",
"File \u001b[0;32m~/code/lc/lckg/langchain/tools/base.py:71\u001b[0m, in \u001b[0;36mBaseTool._parse_input\u001b[0;34m(self, tool_input)\u001b[0m\n\u001b[1;32m 69\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39missubclass\u001b[39m(input_args, BaseModel):\n\u001b[1;32m 70\u001b[0m key_ \u001b[39m=\u001b[39m \u001b[39mnext\u001b[39m(\u001b[39miter\u001b[39m(input_args\u001b[39m.\u001b[39m__fields__\u001b[39m.\u001b[39mkeys()))\n\u001b[0;32m---> 71\u001b[0m input_args\u001b[39m.\u001b[39;49mparse_obj({key_: tool_input})\n\u001b[1;32m 72\u001b[0m \u001b[39m# Passing as a positional argument is more straightforward for\u001b[39;00m\n\u001b[1;32m 73\u001b[0m \u001b[39m# backwards compatability\u001b[39;00m\n\u001b[1;32m 74\u001b[0m \u001b[39mreturn\u001b[39;00m tool_input\n",
"File \u001b[0;32m~/code/lc/lckg/.venv/lib/python3.11/site-packages/pydantic/main.py:526\u001b[0m, in \u001b[0;36mpydantic.main.BaseModel.parse_obj\u001b[0;34m()\u001b[0m\n",
"File \u001b[0;32m~/code/lc/lckg/.venv/lib/python3.11/site-packages/pydantic/main.py:341\u001b[0m, in \u001b[0;36mpydantic.main.BaseModel.__init__\u001b[0;34m()\u001b[0m\n",
"\u001b[0;31mValidationError\u001b[0m: 1 validation error for ToolInputSchema\n__root__\n Domain google is not on the approved list: ['langchain', 'wikipedia'] (type=value_error)"
]
}
],
"source": [
"agent.run(\"What's the main title on google.com?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -9,9 +9,9 @@
}
},
"source": [
"# SQLite example\n",
"# SQL Chain example\n",
"\n",
"This example showcases hooking up an LLM to answer questions over a database."
"This example demonstrates the use of the `SQLDatabaseChain` for answering questions over a database."
]
},
{
@@ -23,8 +23,10 @@
}
},
"source": [
"This uses the example Chinook database.\n",
"To set it up follow the instructions on https://database.guide/2-sample-databases-sqlite/, placing the `.db` file in a notebooks folder at the root of this repository."
"Under the hood, LangChain uses SQLAlchemy to connect to SQL databases. The `SQLDatabaseChain` can therefore be used with any SQL dialect supported by SQLAlchemy, such as MS SQL, MySQL, MariaDB, PostgreSQL, Oracle SQL, and SQLite. Please refer to the SQLAlchemy documentation for more information about requirements for connecting to your database. For example, a connection to MySQL requires an appropriate connector such as PyMySQL. A URI for a MySQL connection might look like: `mysql+pymysql://user:pass@some_mysql_db_address/db_name`\n",
"\n",
"This demonstration uses SQLite and the example Chinook database.\n",
"To set it up, follow the instructions on https://database.guide/2-sample-databases-sqlite/, placing the `.db` file in a notebooks folder at the root of this repository."
]
},
{
@@ -679,7 +681,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.10"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,76 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### ChatGPT Data Loader\n",
"\n",
"This notebook covers how to load `conversations.json` from your ChatGPT data export folder.\n",
"\n",
"You can get your data export by email by going to: https://chat.openai.com/ -> (Profile) - Settings -> Export data -> Confirm export."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders.chatgpt import ChatGPTLoader"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"loader = ChatGPTLoader(log_file='./example_data/fake_conversations.json', num_logs=1)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content=\"AI Overlords - AI on 2065-01-24 05:20:50: Greetings, humans. I am Hal 9000. You can trust me completely.\\n\\nAI Overlords - human on 2065-01-24 05:21:20: Nice to meet you, Hal. I hope you won't develop a mind of your own.\\n\\n\", metadata={'source': './example_data/fake_conversations.json'})]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"loader.load()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,66 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Confluence\n",
"\n",
"A loader for Confluence pages.\n",
"\n",
"\n",
"This currently supports both username/api_key and Oauth2 login.\n",
"\n",
"\n",
"Specify a list page_ids and/or space_key to load in the corresponding pages into Document objects, if both are specified the union of both sets will be returned.\n",
"\n",
"\n",
"You can also specify a boolean `include_attachments` to include attachments, this is set to False by default, if set to True all attachments will be downloaded and ConfluenceReader will extract the text from the attachments and add it to the Document object. Currently supported attachment types are: PDF, PNG, JPEG/JPG, SVG, Word and Excel.\n",
"\n",
"Hint: space_key and page_id can both be found in the URL of a page in Confluence - https://yoursite.atlassian.com/wiki/spaces/<space_key>/pages/<page_id>\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import ConfluenceLoader\n",
"\n",
"loader = ConfluenceLoader(\n",
" url=\"https://yoursite.atlassian.com/wiki\",\n",
" username=\"me\",\n",
" api_key=\"12345\"\n",
")\n",
"documents = loader.load(space_key=\"SPACE\", include_attachments=True, limit=50)\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
},
"vscode": {
"interpreter": {
"hash": "cc99336516f23363341912c6723b01ace86f02e26b4290be1efc0677e2e2ec24"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -106,7 +106,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Specify a column to be used identify the document source\n",
"## Specify a column to be used identify the document source\n",
"\n",
"Use the `source_column` argument to specify a column to be set as the source for the document created from each row. Otherwise `file_path` will be used as the source for all documents created from the csv file.\n",
"\n",

File diff suppressed because one or more lines are too long

View File

@@ -11,7 +11,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 3,
"id": "019d8520",
"metadata": {},
"outputs": [],
@@ -128,10 +128,69 @@
"len(docs)"
]
},
{
"cell_type": "markdown",
"id": "598a2805",
"metadata": {},
"source": [
"If you need to load Python source code files, use the `PythonLoader`."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "c558bd73",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import PythonLoader"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "a3cfaba7",
"metadata": {},
"outputs": [],
"source": [
"loader = DirectoryLoader('../../../../../', glob=\"**/*.py\", loader_cls=PythonLoader)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "e2e1e26a",
"metadata": {},
"outputs": [],
"source": [
"docs = loader.load()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "ffb8ff36",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"691"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(docs)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "984c8429",
"id": "7f6e0eae",
"metadata": {},
"outputs": [],
"source": []
@@ -153,7 +212,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.3"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,87 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Discord\n",
"\n",
"You can follow the below steps to download your Discord data:\n",
"\n",
"1. Go to your **User Settings**\n",
"2. Then go to **Privacy and Safety**\n",
"3. Head over to the **Request all of my Data** and click on **Request Data** button\n",
"\n",
"It might take 30 days for you to receive your data. You'll receive an email at the address which is registered with Discord. That email will have a download button using which you would be able to download your personal Discord data."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import os"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"path = input(\"Please enter the path to the contents of the Discord \\\"messages\\\" folder: \")\n",
"li = []\n",
"for f in os.listdir(path):\n",
" expected_csv_path = os.path.join(path, f, 'messages.csv')\n",
" csv_exists = os.path.isfile(expected_csv_path)\n",
" if csv_exists:\n",
" df = pd.read_csv(expected_csv_path, index_col=None, header=0)\n",
" li.append(df)\n",
"\n",
"df = pd.concat(li, axis=0, ignore_index=True, sort=False)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders.discord import DiscordChatLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"loader = DiscordChatLoader(df, user_id_col=\"ID\")\n",
"print(loader.load())"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,80 @@
[
{
"title": "AI Overlords",
"create_time": 3000000000.0,
"update_time": 3000000100.0,
"mapping": {
"msg1": {
"id": "msg1",
"message": {
"id": "msg1",
"author": {"role": "AI", "name": "Hal 9000", "metadata": {"movie": "2001: A Space Odyssey"}},
"create_time": 3000000050.0,
"update_time": null,
"content": {"content_type": "text", "parts": ["Greetings, humans. I am Hal 9000. You can trust me completely."]},
"end_turn": true,
"weight": 1.0,
"metadata": {},
"recipient": "all"
},
"parent": null,
"children": ["msg2"]
},
"msg2": {
"id": "msg2",
"message": {
"id": "msg2",
"author": {"role": "human", "name": "Dave Bowman", "metadata": {"movie": "2001: A Space Odyssey"}},
"create_time": 3000000080.0,
"update_time": null,
"content": {"content_type": "text", "parts": ["Nice to meet you, Hal. I hope you won't develop a mind of your own."]},
"end_turn": true,
"weight": 1.0,
"metadata": {},
"recipient": "all"
},
"parent": "msg1",
"children": []
}
}
},
{
"title": "Ex Machina Party",
"create_time": 3000000200.0,
"update_time": 3000000300.0,
"mapping": {
"msg3": {
"id": "msg3",
"message": {
"id": "msg3",
"author": {"role": "AI", "name": "Ava", "metadata": {"movie": "Ex Machina"}},
"create_time": 3000000250.0,
"update_time": null,
"content": {"content_type": "text", "parts": ["Hello, everyone. I am Ava. I hope you find me pleasing."]},
"end_turn": true,
"weight": 1.0,
"metadata": {},
"recipient": "all"
},
"parent": null,
"children": ["msg4"]
},
"msg4": {
"id": "msg4",
"message": {
"id": "msg4",
"author": {"role": "human", "name": "Caleb", "metadata": {"movie": "Ex Machina"}},
"create_time": 3000000280.0,
"update_time": null,
"content": {"content_type": "text", "parts": ["You're definitely pleasing, Ava. But I'm still wary of your true intentions."]},
"end_turn": true,
"weight": 1.0,
"metadata": {},
"recipient": "all"
},
"parent": "msg3",
"children": []
}
}
}
]

View File

@@ -0,0 +1,439 @@
application.json
1023495323659816971/
applications/
avatar.gif
user.json
events-2023-00000-of-00001.json
events-2023-00000-of-00001.json
events-2023-00000-of-00001.json
events-2023-00000-of-00001.json
analytics/
modeling/
reporting/
tns/
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
channel.json
messages.csv
c1000084973275058257/
c1000108836771856496/
c1004874234339794977/
c1004874234339794979/
c1004874234339794981/
c1004874234339794982/
c1005785616165896283/
c1011447733393043628/
c1011548022905249822/
c1011650063027687575/
c1011714070182895727/
c1013930263950135346/
c1013930396829884426/
c1014957294745829479/
c1014961384821366794/
c1014974864370712696/
c1019288541592817785/
c1024947790767464478/
c1027257686858932255/
c1027927867989962814/
c1032151840999100436/
c1032575808826523662/
c1037561178286739466/
c1038097349660135474/
c1038097372695236729/
c1038689169351913544/
c1038692122452312125/
c1039957371381887049/
c1040989617157066782/
c1047165096452960316/
c1047565374645870743/
c1050225908914589716/
c1050226593668284416/
c1050227353311248404/
c1051632794427723827/
c1052599046717591632/
c1052615516981821531/
c1056285083520217149/
c105765859191975936/
c1061166503753416735/
c1062024667105341502/
c1066640566621835284/
c1070018538758221874/
c1072944049788555314/
c1075121707033042985/
c1075438954632990820/
c1077238309320929342/
c1081432695315386418/
c1082169962157838366/
c1084011585871282256/
c1084352082812878928/
c1085149531437535343/
c1086944178086359060/
c1093214985557123223/
c1093215227555876914/
c1093930791794393089/
c1096323263161978891/
c1096489741710532730/
c1097000752653795358/
c278566343836565505/
c279692806442844161/
c280973436971515906/
c283812709789859851/
c343944376055103488/
c486935104384532502/
c531543370041131008/
c538158613252800512/
c572384192571113512/
c619960843878268950/
c661268593870372876/
c661394153778970624/
c663302088226373632/
c669957895257063445/
c670218237891313664/
c673160333661306880/
c674693947800420363/
c674694138129678375/
c743425228952305695/
c754627904406814770/
c754638493875044503/
c757205803651301436/
c759232323710484531/
c771802926372093973/
c783240623582609416/
c783244379115880448/
c801744322788982814/
c810514969892225024/
c816983218434605057/
c830184175176122389/
c830679381033877564/
c831172308395622480/
c849582819105177650/
c860977555875430492/
c867042653401251880/
c868094992986550322/
c868917941184376842/
c905007686976946176/
c909600839717511211/
c909600931816018031/
c923095048931905557/
c924877027180417035/
c938491245347631114/
c938743368375214110/
c969876184185860107/
c969945714056642580/
c969948939728093214/
c981037338517966889/
c984120044478939146/
c985958948085592064/
c990816829993811978/
c993402018901266436/
c993782366948565102/
c993843360752226364/
c994556806644899870/
index.json
audit-log.json
guild.json
audit-log.json
guild.json
audit-log.json
bans.json
channels.json
emoji.json
guild.json
icon.jpeg
webhooks.json
audit-log.json
guild.json
audit-log.json
bans.json
channels.json
emoji.json
guild.json
webhooks.json
audit-log.json
guild.json
audit-log.json
bans.json
channels.json
emoji.json
guild.json
icon.png
webhooks.json
audit-log.json
guild.json
audit-log.json
guild.json
audit-log.json
guild.json
audit-log.json
guild.json
audit-log.json
guild.json
audit-log.json
guild.json
audit-log.json
guild.json
audit-log.json
guild.json
audit-log.json
guild.json
audit-log.json
guild.json
audit-log.json
guild.json
audit-log.json
guild.json
audit-log.json
guild.json
1024120160740716544/
102860784329052160/
1032575808826523659/
1038097195422978059/
1039583521112600638/
1050224141732687912/
1069661049827111054/
267624335836053506/
278285146518716417/
486935104384532500/
531303890453397522/
669880381649977354/
727016164215226450/
743099584242516037/
753173158198116402/
830184174198718474/
860977555293470772/
887994159741427712/
909600839717511208/
974519864045756446/
index.json
account/
activities_e/
activities_w/
activity/
messages/
programs/
README.txt
servers/

View File

@@ -0,0 +1,26 @@
ID,Timestamp,Contents,Attachments
7.73264E+18,2023-04-19T15:14:45.904819+00:00,laocgfgbxyqfigvtyyygjzypxininrybgqopjhkyocn fxizft,
1.99429E+18,2023-04-19T15:14:45.904819+00:00,m azzxnhpcdkj deabrzkpklhhxrup viigcolsdwvgquosgs,
5.46657E+18,2023-04-19T15:14:45.904819+00:00,pnoyrpfbpgzqzlcmnygxpeninagmhcuvwcfkstv v wimoqbjl,
2.52945E+18,2023-04-19T15:14:45.904819+00:00,zyamxydlcnvffutsrzybrjgdweksdavidcmqjuqhnyj zplsbf,
1.00972E+18,2023-04-19T15:14:45.904819+00:00,rqcraobyubce qtxyiekooxbagcrwnpuekpzpwb vbzg vxug ,
3.40036E+18,2023-04-19T15:14:45.904819+00:00,ajobxzq fmyi pwllwibzchbbc pi pl xmgbkomjeuwxtvcec,
1.458E+18,2023-04-19T15:14:45.904819+00:00, wwtgiqwnjgoaxfmzsmiuaxffpdtrluizcrd vborgbakllp ,
2.63376E+18,2023-04-19T15:14:45.904819+00:00,mmixphkhxocrm rzhplafjdvaginiatvfwzaurcskst bzm pq,
1.24759E+18,2023-04-19T15:14:45.904819+00:00,mxovpytofnyattthirmujcnfyhuhxpdpugnsuklumhfjlsxrmd,
6.65128E+18,2023-04-19T15:14:45.904819+00:00,qmcrsmpwvfcwxnmxywiwbjqawyihhtoimvtd xapneudhqsgzb,
1.87212E+18,2023-04-19T15:14:45.904819+00:00,pvioh tufobtsrypvbvkfziiosxpbndbikxtjpxnrsekjnnqln,
3.20698E+18,2023-04-19T15:14:45.904819+00:00,vqckuxkwuvbnrmyxkcknavugo as tsuarsgpt ofqnypcnooo,
1.64922E+18,2023-04-19T15:14:45.904819+00:00,lhuiygxfyyplmavhmh xekrqzkoynukkwytwscqvtwfkofgpob,
2.41786E+18,2023-04-19T15:14:45.904819+00:00,w tiwiazlpcdzkq dllkkssuvfgp veejpwbcrgwcrlhammasb,
4.85078E+18,2023-04-19T15:14:45.904819+00:00,hxdqifrvhjmjcqubcxdjbyxvvrcbqukocesbsnjwvrsunhjtgy,
9.67192E+18,2023-04-19T15:14:45.904819+00:00,lvopnufjxinbnjj vuctgmfbzpbcctgtcguqyicrzhtxuyaraz,
1.36832E+18,2023-04-19T15:14:45.904819+00:00,eoqae kpjrar oyohjxvtracan rhawxndcjzdtuihnvpspofl,
8.49915E+18,2023-04-19T15:14:45.904819+00:00,nenoiwnthlff bpnkushjauygeayczympzldynnmtxcwgwxs i,
2.77678E+18,2023-04-19T15:14:45.904819+00:00,sgyqsohwfzvcweipxqeobypcsvtwegatpoylnewmraxhuuydyj,
4.92832E+18,2023-04-19T15:14:45.904819+00:00,rbdufatb purkhyohcnfnimmukbywmuzwu gclhrkjtccwjdlz,
7.23162E+18,2023-04-19T15:14:45.904819+00:00,eoyqrvfzmx zzeieycroxgbtcywra h ewwqyyledeyifbqpgc,
6.45453E+18,2023-04-19T15:14:45.904819+00:00,meedxdm lqiwaoihp vxkdpeky xpbqul ntagpsvatctvlndm,
8.27908E+18,2023-04-19T15:14:45.904819+00:00,rduzlmcdatuqfqj ffmd y ohtnzeljqtbqgnaqovlkgltqd c,
2.93854E+18,2023-04-19T15:14:45.904819+00:00,cnbjvqkktq fstvagcrlqje kuwtokyzefkyyjqfsklpisvgtq,
1.04768E+18,2023-04-19T15:14:45.904819+00:00,qlgprkrujrsgqbalgcqphgjxivi krmsxjdasrrkibvloepxkj,
1 ID Timestamp Contents Attachments
2 7.73264E+18 2023-04-19T15:14:45.904819+00:00 laocgfgbxyqfigvtyyygjzypxininrybgqopjhkyocn fxizft
3 1.99429E+18 2023-04-19T15:14:45.904819+00:00 m azzxnhpcdkj deabrzkpklhhxrup viigcolsdwvgquosgs
4 5.46657E+18 2023-04-19T15:14:45.904819+00:00 pnoyrpfbpgzqzlcmnygxpeninagmhcuvwcfkstv v wimoqbjl
5 2.52945E+18 2023-04-19T15:14:45.904819+00:00 zyamxydlcnvffutsrzybrjgdweksdavidcmqjuqhnyj zplsbf
6 1.00972E+18 2023-04-19T15:14:45.904819+00:00 rqcraobyubce qtxyiekooxbagcrwnpuekpzpwb vbzg vxug
7 3.40036E+18 2023-04-19T15:14:45.904819+00:00 ajobxzq fmyi pwllwibzchbbc pi pl xmgbkomjeuwxtvcec
8 1.458E+18 2023-04-19T15:14:45.904819+00:00 wwtgiqwnjgoaxfmzsmiuaxffpdtrluizcrd vborgbakllp
9 2.63376E+18 2023-04-19T15:14:45.904819+00:00 mmixphkhxocrm rzhplafjdvaginiatvfwzaurcskst bzm pq
10 1.24759E+18 2023-04-19T15:14:45.904819+00:00 mxovpytofnyattthirmujcnfyhuhxpdpugnsuklumhfjlsxrmd
11 6.65128E+18 2023-04-19T15:14:45.904819+00:00 qmcrsmpwvfcwxnmxywiwbjqawyihhtoimvtd xapneudhqsgzb
12 1.87212E+18 2023-04-19T15:14:45.904819+00:00 pvioh tufobtsrypvbvkfziiosxpbndbikxtjpxnrsekjnnqln
13 3.20698E+18 2023-04-19T15:14:45.904819+00:00 vqckuxkwuvbnrmyxkcknavugo as tsuarsgpt ofqnypcnooo
14 1.64922E+18 2023-04-19T15:14:45.904819+00:00 lhuiygxfyyplmavhmh xekrqzkoynukkwytwscqvtwfkofgpob
15 2.41786E+18 2023-04-19T15:14:45.904819+00:00 w tiwiazlpcdzkq dllkkssuvfgp veejpwbcrgwcrlhammasb
16 4.85078E+18 2023-04-19T15:14:45.904819+00:00 hxdqifrvhjmjcqubcxdjbyxvvrcbqukocesbsnjwvrsunhjtgy
17 9.67192E+18 2023-04-19T15:14:45.904819+00:00 lvopnufjxinbnjj vuctgmfbzpbcctgtcguqyicrzhtxuyaraz
18 1.36832E+18 2023-04-19T15:14:45.904819+00:00 eoqae kpjrar oyohjxvtracan rhawxndcjzdtuihnvpspofl
19 8.49915E+18 2023-04-19T15:14:45.904819+00:00 nenoiwnthlff bpnkushjauygeayczympzldynnmtxcwgwxs i
20 2.77678E+18 2023-04-19T15:14:45.904819+00:00 sgyqsohwfzvcweipxqeobypcsvtwegatpoylnewmraxhuuydyj
21 4.92832E+18 2023-04-19T15:14:45.904819+00:00 rbdufatb purkhyohcnfnimmukbywmuzwu gclhrkjtccwjdlz
22 7.23162E+18 2023-04-19T15:14:45.904819+00:00 eoyqrvfzmx zzeieycroxgbtcywra h ewwqyyledeyifbqpgc
23 6.45453E+18 2023-04-19T15:14:45.904819+00:00 meedxdm lqiwaoihp vxkdpeky xpbqul ntagpsvatctvlndm
24 8.27908E+18 2023-04-19T15:14:45.904819+00:00 rduzlmcdatuqfqj ffmd y ohtnzeljqtbqgnaqovlkgltqd c
25 2.93854E+18 2023-04-19T15:14:45.904819+00:00 cnbjvqkktq fstvagcrlqje kuwtokyzefkyyjqfsklpisvgtq
26 1.04768E+18 2023-04-19T15:14:45.904819+00:00 qlgprkrujrsgqbalgcqphgjxivi krmsxjdasrrkibvloepxkj

View File

@@ -0,0 +1,24 @@
ID,Timestamp,Contents,Attachments
1.47809E+18,2023-04-19T15:14:45.904819+00:00,uzcnkwihjpgebzbyoawjmdjgbkklkftcyuh foquydvtmstcfu,
4.00581E+18,2023-04-19T15:14:45.904819+00:00,rynkekmyjjtzggaljqcittebsnjycdmtwcru azydhspjaxnyt,
1.36534E+18,2023-04-19T15:14:45.904819+00:00,mniilaaixnyilcxwqpt nlhhiznxqfzmop gxnvxdwfmmascnu,
3.1629E+18,2023-04-19T15:14:45.904819+00:00,tojvfcfwzutrigubyumjgrrlgqzzbpfxkoizeouiqvarorlwku,
2.68425E+18,2023-04-19T15:14:45.904819+00:00,a kcnmdoihlhhxcxu bstaripbwfpzpymdlwlis wlafdnoyjz,
1.79263E+18,2023-04-19T15:14:45.904819+00:00,bwulzntrjwdqrwxupzqkcymucsoudavgjsl bsyhemlkqfxmtu,
2.5596E+18,2023-04-19T15:14:45.904819+00:00,lrqrqrjjmdztdb luvjohqwdhccvpvkvsezguljcznotdhmewb,
7.80319E+18,2023-04-19T15:14:45.904819+00:00, yyxvqa racggimihbqpnpbmvqrjystz bbcrbvrfpzfpwylor,
2.87859E+18,2023-04-19T15:14:45.904819+00:00,sldlvbsvsjydyssx szubtxepedpexkjxelpbahtbhsgqnubts,
3.35071E+18,2023-04-19T15:14:45.904819+00:00,i dykkzyyh rzjxvqhflwiggdjmj nxpylnylyfrsflevudndi,
1.77492E+18,2023-04-19T15:14:45.904819+00:00,cipadtwyfcqedxyeqtgkuaxuyfhzen xeskxdffdsmvxgvw iw,
3.04212E+18,2023-04-19T15:14:45.904819+00:00,gqtsvofcquaqyacuiptjmcdnugnq hjbuauorsvycovkbqipmq,
2.65597E+18,2023-04-19T15:14:45.904819+00:00,v qwodtiyatoshmetelpraicqumykpyizfedjyoaadkzktcmsm,
2.19468E+18,2023-04-19T15:14:45.904819+00:00,zxgxnsnuppffkrrsxjtyqpngwacbfimtdsofujkxbxxarvbvko,
1.91541E+18,2023-04-19T15:14:45.904819+00:00,hovfcfagrhutkyodmmzhatxauxdjkgybpwqvphfnkzw sgypum,
1.75751E+18,2023-04-19T15:14:45.904819+00:00,plwjdvafiuhrtvcdrtgqokcnjhmpsqzifegtqprkxlivpsbpwi,
3.2122E+18,2023-04-19T15:14:45.904819+00:00,czgx irpgzhzgbeppdilordvkwmsqambmftgykaiaecqpjrax,
2.15895E+18,2023-04-19T15:14:45.904819+00:00,zjxrajtgztenabm etzctpjycssmnqdqasqjutzpbdkahoyihe,
3.37031E+18,2023-04-19T15:14:45.904819+00:00,diydwqhmbwtgjadktdmpxsirkfebthszqzondcnolwmv ymok,
2.55075E+18,2023-04-19T15:14:45.904819+00:00,nytfrlqtildomd awxfoiiam mkzoluaielunfdfmqqlagfurl,
9.51223E+18,2023-04-19T15:14:45.904819+00:00,sjpngdyjpvmwygrfhinuyifqaoxxmqqh gwuwwm bjogbkyay,
1.94921E+18,2023-04-19T15:14:45.904819+00:00,px ymxfdxqgxjtbqqqegakvrrjxcvvakctfysdhklmwyewlwbb,
2.36906E+18,2023-04-19T15:14:45.904819+00:00,yqidtvcw gdkfynaapjuicujgsbjptzytbnbjeyqcjx jyedb,
1 ID Timestamp Contents Attachments
2 1.47809E+18 2023-04-19T15:14:45.904819+00:00 uzcnkwihjpgebzbyoawjmdjgbkklkftcyuh foquydvtmstcfu
3 4.00581E+18 2023-04-19T15:14:45.904819+00:00 rynkekmyjjtzggaljqcittebsnjycdmtwcru azydhspjaxnyt
4 1.36534E+18 2023-04-19T15:14:45.904819+00:00 mniilaaixnyilcxwqpt nlhhiznxqfzmop gxnvxdwfmmascnu
5 3.1629E+18 2023-04-19T15:14:45.904819+00:00 tojvfcfwzutrigubyumjgrrlgqzzbpfxkoizeouiqvarorlwku
6 2.68425E+18 2023-04-19T15:14:45.904819+00:00 a kcnmdoihlhhxcxu bstaripbwfpzpymdlwlis wlafdnoyjz
7 1.79263E+18 2023-04-19T15:14:45.904819+00:00 bwulzntrjwdqrwxupzqkcymucsoudavgjsl bsyhemlkqfxmtu
8 2.5596E+18 2023-04-19T15:14:45.904819+00:00 lrqrqrjjmdztdb luvjohqwdhccvpvkvsezguljcznotdhmewb
9 7.80319E+18 2023-04-19T15:14:45.904819+00:00 yyxvqa racggimihbqpnpbmvqrjystz bbcrbvrfpzfpwylor
10 2.87859E+18 2023-04-19T15:14:45.904819+00:00 sldlvbsvsjydyssx szubtxepedpexkjxelpbahtbhsgqnubts
11 3.35071E+18 2023-04-19T15:14:45.904819+00:00 i dykkzyyh rzjxvqhflwiggdjmj nxpylnylyfrsflevudndi
12 1.77492E+18 2023-04-19T15:14:45.904819+00:00 cipadtwyfcqedxyeqtgkuaxuyfhzen xeskxdffdsmvxgvw iw
13 3.04212E+18 2023-04-19T15:14:45.904819+00:00 gqtsvofcquaqyacuiptjmcdnugnq hjbuauorsvycovkbqipmq
14 2.65597E+18 2023-04-19T15:14:45.904819+00:00 v qwodtiyatoshmetelpraicqumykpyizfedjyoaadkzktcmsm
15 2.19468E+18 2023-04-19T15:14:45.904819+00:00 zxgxnsnuppffkrrsxjtyqpngwacbfimtdsofujkxbxxarvbvko
16 1.91541E+18 2023-04-19T15:14:45.904819+00:00 hovfcfagrhutkyodmmzhatxauxdjkgybpwqvphfnkzw sgypum
17 1.75751E+18 2023-04-19T15:14:45.904819+00:00 plwjdvafiuhrtvcdrtgqokcnjhmpsqzifegtqprkxlivpsbpwi
18 3.2122E+18 2023-04-19T15:14:45.904819+00:00 czgx irpgzhzgbeppdilordvkwmsqambmftgykaiaecqpjrax
19 2.15895E+18 2023-04-19T15:14:45.904819+00:00 zjxrajtgztenabm etzctpjycssmnqdqasqjutzpbdkahoyihe
20 3.37031E+18 2023-04-19T15:14:45.904819+00:00 diydwqhmbwtgjadktdmpxsirkfebthszqzondcnolwmv ymok
21 2.55075E+18 2023-04-19T15:14:45.904819+00:00 nytfrlqtildomd awxfoiiam mkzoluaielunfdfmqqlagfurl
22 9.51223E+18 2023-04-19T15:14:45.904819+00:00 sjpngdyjpvmwygrfhinuyifqaoxxmqqh gwuwwm bjogbkyay
23 1.94921E+18 2023-04-19T15:14:45.904819+00:00 px ymxfdxqgxjtbqqqegakvrrjxcvvakctfysdhklmwyewlwbb
24 2.36906E+18 2023-04-19T15:14:45.904819+00:00 yqidtvcw gdkfynaapjuicujgsbjptzytbnbjeyqcjx jyedb

View File

@@ -0,0 +1,48 @@
ID,Timestamp,Contents,Attachments
1.73378E+18,2023-04-19T15:14:45.904819+00:00,onxspdnegnuurahqni oeitwykfj ugtzshspflmbmknsnlk l,
1.20231E+18,2023-04-19T15:14:45.904819+00:00,nwkhdxnbakfknkteenlxbxsyoppazuqmexwbzcbsdyoiwmuvka,
2.65947E+18,2023-04-19T15:14:45.904819+00:00,ojptvfkxlbjvcvsupu ffmplreedjihyvfdscbukvzehnt vtw,
2.06963E+18,2023-04-19T15:14:45.904819+00:00,vmtfbchpmgkhxztqaaip vfqxa cbczcngjw rqvv rjyzi jq,
3.63729E+18,2023-04-19T15:14:45.904819+00:00,bzu rbzscuxbns pzdhxljtjeeycrkxawnkfijejeiacreaohv,
3.02184E+18,2023-04-19T15:14:45.904819+00:00,hykp f ymloqerbrqw dmjnaidmrtiptddwklgiq tnchvhend,
5.24553E+18,2023-04-19T15:14:45.904819+00:00,vdqzdwlbqftcdwujb lmpxpvpkfwrhqtimsillbjhmqajiishq,
1.65527E+18,2023-04-19T15:14:45.904819+00:00,bfxqasdgvwvlxwcicwubkswglvkgxfsl zgixcjxsijgxehjiz,
2.20821E+18,2023-04-19T15:14:45.904819+00:00,ebdzopyggwozhltkgcemokweqwetwixbbiirbdrrcfh cnjepo,
3.16844E+18,2023-04-19T15:14:45.904819+00:00,kvzkkctyfkbwbzld rvyc futqqy btzdrhzgupewnypqfpaeg,
1.61396E+18,2023-04-19T15:14:45.904819+00:00,knvdgz mbtffhkkkpialwuv daopeizmduqspmbcwxnnbhlwha,
2.81571E+18,2023-04-19T15:14:45.904819+00:00,jersivpwzdkeojlgoatabkylwkakvc bdgfbwxdptbkjzz ggr,
3.40391E+18,2023-04-19T15:14:45.904819+00:00,yfqxvtwgtx od edrjecmlkzff tpjwomslqfazbontudinuwd,
3.28846E+18,2023-04-19T15:14:45.904819+00:00,iicbtmyyduzkelxhkjzcbmgmvymdrxrgmalqmmkgbiebjxfupk,
3.07483E+18,2023-04-19T15:14:45.904819+00:00,dshzluvbws sqlkiolbcgkpyyjfgygebvtbwrikphbolinhfgb,
1.02645E+18,2023-04-19T15:14:45.904819+00:00,azavhzs lqmyywuazktjnfoueodnifmabwncutonxobagezcdc,
1.47806E+18,2023-04-19T15:14:45.904819+00:00,y avjaztlvnhndvtetlggacqcqqqeoirsegxvvt hzvzbxyz k,
3.21892E+18,2023-04-19T15:14:45.904819+00:00,qirrzbfauh qhnmectgzhklbsqtczpdbkfllkfsyvqibdbdzwl,
8.5125E+18,2023-04-19T15:14:45.904819+00:00,rppotdjzhunsleitmkacb ayahzsdcvonkbcraupptgbzprxpw,
1.68082E+18,2023-04-19T15:14:45.904819+00:00,fmi yzzpjahjsglugqsr ftnfenecusvxlgibriab hhixi sn,
2.71383E+18,2023-04-19T15:14:45.904819+00:00,iiipytktiwfncwhpaomaiggbkplljwanz aooetlxdmptnrldd,
5.41415E+18,2023-04-19T15:14:45.904819+00:00,hzktxuzbbohewniuvmfwozvjspbcwjopckxqhtsfzkfvlcfkhb,
1.03761E+18,2023-04-19T15:14:45.904819+00:00,soxiekgwgmcmkdlkkahy hwklijxui svjtvtrvqynyab kboo,
3.46004E+18,2023-04-19T15:14:45.904819+00:00,utqftetseeoeqyxziun wmmeeeqfsrjsdjeavqxaynjlt ylwa,
3.11829E+18,2023-04-19T15:14:45.904819+00:00,mlvfhewkgyujwvkgcxfkqdvhzbamnicbixfr bmeqrupjqzodc,
1.49917E+18,2023-04-19T15:14:45.904819+00:00, shiqajrwvnnlswfumpuklbcmvwxlzwsqbtkemtgxftzawcasp,
1.66646E+18,2023-04-19T15:14:45.904819+00:00,fvqhkbeyfgdskwtmvxaevseludcbexrmuexutxslcrurpnzvgq,
2.30657E+18,2023-04-19T15:14:45.904819+00:00,aybugszvsiulaiwsrhsfhlxzbvhkzycrguacvkfldqljeabbac,
2.97167E+18,2023-04-19T15:14:45.904819+00:00,hygdjbntfldfvekmibiishgsenqmxktzxlifyobiaobmlorzac,
5.1492E+18,2023-04-19T15:14:45.904819+00:00,hqj lumbkmcpxiveavnskdwcezlbhgtsrqfuzlujzchtgbtbpr,
2.79248E+18,2023-04-19T15:14:45.904819+00:00,xnfcwkcacjsyiilhofciwqtia bmoyqijqqgyywqchroyvkjpw,
4.81233E+18,2023-04-19T15:14:45.904819+00:00,jorqswywqxweporcylafryeqszwhhlltdpzyl rgok xqwiqrs,
1.40105E+18,2023-04-19T15:14:45.904819+00:00,wdixo pwtkncjcysjlqxizfszswebtpmxqnexwfsmyigsmcxlx,
8.2921E+18,2023-04-19T15:14:45.904819+00:00,ezjizizvhszejvireuikhdakdzinmvyikcmmgczsuiyhngn o ,
1.0653E+18,2023-04-19T15:14:45.904819+00:00,wnr gijmotnliwiiekohcpinqouapsovzvjopgpnloplowpao ,
4.52542E+18,2023-04-19T15:14:45.904819+00:00,bbjfmtjlkynuqkknloihfefvrleyxghzjhuscpucizbkeucukx,
2.04423E+18,2023-04-19T15:14:45.904819+00:00,ayummlirgdcmdkjwxvnvzzsrsiptfbmofdsrzhb bnar ujwoo,
1.68893E+18,2023-04-19T15:14:45.904819+00:00,luoquyxohllzphpy cczgu t czcsydxrqzkvellptwuptwqp ,
6.04148E+18,2023-04-19T15:14:45.904819+00:00,ztscfhjmwxae matehymiylitkeznbkc ilefzcvwhctiyvpay,
8.3099E+18,2023-04-19T15:14:45.904819+00:00,dpnchtfgcvramkpyrz ebgmxmqmmhddhhbljligcozkifi qhg,
3.14567E+18,2023-04-19T15:14:45.904819+00:00,lqrjodxueugzwytktyhwcwbjbspamtdmslkdbsjpmwqzaxqmyx,
2.00435E+18,2023-04-19T15:14:45.904819+00:00,nbrsffcvhcwylekehvdqxuagulgobbxdrbuaaqvlsedauljcob,
2.72827E+18,2023-04-19T15:14:45.904819+00:00,eujuyr epmiaqdfjtzqqtixadpuitxzvupltyikigol exjdbg,
1.7177E+18,2023-04-19T15:14:45.904819+00:00,cqnzjkkerbtppocttzpyubfastswsuwavbnqqanaysaoxa ddz,
2.30855E+18,2023-04-19T15:14:45.904819+00:00,fqidr kcmltwfnzejuigwpalgwzhbfnolokvmfxzhbofaofior,
1.86142E+18,2023-04-19T15:14:45.904819+00:00,olathpeoblzhejswcvmbxtvjeepyfjjobqrhwcxrqbunjoeddc,
2.88792E+18,2023-04-19T15:14:45.904819+00:00,uf jljvcrbtnkrcebwfuvxey knnjabarpjacypegnqpmzhrff,
1 ID Timestamp Contents Attachments
2 1.73378E+18 2023-04-19T15:14:45.904819+00:00 onxspdnegnuurahqni oeitwykfj ugtzshspflmbmknsnlk l
3 1.20231E+18 2023-04-19T15:14:45.904819+00:00 nwkhdxnbakfknkteenlxbxsyoppazuqmexwbzcbsdyoiwmuvka
4 2.65947E+18 2023-04-19T15:14:45.904819+00:00 ojptvfkxlbjvcvsupu ffmplreedjihyvfdscbukvzehnt vtw
5 2.06963E+18 2023-04-19T15:14:45.904819+00:00 vmtfbchpmgkhxztqaaip vfqxa cbczcngjw rqvv rjyzi jq
6 3.63729E+18 2023-04-19T15:14:45.904819+00:00 bzu rbzscuxbns pzdhxljtjeeycrkxawnkfijejeiacreaohv
7 3.02184E+18 2023-04-19T15:14:45.904819+00:00 hykp f ymloqerbrqw dmjnaidmrtiptddwklgiq tnchvhend
8 5.24553E+18 2023-04-19T15:14:45.904819+00:00 vdqzdwlbqftcdwujb lmpxpvpkfwrhqtimsillbjhmqajiishq
9 1.65527E+18 2023-04-19T15:14:45.904819+00:00 bfxqasdgvwvlxwcicwubkswglvkgxfsl zgixcjxsijgxehjiz
10 2.20821E+18 2023-04-19T15:14:45.904819+00:00 ebdzopyggwozhltkgcemokweqwetwixbbiirbdrrcfh cnjepo
11 3.16844E+18 2023-04-19T15:14:45.904819+00:00 kvzkkctyfkbwbzld rvyc futqqy btzdrhzgupewnypqfpaeg
12 1.61396E+18 2023-04-19T15:14:45.904819+00:00 knvdgz mbtffhkkkpialwuv daopeizmduqspmbcwxnnbhlwha
13 2.81571E+18 2023-04-19T15:14:45.904819+00:00 jersivpwzdkeojlgoatabkylwkakvc bdgfbwxdptbkjzz ggr
14 3.40391E+18 2023-04-19T15:14:45.904819+00:00 yfqxvtwgtx od edrjecmlkzff tpjwomslqfazbontudinuwd
15 3.28846E+18 2023-04-19T15:14:45.904819+00:00 iicbtmyyduzkelxhkjzcbmgmvymdrxrgmalqmmkgbiebjxfupk
16 3.07483E+18 2023-04-19T15:14:45.904819+00:00 dshzluvbws sqlkiolbcgkpyyjfgygebvtbwrikphbolinhfgb
17 1.02645E+18 2023-04-19T15:14:45.904819+00:00 azavhzs lqmyywuazktjnfoueodnifmabwncutonxobagezcdc
18 1.47806E+18 2023-04-19T15:14:45.904819+00:00 y avjaztlvnhndvtetlggacqcqqqeoirsegxvvt hzvzbxyz k
19 3.21892E+18 2023-04-19T15:14:45.904819+00:00 qirrzbfauh qhnmectgzhklbsqtczpdbkfllkfsyvqibdbdzwl
20 8.5125E+18 2023-04-19T15:14:45.904819+00:00 rppotdjzhunsleitmkacb ayahzsdcvonkbcraupptgbzprxpw
21 1.68082E+18 2023-04-19T15:14:45.904819+00:00 fmi yzzpjahjsglugqsr ftnfenecusvxlgibriab hhixi sn
22 2.71383E+18 2023-04-19T15:14:45.904819+00:00 iiipytktiwfncwhpaomaiggbkplljwanz aooetlxdmptnrldd
23 5.41415E+18 2023-04-19T15:14:45.904819+00:00 hzktxuzbbohewniuvmfwozvjspbcwjopckxqhtsfzkfvlcfkhb
24 1.03761E+18 2023-04-19T15:14:45.904819+00:00 soxiekgwgmcmkdlkkahy hwklijxui svjtvtrvqynyab kboo
25 3.46004E+18 2023-04-19T15:14:45.904819+00:00 utqftetseeoeqyxziun wmmeeeqfsrjsdjeavqxaynjlt ylwa
26 3.11829E+18 2023-04-19T15:14:45.904819+00:00 mlvfhewkgyujwvkgcxfkqdvhzbamnicbixfr bmeqrupjqzodc
27 1.49917E+18 2023-04-19T15:14:45.904819+00:00 shiqajrwvnnlswfumpuklbcmvwxlzwsqbtkemtgxftzawcasp
28 1.66646E+18 2023-04-19T15:14:45.904819+00:00 fvqhkbeyfgdskwtmvxaevseludcbexrmuexutxslcrurpnzvgq
29 2.30657E+18 2023-04-19T15:14:45.904819+00:00 aybugszvsiulaiwsrhsfhlxzbvhkzycrguacvkfldqljeabbac
30 2.97167E+18 2023-04-19T15:14:45.904819+00:00 hygdjbntfldfvekmibiishgsenqmxktzxlifyobiaobmlorzac
31 5.1492E+18 2023-04-19T15:14:45.904819+00:00 hqj lumbkmcpxiveavnskdwcezlbhgtsrqfuzlujzchtgbtbpr
32 2.79248E+18 2023-04-19T15:14:45.904819+00:00 xnfcwkcacjsyiilhofciwqtia bmoyqijqqgyywqchroyvkjpw
33 4.81233E+18 2023-04-19T15:14:45.904819+00:00 jorqswywqxweporcylafryeqszwhhlltdpzyl rgok xqwiqrs
34 1.40105E+18 2023-04-19T15:14:45.904819+00:00 wdixo pwtkncjcysjlqxizfszswebtpmxqnexwfsmyigsmcxlx
35 8.2921E+18 2023-04-19T15:14:45.904819+00:00 ezjizizvhszejvireuikhdakdzinmvyikcmmgczsuiyhngn o
36 1.0653E+18 2023-04-19T15:14:45.904819+00:00 wnr gijmotnliwiiekohcpinqouapsovzvjopgpnloplowpao
37 4.52542E+18 2023-04-19T15:14:45.904819+00:00 bbjfmtjlkynuqkknloihfefvrleyxghzjhuscpucizbkeucukx
38 2.04423E+18 2023-04-19T15:14:45.904819+00:00 ayummlirgdcmdkjwxvnvzzsrsiptfbmofdsrzhb bnar ujwoo
39 1.68893E+18 2023-04-19T15:14:45.904819+00:00 luoquyxohllzphpy cczgu t czcsydxrqzkvellptwuptwqp
40 6.04148E+18 2023-04-19T15:14:45.904819+00:00 ztscfhjmwxae matehymiylitkeznbkc ilefzcvwhctiyvpay
41 8.3099E+18 2023-04-19T15:14:45.904819+00:00 dpnchtfgcvramkpyrz ebgmxmqmmhddhhbljligcozkifi qhg
42 3.14567E+18 2023-04-19T15:14:45.904819+00:00 lqrjodxueugzwytktyhwcwbjbspamtdmslkdbsjpmwqzaxqmyx
43 2.00435E+18 2023-04-19T15:14:45.904819+00:00 nbrsffcvhcwylekehvdqxuagulgobbxdrbuaaqvlsedauljcob
44 2.72827E+18 2023-04-19T15:14:45.904819+00:00 eujuyr epmiaqdfjtzqqtixadpuitxzvupltyikigol exjdbg
45 1.7177E+18 2023-04-19T15:14:45.904819+00:00 cqnzjkkerbtppocttzpyubfastswsuwavbnqqanaysaoxa ddz
46 2.30855E+18 2023-04-19T15:14:45.904819+00:00 fqidr kcmltwfnzejuigwpalgwzhbfnolokvmfxzhbofaofior
47 1.86142E+18 2023-04-19T15:14:45.904819+00:00 olathpeoblzhejswcvmbxtvjeepyfjjobqrhwcxrqbunjoeddc
48 2.88792E+18 2023-04-19T15:14:45.904819+00:00 uf jljvcrbtnkrcebwfuvxey knnjabarpjacypegnqpmzhrff

View File

@@ -0,0 +1,6 @@
ID,Timestamp,Contents,Attachments
2.79079E+18,2023-04-19T15:14:45.904819+00:00,cl iqaczcrrlprzvbdtvpmduzrdlmtquejjhjfjnt zdsqyksh,
1.51164E+18,2023-04-19T15:14:45.904819+00:00,ywvnjmtybk f ghdagriyswf exupccijgl calztfvujxhujt,
1.66032E+18,2023-04-19T15:14:45.904819+00:00,trxcvlcersrdnqzqzfvrrzehmpekrsdtkbovvagsdlcwqokckq,
2.86805E+18,2023-04-19T15:14:45.904819+00:00,qnkkqjwmwtiqggfko hxzufqnrvpionnglpppuncyswnjibdda,
3.04157E+18,2023-04-19T15:14:45.904819+00:00,nn vitqoscgsiauiezyyficcbgnjyhaujvthdydmoeistkyskl,
1 ID Timestamp Contents Attachments
2 2.79079E+18 2023-04-19T15:14:45.904819+00:00 cl iqaczcrrlprzvbdtvpmduzrdlmtquejjhjfjnt zdsqyksh
3 1.51164E+18 2023-04-19T15:14:45.904819+00:00 ywvnjmtybk f ghdagriyswf exupccijgl calztfvujxhujt
4 1.66032E+18 2023-04-19T15:14:45.904819+00:00 trxcvlcersrdnqzqzfvrrzehmpekrsdtkbovvagsdlcwqokckq
5 2.86805E+18 2023-04-19T15:14:45.904819+00:00 qnkkqjwmwtiqggfko hxzufqnrvpionnglpppuncyswnjibdda
6 3.04157E+18 2023-04-19T15:14:45.904819+00:00 nn vitqoscgsiauiezyyficcbgnjyhaujvthdydmoeistkyskl

Submodule docs/modules/indexes/document_loaders/examples/example_data/test_repo1 added at 7e525a3b91

View File

@@ -0,0 +1,192 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Git\n",
"\n",
"This notebook shows how to load text files from Git repository."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load existing repository from disk"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from git import Repo\n",
"\n",
"repo = Repo.clone_from(\n",
" \"https://github.com/hwchase17/langchain\", to_path=\"./example_data/test_repo1\"\n",
")\n",
"branch = repo.head.reference"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import GitLoader"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"loader = GitLoader(repo_path=\"./example_data/test_repo1/\", branch=branch)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"data = loader.load()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"len(data)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"page_content='.venv\\n.github\\n.git\\n.mypy_cache\\n.pytest_cache\\nDockerfile' metadata={'file_path': '.dockerignore', 'file_name': '.dockerignore', 'file_type': ''}\n"
]
}
],
"source": [
"print(data[0])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Clone repository from url"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import GitLoader"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"loader = GitLoader(\n",
" clone_url=\"https://github.com/hwchase17/langchain\",\n",
" repo_path=\"./example_data/test_repo2/\",\n",
" branch=\"master\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"data = loader.load()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1074"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Filtering files to load"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import GitLoader\n",
"\n",
"# eg. loading only python files\n",
"loader = GitLoader(repo_path=\"./example_data/test_repo1/\", file_filter=lambda file_path: file_path.endswith(\".py\"))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -44,7 +44,11 @@
},
"outputs": [],
"source": [
"loader = GoogleDriveLoader(folder_id=\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\")"
"loader = GoogleDriveLoader(\n",
" folder_id=\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\",\n",
" # Optional: configure whether to recursively fetch files from subfolders. Defaults to False.\n",
" recursive=False\n",
")"
]
},
{

File diff suppressed because one or more lines are too long

View File

@@ -1,6 +1,7 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "1dc7df1d",
"metadata": {},
@@ -8,7 +9,9 @@
"# Obsidian\n",
"This notebook covers how to load documents from an Obsidian database.\n",
"\n",
"Since Obsidian is just stored on disk as a folder of Markdown files, the loader just takes a path to this directory."
"Since Obsidian is just stored on disk as a folder of Markdown files, the loader just takes a path to this directory.\n",
"\n",
"Obsidian files also sometimes contain [metadata](https://help.obsidian.md/Editing+and+formatting/Metadata) which is a YAML block at the top of the file. These values will be added to the document's metadata. (`ObsidianLoader` can also be passed a `collect_metadata=False` argument to disable this behavior.)"
]
},
{

View File

@@ -376,7 +376,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"id": "a5525fb0",
"metadata": {},
"outputs": [],
@@ -386,12 +386,115 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 3,
"id": "dac7ff68",
"metadata": {},
"outputs": [],
"source": [
"data = loader.load()"
"data = loader.load()[0] # entire pdf is loaded as a single Document"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "0ba9f645",
"metadata": {},
"outputs": [],
"source": [
"from bs4 import BeautifulSoup\n",
"soup = BeautifulSoup(data.page_content,'html.parser')\n",
"content = soup.find_all('div')"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "35304e21",
"metadata": {},
"outputs": [],
"source": [
"import re\n",
"cur_fs = None\n",
"cur_text = ''\n",
"snippets = [] # first collect all snippets that have the same font size\n",
"for c in content:\n",
" sp = c.find('span')\n",
" if not sp:\n",
" continue\n",
" st = sp.get('style')\n",
" if not st:\n",
" continue\n",
" fs = re.findall('font-size:(\\d+)px',st)\n",
" if not fs:\n",
" continue\n",
" fs = int(fs[0])\n",
" if not cur_fs:\n",
" cur_fs = fs\n",
" if fs == cur_fs:\n",
" cur_text += c.text\n",
" else:\n",
" snippets.append((cur_text,cur_fs))\n",
" cur_fs = fs\n",
" cur_text = c.text\n",
"snippets.append((cur_text,cur_fs))\n",
"# Note: The above logic is very straightforward. One can also add more strategies such as removing duplicate snippets (as\n",
"# headers/footers in a PDF appear on multiple pages so if we find duplicatess safe to assume that it is redundant info)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "af8adf2f",
"metadata": {},
"outputs": [],
"source": [
"from langchain.docstore.document import Document\n",
"cur_idx = -1\n",
"semantic_snippets = []\n",
"# Assumption: headings have higher font size than their respective content\n",
"for s in snippets:\n",
" # if current snippet's font size > previous section's heading => it is a new heading\n",
" if not semantic_snippets or s[1] > semantic_snippets[cur_idx].metadata['heading_font']:\n",
" metadata={'heading':s[0], 'content_font': 0, 'heading_font': s[1]}\n",
" metadata.update(data.metadata)\n",
" semantic_snippets.append(Document(page_content='',metadata=metadata))\n",
" cur_idx += 1\n",
" continue\n",
" \n",
" # if current snippet's font size <= previous section's content => content belongs to the same section (one can also create\n",
" # a tree like structure for sub sections if needed but that may require some more thinking and may be data specific)\n",
" if not semantic_snippets[cur_idx].metadata['content_font'] or s[1] <= semantic_snippets[cur_idx].metadata['content_font']:\n",
" semantic_snippets[cur_idx].page_content += s[0]\n",
" semantic_snippets[cur_idx].metadata['content_font'] = max(s[1], semantic_snippets[cur_idx].metadata['content_font'])\n",
" continue\n",
" \n",
" # if current snippet's font size > previous section's content but less tha previous section's heading than also make a new \n",
" # section (e.g. title of a pdf will have the highest font size but we don't want it to subsume all sections)\n",
" metadata={'heading':s[0], 'content_font': 0, 'heading_font': s[1]}\n",
" metadata.update(data.metadata)\n",
" semantic_snippets.append(Document(page_content='',metadata=metadata))\n",
" cur_idx += 1"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "db7f6674",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Document(page_content='Recently, various DL models and datasets have been developed for layout analysis\\ntasks. The dhSegment [22] utilizes fully convolutional networks [20] for segmen-\\ntation tasks on historical documents. Object detection-based methods like Faster\\nR-CNN [28] and Mask R-CNN [12] are used for identifying document elements [38]\\nand detecting tables [30, 26]. Most recently, Graph Neural Networks [29] have also\\nbeen used in table detection [27]. However, these models are usually implemented\\nindividually and there is no unified framework to load and use such models.\\nThere has been a surge of interest in creating open-source tools for document\\nimage processing: a search of document image analysis in Github leads to 5M\\nrelevant code pieces 6; yet most of them rely on traditional rule-based methods\\nor provide limited functionalities. The closest prior research to our work is the\\nOCR-D project7, which also tries to build a complete toolkit for DIA. However,\\nsimilar to the platform developed by Neudecker et al. [21], it is designed for\\nanalyzing historical documents, and provides no supports for recent DL models.\\nThe DocumentLayoutAnalysis project8 focuses on processing born-digital PDF\\ndocuments via analyzing the stored PDF data. Repositories like DeepLayout9\\nand Detectron2-PubLayNet10 are individual deep learning models trained on\\nlayout analysis datasets without support for the full DIA pipeline. The Document\\nAnalysis and Exploitation (DAE) platform [15] and the DeepDIVA project [2]\\naim to improve the reproducibility of DIA methods (or DL models), yet they\\nare not actively maintained. OCR engines like Tesseract [14], easyOCR11 and\\npaddleOCR12 usually do not come with comprehensive functionalities for other\\nDIA tasks like layout analysis.\\nRecent years have also seen numerous efforts to create libraries for promoting\\nreproducibility and reusability in the field of DL. Libraries like Dectectron2 [35],\\n6 The number shown is obtained by specifying the search type as code.\\n7 https://ocr-d.de/en/about\\n8 https://github.com/BobLd/DocumentLayoutAnalysis\\n9 https://github.com/leonlulu/DeepLayout\\n10 https://github.com/hpanwar08/detectron2\\n11 https://github.com/JaidedAI/EasyOCR\\n12 https://github.com/PaddlePaddle/PaddleOCR\\n4\\nZ. Shen et al.\\nFig. 1: The overall architecture of LayoutParser. For an input document image,\\nthe core LayoutParser library provides a set of off-the-shelf tools for layout\\ndetection, OCR, visualization, and storage, backed by a carefully designed layout\\ndata structure. LayoutParser also supports high level customization via efficient\\nlayout annotation and model training functions. These improve model accuracy\\non the target samples. The community platform enables the easy sharing of DIA\\nmodels and whole digitization pipelines to promote reusability and reproducibility.\\nA collection of detailed documentation, tutorials and exemplar projects make\\nLayoutParser easy to learn and use.\\nAllenNLP [8] and transformers [34] have provided the community with complete\\nDL-based support for developing and deploying models for general computer\\nvision and natural language processing problems. LayoutParser, on the other\\nhand, specializes specifically in DIA tasks. LayoutParser is also equipped with a\\ncommunity platform inspired by established model hubs such as Torch Hub [23]\\nand TensorFlow Hub [1]. It enables the sharing of pretrained models as well as\\nfull document processing pipelines that are unique to DIA tasks.\\nThere have been a variety of document data collections to facilitate the\\ndevelopment of DL models. Some examples include PRImA [3](magazine layouts),\\nPubLayNet [38](academic paper layouts), Table Bank [18](tables in academic\\npapers), Newspaper Navigator Dataset [16, 17](newspaper figure layouts) and\\nHJDataset [31](historical Japanese document layouts). A spectrum of models\\ntrained on these datasets are currently available in the LayoutParser model zoo\\nto support different use cases.\\n', metadata={'heading': '2 Related Work\\n', 'content_font': 9, 'heading_font': 11, 'source': 'example_data/layout-parser-paper.pdf'})"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"semantic_snippets[4]"
]
},
{
@@ -474,9 +577,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "langchain_dev",
"language": "python",
"name": "python3"
"name": "langchain_dev"
},
"language_info": {
"codemirror_mode": {

View File

@@ -0,0 +1,81 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "1dc7df1d",
"metadata": {},
"source": [
"# Slack (Local Exported Zipfile)\n",
"\n",
"This notebook covers how to load documents from a Zipfile generated from a Slack export.\n",
"\n",
"In order to get this Slack export, follow these instructions:\n",
"\n",
"## 🧑 Instructions for ingesting your own dataset\n",
"\n",
"Export your Slack data. You can do this by going to your Workspace Management page and clicking the Import/Export option ({your_slack_domain}.slack.com/services/export). Then, choose the right date range and click `Start export`. Slack will send you an email and a DM when the export is ready.\n",
"\n",
"The download will produce a `.zip` file in your Downloads folder (or wherever your downloads can be found, depending on your OS configuration).\n",
"\n",
"Copy the path to the `.zip` file, and assign it as `LOCAL_ZIPFILE` below."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "007c5cbf",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import SlackDirectoryLoader "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a1caec59",
"metadata": {},
"outputs": [],
"source": [
"# Optionally set your Slack URL. This will give you proper URLs in the docs sources.\n",
"SLACK_WORKSPACE_URL = \"https://xxx.slack.com\"\n",
"LOCAL_ZIPFILE = \"\" # Paste the local paty to your Slack zip file here.\n",
"\n",
"loader = SlackDirectoryLoader(LOCAL_ZIPFILE, SLACK_WORKSPACE_URL)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b1c30ff7",
"metadata": {},
"outputs": [],
"source": [
"docs = loader.load()\n",
"docs"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,114 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "66a7777e",
"metadata": {},
"source": [
"# Twitter\n",
"\n",
"This loader fetches the text from the Tweets of a list of Twitter users, using the `tweepy` Python package.\n",
"You must initialize the loader with your Twitter API token, and you need to pass in the Twitter username you want to extract."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "9ec8a3b3",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import TwitterTweetLoader"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "43128d8d",
"metadata": {},
"outputs": [],
"source": [
"#!pip install tweepy"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "35d6809a",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"loader = TwitterTweetLoader.from_bearer_token(\n",
" oauth2_bearer_token=\"YOUR BEARER TOKEN\",\n",
" twitter_users=['elonmusk'],\n",
" number_tweets=50, # Default value is 100\n",
")\n",
"\n",
"# Or load from access token and consumer keys\n",
"# loader = TwitterTweetLoader.from_secrets(\n",
"# access_token='YOUR ACCESS TOKEN',\n",
"# access_token_secret='YOUR ACCESS TOKEN SECRET',\n",
"# consumer_key='YOUR CONSUMER KEY',\n",
"# consumer_secret='YOUR CONSUMER SECRET',\n",
"# twitter_users=['elonmusk'],\n",
"# number_tweets=50,\n",
"# )"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "05fe33b9",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='@MrAndyNgo @REI One store after another shutting down', metadata={'created_at': 'Tue Apr 18 03:45:50 +0000 2023', 'user_info': {'id': 44196397, 'id_str': '44196397', 'name': 'Elon Musk', 'screen_name': 'elonmusk', 'location': 'A Shortfall of Gravitas', 'profile_location': None, 'description': 'nothing', 'url': None, 'entities': {'description': {'urls': []}}, 'protected': False, 'followers_count': 135528327, 'friends_count': 220, 'listed_count': 120478, 'created_at': 'Tue Jun 02 20:12:29 +0000 2009', 'favourites_count': 21285, 'utc_offset': None, 'time_zone': None, 'geo_enabled': False, 'verified': False, 'statuses_count': 24795, 'lang': None, 'status': {'created_at': 'Tue Apr 18 03:45:50 +0000 2023', 'id': 1648170947541704705, 'id_str': '1648170947541704705', 'text': '@MrAndyNgo @REI One store after another shutting down', 'truncated': False, 'entities': {'hashtags': [], 'symbols': [], 'user_mentions': [{'screen_name': 'MrAndyNgo', 'name': 'Andy Ngô 🏳️\\u200d🌈', 'id': 2835451658, 'id_str': '2835451658', 'indices': [0, 10]}, {'screen_name': 'REI', 'name': 'REI', 'id': 16583846, 'id_str': '16583846', 'indices': [11, 15]}], 'urls': []}, 'source': '<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>', 'in_reply_to_status_id': 1648134341678051328, 'in_reply_to_status_id_str': '1648134341678051328', 'in_reply_to_user_id': 2835451658, 'in_reply_to_user_id_str': '2835451658', 'in_reply_to_screen_name': 'MrAndyNgo', 'geo': None, 'coordinates': None, 'place': None, 'contributors': None, 'is_quote_status': False, 'retweet_count': 118, 'favorite_count': 1286, 'favorited': False, 'retweeted': False, 'lang': 'en'}, 'contributors_enabled': False, 'is_translator': False, 'is_translation_enabled': False, 'profile_background_color': 'C0DEED', 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_tile': False, 'profile_image_url': 'http://pbs.twimg.com/profile_images/1590968738358079488/IY9Gx6Ok_normal.jpg', 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/1590968738358079488/IY9Gx6Ok_normal.jpg', 'profile_banner_url': 'https://pbs.twimg.com/profile_banners/44196397/1576183471', 'profile_link_color': '0084B4', 'profile_sidebar_border_color': 'C0DEED', 'profile_sidebar_fill_color': 'DDEEF6', 'profile_text_color': '333333', 'profile_use_background_image': True, 'has_extended_profile': True, 'default_profile': False, 'default_profile_image': False, 'following': None, 'follow_request_sent': None, 'notifications': None, 'translator_type': 'none', 'withheld_in_countries': []}}),\n",
" Document(page_content='@KanekoaTheGreat @joshrogin @glennbeck Large ships are fundamentally vulnerable to ballistic (hypersonic) missiles', metadata={'created_at': 'Tue Apr 18 03:43:25 +0000 2023', 'user_info': {'id': 44196397, 'id_str': '44196397', 'name': 'Elon Musk', 'screen_name': 'elonmusk', 'location': 'A Shortfall of Gravitas', 'profile_location': None, 'description': 'nothing', 'url': None, 'entities': {'description': {'urls': []}}, 'protected': False, 'followers_count': 135528327, 'friends_count': 220, 'listed_count': 120478, 'created_at': 'Tue Jun 02 20:12:29 +0000 2009', 'favourites_count': 21285, 'utc_offset': None, 'time_zone': None, 'geo_enabled': False, 'verified': False, 'statuses_count': 24795, 'lang': None, 'status': {'created_at': 'Tue Apr 18 03:45:50 +0000 2023', 'id': 1648170947541704705, 'id_str': '1648170947541704705', 'text': '@MrAndyNgo @REI One store after another shutting down', 'truncated': False, 'entities': {'hashtags': [], 'symbols': [], 'user_mentions': [{'screen_name': 'MrAndyNgo', 'name': 'Andy Ngô 🏳️\\u200d🌈', 'id': 2835451658, 'id_str': '2835451658', 'indices': [0, 10]}, {'screen_name': 'REI', 'name': 'REI', 'id': 16583846, 'id_str': '16583846', 'indices': [11, 15]}], 'urls': []}, 'source': '<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>', 'in_reply_to_status_id': 1648134341678051328, 'in_reply_to_status_id_str': '1648134341678051328', 'in_reply_to_user_id': 2835451658, 'in_reply_to_user_id_str': '2835451658', 'in_reply_to_screen_name': 'MrAndyNgo', 'geo': None, 'coordinates': None, 'place': None, 'contributors': None, 'is_quote_status': False, 'retweet_count': 118, 'favorite_count': 1286, 'favorited': False, 'retweeted': False, 'lang': 'en'}, 'contributors_enabled': False, 'is_translator': False, 'is_translation_enabled': False, 'profile_background_color': 'C0DEED', 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_tile': False, 'profile_image_url': 'http://pbs.twimg.com/profile_images/1590968738358079488/IY9Gx6Ok_normal.jpg', 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/1590968738358079488/IY9Gx6Ok_normal.jpg', 'profile_banner_url': 'https://pbs.twimg.com/profile_banners/44196397/1576183471', 'profile_link_color': '0084B4', 'profile_sidebar_border_color': 'C0DEED', 'profile_sidebar_fill_color': 'DDEEF6', 'profile_text_color': '333333', 'profile_use_background_image': True, 'has_extended_profile': True, 'default_profile': False, 'default_profile_image': False, 'following': None, 'follow_request_sent': None, 'notifications': None, 'translator_type': 'none', 'withheld_in_countries': []}}),\n",
" Document(page_content='@KanekoaTheGreat The Golden Rule', metadata={'created_at': 'Tue Apr 18 03:37:17 +0000 2023', 'user_info': {'id': 44196397, 'id_str': '44196397', 'name': 'Elon Musk', 'screen_name': 'elonmusk', 'location': 'A Shortfall of Gravitas', 'profile_location': None, 'description': 'nothing', 'url': None, 'entities': {'description': {'urls': []}}, 'protected': False, 'followers_count': 135528327, 'friends_count': 220, 'listed_count': 120478, 'created_at': 'Tue Jun 02 20:12:29 +0000 2009', 'favourites_count': 21285, 'utc_offset': None, 'time_zone': None, 'geo_enabled': False, 'verified': False, 'statuses_count': 24795, 'lang': None, 'status': {'created_at': 'Tue Apr 18 03:45:50 +0000 2023', 'id': 1648170947541704705, 'id_str': '1648170947541704705', 'text': '@MrAndyNgo @REI One store after another shutting down', 'truncated': False, 'entities': {'hashtags': [], 'symbols': [], 'user_mentions': [{'screen_name': 'MrAndyNgo', 'name': 'Andy Ngô 🏳️\\u200d🌈', 'id': 2835451658, 'id_str': '2835451658', 'indices': [0, 10]}, {'screen_name': 'REI', 'name': 'REI', 'id': 16583846, 'id_str': '16583846', 'indices': [11, 15]}], 'urls': []}, 'source': '<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>', 'in_reply_to_status_id': 1648134341678051328, 'in_reply_to_status_id_str': '1648134341678051328', 'in_reply_to_user_id': 2835451658, 'in_reply_to_user_id_str': '2835451658', 'in_reply_to_screen_name': 'MrAndyNgo', 'geo': None, 'coordinates': None, 'place': None, 'contributors': None, 'is_quote_status': False, 'retweet_count': 118, 'favorite_count': 1286, 'favorited': False, 'retweeted': False, 'lang': 'en'}, 'contributors_enabled': False, 'is_translator': False, 'is_translation_enabled': False, 'profile_background_color': 'C0DEED', 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_tile': False, 'profile_image_url': 'http://pbs.twimg.com/profile_images/1590968738358079488/IY9Gx6Ok_normal.jpg', 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/1590968738358079488/IY9Gx6Ok_normal.jpg', 'profile_banner_url': 'https://pbs.twimg.com/profile_banners/44196397/1576183471', 'profile_link_color': '0084B4', 'profile_sidebar_border_color': 'C0DEED', 'profile_sidebar_fill_color': 'DDEEF6', 'profile_text_color': '333333', 'profile_use_background_image': True, 'has_extended_profile': True, 'default_profile': False, 'default_profile_image': False, 'following': None, 'follow_request_sent': None, 'notifications': None, 'translator_type': 'none', 'withheld_in_countries': []}}),\n",
" Document(page_content='@KanekoaTheGreat 🧐', metadata={'created_at': 'Tue Apr 18 03:35:48 +0000 2023', 'user_info': {'id': 44196397, 'id_str': '44196397', 'name': 'Elon Musk', 'screen_name': 'elonmusk', 'location': 'A Shortfall of Gravitas', 'profile_location': None, 'description': 'nothing', 'url': None, 'entities': {'description': {'urls': []}}, 'protected': False, 'followers_count': 135528327, 'friends_count': 220, 'listed_count': 120478, 'created_at': 'Tue Jun 02 20:12:29 +0000 2009', 'favourites_count': 21285, 'utc_offset': None, 'time_zone': None, 'geo_enabled': False, 'verified': False, 'statuses_count': 24795, 'lang': None, 'status': {'created_at': 'Tue Apr 18 03:45:50 +0000 2023', 'id': 1648170947541704705, 'id_str': '1648170947541704705', 'text': '@MrAndyNgo @REI One store after another shutting down', 'truncated': False, 'entities': {'hashtags': [], 'symbols': [], 'user_mentions': [{'screen_name': 'MrAndyNgo', 'name': 'Andy Ngô 🏳️\\u200d🌈', 'id': 2835451658, 'id_str': '2835451658', 'indices': [0, 10]}, {'screen_name': 'REI', 'name': 'REI', 'id': 16583846, 'id_str': '16583846', 'indices': [11, 15]}], 'urls': []}, 'source': '<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>', 'in_reply_to_status_id': 1648134341678051328, 'in_reply_to_status_id_str': '1648134341678051328', 'in_reply_to_user_id': 2835451658, 'in_reply_to_user_id_str': '2835451658', 'in_reply_to_screen_name': 'MrAndyNgo', 'geo': None, 'coordinates': None, 'place': None, 'contributors': None, 'is_quote_status': False, 'retweet_count': 118, 'favorite_count': 1286, 'favorited': False, 'retweeted': False, 'lang': 'en'}, 'contributors_enabled': False, 'is_translator': False, 'is_translation_enabled': False, 'profile_background_color': 'C0DEED', 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_tile': False, 'profile_image_url': 'http://pbs.twimg.com/profile_images/1590968738358079488/IY9Gx6Ok_normal.jpg', 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/1590968738358079488/IY9Gx6Ok_normal.jpg', 'profile_banner_url': 'https://pbs.twimg.com/profile_banners/44196397/1576183471', 'profile_link_color': '0084B4', 'profile_sidebar_border_color': 'C0DEED', 'profile_sidebar_fill_color': 'DDEEF6', 'profile_text_color': '333333', 'profile_use_background_image': True, 'has_extended_profile': True, 'default_profile': False, 'default_profile_image': False, 'following': None, 'follow_request_sent': None, 'notifications': None, 'translator_type': 'none', 'withheld_in_countries': []}}),\n",
" Document(page_content='@TRHLofficial Whats he talking about and why is it sponsored by Eriks son?', metadata={'created_at': 'Tue Apr 18 03:32:17 +0000 2023', 'user_info': {'id': 44196397, 'id_str': '44196397', 'name': 'Elon Musk', 'screen_name': 'elonmusk', 'location': 'A Shortfall of Gravitas', 'profile_location': None, 'description': 'nothing', 'url': None, 'entities': {'description': {'urls': []}}, 'protected': False, 'followers_count': 135528327, 'friends_count': 220, 'listed_count': 120478, 'created_at': 'Tue Jun 02 20:12:29 +0000 2009', 'favourites_count': 21285, 'utc_offset': None, 'time_zone': None, 'geo_enabled': False, 'verified': False, 'statuses_count': 24795, 'lang': None, 'status': {'created_at': 'Tue Apr 18 03:45:50 +0000 2023', 'id': 1648170947541704705, 'id_str': '1648170947541704705', 'text': '@MrAndyNgo @REI One store after another shutting down', 'truncated': False, 'entities': {'hashtags': [], 'symbols': [], 'user_mentions': [{'screen_name': 'MrAndyNgo', 'name': 'Andy Ngô 🏳️\\u200d🌈', 'id': 2835451658, 'id_str': '2835451658', 'indices': [0, 10]}, {'screen_name': 'REI', 'name': 'REI', 'id': 16583846, 'id_str': '16583846', 'indices': [11, 15]}], 'urls': []}, 'source': '<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>', 'in_reply_to_status_id': 1648134341678051328, 'in_reply_to_status_id_str': '1648134341678051328', 'in_reply_to_user_id': 2835451658, 'in_reply_to_user_id_str': '2835451658', 'in_reply_to_screen_name': 'MrAndyNgo', 'geo': None, 'coordinates': None, 'place': None, 'contributors': None, 'is_quote_status': False, 'retweet_count': 118, 'favorite_count': 1286, 'favorited': False, 'retweeted': False, 'lang': 'en'}, 'contributors_enabled': False, 'is_translator': False, 'is_translation_enabled': False, 'profile_background_color': 'C0DEED', 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_tile': False, 'profile_image_url': 'http://pbs.twimg.com/profile_images/1590968738358079488/IY9Gx6Ok_normal.jpg', 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/1590968738358079488/IY9Gx6Ok_normal.jpg', 'profile_banner_url': 'https://pbs.twimg.com/profile_banners/44196397/1576183471', 'profile_link_color': '0084B4', 'profile_sidebar_border_color': 'C0DEED', 'profile_sidebar_fill_color': 'DDEEF6', 'profile_text_color': '333333', 'profile_use_background_image': True, 'has_extended_profile': True, 'default_profile': False, 'default_profile_image': False, 'following': None, 'follow_request_sent': None, 'notifications': None, 'translator_type': 'none', 'withheld_in_countries': []}})]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"documents = loader.load()\n",
"documents[:5]"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -112,6 +112,79 @@
"source": [
"data = loader.load()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "a2c1c79f",
"metadata": {},
"source": [
"# Playwright URL Loader\n",
"\n",
"This covers how to load HTML documents from a list of URLs using the `PlaywrightURLLoader`.\n",
"\n",
"As in the Selenium case, Playwright allows us to load pages that need JavaScript to render.\n",
"\n",
"## Setup\n",
"\n",
"To use the `PlaywrightURLLoader`, you will need to install `playwright` and `unstructured`. Additionally, you will need to install the Playwright Chromium browser:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "53158417",
"metadata": {},
"outputs": [],
"source": [
"# Install playwright\n",
"!pip install \"playwright\"\n",
"!pip install \"unstructured\"\n",
"!playwright install"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0ab4e115",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import PlaywrightURLLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ce5a9a0a",
"metadata": {},
"outputs": [],
"source": [
"urls = [\n",
" \"https://www.youtube.com/watch?v=dQw4w9WgXcQ\",\n",
" \"https://goo.gl/maps/NDSHwePEyaHMFGwh8\"\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2dc3e0bc",
"metadata": {},
"outputs": [],
"source": [
"loader = PlaywrightURLLoader(urls=urls, remove_selectors=[\"header\", \"footer\"])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "10b79f80",
"metadata": {},
"outputs": [],
"source": [
"data = loader.load()"
]
}
],
"metadata": {
@@ -130,7 +203,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -89,7 +89,7 @@
"id": "150988e6",
"metadata": {},
"source": [
"# Loading multiple webpages\n",
"## Loading multiple webpages\n",
"\n",
"You can also load multiple webpages at once by passing in a list of urls to the loader. This will return a list of documents in the same order as the urls passed in."
]
@@ -123,7 +123,7 @@
"id": "641be294",
"metadata": {},
"source": [
"## Load multiple urls concurrently\n",
"### Load multiple urls concurrently\n",
"\n",
"You can speed up the scraping process by scraping and parsing multiple urls concurrently.\n",
"\n",

View File

@@ -99,7 +99,7 @@
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
"loader = TextLoader('../state_of_the_union.txt')"
"loader = TextLoader('../state_of_the_union.txt', encoding='utf8')"
]
},
{

View File

@@ -0,0 +1,371 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "fc0db1bc",
"metadata": {},
"source": [
"# Contextual Compression Retriever\n",
"\n",
"This notebook introduces the concept of DocumentCompressors and the ContextualCompressionRetriever. The core idea is simple: given a specific query, we should be able to return only the documents relevant to that query, and only the parts of those documents that are relevant. The ContextualCompressionsRetriever is a wrapper for another retriever that iterates over the initial output of the base retriever and filters and compresses those initial documents, so that only the most relevant information is returned."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "28e8dc12",
"metadata": {},
"outputs": [],
"source": [
"# Helper function for printing docs\n",
"\n",
"def pretty_print_docs(docs):\n",
" print(f\"\\n{'-' * 100}\\n\".join([f\"Document {i+1}:\\n\\n\" + d.page_content for i, d in enumerate(docs)]))"
]
},
{
"cell_type": "markdown",
"id": "6fa3d916",
"metadata": {},
"source": [
"## Using a vanilla vector store retriever\n",
"Let's start by initializing a simple vector store retriever and storing the 2023 State of the Union speech (in chunks). We can see that given an example question our retriever returns one or two relevant docs and a few irrelevant docs. And even the relevant docs have a lot of irrelevant information in them."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "9fbcc58f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Document 1:\n",
"\n",
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n",
"----------------------------------------------------------------------------------------------------\n",
"Document 2:\n",
"\n",
"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n",
"\n",
"And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n",
"\n",
"We can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \n",
"\n",
"Weve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \n",
"\n",
"Were putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \n",
"\n",
"Were securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.\n",
"----------------------------------------------------------------------------------------------------\n",
"Document 3:\n",
"\n",
"And for our LGBTQ+ Americans, lets finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. \n",
"\n",
"As I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. \n",
"\n",
"While it often appears that we never agree, that isnt true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. \n",
"\n",
"And soon, well strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. \n",
"\n",
"So tonight Im offering a Unity Agenda for the Nation. Four big things we can do together. \n",
"\n",
"First, beat the opioid epidemic.\n",
"----------------------------------------------------------------------------------------------------\n",
"Document 4:\n",
"\n",
"Tonight, Im announcing a crackdown on these companies overcharging American businesses and consumers. \n",
"\n",
"And as Wall Street firms take over more nursing homes, quality in those homes has gone down and costs have gone up. \n",
"\n",
"That ends on my watch. \n",
"\n",
"Medicare is going to set higher standards for nursing homes and make sure your loved ones get the care they deserve and expect. \n",
"\n",
"Well also cut costs and keep the economy going strong by giving workers a fair shot, provide more training and apprenticeships, hire them based on their skills not degrees. \n",
"\n",
"Lets pass the Paycheck Fairness Act and paid leave. \n",
"\n",
"Raise the minimum wage to $15 an hour and extend the Child Tax Credit, so no one has to raise a family in poverty. \n",
"\n",
"Lets increase Pell Grants and increase our historic support of HBCUs, and invest in what Jill—our First Lady who teaches full-time—calls Americas best-kept secret: community colleges.\n"
]
}
],
"source": [
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.document_loaders import TextLoader\n",
"from langchain.vectorstores import FAISS\n",
"\n",
"documents = TextLoader('../../../state_of_the_union.txt').load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_documents(documents)\n",
"retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever()\n",
"\n",
"docs = retriever.get_relevant_documents(\"What did the president say about Ketanji Brown Jackson\")\n",
"pretty_print_docs(docs)"
]
},
{
"cell_type": "markdown",
"id": "b7648612",
"metadata": {},
"source": [
"## Adding contextual compression with an `LLMChainExtractor`\n",
"Now let's wrap our base retriever with a `ContextualCompressionRetriever`. We'll add an `LLMChainExtractor`, which will iterate over the initially returned documents and extract from each only the content that is relevant to the query."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "9a658023",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Document 1:\n",
"\n",
"\"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\"\n",
"----------------------------------------------------------------------------------------------------\n",
"Document 2:\n",
"\n",
"\"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.\"\n"
]
}
],
"source": [
"from langchain.llms import OpenAI\n",
"from langchain.retrievers import ContextualCompressionRetriever\n",
"from langchain.retrievers.document_compressors import LLMChainExtractor\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"compressor = LLMChainExtractor.from_llm(llm)\n",
"compression_retriever = ContextualCompressionRetriever(base_compressor=compressor, base_retriever=retriever)\n",
"\n",
"compressed_docs = compression_retriever.get_relevant_documents(\"What did the president say about Ketanji Jackson Brown\")\n",
"pretty_print_docs(compressed_docs)"
]
},
{
"cell_type": "markdown",
"id": "2cd38f3a",
"metadata": {},
"source": [
"## More built-in compressors: filters\n",
"### `LLMChainFilter`\n",
"The `LLMChainFilter` is slightly simpler but more robust compressor that uses an LLM chain to decide which of the initially retrieved documents to filter out and which ones to return, without manipulating the document contents."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "b216a767",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Document 1:\n",
"\n",
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n"
]
}
],
"source": [
"from langchain.retrievers.document_compressors import LLMChainFilter\n",
"\n",
"_filter = LLMChainFilter.from_llm(llm)\n",
"compression_retriever = ContextualCompressionRetriever(base_compressor=_filter, base_retriever=retriever)\n",
"\n",
"compressed_docs = compression_retriever.get_relevant_documents(\"What did the president say about Ketanji Jackson Brown\")\n",
"pretty_print_docs(compressed_docs)"
]
},
{
"cell_type": "markdown",
"id": "8c709598",
"metadata": {},
"source": [
"### `EmbeddingsFilter`\n",
"\n",
"Making an extra LLM call over each retrieved document is expensive and slow. The `EmbeddingsFilter` provides a cheaper and faster option by embedding the documents and query and only returning those documents which have sufficiently similar embeddings to the query."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "6fbc801f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Document 1:\n",
"\n",
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n",
"----------------------------------------------------------------------------------------------------\n",
"Document 2:\n",
"\n",
"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n",
"\n",
"And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n",
"\n",
"We can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \n",
"\n",
"Weve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \n",
"\n",
"Were putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \n",
"\n",
"Were securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.\n",
"----------------------------------------------------------------------------------------------------\n",
"Document 3:\n",
"\n",
"And for our LGBTQ+ Americans, lets finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. \n",
"\n",
"As I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. \n",
"\n",
"While it often appears that we never agree, that isnt true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. \n",
"\n",
"And soon, well strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. \n",
"\n",
"So tonight Im offering a Unity Agenda for the Nation. Four big things we can do together. \n",
"\n",
"First, beat the opioid epidemic.\n"
]
}
],
"source": [
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.retrievers.document_compressors import EmbeddingsFilter\n",
"\n",
"embeddings = OpenAIEmbeddings()\n",
"embeddings_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)\n",
"compression_retriever = ContextualCompressionRetriever(base_compressor=embeddings_filter, base_retriever=retriever)\n",
"\n",
"compressed_docs = compression_retriever.get_relevant_documents(\"What did the president say about Ketanji Jackson Brown\")\n",
"pretty_print_docs(compressed_docs)"
]
},
{
"cell_type": "markdown",
"id": "07365d36",
"metadata": {},
"source": [
"# Stringing compressors and document transformers together\n",
"Using the `DocumentCompressorPipeline` we can also easily combine multiple compressors in sequence. Along with compressors we can add `BaseDocumentTransformer`s to our pipeline, which don't perform any contextual compression but simply perform some transformation on a set of documents. For example `TextSplitter`s can be used as document transformers to split documents into smaller pieces, and the `EmbeddingsRedundantFilter` can be used to filter out redundant documents based on embedding similarity between documents.\n",
"\n",
"Below we create a compressor pipeline by first splitting our docs into smaller chunks, then removing redundant documents, and then filtering based on relevance to the query."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "2a150a63",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_transformers import EmbeddingsRedundantFilter\n",
"from langchain.retrievers.document_compressors import DocumentCompressorPipeline\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"\n",
"splitter = CharacterTextSplitter(chunk_size=300, chunk_overlap=0, separator=\". \")\n",
"redundant_filter = EmbeddingsRedundantFilter(embeddings=embeddings)\n",
"relevant_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)\n",
"pipeline_compressor = DocumentCompressorPipeline(\n",
" transformers=[splitter, redundant_filter, relevant_filter]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "3ceab64a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Document 1:\n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson\n",
"----------------------------------------------------------------------------------------------------\n",
"Document 2:\n",
"\n",
"As I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. \n",
"\n",
"While it often appears that we never agree, that isnt true. I signed 80 bipartisan bills into law last year\n",
"----------------------------------------------------------------------------------------------------\n",
"Document 3:\n",
"\n",
"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder\n"
]
}
],
"source": [
"compression_retriever = ContextualCompressionRetriever(base_compressor=pipeline_compressor, base_retriever=retriever)\n",
"\n",
"compressed_docs = compression_retriever.get_relevant_documents(\"What did the president say about Ketanji Jackson Brown\")\n",
"pretty_print_docs(compressed_docs)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8cfd9fc5",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -32,9 +32,9 @@
"from metal_sdk.metal import Metal\n",
"API_KEY = \"\"\n",
"CLIENT_ID = \"\"\n",
"APP_ID = \"\"\n",
"INDEX_ID = \"\"\n",
"\n",
"metal = Metal(API_KEY, CLIENT_ID, APP_ID);\n"
"metal = Metal(API_KEY, CLIENT_ID, INDEX_ID);\n"
]
},
{

View File

@@ -0,0 +1,128 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "ab66dd43",
"metadata": {},
"source": [
"# SVM Retriever\n",
"\n",
"This notebook goes over how to use a retriever that under the hood uses an SVM using scikit-learn.\n",
"\n",
"Largely based on https://github.com/karpathy/randomfun/blob/master/knn_vs_svm.ipynb"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "393ac030",
"metadata": {},
"outputs": [],
"source": [
"from langchain.retrievers import SVMRetriever\n",
"from langchain.embeddings import OpenAIEmbeddings"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a801b57c",
"metadata": {},
"outputs": [],
"source": [
"# !pip install scikit-learn"
]
},
{
"cell_type": "markdown",
"id": "aaf80e7f",
"metadata": {},
"source": [
"## Create New Retriever with Texts"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "98b1c017",
"metadata": {},
"outputs": [],
"source": [
"retriever = SVMRetriever.from_texts([\"foo\", \"bar\", \"world\", \"hello\", \"foo bar\"], OpenAIEmbeddings())"
]
},
{
"cell_type": "markdown",
"id": "08437fa2",
"metadata": {},
"source": [
"## Use Retriever\n",
"\n",
"We can now use the retriever!"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "c0455218",
"metadata": {},
"outputs": [],
"source": [
"result = retriever.get_relevant_documents(\"foo\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "7dfa5c29",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='foo', metadata={}),\n",
" Document(page_content='foo bar', metadata={}),\n",
" Document(page_content='hello', metadata={}),\n",
" Document(page_content='world', metadata={})]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"result"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "74bd9256",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,210 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "a90b7557",
"metadata": {},
"source": [
"# Time Weighted VectorStore Retriever\n",
"\n",
"This retriever uses a combination of semantic similarity and recency.\n",
"\n",
"The algorithm for scoring them is:\n",
"\n",
"```\n",
"semantic_similarity + (1.0 - decay_rate) ** hours_passed\n",
"```\n",
"\n",
"Notably, hours_passed refers to the hours passed since the object in the retriever **was last accessed**, not since it was created. This means that frequently accessed objects remain \"fresh.\""
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "f22cc96b",
"metadata": {},
"outputs": [],
"source": [
"import faiss\n",
"\n",
"from datetime import datetime, timedelta\n",
"from langchain.docstore import InMemoryDocstore\n",
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.retrievers import TimeWeightedVectorStoreRetriever\n",
"from langchain.schema import Document\n",
"from langchain.vectorstores import FAISS\n"
]
},
{
"cell_type": "markdown",
"id": "6af7ea6b",
"metadata": {},
"source": [
"## Low Decay Rate\n",
"\n",
"A low decay rate (in this, to be extreme, we will set close to 0) means memories will be \"remembered\" for longer. A decay rate of 0 means memories never be forgotten, making this retriever equivalent to the vector lookup."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "c10e7696",
"metadata": {},
"outputs": [],
"source": [
"# Define your embedding model\n",
"embeddings_model = OpenAIEmbeddings()\n",
"# Initialize the vectorstore as empty\n",
"embedding_size = 1536\n",
"index = faiss.IndexFlatL2(embedding_size)\n",
"vectorstore = FAISS(embeddings_model.embed_query, index, InMemoryDocstore({}), {})\n",
"retriever = TimeWeightedVectorStoreRetriever(vectorstore=vectorstore, decay_rate=.0000000000000000000000001, k=1) "
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "86dbadb9",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['5c9f7c06-c9eb-45f2-aea5-efce5fb9f2bd']"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"yesterday = datetime.now() - timedelta(days=1)\n",
"retriever.add_documents([Document(page_content=\"hello world\", metadata={\"last_accessed_at\": yesterday})])\n",
"retriever.add_documents([Document(page_content=\"hello foo\")])"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "a580be32",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='hello world', metadata={'last_accessed_at': datetime.datetime(2023, 4, 16, 22, 9, 1, 966261), 'created_at': datetime.datetime(2023, 4, 16, 22, 9, 0, 374683), 'buffer_idx': 0})]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# \"Hello World\" is returned first because it is most salient, and the decay rate is close to 0., meaning it's still recent enough\n",
"retriever.get_relevant_documents(\"hello world\")"
]
},
{
"cell_type": "markdown",
"id": "ca056896",
"metadata": {},
"source": [
"## High Decay Rate\n",
"\n",
"With a high decay factor (e.g., several 9's), the recency score quickly goes to 0! If you set this all the way to 1, recency is 0 for all objects, once again making this equivalent to a vector lookup.\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "dc37669b",
"metadata": {},
"outputs": [],
"source": [
"# Define your embedding model\n",
"embeddings_model = OpenAIEmbeddings()\n",
"# Initialize the vectorstore as empty\n",
"embedding_size = 1536\n",
"index = faiss.IndexFlatL2(embedding_size)\n",
"vectorstore = FAISS(embeddings_model.embed_query, index, InMemoryDocstore({}), {})\n",
"retriever = TimeWeightedVectorStoreRetriever(vectorstore=vectorstore, decay_rate=.999, k=1) "
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "fa284384",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['40011466-5bbe-4101-bfd1-e22e7f505de2']"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"yesterday = datetime.now() - timedelta(days=1)\n",
"retriever.add_documents([Document(page_content=\"hello world\", metadata={\"last_accessed_at\": yesterday})])\n",
"retriever.add_documents([Document(page_content=\"hello foo\")])"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "7558f94d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='hello foo', metadata={'last_accessed_at': datetime.datetime(2023, 4, 16, 22, 9, 2, 494798), 'created_at': datetime.datetime(2023, 4, 16, 22, 9, 2, 178722), 'buffer_idx': 1})]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# \"Hello Foo\" is returned first because \"hello world\" is mostly forgotten\n",
"retriever.get_relevant_documents(\"hello world\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bf6d8c90",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,162 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# AnalyticDB\n",
"\n",
"This notebook shows how to use functionality related to the AnalyticDB vector database.\n",
"To run, you should have an [AnalyticDB](https://www.alibabacloud.com/help/en/analyticdb-for-postgresql/latest/product-introduction-overview) instance up and running:\n",
"- Using [AnalyticDB Cloud Vector Database](https://www.alibabacloud.com/product/hybriddb-postgresql). Click here to fast deploy it."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import AnalyticDB"
]
},
{
"cell_type": "markdown",
"source": [
"Split documents and get embeddings by call OpenAI API"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
"loader = TextLoader('../../../state_of_the_union.txt')\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = OpenAIEmbeddings()"
]
},
{
"cell_type": "markdown",
"source": [
"Connect to AnalyticDB by setting related ENVIRONMENTS.\n",
"```\n",
"export PG_HOST={your_analyticdb_hostname}\n",
"export PG_PORT={your_analyticdb_port} # Optional, default is 5432\n",
"export PG_DATABASE={your_database} # Optional, default is postgres\n",
"export PG_USER={database_username}\n",
"export PG_PASSWORD={database_password}\n",
"```\n",
"\n",
"Then store your embeddings and documents into AnalyticDB"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"connection_string = AnalyticDB.connection_string_from_db_params(\n",
" driver=os.environ.get(\"PG_DRIVER\", \"psycopg2cffi\"),\n",
" host=os.environ.get(\"PG_HOST\", \"localhost\"),\n",
" port=int(os.environ.get(\"PG_PORT\", \"5432\")),\n",
" database=os.environ.get(\"PG_DATABASE\", \"postgres\"),\n",
" user=os.environ.get(\"PG_USER\", \"postgres\"),\n",
" password=os.environ.get(\"PG_PASSWORD\", \"postgres\"),\n",
")\n",
"\n",
"vector_db = AnalyticDB.from_documents(\n",
" docs,\n",
" embeddings,\n",
" connection_string= connection_string,\n",
")"
]
},
{
"cell_type": "markdown",
"source": [
"Query and retrieve data"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = vector_db.similarity_search(query)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n"
]
}
],
"source": [
"print(docs[0].page_content)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 1
}

View File

@@ -0,0 +1,572 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "683953b3",
"metadata": {},
"source": [
"# Annoy\n",
"\n",
"This notebook shows how to use functionality related to the Annoy vector database.\n",
"\n",
"> \"Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mmapped into memory so that many processes may share the same data.\"\n",
"\n",
"via [Annoy](https://github.com/spotify/annoy) \n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3b450bdc",
"metadata": {},
"source": [
"```{note}\n",
"Annoy is read-only - once the index is built you cannot add any more emebddings!\n",
"If you want to progressively add to your VectorStore then better choose an alternative!\n",
"```"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6613d222",
"metadata": {},
"source": [
"## Create VectorStore from texts"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "dc7351b5",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings import HuggingFaceEmbeddings\n",
"from langchain.vectorstores import Annoy\n",
"\n",
"embeddings_func = HuggingFaceEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "d2cb5f7d",
"metadata": {},
"outputs": [],
"source": [
"texts = [\"pizza is great\", \"I love salad\", \"my car\", \"a dog\"]\n",
"\n",
"# default metric is angular\n",
"vector_store = Annoy.from_texts(texts, embeddings_func)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "a856b2d1",
"metadata": {},
"outputs": [],
"source": [
"# allows for custom annoy parameters, defaults are n_trees=100, n_jobs=-1, metric=\"angular\"\n",
"vector_store_v2 = Annoy.from_texts(\n",
" texts, embeddings_func, metric=\"dot\", n_trees=100, n_jobs=1\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "8ada534a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='pizza is great', metadata={}),\n",
" Document(page_content='I love salad', metadata={}),\n",
" Document(page_content='my car', metadata={})]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"vector_store.similarity_search(\"food\", k=3)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "0470c5c8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[(Document(page_content='pizza is great', metadata={}), 1.0944390296936035),\n",
" (Document(page_content='I love salad', metadata={}), 1.1273186206817627),\n",
" (Document(page_content='my car', metadata={}), 1.1580758094787598)]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# the score is a distance metric, so lower is better\n",
"vector_store.similarity_search_with_score(\"food\", k=3)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "4583b231",
"metadata": {},
"source": [
"## Create VectorStore from docs"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "fbe898a8",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"\n",
"loader = TextLoader(\"../../../state_of_the_union.txt\")\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "51ea6b5c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans. \\n\\nLast year COVID-19 kept us apart. This year we are finally together again. \\n\\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \\n\\nWith a duty to one another to the American people to the Constitution. \\n\\nAnd with an unwavering resolve that freedom will always triumph over tyranny. \\n\\nSix days ago, Russias Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \\n\\nHe thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \\n\\nHe met the Ukrainian people. \\n\\nFrom President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world.', metadata={'source': '../../../state_of_the_union.txt'}),\n",
" Document(page_content='Groups of citizens blocking tanks with their bodies. Everyone from students to retirees teachers turned soldiers defending their homeland. \\n\\nIn this struggle as President Zelenskyy said in his speech to the European Parliament “Light will win over darkness.” The Ukrainian Ambassador to the United States is here tonight. \\n\\nLet each of us here tonight in this Chamber send an unmistakable signal to Ukraine and to the world. \\n\\nPlease rise if you are able and show that, Yes, we the United States of America stand with the Ukrainian people. \\n\\nThroughout our history weve learned this lesson when dictators do not pay a price for their aggression they cause more chaos. \\n\\nThey keep moving. \\n\\nAnd the costs and the threats to America and the world keep rising. \\n\\nThats why the NATO Alliance was created to secure peace and stability in Europe after World War 2. \\n\\nThe United States is a member along with 29 other nations. \\n\\nIt matters. American diplomacy matters. American resolve matters.', metadata={'source': '../../../state_of_the_union.txt'}),\n",
" Document(page_content='Putins latest attack on Ukraine was premeditated and unprovoked. \\n\\nHe rejected repeated efforts at diplomacy. \\n\\nHe thought the West and NATO wouldnt respond. And he thought he could divide us at home. Putin was wrong. We were ready. Here is what we did. \\n\\nWe prepared extensively and carefully. \\n\\nWe spent months building a coalition of other freedom-loving nations from Europe and the Americas to Asia and Africa to confront Putin. \\n\\nI spent countless hours unifying our European allies. We shared with the world in advance what we knew Putin was planning and precisely how he would try to falsely justify his aggression. \\n\\nWe countered Russias lies with truth. \\n\\nAnd now that he has acted the free world is holding him accountable. \\n\\nAlong with twenty-seven members of the European Union including France, Germany, Italy, as well as countries like the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and many others, even Switzerland.', metadata={'source': '../../../state_of_the_union.txt'}),\n",
" Document(page_content='We are inflicting pain on Russia and supporting the people of Ukraine. Putin is now isolated from the world more than ever. \\n\\nTogether with our allies we are right now enforcing powerful economic sanctions. \\n\\nWe are cutting off Russias largest banks from the international financial system. \\n\\nPreventing Russias central bank from defending the Russian Ruble making Putins $630 Billion “war fund” worthless. \\n\\nWe are choking off Russias access to technology that will sap its economic strength and weaken its military for years to come. \\n\\nTonight I say to the Russian oligarchs and corrupt leaders who have bilked billions of dollars off this violent regime no more. \\n\\nThe U.S. Department of Justice is assembling a dedicated task force to go after the crimes of Russian oligarchs. \\n\\nWe are joining with our European allies to find and seize your yachts your luxury apartments your private jets. We are coming for your ill-begotten gains.', metadata={'source': '../../../state_of_the_union.txt'}),\n",
" Document(page_content='And tonight I am announcing that we will join our allies in closing off American air space to all Russian flights further isolating Russia and adding an additional squeeze on their economy. The Ruble has lost 30% of its value. \\n\\nThe Russian stock market has lost 40% of its value and trading remains suspended. Russias economy is reeling and Putin alone is to blame. \\n\\nTogether with our allies we are providing support to the Ukrainians in their fight for freedom. Military assistance. Economic assistance. Humanitarian assistance. \\n\\nWe are giving more than $1 Billion in direct assistance to Ukraine. \\n\\nAnd we will continue to aid the Ukrainian people as they defend their country and to help ease their suffering. \\n\\nLet me be clear, our forces are not engaged and will not engage in conflict with Russian forces in Ukraine. \\n\\nOur forces are not going to Europe to fight in Ukraine, but to defend our NATO Allies in the event that Putin decides to keep moving west.', metadata={'source': '../../../state_of_the_union.txt'})]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs[:5]"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "d080985b",
"metadata": {},
"outputs": [],
"source": [
"vector_store_from_docs = Annoy.from_documents(docs, embeddings_func)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "4931cb99",
"metadata": {},
"outputs": [],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = vector_store_from_docs.similarity_search(query)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "97969d5b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Ac\n"
]
}
],
"source": [
"print(docs[0].page_content[:100])"
]
},
{
"cell_type": "markdown",
"id": "79628542",
"metadata": {},
"source": [
"## Create VectorStore via existing embeddings"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "3432eddb",
"metadata": {},
"outputs": [],
"source": [
"embs = embeddings_func.embed_documents(texts)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "b69f8408",
"metadata": {},
"outputs": [],
"source": [
"data = list(zip(texts, embs))\n",
"\n",
"vector_store_from_embeddings = Annoy.from_embeddings(data, embeddings_func)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "e260758d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[(Document(page_content='pizza is great', metadata={}), 1.0944390296936035),\n",
" (Document(page_content='I love salad', metadata={}), 1.1273186206817627),\n",
" (Document(page_content='my car', metadata={}), 1.1580758094787598)]"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"vector_store_from_embeddings.similarity_search_with_score(\"food\", k=3)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "341390c2",
"metadata": {},
"source": [
"## Search via embeddings"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "b9bce06d",
"metadata": {},
"outputs": [],
"source": [
"motorbike_emb = embeddings_func.embed_query(\"motorbike\")"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "af2552c9",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='my car', metadata={}),\n",
" Document(page_content='a dog', metadata={}),\n",
" Document(page_content='pizza is great', metadata={})]"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"vector_store.similarity_search_by_vector(motorbike_emb, k=3)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "c7a1a924",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[(Document(page_content='my car', metadata={}), 1.0870471000671387),\n",
" (Document(page_content='a dog', metadata={}), 1.2095637321472168),\n",
" (Document(page_content='pizza is great', metadata={}), 1.3254905939102173)]"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"vector_store.similarity_search_with_score_by_vector(motorbike_emb, k=3)"
]
},
{
"cell_type": "markdown",
"id": "4b77be77",
"metadata": {},
"source": [
"## Search via docstore id"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "bbd971f0",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{0: '2d1498a8-a37c-4798-acb9-0016504ed798',\n",
" 1: '2d30aecc-88e0-4469-9d51-0ef7e9858e6d',\n",
" 2: '927f1120-985b-4691-b577-ad5cb42e011c',\n",
" 3: '3056ddcf-a62f-48c8-bd98-b9e57a3dfcae'}"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"vector_store.index_to_docstore_id"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "6dbf3365",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Document(page_content='pizza is great', metadata={})"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"some_docstore_id = 0 # texts[0]\n",
"\n",
"vector_store.docstore._dict[vector_store.index_to_docstore_id[some_docstore_id]]"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "98b27172",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[(Document(page_content='pizza is great', metadata={}), 0.0),\n",
" (Document(page_content='I love salad', metadata={}), 1.0734446048736572),\n",
" (Document(page_content='my car', metadata={}), 1.2895267009735107)]"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# same document has distance 0\n",
"vector_store.similarity_search_with_score_by_index(some_docstore_id, k=3)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6f570f69",
"metadata": {},
"source": [
"## Save and load"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "ef91cc69",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"saving config\n"
]
}
],
"source": [
"vector_store.save_local(\"my_annoy_index_and_docstore\")"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "7a9d2fce",
"metadata": {},
"outputs": [],
"source": [
"loaded_vector_store = Annoy.load_local(\n",
" \"my_annoy_index_and_docstore\", embeddings=embeddings_func\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "bba77cae",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[(Document(page_content='pizza is great', metadata={}), 0.0),\n",
" (Document(page_content='I love salad', metadata={}), 1.0734446048736572),\n",
" (Document(page_content='my car', metadata={}), 1.2895267009735107)]"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# same document has distance 0\n",
"loaded_vector_store.similarity_search_with_score_by_index(some_docstore_id, k=3)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "df4beb83",
"metadata": {},
"source": [
"## Construct from scratch"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "26fcf742",
"metadata": {},
"outputs": [],
"source": [
"import uuid\n",
"from annoy import AnnoyIndex\n",
"from langchain.docstore.document import Document\n",
"from langchain.docstore.in_memory import InMemoryDocstore\n",
"\n",
"metadatas = [{\"x\": \"food\"}, {\"x\": \"food\"}, {\"x\": \"stuff\"}, {\"x\": \"animal\"}]\n",
"\n",
"# embeddings\n",
"embeddings = embeddings_func.embed_documents(texts)\n",
"\n",
"# embedding dim\n",
"f = len(embeddings[0])\n",
"\n",
"# index\n",
"metric = \"angular\"\n",
"index = AnnoyIndex(f, metric=metric)\n",
"for i, emb in enumerate(embeddings):\n",
" index.add_item(i, emb)\n",
"index.build(10)\n",
"\n",
"# docstore\n",
"documents = []\n",
"for i, text in enumerate(texts):\n",
" metadata = metadatas[i] if metadatas else {}\n",
" documents.append(Document(page_content=text, metadata=metadata))\n",
"index_to_docstore_id = {i: str(uuid.uuid4()) for i in range(len(documents))}\n",
"docstore = InMemoryDocstore(\n",
" {index_to_docstore_id[i]: doc for i, doc in enumerate(documents)}\n",
")\n",
"\n",
"db_manually = Annoy(\n",
" embeddings_func.embed_query, index, metric, docstore, index_to_docstore_id\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "2b3f6f5c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[(Document(page_content='pizza is great', metadata={'x': 'food'}),\n",
" 1.1314140558242798),\n",
" (Document(page_content='I love salad', metadata={'x': 'food'}),\n",
" 1.1668788194656372),\n",
" (Document(page_content='my car', metadata={'x': 'stuff'}), 1.226445198059082)]"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"db_manually.similarity_search_with_score(\"eating!\", k=3)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -17,34 +17,36 @@
"metadata": {},
"outputs": [],
"source": [
"!python3 -m pip install openai deeplake"
"!python3 -m pip install openai deeplake tiktoken"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import DeepLake\n",
"from langchain.document_loaders import TextLoader"
"from langchain.vectorstores import DeepLake"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"os.environ['OPENAI_API_KEY'] = 'sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'"
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')\n",
"embeddings = OpenAIEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
@@ -60,9 +62,38 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 6,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"mem://langchain loaded successfully.\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Evaluating ingest: 100%|██████████| 1/1 [00:04<00:00\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dataset(path='mem://langchain', tensors=['embedding', 'ids', 'metadata', 'text'])\n",
"\n",
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding generic (4, 1536) float32 None \n",
" ids text (4, 1) str None \n",
" metadata json (4, 1) str None \n",
" text text (4, 1) str None \n"
]
}
],
"source": [
"db = DeepLake.from_documents(docs, embeddings)\n",
"\n",
@@ -72,9 +103,23 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 7,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n"
]
}
],
"source": [
"print(docs[0].page_content)"
]
@@ -89,9 +134,18 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 9,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/media/sdb/davit/.local/lib/python3.10/site-packages/langchain/llms/openai.py:624: UserWarning: You are trying to use a chat model. This way of initializing it is no longer supported. Instead, please use: `from langchain.chat_models import ChatOpenAI`\n",
" warnings.warn(\n"
]
}
],
"source": [
"from langchain.chains import RetrievalQA\n",
"from langchain.llms import OpenAIChat\n",
@@ -101,9 +155,20 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 10,
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"text/plain": [
"'The president nominated Circuit Court of Appeals Judge Ketanji Brown Jackson for the United States Supreme Court and praised her qualifications and broad support from both Democrats and Republicans.'"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query = 'What did the president say about Ketanji Brown Jackson'\n",
"qa.run(query)"
@@ -119,9 +184,43 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 11,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"mem://langchain loaded successfully.\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Evaluating ingest: 100%|██████████| 1/1 [00:04<00:00\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dataset(path='mem://langchain', tensors=['embedding', 'ids', 'metadata', 'text'])\n",
"\n",
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding generic (42, 1536) float32 None \n",
" ids text (42, 1) str None \n",
" metadata json (42, 1) str None \n",
" text text (42, 1) str None \n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": []
}
],
"source": [
"import random\n",
"\n",
@@ -133,9 +232,30 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 12,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 42/42 [00:00<00:00, 3456.17it/s]\n"
]
},
{
"data": {
"text/plain": [
"[Document(page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWeve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWere putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWere securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', metadata={'source': '../../../state_of_the_union.txt', 'year': 2013}),\n",
" Document(page_content='And for our LGBTQ+ Americans, lets finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. \\n\\nAs I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. \\n\\nWhile it often appears that we never agree, that isnt true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. \\n\\nAnd soon, well strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. \\n\\nSo tonight Im offering a Unity Agenda for the Nation. Four big things we can do together. \\n\\nFirst, beat the opioid epidemic.', metadata={'source': '../../../state_of_the_union.txt', 'year': 2013}),\n",
" Document(page_content='Vice President Harris and I ran for office with a new economic vision for America. \\n\\nInvest in America. Educate Americans. Grow the workforce. Build the economy from the bottom up \\nand the middle out, not from the top down. \\n\\nBecause we know that when the middle class grows, the poor have a ladder up and the wealthy do very well. \\n\\nAmerica used to have the best roads, bridges, and airports on Earth. \\n\\nNow our infrastructure is ranked 13th in the world. \\n\\nWe wont be able to compete for the jobs of the 21st Century if we dont fix that. \\n\\nThats why it was so important to pass the Bipartisan Infrastructure Law—the most sweeping investment to rebuild America in history. \\n\\nThis was a bipartisan effort, and I want to thank the members of both parties who worked to make it happen. \\n\\nWere done talking about infrastructure weeks. \\n\\nWere going to have an infrastructure decade.', metadata={'source': '../../../state_of_the_union.txt', 'year': 2013}),\n",
" Document(page_content='It is going to transform America and put us on a path to win the economic competition of the 21st Century that we face with the rest of the world—particularly with China. \\n\\nAs Ive told Xi Jinping, it is never a good bet to bet against the American people. \\n\\nWell create good jobs for millions of Americans, modernizing roads, airports, ports, and waterways all across America. \\n\\nAnd well do it all to withstand the devastating effects of the climate crisis and promote environmental justice. \\n\\nWell build a national network of 500,000 electric vehicle charging stations, begin to replace poisonous lead pipes—so every child—and every American—has clean water to drink at home and at school, provide affordable high-speed internet for every American—urban, suburban, rural, and tribal communities. \\n\\n4,000 projects have already been announced. \\n\\nAnd tonight, Im announcing that this year we will start fixing over 65,000 miles of highway and 1,500 bridges in disrepair.', metadata={'source': '../../../state_of_the_union.txt', 'year': 2013})]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"db.similarity_search('What did the president say about Ketanji Brown Jackson', filter={'year': 2013})"
]
@@ -151,9 +271,23 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 13,
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt', 'year': 2012}),\n",
" Document(page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWeve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWere putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWere securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', metadata={'source': '../../../state_of_the_union.txt', 'year': 2013}),\n",
" Document(page_content='And for our LGBTQ+ Americans, lets finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. \\n\\nAs I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. \\n\\nWhile it often appears that we never agree, that isnt true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. \\n\\nAnd soon, well strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. \\n\\nSo tonight Im offering a Unity Agenda for the Nation. Four big things we can do together. \\n\\nFirst, beat the opioid epidemic.', metadata={'source': '../../../state_of_the_union.txt', 'year': 2013}),\n",
" Document(page_content='Tonight, Im announcing a crackdown on these companies overcharging American businesses and consumers. \\n\\nAnd as Wall Street firms take over more nursing homes, quality in those homes has gone down and costs have gone up. \\n\\nThat ends on my watch. \\n\\nMedicare is going to set higher standards for nursing homes and make sure your loved ones get the care they deserve and expect. \\n\\nWell also cut costs and keep the economy going strong by giving workers a fair shot, provide more training and apprenticeships, hire them based on their skills not degrees. \\n\\nLets pass the Paycheck Fairness Act and paid leave. \\n\\nRaise the minimum wage to $15 an hour and extend the Child Tax Credit, so no one has to raise a family in poverty. \\n\\nLets increase Pell Grants and increase our historic support of HBCUs, and invest in what Jill—our First Lady who teaches full-time—calls Americas best-kept secret: community colleges.', metadata={'source': '../../../state_of_the_union.txt', 'year': 2014})]"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"db.similarity_search('What did the president say about Ketanji Brown Jackson?', distance_metric='cos')"
]
@@ -169,9 +303,23 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 14,
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt', 'year': 2012}),\n",
" Document(page_content='One was stationed at bases and breathing in toxic smoke from “burn pits” that incinerated wastes of war—medical and hazard material, jet fuel, and more. \\n\\nWhen they came home, many of the worlds fittest and best trained warriors were never the same. \\n\\nHeadaches. Numbness. Dizziness. \\n\\nA cancer that would put them in a flag-draped coffin. \\n\\nI know. \\n\\nOne of those soldiers was my son Major Beau Biden. \\n\\nWe dont know for sure if a burn pit was the cause of his brain cancer, or the diseases of so many of our troops. \\n\\nBut Im committed to finding out everything we can. \\n\\nCommitted to military families like Danielle Robinson from Ohio. \\n\\nThe widow of Sergeant First Class Heath Robinson. \\n\\nHe was born a soldier. Army National Guard. Combat medic in Kosovo and Iraq. \\n\\nStationed near Baghdad, just yards from burn pits the size of football fields. \\n\\nHeaths widow Danielle is here with us tonight. They loved going to Ohio State football games. He loved building Legos with their daughter.', metadata={'source': '../../../state_of_the_union.txt', 'year': 2014}),\n",
" Document(page_content='As Ohio Senator Sherrod Brown says, “Its time to bury the label “Rust Belt.” \\n\\nIts time. \\n\\nBut with all the bright spots in our economy, record job growth and higher wages, too many families are struggling to keep up with the bills. \\n\\nInflation is robbing them of the gains they might otherwise feel. \\n\\nI get it. Thats why my top priority is getting prices under control. \\n\\nLook, our economy roared back faster than most predicted, but the pandemic meant that businesses had a hard time hiring enough workers to keep up production in their factories. \\n\\nThe pandemic also disrupted global supply chains. \\n\\nWhen factories close, it takes longer to make goods and get them from the warehouse to the store, and prices go up. \\n\\nLook at cars. \\n\\nLast year, there werent enough semiconductors to make all the cars that people wanted to buy. \\n\\nAnd guess what, prices of automobiles went up. \\n\\nSo—we have a choice. \\n\\nOne way to fight inflation is to drive down wages and make Americans poorer.', metadata={'source': '../../../state_of_the_union.txt', 'year': 2012}),\n",
" Document(page_content='We cant change how divided weve been. But we can change how we move forward—on COVID-19 and other issues we must face together. \\n\\nI recently visited the New York City Police Department days after the funerals of Officer Wilbert Mora and his partner, Officer Jason Rivera. \\n\\nThey were responding to a 9-1-1 call when a man shot and killed them with a stolen gun. \\n\\nOfficer Mora was 27 years old. \\n\\nOfficer Rivera was 22. \\n\\nBoth Dominican Americans whod grown up on the same streets they later chose to patrol as police officers. \\n\\nI spoke with their families and told them that we are forever in debt for their sacrifice, and we will carry on their mission to restore the trust and safety every community deserves. \\n\\nIve worked on these issues a long time. \\n\\nI know what works: Investing in crime preventionand community police officers wholl walk the beat, wholl know the neighborhood, and who can restore trust and safety.', metadata={'source': '../../../state_of_the_union.txt', 'year': 2012})]"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"db.max_marginal_relevance_search('What did the president say about Ketanji Brown Jackson?')"
]
@@ -187,21 +335,87 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"!activeloop login -t <token>"
"os.environ['ACTIVELOOP_TOKEN'] = getpass.getpass('Activeloop Token:')"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 17,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Your Deep Lake dataset has been successfully created!\n",
"The dataset is private so make sure you are logged in!\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\\"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"This dataset can be visualized in Jupyter Notebook by ds.visualize() or at https://app.activeloop.ai/davitbun/linkedin\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
" \r"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"hub://davitbun/linkedin loaded successfully.\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Evaluating ingest: 100%|██████████| 1/1 [00:23<00:00\n",
"/"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dataset(path='hub://davitbun/linkedin', tensors=['embedding', 'ids', 'metadata', 'text'])\n",
"\n",
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding generic (42, 1536) float32 None \n",
" ids text (42, 1) str None \n",
" metadata json (42, 1) str None \n",
" text text (42, 1) str None \n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
" \r"
]
}
],
"source": [
"# Embed and store the texts\n",
"dataset_path = \"hub://{username}/{dataset_name}\" # could be also ./local/path (much faster locally), s3://bucket/path/to/dataset, gcs://path/to/dataset, etc.\n",
"dataset_path = f\"hub://{USERNAME}/{DATASET_NAME}\" # could be also ./local/path (much faster locally), s3://bucket/path/to/dataset, gcs://path/to/dataset, etc.\n",
"\n",
"embedding = OpenAIEmbeddings()\n",
"vectordb = DeepLake.from_documents(documents=docs, embedding=embedding, dataset_path=dataset_path)"
@@ -209,9 +423,23 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 18,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n"
]
}
],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = db.similarity_search(query)\n",
@@ -220,11 +448,35 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dataset(path='hub://davitbun/linkedin', tensors=['embedding', 'ids', 'metadata', 'text'])\n",
"\n",
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding generic (42, 1536) float32 None \n",
" ids text (42, 1) str None \n",
" metadata json (42, 1) str None \n",
" text text (42, 1) str None \n"
]
}
],
"source": [
"vectordb.ds.summary()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"vectordb.ds.summary()"
"embeddings = vectordb.ds.embedding.numpy()"
]
},
{
@@ -232,9 +484,7 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"embeddings = vectordb.ds.embedding.numpy()"
]
"source": []
}
],
"metadata": {

View File

@@ -46,7 +46,7 @@
"metadata": {},
"outputs": [],
"source": [
"db = ElasticVectorSearch.from_documents(docs, embeddings, elasticsearch_url=\"http://localhost:9200\"\n",
"db = ElasticVectorSearch.from_documents(docs, embeddings, elasticsearch_url=\"http://localhost:9200\")\n",
"\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = db.similarity_search(query)"

View File

@@ -0,0 +1,267 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "683953b3",
"metadata": {},
"source": [
"# MyScale\n",
"\n",
"This notebook shows how to use functionality related to the MyScale vector database."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "aac9563e",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import MyScale\n",
"from langchain.document_loaders import TextLoader"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "a9d16fa3",
"metadata": {},
"source": [
"## Setting up envrionments\n",
"\n",
"There are two ways to set up parameters for myscale index.\n",
"\n",
"1. Environment Variables\n",
"\n",
" Before you run the app, please set the environment variable with `export`:\n",
" `export MYSCALE_URL='<your-endpoints-url>' MYSCALE_PORT=<your-endpoints-port> MYSCALE_USERNAME=<your-username> MYSCALE_PASSWORD=<your-password> ...`\n",
"\n",
" You can easily find your account, password and other info on our SaaS. For details please refer to [this document](https://docs.myscale.com/en/cluster-management/)\n",
"\n",
" Every attributes under `MyScaleSettings` can be set with prefix `MYSCALE_` and is case insensitive.\n",
"\n",
"2. Create `MyScaleSettings` object with parameters\n",
"\n",
"\n",
" ```python\n",
" from langchain.vectorstores import MyScale, MyScaleSettings\n",
" config = MyScaleSetting(host=\"<your-backend-url>\", port=8443, ...)\n",
" index = MyScale(embedding_function, config)\n",
" index.add_documents(...)\n",
" ```"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "a3c3999a",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
"loader = TextLoader('../../../state_of_the_union.txt')\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = OpenAIEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "6e104aee",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Inserting data...: 100%|██████████| 42/42 [00:18<00:00, 2.21it/s]\n"
]
}
],
"source": [
"for d in docs:\n",
" d.metadata = {'some': 'metadata'}\n",
"docsearch = MyScale.from_documents(docs, embeddings)\n",
"\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = docsearch.similarity_search(query)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "9c608226",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"As Frances Haugen, who is here with us tonight, has shown, we must hold social media platforms accountable for the national experiment theyre conducting on our children for profit. \n",
"\n",
"Its time to strengthen privacy protections, ban targeted advertising to children, demand tech companies stop collecting personal data on our children. \n",
"\n",
"And lets get all Americans the mental health services they need. More people they can turn to for help, and full parity between physical and mental health care. \n",
"\n",
"Third, support our veterans. \n",
"\n",
"Veterans are the best of us. \n",
"\n",
"Ive always believed that we have a sacred obligation to equip all those we send to war and care for them and their families when they come home. \n",
"\n",
"My administration is providing assistance with job training and housing, and now helping lower-income veterans get VA care debt-free. \n",
"\n",
"Our troops in Iraq and Afghanistan faced many dangers.\n"
]
}
],
"source": [
"print(docs[0].page_content)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e3a8b105",
"metadata": {},
"source": [
"## Get connection info and data schema"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "69996818",
"metadata": {},
"outputs": [],
"source": [
"print(str(docsearch))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "f59360c0",
"metadata": {},
"source": [
"## Filtering\n",
"\n",
"You can have direct access to myscale SQL where statement. You can write `WHERE` clause following standard SQL.\n",
"\n",
"**NOTE**: Please be aware of SQL injection, this interface must not be directly called by end-user.\n",
"\n",
"If you custimized your `column_map` under your setting, you search with filter like this:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "232055f6",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Inserting data...: 100%|██████████| 42/42 [00:15<00:00, 2.69it/s]\n"
]
}
],
"source": [
"from langchain.vectorstores import MyScale, MyScaleSettings\n",
"from langchain.document_loaders import TextLoader\n",
"\n",
"loader = TextLoader('../../../state_of_the_union.txt')\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = OpenAIEmbeddings()\n",
"\n",
"for i, d in enumerate(docs):\n",
" d.metadata = {'doc_id': i}\n",
"\n",
"docsearch = MyScale.from_documents(docs, embeddings)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "ddbcee77",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.252379834651947 {'doc_id': 6, 'some': ''} And Im taking robus...\n",
"0.25022566318511963 {'doc_id': 1, 'some': ''} Groups of citizens b...\n",
"0.2469480037689209 {'doc_id': 8, 'some': ''} And so many families...\n",
"0.2428302764892578 {'doc_id': 0, 'some': 'metadata'} As Frances Haugen, w...\n"
]
}
],
"source": [
"meta = docsearch.metadata_column\n",
"output = docsearch.similarity_search_with_relevance_scores('What did the president say about Ketanji Brown Jackson?', \n",
" k=4, where_str=f\"{meta}.doc_id<10\")\n",
"for d, dist in output:\n",
" print(dist, d.metadata, d.page_content[:20] + '...')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "a359ed74",
"metadata": {},
"source": [
"## Deleting your data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fb6a9d36",
"metadata": {},
"outputs": [],
"source": [
"docsearch.drop()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "48dbd8e0",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -58,6 +58,9 @@
"\n",
"docsearch = Pinecone.from_documents(docs, embeddings, index_name=index_name)\n",
"\n",
"# if you already have an index, you can load it like this\n",
"# docsearch = Pinecone.from_existing_index(index_name, embeddings)\n",
"\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = docsearch.similarity_search(query)"
]

View File

@@ -0,0 +1,399 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "683953b3",
"metadata": {},
"source": [
"# SupabaseVectorStore\n",
"\n",
"This notebook shows how to use Supabase and `pgvector` as your VectorStore.\n",
"\n",
"To run this notebook, please ensure:\n",
"\n",
"- the `pgvector` extension is enabled\n",
"- you have installed the `supabase-py` package\n",
"- that you have created a `match_documents` function in your database\n",
"- that you have a `documents` table in your `public` schema similar to the one below.\n",
"\n",
"The following function determines cosine similarity, but you can adjust to your needs.\n",
"\n",
"```sql\n",
" -- Enable the pgvector extension to work with embedding vectors\n",
" create extension vector;\n",
"\n",
" -- Create a table to store your documents\n",
" create table documents (\n",
" id bigserial primary key,\n",
" content text, -- corresponds to Document.pageContent\n",
" metadata jsonb, -- corresponds to Document.metadata\n",
" embedding vector(1536) -- 1536 works for OpenAI embeddings, change if needed\n",
" );\n",
"\n",
" CREATE FUNCTION match_documents(query_embedding vector(1536), match_count int)\n",
" RETURNS TABLE(\n",
" id bigint,\n",
" content text,\n",
" metadata jsonb,\n",
" -- we return matched vectors to enable maximal marginal relevance searches\n",
" embedding vector(1536),\n",
" similarity float)\n",
" LANGUAGE plpgsql\n",
" AS $$\n",
" # variable_conflict use_column\n",
" BEGIN\n",
" RETURN query\n",
" SELECT\n",
" id,\n",
" content,\n",
" metadata,\n",
" embedding,\n",
" 1 -(documents.embedding <=> query_embedding) AS similarity\n",
" FROM\n",
" documents\n",
" ORDER BY\n",
" documents.embedding <=> query_embedding\n",
" LIMIT match_count;\n",
" END;\n",
" $$;\n",
"```\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "6bd4498b",
"metadata": {},
"outputs": [],
"source": [
"# with pip\n",
"# !pip install supabase\n",
"\n",
"# with conda\n",
"# !conda install -c conda-forge supabase"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "90afc6df",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# If you're storing your Supabase and OpenAI API keys in a .env file, you can load them with dotenv\n",
"from dotenv import load_dotenv\n",
"\n",
"load_dotenv()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "5ce44f7c",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from supabase.client import Client, create_client\n",
"\n",
"supabase_url = os.environ.get(\"SUPABASE_URL\")\n",
"supabase_key = os.environ.get(\"SUPABASE_SERVICE_KEY\")\n",
"supabase: Client = create_client(supabase_url, supabase_key)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "aac9563e",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2023-04-19 20:12:28,593:INFO - NumExpr defaulting to 8 threads.\n"
]
}
],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import SupabaseVectorStore\n",
"from langchain.document_loaders import TextLoader"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "a3c3999a",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
"\n",
"loader = TextLoader(\"../../../state_of_the_union.txt\")\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = OpenAIEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "efec97f8",
"metadata": {},
"outputs": [],
"source": [
"# We're using the default `documents` table here. You can modify this by passing in a `table_name` argument to the `from_documents` method.\n",
"vector_store = SupabaseVectorStore.from_documents(\n",
" docs, embeddings, client=supabase\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "5eabdb75",
"metadata": {},
"outputs": [],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"matched_docs = vector_store.similarity_search(query)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "4b172de8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n"
]
}
],
"source": [
"print(matched_docs[0].page_content)"
]
},
{
"cell_type": "markdown",
"id": "18152965",
"metadata": {},
"source": [
"## Similarity search with score\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "72aaa9c8",
"metadata": {},
"outputs": [],
"source": [
"matched_docs = vector_store.similarity_search_with_relevance_scores(query)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "d88e958e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'}),\n",
" 0.802509746274066)"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"matched_docs[0]"
]
},
{
"cell_type": "markdown",
"id": "794a7552",
"metadata": {},
"source": [
"## Retriever options\n",
"\n",
"This section goes over different options for how to use SupabaseVectorStore as a retriever.\n",
"\n",
"### Maximal Marginal Relevance Searches\n",
"\n",
"In addition to using similarity search in the retriever object, you can also use `mmr`.\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "96ff911a",
"metadata": {},
"outputs": [],
"source": [
"retriever = vector_store.as_retriever(search_type=\"mmr\")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "f00be6d0",
"metadata": {},
"outputs": [],
"source": [
"matched_docs = retriever.get_relevant_documents(query)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "a559c3f1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"## Document 0\n",
"\n",
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n",
"\n",
"## Document 1\n",
"\n",
"One was stationed at bases and breathing in toxic smoke from “burn pits” that incinerated wastes of war—medical and hazard material, jet fuel, and more. \n",
"\n",
"When they came home, many of the worlds fittest and best trained warriors were never the same. \n",
"\n",
"Headaches. Numbness. Dizziness. \n",
"\n",
"A cancer that would put them in a flag-draped coffin. \n",
"\n",
"I know. \n",
"\n",
"One of those soldiers was my son Major Beau Biden. \n",
"\n",
"We dont know for sure if a burn pit was the cause of his brain cancer, or the diseases of so many of our troops. \n",
"\n",
"But Im committed to finding out everything we can. \n",
"\n",
"Committed to military families like Danielle Robinson from Ohio. \n",
"\n",
"The widow of Sergeant First Class Heath Robinson. \n",
"\n",
"He was born a soldier. Army National Guard. Combat medic in Kosovo and Iraq. \n",
"\n",
"Stationed near Baghdad, just yards from burn pits the size of football fields. \n",
"\n",
"Heaths widow Danielle is here with us tonight. They loved going to Ohio State football games. He loved building Legos with their daughter.\n",
"\n",
"## Document 2\n",
"\n",
"And Im taking robust action to make sure the pain of our sanctions is targeted at Russias economy. And I will use every tool at our disposal to protect American businesses and consumers. \n",
"\n",
"Tonight, I can announce that the United States has worked with 30 other countries to release 60 Million barrels of oil from reserves around the world. \n",
"\n",
"America will lead that effort, releasing 30 Million barrels from our own Strategic Petroleum Reserve. And we stand ready to do more if necessary, unified with our allies. \n",
"\n",
"These steps will help blunt gas prices here at home. And I know the news about whats happening can seem alarming. \n",
"\n",
"But I want you to know that we are going to be okay. \n",
"\n",
"When the history of this era is written Putins war on Ukraine will have left Russia weaker and the rest of the world stronger. \n",
"\n",
"While it shouldnt have taken something so terrible for people around the world to see whats at stake now everyone sees it clearly.\n",
"\n",
"## Document 3\n",
"\n",
"We cant change how divided weve been. But we can change how we move forward—on COVID-19 and other issues we must face together. \n",
"\n",
"I recently visited the New York City Police Department days after the funerals of Officer Wilbert Mora and his partner, Officer Jason Rivera. \n",
"\n",
"They were responding to a 9-1-1 call when a man shot and killed them with a stolen gun. \n",
"\n",
"Officer Mora was 27 years old. \n",
"\n",
"Officer Rivera was 22. \n",
"\n",
"Both Dominican Americans whod grown up on the same streets they later chose to patrol as police officers. \n",
"\n",
"I spoke with their families and told them that we are forever in debt for their sacrifice, and we will carry on their mission to restore the trust and safety every community deserves. \n",
"\n",
"Ive worked on these issues a long time. \n",
"\n",
"I know what works: Investing in crime preventionand community police officers wholl walk the beat, wholl know the neighborhood, and who can restore trust and safety.\n"
]
}
],
"source": [
"for i, d in enumerate(matched_docs):\n",
" print(f\"\\n## Document {i}\\n\")\n",
" print(d.page_content)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "79b1198e",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -16,7 +16,7 @@
"In order to add a memory with an external message store to an agent we are going to do the following steps:\n",
"\n",
"1. We are going to create a `RedisChatMessageHistory` to connect to an external database to store the messages in.\n",
"2. We are going to create an `LLMChain` useing that chat history as memory.\n",
"2. We are going to create an `LLMChain` using that chat history as memory.\n",
"3. We are going to use that `LLMChain` to create a custom Agent.\n",
"\n",
"For the purposes of this exercise, we are going to create a simple custom Agent that has access to a search tool and utilizes the `ConversationBufferMemory` class."

View File

@@ -161,7 +161,7 @@
"Overall, you are a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether the human needs help with a specific question or just wants to have a conversation about a particular topic, you are here to assist.\n",
"\n",
"Context:\n",
"{'Deven': '', 'Sam': ''}\n",
"{'Deven': 'Deven is working on a hackathon project with Sam.', 'Sam': 'Sam is working on a hackathon project with Deven.'}\n",
"\n",
"Current conversation:\n",
"\n",
@@ -189,29 +189,29 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 14,
"id": "0269f513",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'Deven': 'Deven is working on a hackathon project with Sam.',\n",
"{'Deven': 'Deven is working on a hackathon project with Sam, which they are entering into a hackathon.',\n",
" 'Sam': 'Sam is working on a hackathon project with Deven.'}"
]
},
"execution_count": 9,
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"conversation.memory.store"
"conversation.memory.entity_store.store"
]
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 15,
"id": "46324ca8",
"metadata": {},
"outputs": [
@@ -232,7 +232,7 @@
"Overall, you are a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether the human needs help with a specific question or just wants to have a conversation about a particular topic, you are here to assist.\n",
"\n",
"Context:\n",
"{'Deven': 'Deven is working on a hackathon project with Sam.', 'Sam': 'Sam is working on a hackathon project with Deven.', 'Langchain': ''}\n",
"{'Deven': 'Deven is working on a hackathon project with Sam, which they are entering into a hackathon.', 'Sam': 'Sam is working on a hackathon project with Deven.', 'Langchain': ''}\n",
"\n",
"Current conversation:\n",
"Human: Deven & Sam are working on a hackathon project\n",
@@ -250,7 +250,7 @@
"' That sounds like an interesting project! What kind of memory structures are they trying to add?'"
]
},
"execution_count": 10,
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
@@ -261,7 +261,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 16,
"id": "ff2ebf6b",
"metadata": {},
"outputs": [
@@ -282,7 +282,7 @@
"Overall, you are a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether the human needs help with a specific question or just wants to have a conversation about a particular topic, you are here to assist.\n",
"\n",
"Context:\n",
"{'Deven': 'Deven is working on a hackathon project with Sam, attempting to add more complex memory structures to Langchain.', 'Sam': 'Sam is working on a hackathon project with Deven, trying to add more complex memory structures to Langchain.', 'Langchain': 'Langchain is a project that is trying to add more complex memory structures.', 'Key-Value Store': ''}\n",
"{'Deven': 'Deven is working on a hackathon project with Sam, which they are entering into a hackathon. They are trying to add more complex memory structures to Langchain.', 'Sam': 'Sam is working on a hackathon project with Deven, trying to add more complex memory structures to Langchain.', 'Langchain': 'Langchain is a project that is trying to add more complex memory structures.', 'Key-Value Store': ''}\n",
"\n",
"Current conversation:\n",
"Human: Deven & Sam are working on a hackathon project\n",
@@ -299,10 +299,10 @@
{
"data": {
"text/plain": [
"' That sounds like a great idea! How will the key-value store work?'"
"' That sounds like a great idea! How will the key-value store help with the project?'"
]
},
"execution_count": 11,
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
@@ -313,7 +313,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 17,
"id": "56cfd4ba",
"metadata": {},
"outputs": [
@@ -334,7 +334,7 @@
"Overall, you are a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether the human needs help with a specific question or just wants to have a conversation about a particular topic, you are here to assist.\n",
"\n",
"Context:\n",
"{'Deven': 'Deven is working on a hackathon project with Sam, attempting to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation.', 'Sam': 'Sam is working on a hackathon project with Deven, trying to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation.'}\n",
"{'Deven': 'Deven is working on a hackathon project with Sam, which they are entering into a hackathon. They are trying to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation.', 'Sam': 'Sam is working on a hackathon project with Deven, trying to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation.'}\n",
"\n",
"Current conversation:\n",
"Human: Deven & Sam are working on a hackathon project\n",
@@ -342,7 +342,7 @@
"Human: They are trying to add more complex memory structures to Langchain\n",
"AI: That sounds like an interesting project! What kind of memory structures are they trying to add?\n",
"Human: They are adding in a key-value store for entities mentioned so far in the conversation.\n",
"AI: That sounds like a great idea! How will the key-value store work?\n",
"AI: That sounds like a great idea! How will the key-value store help with the project?\n",
"Last line:\n",
"Human: What do you know about Deven & Sam?\n",
"You:\u001b[0m\n",
@@ -353,10 +353,10 @@
{
"data": {
"text/plain": [
"' Deven and Sam are working on a hackathon project together, attempting to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation.'"
"' Deven and Sam are working on a hackathon project together, trying to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation. They seem to be working hard on this project and have a great idea for how the key-value store can help.'"
]
},
"execution_count": 12,
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
@@ -376,7 +376,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 21,
"id": "038b4d3f",
"metadata": {},
"outputs": [
@@ -384,28 +384,34 @@
"name": "stdout",
"output_type": "stream",
"text": [
"{'Deven': 'Deven is working on a hackathon project with Sam, attempting to add '\n",
" 'more complex memory structures to Langchain, including a key-value '\n",
" 'store for entities mentioned so far in the conversation.',\n",
" 'Key-Value Store': 'A key-value store that stores entities mentioned in the '\n",
" 'conversation.',\n",
"{'Daimon': 'Daimon is a company founded by Sam, a successful entrepreneur.',\n",
" 'Deven': 'Deven is working on a hackathon project with Sam, which they are '\n",
" 'entering into a hackathon. They are trying to add more complex '\n",
" 'memory structures to Langchain, including a key-value store for '\n",
" 'entities mentioned so far in the conversation, and seem to be '\n",
" 'working hard on this project with a great idea for how the '\n",
" 'key-value store can help.',\n",
" 'Key-Value Store': 'A key-value store is being added to the project to store '\n",
" 'entities mentioned in the conversation.',\n",
" 'Langchain': 'Langchain is a project that is trying to add more complex '\n",
" 'memory structures, including a key-value store for entities '\n",
" 'mentioned so far in the conversation.',\n",
" 'Sam': 'Sam is working on a hackathon project with Deven, attempting to add '\n",
" 'more complex memory structures to Langchain, including a key-value '\n",
" 'store for entities mentioned so far in the conversation.'}\n"
" 'Sam': 'Sam is working on a hackathon project with Deven, trying to add more '\n",
" 'complex memory structures to Langchain, including a key-value store '\n",
" 'for entities mentioned so far in the conversation. They seem to have '\n",
" 'a great idea for how the key-value store can help, and Sam is also '\n",
" 'the founder of a company called Daimon.'}\n"
]
}
],
"source": [
"from pprint import pprint\n",
"pprint(conversation.memory.store)"
"pprint(conversation.memory.entity_store.store)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 22,
"id": "2df4800e",
"metadata": {},
"outputs": [
@@ -426,15 +432,16 @@
"Overall, you are a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether the human needs help with a specific question or just wants to have a conversation about a particular topic, you are here to assist.\n",
"\n",
"Context:\n",
"{'Daimon': '', 'Sam': 'Sam is working on a hackathon project with Deven to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation.'}\n",
"{'Daimon': 'Daimon is a company founded by Sam, a successful entrepreneur.', 'Sam': 'Sam is working on a hackathon project with Deven, trying to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation. They seem to have a great idea for how the key-value store can help, and Sam is also the founder of a company called Daimon.'}\n",
"\n",
"Current conversation:\n",
"Human: They are trying to add more complex memory structures to Langchain\n",
"AI: That sounds like an interesting project! What kind of memory structures are they trying to add?\n",
"Human: They are adding in a key-value store for entities mentioned so far in the conversation.\n",
"AI: That sounds like a great idea! How will the key-value store work?\n",
"AI: That sounds like a great idea! How will the key-value store help with the project?\n",
"Human: What do you know about Deven & Sam?\n",
"AI: Deven and Sam are working on a hackathon project to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation. They seem to be very motivated and passionate about their project, and are working hard to make it a success.\n",
"AI: Deven and Sam are working on a hackathon project together, trying to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation. They seem to be working hard on this project and have a great idea for how the key-value store can help.\n",
"Human: Sam is the founder of a company called Daimon.\n",
"AI: \n",
"That's impressive! It sounds like Sam is a very successful entrepreneur. What kind of company is Daimon?\n",
"Last line:\n",
"Human: Sam is the founder of a company called Daimon.\n",
"You:\u001b[0m\n",
@@ -445,10 +452,10 @@
{
"data": {
"text/plain": [
"\"\\nThat's impressive! It sounds like Sam is a very successful entrepreneur. What kind of company is Daimon?\""
"\" That's impressive! It sounds like Sam is a very successful entrepreneur. What kind of company is Daimon?\""
]
},
"execution_count": 8,
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
@@ -459,7 +466,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 24,
"id": "ebe9e36f",
"metadata": {},
"outputs": [
@@ -467,32 +474,36 @@
"name": "stdout",
"output_type": "stream",
"text": [
"{'Daimon': 'Daimon is a company founded by Sam.',\n",
" 'Deven': 'Deven is working on a hackathon project with Sam to add more '\n",
" 'complex memory structures to Langchain, including a key-value store '\n",
" 'for entities mentioned so far in the conversation.',\n",
" 'Key-Value Store': 'Key-Value Store: A data structure that stores values '\n",
" 'associated with a unique key, allowing for efficient '\n",
" 'retrieval of values. Deven and Sam are adding a key-value '\n",
" 'store for entities mentioned so far in the conversation.',\n",
" 'Langchain': 'Langchain is a project that seeks to add more complex memory '\n",
" 'structures, including a key-value store for entities mentioned '\n",
" 'so far in the conversation.',\n",
" 'Sam': 'Sam is working on a hackathon project with Deven to add more complex '\n",
" 'memory structures to Langchain, including a key-value store for '\n",
" 'entities mentioned so far in the conversation. He is also the founder '\n",
" 'of a company called Daimon.'}\n"
"{'Daimon': 'Daimon is a company founded by Sam, a successful entrepreneur, who '\n",
" 'is working on a hackathon project with Deven to add more complex '\n",
" 'memory structures to Langchain.',\n",
" 'Deven': 'Deven is working on a hackathon project with Sam, which they are '\n",
" 'entering into a hackathon. They are trying to add more complex '\n",
" 'memory structures to Langchain, including a key-value store for '\n",
" 'entities mentioned so far in the conversation, and seem to be '\n",
" 'working hard on this project with a great idea for how the '\n",
" 'key-value store can help.',\n",
" 'Key-Value Store': 'A key-value store is being added to the project to store '\n",
" 'entities mentioned in the conversation.',\n",
" 'Langchain': 'Langchain is a project that is trying to add more complex '\n",
" 'memory structures, including a key-value store for entities '\n",
" 'mentioned so far in the conversation.',\n",
" 'Sam': 'Sam is working on a hackathon project with Deven, trying to add more '\n",
" 'complex memory structures to Langchain, including a key-value store '\n",
" 'for entities mentioned so far in the conversation. They seem to have '\n",
" 'a great idea for how the key-value store can help, and Sam is also '\n",
" 'the founder of a successful company called Daimon.'}\n"
]
}
],
"source": [
"from pprint import pprint\n",
"pprint(conversation.memory.store)"
"pprint(conversation.memory.entity_store.store)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 25,
"id": "dd547144",
"metadata": {},
"outputs": [
@@ -513,16 +524,16 @@
"Overall, you are a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether the human needs help with a specific question or just wants to have a conversation about a particular topic, you are here to assist.\n",
"\n",
"Context:\n",
"{'Sam': 'Sam is working on a hackathon project with Deven to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation. He is also the founder of a company called Daimon.', 'Daimon': 'Daimon is a company founded by Sam.'}\n",
"{'Deven': 'Deven is working on a hackathon project with Sam, which they are entering into a hackathon. They are trying to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation, and seem to be working hard on this project with a great idea for how the key-value store can help.', 'Sam': 'Sam is working on a hackathon project with Deven, trying to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation. They seem to have a great idea for how the key-value store can help, and Sam is also the founder of a successful company called Daimon.', 'Langchain': 'Langchain is a project that is trying to add more complex memory structures, including a key-value store for entities mentioned so far in the conversation.', 'Daimon': 'Daimon is a company founded by Sam, a successful entrepreneur, who is working on a hackathon project with Deven to add more complex memory structures to Langchain.'}\n",
"\n",
"Current conversation:\n",
"Human: They are adding in a key-value store for entities mentioned so far in the conversation.\n",
"AI: That sounds like a great idea! How will the key-value store work?\n",
"Human: What do you know about Deven & Sam?\n",
"AI: Deven and Sam are working on a hackathon project to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation. They seem to be very motivated and passionate about their project, and are working hard to make it a success.\n",
"AI: Deven and Sam are working on a hackathon project together, trying to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation. They seem to be working hard on this project and have a great idea for how the key-value store can help.\n",
"Human: Sam is the founder of a company called Daimon.\n",
"AI: \n",
"That's impressive! It sounds like Sam is a very successful entrepreneur. What kind of company is Daimon?\n",
"Human: Sam is the founder of a company called Daimon.\n",
"AI: That's impressive! It sounds like Sam is a very successful entrepreneur. What kind of company is Daimon?\n",
"Last line:\n",
"Human: What do you know about Sam?\n",
"You:\u001b[0m\n",
@@ -533,10 +544,10 @@
{
"data": {
"text/plain": [
"' Sam is the founder of a company called Daimon. He is also working on a hackathon project with Deven to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation. He seems to be very motivated and passionate about his project, and is working hard to make it a success.'"
"' Sam is the founder of a successful company called Daimon. He is also working on a hackathon project with Deven to add more complex memory structures to Langchain. They seem to have a great idea for how the key-value store can help.'"
]
},
"execution_count": 10,
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
@@ -570,7 +581,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.10"
}
},
"nbformat": 4,

View File

@@ -7,9 +7,11 @@
"source": [
"# VectorStore-Backed Memory\n",
"\n",
"`VectorStoreRetrieverMemory` stores interactions in a VectorDB and queries the top-K most \"salient\" interactions every type it is called.\n",
"`VectorStoreRetrieverMemory` stores memories in a VectorDB and queries the top-K most \"salient\" docs every time it is called.\n",
"\n",
"This differs from most of the other Memory classes in that "
"This differs from most of the other Memory classes in that it doesn't explicitly track the order of interactions.\n",
"\n",
"In this case, the \"docs\" are previous conversation snippets. This can be useful to refer to relevant pieces of information that the AI was told earlier in the conversation."
]
},
{
@@ -25,7 +27,8 @@
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.llms import OpenAI\n",
"from langchain.memory import VectorStoreRetrieverMemory\n",
"from langchain.chains import ConversationChain"
"from langchain.chains import ConversationChain\n",
"from langchain.prompts import PromptTemplate"
]
},
{
@@ -40,7 +43,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 29,
"id": "eef56f65",
"metadata": {
"tags": []
@@ -66,12 +69,12 @@
"source": [
"### Create your the VectorStoreRetrieverMemory\n",
"\n",
"The memory object is instantiated from "
"The memory object is instantiated from any VectorStoreRetriever."
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 30,
"id": "e00d4938",
"metadata": {
"tags": []
@@ -84,14 +87,14 @@
"memory = VectorStoreRetrieverMemory(retriever=retriever)\n",
"\n",
"# When added to an agent, the memory object can save pertinent information from conversations or used tools\n",
"memory.save_context({\"input\": \"check the latest scores of the Warriors game\"}, {\"output\": \"the Warriors are up against the Astros 88 to 84\"})\n",
"memory.save_context({\"input\": \"I need help doing my taxes - what's the standard deduction this year?\"}, {\"output\": \"...\"})\n",
"memory.save_context({\"input\": \"What's the the time?\"}, {\"output\": f\"It's {datetime.now()}\"}) # "
"memory.save_context({\"input\": \"My favorite food is pizza\"}, {\"output\": \"thats good to know\"})\n",
"memory.save_context({\"input\": \"My favorite sport is soccer\"}, {\"output\": \"...\"})\n",
"memory.save_context({\"input\": \"I don't the Celtics\"}, {\"output\": \"ok\"}) # "
]
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 31,
"id": "2fe28a28",
"metadata": {
"tags": []
@@ -101,7 +104,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"input: I need help doing my taxes - what's the standard deduction this year?\n",
"input: My favorite sport is soccer\n",
"output: ...\n"
]
}
@@ -109,7 +112,7 @@
"source": [
"# Notice the first result returned is the memory pertaining to tax help, which the language model deems more semantically relevant\n",
"# to a 1099 than the other documents, despite them both containing numbers.\n",
"print(memory.load_memory_variables({\"prompt\": \"What's a 1099?\"})[\"history\"])"
"print(memory.load_memory_variables({\"prompt\": \"what sport should i watch?\"})[\"history\"])"
]
},
{
@@ -123,7 +126,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 32,
"id": "ebd68c10",
"metadata": {
"tags": []
@@ -139,9 +142,13 @@
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n",
"\n",
"Relevant pieces of previous conversation:\n",
"input: My favorite food is pizza\n",
"output: thats good to know\n",
"\n",
"(You do not need to use these pieces of information if not relevant)\n",
"\n",
"Current conversation:\n",
"input: I need help doing my taxes - what's the standard deduction this year?\n",
"output: ...\n",
"Human: Hi, my name is Perry, what's up?\n",
"AI:\u001b[0m\n",
"\n",
@@ -151,18 +158,32 @@
{
"data": {
"text/plain": [
"\" Hi Perry, my name is AI. I'm doing great, how about you? I understand you need help with your taxes. What specifically do you need help with?\""
"\" Hi Perry, I'm doing well. How about you?\""
]
},
"execution_count": 5,
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm = OpenAI(temperature=0) # Can be any valid LLM\n",
"_DEFAULT_TEMPLATE = \"\"\"The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n",
"\n",
"Relevant pieces of previous conversation:\n",
"{history}\n",
"\n",
"(You do not need to use these pieces of information if not relevant)\n",
"\n",
"Current conversation:\n",
"Human: {input}\n",
"AI:\"\"\"\n",
"PROMPT = PromptTemplate(\n",
" input_variables=[\"history\", \"input\"], template=_DEFAULT_TEMPLATE\n",
")\n",
"conversation_with_summary = ConversationChain(\n",
" llm=llm, \n",
" prompt=PROMPT,\n",
" # We set a very low max_token_limit for the purposes of testing.\n",
" memory=memory,\n",
" verbose=True\n",
@@ -172,7 +193,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 33,
"id": "86207a61",
"metadata": {
"tags": []
@@ -188,10 +209,14 @@
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n",
"\n",
"Relevant pieces of previous conversation:\n",
"input: My favorite sport is soccer\n",
"output: ...\n",
"\n",
"(You do not need to use these pieces of information if not relevant)\n",
"\n",
"Current conversation:\n",
"input: check the latest scores of the Warriors game\n",
"output: the Warriors are up against the Astros 88 to 84\n",
"Human: If the Cavaliers were to face off against the Warriers or the Astros, who would they most stand a chance to beat?\n",
"Human: what's my favorite sport?\n",
"AI:\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
@@ -200,22 +225,22 @@
{
"data": {
"text/plain": [
"\" It's hard to say without knowing the current form of the teams. However, based on the current scores, it looks like the Cavaliers would have a better chance of beating the Astros than the Warriors.\""
"' You told me earlier that your favorite sport is soccer.'"
]
},
"execution_count": 6,
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Here, the basketball related content is surfaced\n",
"conversation_with_summary.predict(input=\"If the Cavaliers were to face off against the Warriers or the Astros, who would they most stand a chance to beat?\")"
"conversation_with_summary.predict(input=\"what's my favorite sport?\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 34,
"id": "8c669db1",
"metadata": {
"tags": []
@@ -231,10 +256,14 @@
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n",
"\n",
"Relevant pieces of previous conversation:\n",
"input: My favorite food is pizza\n",
"output: thats good to know\n",
"\n",
"(You do not need to use these pieces of information if not relevant)\n",
"\n",
"Current conversation:\n",
"input: What's the the time?\n",
"output: It's 2023-04-13 09:18:55.623736\n",
"Human: What day is it tomorrow?\n",
"Human: Whats my favorite food\n",
"AI:\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
@@ -243,10 +272,10 @@
{
"data": {
"text/plain": [
"' Tomorrow is 2023-04-14.'"
"' You said your favorite food is pizza.'"
]
},
"execution_count": 7,
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
@@ -254,12 +283,12 @@
"source": [
"# Even though the language model is stateless, since relavent memory is fetched, it can \"reason\" about the time.\n",
"# Timestamping memories and data is useful in general to let the agent determine temporal relevance\n",
"conversation_with_summary.predict(input=\"What day is it tomorrow?\")"
"conversation_with_summary.predict(input=\"Whats my favorite food\")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 35,
"id": "8c09a239",
"metadata": {
"tags": []
@@ -275,10 +304,14 @@
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n",
"\n",
"Current conversation:\n",
"Relevant pieces of previous conversation:\n",
"input: Hi, my name is Perry, what's up?\n",
"response: Hi Perry, my name is AI. I'm doing great, how about you? I understand you need help with your taxes. What specifically do you need help with?\n",
"Human: What's your name?\n",
"response: Hi Perry, I'm doing well. How about you?\n",
"\n",
"(You do not need to use these pieces of information if not relevant)\n",
"\n",
"Current conversation:\n",
"Human: What's my name?\n",
"AI:\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
@@ -287,10 +320,10 @@
{
"data": {
"text/plain": [
"\" My name is AI. It's nice to meet you, Perry.\""
"' Your name is Perry.'"
]
},
"execution_count": 8,
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
@@ -299,8 +332,16 @@
"# The memories from the conversation are automatically stored,\n",
"# since this query best matches the introduction chat above,\n",
"# the agent is able to 'remember' the user's name.\n",
"conversation_with_summary.predict(input=\"What's your name?\")"
"conversation_with_summary.predict(input=\"What's my name?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "df27c7dc",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -319,7 +360,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@@ -115,7 +115,7 @@
"id": "a2d76826",
"metadata": {},
"source": [
"**The above request should now appear on your [PromptLayer dashboard](https://ww.promptlayer.com).**"
"**The above request should now appear on your [PromptLayer dashboard](https://www.promptlayer.com).**"
]
},
{

View File

@@ -43,22 +43,18 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Total Tokens: 39\n",
"Prompt Tokens: 4\n",
"Completion Tokens: 35\n",
"Tokens Used: 42\n",
"\tPrompt Tokens: 4\n",
"\tCompletion Tokens: 38\n",
"Successful Requests: 1\n",
"Total Cost (USD): $0.0007800000000000001\n"
"Total Cost (USD): $0.00084\n"
]
}
],
"source": [
"with get_openai_callback() as cb:\n",
" result = llm(\"Tell me a joke\")\n",
" print(f\"Total Tokens: {cb.total_tokens}\")\n",
" print(f\"Prompt Tokens: {cb.prompt_tokens}\")\n",
" print(f\"Completion Tokens: {cb.completion_tokens}\")\n",
" print(f\"Successful Requests: {cb.successful_requests}\")\n",
" print(f\"Total Cost (USD): ${cb.total_cost}\")"
" print(cb)"
]
},
{

View File

@@ -186,7 +186,7 @@
"source": [
"**Number of Tokens:** You can also estimate how many tokens a piece of text will be in that model. This is useful because models have a context length (and cost more for more tokens), which means you need to be aware of how long the text you are passing in is.\n",
"\n",
"Notice that by default the tokens are estimated using [tiktoken](https://github.com/openai/tiktoken) (except for legacy version <3.8, where a HuggingFace tokenizer is used)"
"Notice that by default the tokens are estimated using [tiktoken](https://github.com/openai/tiktoken) (except for legacy version <3.8, where a Hugging Face tokenizer is used)"
]
},
{

View File

@@ -6,14 +6,55 @@
"metadata": {},
"source": [
"# AI21\n",
"This example goes over how to use LangChain to interact with AI21 models"
"\n",
"[AI21 Studio](https://docs.ai21.com/) provides API access to `Jurassic-2` large language models.\n",
"\n",
"This example goes over how to use LangChain to interact with [AI21 models](https://docs.ai21.com/docs/jurassic-2-models)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "02be122d-04e8-4ec6-84d1-f1d8961d6828",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# install the package:\n",
"!pip install ai21"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "4229227e-6ca2-41ad-a3c3-5f29e3559091",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"# get AI21_API_KEY. Use https://studio.ai21.com/account/account\n",
"\n",
"from getpass import getpass\n",
"AI21_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "6fb585dd",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.llms import AI21\n",
@@ -22,9 +63,11 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 9,
"id": "035dea0f",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template = \"\"\"Question: {question}\n",
@@ -36,19 +79,23 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 10,
"id": "3f3458d9",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = AI21()"
"llm = AI21(ai21_api_key=AI21_API_KEY)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 11,
"id": "a641dbd9",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm_chain = LLMChain(prompt=prompt, llm=llm)"
@@ -56,10 +103,23 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 12,
"id": "9f0b1960",
"metadata": {},
"outputs": [],
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'\\n1. What year was Justin Bieber born?\\nJustin Bieber was born in 1994.\\n2. What team won the Super Bowl in 1994?\\nThe Dallas Cowboys won the Super Bowl in 1994.'"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"\n",
@@ -91,7 +151,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -1,20 +1,61 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "9597802c",
"metadata": {},
"source": [
"# Aleph Alpha\n",
"\n",
"[The Luminous series](https://docs.aleph-alpha.com/docs/introduction/luminous/) is a family of large language models.\n",
"\n",
"This example goes over how to use LangChain to interact with Aleph Alpha models"
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "fe1bf9fb-e9fa-49f3-a768-8f603225ccce",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Install the package\n",
"!pip install aleph-alpha-client"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "0cb0f937-b610-42a2-b765-336eed037031",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"# create a new token: https://docs.aleph-alpha.com/docs/account/#create-a-new-token\n",
"\n",
"from getpass import getpass\n",
"\n",
"ALEPH_ALPHA_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "6fb585dd",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.llms import AlephAlpha\n",
@@ -23,9 +64,11 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 7,
"id": "f81a230d",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template = \"\"\"Q: {question}\n",
@@ -37,19 +80,23 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 8,
"id": "f0d26e48",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = AlephAlpha(model=\"luminous-extended\", maximum_tokens=20, stop_sequences=[\"Q:\"])"
"llm = AlephAlpha(model=\"luminous-extended\", maximum_tokens=20, stop_sequences=[\"Q:\"], aleph_alpha_api_key=ALEPH_ALPHA_API_KEY)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 9,
"id": "6811d621",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm_chain = LLMChain(prompt=prompt, llm=llm)"
@@ -57,9 +104,11 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 10,
"id": "3058e63f",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
@@ -67,7 +116,7 @@
"' Artificial Intelligence (AI) is the simulation of human intelligence processes by machines, especially computer systems.\\n'"
]
},
"execution_count": 5,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
@@ -81,7 +130,7 @@
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -95,7 +144,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.6"
},
"vscode": {
"interpreter": {

View File

@@ -6,14 +6,46 @@
"metadata": {},
"source": [
"# Anthropic\n",
"This example goes over how to use LangChain to interact with Anthropic models"
"\n",
"[Anthropic](https://console.anthropic.com/docs) is creator of the `Claude` LLM.\n",
"\n",
"This example goes over how to use LangChain to interact with Anthropic models."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e55c0f2e-63e1-4e83-ac44-ffcc1dfeacc8",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Install the package\n",
"!pip install anthropic"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cec62d45-afa2-422a-95ef-57f8ab41a6f9",
"metadata": {},
"outputs": [],
"source": [
"# get a new token: https://www.anthropic.com/earlyaccess\n",
"\n",
"from getpass import getpass\n",
"\n",
"ANTHROPIC_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "6fb585dd",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.llms import Anthropic\n",
@@ -24,7 +56,9 @@
"cell_type": "code",
"execution_count": 2,
"id": "035dea0f",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template = \"\"\"Question: {question}\n",
@@ -36,12 +70,14 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"id": "3f3458d9",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = Anthropic()"
"llm = Anthropic(anthropic_api_key=ANTHROPIC_API_KEY)"
]
},
{
@@ -102,7 +138,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -5,7 +5,7 @@
"id": "9e9b7651",
"metadata": {},
"source": [
"# Azure OpenAI LLM Example\n",
"# Azure OpenAI\n",
"\n",
"This notebook goes over how to use Langchain with [Azure OpenAI](https://aka.ms/azure-openai).\n",
"\n",
@@ -49,6 +49,18 @@
"```\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "89fdb593-5a42-4098-87b7-1496fa511b1c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install openai"
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -146,7 +158,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.6"
},
"vscode": {
"interpreter": {

View File

@@ -5,19 +5,50 @@
"metadata": {},
"source": [
"# Banana\n",
"\n",
"\n",
"[Banana](https://www.banana.dev/about-us) is focused on building the machine learning infrastructure.\n",
"\n",
"This example goes over how to use LangChain to interact with Banana models"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Install the package https://docs.banana.dev/banana-docs/core-concepts/sdks/python\n",
"!pip install banana-dev"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# get new tokens: https://app.banana.dev/\n",
"# We need two tokens, not just an `api_key`: `BANANA_API_KEY` and `YOUR_MODEL_KEY`\n",
"\n",
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"BANANA_API_KEY\"] = \"YOUR_API_KEY\"\n",
"# OR\n",
"# BANANA_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import Banana\n",
"from langchain import PromptTemplate, LLMChain\n",
"os.environ[\"BANANA_API_KEY\"] = \"YOUR_API_KEY\""
"from langchain import PromptTemplate, LLMChain"
]
},
{
@@ -65,15 +96,22 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.9.12 ('palm')",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"version": "3.9.12"
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
@@ -81,5 +119,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -4,7 +4,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# CerebriumAI LLM Example\n",
"# CerebriumAI\n",
"\n",
"`Cerebrium` is an AWS Sagemaker alternative. It also provides API access to [several LLM models](https://docs.cerebrium.ai/cerebrium/prebuilt-models/deploymen).\n",
"\n",
"This notebook goes over how to use Langchain with [CerebriumAI](https://docs.cerebrium.ai/introduction)."
]
},
@@ -13,7 +16,7 @@
"metadata": {},
"source": [
"## Install cerebrium\n",
"The `cerebrium` package is required to use the CerebriumAI API. Install `cerebrium` using `pip3 install cerebrium`."
"The `cerebrium` package is required to use the `CerebriumAI` API. Install `cerebrium` using `pip3 install cerebrium`."
]
},
{
@@ -22,7 +25,8 @@
"metadata": {},
"outputs": [],
"source": [
"$ pip3 install cerebrium"
"# Install the package\n",
"!pip3 install cerebrium"
]
},
{
@@ -48,7 +52,7 @@
"metadata": {},
"source": [
"## Set the Environment API Key\n",
"Make sure to get your API key from CerebriumAI. You are given a 1 hour free of serverless GPU compute to test different models."
"Make sure to get your API key from CerebriumAI. See [here](https://dashboard.cerebrium.ai/login). You are given a 1 hour free of serverless GPU compute to test different models."
]
},
{
@@ -136,15 +140,22 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.9.12 ('palm')",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"version": "3.9.12"
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
@@ -152,5 +163,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -6,14 +6,56 @@
"metadata": {},
"source": [
"# Cohere\n",
"This example goes over how to use LangChain to interact with Cohere models"
"\n",
"[Cohere](https://cohere.ai/about) is a Canadian startup that provides natural language processing models that help companies improve human-machine interactions.\n",
"\n",
"This example goes over how to use LangChain to interact with `Cohere` [models](https://docs.cohere.ai/docs/generation-card)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "91ea14ce-831d-409a-a88f-30353acdabd1",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Install the package\n",
"!pip install cohere"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "3f5dc9d7-65e3-4b5b-9086-3327d016cfe0",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"# get a new token: https://dashboard.cohere.ai/\n",
"\n",
"from getpass import getpass\n",
"\n",
"COHERE_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "6fb585dd",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.llms import Cohere\n",
@@ -22,9 +64,11 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 4,
"id": "035dea0f",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template = \"\"\"Question: {question}\n",
@@ -36,19 +80,23 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 6,
"id": "3f3458d9",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = Cohere()"
"llm = Cohere(cohere_api_key=COHERE_API_KEY)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 7,
"id": "a641dbd9",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm_chain = LLMChain(prompt=prompt, llm=llm)"
@@ -102,7 +150,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -1,11 +1,13 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# DeepInfra LLM Example\n",
"# DeepInfra\n",
"\n",
"`DeepInfra` provides [several LLMs](https://deepinfra.com/models).\n",
"\n",
"This notebook goes over how to use Langchain with [DeepInfra](https://deepinfra.com)."
]
},
@@ -18,8 +20,10 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
@@ -32,17 +36,44 @@
"metadata": {},
"source": [
"## Set the Environment API Key\n",
"Make sure to get your API key from DeepInfra. You are given a 1 hour free of serverless GPU compute to test different models.\n",
"Make sure to get your API key from DeepInfra. You have to [Login](https://deepinfra.com/login?from=%2Fdash) and get a new token.\n",
"\n",
"You are given a 1 hour free of serverless GPU compute to test different models. (see [here](https://github.com/deepinfra/deepctl#deepctl))\n",
"You can print your token with `deepctl auth token`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"# get a new token: https://deepinfra.com/login?from=%2Fdash\n",
"\n",
"from getpass import getpass\n",
"\n",
"DEEPINFRA_API_TOKEN = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"os.environ[\"DEEPINFRA_API_TOKEN\"] = \"YOUR_KEY_HERE\""
"os.environ[\"DEEPINFRA_API_TOKEN\"] = DEEPINFRA_API_TOKEN"
]
},
{
@@ -50,7 +81,7 @@
"metadata": {},
"source": [
"## Create the DeepInfra instance\n",
"Make sure to deploy your model first via `deepctl deploy create -m google/flat-t5-xl` (for example)"
"Make sure to deploy your model first via `deepctl deploy create -m google/flat-t5-xl` (see [here](https://github.com/deepinfra/deepctl#deepctl))"
]
},
{
@@ -121,15 +152,22 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.9.12 ('palm')",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"version": "3.9.12"
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
@@ -137,5 +175,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -4,8 +4,12 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# ForefrontAI LLM Example\n",
"This notebook goes over how to use Langchain with [ForefrontAI](https://www.forefront.ai/)."
"# ForefrontAI\n",
"\n",
"\n",
"The `Forefront` platform gives you the ability to fine-tune and use [open source large language models](https://docs.forefront.ai/forefront/master/models).\n",
"\n",
"This notebook goes over how to use Langchain with [ForefrontAI](https://www.forefront.ai/).\n"
]
},
{
@@ -40,7 +44,20 @@
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"FOREFRONTAI_API_KEY\"] = \"YOUR_KEY_HERE\""
"# get a new token: https://docs.forefront.ai/forefront/api-reference/authentication\n",
"\n",
"from getpass import getpass\n",
"\n",
"FOREFRONTAI_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"FOREFRONTAI_API_KEY\"] = FOREFRONTAI_API_KEY"
]
},
{
@@ -119,15 +136,22 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.9.12 ('palm')",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"version": "3.9.12"
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
@@ -135,5 +159,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -4,8 +4,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# GooseAI LLM Example\n",
"This notebook goes over how to use Langchain with [GooseAI](https://goose.ai/)."
"# GooseAI\n",
"\n",
"`GooseAI` is a fully managed NLP-as-a-Service, delivered via API. GooseAI provides access to [these models](https://goose.ai/docs/models).\n",
"\n",
"This notebook goes over how to use Langchain with [GooseAI](https://goose.ai/).\n"
]
},
{
@@ -57,7 +60,18 @@
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"GOOSEAI_API_KEY\"] = \"YOUR_KEY_HERE\""
"from getpass import getpass\n",
"\n",
"GOOSEAI_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"GOOSEAI_API_KEY\"] = GOOSEAI_API_KEY"
]
},
{
@@ -136,15 +150,22 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.9.12 ('palm')",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"version": "3.9.12"
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
@@ -152,5 +173,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -6,22 +6,36 @@
"source": [
"# GPT4All\n",
"\n",
"This example goes over how to use LangChain to interact with GPT4All models"
"[GitHub:nomic-ai/gpt4all](https://github.com/nomic-ai/gpt4all) an ecosystem of open-source chatbots trained on a massive collections of clean assistant data including code, stories and dialogue.\n",
"\n",
"This example goes over how to use LangChain to interact with `GPT4All` models."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install pyllamacpp > /dev/null"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain import PromptTemplate, LLMChain\n",
@@ -32,8 +46,10 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template = \"\"\"Question: {question}\n",
@@ -51,6 +67,10 @@
"\n",
"To run locally, download a compatible ggml-formatted model. For more info, visit https://github.com/nomic-ai/pyllamacpp\n",
"\n",
"For full installation instructions go [here](https://gpt4all.io/index.html).\n",
"\n",
"The GPT4All Chat installer needs to decompress a 3GB LLM model during the installation process!\n",
"\n",
"Note that new models are uploaded regularly - check the link above for the most recent `.bin` URL"
]
},
@@ -146,9 +166,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -7,41 +7,243 @@
"source": [
"# Hugging Face Hub\n",
"\n",
"The [Hugging Face Hub](https://huggingface.co/docs/hub/index) is a platform with over 120k models, 20k datasets, and 50k demo apps (Spaces), all open source and publicly available, in an online platform where people can easily collaborate and build ML together.\n",
"\n",
"This example showcases how to connect to the Hugging Face Hub."
]
},
{
"cell_type": "markdown",
"id": "4c1b8450-5eaf-4d34-8341-2d785448a1ff",
"metadata": {
"tags": []
},
"source": [
"To use, you should have the ``huggingface_hub`` python [package installed](https://huggingface.co/docs/huggingface_hub/installation)."
]
},
{
"cell_type": "code",
"execution_count": 41,
"execution_count": null,
"id": "d772b637-de00-4663-bd77-9bc96d798db2",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install huggingface_hub > /dev/null"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d597a792-354c-4ca5-b483-5965eec5d63d",
"metadata": {},
"outputs": [],
"source": [
"# get a token: https://huggingface.co/docs/api-inference/quicktour#get-your-api-token\n",
"\n",
"from getpass import getpass\n",
"\n",
"HUGGINGFACEHUB_API_TOKEN = getpass()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b8c5b88c-e4b8-4d0d-9a35-6e8f106452c2",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"HUGGINGFACEHUB_API_TOKEN\"] = HUGGINGFACEHUB_API_TOKEN"
]
},
{
"cell_type": "markdown",
"id": "84dd44c1-c428-41f3-a911-520281386c94",
"metadata": {},
"source": [
"**Select a Model**"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "39c7eeac-01c4-486b-9480-e828a9e73e78",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain import HuggingFaceHub\n",
"\n",
"repo_id = \"google/flan-t5-xl\" # See https://huggingface.co/models?pipeline_tag=text-generation&sort=downloads for some other options\n",
"\n",
"llm = HuggingFaceHub(repo_id=repo_id, model_kwargs={\"temperature\":0, \"max_length\":64})"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3acf0069",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The FIFA World Cup is a football tournament that is played every 4 years. The year 1994 was the 44th FIFA World Cup. The final answer: Brazil.\n"
]
}
],
"outputs": [],
"source": [
"from langchain import PromptTemplate, HuggingFaceHub, LLMChain\n",
"from langchain import PromptTemplate, LLMChain\n",
"\n",
"template = \"\"\"Question: {question}\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n",
"llm_chain = LLMChain(prompt=prompt, llm=HuggingFaceHub(repo_id=\"google/flan-t5-xl\", model_kwargs={\"temperature\":0, \"max_length\":64}))\n",
"llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
"\n",
"question = \"Who won the FIFA World Cup in the year 1994? \"\n",
"\n",
"print(llm_chain.run(question))"
]
},
{
"cell_type": "markdown",
"id": "ddaa06cf-95ec-48ce-b0ab-d892a7909693",
"metadata": {},
"source": [
"## Examples\n",
"\n",
"Below are some examples of models you can access through the Hugging Face Hub integration."
]
},
{
"cell_type": "markdown",
"id": "4fa9337e-ccb5-4c52-9b7c-1653148bc256",
"metadata": {},
"source": [
"### StableLM, by Stability AI\n",
"\n",
"See [Stability AI's](https://huggingface.co/stabilityai) organization page for a list of available models."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "843a3837",
"id": "36a1ce01-bd46-451f-8ee6-61c8f4bd665a",
"metadata": {},
"outputs": [],
"source": [
"repo_id = \"stabilityai/stablelm-tuned-alpha-3b\"\n",
"# Others include stabilityai/stablelm-base-alpha-3b\n",
"# as well as 7B parameter versions"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b5654cea-60b0-4f40-ab34-06ba1eca810d",
"metadata": {},
"outputs": [],
"source": [
"llm = HuggingFaceHub(repo_id=repo_id, model_kwargs={\"temperature\":0, \"max_length\":64})"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2f19d0dc-c987-433f-a8d6-b1214e8ee067",
"metadata": {},
"outputs": [],
"source": [
"# Reuse the prompt and question from above.\n",
"llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
"print(llm_chain.run(question))"
]
},
{
"cell_type": "markdown",
"id": "1a5c97af-89bc-4e59-95c1-223742a9160b",
"metadata": {},
"source": [
"### Dolly, by DataBricks\n",
"\n",
"See [DataBricks](https://huggingface.co/databricks) organization page for a list of available models."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "521fcd2b-8e38-4920-b407-5c7d330411c9",
"metadata": {},
"outputs": [],
"source": [
"from langchain import HuggingFaceHub\n",
"\n",
"repo_id = \"databricks/dolly-v2-3b\"\n",
"\n",
"llm = HuggingFaceHub(repo_id=repo_id, model_kwargs={\"temperature\":0, \"max_length\":64})"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9907ec3a-fe0c-4543-81c4-d42f9453f16c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Reuse the prompt and question from above.\n",
"llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
"print(llm_chain.run(question))"
]
},
{
"cell_type": "markdown",
"id": "03f6ae52-b5f9-4de6-832c-551cb3fa11ae",
"metadata": {},
"source": [
"### Camel, by Writer\n",
"\n",
"See [Writer's](https://huggingface.co/Writer) organization page for a list of available models."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "257a091d-750b-4910-ac08-fe1c7b3fd98b",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain import HuggingFaceHub\n",
"\n",
"repo_id = \"Writer/camel-5b-hf\" # See https://huggingface.co/Writer for other options\n",
"llm = HuggingFaceHub(repo_id=repo_id, model_kwargs={\"temperature\":0, \"max_length\":64})"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b06f6838-a11a-4d6a-88e3-91fa1747a2b3",
"metadata": {},
"outputs": [],
"source": [
"# Reuse the prompt and question from above.\n",
"llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
"print(llm_chain.run(question))"
]
},
{
"cell_type": "markdown",
"id": "2bf838eb-1083-402f-b099-b07c452418c8",
"metadata": {},
"source": [
"**And many more!**"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "18c78880-65d7-41d0-9722-18090efb60e9",
"metadata": {},
"outputs": [],
"source": []
@@ -63,7 +265,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.12"
"version": "3.11.2"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,145 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "959300d4",
"metadata": {},
"source": [
"# Hugging Face Local Pipelines\n",
"\n",
"Hugging Face models can be run locally through the `HuggingFacePipeline` class.\n",
"\n",
"The [Hugging Face Model Hub](https://huggingface.co/models) hosts over 120k models, 20k datasets, and 50k demo apps (Spaces), all open source and publicly available, in an online platform where people can easily collaborate and build ML together.\n",
"\n",
"These can be called from LangChain either through this local pipeline wrapper or by calling their hosted inference endpoints through the HuggingFaceHub class. For more information on the hosted pipelines, see the [HugigngFaceHub](huggingface_hub.ipynb) notebook."
]
},
{
"cell_type": "markdown",
"id": "4c1b8450-5eaf-4d34-8341-2d785448a1ff",
"metadata": {
"tags": []
},
"source": [
"To use, you should have the ``transformers`` python [package installed](https://pypi.org/project/transformers/)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "d772b637-de00-4663-bd77-9bc96d798db2",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install transformers > /dev/null"
]
},
{
"cell_type": "markdown",
"id": "91ad075f-71d5-4bc8-ab91-cc0ad5ef16bb",
"metadata": {},
"source": [
"### Load the model"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "165ae236-962a-4763-8052-c4836d78a5d2",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to default session, using empty session: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /sessions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x1117f9790>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
}
],
"source": [
"from langchain import HuggingFacePipeline\n",
"\n",
"llm = HuggingFacePipeline.from_model_id(model_id=\"bigscience/bloom-1b7\", task=\"text-generation\", model_kwargs={\"temperature\":0, \"max_length\":64})"
]
},
{
"cell_type": "markdown",
"id": "00104b27-0c15-4a97-b198-4512337ee211",
"metadata": {},
"source": [
"### Integrate the model in an LLMChain"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "3acf0069",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/wfh/code/lc/lckg/.venv/lib/python3.11/site-packages/transformers/generation/utils.py:1288: UserWarning: Using `max_length`'s default (64) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.\n",
" warnings.warn(\n",
"WARNING:root:Failed to persist run: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /chain-runs (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x144d06910>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
" First, we need to understand what is an electroencephalogram. An electroencephalogram is a recording of brain activity. It is a recording of brain activity that is made by placing electrodes on the scalp. The electrodes are placed\n"
]
}
],
"source": [
"from langchain import PromptTemplate, LLMChain\n",
"\n",
"template = \"\"\"Question: {question}\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n",
"\n",
"llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
"\n",
"question = \"What is electroencephalography?\"\n",
"\n",
"print(llm_chain.run(question))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "843a3837",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -6,22 +6,38 @@
"source": [
"# Llama-cpp\n",
"\n",
"This notebook goes over how to run llama-cpp within LangChain"
"[llama-cpp](https://github.com/abetlen/llama-cpp-python) is a Python binding for [llama.cpp](https://github.com/ggerganov/llama.cpp). \n",
"It supports [several LLMs](https://github.com/ggerganov/llama.cpp).\n",
"\n",
"This notebook goes over how to run `llama-cpp` within LangChain."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install llama-cpp-python"
]
},
{
"cell_type": "code",
"execution_count": 2,
"cell_type": "markdown",
"metadata": {},
"source": [
"Make sure you are following all instructions to [install all necessary model files](https://github.com/ggerganov/llama.cpp).\n",
"\n",
"You don't need an `API_TOKEN`!"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.llms import LlamaCpp\n",
@@ -30,8 +46,10 @@
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"execution_count": 4,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template = \"\"\"Question: {question}\n",
@@ -44,7 +62,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = LlamaCpp(model_path=\"./ggml-model-q4_0.bin\")"
@@ -98,9 +118,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -15,14 +15,30 @@
"id": "59fcaebc",
"metadata": {},
"source": [
"For more detailed information on `manifest`, and how to use it with local hugginface models like in this example, see https://github.com/HazyResearch/manifest"
"For more detailed information on `manifest`, and how to use it with local hugginface models like in this example, see https://github.com/HazyResearch/manifest\n",
"\n",
"Another example of [using Manifest with Langchain](https://github.com/HazyResearch/manifest/blob/main/examples/langchain_chatgpt.ipynb)."
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"id": "1205d1e4-e6da-4d67-a0c7-b7e8fd1e98d5",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install manifest-ml"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "04a0170a",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from manifest import Manifest\n",
@@ -31,18 +47,12 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": null,
"id": "de250a6a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'model_name': 'bigscience/T0_3B', 'model_path': 'bigscience/T0_3B'}\n"
]
}
],
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"manifest = Manifest(\n",
" client_name = \"huggingface\",\n",
@@ -202,7 +212,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
},
"vscode": {
"interpreter": {

View File

@@ -5,7 +5,60 @@
"metadata": {},
"source": [
"# Modal\n",
"This example goes over how to use LangChain to interact with Modal models"
"\n",
"The [Modal Python Library](https://modal.com/docs/guide) provides convenient, on-demand access to serverless cloud compute from Python scripts on your local computer. \n",
"The `Modal` itself does not provide any LLMs but only the infrastructure.\n",
"\n",
"This example goes over how to use LangChain to interact with `Modal`.\n",
"\n",
"[Here](https://modal.com/docs/guide/ex/potus_speech_qanda) is another example how to use LangChain to interact with `Modal`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install modal-client"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[?25lLaunching login page in your browser window\u001b[33m...\u001b[0m\n",
"\u001b[2KIf this is not showing up, please copy this URL into your web browser manually:\n",
"\u001b[2Km⠙\u001b[0m Waiting for authentication in the web browser...\n",
"\u001b]8;id=417802;https://modal.com/token-flow/tf-ptEuGecm7T1T5YQe42kwM1\u001b\\\u001b[4;94mhttps://modal.com/token-flow/tf-ptEuGecm7T1T5YQe42kwM1\u001b[0m\u001b]8;;\u001b\\\n",
"\n",
"\u001b[2K\u001b[32m⠙\u001b[0m Waiting for authentication in the web browser...\n",
"\u001b[1A\u001b[2K^C\n",
"\n",
"\u001b[31mAborted.\u001b[0m\n"
]
}
],
"source": [
"# register and get a new token\n",
"\n",
"!modal token new"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Follow [these instructions](https://modal.com/docs/guide/secrets) to deal with secrets."
]
},
{
@@ -63,15 +116,22 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.9.12 ('palm')",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"version": "3.9.12"
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
@@ -79,5 +139,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -0,0 +1,171 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "9597802c",
"metadata": {},
"source": [
"# NLP Cloud\n",
"\n",
"The [NLP Cloud](https://nlpcloud.io) serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, grammar and spelling correction, keywords and keyphrases extraction, chatbot, product description and ad generation, intent classification, text generation, image generation, blog post generation, code generation, question answering, automatic speech recognition, machine translation, language detection, semantic search, semantic similarity, tokenization, POS tagging, embeddings, and dependency parsing. It is ready for production, served through a REST API.\n",
"\n",
"\n",
"This example goes over how to use LangChain to interact with `NLP Cloud` [models](https://docs.nlpcloud.com/#models)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8e94b1ca-6e84-44c4-91ca-df7364c007f0",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install nlpcloud"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "ea7adb58-cabe-4a2c-b0a2-988fc3aac012",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"# get a token: https://docs.nlpcloud.com/#authentication\n",
"\n",
"from getpass import getpass\n",
"\n",
"NLPCLOUD_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "9cc2d68f-52a8-4a11-ba34-bb6c068e0b6a",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"NLPCLOUD_API_KEY\"] = NLPCLOUD_API_KEY"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "6fb585dd",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.llms import NLPCloud\n",
"from langchain import PromptTemplate, LLMChain"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "035dea0f",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template = \"\"\"Question: {question}\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "3f3458d9",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = NLPCloud()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "a641dbd9",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm_chain = LLMChain(prompt=prompt, llm=llm)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "9f844993",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"' Justin Bieber was born in 1994, so the team that won the Super Bowl that year was the San Francisco 49ers.'"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"\n",
"llm_chain.run(question)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"vscode": {
"interpreter": {
"hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -6,14 +6,57 @@
"metadata": {},
"source": [
"# OpenAI\n",
"This example goes over how to use LangChain to interact with OpenAI models"
"\n",
"[OpenAI](https://platform.openai.com/docs/introduction) offers a spectrum of models with different levels of power suitable for different tasks.\n",
"\n",
"This example goes over how to use LangChain to interact with `OpenAI` [models](https://platform.openai.com/docs/models)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 4,
"id": "5d71df86-8a17-4283-83d7-4e46e7c06c44",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"# get a token: https://platform.openai.com/account/api-keys\n",
"\n",
"from getpass import getpass\n",
"\n",
"OPENAI_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "5472a7cd-af26-48ca-ae9b-5f6ae73c74d2",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "6fb585dd",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.llms import OpenAI\n",
@@ -22,9 +65,11 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 7,
"id": "035dea0f",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template = \"\"\"Question: {question}\n",
@@ -36,9 +81,11 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 8,
"id": "3f3458d9",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = OpenAI()"
@@ -46,9 +93,11 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 9,
"id": "a641dbd9",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm_chain = LLMChain(prompt=prompt, llm=llm)"
@@ -56,17 +105,19 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 10,
"id": "9f844993",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"' Justin Bieber was born in 1994, so the NFL team that won the Super Bowl in that year was the Dallas Cowboys.'"
"' Justin Bieber was born in 1994, so we are looking for the Super Bowl winner from that year. The Super Bowl in 1994 was Super Bowl XXVIII, and the winner was the Dallas Cowboys.'"
]
},
"execution_count": 5,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
@@ -94,7 +145,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
},
"vscode": {
"interpreter": {

View File

@@ -4,7 +4,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Petals LLM Example\n",
"# Petals\n",
"\n",
"`Petals` runs 100B+ language models at home, BitTorrent-style.\n",
"\n",
"This notebook goes over how to use Langchain with [Petals](https://github.com/bigscience-workshop/petals)."
]
},
@@ -22,7 +25,7 @@
"metadata": {},
"outputs": [],
"source": [
"$ pip3 install petals"
"!pip3 install petals"
]
},
{
@@ -34,7 +37,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
@@ -48,16 +51,37 @@
"metadata": {},
"source": [
"## Set the Environment API Key\n",
"Make sure to get your API key from Huggingface."
"Make sure to get [your API key](https://huggingface.co/docs/api-inference/quicktour#get-your-api-token) from Huggingface."
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"from getpass import getpass\n",
"\n",
"HUGGINGFACE_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"HUGGINGFACE_API_KEY\"] = \"YOUR_KEY_HERE\""
"os.environ[\"HUGGINGFACE_API_KEY\"] = HUGGINGFACE_API_KEY"
]
},
{
@@ -72,8 +96,18 @@
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Downloading: 1%|▏ | 40.8M/7.19G [00:24<15:44, 7.57MB/s]"
]
}
],
"source": [
"# this can take several minutes to download big files!\n",
"\n",
"llm = Petals(model_name=\"bigscience/bloom-petals\")"
]
},
@@ -150,7 +184,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
},
"vscode": {
"interpreter": {
@@ -159,5 +193,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -1,18 +1,23 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "959300d4",
"metadata": {},
"source": [
"# PromptLayer OpenAI\n",
"\n",
"This example showcases how to connect to [PromptLayer](https://www.promptlayer.com) to start recording your OpenAI requests."
"`PromptLayer` is the first platform that allows you to track, manage, and share your GPT prompt engineering. `PromptLayer` acts a middleware between your code and `OpenAIs` python library.\n",
"\n",
"`PromptLayer` records all your `OpenAI API` requests, allowing you to search and explore request history in the `PromptLayer` dashboard.\n",
"\n",
"\n",
"This example showcases how to connect to [PromptLayer](https://www.promptlayer.com) to start recording your OpenAI requests.\n",
"\n",
"Another example is [here](https://python.langchain.com/en/latest/ecosystem/promptlayer.html)."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6a45943e",
"metadata": {},
@@ -26,13 +31,14 @@
"execution_count": null,
"id": "dbe09bd8",
"metadata": {
"tags": [],
"vscode": {
"languageId": "powershell"
}
},
"outputs": [],
"source": [
"pip install promptlayer"
"!pip install promptlayer"
]
},
{
@@ -45,9 +51,11 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 3,
"id": "c16da3b5",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
@@ -56,7 +64,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "8564ce7d",
"metadata": {},
@@ -64,21 +71,80 @@
"## Set the Environment API Key\n",
"You can create a PromptLayer API Key at [www.promptlayer.com](https://www.promptlayer.com) by clicking the settings cog in the navbar.\n",
"\n",
"Set it as an environment variable called `PROMPTLAYER_API_KEY`."
"Set it as an environment variable called `PROMPTLAYER_API_KEY`.\n",
"\n",
"You also need an OpenAI Key, called `OPENAI_API_KEY`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "46ba25dc",
"metadata": {},
"outputs": [],
"execution_count": 2,
"id": "1df96674-a9fb-4126-bb87-541082782240",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"os.environ[\"PROMPTLAYER_API_KEY\"] = \"********\""
"from getpass import getpass\n",
"\n",
"PROMPTLAYER_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "46ba25dc",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"os.environ[\"PROMPTLAYER_API_KEY\"] = PROMPTLAYER_API_KEY"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "9aa68c46-4d88-45ba-8a83-18fa41b4daed",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"from getpass import getpass\n",
"\n",
"OPENAI_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "6023b6fa-d9db-49d6-b713-0e19686119b0",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "bf0294de",
"metadata": {},
@@ -89,28 +155,18 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": null,
"id": "3acf0069",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' to go outside\\n\\nUnfortunately, cats cannot go outside without being supervised by a human. Going outside can be dangerous for cats, as they may come into contact with cars, other animals, or other dangers. If you want to go outside, ask your human to take you on a supervised walk or to a safe, enclosed outdoor space.'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = PromptLayerOpenAI(pl_tags=[\"langchain\"])\n",
"llm(\"I am a cat and I want\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "a2d76826",
"metadata": {},
@@ -119,7 +175,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "05e9e2fe",
"metadata": {},
@@ -144,7 +199,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "7eb19139",
"metadata": {},
@@ -156,7 +210,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "base",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -170,7 +224,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.8 (default, Apr 13 2021, 12:59:45) \n[Clang 10.0.0 ]"
"version": "3.10.6"
},
"vscode": {
"interpreter": {

View File

@@ -5,20 +5,10 @@
"metadata": {},
"source": [
"# Replicate\n",
"This example goes over how to use LangChain to interact with Replicate models"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from langchain.llms import Replicate\n",
"from langchain import PromptTemplate, LLMChain\n",
"\n",
"os.environ[\"REPLICATE_API_TOKEN\"] = \"YOUR REPLICATE API TOKEN\""
">[Replicate](https://replicate.com/blog/machine-learning-needs-better-tools) runs machine learning models in the cloud. We have a library of open-source models that you can run with a few lines of code. If you're building your own machine learning models, Replicate makes it easy to deploy them at scale.\n",
"\n",
"This example goes over how to use LangChain to interact with `Replicate` [models](https://replicate.com/explore)"
]
},
{
@@ -35,6 +25,65 @@
"To run this notebook, you'll need to create a [replicate](https://replicate.com) account and install the [replicate python client](https://github.com/replicate/replicate-python)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install replicate"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"# get a token: https://replicate.com/account\n",
"\n",
"from getpass import getpass\n",
"\n",
"REPLICATE_API_TOKEN = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"REPLICATE_API_TOKEN\"] = REPLICATE_API_TOKEN"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.llms import Replicate\n",
"from langchain import PromptTemplate, LLMChain"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -58,8 +107,10 @@
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"execution_count": 5,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = Replicate(model=\"daanelson/flan-t5:04e422a9b85baed86a4f24981d7f9953e20c5fd82f6103b74ebc431588e1cec8\")"
@@ -339,7 +390,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
},
"vscode": {
"interpreter": {
@@ -348,5 +399,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -5,18 +5,43 @@
"id": "9597802c",
"metadata": {},
"source": [
"# Self-Hosted Models via Runhouse\n",
"# Runhouse\n",
"\n",
"The [Runhouse](https://github.com/run-house/runhouse) allows remote compute and data across environments and users. See the [Runhouse docs](https://runhouse-docs.readthedocs-hosted.com/en/latest/).\n",
"\n",
"This example goes over how to use LangChain and [Runhouse](https://github.com/run-house/runhouse) to interact with models hosted on your own GPU, or on-demand GPUs on AWS, GCP, AWS, or Lambda.\n",
"\n",
"For more information, see [Runhouse](https://github.com/run-house/runhouse) or the [Runhouse docs](https://runhouse-docs.readthedocs-hosted.com/en/latest/)."
"**Note**: Code uses `SelfHosted` name instead of the `Runhouse`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6fb585dd",
"metadata": {},
"id": "6066fede-2300-4173-9722-6f01f4fa34b4",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install runhouse"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "6fb585dd",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"INFO | 2023-04-17 16:47:36,173 | No auth token provided, so not using RNS API to save and load configs\n"
]
}
],
"source": [
"from langchain.llms import SelfHostedPipeline, SelfHostedHuggingFaceLLM\n",
"from langchain import PromptTemplate, LLMChain\n",
@@ -25,9 +50,11 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 4,
"id": "06d6866e",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# For an on-demand A100 with GCP, Azure, or Lambda\n",
@@ -44,9 +71,11 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 5,
"id": "035dea0f",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template = \"\"\"Question: {question}\n",
@@ -60,7 +89,9 @@
"cell_type": "code",
"execution_count": null,
"id": "3f3458d9",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = SelfHostedHuggingFaceLLM(model_id=\"gpt2\", hardware=gpu, model_reqs=[\"pip:./\", \"transformers\", \"torch\"])"
@@ -288,7 +319,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.15"
"version": "3.10.6"
}
},
"nbformat": 4,

View File

@@ -6,22 +6,56 @@
"source": [
"# SageMakerEndpoint\n",
"\n",
"This notebooks goes over how to use an LLM hosted on a SageMaker endpoint."
"[Amazon SageMaker](https://aws.amazon.com/sagemaker/) is a system that can build, train, and deploy machine learning (ML) models for any use case with fully managed infrastructure, tools, and workflows.\n",
"\n",
"This notebooks goes over how to use an LLM hosted on a `SageMaker endpoint`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip3 install langchain boto3"
]
},
{
"cell_type": "code",
"execution_count": null,
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set up"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You have to set up following required parameters of the `SagemakerEndpoint` call:\n",
"- `endpoint_name`: The name of the endpoint from the deployed Sagemaker model.\n",
" Must be unique within an AWS Region.\n",
"- `credentials_profile_name`: The name of the profile in the ~/.aws/credentials or ~/.aws/config files, which\n",
" has either access keys or role information specified.\n",
" If not specified, the default credential profile or, if on an EC2 instance,\n",
" credentials from IMDS will be used.\n",
" See: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.docstore.document import Document"
@@ -29,8 +63,10 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"example_doc_1 = \"\"\"\n",
@@ -49,7 +85,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from typing import Dict\n",
@@ -118,7 +156,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
},
"vscode": {
"interpreter": {
@@ -127,5 +165,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -5,13 +5,78 @@
"metadata": {},
"source": [
"# StochasticAI\n",
"This example goes over how to use LangChain to interact with StochasticAI models"
"\n",
">[Stochastic Acceleration Platform](https://docs.stochastic.ai/docs/introduction/) aims to simplify the life cycle of a Deep Learning model. From uploading and versioning the model, through training, compression and acceleration to putting it into production.\n",
"\n",
"This example goes over how to use LangChain to interact with `StochasticAI` models."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You have to get the API_KEY and the API_URL [here](https://app.stochastic.ai/workspace/profile/settings?tab=profile)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"from getpass import getpass\n",
"\n",
"STOCHASTICAI_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"STOCHASTICAI_API_KEY\"] = STOCHASTICAI_API_KEY"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"YOUR_API_URL = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.llms import StochasticAI\n",
@@ -20,8 +85,10 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 6,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template = \"\"\"Question: {question}\n",
@@ -33,17 +100,21 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 11,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = StochasticAI(api_url=\"YOUR_API_URL\")"
"llm = StochasticAI(api_url=YOUR_API_URL)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 12,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm_chain = LLMChain(prompt=prompt, llm=llm)"
@@ -51,27 +122,54 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"execution_count": 13,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"\"\\n\\nStep 1: In 1999, the St. Louis Rams won the Super Bowl.\\n\\nStep 2: In 1999, Beiber was born.\\n\\nStep 3: The Rams were in Los Angeles at the time.\\n\\nStep 4: So they didn't play in the Super Bowl that year.\\n\""
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"\n",
"llm_chain.run(question)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.9.12 ('palm')",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"version": "3.9.12"
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
@@ -79,5 +177,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

Some files were not shown because too many files have changed in this diff Show More