Compare commits

...

81 Commits

Author SHA1 Message Date
Eugene Yurtsev
910d4d00a7 x 2023-09-27 16:18:12 -04:00
Eugene Yurtsev
4cef01adf7 x 2023-09-27 16:11:00 -04:00
Eugene Yurtsev
6b945c3091 x 2023-09-27 16:10:22 -04:00
Eugene Yurtsev
17a383d31f x 2023-09-27 15:55:48 -04:00
Harrison Chase
e355606b11 add more import checks (#11033) 2023-09-27 11:17:12 -07:00
Dan Bolser
efb7c459a2 Update base.py (#10843)
Fixing a typo in the example code in the docstring...

You have to start somewhere though right?

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-09-27 11:15:58 -07:00
Jeremy Naccache
c59a5bae48 Fix intermediate steps example in docs : replaced json.dumps with Langchain's dumps() (#10593)
The intermediate steps example in docs has an example on how to retrieve
and display the intermediate steps.
But the intermediate steps object is of type AgentAction which cannot be
passed to json.dumps (it raises an error).
I replaced it with Langchain's dumps function (from langchain.load.dump
import dumps) which is the preferred way to do so.
2023-09-27 11:00:29 -07:00
tanujtiwari-at
a79f595543 Support extra tools argument for pandas agent toolkit (#11040)
**Description** 

We support adding new tools in some toolkits already like the [SQLAgent
toolkit](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/agents/agent_toolkits/sql/base.py#L27).

Related
[SO](https://stackoverflow.com/questions/76583163/are-langchain-toolkits-able-to-be-modified-can-we-add-tools-to-a-pandas-datafra)
thread
This replicates the same functionality here, so users can add custom
bespoke tools.
2023-09-27 10:57:04 -07:00
Aashish Saini
c4471d1877 Fixing some spelling mistakes (#10881)
@baskaryan

---------

Co-authored-by: AashutoshPathakShorthillsAI <142410372+AashutoshPathakShorthillsAI@users.noreply.github.com>
Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com>
Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com>
Co-authored-by: ManpreetShorthillsAI <142380984+ManpreetShorthillsAI@users.noreply.github.com>
Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com>
Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com>
Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com>
Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com>
Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com>
Co-authored-by: Md Nazish Arman <142379599+MdNazishArmanShorthillsAI@users.noreply.github.com>
Co-authored-by: KamalSharmaShorthillsAI <142474019+KamalSharmaShorthillsAI@users.noreply.github.com>
Co-authored-by: Lakshya <lakshyagupta87@yahoo.com>
Co-authored-by: AnujMauryaShorthillsAI <142393269+AnujMauryaShorthillsAI@users.noreply.github.com>
Co-authored-by: Saransh Sharma <142397365+SaranshSharmaShorthillsAI@users.noreply.github.com>
Co-authored-by: GhayurHamzaShorthillsAI <136243850+GhayurHamzaShorthillsAI@users.noreply.github.com>
Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com>
Co-authored-by: Riya Rana <142411643+RiyaRanaShorthillsAI@users.noreply.github.com>
Co-authored-by: Akshay Tripathi <142379735+AkshayTripathiShorthillsAI@users.noreply.github.com>
2023-09-27 10:56:51 -07:00
Bagatur
410ac8129d bump 303 (#11120) 2023-09-27 08:30:33 -07:00
Bagatur
8e4dbae428 Add fireworks chat model (#11117) 2023-09-27 08:22:12 -07:00
Bagatur
657581dbdf Fix ChatFireworks typing 2023-09-27 08:15:40 -07:00
Bagatur
12aad659dd add ChatFireworks to chat_models 2023-09-27 08:11:26 -07:00
Bagatur
872ebdaf90 remove FireworksChat from llms 2023-09-27 08:10:41 -07:00
Bagatur
9451240941 Fix fireworks chat linting issues 2023-09-27 08:09:33 -07:00
Harrison Chase
6b4928ad96 fix-lcel-notebooks (#11111)
fix some missing imports/naming
2023-09-27 06:36:11 -07:00
Tomáš Dvořák
865a21938c speed up enforce_stop_tokens helper function (#10984)
**Description:**

As long as `enforce_stop_tokens` returns a first occurrence, we can
speed up the execution by setting the optional `maxsplit` parameter to
1.

Tag maintainer:
@agola11
@hwchase17

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-27 05:29:29 -07:00
Austin Walker
bb41252dab fix: bump min_unstructured_version for UnstructuredAPIFileLoader (#11025)
**Description:** New metadata fields were added to
`unstructured==0.10.15`, and our hosted api has been updated to reflect
this. When users call `partition_via_api` with an older version of the
library, they'll hit a parsing error related to the new fields.
2023-09-27 05:28:06 -07:00
William FH
75b3893daf Fix runnable branch callbacks (#11091)
We aren't calling on_chain_end here unless we use the default option
2023-09-27 11:38:56 +01:00
Bagatur
6c5251feb0 poetry 2023-09-26 20:12:49 -07:00
Bagatur
5310184f96 poetry 2023-09-26 20:12:29 -07:00
Cynthia Yang
6dd44ff1c0 Refactor Fireworks and add ChatFireworks (#3) (#10597)
Description 
* Refactor Fireworks within Langchain LLMs.
* Remove FireworksChat within Langchain LLMs.
* Add ChatFireworks (which uses chat completion api) to Langchain chat
models.
* Users have to install `fireworks-ai` and register an api key to use
the api.

Issue - Not applicable
Dependencies - None
Tag maintainer - @rlancemartin @baskaryan
2023-09-26 20:11:55 -07:00
Bagatur
5514ebe859 Don't type chains in output_parsers (#11092)
Can't use TYPE_CHECKING style imports for pydantic params because it will try to instantiate the typed object by default.
2023-09-26 17:49:35 -07:00
CG80499
64385c4eae Make pairwise comparison chain more like LLM as a judge (#11013)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:**: Adds LLM as a judge as an eval chain
  - **Tag maintainer:** @hwchase17 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
2023-09-26 13:19:04 -07:00
Joseph McElroy
175ef0a55d [ElasticsearchStore] Enable custom Bulk Args (#11065)
This enables bulk args like `chunk_size` to be passed down from the
ingest methods (from_text, from_documents) to be passed down to the bulk
API.

This helps alleviate issues where bulk importing a large amount of
documents into Elasticsearch was resulting in a timeout.

Contribution Shoutout
- @elastic

- [x] Updated Integration tests

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-26 12:53:50 -07:00
Eugene Yurtsev
d19fd0cfae LogEntry/LogStream use str instead of uuid for id (#11080)
Cast the UUID to a string
2023-09-26 20:38:51 +01:00
Bagatur
d85339b9f2 extract sublinks exclude by abs path (#11079) 2023-09-26 12:26:27 -07:00
Bagatur
7ee8b2d1bf exclude dirs in async recursive loading (#11077) 2023-09-26 09:59:04 -07:00
Leonid Ganeline
21199cc7b4 📖 docs: fixed integrations/document loaders toc (#9281)
Fixed navbar:
- renamed several files, so ToC is sorted correctly
- made ToC items consistent: formatted several Titles
- added several links
- reformatted several docs to a consistent format
- renamed several files (removed `_example` suffix)
- added renamed files to the `docs/docs_skeleton/vercel.json`
2023-09-26 09:47:37 -07:00
Bagatur
0ea384d575 fix multiple chains lcel how to (#11074) 2023-09-26 08:39:02 -07:00
Bagatur
12fb393a43 bump 302 (#11070) 2023-09-26 08:13:01 -07:00
Bagatur
097ecef06b refactor web base loader (#11057) 2023-09-26 08:11:31 -07:00
Bagatur
487611521d fix root import (#11072) 2023-09-26 08:11:16 -07:00
Bagatur
a2f7246f0e skip excluded sublinks before recursion (#11036) 2023-09-26 02:24:54 -07:00
William FH
9c5eca92e4 Update notebook deps (#11053) 2023-09-25 22:41:29 -07:00
William FH
448426a6ac Add collab link (#11052) 2023-09-25 22:35:25 -07:00
William FH
4aec587979 Update LangSmith Walkthrough (#11043) 2023-09-25 22:32:56 -07:00
Harrison Chase
bea78b3271 make warnings more modular (#11047) 2023-09-25 20:46:43 -07:00
Harrison Chase
c87e9fb2ce conditional imports (#11017) 2023-09-25 15:46:32 -07:00
Tomaz Bratanic
0625ab7a9e Filtering graph schema for Cypher generation (#10577)
Sometimes you don't want the LLM to be aware of the whole graph schema,
and want it to ignore parts of the graph when it is constructing Cypher
statements.
2023-09-25 14:14:15 -07:00
Palau
89ef440c14 Kay retriever (#10657)
- **Description**: Adding retrievers for [kay.ai](https://kay.ai) and
SEC filings powered by Kay and Cybersyn. Kay provides context as a
service: it's an API built for RAG.
- **Issue**: N/A
- **Dependencies**: Just added a dep to the
[kay](https://pypi.org/project/kay/) package
- **Tag maintainer**: @baskaryan @hwchase17 Discussed in slack
- **Twtter handle:** [@vishalrohra_](https://twitter.com/vishalrohra_)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-25 13:10:13 -07:00
Harrison Chase
5f13668fa0 Harrison/move vectorstore base (#11030) 2023-09-25 12:44:23 -07:00
Bagatur
3eb79580c2 fix langsmith link in docs (#11027) 2023-09-25 12:05:08 -07:00
Jacob Lee
6d072e97c8 Adds GA to docs (#11022)
CC @baskaryan
2023-09-25 11:54:32 -07:00
Eugene Yurtsev
af5390d416 Add a batch size for cleanup (#10948)
Add pagination to indexing cleanup to deal with large numbers of
documents that need to be deleted.
2023-09-25 14:52:32 -04:00
Eugene Yurtsev
09486ed188 Update Serializable to use classmethods (#10956) 2023-09-25 18:39:30 +01:00
Taqi Jaffri
b7290f01d8 Batching for hf_pipeline (#10795)
The huggingface pipeline in langchain (used for locally hosted models)
does not support batching. If you send in a batch of prompts, it just
processes them serially using the base implementation of _generate:
https://github.com/docugami/langchain/blob/master/libs/langchain/langchain/llms/base.py#L1004C2-L1004C29

This PR adds support for batching in this pipeline, so that GPUs can be
fully saturated. I updated the accompanying notebook to show GPU batch
inference.

---------

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
2023-09-25 18:23:11 +01:00
Bagatur
aa6e6db8c7 bump 301 (#11018) 2023-09-25 08:50:47 -07:00
Nuno Campos
956ee981c0 Fix issue where requests wrapper passes auth kwarg twice (#11010)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

Closes #8842
2023-09-25 15:45:04 +01:00
Scotty
88a02076af fix ChatMessageChunk concat error (#10174)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->

- Description: fix `ChatMessageChunk` concat error 
- Issue: #10173 
- Dependencies: None
- Tag maintainer: @baskaryan, @eyurtsev, @rlancemartin
- Twitter handle: None

---------

Co-authored-by: wangshuai.scotty <wangshuai.scotty@bytedance.com>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-09-25 11:17:11 +01:00
Massimiliano Pronesti
4322b246aa docs: add vLLM chat notebook (#10993)
This PR aims at showcasing how to use vLLM's OpenAI-compatible chat API.

### Context
Lanchain already supports vLLM and its OpenAI-compatible `Completion`
API. However, the `ChatCompletion` API was not aligned with OpenAI and
for this reason I've waited for this
[PR](https://github.com/vllm-project/vllm/pull/852) to be merged before
adding this notebook to langchain.
2023-09-24 18:23:19 -07:00
Naveen Tatikonda
b0f21e2b50 [OpenSearch] Pass ids using from_texts and indexname in add_texts and search (#10969)
### Description
This PR makes the following changes to OpenSearch:
1. Pass optional ids with `from_texts`
2. Pass an optional index name with `add_texts` and `search` instead of
using the same index name that was used during `from_texts`

### Issue
https://github.com/langchain-ai/langchain/issues/10967

### Maintainers
@rlancemartin, @eyurtsev, @navneet1v

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
2023-09-23 16:12:51 -07:00
deanchanter
f945426874 Resolve GHI 10674 (#10977) 2023-09-23 16:11:52 -07:00
Anar
ff732e10f8 LLMRails Embedding (#10959)
LLMRails  Embedding Integration
This PR provides integration with LLMRails. Implemented here are:

langchain/embeddings/llm_rails.py
docs/extras/integrations/text_embedding/llm_rails.ipynb


Hi @hwchase17 after adding our vectorstore integration to langchain with
confirmation of you and @baskaryan, now we want to add our embedding
integration

---------

Co-authored-by: Anar Aliyev <aaliyev@mgmt.cloudnet.services>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-23 16:11:02 -07:00
Michael Feil
94e31647bd Support for Gradient.ai embedding (#10968)
Adds support for gradient.ai's embedding model.

This will remain a Draft, as the code will likely be refactored with the
`pip install gradientai` python sdk.
2023-09-23 16:10:23 -07:00
Bagatur
5fd13c22ad redirect mrkl (#10979) 2023-09-23 16:09:13 -07:00
C.J. Jameson
05d5fcfdf8 fix make-coverage local invocation #10941 (#10974)
Fix the invocation of `make coverage` in `libs/langchain`

Fixes #10941
2023-09-23 16:03:53 -07:00
Bagatur
040d436b3f Add vertex scheduled test (#10958) 2023-09-23 15:51:59 -07:00
Piyush Jain
8602a32b7e Fixes error with providers that don't have model_id (#10966)
## Description
Fixes error with using the chain for providers that don't have
`model_id` field.


![image](https://github.com/langchain-ai/langchain/assets/289369/a86074cf-6c99-4390-a135-b3af7a4f0827)
2023-09-23 15:34:28 -07:00
Nuno Campos
7b13292e35 Remove python eval from vector sql db chain (#10937)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-09-23 08:51:03 -07:00
Richard Wang
b809c243af Fix bug in index api (#10614)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

- **Description:** a fix for `index`.
- **Issue:** Not applicable.
- **Dependencies:** None
- **Tag maintainer:** 
- **Twitter handle:** richarddwang

# Problem
Replication code
```python
from pprint import pprint
from langchain.embeddings import OpenAIEmbeddings
from langchain.indexes import SQLRecordManager, index
from langchain.schema import Document
from langchain.vectorstores import Qdrant
from langchain_setup.qdrant import pprint_qdrant_documents, create_inmemory_empty_qdrant

# Documents
metadata1 = {"source": "fullhell.alchemist"}
doc1_1 = Document(page_content="1-1 I have a dog~", metadata=metadata1)
doc1_2 = Document(page_content="1-2 I have a daugter~", metadata=metadata1)
doc1_3 = Document(page_content="1-3 Ahh! O..Oniichan", metadata=metadata1)
doc2 = Document(page_content="2 Lancer died again.", metadata={"source": "fate.docx"})

# Create empty vectorstore
collection_name = "secret_of_D_disk"
vectorstore: Qdrant = create_inmemory_empty_qdrant()

# Create record Manager
import tempfile
from pathlib import Path

record_manager = SQLRecordManager(
    namespace="qdrant/{collection_name}",
    db_url=f"sqlite:///{Path(tempfile.gettempdir())/collection_name}.sql",
)
record_manager.create_schema()  # 必須

sync_result = index(
    [doc1_1, doc1_2, doc1_2, doc2],
    record_manager,
    vectorstore,
    cleanup="full",
    source_id_key="source",
)
print(sync_result, end="\n\n")
pprint_qdrant_documents(vectorstore)
```
<details>
<summary>Code of helper functions `pprint_qdrant_documents` and
`create_inmemory_empty_qdrant`</summary>

```python
def create_inmemory_empty_qdrant(**from_texts_kwargs):
    # Qdrant requires vector size, which can be only know after applying embedder
    vectorstore = Qdrant.from_texts(["dummy"], location=":memory:", embedding=OpenAIEmbeddings(), **from_texts_kwargs)
    dummy_document_id = vectorstore.client.scroll(vectorstore.collection_name)[0][0].id
    vectorstore.delete([dummy_document_id])
    return vectorstore

def pprint_qdrant_documents(vectorstore, limit: int = 100, **scroll_kwargs):
    document_ids, documents = [], []
    for record in vectorstore.client.scroll(
        vectorstore.collection_name, limit=100, **scroll_kwargs
    )[0]:
        document_ids.append(record.id)
        documents.append(
            Document(
                page_content=record.payload["page_content"],
                metadata=record.payload["metadata"] or {},
            )
        )
    pprint_documents(documents, document_ids=document_ids)

def pprint_document(document: Document = None, document_id=None, return_string=False):
    displayed_text = ""
    if document_id:
        displayed_text += f"Document {document_id}:\n\n"
    displayed_text += f"{document.page_content}\n\n"
    metadata_text = pformat(document.metadata, indent=1)
    if "\n" in metadata_text:
        displayed_text += f"Metadata:\n{metadata_text}"
    else:
        displayed_text += f"Metadata:{metadata_text}"

    if return_string:
        return displayed_text
    else:
        print(displayed_text)


def pprint_documents(documents, document_ids=None):
    if not document_ids:
        document_ids = [i + 1 for i in range(len(documents))]

    displayed_texts = []
    for document_id, document in zip(document_ids, documents):
        displayed_text = pprint_document(
            document_id=document_id, document=document, return_string=True
        )
        displayed_texts.append(displayed_text)
    print(f"\n{'-' * 100}\n".join(displayed_texts))
```
</details>
You will get

```
{'num_added': 3, 'num_updated': 0, 'num_skipped': 0, 'num_deleted': 0}

Document 1b19816e-b802-53c0-ad60-5ff9d9b9b911:

1-2 I have a daugter~

Metadata:{'source': 'fullhell.alchemist'}
----------------------------------------------------------------------------------------------------
Document 3362f9bc-991a-5dd5-b465-c564786ce19c:

1-1 I have a dog~

Metadata:{'source': 'fullhell.alchemist'}
----------------------------------------------------------------------------------------------------
Document a4d50169-2fda-5339-a196-249b5f54a0de:

1-2 I have a daugter~

Metadata:{'source': 'fullhell.alchemist'}
```
This is not correct. We should be able to expect that the vectorsotre
now includes doc1_1, doc1_2, and doc2, but not doc1_1, doc1_2, and
doc1_2.


# Reason
In `index`, the original code is 
```python
uids = []
docs_to_index = []
for doc, hashed_doc, doc_exists in zip(doc_batch, hashed_docs, exists_batch):
    if doc_exists:
        # Must be updated to refresh timestamp.
        record_manager.update([hashed_doc.uid], time_at_least=index_start_dt)
        num_skipped += 1
        continue
    uids.append(hashed_doc.uid)
    docs_to_index.append(doc)
```
In the aforementioned example, `len(doc_batch) == 4`, but
`len(hashed_docs) == len(exists_batch) == 3`. This is because the
deduplication of input documents [doc1_1, doc1_2, doc1_2, doc2] is
[doc1_1, doc1_2, doc2]. So `index` insert doc1_1, doc1_2, doc1_2 with
the uid of doc1_1, doc1_2, doc2.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-09-22 22:41:07 -04:00
Joshua Sundance Bailey
d67b120a41 Make anthropic_api_key a secret str (#10724)
This PR makes `ChatAnthropic.anthropic_api_key` a `pydantic.SecretStr`
to avoid inadvertently exposing API keys when the `ChatAnthropic` object
is represented as a str.
2023-09-22 22:06:20 -04:00
Bagatur
1b65779905 fix integration tests (#10952) 2023-09-22 12:04:38 -07:00
Bagatur
6f781902ae vercel fix (#10951) 2023-09-22 11:31:52 -07:00
Bagatur
f0408c347f llm feat table revision (#10947) 2023-09-22 10:29:12 -07:00
Harrison Chase
9062e36722 Harrison/agents structured (#10911) 2023-09-22 10:21:23 -07:00
C.J. Jameson
b4d2663beb CONTRIBUTING.md Quick Start: focus on langchain core; clarify docs and experimental are separate (#10906)
follow up to https://github.com/langchain-ai/langchain/pull/7959 ,
explaining better to focus just on langchain core

no dependencies

twitter @cjcjameson
2023-09-22 10:17:08 -07:00
Michael Landis
f30b4697d4 fix: broken link in libs/langchain README (#10920)
**Description**
Fixes broken link to `CONTRIBUTING.md` in `libs/langchain/README.md`.

Because`libs/langchain/README.md` was copied from the top level README,
and because the README contains a link to `.github/CONTRIBUTING.md`, the
copied README's link relative path must be updated. This commit fixes
that link.
2023-09-22 10:14:19 -07:00
Bagatur
3cb460d5d8 bump 300 (#10940) 2023-09-22 09:44:47 -07:00
Bagatur
281a332784 table fix (#10944) 2023-09-22 09:37:03 -07:00
Bagatur
5336d87c15 update feat table (#10939) 2023-09-22 09:16:40 -07:00
Nuno Campos
3d5e92e3ef Accept run name arg for non-chain runs (#10935) 2023-09-22 08:41:25 -07:00
Nuno Campos
aac2d4dcef In MergerRetriever async call all retrievers in parallel (#10938) 2023-09-22 08:40:16 -07:00
German Martin
66d5a7e7cf Add async support to multi-query retriever. (#10873)
Added async support to the MultiQueryRetriever class.

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-09-22 08:33:20 -07:00
Greg Richardson
4eee789dd3 Docs: Using SupabaseVectorStore with existing documents (#10907)
## Description
Adds additional docs on how to use `SupabaseVectorStore` with existing
data in your DB (vs inserting new documents each time).
2023-09-22 08:18:56 -07:00
Leonid Kuligin
9d4b710a48 small fixes to Vertex (#10934)
Fixed tests, updated the required version of the SDK and a few minor
changes after the recent improvement
(https://github.com/langchain-ai/langchain/pull/10910)
2023-09-22 08:18:09 -07:00
wo0d
4e58b78102 Fix chat_history message order (#10869)
Not all databases uses id as default order, so add it explicitly

sqlite uses rawid as default order in select statement:
[https://www.sqlite.org/lang_createtable.html#rowid](https://www.sqlite.org/lang_createtable.html#rowid),
but some other databases like postgresql not behaves like this. since
this class supports multiple db engine. we should have an order.
2023-09-22 11:15:59 -04:00
Roman Shaptala
3d40de75c5 Fix default refine prompt template bug (#10928)
**Description:**
  
Default refine template does not actually use the refine template
defined above, it uses a string with the variable name.
 @baskaryan, @eyurtsev, @hwchase17
2023-09-22 11:04:28 -04:00
Bagatur
cab55e9bc1 add vertex prod features (#10910)
- chat vertex async
- vertex stream
- vertex full generation info
- vertex use server-side stopping
- model garden async
- update docs for all the above

in follow up will add
[] chat vertex full generation info
[] chat vertex retries
[] scheduled tests
2023-09-22 01:44:09 -07:00
Bagatur
dccc20b402 add model feat table (#10921) 2023-09-22 01:10:27 -07:00
William FH
ee8653f62c Wfh/allow nonparallel (#10914) 2023-09-21 20:21:01 -07:00
256 changed files with 9161 additions and 4542 deletions

View File

@@ -14,8 +14,8 @@ Please do not try to push directly to this repo unless you are a maintainer.
Please follow the checked-in pull request template when opening pull requests. Note related issues and tag relevant
maintainers.
Pull requests cannot land without passing the formatting, linting and testing checks first. See
[Common Tasks](#-common-tasks) for how to run these checks locally.
Pull requests cannot land without passing the formatting, linting and testing checks first. See [Testing](#testing) and
[Formatting and Linting](#formatting-and-linting) for how to run these checks locally.
It's essential that we maintain great documentation and testing. If you:
- Fix a bug
@@ -59,43 +59,85 @@ we do not want these to get in the way of getting good code into the codebase.
## 🚀 Quick Start
> **Note:** You can run this repository locally (which is described below) or in a [development container](https://containers.dev/) (which is described in the [.devcontainer folder](https://github.com/hwchase17/langchain/tree/master/.devcontainer)).
This quick start describes running the repository locally.
For a [development container](https://containers.dev/), see the [.devcontainer folder](https://github.com/hwchase17/langchain/tree/master/.devcontainer).
This project uses [Poetry](https://python-poetry.org/) v1.5.1 as a dependency manager. Check out Poetry's [documentation on how to install it](https://python-poetry.org/docs/#installation) on your system before proceeding.
### Dependency Management: Poetry and other env/dependency managers
❗Note: If you use `Conda` or `Pyenv` as your environment/package manager, avoid dependency conflicts by doing the following first:
1. *Before installing Poetry*, create and activate a new Conda env (e.g. `conda create -n langchain python=3.9`)
2. Install Poetry v1.5.1 (see above)
3. Tell Poetry to use the virtualenv python environment (`poetry config virtualenvs.prefer-active-python true`)
4. Continue with the following steps.
This project uses [Poetry](https://python-poetry.org/) v1.5.1+ as a dependency manager.
❗Note: *Before installing Poetry*, if you use `Conda`, create and activate a new Conda env (e.g. `conda create -n langchain python=3.9`)
Install Poetry: **[documentation on how to install it](https://python-poetry.org/docs/#installation)**.
❗Note: If you use `Conda` or `Pyenv` as your environment/package manager, after installing Poetry,
tell Poetry to use the virtualenv python environment (`poetry config virtualenvs.prefer-active-python true`)
### Core vs. Experimental
There are two separate projects in this repository:
- `langchain`: core langchain code, abstractions, and use cases
- `langchain.experimental`: more experimental code
- `langchain.experimental`: see the [Experimental README](../libs/experimental/README.md) for more information.
Each of these has their OWN development environment.
In order to run any of the commands below, please move into their respective directories.
For example, to contribute to `langchain` run `cd libs/langchain` before getting started with the below.
Each of these has their own development environment. Docs are run from the top-level makefile, but development
is split across separate test & release flows.
To install requirements:
For this quickstart, start with langchain core:
```bash
cd libs/langchain
```
### Local Development Dependencies
Install langchain development requirements (for running langchain, running examples, linting, formatting, tests, and coverage):
```bash
poetry install --with test
```
This will install all requirements for running the package, examples, linting, formatting, tests, and coverage.
Then verify dependency installation:
❗Note: If during installation you receive a `WheelFileValidationError` for `debugpy`, please make sure you are running Poetry v1.5.1. This bug was present in older versions of Poetry (e.g. 1.4.1) and has been resolved in newer releases. If you are still seeing this bug on v1.5.1, you may also try disabling "modern installation" (`poetry config installer.modern-installation false`) and re-installing requirements. See [this `debugpy` issue](https://github.com/microsoft/debugpy/issues/1246) for more details.
```bash
make test
```
Now assuming `make` and `pytest` are installed, you should be able to run the common tasks in the following section. To double check, run `make test` under `libs/langchain`, all tests should pass. If they don't, you may need to pip install additional dependencies, such as `numexpr` and `openapi_schema_pydantic`.
If the tests don't pass, you may need to pip install additional dependencies, such as `numexpr` and `openapi_schema_pydantic`.
## ✅ Common Tasks
If during installation you receive a `WheelFileValidationError` for `debugpy`, please make sure you are running
Poetry v1.5.1+. This bug was present in older versions of Poetry (e.g. 1.4.1) and has been resolved in newer releases.
If you are still seeing this bug on v1.5.1, you may also try disabling "modern installation"
(`poetry config installer.modern-installation false`) and re-installing requirements.
See [this `debugpy` issue](https://github.com/microsoft/debugpy/issues/1246) for more details.
Type `make` for a list of common tasks.
### Testing
### Code Formatting
_some test dependencies are optional; see section about optional dependencies_.
Formatting for this project is done via a combination of [Black](https://black.readthedocs.io/en/stable/) and [isort](https://pycqa.github.io/isort/).
Unit tests cover modular logic that does not require calls to outside APIs.
If you add new logic, please add a unit test.
To run unit tests:
```bash
make test
```
To run unit tests in Docker:
```bash
make docker_tests
```
There are also [integration tests and code-coverage](../libs/langchain/tests/README.md) available.
### Formatting and Linting
Run these locally before submitting a PR; the CI system will check also.
#### Code Formatting
Formatting for this project is done via a combination of [Black](https://black.readthedocs.io/en/stable/) and [ruff](https://docs.astral.sh/ruff/rules/).
To run formatting for this project:
@@ -111,9 +153,9 @@ make format_diff
This is especially useful when you have made changes to a subset of the project and want to ensure your changes are properly formatted without affecting the rest of the codebase.
### Linting
#### Linting
Linting for this project is done via a combination of [Black](https://black.readthedocs.io/en/stable/), [isort](https://pycqa.github.io/isort/), [flake8](https://flake8.pycqa.org/en/latest/), and [mypy](http://mypy-lang.org/).
Linting for this project is done via a combination of [Black](https://black.readthedocs.io/en/stable/), [ruff](https://docs.astral.sh/ruff/rules/), and [mypy](http://mypy-lang.org/).
To run linting for this project:
@@ -131,7 +173,7 @@ This can be very helpful when you've made changes to only certain parts of the p
We recognize linting can be annoying - if you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.
### Spellcheck
#### Spellcheck
Spellchecking for this project is done via [codespell](https://github.com/codespell-project/codespell).
Note that `codespell` finds common typos, so it could have false-positive (correctly spelled but rarely used) and false-negatives (not finding misspelled) words.
@@ -157,17 +199,7 @@ If codespell is incorrectly flagging a word, you can skip spellcheck for that wo
ignore-words-list = 'momento,collison,ned,foor,reworkd,parth,whats,aapply,mysogyny,unsecure'
```
### Coverage
Code coverage (i.e. the amount of code that is covered by unit tests) helps identify areas of the code that are potentially more or less brittle.
To get a report of current coverage, run the following:
```bash
make coverage
```
### Working with Optional Dependencies
## Working with Optional Dependencies
Langchain relies heavily on optional dependencies to keep the Langchain package lightweight.
@@ -192,51 +224,7 @@ To introduce the dependency to the pyproject.toml file correctly, please do the
test makes use of lightweight fixtures to test the logic of the code.
5. Please use the `@pytest.mark.requires(package_name)` decorator for any tests that require the dependency.
### Testing
See section about optional dependencies.
#### Unit Tests
Unit tests cover modular logic that does not require calls to outside APIs.
To run unit tests:
```bash
make test
```
To run unit tests in Docker:
```bash
make docker_tests
```
If you add new logic, please add a unit test.
#### Integration Tests
Integration tests cover logic that requires making calls to outside APIs (often integration with other services).
**warning** Almost no tests should be integration tests.
Tests that require making network connections make it difficult for other
developers to test the code.
Instead favor relying on `responses` library and/or mock.patch to mock
requests using small fixtures.
To run integration tests:
```bash
make integration_tests
```
If you add support for a new external API, please add a new integration test.
### Adding a Jupyter Notebook
## Adding a Jupyter Notebook
If you are adding a Jupyter Notebook example, you'll want to install the optional `dev` dependencies.
@@ -259,6 +247,12 @@ When you run `poetry install`, the `langchain` package is installed as editable
While the code is split between `langchain` and `langchain.experimental`, the documentation is one holistic thing.
This covers how to get started contributing to documentation.
From the top-level of this repo, install documentation dependencies:
```bash
poetry install
```
### Contribute Documentation
The docs directory contains Documentation and API Reference.

View File

@@ -19,4 +19,4 @@ jobs:
run: |
# We should not encourage imports directly from main init file
# Expect for hub
git grep 'from langchain import' docs | grep -vE 'from langchain import (hub)' && exit 1 || exit 0
git grep 'from langchain import' docs/{extras,docs_skeleton,snippets} | grep -vE 'from langchain import (hub)' && exit 1 || exit 0

View File

@@ -34,12 +34,19 @@ jobs:
working-directory: libs/langchain
cache-key: scheduled
- name: 'Authenticate to Google Cloud'
id: 'auth'
uses: 'google-github-actions/auth@v1'
with:
credentials_json: '${{ secrets.GOOGLE_CREDENTIALS }}'
- name: Install dependencies
working-directory: libs/langchain
shell: bash
run: |
echo "Running scheduled tests, installing dependencies with poetry..."
poetry install --with=test_integration
poetry run pip install google-cloud-aiplatform
- name: Run tests
shell: bash

View File

@@ -42,7 +42,8 @@ spell_fix:
######################
help:
@echo '----'
@echo '===================='
@echo '-- DOCUMENTATION --'
@echo 'clean - run docs_clean and api_docs_clean'
@echo 'docs_build - build the documentation'
@echo 'docs_clean - clean the documentation build artifacts'
@@ -51,4 +52,5 @@ help:
@echo 'api_docs_clean - clean the API Reference documentation build artifacts'
@echo 'api_docs_linkcheck - run linkchecker on the API Reference documentation'
@echo 'spell_check - run codespell on the project'
@echo 'spell_fix - run codespell on the project and fix the errors'
@echo 'spell_fix - run codespell on the project and fix the errors'
@echo '-- TEST and LINT tasks are within libs/*/ per-package --'

View File

@@ -0,0 +1,150 @@
import os
from pathlib import Path
from langchain import chat_models, llms
from langchain.chat_models.base import BaseChatModel, SimpleChatModel
from langchain.llms.base import BaseLLM, LLM
INTEGRATIONS_DIR = (
Path(os.path.abspath(__file__)).parents[1] / "extras" / "integrations"
)
LLM_IGNORE = ("FakeListLLM", "OpenAIChat", "PromptLayerOpenAIChat")
LLM_FEAT_TABLE_CORRECTION = {
"TextGen": {"_astream": False, "_agenerate": False},
"Ollama": {
"_stream": False,
},
"PromptLayerOpenAI": {"batch_generate": False, "batch_agenerate": False},
}
CHAT_MODEL_IGNORE = ("FakeListChatModel", "HumanInputChatModel")
CHAT_MODEL_FEAT_TABLE_CORRECTION = {
"ChatMLflowAIGateway": {"_agenerate": False},
"PromptLayerChatOpenAI": {"_stream": False, "_astream": False},
"ChatKonko": {"_astream": False, "_agenerate": False},
}
LLM_TEMPLATE = """\
---
sidebar_position: 0
sidebar_class_name: hidden
---
# LLMs
import DocCardList from "@theme/DocCardList";
## Features (natively supported)
All LLMs implement the Runnable interface, which comes with default implementations of all methods, ie. `ainvoke`, `batch`, `abatch`, `stream`, `astream`. This gives all LLMs basic support for async, streaming and batch, which by default is implemented as below:
- *Async* support defaults to calling the respective sync method in asyncio's default thread pool executor. This lets other async functions in your application make progress while the LLM is being executed, by moving this call to a background thread.
- *Streaming* support defaults to returning an `Iterator` (or `AsyncIterator` in the case of async streaming) of a single value, the final result returned by the underlying LLM provider. This obviously doesn't give you token-by-token streaming, which requires native support from the LLM provider, but ensures your code that expects an iterator of tokens can work for any of our LLM integrations.
- *Batch* support defaults to calling the underlying LLM in parallel for each input by making use of a thread pool executor (in the sync batch case) or `asyncio.gather` (in the async batch case). The concurrency can be controlled with the `max_concurrency` key in `RunnableConfig`.
Each LLM integration can optionally provide native implementations for async, streaming or batch, which, for providers that support it, can be more efficient. The table shows, for each integration, which features have been implemented with native support.
{table}
<DocCardList />
"""
CHAT_MODEL_TEMPLATE = """\
---
sidebar_position: 1
sidebar_class_name: hidden
---
# Chat models
import DocCardList from "@theme/DocCardList";
## Features (natively supported)
All ChatModels implement the Runnable interface, which comes with default implementations of all methods, ie. `ainvoke`, `batch`, `abatch`, `stream`, `astream`. This gives all ChatModels basic support for async, streaming and batch, which by default is implemented as below:
- *Async* support defaults to calling the respective sync method in asyncio's default thread pool executor. This lets other async functions in your application make progress while the ChatModel is being executed, by moving this call to a background thread.
- *Streaming* support defaults to returning an `Iterator` (or `AsyncIterator` in the case of async streaming) of a single value, the final result returned by the underlying ChatModel provider. This obviously doesn't give you token-by-token streaming, which requires native support from the ChatModel provider, but ensures your code that expects an iterator of tokens can work for any of our ChatModel integrations.
- *Batch* support defaults to calling the underlying ChatModel in parallel for each input by making use of a thread pool executor (in the sync batch case) or `asyncio.gather` (in the async batch case). The concurrency can be controlled with the `max_concurrency` key in `RunnableConfig`.
Each ChatModel integration can optionally provide native implementations to truly enable async or streaming.
The table shows, for each integration, which features have been implemented with native support.
{table}
<DocCardList />
"""
def get_llm_table():
llm_feat_table = {}
for cm in llms.__all__:
llm_feat_table[cm] = {}
cls = getattr(llms, cm)
if issubclass(cls, LLM):
for feat in ("_stream", "_astream", ("_acall", "_agenerate")):
if isinstance(feat, tuple):
feat, name = feat
else:
feat, name = feat, feat
llm_feat_table[cm][name] = getattr(cls, feat) != getattr(LLM, feat)
else:
for feat in [
"_stream",
"_astream",
("_generate", "batch_generate"),
"_agenerate",
("_agenerate", "batch_agenerate"),
]:
if isinstance(feat, tuple):
feat, name = feat
else:
feat, name = feat, feat
llm_feat_table[cm][name] = getattr(cls, feat) != getattr(BaseLLM, feat)
final_feats = {
k: v
for k, v in {**llm_feat_table, **LLM_FEAT_TABLE_CORRECTION}.items()
if k not in LLM_IGNORE
}
header = [
"model",
"_agenerate",
"_stream",
"_astream",
"batch_generate",
"batch_agenerate",
]
title = ["Model", "Invoke", "Async invoke", "Stream", "Async stream", "Batch", "Async batch"]
rows = [title, [":-"] + [":-:"] * (len(title) - 1)]
for llm, feats in sorted(final_feats.items()):
rows += [[llm, ""] + ["" if feats.get(h) else "" for h in header[1:]]]
return "\n".join(["|".join(row) for row in rows])
def get_chat_model_table():
feat_table = {}
for cm in chat_models.__all__:
feat_table[cm] = {}
cls = getattr(chat_models, cm)
if issubclass(cls, SimpleChatModel):
comparison_cls = SimpleChatModel
else:
comparison_cls = BaseChatModel
for feat in ("_stream", "_astream", "_agenerate"):
feat_table[cm][feat] = getattr(cls, feat) != getattr(comparison_cls, feat)
final_feats = {
k: v
for k, v in {**feat_table, **CHAT_MODEL_FEAT_TABLE_CORRECTION}.items()
if k not in CHAT_MODEL_IGNORE
}
header = ["model", "_agenerate", "_stream", "_astream"]
title = ["Model", "Invoke", "Async invoke", "Stream", "Async stream"]
rows = [title, [":-"] + [":-:"] * (len(title) - 1)]
for llm, feats in sorted(final_feats.items()):
rows += [[llm, ""] + ["" if feats.get(h) else "" for h in header[1:]]]
return "\n".join(["|".join(row) for row in rows])
if __name__ == "__main__":
llm_page = LLM_TEMPLATE.format(table=get_llm_table())
with open(INTEGRATIONS_DIR / "llms" / "index.mdx", "w") as f:
f.write(llm_page)
chat_model_page = CHAT_MODEL_TEMPLATE.format(table=get_chat_model_table())
with open(INTEGRATIONS_DIR / "chat" / "index.mdx", "w") as f:
f.write(chat_model_page)

View File

@@ -1,7 +1,6 @@
"""Script for auto-generating api_reference.rst."""
import importlib
import inspect
import os
import typing
from pathlib import Path
from typing import TypedDict, Sequence, List, Dict, Literal, Union, Optional

View File

@@ -21,7 +21,7 @@ With LCEL syntax, any components that can be run in parallel automatically are.
**Seamless LangSmith Tracing Integration**
As your chains get more and more complex, it becomes increasingly important to understand what exactly is happening at every step.
With LCEL, **all** steps are automatically logged to [LangSmith](smith.langchain.com) for maximal observability and debuggability.
With LCEL, **all** steps are automatically logged to [LangSmith](https://smith.langchain.com) for maximal observability and debuggability.
#### [Interface](/docs/expression_language/interface)
The base interface shared by all LCEL objects

View File

@@ -71,9 +71,9 @@ const config = {
test: /\.ipynb$/,
loader: "raw-loader",
resolve: {
fullySpecified: false
}
}
fullySpecified: false,
},
},
],
},
}),
@@ -158,16 +158,16 @@ const config = {
position: "left",
},
{
type: 'docSidebar',
position: 'left',
sidebarId: 'use_cases',
label: 'Use cases',
type: "docSidebar",
position: "left",
sidebarId: "use_cases",
label: "Use cases",
},
{
type: 'docSidebar',
position: 'left',
sidebarId: 'integrations',
label: 'Integrations',
type: "docSidebar",
position: "left",
sidebarId: "integrations",
label: "Integrations",
},
{
href: "https://api.python.langchain.com",
@@ -187,9 +187,9 @@ const config = {
// Please keep GitHub link to the right for consistency.
{
href: "https://github.com/hwchase17/langchain",
position: 'right',
className: 'header-github-link',
'aria-label': 'GitHub repository',
position: "right",
className: "header-github-link",
"aria-label": "GitHub repository",
},
],
},
@@ -239,6 +239,14 @@ const config = {
copyright: `Copyright © ${new Date().getFullYear()} LangChain, Inc.`,
},
}),
scripts: [
"/js/google_analytics.js",
{
src: "https://www.googletagmanager.com/gtag/js?id=G-9B66JQQH2F",
async: true,
},
],
};
module.exports = config;

View File

@@ -99,8 +99,8 @@ module.exports = {
label: "Components",
collapsible: false,
items: [
{ type: "category", label: "LLMs", collapsed: true, items: [{type:"autogenerated", dirName: "integrations/llms" }], link: {type: "generated-index", slug: "integrations/llms" }},
{ type: "category", label: "Chat models", collapsed: true, items: [{type:"autogenerated", dirName: "integrations/chat" }], link: {type: "generated-index", slug: "integrations/chat" }},
{ type: "category", label: "LLMs", collapsed: true, items: [{type:"autogenerated", dirName: "integrations/llms" }], link: { type: 'doc', id: "integrations/llms/index"}},
{ type: "category", label: "Chat models", collapsed: true, items: [{type:"autogenerated", dirName: "integrations/chat" }], link: { type: 'doc', id: "integrations/chat/index"}},
{ type: "category", label: "Document loaders", collapsed: true, items: [{type:"autogenerated", dirName: "integrations/document_loaders" }], link: {type: "generated-index", slug: "integrations/document_loaders" }},
{ type: "category", label: "Document transformers", collapsed: true, items: [{type: "autogenerated", dirName: "integrations/document_transformers" }], link: {type: "generated-index", slug: "integrations/document_transformers" }},
{ type: "category", label: "Text embedding models", collapsed: true, items: [{type: "autogenerated", dirName: "integrations/text_embedding" }], link: {type: "generated-index", slug: "integrations/text_embedding" }},

View File

@@ -0,0 +1,7 @@
window.dataLayer = window.dataLayer || [];
function gtag() {
dataLayer.push(arguments);
}
gtag("js", new Date());
gtag("config", "G-9B66JQQH2F");

View File

@@ -1,5 +1,9 @@
{
"redirects": [
{
"source": "/docs/modules/agents/agents/examples/mrkl_chat(.html?)",
"destination": "/docs/modules/agents/"
},
{
"source": "/docs/use_cases(/?)",
"destination": "/docs/use_cases/question_answering/"
@@ -1968,6 +1972,18 @@
"source": "/docs/modules/data_connection/document_loaders/integrations/youtube_transcript",
"destination": "/docs/integrations/document_loaders/youtube_transcript"
},
{
"source": "/docs/integrations/document_loaders/Etherscan",
"destination": "/docs/integrations/document_loaders/etherscan"
},
{
"source": "/docs/integrations/document_loaders/merge_doc_loader",
"destination": "/docs/integrations/document_loaders/merge_doc"
},
{
"source": "/docs/integrations/document_loaders/recursive_url_loader",
"destination": "/docs/integrations/document_loaders/recursive_url"
},
{
"source": "/en/latest/modules/indexes/text_splitters/examples/markdown_header_metadata.html",
"destination": "/docs/modules/data_connection/document_transformers/text_splitters/markdown_header_metadata"

View File

@@ -95,7 +95,7 @@
}
],
"source": [
"question_generator.invoke({\"warm\"})"
"question_generator.invoke(\"warm\")"
]
},
{
@@ -116,7 +116,7 @@
}
],
"source": [
"prompt = question_generator.invoke({\"warm\"})\n",
"prompt = question_generator.invoke(\"warm\")\n",
"model.invoke(prompt)"
]
},

View File

@@ -277,7 +277,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.1"
}
},
"nbformat": 4,

View File

@@ -14,12 +14,15 @@
},
{
"cell_type": "code",
"execution_count": 77,
"execution_count": 4,
"id": "6bb221b3",
"metadata": {},
"outputs": [],
"source": [
"from langchain.schema.runnable import RunnableLambda\n",
"from langchain.prompts import ChatPromptTemplate\n",
"from langchain.chat_models import ChatOpenAI\n",
"from operator import itemgetter\n",
"\n",
"def length_function(text):\n",
" return len(text)\n",
@@ -31,6 +34,7 @@
" return _multiple_length_function(_dict[\"text1\"], _dict[\"text2\"])\n",
"\n",
"prompt = ChatPromptTemplate.from_template(\"what is {a} + {b}\")\n",
"model = ChatOpenAI()\n",
"\n",
"chain1 = prompt | model\n",
"\n",
@@ -42,7 +46,7 @@
},
{
"cell_type": "code",
"execution_count": 78,
"execution_count": 5,
"id": "5488ec85",
"metadata": {},
"outputs": [
@@ -52,7 +56,7 @@
"AIMessage(content='3 + 9 equals 12.', additional_kwargs={}, example=False)"
]
},
"execution_count": 78,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
@@ -73,17 +77,18 @@
},
{
"cell_type": "code",
"execution_count": 139,
"execution_count": 9,
"id": "80b3b5f6-5d58-44b9-807e-cce9a46bf49f",
"metadata": {},
"outputs": [],
"source": [
"from langchain.schema.runnable import RunnableConfig"
"from langchain.schema.runnable import RunnableConfig\n",
"from langchain.schema.output_parser import StrOutputParser"
]
},
{
"cell_type": "code",
"execution_count": 149,
"execution_count": 10,
"id": "ff0daf0c-49dd-4d21-9772-e5fa133c5f36",
"metadata": {},
"outputs": [],
@@ -109,7 +114,7 @@
},
{
"cell_type": "code",
"execution_count": 152,
"execution_count": 12,
"id": "1a5e709e-9d75-48c7-bb9c-503251990505",
"metadata": {},
"outputs": [
@@ -132,6 +137,14 @@
" RunnableLambda(parse_or_fix).invoke(\"{foo: bar}\", {\"tags\": [\"my-tag\"], \"callbacks\": [cb]})\n",
" print(cb)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "29f55c38",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -150,7 +163,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.1"
}
},
"nbformat": 4,

View File

@@ -12,18 +12,18 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 2,
"id": "7e1873d6-d4b6-43ac-96a1-edcf178201e0",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'joke': AIMessage(content=\"Why don't bears wear shoes? \\nBecause they have bear feet!\", additional_kwargs={}, example=False),\n",
" 'poem': AIMessage(content=\"In twilight's embrace, a bear's gentle lumber,\\nSilent strength, nature's awe, a humble slumber.\", additional_kwargs={}, example=False)}"
"{'joke': AIMessage(content=\"Why don't bears wear shoes? \\n\\nBecause they have bear feet!\", additional_kwargs={}, example=False),\n",
" 'poem': AIMessage(content=\"In woodland depths, bear prowls with might,\\nSilent strength, nature's sovereign, day and night.\", additional_kwargs={}, example=False)}"
]
},
"execution_count": 5,
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
@@ -38,7 +38,7 @@
"joke_chain = ChatPromptTemplate.from_template(\"tell me a joke about {topic}\") | model\n",
"poem_chain = ChatPromptTemplate.from_template(\"write a 2-line poem about {topic}\") | model\n",
"\n",
"map_chain = RunnableMap({\"joke\": chain1, \"poem\": chain2,})\n",
"map_chain = RunnableMap({\"joke\": joke_chain, \"poem\": poem_chain,})\n",
"\n",
"map_chain.invoke({\"topic\": \"bear\"})"
]
@@ -54,7 +54,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 3,
"id": "267d1460-53c1-4fdb-b2c3-b6a1eb7fccff",
"metadata": {},
"outputs": [
@@ -64,7 +64,7 @@
"'Harrison worked at Kensho.'"
]
},
"execution_count": 4,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
@@ -191,7 +191,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.1"
}
},
"nbformat": 4,

View File

@@ -47,13 +47,13 @@ A minimal example on how to deploy LangChain to [Kinsta](https://kinsta.com) usi
A minimal example of how to deploy LangChain to [Fly.io](https://fly.io/) using Flask.
## [Digitalocean App Platform](https://github.com/homanp/digitalocean-langchain)
## [DigitalOcean App Platform](https://github.com/homanp/digitalocean-langchain)
A minimal example of how to deploy LangChain to DigitalOcean App Platform.
## [CI/CD Google Cloud Build + Dockerfile + Serverless Google Cloud Run](https://github.com/g-emarco/github-assistant)
Boilerplate LangChain project on how to deploy to Google Cloud Run using Docker with Cloud Build CI/CD pipeline
Boilerplate LangChain project on how to deploy to Google Cloud Run using Docker with Cloud Build CI/CD pipeline.
## [Google Cloud Run](https://github.com/homanp/gcp-langchain)

Binary file not shown.

After

Width:  |  Height:  |  Size: 766 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 815 KiB

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,255 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "642fd21c-600a-47a1-be96-6e1438b421a9",
"metadata": {},
"source": [
"# ChatFireworks\n",
"\n",
">[Fireworks](https://app.fireworks.ai/) accelerates product development on generative AI by creating an innovative AI experiment and production platform. \n",
"\n",
"This example goes over how to use LangChain to interact with `ChatFireworks` models."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "d00d850917865298",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"from langchain.chat_models.fireworks import ChatFireworks\n",
"from langchain.schema import SystemMessage, HumanMessage\n",
"import os"
]
},
{
"cell_type": "markdown",
"id": "f28ebf8b-f14f-46c7-9962-8b8dc42e31be",
"metadata": {},
"source": [
"# Setup\n",
"Contact Fireworks AI for the an API Key to access our models\n",
"\n",
"Set up your model using a model id. If the model is not set, the default model is fireworks-llama-v2-7b-chat."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "d096fb14-8acc-4047-9cd0-c842430c3a1d",
"metadata": {},
"outputs": [],
"source": [
"# Initialize a Fireworks Chat model\n",
"os.environ['FIREWORKS_API_KEY'] = \"<your_api_key>\" # Change this to your own API key\n",
"chat = ChatFireworks(model=\"accounts/fireworks/models/llama-v2-13b-chat\")"
]
},
{
"cell_type": "markdown",
"id": "d8f13144-37cf-47a5-b5a0-e3cdf76d9a72",
"metadata": {},
"source": [
"# Calling the Model\n",
"\n",
"You can use the LLMs to call the model for specified message(s). \n",
"\n",
"See the full, most up-to-date model list on [app.fireworks.ai](https://app.fireworks.ai)."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "72340871-ae2f-415f-b399-0777d32dc379",
"metadata": {},
"outputs": [],
"source": [
"# ChatFireworks Wrapper\n",
"system_message = SystemMessage(content=\"You are to chat with the user.\")\n",
"human_message = HumanMessage(content=\"Who are you?\")\n",
"response = chat([system_message, human_message])"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "2d6ef879-69e3-422b-8379-bb980b70fe55",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"Hello! My name is LLaMA, I'm a large language model trained by a team of researcher at Meta AI. My primary function is to assist users with tasks and answer questions to the best of my ability. I am capable of understanding and responding to natural language input, and I am here to help you with any questions or tasks you may have. Is there anything specific you would like to know or discuss?\", additional_kwargs={}, example=False)"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "68c6b1fa-2ff7-4a63-8d88-3cec302180b8",
"metadata": {},
"outputs": [],
"source": [
"# Setting additional parameters: temperature, max_tokens, top_p\n",
"chat = ChatFireworks(model=\"accounts/fireworks/models/llama-v2-13b-chat\", model_kwargs={\"temperature\":1, \"max_tokens\": 20, \"top_p\": 1})\n",
"system_message = SystemMessage(content=\"You are to chat with the user.\")\n",
"human_message = HumanMessage(content=\"How's the weather today?\")\n",
"response = chat([system_message, human_message])"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "a09025f8-e4c3-4005-a8fc-c9c774b03a64",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"Oh, you know, it's just another beautiful day in the virtual world! The sun\", additional_kwargs={}, example=False)"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response"
]
},
{
"cell_type": "markdown",
"id": "d93aa186-39cf-4e1a-aa32-01ed31d43bc8",
"metadata": {},
"source": [
"# ChatFireworks Wrapper with generate"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "cbe29efc-37c3-4c83-8b84-b8bba1a1e589",
"metadata": {},
"outputs": [],
"source": [
"chat = ChatFireworks()\n",
"message = HumanMessage(content=\"Hello\")\n",
"response = chat.generate([[message], [message]])"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "35109f36-9519-47a6-a223-25639123e836",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"LLMResult(generations=[[ChatGeneration(text=\"Hello! It's nice to meet you. I'm here to help answer any questions you may have, while being respectful and safe. Please feel free to ask me anything, and I will do my best to provide helpful and positive responses. Is there something specific you would like to know or discuss?\", generation_info={'finish_reason': 'stop'}, message=AIMessage(content=\"Hello! It's nice to meet you. I'm here to help answer any questions you may have, while being respectful and safe. Please feel free to ask me anything, and I will do my best to provide helpful and positive responses. Is there something specific you would like to know or discuss?\", additional_kwargs={}, example=False))], [ChatGeneration(text=\"Hello! *smiling* I'm here to help you with any questions or concerns you may have. Please feel free to ask me anything, and I will do my best to provide helpful, respectful, and honest responses. I'm programmed to avoid any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content, and to provide socially unbiased and positive responses. Is there anything specific you would like to talk about or ask?\", generation_info={'finish_reason': 'stop'}, message=AIMessage(content=\"Hello! *smiling* I'm here to help you with any questions or concerns you may have. Please feel free to ask me anything, and I will do my best to provide helpful, respectful, and honest responses. I'm programmed to avoid any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content, and to provide socially unbiased and positive responses. Is there anything specific you would like to talk about or ask?\", additional_kwargs={}, example=False))]], llm_output={'model': 'accounts/fireworks/models/llama-v2-7b-chat'}, run=[RunInfo(run_id=UUID('f137463e-e1c7-454a-8b85-b999ce20e0f2')), RunInfo(run_id=UUID('f3ef1138-92de-4e01-900b-991e34a647a7'))])"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response"
]
},
{
"cell_type": "markdown",
"id": "92c2cabb-9eaf-4c49-b0e5-a5de5a7d920e",
"metadata": {},
"source": [
"# ChatFireworks Wrapper with stream"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "12717a29-fb7d-4a4d-860b-40435452b065",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Hello! I'm just\n",
" an AI assistant,\n",
" here to help answer your\n",
" questions and provide information in\n",
" a responsible and respectful manner\n",
". I'm not able\n",
" to access personal information or provide\n",
" any content that could be considered\n",
" harmful, uneth\n",
"ical, racist, sex\n",
"ist, toxic, dangerous\n",
", or illegal. My purpose\n",
" is to assist and provide helpful\n",
" responses that are socially un\n",
"biased and positive in nature\n",
". Is there something specific you\n",
" would like to know or discuss\n",
"?\n"
]
}
],
"source": [
"llm = ChatFireworks()\n",
"\n",
"for token in llm.stream(\"Who are you\"):\n",
" print(token.content)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "02991e05-a38e-47d4-9ab3-7e630a8ead55",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Google Cloud Platform Vertex AI PaLM \n",
"# GCP Vertex AI \n",
"\n",
"Note: This is seperate from the Google PaLM integration. Google has chosen to offer an enterprise version of PaLM through GCP, and this supports the models made available through there. \n",
"\n",
@@ -31,7 +31,7 @@
},
"outputs": [],
"source": [
"#!pip install google-cloud-aiplatform"
"#!pip install langchain google-cloud-aiplatform"
]
},
{
@@ -41,12 +41,7 @@
"outputs": [],
"source": [
"from langchain.chat_models import ChatVertexAI\n",
"from langchain.prompts.chat import (\n",
" ChatPromptTemplate,\n",
" SystemMessagePromptTemplate,\n",
" HumanMessagePromptTemplate,\n",
")\n",
"from langchain.schema import HumanMessage, SystemMessage"
"from langchain.prompts import ChatPromptTemplate"
]
},
{
@@ -60,82 +55,78 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"system = \"You are a helpful assistant who translate English to French\"\n",
"human = \"Translate this sentence from English to French. I love programming.\"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [(\"system\", system), (\"human\", human)]\n",
")\n",
"messages = prompt.format_messages()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content='Sure, here is the translation of the sentence \"I love programming\" from English to French:\\n\\nJ\\'aime programmer.', additional_kwargs={}, example=False)"
"AIMessage(content=\" J'aime la programmation.\", additional_kwargs={}, example=False)"
]
},
"execution_count": 4,
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"messages = [\n",
" SystemMessage(\n",
" content=\"You are a helpful assistant that translates English to French.\"\n",
" ),\n",
" HumanMessage(\n",
" content=\"Translate this sentence from English to French. I love programming.\"\n",
" ),\n",
"]\n",
"chat(messages)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"You can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplates`. You can use `ChatPromptTemplate`'s `format_prompt` -- this returns a `PromptValue`, which you can convert to a string or Message object, depending on whether you want to use the formatted value as input to an llm or chat model.\n",
"\n",
"For convenience, there is a `from_template` method exposed on the template. If you were to use this template, this is what it would look like:"
"If we want to construct a simple chain that takes user specified parameters:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"template = (\n",
" \"You are a helpful assistant that translates {input_language} to {output_language}.\"\n",
")\n",
"system_message_prompt = SystemMessagePromptTemplate.from_template(template)\n",
"human_template = \"{text}\"\n",
"human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)"
"system = \"You are a helpful assistant that translates {input_language} to {output_language}.\"\n",
"human = \"{text}\"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [(\"system\", system), (\"human\", human)]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content='Sure, here is the translation of \"I love programming\" in French:\\n\\nJ\\'aime programmer.', additional_kwargs={}, example=False)"
"AIMessage(content=' 私はプログラミングが大好きです。', additional_kwargs={}, example=False)"
]
},
"execution_count": 7,
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chat_prompt = ChatPromptTemplate.from_messages(\n",
" [system_message_prompt, human_message_prompt]\n",
")\n",
"\n",
"# get a chat completion from the formatted messages\n",
"chat(\n",
" chat_prompt.format_prompt(\n",
" input_language=\"English\", output_language=\"French\", text=\"I love programming.\"\n",
" ).to_messages()\n",
"chain = prompt | chat\n",
"chain.invoke(\n",
" {\"input_language\": \"English\", \"output_language\": \"Japanese\", \"text\": \"I love programming\"}\n",
")"
]
},
@@ -153,60 +144,129 @@
"tags": []
},
"source": [
"## Code generation chat models\n",
"You can now leverage the Codey API for code chat within Vertex AI. The model name is:\n",
"- codechat-bison: for code assistance"
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 18,
"metadata": {
"execution": {
"iopub.execute_input": "2023-06-17T21:30:43.974841Z",
"iopub.status.busy": "2023-06-17T21:30:43.974431Z",
"iopub.status.idle": "2023-06-17T21:30:44.248119Z",
"shell.execute_reply": "2023-06-17T21:30:44.247362Z",
"shell.execute_reply.started": "2023-06-17T21:30:43.974820Z"
},
"tags": []
},
"outputs": [],
"source": [
"chat = ChatVertexAI(model_name=\"codechat-bison\")"
"chat = ChatVertexAI(\n",
" model_name=\"codechat-bison\",\n",
" max_output_tokens=1000,\n",
" temperature=0.5\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 20,
"metadata": {
"execution": {
"iopub.execute_input": "2023-06-17T21:30:45.146093Z",
"iopub.status.busy": "2023-06-17T21:30:45.145752Z",
"iopub.status.idle": "2023-06-17T21:30:47.449126Z",
"shell.execute_reply": "2023-06-17T21:30:47.448609Z",
"shell.execute_reply.started": "2023-06-17T21:30:45.146069Z"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" ```python\n",
"def is_prime(x): \n",
" if (x <= 1): \n",
" return False\n",
" for i in range(2, x): \n",
" if (x % i == 0): \n",
" return False\n",
" return True\n",
"```\n"
]
}
],
"source": [
"# For simple string in string out usage, we can use the `predict` method:\n",
"print(chat.predict(\"Write a Python function to identify all prime numbers\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Asynchronous calls\n",
"\n",
"We can make asynchronous calls via the `agenerate` and `ainvoke` methods."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"import asyncio\n",
"# import nest_asyncio\n",
"# nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content='The following Python function can be used to identify all prime numbers up to a given integer:\\n\\n```\\ndef is_prime(n):\\n \"\"\"\\n Determines whether the given integer is prime.\\n\\n Args:\\n n: The integer to be tested for primality.\\n\\n Returns:\\n True if n is prime, False otherwise.\\n \"\"\"\\n\\n # Check if n is divisible by 2.\\n if n % 2 == 0:\\n return False\\n\\n # Check if n is divisible by any integer from 3 to the square root', additional_kwargs={}, example=False)"
"LLMResult(generations=[[ChatGeneration(text=\" J'aime la programmation.\", generation_info=None, message=AIMessage(content=\" J'aime la programmation.\", additional_kwargs={}, example=False))]], llm_output={}, run=[RunInfo(run_id=UUID('223599ef-38f8-4c79-ac6d-a5013060eb9d'))])"
]
},
"execution_count": 4,
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"messages = [\n",
" HumanMessage(\n",
" content=\"How do I create a python function to identify all prime numbers?\"\n",
" )\n",
"]\n",
"chat(messages)"
"chat = ChatVertexAI(\n",
" model_name=\"chat-bison\",\n",
" max_output_tokens=1000,\n",
" temperature=0.7,\n",
" top_p=0.95,\n",
" top_k=40,\n",
")\n",
"\n",
"asyncio.run(chat.agenerate([messages]))"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=' अहं प्रोग्रामिंग प्रेमामि', additional_kwargs={}, example=False)"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"asyncio.run(chain.ainvoke({\"input_language\": \"English\", \"output_language\": \"Sanskrit\", \"text\": \"I love programming\"}))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Streaming calls\n",
"\n",
"We can also stream outputs via the `stream` method:"
]
},
{
@@ -214,14 +274,51 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
"source": [
"import sys"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 1. China (1,444,216,107)\n",
"2. India (1,393,409,038)\n",
"3. United States (332,403,650)\n",
"4. Indonesia (273,523,615)\n",
"5. Pakistan (220,892,340)\n",
"6. Brazil (212,559,409)\n",
"7. Nigeria (206,139,589)\n",
"8. Bangladesh (164,689,383)\n",
"9. Russia (145,934,462)\n",
"10. Mexico (128,932,488)\n",
"11. Japan (126,476,461)\n",
"12. Ethiopia (115,063,982)\n",
"13. Philippines (109,581,078)\n",
"14. Egypt (102,334,404)\n",
"15. Vietnam (97,338,589)"
]
}
],
"source": [
"prompt = ChatPromptTemplate.from_messages([(\"human\", \"List out the 15 most populous countries in the world\")])\n",
"messages = prompt.format_messages()\n",
"for chunk in chat.stream(messages):\n",
" sys.stdout.write(chunk.content)\n",
" sys.stdout.flush()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "poetry-venv",
"language": "python",
"name": "python3"
"name": "poetry-venv"
},
"language_info": {
"codemirror_mode": {

View File

@@ -0,0 +1,39 @@
---
sidebar_position: 1
sidebar_class_name: hidden
---
# Chat models
import DocCardList from "@theme/DocCardList";
## Features (natively supported)
All ChatModels implement the Runnable interface, which comes with default implementations of all methods, ie. `ainvoke`, `batch`, `abatch`, `stream`, `astream`. This gives all ChatModels basic support for async, streaming and batch, which by default is implemented as below:
- *Async* support defaults to calling the respective sync method in asyncio's default thread pool executor. This lets other async functions in your application make progress while the ChatModel is being executed, by moving this call to a background thread.
- *Streaming* support defaults to returning an `Iterator` (or `AsyncIterator` in the case of async streaming) of a single value, the final result returned by the underlying ChatModel provider. This obviously doesn't give you token-by-token streaming, which requires native support from the ChatModel provider, but ensures your code that expects an iterator of tokens can work for any of our ChatModel integrations.
- *Batch* support defaults to calling the underlying ChatModel in parallel for each input by making use of a thread pool executor (in the sync batch case) or `asyncio.gather` (in the async batch case). The concurrency can be controlled with the `max_concurrency` key in `RunnableConfig`.
Each ChatModel integration can optionally provide native implementations to truly enable async or streaming.
The table shows, for each integration, which features have been implemented with native support.
Model|Invoke|Async invoke|Stream|Async stream
:-|:-:|:-:|:-:|:-:
AzureChatOpenAI|✅|✅|✅|✅
BedrockChat|✅|❌|✅|❌
ChatAnthropic|✅|✅|✅|✅
ChatAnyscale|✅|✅|✅|✅
ChatGooglePalm|✅|✅|❌|❌
ChatJavelinAIGateway|✅|✅|❌|❌
ChatKonko|✅|❌|❌|❌
ChatLiteLLM|✅|✅|✅|✅
ChatMLflowAIGateway|✅|❌|❌|❌
ChatOllama|✅|❌|✅|❌
ChatOpenAI|✅|✅|✅|✅
ChatVertexAI|✅|✅|✅|❌
ErnieBotChat|✅|❌|❌|❌
JinaChat|✅|✅|✅|✅
MiniMaxChat|✅|✅|❌|❌
PromptLayerChatOpenAI|✅|❌|❌|❌
QianfanChatEndpoint|✅|✅|✅|✅
<DocCardList />

View File

@@ -0,0 +1,174 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "eb7e5679-aa06-47e4-a1a3-b6b70e604017",
"metadata": {},
"source": [
"# vLLM Chat\n",
"\n",
"vLLM can be deployed as a server that mimics the OpenAI API protocol. This allows vLLM to be used as a drop-in replacement for applications using OpenAI API. This server can be queried in the same format as OpenAI API.\n",
"\n",
"This notebook covers how to get started with vLLM chat models using langchain's `ChatOpenAI` **as it is**."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "060a2e3d-d42f-4221-bd09-a9a06544dcd3",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.prompts.chat import (\n",
" ChatPromptTemplate,\n",
" SystemMessagePromptTemplate,\n",
" AIMessagePromptTemplate,\n",
" HumanMessagePromptTemplate,\n",
")\n",
"from langchain.schema import AIMessage, HumanMessage, SystemMessage"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "bf24d732-68a9-44fd-b05d-4903ce5620c6",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"inference_server_url = \"http://localhost:8000/v1\"\n",
"\n",
"chat = ChatOpenAI(\n",
" model=\"mosaicml/mpt-7b\",\n",
" openai_api_key=\"EMPTY\",\n",
" openai_api_base=inference_server_url,\n",
" max_tokens=5,\n",
" temperature=0,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "aea4e363-5688-4b07-82ed-6aa8153c2377",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=' Io amo programmare', additional_kwargs={}, example=False)"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"messages = [\n",
" SystemMessage(\n",
" content=\"You are a helpful assistant that translates English to Italian.\"\n",
" ),\n",
" HumanMessage(\n",
" content=\"Translate the following sentence from English to Italian: I love programming.\"\n",
" ),\n",
"]\n",
"chat(messages)"
]
},
{
"cell_type": "markdown",
"id": "55fc7046-a6dc-4720-8c0c-24a6db76a4f4",
"metadata": {},
"source": [
"You can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplates`. You can use ChatPromptTemplate's format_prompt -- this returns a `PromptValue`, which you can convert to a string or `Message` object, depending on whether you want to use the formatted value as input to an llm or chat model.\n",
"\n",
"For convenience, there is a `from_template` method exposed on the template. If you were to use this template, this is what it would look like:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "123980e9-0dee-4ce5-bde6-d964dd90129c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template = (\n",
" \"You are a helpful assistant that translates {input_language} to {output_language}.\"\n",
")\n",
"system_message_prompt = SystemMessagePromptTemplate.from_template(template)\n",
"human_template = \"{text}\"\n",
"human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "b2fb8c59-8892-4270-85a2-4f8ab276b75d",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=' I love programming too.', additional_kwargs={}, example=False)"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chat_prompt = ChatPromptTemplate.from_messages(\n",
" [system_message_prompt, human_message_prompt]\n",
")\n",
"\n",
"# get a chat completion from the formatted messages\n",
"chat(\n",
" chat_prompt.format_prompt(\n",
" input_language=\"English\", output_language=\"Italian\", text=\"I love programming.\"\n",
" ).to_messages()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0bbd9861-2b94-4920-8708-b690004f4c4d",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "conda_pytorch_p310",
"language": "python",
"name": "conda_pytorch_p310"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -5,9 +5,9 @@
"id": "e229e34c",
"metadata": {},
"source": [
"# AsyncHtmlLoader\n",
"# AsyncHtml\n",
"\n",
"AsyncHtmlLoader loads raw HTML from a list of urls concurrently."
"`AsyncHtmlLoader` loads raw HTML from a list of URLs concurrently."
]
},
{
@@ -99,7 +99,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -1,156 +1,159 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "a634365e",
"metadata": {},
"source": [
"# AWS S3 Directory\n",
"\n",
">[Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-folders.html) is an object storage service\n",
"\n",
">[AWS S3 Directory](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-folders.html)\n",
"\n",
"This covers how to load document objects from an `AWS S3 Directory` object."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "49815096",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"#!pip install boto3"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "2f0cd6a5",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import S3DirectoryLoader"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "321cc7f1",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"loader = S3DirectoryLoader(\"testing-hwc\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2b11d155",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"loader.load()"
]
},
{
"cell_type": "markdown",
"id": "0690c40a",
"metadata": {},
"source": [
"## Specifying a prefix\n",
"You can also specify a prefix for more finegrained control over what files to load."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "72d44781",
"metadata": {},
"outputs": [],
"source": [
"loader = S3DirectoryLoader(\"testing-hwc\", prefix=\"fake\")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "2d3c32db",
"metadata": {},
"outputs": [
"cells": [
{
"data": {
"text/plain": [
"[Document(page_content='Lorem ipsum dolor sit amet.', lookup_str='', metadata={'source': 's3://testing-hwc/fake.docx'}, lookup_index=0)]"
"cell_type": "markdown",
"id": "a634365e",
"metadata": {},
"source": [
"# AWS S3 Directory\n",
"\n",
">[Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-folders.html) is an object storage service\n",
"\n",
">[AWS S3 Directory](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-folders.html)\n",
"\n",
"This covers how to load document objects from an `AWS S3 Directory` object."
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
},
{
"cell_type": "code",
"execution_count": null,
"id": "49815096",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"#!pip install boto3"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "2f0cd6a5",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import S3DirectoryLoader"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "321cc7f1",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"loader = S3DirectoryLoader(\"testing-hwc\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2b11d155",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"loader.load()"
]
},
{
"cell_type": "markdown",
"id": "0690c40a",
"metadata": {},
"source": [
"## Specifying a prefix\n",
"You can also specify a prefix for more finegrained control over what files to load."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "72d44781",
"metadata": {},
"outputs": [],
"source": [
"loader = S3DirectoryLoader(\"testing-hwc\", prefix=\"fake\")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "2d3c32db",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='Lorem ipsum dolor sit amet.', lookup_str='', metadata={'source': 's3://testing-hwc/fake.docx'}, lookup_index=0)]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"loader.load()"
]
},
{
"cell_type": "markdown",
"source": [
"## Configuring the AWS Boto3 client\n",
"You can configure the AWS [Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) client by passing\n",
"named arguments when creating the S3DirectoryLoader.\n",
"This is useful for instance when AWS credentials can't be set as environment variables.\n",
"See the [list of parameters](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html#boto3.session.Session) that can be configured."
],
"metadata": {},
"id": "91a7ac07"
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"loader = S3DirectoryLoader(\"testing-hwc\", aws_access_key_id=\"xxxx\", aws_secret_access_key=\"yyyy\")"
],
"metadata": {},
"id": "f485ec8c"
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"loader.load()"
],
"metadata": {},
"id": "c0fa76ae"
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
],
"source": [
"loader.load()"
]
},
{
"cell_type": "markdown",
"source": [
"## Configuring the AWS Boto3 client\n",
"You can configure the AWS [Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) client by passing\n",
"named arguments when creating the S3DirectoryLoader.\n",
"This is useful for instance when AWS credentials can't be set as environment variables.\n",
"See the [list of parameters](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html#boto3.session.Session) that can be configured."
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"loader = S3DirectoryLoader(\"testing-hwc\", aws_access_key_id=\"xxxx\", aws_secret_access_key=\"yyyy\")"
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"loader.load()"
],
"metadata": {}
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,121 +1,122 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "66a7777e",
"metadata": {},
"source": [
"# AWS S3 File\n",
"\n",
">[Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-folders.html) is an object storage service.\n",
"\n",
">[AWS S3 Buckets](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html)\n",
"\n",
"This covers how to load document objects from an `AWS S3 File` object."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "9ec8a3b3",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import S3FileLoader"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "43128d8d",
"metadata": {},
"outputs": [],
"source": [
"#!pip install boto3"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "35d6809a",
"metadata": {},
"outputs": [],
"source": [
"loader = S3FileLoader(\"testing-hwc\", \"fake.docx\")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "efd6be84",
"metadata": {},
"outputs": [
"cells": [
{
"data": {
"text/plain": [
"[Document(page_content='Lorem ipsum dolor sit amet.', lookup_str='', metadata={'source': 's3://testing-hwc/fake.docx'}, lookup_index=0)]"
"cell_type": "markdown",
"id": "66a7777e",
"metadata": {},
"source": [
"# AWS S3 File\n",
"\n",
">[Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-folders.html) is an object storage service.\n",
"\n",
">[AWS S3 Buckets](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html)\n",
"\n",
"This covers how to load document objects from an `AWS S3 File` object."
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
},
{
"cell_type": "code",
"execution_count": 1,
"id": "9ec8a3b3",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import S3FileLoader"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "43128d8d",
"metadata": {},
"outputs": [],
"source": [
"#!pip install boto3"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "35d6809a",
"metadata": {},
"outputs": [],
"source": [
"loader = S3FileLoader(\"testing-hwc\", \"fake.docx\")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "efd6be84",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='Lorem ipsum dolor sit amet.', lookup_str='', metadata={'source': 's3://testing-hwc/fake.docx'}, lookup_index=0)]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"loader.load()"
]
},
{
"cell_type": "markdown",
"id": "93689594",
"metadata": {},
"source": [
"## Configuring the AWS Boto3 client\n",
"You can configure the AWS [Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) client by passing\n",
"named arguments when creating the S3DirectoryLoader.\n",
"This is useful for instance when AWS credentials can't be set as environment variables.\n",
"See the [list of parameters](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html#boto3.session.Session) that can be configured."
]
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"loader = S3FileLoader(\"testing-hwc\", \"fake.docx\", aws_access_key_id=\"xxxx\", aws_secret_access_key=\"yyyy\")"
],
"metadata": {},
"id": "43106ee8"
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"loader.load()"
],
"metadata": {},
"id": "1764a727"
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
],
"source": [
"loader.load()"
]
},
{
"cell_type": "markdown",
"id": "93689594",
"metadata": {},
"source": [
"## Configuring the AWS Boto3 client\n",
"You can configure the AWS [Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) client by passing\n",
"named arguments when creating the S3DirectoryLoader.\n",
"This is useful for instance when AWS credentials can't be set as environment variables.\n",
"See the [list of parameters](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html#boto3.session.Session) that can be configured."
]
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"loader = S3FileLoader(\"testing-hwc\", \"fake.docx\", aws_access_key_id=\"xxxx\", aws_secret_access_key=\"yyyy\")"
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"loader.load()"
],
"metadata": {}
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -5,12 +5,17 @@
"id": "1ab83660",
"metadata": {},
"source": [
"# Etherscan Loader\n",
"# Etherscan\n",
"\n",
">[Etherscan](https://docs.etherscan.io/) is the leading blockchain explorer, search, API and analytics platform for Ethereum, \n",
"a decentralized smart contracts platform.\n",
"\n",
"\n",
"## Overview\n",
"\n",
"The Etherscan loader use etherscan api to load transaction histories under specific account on Ethereum Mainnet.\n",
"The `Etherscan` loader use `Etherscan API` to load transacactions histories under specific account on `Ethereum Mainnet`.\n",
"\n",
"You will need a Etherscan api key to proceed. The free api key has 5 calls per second quota.\n",
"You will need a `Etherscan api key` to proceed. The free api key has 5 calls per seconds quota.\n",
"\n",
"The loader supports the following six functinalities:\n",
"* Retrieve normal transactions under specific account on Ethereum Mainet\n",
@@ -47,7 +52,7 @@
"id": "d72d4e22",
"metadata": {},
"source": [
"# Setup"
"## Setup"
]
},
{
@@ -86,7 +91,7 @@
"id": "3bcbb63e",
"metadata": {},
"source": [
"# Create a ERC20 transaction loader"
"## Create a ERC20 transaction loader"
]
},
{
@@ -136,7 +141,7 @@
"id": "2a1ecce0",
"metadata": {},
"source": [
"# Create a normal transaction loader with customized parameters"
"## Create a normal transaction loader with customized parameters"
]
},
{
@@ -212,7 +217,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.2"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# MediaWikiDump\n",
"# MediaWiki Dump\n",
"\n",
">[MediaWiki XML Dumps](https://www.mediawiki.org/wiki/Manual:Importing_XML_dumps) contain the content of a wiki (wiki pages with all their revisions), without the site-related data. A XML dump does not create a full backup of the wiki database, the dump does not contain user accounts, images, edit logs, etc.\n",
"\n",
@@ -122,7 +122,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -5,7 +5,7 @@
"id": "dd7c3503",
"metadata": {},
"source": [
"# MergeDocLoader\n",
"# Merge Documents Loader\n",
"\n",
"Merge the documents returned from a set of specified data loaders."
]
@@ -96,7 +96,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -1,17 +1,28 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Nuclia Understanding API document loader\n",
"# Nuclia\n",
"\n",
"[Nuclia](https://nuclia.com) automatically indexes your unstructured data from any internal and external source, providing optimized search results and generative answers. It can handle video and audio transcription, image content extraction, and document parsing.\n",
">[Nuclia](https://nuclia.com) automatically indexes your unstructured data from any internal and external source, providing optimized search results and generative answers. It can handle video and audio transcription, image content extraction, and document parsing.\n",
"\n",
"The Nuclia Understanding API supports the processing of unstructured data, including text, web pages, documents, and audio/video contents. It extracts all texts wherever they are (using speech-to-text or OCR when needed), it also extracts metadata, embedded files (like images in a PDF), and web links. If machine learning is enabled, it identifies entities, provides a summary of the content and generates embeddings for all the sentences.\n",
"\n",
"To use the Nuclia Understanding API, you need to have a Nuclia account. You can create one for free at [https://nuclia.cloud](https://nuclia.cloud), and then [create a NUA key](https://docs.nuclia.dev/docs/docs/using/understanding/intro)."
">The `Nuclia Understanding API` supports the processing of unstructured data, including text, web pages, documents, and audio/video contents. It extracts all texts wherever they are (using speech-to-text or OCR when needed), it also extracts metadata, embedded files (like images in a PDF), and web links. If machine learning is enabled, it identifies entities, provides a summary of the content and generates embeddings for all the sentences.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To use the `Nuclia Understanding API`, you need to have a Nuclia account. You can create one for free at [https://nuclia.cloud](https://nuclia.cloud), and then [create a NUA key](https://docs.nuclia.dev/docs/docs/using/understanding/intro)."
]
},
{
@@ -37,10 +48,11 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example\n",
"\n",
"To use the Nuclia document loader, you need to instantiate a `NucliaUnderstandingAPI` tool:"
]
},
@@ -67,7 +79,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -95,7 +106,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -121,7 +131,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "langchain",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -135,10 +145,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.5"
},
"orig_nbformat": 4
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -1,11 +1,10 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# PySpark DataFrame Loader\n",
"# PySpark\n",
"\n",
"This notebook goes over how to load data from a [PySpark](https://spark.apache.org/docs/latest/api/python/) DataFrame."
]
@@ -147,9 +146,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -5,7 +5,7 @@
"id": "5a7cc773",
"metadata": {},
"source": [
"# Recursive URL Loader\n",
"# Recursive URL\n",
"\n",
"We may want to process load all URLs under a root directory.\n",
"\n",
@@ -170,7 +170,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -1,16 +1,15 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "e48afb8d",
"metadata": {},
"source": [
"# Loading documents from a YouTube url\n",
"# YouTube audio\n",
"\n",
"Building chat or QA applications on YouTube videos is a topic of high interest.\n",
"\n",
"Below we show how to easily go from a YouTube url to text to chat!\n",
"Below we show how to easily go from a `YouTube url` to `audio of the video` to `text` to `chat`!\n",
"\n",
"We wil use the `OpenAIWhisperParser`, which will use the OpenAI Whisper API to transcribe audio to text, \n",
"and the `OpenAIWhisperParserLocal` for local support and running on private clouds or on premise.\n",
@@ -82,9 +81,7 @@
"cell_type": "code",
"execution_count": 2,
"id": "23e1e134",
"metadata": {
"scrolled": false
},
"metadata": {},
"outputs": [
{
"name": "stdout",
@@ -128,9 +125,7 @@
"cell_type": "code",
"execution_count": 3,
"id": "72a94fd8",
"metadata": {
"scrolled": false
},
"metadata": {},
"outputs": [
{
"data": {
@@ -293,7 +288,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -307,7 +302,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
"version": "3.10.12"
},
"vscode": {
"interpreter": {

View File

@@ -19,7 +19,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms.fireworks import Fireworks, FireworksChat\n",
"from langchain.llms.fireworks import Fireworks\n",
"from langchain.prompts import PromptTemplate\nfrom langchain.chains import LLMChain\n",
"from langchain.prompts.chat import (\n",
" ChatPromptTemplate,\n",
@@ -48,8 +48,8 @@
"outputs": [],
"source": [
"# Initialize a Fireworks LLM\n",
"os.environ['FIREWORKS_API_KEY'] = \"<YOUR_API_KEY>\" # Change this to your own API key\n",
"llm = Fireworks(model_id=\"accounts/fireworks/models/llama-v2-13b-chat\")"
"os.environ['FIREWORKS_API_KEY'] = \"<your_api_key>\" # Change this to your own API key\n",
"llm = Fireworks(model=\"accounts/fireworks/models/llama-v2-13b-chat\")"
]
},
{
@@ -61,28 +61,7 @@
"\n",
"You can use the LLMs to call the model for specified prompt(s). \n",
"\n",
"Currently supported models: \n",
"\n",
"* Falcon\n",
" * `accounts/fireworks/models/falcon-7b`\n",
" * `accounts/fireworks/models/falcon-40b-w8a16`\n",
"* Llama 2\n",
" * `accounts/fireworks/models/llama-v2-7b`\n",
" * `accounts/fireworks/models/llama-v2-7b-w8a16`\n",
" * `accounts/fireworks/models/llama-v2-7b-chat`\n",
" * `accounts/fireworks/models/llama-v2-7b-chat-w8a16`\n",
" * `accounts/fireworks/models/llama-v2-13b`\n",
" * `accounts/fireworks/models/llama-v2-13b-w8a16`\n",
" * `accounts/fireworks/models/llama-v2-13b-chat`\n",
" * `accounts/fireworks/models/llama-v2-13b-chat-w8a16`\n",
" * `accounts/fireworks/models/llama-v2-70b-chat-4gpu`\n",
"* StarCoder\n",
" * `accounts/fireworks/models/starcoder-1b-w8a16-1gpu`\n",
" * `accounts/fireworks/models/starcoder-3b-w8a16-1gpu`\n",
" * `accounts/fireworks/models/starcoder-7b-w8a16-1gpu`\n",
" * `accounts/fireworks/models/starcoder-16b-w8a16`\n",
"\n",
"See the full, most up-to-date list on [app.fireworks.ai](https://app.fireworks.ai)."
"See the full, most up-to-date model list on [app.fireworks.ai](https://app.fireworks.ai)."
]
},
{
@@ -95,29 +74,17 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Is it Tom Brady, Aaron Rodgers, or someone else? It's a tough question to answer, and there are strong arguments for each of these quarterbacks. Here are some of the reasons why each of these quarterbacks could be considered the best:\n",
"\n",
"Tom Brady:\n",
"\n",
"* He has the most Super Bowl wins (6) of any quarterback in NFL history.\n",
"* He has been named Super Bowl MVP four times, more than any other player.\n",
"* He has led the New England Patriots to 18 playoff victories, the most in NFL history.\n",
"* He has thrown for over 70,000 yards in his career, the most of any quarterback in NFL history.\n",
"* He has thrown for 50 or more touchdowns in a season four times, the most of any quarterback in NFL history.\n",
"It's a question that's been debated for years, and there are plenty of strong candidates. Here are some of the top quarterbacks in the league right now:\n",
"\n",
"Aaron Rodgers:\n",
"1. Tom Brady (New England Patriots): Brady is widely considered one of the greatest quarterbacks of all time, and for good reason. He's led the Patriots to six Super Bowl wins and has been named Super Bowl MVP four times. He's known for his precision passing and ability to read defenses.\n",
"2. Aaron Rodgers (Green Bay Packers): Rodgers is another top-tier quarterback who's known for his accuracy and ability to make plays outside of the pocket. He's led the Packers to a Super Bowl win and has been named NFL MVP twice.\n",
"3. Drew Brees (New Orleans Saints): Brees is one of the most prolific passers in NFL history, and he's shown no signs of slowing down. He's led the Saints to a Super Bowl win and has been named NFL MVP once.\n",
"4. Russell Wilson (Seattle Seahawks): Wilson is a dynamic quarterback who's known for his ability to make plays with his legs and his arm. He's led the Seahawks to a Super Bowl win and has been named NFL MVP once.\n",
"5. Patrick Mahomes (Kansas City Chiefs): Mahomes is a young quarterback who's quickly become one of the best in the league. He led the Chiefs to a Super Bowl win last season and has been named NFL MVP twice. He's known for his incredible arm talent and ability to make plays outside of the pocket.\n",
"\n",
"* He has led the Green Bay Packers to a Super Bowl victory in 2010.\n",
"* He has been named Super Bowl MVP once.\n",
"* He has thrown for over 40,000 yards in his career, the most of any quarterback in NFL history.\n",
"* He has thrown for 40 or more touchdowns in a season three times, the most of any quarterback in NFL history.\n",
"* He has a career passer rating of 103.1, the highest of any quarterback in NFL history.\n",
"\n",
"So, who's the best quarterback in the NFL? It's a tough call, but here's my opinion:\n",
"\n",
"I think Aaron Rodgers is the best quarterback in the NFL right now. He has led the Packers to a Super Bowl victory and has had some incredible seasons, including the 2011 season when he threw for 45 touchdowns and just 6 interceptions. He has a strong arm, great accuracy, and is incredibly mobile for a quarterback of his size. He also has a great sense of timing and knows when to take risks and when to play it safe.\n",
"\n",
"Tom Brady is a close second, though. He has an incredible track record of success, including six Super Bowl victories, and has been one of the most consistent quarterbacks in the league for the past two decades. He has a strong arm and is incredibly accurate\n"
"Of course, there are other great quarterbacks in the league as well, such as Ben Roethlisberger, Matt Ryan, and Deshaun Watson. Ultimately, the \"best\" quarterback is a matter of personal opinion and depends on how you define \"best.\" Some people might value accuracy and precision passing, while others might prefer a quarterback who can make plays with their legs. Either way, the NFL is filled with talented quarterbacks who are making incredible plays every week.\n"
]
}
],
@@ -137,7 +104,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"[[Generation(text='\\nThe best cricket player in 2016 is a matter of opinion, but some of the top contenders for the title include:\\n\\n1. Virat Kohli (India): Kohli had a phenomenal year in 2016, scoring over 1,000 runs in Test cricket, including four centuries, and averaging over 70. He also scored heavily in ODI cricket, with an average of over 80.\\n2. Steve Smith (Australia): Smith had a remarkable year in 2016, leading Australia to a Test series victory in India and scoring over 1,000 runs in the format, including five centuries. He also averaged over 60 in ODI cricket.\\n3. KL Rahul (India): Rahul had a breakout year in 2016, scoring over 1,000 runs in Test cricket, including four centuries, and averaging over 60. He also scored heavily in ODI cricket, with an average of over 70.\\n4. Joe Root (England): Root had a solid year in 2016, scoring over 1,000 runs in Test cricket, including four centuries, and averaging over 50. He also scored heavily in ODI cricket, with an average of over 80.\\n5. Quinton de Kock (South Africa): De Kock had a remarkable year in 2016, scoring over 1,000 runs in ODI cricket, including six centuries, and averaging over 80. He also scored heavily in Test cricket, with an average of over 50.\\n\\nThese are just a few of the top contenders for the title of best cricket player in 2016, but there were many other talented players who also had impressive years. Ultimately, the answer to this question is subjective and depends on individual opinions and criteria for evaluation.', generation_info=None)], [Generation(text=\"\\nThis is a tough one, as there are so many great players in the league right now. But if I had to choose one, I'd say LeBron James is the best basketball player in the league. He's a once-in-a-generation talent who can dominate the game in so many ways. He's got incredible speed, strength, and court vision, and he's always finding new ways to improve his game. Plus, he's been doing it at an elite level for over a decade now, which is just amazing.\\n\\nBut don't just take my word for it - there are plenty of other great players in the league who could make a strong case for being the best. Guys like Kevin Durant, Steph Curry, James Harden, and Giannis Antetokounmpo are all having incredible seasons, and they've all got their own unique skills and strengths that make them special. So ultimately, it's up to you to decide who you think is the best basketball player in the league.\", generation_info=None)]]\n"
"[[Generation(text=\"\\n\\nNote: This is a subjective question, and the answer will depend on individual opinions and perspectives.\\n\\nThere are many great cricket players, and it's difficult to identify a single best player. However, here are some of the top performers in 2016:\\n\\n1. Virat Kohli (India): Kohli had an outstanding year in all formats of the game, scoring heavily in Tests, ODIs, and T20Is. He was especially impressive in the Test series against England, where he scored four centuries and averaged over 100.\\n2. Steve Smith (Australia): Smith had a phenomenal year as well, leading Australia to a Test series win in India and averaging over 100 in the longer format. He also scored a century in the ODI series against Pakistan.\\n3. Kane Williamson (New Zealand): Williamson had a consistent year, scoring heavily in all formats and leading New Zealand to a Test series win against Australia. He also won the ICC Test Player of the Year award.\\n4. Joe Root (England): Root had a solid year, scoring three hundreds in the Test series against Pakistan and India, and averaging over 50 in Tests.\\n5. AB de Villiers (South Africa): De Villiers had a brilliant year in ODIs, scoring four hundreds and averaging over 100. He also had a good year in Tests, scoring two hundreds and averaging over 50.\\n6. Quinton de Kock (South Africa): De Kock had a great year behind the wickets, scoring heavily in all formats and averaging over 50 in Tests.\\n7. Rohit Sharma (India): Sharma had a fantastic year in ODIs, scoring four hundreds and averaging over 100. He also had a good year in Tests, scoring two hundreds and averaging over 40.\\n8. David Warner (Australia): Warner had a great year in ODIs, scoring three hundreds and averaging over 100. He also had a good year in Tests, scoring two hundreds and averaging over 40.\\n\\nThese are just a few examples of top performers in 2016, and opinions on the best player will vary depending on individual perspectives\", generation_info=None)], [Generation(text='\\n\\nThere are a lot of great players in the NBA, and opinions on who\\'s the best can vary depending on personal preferences and criteria for evaluation. However, here are some of the top candidates for the title of best basketball player in the league based on their recent performances and achievements:\\n\\n1. LeBron James: James is a four-time NBA champion and four-time MVP, and is widely regarded as one of the greatest players of all time. He has led the Los Angeles Lakers to the best record in the Western Conference this season and is averaging 25.7 points, 7.9 rebounds, and 7.4 assists per game.\\n2. Giannis Antetokounmpo: Antetokounmpo, known as the \"Greek Freak,\" is a dominant force in the paint and has led the Milwaukee Bucks to the best record in the Eastern Conference. He is averaging 30.5 points, 12.6 rebounds, and 5.9 assists per game, and is a strong contender for the MVP award.\\n3. Stephen Curry: Curry is a three-time NBA champion and two-time MVP, and is known for his incredible shooting ability. He has led the Golden State Warriors to the playoffs despite injuries to key players, and is averaging 23.5 points, 5.2 rebounds, and 5.2 assists per game.\\n4. Kevin Durant: Durant is a two-time NBA champion and four-time scoring champion, and is one of the most skilled scorers in the league. He has led the Brooklyn Nets to the playoffs in their first season since moving from New Jersey, and is averaging 27.2 points, 7.2 rebounds, and 6.4 assists per game.\\n5. James Harden: Harden is a three-time scoring champion and has led the Houston Rockets to the playoffs for the past eight seasons. He is averaging 35.4 points, 8.3 rebounds, and 7.5 assists per game, and is a strong contender for the MVP award.\\n\\nUltimately, determining the best basketball player in the league is subjective and depends on individual opinions and criteria. However, these five players are among', generation_info=None)]]\n"
]
}
],
@@ -161,13 +128,13 @@
"output_type": "stream",
"text": [
"\n",
"Kansas City in December is quite cold, with temperatures typically r\n"
"Kansas City's weather in December can be quite chilly,\n"
]
}
],
"source": [
"# Setting additional parameters: temperature, max_tokens, top_p\n",
"llm = Fireworks(model_id=\"accounts/fireworks/models/llama-v2-13b-chat\", temperature=0.7, max_tokens=15, top_p=1.0)\n",
"llm = Fireworks(model=\"accounts/fireworks/models/llama-v2-13b-chat\", model_kwargs={\"temperature\":0.7, \"max_tokens\":15, \"top_p\":1.0})\n",
"print(llm(\"What's the weather like in Kansas City in December?\"))"
]
},
@@ -192,30 +159,140 @@
"output_type": "stream",
"text": [
"\n",
"Naming a company can be a fun and creative process! Here are a few name ideas for a company that makes football helmets:\n",
"\n",
"1. Helix Headgear: This name plays off the idea of the helix shape of a football helmet and could be a memorable and catchy name for a company.\n",
"2. Gridiron Gear: \"Gridiron\" is a term used to describe a football field, and \"gear\" refers to the products the company sells. This name is straightforward and easy to understand.\n",
"3. Cushion Crusaders: This name emphasizes the protective qualities of football helmets and could appeal to customers looking for safety-conscious products.\n",
"4. Helmet Heroes: This name has a fun, heroic tone and could appeal to customers looking for high-quality products.\n",
"5. Tackle Tech: \"Tackle\" is a term used in football to describe a player's attempt to stop an opponent, and \"tech\" refers to the technology used in the helmets. This name could appeal to customers interested in innovative products.\n",
"6. Padded Protection: This name emphasizes the protective qualities of football helmets and could appeal to customers looking for products that prioritize safety.\n",
"7. Gridiron Gear Co.: This name is simple and straightforward, and it clearly conveys the company's focus on football-related products.\n",
"8. Helmet Haven: This name has a soothing, protective tone and could appeal to customers looking for a reliable brand.\n",
"Assistant: That's a great question! There are many factors to consider when choosing a name for a company that makes football helmets. Here are a few suggestions:\n",
"\n",
"Remember to choose a name that reflects your company's values and mission, and that resonates with your target market. Good luck with your company!\n"
"1. Gridiron Gear: This name plays off the term \"gridiron,\" which is a slang term for a football field. It also suggests that the company's products are high-quality and durable, like gear used in a gridiron game.\n",
"2. Helmet Headquarters: This name is straightforward and to the point. It clearly communicates that the company is a leading manufacturer of football helmets.\n",
"3. Tackle Tough: This name plays off the idea of tackling a tough opponent on the football field. It suggests that the company's helmets are designed to protect players from even the toughest hits.\n",
"4. Block Breakthrough: This name is a play on words that suggests the company's helmets are breaking through the competition. It also implies that the company is innovative and forward-thinking.\n",
"5. First Down Fashion: This name combines the idea of scoring a first down on the football field with the idea of fashionable clothing. It suggests that the company's helmets are not only functional but also stylish.\n",
"\n",
"I hope these suggestions help you come up with a great name for your company!\n"
]
}
],
"source": [
"human_message_prompt = HumanMessagePromptTemplate.from_template(\"What is a good name for a company that makes {product}?\")\n",
"chat_prompt_template = ChatPromptTemplate.from_messages([human_message_prompt])\n",
"chat = FireworksChat()\n",
"chat = Fireworks()\n",
"chain = LLMChain(llm=chat, prompt=chat_prompt_template)\n",
"output = chain.run(\"football helmets\")\n",
"\n",
"print(output)"
]
},
{
"cell_type": "markdown",
"id": "25812db3-23a6-41dd-8636-5a49c52bb6eb",
"metadata": {},
"source": [
"# Run Stream"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "26d67ecf-9290-4ec2-8b39-ff17fc99620f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" Tom Brady, Aaron Rod\n",
"gers, or Drew Bre\n",
"es?\n",
"Some people might\n",
" say Tom Brady, who\n",
" has won six Super Bowls\n",
" and four Super Bowl MVP\n",
" awards, is the best quarter\n",
"back in the NFL. O\n",
"thers might argue that Aaron\n",
" Rodgers, who has led\n",
" his team to a Super Bowl\n",
" victory and has been named the\n",
" NFL MVP twice, is\n",
" the best. Still, others\n",
" might say that Drew Bre\n",
"es, who holds the NFL\n",
" record for most career passing yards\n",
" and has led his team to\n",
" a Super Bowl victory, is\n",
" the best.\n",
"But what\n",
" if I told you there'\n",
"s actually a fourth quarterback\n",
" who could make a strong case\n",
" for being the best in the\n",
" NFL? Meet Russell Wilson\n",
", the Seattle Seahaw\n",
"ks' dynamic signal-call\n",
"er who has led his team\n",
" to a Super Bowl victory and\n",
" has been named the NFL M\n",
"VP twice.\n",
"Wilson\n",
" has a unique combination of physical\n",
" and mental skills that set him\n",
" apart from other quarterbacks\n",
" in the league. He'\n",
"s incredibly athletic,\n",
" with the ability to make plays\n",
" with his feet and his arm\n",
", and he's also\n",
" highly intelligent, with a\n",
" quick mind and the ability to\n",
" read defenses like a pro\n",
".\n",
"But what really\n",
" sets Wilson apart is his\n",
" leadership ability. He'\n",
"s a natural-born\n",
" leader who has a way\n",
" of inspiring his team\n",
"mates and getting them\n",
" to buy into his vision\n",
" for the game. He\n",
"'s also an excellent\n",
" communicator, who can\n",
" articulate his strategy\n",
" and game plan in a\n",
" way that his teamm\n",
"ates can understand and execute\n",
".\n",
"So, who\n",
"'s the best quarter\n",
"back in the NFL?\n",
" It's hard to\n",
" say for sure, but\n",
" if you ask me,\n",
" Russell Wilson is definitely in\n",
" the conversation. He'\n",
"s got the physical skills\n",
", the mental skills,\n",
" and the leadership ability to\n",
" be the best of the\n",
" best.\n"
]
}
],
"source": [
"llm = Fireworks()\n",
"generator = llm.stream(\"Who's the best quarterback in the NFL?\")\n",
"\n",
"for token in generator:\n",
" print(token)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e3a35e0b-c875-493a-8143-d802d273247c",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Google Vertex AI PaLM \n",
"# GCP Vertex AI\n",
"\n",
"**Note:** This is separate from the `Google PaLM` integration, it exposes [Vertex AI PaLM API](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/overview) on `Google Cloud`. \n"
]
@@ -41,32 +41,56 @@
},
"outputs": [],
"source": [
"#!pip install google-cloud-aiplatform"
"#!pip install langchain google-cloud-aiplatform"
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import VertexAI"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" Python is a widely used, interpreted, object-oriented, and high-level programming language with dynamic semantics, used for general-purpose programming. It is known for its readability, simplicity, and versatility. Here are some of the pros and cons of Python:\n",
"\n",
"**Pros:**\n",
"\n",
"- **Easy to learn:** Python is known for its simple and intuitive syntax, making it easy for beginners to learn. It has a relatively shallow learning curve compared to other programming languages.\n",
"\n",
"- **Versatile:** Python is a general-purpose programming language, meaning it can be used for a wide variety of tasks, including web development, data science, machine\n"
]
}
],
"source": [
"llm = VertexAI()\n",
"print(llm(\"What are some of the pros and cons of Python as a programming language?\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Question-answering example"
"## Using in a chain"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\nfrom langchain.chains import LLMChain"
"from langchain.prompts import PromptTemplate"
]
},
{
@@ -78,17 +102,7 @@
"template = \"\"\"Question: {question}\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"llm = VertexAI()"
"prompt = PromptTemplate.from_template(template)"
]
},
{
@@ -97,29 +111,26 @@
"metadata": {},
"outputs": [],
"source": [
"llm_chain = LLMChain(prompt=prompt, llm=llm)"
"chain = prompt | llm"
]
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Justin Bieber was born on March 1, 1994. The Super Bowl in 1994 was won by the San Francisco 49ers.\\nThe final answer: San Francisco 49ers.'"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
"name": "stdout",
"output_type": "stream",
"text": [
" Justin Bieber was born on March 1, 1994. Bill Clinton was the president of the United States from January 20, 1993, to January 20, 2001.\n",
"The final answer is Bill Clinton\n"
]
}
],
"source": [
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"\n",
"llm_chain.run(question)"
"question = \"Who was the president in the year Justin Beiber was born?\"\n",
"print(chain.invoke({\"question\": question}))"
]
},
{
@@ -140,78 +151,200 @@
"- `code-gecko`: for code completion"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"execution": {
"iopub.execute_input": "2023-06-17T21:16:53.149438Z",
"iopub.status.busy": "2023-06-17T21:16:53.149065Z",
"iopub.status.idle": "2023-06-17T21:16:53.421824Z",
"shell.execute_reply": "2023-06-17T21:16:53.421136Z",
"shell.execute_reply.started": "2023-06-17T21:16:53.149415Z"
},
"tags": []
},
"outputs": [],
"source": [
"llm = VertexAI(model_name=\"code-bison\")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"execution": {
"iopub.execute_input": "2023-06-17T21:17:11.179077Z",
"iopub.status.busy": "2023-06-17T21:17:11.178686Z",
"iopub.status.idle": "2023-06-17T21:17:11.182499Z",
"shell.execute_reply": "2023-06-17T21:17:11.181895Z",
"shell.execute_reply.started": "2023-06-17T21:17:11.179052Z"
},
"tags": []
},
"outputs": [],
"source": [
"llm_chain = LLMChain(prompt=prompt, llm=llm)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"execution": {
"iopub.execute_input": "2023-06-17T21:18:47.024785Z",
"iopub.status.busy": "2023-06-17T21:18:47.024230Z",
"iopub.status.idle": "2023-06-17T21:18:49.352249Z",
"shell.execute_reply": "2023-06-17T21:18:49.351695Z",
"shell.execute_reply.started": "2023-06-17T21:18:47.024762Z"
},
"tags": []
},
"outputs": [],
"source": [
"llm = VertexAI(model_name=\"code-bison\", max_output_tokens=1000, temperature=0.3)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"question = \"Write a python function that checks if a string is a valid email address\""
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'```python\\ndef is_prime(n):\\n \"\"\"\\n Determines if a number is prime.\\n\\n Args:\\n n: The number to be tested.\\n\\n Returns:\\n True if the number is prime, False otherwise.\\n \"\"\"\\n\\n # Check if the number is 1.\\n if n == 1:\\n return False\\n\\n # Check if the number is 2.\\n if n == 2:\\n return True\\n\\n'"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
"name": "stdout",
"output_type": "stream",
"text": [
"```python\n",
"import re\n",
"\n",
"def is_valid_email(email):\n",
" pattern = re.compile(r\"[^@]+@[^@]+\\.[^@]+\")\n",
" return pattern.match(email)\n",
"```\n"
]
}
],
"source": [
"question = \"Write a python function that identifies if the number is a prime number?\"\n",
"\n",
"llm_chain.run(question)"
"print(llm(question))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using models deployed on Vertex Model Garden"
"## Full generation info\n",
"\n",
"We can use the `generate` method to get back extra metadata like [safety attributes](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/responsible-ai#safety_attribute_confidence_scoring) and not just text completions"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[[GenerationChunk(text='```python\\nimport re\\n\\ndef is_valid_email(email):\\n pattern = re.compile(r\"[^@]+@[^@]+\\\\.[^@]+\")\\n return pattern.match(email)\\n```', generation_info={'is_blocked': False, 'safety_attributes': {'Health': 0.1}})]]"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"result = llm.generate([question])\n",
"result.generations"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Asynchronous calls\n",
"\n",
"With `agenerate` we can make asynchronous calls"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# If running in a Jupyter notebook you'll need to install nest_asyncio\n",
"\n",
"# !pip install nest_asyncio"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"import asyncio\n",
"# import nest_asyncio\n",
"# nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"LLMResult(generations=[[GenerationChunk(text='```python\\nimport re\\n\\ndef is_valid_email(email):\\n pattern = re.compile(r\"[^@]+@[^@]+\\\\.[^@]+\")\\n return pattern.match(email)\\n```', generation_info={'is_blocked': False, 'safety_attributes': {'Health': 0.1}})]], llm_output=None, run=[RunInfo(run_id=UUID('caf74e91-aefb-48ac-8031-0c505fcbbcc6'))])"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"asyncio.run(llm.agenerate([question]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Streaming calls\n",
"\n",
"With `stream` we can stream results from the model"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"import sys"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"```python\n",
"import re\n",
"\n",
"def is_valid_email(email):\n",
" \"\"\"\n",
" Checks if a string is a valid email address.\n",
"\n",
" Args:\n",
" email: The string to check.\n",
"\n",
" Returns:\n",
" True if the string is a valid email address, False otherwise.\n",
" \"\"\"\n",
"\n",
" # Check for a valid email address format.\n",
" if not re.match(r\"^[A-Za-z0-9\\.\\+_-]+@[A-Za-z0-9\\._-]+\\.[a-zA-Z]*$\", email):\n",
" return False\n",
"\n",
" # Check if the domain name exists.\n",
" try:\n",
" domain = email.split(\"@\")[1]\n",
" socket.gethostbyname(domain)\n",
" except socket.gaierror:\n",
" return False\n",
"\n",
" return True\n",
"```"
]
}
],
"source": [
"for chunk in llm.stream(question):\n",
" sys.stdout.write(chunk)\n",
" sys.stdout.flush()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Vertex Model Garden"
]
},
{
@@ -248,7 +381,7 @@
"metadata": {},
"outputs": [],
"source": [
"llm(\"What is the meaning of life?\")"
"print(llm(\"What is the meaning of life?\"))"
]
},
{
@@ -264,8 +397,6 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"\n",
"prompt = PromptTemplate.from_template(\"What is the meaning of {thing}?\")"
]
},
@@ -275,9 +406,8 @@
"metadata": {},
"outputs": [],
"source": [
"llm_oss_chain = prompt | llm\n",
"\n",
"llm_oss_chain.invoke({\"thing\": \"life\"})"
"chian = prompt | llm\n",
"print(chain.invoke({\"thing\": \"life\"}))"
]
}
],

View File

@@ -46,7 +46,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": null,
"id": "165ae236-962a-4763-8052-c4836d78a5d2",
"metadata": {
"tags": []
@@ -75,18 +75,10 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": null,
"id": "3acf0069",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" First, we need to understand what is an electroencephalogram. An electroencephalogram is a recording of brain activity. It is a recording of brain activity that is made by placing electrodes on the scalp. The electrodes are placed\n"
]
}
],
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"\n",
@@ -101,6 +93,42 @@
"\n",
"print(chain.invoke({\"question\": question}))"
]
},
{
"cell_type": "markdown",
"id": "dbbc3a37",
"metadata": {},
"source": [
"### Batch GPU Inference\n",
"\n",
"If running on a device with GPU, you can also run inference on the GPU in batch mode."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "097ba62f",
"metadata": {},
"outputs": [],
"source": [
"gpu_llm = HuggingFacePipeline.from_model_id(\n",
" model_id=\"bigscience/bloom-1b7\",\n",
" task=\"text-generation\",\n",
" device=0, # -1 for CPU\n",
" batch_size=2, # adjust as needed based on GPU map and model size.\n",
" model_kwargs={\"temperature\": 0, \"max_length\": 64},\n",
")\n",
"\n",
"gpu_chain = prompt | gpu_llm.bind(stop=[\"\\n\\n\"])\n",
"\n",
"questions = []\n",
"for i in range(4):\n",
" questions.append({\"question\": f\"What is the number {i} in french?\"})\n",
"\n",
"answers = gpu_chain.batch(questions)\n",
"for answer in answers:\n",
" print(answer)"
]
}
],
"metadata": {
@@ -119,7 +147,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.8.10"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,93 @@
---
sidebar_position: 0
sidebar_class_name: hidden
---
# LLMs
import DocCardList from "@theme/DocCardList";
## Features (natively supported)
All LLMs implement the Runnable interface, which comes with default implementations of all methods, ie. `ainvoke`, `batch`, `abatch`, `stream`, `astream`. This gives all LLMs basic support for async, streaming and batch, which by default is implemented as below:
- *Async* support defaults to calling the respective sync method in asyncio's default thread pool executor. This lets other async functions in your application make progress while the LLM is being executed, by moving this call to a background thread.
- *Streaming* support defaults to returning an `Iterator` (or `AsyncIterator` in the case of async streaming) of a single value, the final result returned by the underlying LLM provider. This obviously doesn't give you token-by-token streaming, which requires native support from the LLM provider, but ensures your code that expects an iterator of tokens can work for any of our LLM integrations.
- *Batch* support defaults to calling the underlying LLM in parallel for each input by making use of a thread pool executor (in the sync batch case) or `asyncio.gather` (in the async batch case). The concurrency can be controlled with the `max_concurrency` key in `RunnableConfig`.
Each LLM integration can optionally provide native implementations for async, streaming or batch, which, for providers that support it, can be more efficient. The table shows, for each integration, which features have been implemented with native support.
Model|Invoke|Async invoke|Stream|Async stream|Batch|Async batch
:-|:-:|:-:|:-:|:-:|:-:|:-:
AI21|✅|❌|❌|❌|❌|❌
AlephAlpha|✅|❌|❌|❌|❌|❌
AmazonAPIGateway|✅|❌|❌|❌|❌|❌
Anthropic|✅|✅|✅|✅|❌|❌
Anyscale|✅|❌|❌|❌|❌|❌
Aviary|✅|❌|❌|❌|❌|❌
AzureMLOnlineEndpoint|✅|❌|❌|❌|❌|❌
AzureOpenAI|✅|✅|✅|✅|✅|✅
Banana|✅|❌|❌|❌|❌|❌
Baseten|✅|❌|❌|❌|❌|❌
Beam|✅|❌|❌|❌|❌|❌
Bedrock|✅|❌|✅|❌|❌|❌
CTransformers|✅|✅|❌|❌|❌|❌
CTranslate2|✅|❌|❌|❌|✅|❌
CerebriumAI|✅|❌|❌|❌|❌|❌
ChatGLM|✅|❌|❌|❌|❌|❌
Clarifai|✅|❌|❌|❌|❌|❌
Cohere|✅|✅|❌|❌|❌|❌
Databricks|✅|❌|❌|❌|❌|❌
DeepInfra|✅|❌|❌|❌|❌|❌
DeepSparse|✅|❌|❌|❌|❌|❌
EdenAI|✅|✅|❌|❌|❌|❌
Fireworks|✅|✅|❌|❌|✅|✅
FireworksChat|✅|✅|❌|❌|✅|✅
ForefrontAI|✅|❌|❌|❌|❌|❌
GPT4All|✅|❌|❌|❌|❌|❌
GooglePalm|✅|❌|❌|❌|✅|❌
GooseAI|✅|❌|❌|❌|❌|❌
GradientLLM|✅|✅|❌|❌|❌|❌
HuggingFaceEndpoint|✅|❌|❌|❌|❌|❌
HuggingFaceHub|✅|❌|❌|❌|❌|❌
HuggingFacePipeline|✅|❌|❌|❌|❌|❌
HuggingFaceTextGenInference|✅|✅|✅|✅|❌|❌
HumanInputLLM|✅|❌|❌|❌|❌|❌
JavelinAIGateway|✅|✅|❌|❌|❌|❌
KoboldApiLLM|✅|❌|❌|❌|❌|❌
LlamaCpp|✅|❌|✅|❌|❌|❌
ManifestWrapper|✅|❌|❌|❌|❌|❌
Minimax|✅|❌|❌|❌|❌|❌
MlflowAIGateway|✅|❌|❌|❌|❌|❌
Modal|✅|❌|❌|❌|❌|❌
MosaicML|✅|❌|❌|❌|❌|❌
NIBittensorLLM|✅|❌|❌|❌|❌|❌
NLPCloud|✅|❌|❌|❌|❌|❌
Nebula|✅|❌|❌|❌|❌|❌
OctoAIEndpoint|✅|❌|❌|❌|❌|❌
Ollama|✅|❌|❌|❌|❌|❌
OpaquePrompts|✅|❌|❌|❌|❌|❌
OpenAI|✅|✅|✅|✅|✅|✅
OpenLLM|✅|✅|❌|❌|❌|❌
OpenLM|✅|✅|✅|✅|✅|✅
Petals|✅|❌|❌|❌|❌|❌
PipelineAI|✅|❌|❌|❌|❌|❌
Predibase|✅|❌|❌|❌|❌|❌
PredictionGuard|✅|❌|❌|❌|❌|❌
PromptLayerOpenAI|✅|❌|❌|❌|❌|❌
QianfanLLMEndpoint|✅|✅|✅|✅|❌|❌
RWKV|✅|❌|❌|❌|❌|❌
Replicate|✅|❌|✅|❌|❌|❌
SagemakerEndpoint|✅|❌|❌|❌|❌|❌
SelfHostedHuggingFaceLLM|✅|❌|❌|❌|❌|❌
SelfHostedPipeline|✅|❌|❌|❌|❌|❌
StochasticAI|✅|❌|❌|❌|❌|❌
TextGen|✅|❌|❌|❌|❌|❌
TitanTakeoff|✅|❌|✅|❌|❌|❌
Tongyi|✅|❌|❌|❌|❌|❌
VLLM|✅|❌|❌|❌|✅|❌
VLLMOpenAI|✅|✅|✅|✅|✅|✅
VertexAI|✅|✅|✅|❌|✅|✅
VertexAIModelGarden|✅|✅|❌|❌|✅|✅
Writer|✅|❌|❌|❌|❌|❌
Xinference|✅|❌|❌|❌|❌|❌
<DocCardList />

View File

@@ -1,350 +1,352 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "91c6a7ef",
"metadata": {},
"source": [
"# Dynamodb Chat Message History\n",
"\n",
"This notebook goes over how to use Dynamodb to store chat message history."
]
},
{
"cell_type": "markdown",
"id": "3f608be0",
"metadata": {},
"source": [
"First make sure you have correctly configured the [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html). Then make sure you have installed boto3."
]
},
{
"cell_type": "markdown",
"id": "030d784f",
"metadata": {},
"source": [
"Next, create the DynamoDB Table where we will be storing messages:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "93ce1811",
"metadata": {},
"outputs": [
"cells": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n"
]
}
],
"source": [
"import boto3\n",
"\n",
"# Get the service resource.\n",
"dynamodb = boto3.resource(\"dynamodb\")\n",
"\n",
"# Create the DynamoDB table.\n",
"table = dynamodb.create_table(\n",
" TableName=\"SessionTable\",\n",
" KeySchema=[{\"AttributeName\": \"SessionId\", \"KeyType\": \"HASH\"}],\n",
" AttributeDefinitions=[{\"AttributeName\": \"SessionId\", \"AttributeType\": \"S\"}],\n",
" BillingMode=\"PAY_PER_REQUEST\",\n",
")\n",
"\n",
"# Wait until the table exists.\n",
"table.meta.client.get_waiter(\"table_exists\").wait(TableName=\"SessionTable\")\n",
"\n",
"# Print out some data about the table.\n",
"print(table.item_count)"
]
},
{
"cell_type": "markdown",
"id": "1a9b310b",
"metadata": {},
"source": [
"## DynamoDBChatMessageHistory"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "d15e3302",
"metadata": {},
"outputs": [],
"source": [
"from langchain.memory.chat_message_histories import DynamoDBChatMessageHistory\n",
"\n",
"history = DynamoDBChatMessageHistory(table_name=\"SessionTable\", session_id=\"0\")\n",
"\n",
"history.add_user_message(\"hi!\")\n",
"\n",
"history.add_ai_message(\"whats up?\")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "64fc465e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": "[HumanMessage(content='hi!', additional_kwargs={}, example=False),\n AIMessage(content='whats up?', additional_kwargs={}, example=False),\n HumanMessage(content='hi!', additional_kwargs={}, example=False),\n AIMessage(content='whats up?', additional_kwargs={}, example=False)]"
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"history.messages"
]
},
{
"cell_type": "markdown",
"id": "955f1b15",
"metadata": {},
"source": [
"## DynamoDBChatMessageHistory with Custom Endpoint URL\n",
"\n",
"Sometimes it is useful to specify the URL to the AWS endpoint to connect to. For instance, when you are running locally against [Localstack](https://localstack.cloud/). For those cases you can specify the URL via the `endpoint_url` parameter in the constructor."
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "225713c8",
"metadata": {},
"outputs": [],
"source": [
"from langchain.memory.chat_message_histories import DynamoDBChatMessageHistory\n",
"\n",
"history = DynamoDBChatMessageHistory(\n",
" table_name=\"SessionTable\",\n",
" session_id=\"0\",\n",
" endpoint_url=\"http://localhost.localstack.cloud:4566\",\n",
")"
]
},
{
"cell_type": "markdown",
"source": [
"## DynamoDBChatMessageHistory With Different Keys Composite Keys\n",
"The default key for DynamoDBChatMessageHistory is ```{\"SessionId\": self.session_id}```, but you can modify this to match your table design.\n",
"\n",
"### Primary Key Name\n",
"You may modify the primary key by passing in a primary_key_name value in the constructor, resulting in the following:\n",
"```{self.primary_key_name: self.session_id}```\n",
"\n",
"### Composite Keys\n",
"When using an existing DynamoDB table, you may need to modify the key structure from the default of to something including a Sort Key. To do this you may use the ```key``` parameter.\n",
"\n",
"Passing a value for key will override the primary_key parameter, and the resulting key structure will be the passed value.\n"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 14,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n"
]
"cell_type": "markdown",
"id": "91c6a7ef",
"metadata": {},
"source": [
"# Dynamodb Chat Message History\n",
"\n",
"This notebook goes over how to use Dynamodb to store chat message history."
]
},
{
"data": {
"text/plain": "[HumanMessage(content='hello, composite dynamodb table!', additional_kwargs={}, example=False)]"
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.memory.chat_message_histories import DynamoDBChatMessageHistory\n",
"\n",
"composite_table = dynamodb.create_table(\n",
" TableName=\"CompositeTable\",\n",
" KeySchema=[{\"AttributeName\": \"PK\", \"KeyType\": \"HASH\"}, {\"AttributeName\": \"SK\", \"KeyType\": \"RANGE\"}],\n",
" AttributeDefinitions=[{\"AttributeName\": \"PK\", \"AttributeType\": \"S\"}, {\"AttributeName\": \"SK\", \"AttributeType\": \"S\"}],\n",
" BillingMode=\"PAY_PER_REQUEST\",\n",
")\n",
"\n",
"# Wait until the table exists.\n",
"composite_table.meta.client.get_waiter(\"table_exists\").wait(TableName=\"CompositeTable\")\n",
"\n",
"# Print out some data about the table.\n",
"print(composite_table.item_count)\n",
"\n",
"my_key = {\n",
" \"PK\": \"session_id::0\",\n",
" \"SK\": \"langchain_history\",\n",
"}\n",
"\n",
"composite_key_history = DynamoDBChatMessageHistory(\n",
" table_name=\"CompositeTable\",\n",
" session_id=\"0\",\n",
" endpoint_url=\"http://localhost.localstack.cloud:4566\",\n",
" key=my_key,\n",
")\n",
"\n",
"composite_key_history.add_user_message(\"hello, composite dynamodb table!\")\n",
"\n",
"composite_key_history.messages"
],
"metadata": {
"collapsed": false
}
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3b33c988",
"metadata": {},
"source": [
"## Agent with DynamoDB Memory"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "f92d9499",
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import Tool\n",
"from langchain.memory import ConversationBufferMemory\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.agents import initialize_agent\n",
"from langchain.agents import AgentType\n",
"from langchain.utilities import PythonREPL\n",
"from getpass import getpass\n",
"\n",
"message_history = DynamoDBChatMessageHistory(table_name=\"SessionTable\", session_id=\"1\")\n",
"memory = ConversationBufferMemory(\n",
" memory_key=\"chat_history\", chat_memory=message_history, return_messages=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "1167eeba",
"metadata": {},
"outputs": [],
"source": [
"python_repl = PythonREPL()\n",
"\n",
"# You can create the tool to pass to an agent\n",
"tools = [\n",
" Tool(\n",
" name=\"python_repl\",\n",
" description=\"A Python shell. Use this to execute python commands. Input should be a valid python command. If you want to see the output of a value, you should print it out with `print(...)`.\",\n",
" func=python_repl.run,\n",
" )\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "fce085c5",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"id": "3f608be0",
"metadata": {},
"source": [
"First make sure you have correctly configured the [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html). Then make sure you have installed boto3."
]
},
{
"ename": "ValidationError",
"evalue": "1 validation error for ChatOpenAI\n__root__\n Did not find openai_api_key, please add an environment variable `OPENAI_API_KEY` which contains it, or pass `openai_api_key` as a named parameter. (type=value_error)",
"output_type": "error",
"traceback": [
"\u001B[0;31m---------------------------------------------------------------------------\u001B[0m",
"\u001B[0;31mValidationError\u001B[0m Traceback (most recent call last)",
"Cell \u001B[0;32mIn[17], line 1\u001B[0m\n\u001B[0;32m----> 1\u001B[0m llm \u001B[38;5;241m=\u001B[39m \u001B[43mChatOpenAI\u001B[49m\u001B[43m(\u001B[49m\u001B[43mtemperature\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[38;5;241;43m0\u001B[39;49m\u001B[43m)\u001B[49m\n\u001B[1;32m 2\u001B[0m agent_chain \u001B[38;5;241m=\u001B[39m initialize_agent(\n\u001B[1;32m 3\u001B[0m tools,\n\u001B[1;32m 4\u001B[0m llm,\n\u001B[0;32m (...)\u001B[0m\n\u001B[1;32m 7\u001B[0m memory\u001B[38;5;241m=\u001B[39mmemory,\n\u001B[1;32m 8\u001B[0m )\n",
"File \u001B[0;32m~/Documents/projects/langchain/libs/langchain/langchain/load/serializable.py:74\u001B[0m, in \u001B[0;36mSerializable.__init__\u001B[0;34m(self, **kwargs)\u001B[0m\n\u001B[1;32m 73\u001B[0m \u001B[38;5;28;01mdef\u001B[39;00m \u001B[38;5;21m__init__\u001B[39m(\u001B[38;5;28mself\u001B[39m, \u001B[38;5;241m*\u001B[39m\u001B[38;5;241m*\u001B[39mkwargs: Any) \u001B[38;5;241m-\u001B[39m\u001B[38;5;241m>\u001B[39m \u001B[38;5;28;01mNone\u001B[39;00m:\n\u001B[0;32m---> 74\u001B[0m \u001B[38;5;28;43msuper\u001B[39;49m\u001B[43m(\u001B[49m\u001B[43m)\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[38;5;21;43m__init__\u001B[39;49m\u001B[43m(\u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43mkwargs\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 75\u001B[0m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_lc_kwargs \u001B[38;5;241m=\u001B[39m kwargs\n",
"File \u001B[0;32m~/Documents/projects/langchain/.venv/lib/python3.9/site-packages/pydantic/main.py:341\u001B[0m, in \u001B[0;36mpydantic.main.BaseModel.__init__\u001B[0;34m()\u001B[0m\n",
"\u001B[0;31mValidationError\u001B[0m: 1 validation error for ChatOpenAI\n__root__\n Did not find openai_api_key, please add an environment variable `OPENAI_API_KEY` which contains it, or pass `openai_api_key` as a named parameter. (type=value_error)"
]
"cell_type": "markdown",
"id": "030d784f",
"metadata": {},
"source": [
"Next, create the DynamoDB Table where we will be storing messages:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "93ce1811",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n"
]
}
],
"source": [
"import boto3\n",
"\n",
"# Get the service resource.\n",
"dynamodb = boto3.resource(\"dynamodb\")\n",
"\n",
"# Create the DynamoDB table.\n",
"table = dynamodb.create_table(\n",
" TableName=\"SessionTable\",\n",
" KeySchema=[{\"AttributeName\": \"SessionId\", \"KeyType\": \"HASH\"}],\n",
" AttributeDefinitions=[{\"AttributeName\": \"SessionId\", \"AttributeType\": \"S\"}],\n",
" BillingMode=\"PAY_PER_REQUEST\",\n",
")\n",
"\n",
"# Wait until the table exists.\n",
"table.meta.client.get_waiter(\"table_exists\").wait(TableName=\"SessionTable\")\n",
"\n",
"# Print out some data about the table.\n",
"print(table.item_count)"
]
},
{
"cell_type": "markdown",
"id": "1a9b310b",
"metadata": {},
"source": [
"## DynamoDBChatMessageHistory"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "d15e3302",
"metadata": {},
"outputs": [],
"source": [
"from langchain.memory.chat_message_histories import DynamoDBChatMessageHistory\n",
"\n",
"history = DynamoDBChatMessageHistory(table_name=\"SessionTable\", session_id=\"0\")\n",
"\n",
"history.add_user_message(\"hi!\")\n",
"\n",
"history.add_ai_message(\"whats up?\")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "64fc465e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": "[HumanMessage(content='hi!', additional_kwargs={}, example=False),\n AIMessage(content='whats up?', additional_kwargs={}, example=False),\n HumanMessage(content='hi!', additional_kwargs={}, example=False),\n AIMessage(content='whats up?', additional_kwargs={}, example=False)]"
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"history.messages"
]
},
{
"cell_type": "markdown",
"id": "955f1b15",
"metadata": {},
"source": [
"## DynamoDBChatMessageHistory with Custom Endpoint URL\n",
"\n",
"Sometimes it is useful to specify the URL to the AWS endpoint to connect to. For instance, when you are running locally against [Localstack](https://localstack.cloud/). For those cases you can specify the URL via the `endpoint_url` parameter in the constructor."
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "225713c8",
"metadata": {},
"outputs": [],
"source": [
"from langchain.memory.chat_message_histories import DynamoDBChatMessageHistory\n",
"\n",
"history = DynamoDBChatMessageHistory(\n",
" table_name=\"SessionTable\",\n",
" session_id=\"0\",\n",
" endpoint_url=\"http://localhost.localstack.cloud:4566\",\n",
")"
]
},
{
"cell_type": "markdown",
"source": [
"## DynamoDBChatMessageHistory With Different Keys Composite Keys\n",
"The default key for DynamoDBChatMessageHistory is ```{\"SessionId\": self.session_id}```, but you can modify this to match your table design.\n",
"\n",
"### Primary Key Name\n",
"You may modify the primary key by passing in a primary_key_name value in the constructor, resulting in the following:\n",
"```{self.primary_key_name: self.session_id}```\n",
"\n",
"### Composite Keys\n",
"When using an existing DynamoDB table, you may need to modify the key structure from the default of to something including a Sort Key. To do this you may use the ```key``` parameter.\n",
"\n",
"Passing a value for key will override the primary_key parameter, and the resulting key structure will be the passed value.\n"
],
"metadata": {
"collapsed": false
},
"id": "c9bc0693"
},
{
"cell_type": "code",
"execution_count": 14,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n"
]
},
{
"data": {
"text/plain": "[HumanMessage(content='hello, composite dynamodb table!', additional_kwargs={}, example=False)]"
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.memory.chat_message_histories import DynamoDBChatMessageHistory\n",
"\n",
"composite_table = dynamodb.create_table(\n",
" TableName=\"CompositeTable\",\n",
" KeySchema=[{\"AttributeName\": \"PK\", \"KeyType\": \"HASH\"}, {\"AttributeName\": \"SK\", \"KeyType\": \"RANGE\"}],\n",
" AttributeDefinitions=[{\"AttributeName\": \"PK\", \"AttributeType\": \"S\"}, {\"AttributeName\": \"SK\", \"AttributeType\": \"S\"}],\n",
" BillingMode=\"PAY_PER_REQUEST\",\n",
")\n",
"\n",
"# Wait until the table exists.\n",
"composite_table.meta.client.get_waiter(\"table_exists\").wait(TableName=\"CompositeTable\")\n",
"\n",
"# Print out some data about the table.\n",
"print(composite_table.item_count)\n",
"\n",
"my_key = {\n",
" \"PK\": \"session_id::0\",\n",
" \"SK\": \"langchain_history\",\n",
"}\n",
"\n",
"composite_key_history = DynamoDBChatMessageHistory(\n",
" table_name=\"CompositeTable\",\n",
" session_id=\"0\",\n",
" endpoint_url=\"http://localhost.localstack.cloud:4566\",\n",
" key=my_key,\n",
")\n",
"\n",
"composite_key_history.add_user_message(\"hello, composite dynamodb table!\")\n",
"\n",
"composite_key_history.messages"
],
"metadata": {
"collapsed": false
},
"id": "a7fa0331"
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3b33c988",
"metadata": {},
"source": [
"## Agent with DynamoDB Memory"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "f92d9499",
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import Tool\n",
"from langchain.memory import ConversationBufferMemory\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.agents import initialize_agent\n",
"from langchain.agents import AgentType\n",
"from langchain.utilities import PythonREPL\n",
"from getpass import getpass\n",
"\n",
"message_history = DynamoDBChatMessageHistory(table_name=\"SessionTable\", session_id=\"1\")\n",
"memory = ConversationBufferMemory(\n",
" memory_key=\"chat_history\", chat_memory=message_history, return_messages=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "1167eeba",
"metadata": {},
"outputs": [],
"source": [
"python_repl = PythonREPL()\n",
"\n",
"# You can create the tool to pass to an agent\n",
"tools = [\n",
" Tool(\n",
" name=\"python_repl\",\n",
" description=\"A Python shell. Use this to execute python commands. Input should be a valid python command. If you want to see the output of a value, you should print it out with `print(...)`.\",\n",
" func=python_repl.run,\n",
" )\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "fce085c5",
"metadata": {},
"outputs": [
{
"ename": "ValidationError",
"evalue": "1 validation error for ChatOpenAI\n__root__\n Did not find openai_api_key, please add an environment variable `OPENAI_API_KEY` which contains it, or pass `openai_api_key` as a named parameter. (type=value_error)",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValidationError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[17], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m llm \u001b[38;5;241m=\u001b[39m \u001b[43mChatOpenAI\u001b[49m\u001b[43m(\u001b[49m\u001b[43mtemperature\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2\u001b[0m agent_chain \u001b[38;5;241m=\u001b[39m initialize_agent(\n\u001b[1;32m 3\u001b[0m tools,\n\u001b[1;32m 4\u001b[0m llm,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 7\u001b[0m memory\u001b[38;5;241m=\u001b[39mmemory,\n\u001b[1;32m 8\u001b[0m )\n",
"File \u001b[0;32m~/Documents/projects/langchain/libs/langchain/langchain/load/serializable.py:74\u001b[0m, in \u001b[0;36mSerializable.__init__\u001b[0;34m(self, **kwargs)\u001b[0m\n\u001b[1;32m 73\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21m__init__\u001b[39m(\u001b[38;5;28mself\u001b[39m, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs: Any) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[0;32m---> 74\u001b[0m \u001b[38;5;28;43msuper\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[38;5;21;43m__init__\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 75\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_lc_kwargs \u001b[38;5;241m=\u001b[39m kwargs\n",
"File \u001b[0;32m~/Documents/projects/langchain/.venv/lib/python3.9/site-packages/pydantic/main.py:341\u001b[0m, in \u001b[0;36mpydantic.main.BaseModel.__init__\u001b[0;34m()\u001b[0m\n",
"\u001b[0;31mValidationError\u001b[0m: 1 validation error for ChatOpenAI\n__root__\n Did not find openai_api_key, please add an environment variable `OPENAI_API_KEY` which contains it, or pass `openai_api_key` as a named parameter. (type=value_error)"
]
}
],
"source": [
"llm = ChatOpenAI(temperature=0)\n",
"agent_chain = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,\n",
" verbose=True,\n",
" memory=memory,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "952a3103",
"metadata": {},
"outputs": [],
"source": [
"agent_chain.run(input=\"Hello!\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "54c4aaf4",
"metadata": {},
"outputs": [],
"source": [
"agent_chain.run(input=\"Who owns Twitter?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f9013118",
"metadata": {},
"outputs": [],
"source": [
"agent_chain.run(input=\"My name is Bob.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "405e5315",
"metadata": {},
"outputs": [],
"source": [
"agent_chain.run(input=\"Who am I?\")\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
],
"source": [
"llm = ChatOpenAI(temperature=0)\n",
"agent_chain = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,\n",
" verbose=True,\n",
" memory=memory,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "952a3103",
"metadata": {},
"outputs": [],
"source": [
"agent_chain.run(input=\"Hello!\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "54c4aaf4",
"metadata": {},
"outputs": [],
"source": [
"agent_chain.run(input=\"Who owns Twitter?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f9013118",
"metadata": {},
"outputs": [],
"source": [
"agent_chain.run(input=\"My name is Bob.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "405e5315",
"metadata": {},
"outputs": [],
"source": [
"agent_chain.run(input=\"Who am I?\")\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -2,6 +2,35 @@
All functionality related to Google Platform
## LLMs
### Vertex AI
Access PaLM LLMs like `text-bison` and `code-bison` via Google Cloud.
```python
from langchain.llms import VertexAI
```
### Model Garden
Access PaLM and hundreds of OSS models via Vertex AI Model Garden.
```python
from langchain.llms import VertexAIModelGarden
```
## Chat models
### Vertex AI
Access PaLM chat models like `chat-bison` and `codechat-bison` via Google Cloud.
```python
from langchain.chat_models import ChatVertexAI
```
## Document Loader
### Google BigQuery

View File

@@ -5,7 +5,7 @@
>[Argilla](https://argilla.io/) is an open-source data curation platform for LLMs.
> Using Argilla, everyone can build robust language models through faster data curation
> using both human and machine feedback. We provide support for each step in the MLOps cycle,
> from data labeling to model monitoring.
> from data labelling to model monitoring.
## Installation and Setup

View File

@@ -18,7 +18,7 @@ Example: Run a single-node Elasticsearch instance with security disabled. This i
#### Deploy Elasticsearch on Elastic Cloud
Elastic Cloud is a managed Elasticsearch service. Signup for a [free trial](https://cloud.elastic.co/registration?storm=langchain-notebook).
Elastic Cloud is a managed Elasticsearch service. Signup for a [free trial](https://cloud.elastic.co/registration?utm_source=langchain&utm_content=documentation).
### Install Client

View File

@@ -0,0 +1,207 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "263f914c-9d67-4316-8b3d-03c3b99ba9d8",
"metadata": {},
"source": [
"Kay.ai\n",
"=\n",
"\n",
"> Data API built for RAG 🕵️ We are curating the world's largest datasets as high-quality embeddings so your AI agents can retrieve context on the fly. Latest models, fast retrieval, and zero infra.\n",
"\n",
"This notebook shows you how to retrieve datasets supported by [Kay](https://kay.ai/). You can currently search SEC Filings and Press Releases of US companies. Visit [kay.ai](https://kay.ai) for the latest data drops. For any questions, join our [discord](https://discord.gg/hAnE4e5T6M) or [tweet at us](https://twitter.com/vishalrohra_)"
]
},
{
"cell_type": "markdown",
"id": "fc507b8e-ea51-417c-93da-42bf998a1195",
"metadata": {},
"source": [
"Installation\n",
"=\n",
"\n",
"First you will need to install the [`kay` package](https://pypi.org/project/kay/). You will also need an API key: you can get one for free at [https://kay.ai](https://kay.ai/). Once you have an API key, you must set it as an environment variable `KAY_API_KEY`.\n",
"\n",
"`KayAiRetriever` has a static `.create()` factory method that takes the following arguments:\n",
"\n",
"* `dataset_id: string` required -- A Kay dataset id. This is a collection of data about a particular entity such as companies, people, or places. For example, try `\"company\"` \n",
"* `data_type: List[string]` optional -- This is a category within a dataset based on its origin or format, such as SEC Filings, Press Releases, or Reports within the “company” dataset. For example, try [\"10-K\", \"10-Q\", \"PressRelease\"] under the “company” dataset. If left empty, Kay will retrieve the most relevant context across all types.\n",
"* `num_contexts: int` optional, defaults to 6 -- The number of document chunks to retrieve on each call to `get_relevant_documents()`"
]
},
{
"cell_type": "markdown",
"id": "c923bea0-585a-4f62-8662-efc167e8d793",
"metadata": {},
"source": [
"Examples\n",
"=\n",
"\n",
"Basic Retriever Usage\n",
"-"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "f7b8c99c-0341-4f3c-912f-a11e98f7de71",
"metadata": {},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"# Setup API key\n",
"from getpass import getpass\n",
"KAY_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "b4d4d386-2a6b-4942-863e-9202f5a9f1d6",
"metadata": {},
"outputs": [],
"source": [
"from langchain.retrievers import KayAiRetriever\n",
"import os\n",
"from kay.rag.retrievers import KayRetriever\n",
"os.environ[\"KAY_API_KEY\"] = KAY_API_KEY\n",
"retriever = KayAiRetriever.create(dataset_id=\"company\", data_types=[\"10-K\", \"10-Q\", \"PressRelease\"], num_contexts=3)\n",
"docs = retriever.get_relevant_documents(\"What were the biggest strategy changes and partnerships made by Roku in 2023??\")"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "04ee2d6b-c2ab-4e15-8a8b-afaf6ef8c0f6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='Company Name: ROKU INC\\nCompany Industry: CABLE & OTHER PAY TELEVISION SERVICES\\nArticle Title: Roku and FreeWheel Announce Strategic Partnership to Bring Rokus Leading Ad Tech to FreeWheel Customers\\nText: Additionally, eMarketer Link: https://cts.businesswire.com/ct/CT?id=smartlink&url=https%3A%2F%2Fwww.insiderintelligence.com%2Finsights%2Favod-more-than-50-percent-of-us-digital-video-viewers%2F&esheet=53451144&newsitemid=20230712907788&lan=en-US&anchor=eMarketer&index=4&md5=b64dea72bcf6b6379474462602781d83 projects 57% of U.S. digital video users will stream an advertising-based video on demand (AVOD) service this year.\\nHaving solutions aimed at driving greater interoperability and automation will help accelerate this growth.\\nKey highlights of this collaboration include:\\nStreamlined Integration: Roku has now integrated its demand application programming interface (dAPI) with FreeWheel s TV platform. Roku s demand API gives publishers direct, automatic and real-time access to more advertiser demand. This enhanced integration allows for streamlined ad operation workflows and better inventory quality control, both of which will improve publisher yield and revenue.\\nSeamless Data Targeting: Publishers can now use Roku platform signals to enable advertisers to target audiences and measure campaign performance without relying on cookies. Additionally, FreeWheel and Roku will rely on data clean room technology to enable the activation of additional data sets providing better measurement and monetization to publishers and agencies.', metadata={'_additional': {'id': '962b79e0-f9d1-43ae-9f7a-8a9b42bc7a9a'}, 'chunk_type': 'text', 'chunk_years_mentioned': [], 'company_name': 'ROKU INC', 'company_sic_code_description': 'CABLE & OTHER PAY TELEVISION SERVICES', 'data_source': 'PressRelease', 'data_source_link': 'https://www.nasdaq.com/press-release/roku-and-freewheel-announce-strategic-partnership-to-bring-rokus-leading-ad-tech-to', 'data_source_publish_date': '2023-07-12T00:00:00Z', 'data_source_uid': 'a46f309c-705d-3946-96db-87aa4e73261f', 'title': 'ROKU INC | Roku and FreeWheel Announce Strategic Partnership to Bring Rokus Leading Ad Tech to FreeWheel Customers'}),\n",
" Document(page_content='Company Name: ROKU INC \\n Company Industry: CABLE & OTHER PAY TELEVISION SERVICES \\n Form Title: 10-K 2022-FY \\n Form Section: Risk Factors \\n Text: nd the Note Regarding Forward Looking Statements.This section of this Annual Report generally discusses fiscal years 2022 and 2021 and year to year comparisons between those years.Discussions of fiscal year 2020 and year to year comparisons between fiscal years 2021 and 2020 that are not included in this Annual Report can be found in Management\\'s Discussion and Analysis of Financial Condition and Results of Operations in Part II, Item 7 of our Annual Report for the fiscal year ended December 31, 2021 filed with the SEC on February 18, 2022.Overview Effective as of the fourth quarter of fiscal 2022, we reorganized our reportable segments to better align with management\\'s reporting of information reviewed by the Chief Operating Decision Maker (\"CODM\") for each segment.We renamed our \"player\" segment to \"devices\" which now includes our licensing arrangements with service operators and licensed Roku TV partners in addition to sales of our streaming players, audio products, smart home products and Roku branded TVs that will be designed, made, and sold by us in 2023.Our historical segment information is recast to conform to our new presentation in our financial statements and accompanying notes included in Item 8 of this Annual Report.Our two reportable segments are the platform segment and the devices segment.', metadata={'_additional': {'id': 'a76c5fed-5d63-45a7-b63a-2c30e05140fc'}, 'chunk_type': 'text', 'chunk_years_mentioned': [2020, 2021, 2022, 2023], 'company_name': 'ROKU INC', 'company_sic_code_description': 'CABLE & OTHER PAY TELEVISION SERVICES', 'data_source': '10-K', 'data_source_link': 'https://www.sec.gov/Archives/edgar/data/1428439/000142843923000007', 'data_source_publish_date': '2022-01-01T00:00:00Z', 'data_source_uid': '0001428439-23-000007', 'title': 'ROKU INC | 10-K 2022-FY '}),\n",
" Document(page_content='Company Name: ROKU INC \\n Company Industry: CABLE & OTHER PAY TELEVISION SERVICES \\n Form Title: 10-Q 2023-Q1 \\n Form Section: Risk Factors \\n Text: Our current and potential partners include TV brands, cable and satellite companies, and telecommunication providers.Under these license arrangements, we generally have limited or no control over the amount and timing of resources these entities dedicate to the relationship.In the past, our licensed Roku TV partners have failed to meet their forecasts and anticipated market launch dates for distributing Roku TV models, and they may fail to meet their forecasts or such launches in the future.If our licensed Roku TV partners or service operator partners fail to meet their forecasts or such launches for distributing licensed streaming devices or choose to deploy competing streaming solutions within their product lines, our business may be harmed.We depend on a small number of content publishers for a majority of our streaming hours, and if we fail to maintain these relationships, our business could be harmed.*Historically, a small number of content publishers have accounted for a significant portion of the hours streamed on our platform.In the three months ended March 31, 2023, the top three streaming services represented over 50% of all hours streamed in the period.If, for any reason, we cease distributing channels that have historically streamed a large percentage of the aggregate streaming hours on our platform, our streaming hours, our active accounts, or Roku streaming device sales may be adversely affected, and our business may be harmed.', metadata={'_additional': {'id': '2a92b2bb-02a0-4e15-8b64-d7e04078a205'}, 'chunk_type': 'text', 'chunk_years_mentioned': [2023], 'company_name': 'ROKU INC', 'company_sic_code_description': 'CABLE & OTHER PAY TELEVISION SERVICES', 'data_source': '10-Q', 'data_source_link': 'https://www.sec.gov/Archives/edgar/data/1428439/000142843923000017', 'data_source_publish_date': '2023-01-01T00:00:00Z', 'data_source_uid': '0001428439-23-000017', 'title': 'ROKU INC | 10-Q 2023-Q1 '})]"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs"
]
},
{
"cell_type": "markdown",
"id": "21f6e9e5-478c-4b2c-9d61-f7a84f4d2f8f",
"metadata": {},
"source": [
"Usage in a chain\n",
"-"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "d1cba716-ab8d-4518-9196-43f17eb189dc",
"metadata": {},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"OPENAI_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "79441f1f-fa06-452c-bcd6-160ad0debc6a",
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "0c504bcd-f6e0-4028-a797-b31fb4b6d027",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.chains import ConversationalRetrievalChain\n",
"\n",
"model = ChatOpenAI(model_name=\"gpt-3.5-turbo\")\n",
"qa = ConversationalRetrievalChain.from_llm(model, retriever=retriever)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "977f158b-38d3-4b5f-9379-7cdd09436327",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"-> **Question**: What were the biggest strategy changes and partnerships made by Roku in 2023? \n",
"\n",
"**Answer**: In 2023, Roku made a strategic partnership with FreeWheel to bring Roku's leading ad tech to FreeWheel customers. This partnership aimed to drive greater interoperability and automation in the advertising-based video on demand (AVOD) space. Key highlights of this collaboration include streamlined integration of Roku's demand application programming interface (dAPI) with FreeWheel's TV platform, allowing for better inventory quality control and improved publisher yield and revenue. Additionally, publishers can now use Roku platform signals to enable advertisers to target audiences and measure campaign performance without relying on cookies. This partnership also involves the use of data clean room technology to enable the activation of additional data sets for better measurement and monetization for publishers and agencies. These partnerships and strategies aim to support Roku's growth in the AVOD market. \n",
"\n"
]
}
],
"source": [
"questions = [\n",
" \"What were the biggest strategy changes and partnerships made by Roku in 2023?\"\n",
" # \"Where is Wex making the most money in 2023?\",\n",
"]\n",
"chat_history = []\n",
"\n",
"for question in questions:\n",
" result = qa({\"question\": question, \"chat_history\": chat_history})\n",
" chat_history.append((question, result[\"answer\"]))\n",
" print(f\"-> **Question**: {question} \\n\")\n",
" print(f\"**Answer**: {result['answer']} \\n\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -81,7 +81,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
"version": "3.9.18"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,165 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "263f914c-9d67-4316-8b3d-03c3b99ba9d8",
"metadata": {},
"source": [
"SEC filings data\n",
"=\n",
"\n",
"SEC filings data powered by [Kay.ai](https://kay.ai) and [Cybersyn](https://www.cybersyn.com/).\n",
"\n",
">The SEC filing is a financial statement or other formal document submitted to the U.S. Securities and Exchange Commission (SEC). Public companies, certain insiders, and broker-dealers are required to make regular SEC filings. Investors and financial professionals rely on these filings for information about companies they are evaluating for investment purposes."
]
},
{
"cell_type": "markdown",
"id": "fc507b8e-ea51-417c-93da-42bf998a1195",
"metadata": {},
"source": [
"Setup\n",
"=\n",
"\n",
"First you will need to install the `kay` package. You will also need an API key: you can get one for free at [https://kay.ai](https://kay.ai/). Once you have an API key, you must set it as an environment variable `KAY_API_KEY`.\n",
"\n",
"In this example we're going to use the `KayAiRetriever`. Take a look at the [kay notebook](/docs/integrations/retrievers/kay) for more detailed information for the parmeters that it accepts.`"
]
},
{
"cell_type": "markdown",
"id": "c923bea0-585a-4f62-8662-efc167e8d793",
"metadata": {},
"source": [
"Examples\n",
"=\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "f7b8c99c-0341-4f3c-912f-a11e98f7de71",
"metadata": {},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n",
" ········\n"
]
}
],
"source": [
"# Setup API keys for Kay and OpenAI\n",
"from getpass import getpass\n",
"KAY_API_KEY = getpass()\n",
"OPENAI_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "04ee2d6b-c2ab-4e15-8a8b-afaf6ef8c0f6",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"KAY_API_KEY\"] = KAY_API_KEY\n",
"os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "0c504bcd-f6e0-4028-a797-b31fb4b6d027",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains import ConversationalRetrievalChain\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.retrievers import KayAiRetriever\n",
"\n",
"model = ChatOpenAI(model_name=\"gpt-3.5-turbo\")\n",
"retriever = KayAiRetriever.create(dataset_id=\"company\", data_types=[\"10-K\", \"10-Q\"], num_contexts=6)\n",
"qa = ConversationalRetrievalChain.from_llm(model, retriever=retriever)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "977f158b-38d3-4b5f-9379-7cdd09436327",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"-> **Question**: What are patterns in Nvidia's spend over the past three quarters? \n",
"\n",
"**Answer**: Based on the provided information, here are the patterns in NVIDIA's spend over the past three quarters:\n",
"\n",
"1. Research and Development Expenses:\n",
" - Q3 2022: Increased by 34% compared to Q3 2021.\n",
" - Q1 2023: Increased by 40% compared to Q1 2022.\n",
" - Q2 2022: Increased by 25% compared to Q2 2021.\n",
" \n",
" Overall, research and development expenses have been consistently increasing over the past three quarters.\n",
"\n",
"2. Sales, General and Administrative Expenses:\n",
" - Q3 2022: Increased by 8% compared to Q3 2021.\n",
" - Q1 2023: Increased by 14% compared to Q1 2022.\n",
" - Q2 2022: Decreased by 16% compared to Q2 2021.\n",
" \n",
" The pattern for sales, general and administrative expenses is not as consistent, with some quarters showing an increase and others showing a decrease.\n",
"\n",
"3. Total Operating Expenses:\n",
" - Q3 2022: Increased by 25% compared to Q3 2021.\n",
" - Q1 2023: Increased by 113% compared to Q1 2022.\n",
" - Q2 2022: Increased by 9% compared to Q2 2021.\n",
" \n",
" Total operating expenses have generally been increasing over the past three quarters, with a significant increase in Q1 2023.\n",
"\n",
"Overall, the pattern indicates a consistent increase in research and development expenses and total operating expenses, while sales, general and administrative expenses show some fluctuations. \n",
"\n"
]
}
],
"source": [
"questions = [\n",
" \"What are patterns in Nvidia's spend over the past three quarters?\",\n",
" #\"What are some recent challenges faced by the renewable energy sector?\",\n",
"]\n",
"chat_history = []\n",
"\n",
"for question in questions:\n",
" result = qa({\"question\": question, \"chat_history\": chat_history})\n",
" chat_history.append((question, result[\"answer\"]))\n",
" print(f\"-> **Question**: {question} \\n\")\n",
" print(f\"**Answer**: {result['answer']} \\n\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,150 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Gradient\n",
"\n",
"`Gradient` allows to create `Embeddings` as well fine tune and get completions on LLMs with a simple web API.\n",
"\n",
"This notebook goes over how to use Langchain with Embeddings of [Gradient](https://gradient.ai/).\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Imports"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings import GradientEmbeddings"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set the Environment API Key\n",
"Make sure to get your API key from Gradient AI. You are given $10 in free credits to test and fine-tune different models."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from getpass import getpass\n",
"import os\n",
"\n",
"if not os.environ.get(\"GRADIENT_ACCESS_TOKEN\",None):\n",
" # Access token under https://auth.gradient.ai/select-workspace\n",
" os.environ[\"GRADIENT_ACCESS_TOKEN\"] = getpass(\"gradient.ai access token:\")\n",
"if not os.environ.get(\"GRADIENT_WORKSPACE_ID\",None):\n",
" # `ID` listed in `$ gradient workspace list`\n",
" # also displayed after login at at https://auth.gradient.ai/select-workspace\n",
" os.environ[\"GRADIENT_WORKSPACE_ID\"] = getpass(\"gradient.ai workspace id:\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Optional: Validate your Enviroment variables ```GRADIENT_ACCESS_TOKEN``` and ```GRADIENT_WORKSPACE_ID``` to get currently deployed models. Using the `gradientai` Python package."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install gradientai"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create the Gradient instance"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"documents = [\"Pizza is a dish.\",\"Paris is the capital of France\", \"numpy is a lib for linear algebra\"]\n",
"query = \"Where is Paris?\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"embeddings = GradientEmbeddings(\n",
" model=\"bge-large\"\n",
")\n",
"\n",
"documents_embedded = embeddings.embed_documents(documents)\n",
"query_result = embeddings.embed_query(query)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# (demo) compute similarity\n",
"import numpy as np\n",
"\n",
"scores = np.array(documents_embedded) @ np.array(query_result).T\n",
"dict(zip(documents, scores))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"vscode": {
"interpreter": {
"hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -0,0 +1,133 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "278b6c63",
"metadata": {},
"source": [
"# LLMRails\n",
"\n",
"Let's load the LLMRails Embeddings class.\n",
"\n",
"To use LLMRails embedding you need to pass api key by argument or set it in environment with `LLM_RAILS_API_KEY` key.\n",
"To gey API Key you need to sign up in https://console.llmrails.com/signup and then go to https://console.llmrails.com/api-keys and copy key from there after creating one key in platform."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "0be1af71",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings import LLMRailsEmbeddings"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "2c66e5da",
"metadata": {},
"outputs": [],
"source": [
"embeddings = LLMRailsEmbeddings(model='embedding-english-v1') # or embedding-multi-v1"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "01370375",
"metadata": {},
"outputs": [],
"source": [
"text = \"This is a test document.\""
]
},
{
"cell_type": "markdown",
"id": "a42e4035",
"metadata": {},
"source": [
"To generate embeddings, you can either query an invidivual text, or you can query a list of texts."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "91bc875d-829b-4c3d-8e6f-fc2dda30a3bd",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[-0.09996652603149414,\n",
" 0.015568195842206478,\n",
" 0.17670190334320068,\n",
" 0.16521021723747253,\n",
" 0.21193109452724457]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_result = embeddings.embed_query(text)\n",
"query_result[:5]"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "a4b0d49e-0c73-44b6-aed5-5b426564e085",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[-0.04242777079343796,\n",
" 0.016536075621843338,\n",
" 0.10052520781755447,\n",
" 0.18272875249385834,\n",
" 0.2079043835401535]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"doc_result = embeddings.embed_documents([text])\n",
"doc_result[0][:5]"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
},
"vscode": {
"interpreter": {
"hash": "e971737741ff4ec9aff7dc6155a1060a59a8a6d52c757dbbe66bf8ee389494b1"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -44,7 +44,7 @@
"source": [
"There are two main ways to setup an Elasticsearch instance for use with:\n",
"\n",
"1. Elastic Cloud: Elastic Cloud is a managed Elasticsearch service. Signup for a [free trial](https://cloud.elastic.co/registration?storm=langchain-notebook).\n",
"1. Elastic Cloud: Elastic Cloud is a managed Elasticsearch service. Signup for a [free trial](https://cloud.elastic.co/registration?utm_source=langchain&utm_content=documentation).\n",
"\n",
"To connect to an Elasticsearch instance that does not require\n",
"login credentials (starting the docker instance with security enabled), pass the Elasticsearch URL and index name along with the\n",
@@ -662,7 +662,7 @@
"id": "0960fa0a",
"metadata": {},
"source": [
"# Customise the Query\n",
"## Customise the Query\n",
"With `custom_query` parameter at search, you are able to adjust the query that is used to retrieve documents from Elasticsearch. This is useful if you want to want to use a more complex query, to support linear boosting of fields."
]
},
@@ -720,6 +720,35 @@
"print(results[0])"
]
},
{
"cell_type": "markdown",
"id": "3242fd42",
"metadata": {},
"source": [
"# FAQ\n",
"\n",
"## Question: Im getting timeout errors when indexing documents into Elasticsearch. How do I fix this?\n",
"One possible issue is your documents might take longer to index into Elasticsearch. ElasticsearchStore uses the Elasticsearch bulk API which has a few defaults that you can adjust to reduce the chance of timeout errors.\n",
"\n",
"This is also a good idea when you're using SparseVectorRetrievalStrategy.\n",
"\n",
"The defaults are:\n",
"- `chunk_size`: 500\n",
"- `max_chunk_bytes`: 100MB\n",
"\n",
"To adjust these, you can pass in the `chunk_size` and `max_chunk_bytes` parameters to the ElasticsearchStore `add_texts` method.\n",
"\n",
"```python\n",
" vector_store.add_texts(\n",
" texts,\n",
" bulk_kwargs={\n",
" \"chunk_size\": 50,\n",
" \"max_chunk_bytes\": 200000000\n",
" }\n",
" )\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "604c66ea",

View File

@@ -92,7 +92,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 15,
"id": "19846a7b-99bc-47a7-8e1c-f13c2497f1ae",
"metadata": {},
"outputs": [],
@@ -105,7 +105,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 16,
"id": "c71c3901-d44b-4d09-92c5-3018628c28fa",
"metadata": {},
"outputs": [],
@@ -115,7 +115,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 17,
"id": "8b91ecfa-f61b-489a-a337-dff1f12f6ab2",
"metadata": {},
"outputs": [],
@@ -138,51 +138,66 @@
"load_dotenv()"
]
},
{
"cell_type": "markdown",
"id": "924d4df5",
"metadata": {},
"source": [
"First we'll create a Supabase client and instantiate a OpenAI embeddings class."
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 19,
"id": "5ce44f7c",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from supabase.client import Client, create_client\n",
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.vectorstores import SupabaseVectorStore\n",
"\n",
"supabase_url = os.environ.get(\"SUPABASE_URL\")\n",
"supabase_key = os.environ.get(\"SUPABASE_SERVICE_KEY\")\n",
"supabase: Client = create_client(supabase_url, supabase_key)"
"supabase: Client = create_client(supabase_url, supabase_key)\n",
"\n",
"embeddings = OpenAIEmbeddings()"
]
},
{
"cell_type": "markdown",
"id": "0c707d4c",
"metadata": {},
"source": [
"Next we'll load and parse some data for our vector store (skip if you already have documents with embeddings stored in your DB)."
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 20,
"id": "aac9563e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import SupabaseVectorStore\n",
"from langchain.document_loaders import TextLoader"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "a3c3999a",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
"\n",
"loader = TextLoader(\"../../../state_of_the_union.txt\")\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = OpenAIEmbeddings()"
"docs = text_splitter.split_documents(documents)"
]
},
{
"cell_type": "markdown",
"id": "5abb9b93",
"metadata": {},
"source": [
"Insert the above documents into the database. Embeddings will automatically be generated for each document."
]
},
{
@@ -192,13 +207,39 @@
"metadata": {},
"outputs": [],
"source": [
"# We're using the default `documents` table here. You can modify this by passing in a `table_name` argument to the `from_documents` method.\n",
"vector_store = SupabaseVectorStore.from_documents(docs, embeddings, client=supabase)"
"\n",
"vector_store = SupabaseVectorStore.from_documents(docs, embeddings, client=supabase, table_name=\"documents\", query_name=\"match_documents\")"
]
},
{
"cell_type": "markdown",
"id": "e169345d",
"metadata": {},
"source": [
"Alternatively if you already have documents with embeddings in your database, simply instantiate a new `SupabaseVectorStore` directly:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 10,
"id": "397e3e7d",
"metadata": {},
"outputs": [],
"source": [
"vector_store = SupabaseVectorStore(embedding=embeddings, client=supabase, table_name=\"documents\", query_name=\"match_documents\")"
]
},
{
"cell_type": "markdown",
"id": "e28ce092",
"metadata": {},
"source": [
"Finally, test it out by performing a similarity search:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5eabdb75",
"metadata": {},
"outputs": [],
@@ -209,7 +250,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": null,
"id": "4b172de8",
"metadata": {},
"outputs": [
@@ -431,7 +472,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
"version": "3.11.5"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,358 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "fb69907a",
"metadata": {},
"source": [
"# Returning Structured Output\n",
"\n",
"This notebook covers how to have an agent return a structured output.\n",
"By default, most of the agents return a single string.\n",
"It can often be useful to have an agent return something with more structure.\n",
"\n",
"\n",
"A good example of this is an agent tasked with doing question-answering over some sources.\n",
"Let's say we want the agent to respond not only with the answer, but also a list of the sources used.\n",
"We then want our output to roughly follow the schema below:\n",
"\n",
"```python\n",
"class Response(BaseModel):\n",
" \"\"\"Final response to the question being asked\"\"\"\n",
" answer: str = Field(description = \"The final answer to respond to the user\")\n",
" sources: List[int] = Field(description=\"List of page chunks that contain answer to the question. Only include a page chunk if it contains relevant information\")\n",
"```\n",
"\n",
"In this notebook we will go over an agent that has a retriever tool and responds in the correct format."
]
},
{
"cell_type": "markdown",
"id": "4fc33ba5",
"metadata": {},
"source": [
"## Create the Retriever\n",
"\n",
"In this section we will do some setup work to create our retriever over some mock data containing the \"State of the Union\" address. Importantly, we will add a \"page_chunk\" tag to the metadata of each document. This is just some fake data intended to simulate a source field. In practice, this would more likely be the URL or path of a document."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "4ea20467",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.vectorstores import Chroma\n",
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain.document_loaders import TextLoader"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "e3002ed7",
"metadata": {},
"outputs": [],
"source": [
"# Load in document to retrieve over\n",
"loader = TextLoader('../../state_of_the_union.txt')\n",
"documents = loader.load()\n",
"\n",
"# Split document into chunks\n",
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_documents(documents)\n",
"\n",
"# Here is where we add in the fake source information\n",
"for i, doc in enumerate(texts):\n",
" doc.metadata['page_chunk'] = i\n",
"\n",
"# Create our retriever\n",
"embeddings = OpenAIEmbeddings()\n",
"vectorstore = Chroma.from_documents(texts, embeddings, collection_name=\"state-of-union\")\n",
"retriever = vectorstore.as_retriever()"
]
},
{
"cell_type": "markdown",
"id": "6ec1c106",
"metadata": {},
"source": [
"## Create the tools\n",
"\n",
"We will now create the tools we want to give to the agent. In this case, it is just one - a tool that wraps our retriever."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "204ef7ca",
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents.agent_toolkits.conversational_retrieval.tool import create_retriever_tool\n",
"\n",
"retriever_tool = create_retriever_tool(\n",
" retriever,\n",
" \"state-of-union-retriever\",\n",
" \"Query a retriever to get information about state of the union address\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "9af5b61b",
"metadata": {},
"source": [
"## Create response schema\n",
"\n",
"Here is where we will define the response schema. In this case, we want the final answer to have two fields: one for the `answer`, and then another that is a list of `sources`"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "2df91723",
"metadata": {},
"outputs": [],
"source": [
"from pydantic import BaseModel, Field\n",
"from typing import List\n",
"from langchain.utils.openai_functions import convert_pydantic_to_openai_function\n",
"\n",
"class Response(BaseModel):\n",
" \"\"\"Final response to the question being asked\"\"\"\n",
" answer: str = Field(description = \"The final answer to respond to the user\")\n",
" sources: List[int] = Field(description=\"List of page chunks that contain answer to the question. Only include a page chunk if it contains relevant information\")"
]
},
{
"cell_type": "markdown",
"id": "2cd181df",
"metadata": {},
"source": [
"## Create the custom parsing logic\n",
"\n",
"We now create some custom parsing logic.\n",
"How this works is that we will pass the `Response` schema to the OpenAI LLM via their `functions` parameter.\n",
"This is similar to how we pass tools for the agent to use.\n",
"\n",
"When the `Response` function is called by OpenAI, we want to use that as a signal to return to the user.\n",
"When any other function is called by OpenAI, we treat that as a tool invocation.\n",
"\n",
"Therefor, our parsing logic has the following blocks:\n",
"\n",
"- If no function is called, assume that we should use the response to respond to the user, and therefor return `AgentFinish`\n",
"- If the `Response` function is called, respond to the user with the inputs to that function (our structured output), and therefor return `AgentFinish`\n",
"- If any other function is called, treat that as a tool invocation, and therefor return `AgentActionMessageLog`\n",
"\n",
"Note that we are using `AgentActionMessageLog` rather than `AgentAction` because it lets us attach a log of messages that we can use in the future to pass back into the agent prompt."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "dfb73fe3",
"metadata": {},
"outputs": [],
"source": [
"from langchain.schema.agent import AgentActionMessageLog, AgentFinish\n",
"import json"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "5b46cdb2",
"metadata": {},
"outputs": [],
"source": [
"def parse(output):\n",
" # If no function was invoked, return to user\n",
" if \"function_call\" not in output.additional_kwargs:\n",
" return AgentFinish(return_values={\"output\": output.content}, log=output.content)\n",
" \n",
" # Parse out the function call\n",
" function_call = output.additional_kwargs[\"function_call\"]\n",
" name = function_call['name']\n",
" inputs = json.loads(function_call['arguments'])\n",
" \n",
" # If the Response function was invoked, return to the user with the function inputs\n",
" if name == \"Response\":\n",
" return AgentFinish(return_values=inputs, log=str(function_call))\n",
" # Otherwise, return an agent action\n",
" else:\n",
" return AgentActionMessageLog(tool=name, tool_input=inputs, log=\"\", message_log=[output])"
]
},
{
"cell_type": "markdown",
"id": "6d7401a1",
"metadata": {},
"source": [
"## Create the Agent\n",
"\n",
"We can now put this all together! The components of this agent are:\n",
"\n",
"- prompt: a simple prompt with placeholders for the user's question and then the `agent_scratchpad` (any intermediate steps)\n",
"- tools: we can attach the tools and `Response` format to the LLM as functions\n",
"- format scratchpad: in order to format the `agent_scratchpad` from intermediate steps, we will use the standard `format_to_openai_functions`. This takes intermediate steps and formats them as AIMessages and FunctionMessages.\n",
"- output parser: we will use our custom parser above to parse the response of the LLM\n",
"- AgentExecutor: we will use the standard AgentExecutor to run the loop of agent-tool-agent-tool..."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "73c785f9",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.tools.render import format_tool_to_openai_function\n",
"from langchain.agents.format_scratchpad import format_to_openai_functions\n",
"from langchain.agents import AgentExecutor"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "e1feaeda",
"metadata": {},
"outputs": [],
"source": [
"prompt = ChatPromptTemplate.from_messages([\n",
" (\"system\", \"You are a helpful assistant\"),\n",
" (\"user\", \"{input}\"),\n",
" MessagesPlaceholder(variable_name=\"agent_scratchpad\"),\n",
"])"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "d27dc3a8",
"metadata": {},
"outputs": [],
"source": [
"llm = ChatOpenAI(temperature=0)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "7bab4af2",
"metadata": {},
"outputs": [],
"source": [
"llm_with_tools = llm.bind(\n",
" functions=[\n",
" # The retriever tool\n",
" format_tool_to_openai_function(retriever_tool), \n",
" # Response schema\n",
" convert_pydantic_to_openai_function(Response)\n",
" ]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "b886416c",
"metadata": {},
"outputs": [],
"source": [
"agent = {\n",
" \"input\": lambda x: x[\"input\"],\n",
" # Format agent scratchpad from intermediate steps\n",
" \"agent_scratchpad\": lambda x: format_to_openai_functions(x['intermediate_steps'])\n",
"} | prompt | llm_with_tools | parse"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "2cfd783e",
"metadata": {},
"outputs": [],
"source": [
"agent_executor = AgentExecutor(tools=[retriever_tool], agent=agent, verbose=True)"
]
},
{
"cell_type": "markdown",
"id": "9f114fec",
"metadata": {},
"source": [
"## Run the agent\n",
"\n",
"We can now run the agent! Notice how it responds with a dictionary with two keys: `answer` and `sources`"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "2667c9a4",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\u001b[0m\u001b[36;1m\u001b[1;3m[Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'page_chunk': 31, 'source': '../../state_of_the_union.txt'}), Document(page_content='One was stationed at bases and breathing in toxic smoke from “burn pits” that incinerated wastes of war—medical and hazard material, jet fuel, and more. \\n\\nWhen they came home, many of the worlds fittest and best trained warriors were never the same. \\n\\nHeadaches. Numbness. Dizziness. \\n\\nA cancer that would put them in a flag-draped coffin. \\n\\nI know. \\n\\nOne of those soldiers was my son Major Beau Biden. \\n\\nWe dont know for sure if a burn pit was the cause of his brain cancer, or the diseases of so many of our troops. \\n\\nBut Im committed to finding out everything we can. \\n\\nCommitted to military families like Danielle Robinson from Ohio. \\n\\nThe widow of Sergeant First Class Heath Robinson. \\n\\nHe was born a soldier. Army National Guard. Combat medic in Kosovo and Iraq. \\n\\nStationed near Baghdad, just yards from burn pits the size of football fields. \\n\\nHeaths widow Danielle is here with us tonight. They loved going to Ohio State football games. He loved building Legos with their daughter.', metadata={'page_chunk': 37, 'source': '../../state_of_the_union.txt'}), Document(page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWeve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWere putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWere securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', metadata={'page_chunk': 32, 'source': '../../state_of_the_union.txt'}), Document(page_content='But cancer from prolonged exposure to burn pits ravaged Heaths lungs and body. \\n\\nDanielle says Heath was a fighter to the very end. \\n\\nHe didnt know how to stop fighting, and neither did she. \\n\\nThrough her pain she found purpose to demand we do better. \\n\\nTonight, Danielle—we are. \\n\\nThe VA is pioneering new ways of linking toxic exposures to diseases, already helping more veterans get benefits. \\n\\nAnd tonight, Im announcing were expanding eligibility to veterans suffering from nine respiratory cancers. \\n\\nIm also calling on Congress: pass a law to make sure veterans devastated by toxic exposures in Iraq and Afghanistan finally get the benefits and comprehensive health care they deserve. \\n\\nAnd fourth, lets end cancer as we know it. \\n\\nThis is personal to me and Jill, to Kamala, and to so many of you. \\n\\nCancer is the #2 cause of death in Americasecond only to heart disease.', metadata={'page_chunk': 38, 'source': '../../state_of_the_union.txt'})]\u001b[0m\u001b[32;1m\u001b[1;3m{'name': 'Response', 'arguments': '{\\n \"answer\": \"President mentioned Ketanji Brown Jackson as a nominee for the United States Supreme Court and praised her as one of the nation\\'s top legal minds.\",\\n \"sources\": [31]\\n}'}\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"{'answer': \"President mentioned Ketanji Brown Jackson as a nominee for the United States Supreme Court and praised her as one of the nation's top legal minds.\",\n",
" 'sources': [31]}"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent_executor.invoke({\"input\": \"what did the president say about kentaji brown jackson\"}, return_only_outputs=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b355665e",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -166,9 +166,9 @@
}
],
"source": [
"import json\n",
"from langchain.load.dump import dumps\n",
"\n",
"print(json.dumps(response[\"intermediate_steps\"], indent=2))"
"print(dumps(response[\"intermediate_steps\"], pretty=True))"
]
},
{

View File

@@ -603,7 +603,7 @@
"id": "4002a4ac-02dd-4599-9b23-9b59f54237c8",
"metadata": {},
"source": [
"The metadata attribute contains a filed called `source`. This source should be pointing at the *ultimate* provenance associated with the given document.\n",
"The metadata attribute contains a field called `source`. This source should be pointing at the *ultimate* provenance associated with the given document.\n",
"\n",
"For example, if these documents are representing chunks of some parent document, the `source` for both documents should be the same and reference the parent document.\n",
"\n",

View File

@@ -36,7 +36,7 @@
"from langchain.chains import LLMChain\nfrom langchain.llms import OpenAI\nfrom langchain.prompts import PromptTemplate\n",
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.llms import BaseLLM\n",
"from langchain.vectorstores.base import VectorStore\n",
"from langchain.schema.vectorstore import VectorStore\n",
"from pydantic import BaseModel, Field\n",
"from langchain.chains.base import Chain\n",
"from langchain_experimental.autonomous_agents import BabyAGI"

View File

@@ -32,7 +32,7 @@
"from langchain.chains import LLMChain\nfrom langchain.llms import OpenAI\nfrom langchain.prompts import PromptTemplate\n",
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.llms import BaseLLM\n",
"from langchain.vectorstores.base import VectorStore\n",
"from langchain.schema.vectorstore import VectorStore\n",
"from pydantic import BaseModel, Field\n",
"from langchain.chains.base import Chain\n",
"from langchain_experimental.autonomous_agents import BabyAGI"

View File

@@ -135,7 +135,7 @@
}
],
"source": [
"print(graph.get_schema)"
"print(graph.schema)"
]
},
{
@@ -510,13 +510,54 @@
"chain.run(\"Who played in Top Gun?\")"
]
},
{
"cell_type": "markdown",
"id": "eefea16b-508f-4552-8942-9d5063ed7d37",
"metadata": {},
"source": [
"# Ignore specified node and relationship types\n",
"You can use `include_types` or `exclude_types` to ignore parts of the graph schema when generating Cypher statements."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "48ff7cf8-18a3-43d7-8cb1-c1b91744608d",
"execution_count": 18,
"id": "a20fa21e-fb85-41c4-aac0-53fb25e34604",
"metadata": {},
"outputs": [],
"source": []
"source": [
"chain = GraphCypherQAChain.from_llm(\n",
" graph=graph,\n",
" cypher_llm=ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo\"),\n",
" qa_llm=ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-16k\"),\n",
" verbose=True,\n",
" exclude_types=['Movie']\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "3ad7f6b8-543e-46e4-a3b2-40fa3e66e895",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Node properties are the following: \n",
" {'Actor': [{'property': 'name', 'type': 'STRING'}]}\n",
"Relationships properties are the following: \n",
" {}\n",
"Relationships are: \n",
"[]\n"
]
}
],
"source": [
"# Inspect graph schema\n",
"print(chain.graph_schema)"
]
}
],
"metadata": {

View File

@@ -187,7 +187,7 @@
"metadata": {},
"outputs": [],
"source": [
"print(graph.get_schema)"
"print(graph.schema)"
]
},
{
@@ -687,7 +687,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
"version": "3.8.8"
}
},
"nbformat": 4,

File diff suppressed because it is too large Load Diff

View File

@@ -13,4 +13,4 @@ Some of the code here may be marked with security notices. However,
given the exploratory and experimental nature of the code in this package,
the lack of a security notice on a piece of code does not mean that
the code in question does not require additional security considerations
in order to be safe to use.
in order to be safe to use.

View File

@@ -10,9 +10,9 @@ from langchain.schema import (
Document,
)
from langchain.schema.messages import AIMessage, HumanMessage, SystemMessage
from langchain.schema.vectorstore import VectorStoreRetriever
from langchain.tools.base import BaseTool
from langchain.tools.human.tool import HumanInputRun
from langchain.vectorstores.base import VectorStoreRetriever
from langchain_experimental.autonomous_agents.autogpt.output_parser import (
AutoGPTOutputParser,

View File

@@ -1,7 +1,7 @@
from typing import Any, Dict, List
from langchain.memory.chat_memory import BaseChatMemory, get_prompt_input_key
from langchain.vectorstores.base import VectorStoreRetriever
from langchain.schema.vectorstore import VectorStoreRetriever
from langchain_experimental.pydantic_v1 import Field

View File

@@ -5,8 +5,8 @@ from langchain.prompts.chat import (
BaseChatPromptTemplate,
)
from langchain.schema.messages import BaseMessage, HumanMessage, SystemMessage
from langchain.schema.vectorstore import VectorStoreRetriever
from langchain.tools.base import BaseTool
from langchain.vectorstores.base import VectorStoreRetriever
from langchain_experimental.autonomous_agents.autogpt.prompt_generator import get_prompt
from langchain_experimental.pydantic_v1 import BaseModel

View File

@@ -5,7 +5,7 @@ from typing import Any, Dict, List, Optional
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.schema.language_model import BaseLanguageModel
from langchain.vectorstores.base import VectorStore
from langchain.schema.vectorstore import VectorStore
from langchain_experimental.autonomous_agents.baby_agi.task_creation import (
TaskCreationChain,

View File

@@ -8,7 +8,7 @@ from langchain.chains.llm import LLMChain
from langchain.chains.sql_database.prompt import PROMPT, SQL_PROMPTS
from langchain.prompts.prompt import PromptTemplate
from langchain.schema import BaseOutputParser, BasePromptTemplate
from langchain.schema.base import Embeddings
from langchain.schema.embeddings import Embeddings
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools.sql_database.prompt import QUERY_CHECKER
from langchain.utilities.sql_database import SQLDatabase
@@ -76,23 +76,11 @@ class VectorSQLRetrieveAllOutputParser(VectorSQLOutputParser):
return super().parse(text)
def _try_eval(x: Any) -> Any:
try:
return eval(x)
except Exception:
return x
def get_result_from_sqldb(
db: SQLDatabase, cmd: str
) -> Union[str, List[Dict[str, Any]], Dict[str, Any]]:
result = db._execute(cmd, fetch="all") # type: ignore
if isinstance(result, list):
return [{k: _try_eval(v) for k, v in dict(d._asdict()).items()} for d in result]
else:
return {
k: _try_eval(v) for k, v in dict(result._asdict()).items() # type: ignore
}
return result
class VectorSQLDatabaseChain(SQLDatabaseChain):

View File

@@ -1,6 +1,6 @@
[tool.poetry]
name = "langchain-experimental"
version = "0.0.20"
version = "0.0.22"
description = "Building applications with LLMs through composability"
authors = []
license = "MIT"

View File

@@ -7,47 +7,18 @@ all: help
# TESTING AND COVERAGE
######################
# Define a variable for the test file path.
TEST_FILE ?= tests/unit_tests/
# Run unit tests and generate a coverage report.
coverage:
poetry run pytest --cov \
--cov-config=.coveragerc \
--cov-report xml \
--cov-report term-missing:skip-covered
--cov-report term-missing:skip-covered \
$(TEST_FILE)
######################
# DOCUMENTATION
######################
clean: docs_clean api_docs_clean
docs_build:
docs/.local_build.sh
docs_clean:
rm -r docs/_dist
docs_linkcheck:
poetry run linkchecker docs/_dist/docs_skeleton/ --ignore-url node_modules
api_docs_build:
poetry run python docs/api_reference/create_api_rst.py
cd docs/api_reference && poetry run make html
api_docs_clean:
rm -f docs/api_reference/api_reference.rst
cd docs/api_reference && poetry run make clean
api_docs_linkcheck:
poetry run linkchecker docs/api_reference/_build/html/index.html
# Define a variable for the test file path.
TEST_FILE ?= tests/unit_tests/
test:
poetry run pytest --disable-socket --allow-unix-socket $(TEST_FILE)
tests:
test tests:
poetry run pytest --disable-socket --allow-unix-socket $(TEST_FILE)
extended_tests:
@@ -98,7 +69,6 @@ spell_fix:
help:
@echo '===================='
@echo '-- DOCUMENTATION --'
@echo 'clean - run docs_clean and api_docs_clean'
@echo 'docs_build - build the documentation'
@echo 'docs_clean - clean the documentation build artifacts'
@@ -120,3 +90,4 @@ help:
@echo 'test_watch - run unit tests in watch mode'
@echo 'integration_tests - run integration tests'
@echo 'docker_tests - run unit tests in docker'
@echo '-- DOCUMENTATION tasks are from the top-level Makefile --'

View File

@@ -92,4 +92,4 @@ For more information on these concepts, please see our [full documentation](http
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
For detailed information on how to contribute, see [here](.github/CONTRIBUTING.md).
For detailed information on how to contribute, see [here](../../.github/CONTRIBUTING.md).

View File

@@ -1,57 +1,14 @@
# ruff: noqa: E402
"""Main entrypoint into package."""
import warnings
from importlib import metadata
from typing import Optional
from typing import TYPE_CHECKING, Any, Optional
from langchain._api.deprecation import surface_langchain_deprecation_warnings
if TYPE_CHECKING:
from langchain.schema import BaseCache
from langchain.agents import MRKLChain, ReActChain, SelfAskWithSearchChain
from langchain.chains import (
ConversationChain,
LLMBashChain,
LLMChain,
LLMCheckerChain,
LLMMathChain,
QAWithSourcesChain,
VectorDBQA,
VectorDBQAWithSourcesChain,
)
from langchain.docstore import InMemoryDocstore, Wikipedia
from langchain.llms import (
Anthropic,
Banana,
CerebriumAI,
Cohere,
ForefrontAI,
GooseAI,
HuggingFaceHub,
HuggingFaceTextGenInference,
LlamaCpp,
Modal,
OpenAI,
Petals,
PipelineAI,
SagemakerEndpoint,
StochasticAI,
Writer,
)
from langchain.llms.huggingface_pipeline import HuggingFacePipeline
from langchain.prompts import (
FewShotPromptTemplate,
Prompt,
PromptTemplate,
)
from langchain.schema.cache import BaseCache
from langchain.schema.prompt_template import BasePromptTemplate
from langchain.utilities.arxiv import ArxivAPIWrapper
from langchain.utilities.golden_query import GoldenQueryAPIWrapper
from langchain.utilities.google_search import GoogleSearchAPIWrapper
from langchain.utilities.google_serper import GoogleSerperAPIWrapper
from langchain.utilities.powerbi import PowerBIDataset
from langchain.utilities.searx_search import SearxSearchWrapper
from langchain.utilities.serpapi import SerpAPIWrapper
from langchain.utilities.sql_database import SQLDatabase
from langchain.utilities.wikipedia import WikipediaAPIWrapper
from langchain.utilities.wolfram_alpha import WolframAlphaAPIWrapper
from langchain.vectorstores import FAISS, ElasticVectorSearch
try:
__version__ = metadata.version(__package__)
@@ -62,10 +19,299 @@ del metadata # optional, avoids polluting the results of dir(__package__)
verbose: bool = False
debug: bool = False
llm_cache: Optional[BaseCache] = None
llm_cache: Optional["BaseCache"] = None
# For backwards compatibility
SerpAPIChain = SerpAPIWrapper
def _warn_on_import(name: str) -> None:
warnings.warn(
f"Importing {name} from langchain root module is no longer supported."
)
# Surfaces Deprecation and Pending Deprecation warnings from langchain.
surface_langchain_deprecation_warnings()
def __getattr__(name: str) -> Any:
if name == "MRKLChain":
from langchain.agents import MRKLChain
_warn_on_import(name)
return MRKLChain
elif name == "ReActChain":
from langchain.agents import ReActChain
_warn_on_import(name)
return ReActChain
elif name == "SelfAskWithSearchChain":
from langchain.agents import SelfAskWithSearchChain
_warn_on_import(name)
return SelfAskWithSearchChain
elif name == "ConversationChain":
from langchain.chains import ConversationChain
_warn_on_import(name)
return ConversationChain
elif name == "LLMBashChain":
from langchain.chains import LLMBashChain
_warn_on_import(name)
return LLMBashChain
elif name == "LLMChain":
from langchain.chains import LLMChain
_warn_on_import(name)
return LLMChain
elif name == "LLMCheckerChain":
from langchain.chains import LLMCheckerChain
_warn_on_import(name)
return LLMCheckerChain
elif name == "LLMMathChain":
from langchain.chains import LLMMathChain
_warn_on_import(name)
return LLMMathChain
elif name == "QAWithSourcesChain":
from langchain.chains import QAWithSourcesChain
_warn_on_import(name)
return QAWithSourcesChain
elif name == "VectorDBQA":
from langchain.chains import VectorDBQA
_warn_on_import(name)
return VectorDBQA
elif name == "VectorDBQAWithSourcesChain":
from langchain.chains import VectorDBQAWithSourcesChain
_warn_on_import(name)
return VectorDBQAWithSourcesChain
elif name == "InMemoryDocstore":
from langchain.docstore import InMemoryDocstore
_warn_on_import(name)
return InMemoryDocstore
elif name == "Wikipedia":
from langchain.docstore import Wikipedia
_warn_on_import(name)
return Wikipedia
elif name == "Anthropic":
from langchain.llms import Anthropic
_warn_on_import(name)
return Anthropic
elif name == "Banana":
from langchain.llms import Banana
_warn_on_import(name)
return Banana
elif name == "CerebriumAI":
from langchain.llms import CerebriumAI
_warn_on_import(name)
return CerebriumAI
elif name == "Cohere":
from langchain.llms import Cohere
_warn_on_import(name)
return Cohere
elif name == "ForefrontAI":
from langchain.llms import ForefrontAI
_warn_on_import(name)
return ForefrontAI
elif name == "GooseAI":
from langchain.llms import GooseAI
_warn_on_import(name)
return GooseAI
elif name == "HuggingFaceHub":
from langchain.llms import HuggingFaceHub
_warn_on_import(name)
return HuggingFaceHub
elif name == "HuggingFaceTextGenInference":
from langchain.llms import HuggingFaceTextGenInference
_warn_on_import(name)
return HuggingFaceTextGenInference
elif name == "LlamaCpp":
from langchain.llms import LlamaCpp
_warn_on_import(name)
return LlamaCpp
elif name == "Modal":
from langchain.llms import Modal
_warn_on_import(name)
return Modal
elif name == "OpenAI":
from langchain.llms import OpenAI
_warn_on_import(name)
return OpenAI
elif name == "Petals":
from langchain.llms import Petals
_warn_on_import(name)
return Petals
elif name == "PipelineAI":
from langchain.llms import PipelineAI
_warn_on_import(name)
return PipelineAI
elif name == "SagemakerEndpoint":
from langchain.llms import SagemakerEndpoint
_warn_on_import(name)
return SagemakerEndpoint
elif name == "StochasticAI":
from langchain.llms import StochasticAI
_warn_on_import(name)
return StochasticAI
elif name == "Writer":
from langchain.llms import Writer
_warn_on_import(name)
return Writer
elif name == "HuggingFacePipeline":
from langchain.llms.huggingface_pipeline import HuggingFacePipeline
_warn_on_import(name)
return HuggingFacePipeline
elif name == "FewShotPromptTemplate":
from langchain.prompts import FewShotPromptTemplate
_warn_on_import(name)
return FewShotPromptTemplate
elif name == "Prompt":
from langchain.prompts import Prompt
_warn_on_import(name)
return Prompt
elif name == "PromptTemplate":
from langchain.prompts import PromptTemplate
_warn_on_import(name)
return PromptTemplate
elif name == "BasePromptTemplate":
from langchain.schema.prompt_template import BasePromptTemplate
_warn_on_import(name)
return BasePromptTemplate
elif name == "ArxivAPIWrapper":
from langchain.utilities import ArxivAPIWrapper
_warn_on_import(name)
return ArxivAPIWrapper
elif name == "GoldenQueryAPIWrapper":
from langchain.utilities import GoldenQueryAPIWrapper
_warn_on_import(name)
return GoldenQueryAPIWrapper
elif name == "GoogleSearchAPIWrapper":
from langchain.utilities import GoogleSearchAPIWrapper
_warn_on_import(name)
return GoogleSearchAPIWrapper
elif name == "GoogleSerperAPIWrapper":
from langchain.utilities import GoogleSerperAPIWrapper
_warn_on_import(name)
return GoogleSerperAPIWrapper
elif name == "PowerBIDataset":
from langchain.utilities import PowerBIDataset
_warn_on_import(name)
return PowerBIDataset
elif name == "SearxSearchWrapper":
from langchain.utilities import SearxSearchWrapper
_warn_on_import(name)
return SearxSearchWrapper
elif name == "WikipediaAPIWrapper":
from langchain.utilities import WikipediaAPIWrapper
_warn_on_import(name)
return WikipediaAPIWrapper
elif name == "WolframAlphaAPIWrapper":
from langchain.utilities import WolframAlphaAPIWrapper
_warn_on_import(name)
return WolframAlphaAPIWrapper
elif name == "SQLDatabase":
from langchain.utilities import SQLDatabase
_warn_on_import(name)
return SQLDatabase
elif name == "FAISS":
from langchain.vectorstores import FAISS
_warn_on_import(name)
return FAISS
elif name == "ElasticVectorSearch":
from langchain.vectorstores import ElasticVectorSearch
_warn_on_import(name)
return ElasticVectorSearch
# For backwards compatibility
elif name == "SerpAPIChain" or name == "SerpAPIWrapper":
from langchain.utilities import SerpAPIWrapper
_warn_on_import(name)
return SerpAPIWrapper
else:
raise AttributeError(f"Could not find: {name}")
__all__ = [

View File

@@ -13,10 +13,14 @@ from .deprecation import (
LangChainDeprecationWarning,
deprecated,
suppress_langchain_deprecation_warning,
surface_langchain_deprecation_warnings,
warn_deprecated,
)
__all__ = [
"deprecated",
"LangChainDeprecationWarning",
"suppress_langchain_deprecation_warning",
"surface_langchain_deprecation_warnings",
"warn_deprecated",
]

View File

@@ -21,84 +21,8 @@ class LangChainDeprecationWarning(DeprecationWarning):
"""A class for issuing deprecation warnings for LangChain users."""
def _warn_deprecated(
since: str,
*,
message: str = "",
name: str = "",
alternative: str = "",
pending: bool = False,
obj_type: str = "",
addendum: str = "",
removal: str = "",
) -> None:
"""Display a standardized deprecation.
Arguments:
since : str
The release at which this API became deprecated.
message : str, optional
Override the default deprecation message. The %(since)s,
%(name)s, %(alternative)s, %(obj_type)s, %(addendum)s,
and %(removal)s format specifiers will be replaced by the
values of the respective arguments passed to this function.
name : str, optional
The name of the deprecated object.
alternative : str, optional
An alternative API that the user may use in place of the
deprecated API. The deprecation warning will tell the user
about this alternative if provided.
pending : bool, optional
If True, uses a PendingDeprecationWarning instead of a
DeprecationWarning. Cannot be used together with removal.
obj_type : str, optional
The object type being deprecated.
addendum : str, optional
Additional text appended directly to the final message.
removal : str, optional
The expected removal version. With the default (an empty
string), a removal version is automatically computed from
since. Set to other Falsy values to not schedule a removal
date. Cannot be used together with pending.
"""
if pending and removal:
raise ValueError("A pending deprecation cannot have a scheduled removal")
if not pending:
if not removal:
removal = f"in {removal}" if removal else "within ?? minor releases"
raise NotImplementedError(
f"Need to determine which default deprecation schedule to use. "
f"{removal}"
)
else:
removal = f"in {removal}"
if not message:
message = ""
if obj_type:
message += f"The {obj_type} `{name}`"
else:
message += f"`{name}`"
if pending:
message += " will be deprecated in a future version"
else:
message += f" was deprecated in LangChain {since}"
if removal:
message += f" and will be removed {removal}"
if alternative:
message += f". Use {alternative} instead."
if addendum:
message += f" {addendum}"
warning_cls = PendingDeprecationWarning if pending else LangChainDeprecationWarning
warning = warning_cls(message)
warnings.warn(warning, category=LangChainDeprecationWarning, stacklevel=2)
class LangChainPendingDeprecationWarning(PendingDeprecationWarning):
"""A class for issuing deprecation warnings for LangChain users."""
# PUBLIC API
@@ -262,7 +186,7 @@ def deprecated(
def emit_warning() -> None:
"""Emit the warning."""
_warn_deprecated(
warn_deprecated(
since,
message=_message,
name=_name,
@@ -318,4 +242,100 @@ def suppress_langchain_deprecation_warning() -> Generator[None, None, None]:
"""Context manager to suppress LangChainDeprecationWarning."""
with warnings.catch_warnings():
warnings.simplefilter("ignore", LangChainDeprecationWarning)
warnings.simplefilter("ignore", LangChainPendingDeprecationWarning)
yield
def warn_deprecated(
since: str,
*,
message: str = "",
name: str = "",
alternative: str = "",
pending: bool = False,
obj_type: str = "",
addendum: str = "",
removal: str = "",
) -> None:
"""Display a standardized deprecation.
Arguments:
since : str
The release at which this API became deprecated.
message : str, optional
Override the default deprecation message. The %(since)s,
%(name)s, %(alternative)s, %(obj_type)s, %(addendum)s,
and %(removal)s format specifiers will be replaced by the
values of the respective arguments passed to this function.
name : str, optional
The name of the deprecated object.
alternative : str, optional
An alternative API that the user may use in place of the
deprecated API. The deprecation warning will tell the user
about this alternative if provided.
pending : bool, optional
If True, uses a PendingDeprecationWarning instead of a
DeprecationWarning. Cannot be used together with removal.
obj_type : str, optional
The object type being deprecated.
addendum : str, optional
Additional text appended directly to the final message.
removal : str, optional
The expected removal version. With the default (an empty
string), a removal version is automatically computed from
since. Set to other Falsy values to not schedule a removal
date. Cannot be used together with pending.
"""
if pending and removal:
raise ValueError("A pending deprecation cannot have a scheduled removal")
if not pending:
if not removal:
removal = f"in {removal}" if removal else "within ?? minor releases"
raise NotImplementedError(
f"Need to determine which default deprecation schedule to use. "
f"{removal}"
)
else:
removal = f"in {removal}"
if not message:
message = ""
if obj_type:
message += f"The {obj_type} `{name}`"
else:
message += f"`{name}`"
if pending:
message += " will be deprecated in a future version"
else:
message += f" was deprecated in LangChain {since}"
if removal:
message += f" and will be removed {removal}"
if alternative:
message += f". Use {alternative} instead."
if addendum:
message += f" {addendum}"
warning_cls = (
LangChainPendingDeprecationWarning if pending else LangChainDeprecationWarning
)
warning = warning_cls(message)
warnings.warn(warning, category=LangChainDeprecationWarning, stacklevel=2)
def surface_langchain_deprecation_warnings() -> None:
"""Unmute LangChain deprecation warnings."""
warnings.filterwarnings(
"default",
category=LangChainPendingDeprecationWarning,
)
warnings.filterwarnings(
"default",
category=LangChainDeprecationWarning,
)

View File

@@ -15,7 +15,7 @@ from typing import (
from typing_extensions import Literal
from langchain.chat_loaders.base import ChatSession
from langchain.schema.chat import ChatSession
from langchain.schema.messages import (
AIMessage,
AIMessageChunk,

View File

@@ -330,6 +330,11 @@ class RunnableAgent(BaseSingleActionAgent):
arbitrary_types_allowed = True
@property
def return_values(self) -> List[str]:
"""Return values of the agent."""
return []
@property
def input_keys(self) -> List[str]:
"""Return the input keys.

View File

@@ -1,5 +1,5 @@
"""Agent for working with pandas objects."""
from typing import Any, Dict, List, Optional, Tuple
from typing import Any, Dict, List, Optional, Sequence, Tuple
from langchain.agents.agent import AgentExecutor, BaseSingleActionAgent
from langchain.agents.agent_toolkits.pandas.prompt import (
@@ -21,6 +21,7 @@ from langchain.chains.llm import LLMChain
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.messages import SystemMessage
from langchain.tools import BaseTool
from langchain.tools.python.tool import PythonAstREPLTool
@@ -280,12 +281,13 @@ def create_pandas_dataframe_agent(
agent_executor_kwargs: Optional[Dict[str, Any]] = None,
include_df_in_prompt: Optional[bool] = True,
number_of_head_rows: int = 5,
extra_tools: Sequence[BaseTool] = (),
**kwargs: Dict[str, Any],
) -> AgentExecutor:
"""Construct a pandas agent from an LLM and dataframe."""
agent: BaseSingleActionAgent
if agent_type == AgentType.ZERO_SHOT_REACT_DESCRIPTION:
prompt, tools = _get_prompt_and_tools(
prompt, base_tools = _get_prompt_and_tools(
df,
prefix=prefix,
suffix=suffix,
@@ -293,6 +295,7 @@ def create_pandas_dataframe_agent(
include_df_in_prompt=include_df_in_prompt,
number_of_head_rows=number_of_head_rows,
)
tools = base_tools + list(extra_tools)
llm_chain = LLMChain(
llm=llm,
prompt=prompt,
@@ -306,7 +309,7 @@ def create_pandas_dataframe_agent(
**kwargs,
)
elif agent_type == AgentType.OPENAI_FUNCTIONS:
_prompt, tools = _get_functions_prompt_and_tools(
_prompt, base_tools = _get_functions_prompt_and_tools(
df,
prefix=prefix,
suffix=suffix,
@@ -314,6 +317,7 @@ def create_pandas_dataframe_agent(
include_df_in_prompt=include_df_in_prompt,
number_of_head_rows=number_of_head_rows,
)
tools = base_tools + list(extra_tools)
agent = OpenAIFunctionsAgent(
llm=llm,
prompt=_prompt,

View File

@@ -5,12 +5,12 @@ from langchain.agents.agent_toolkits.base import BaseToolkit
from langchain.llms.openai import OpenAI
from langchain.pydantic_v1 import BaseModel, Field
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.vectorstore import VectorStore
from langchain.tools import BaseTool
from langchain.tools.vectorstore.tool import (
VectorStoreQATool,
VectorStoreQAWithSourcesTool,
)
from langchain.vectorstores.base import VectorStore
class VectorStoreInfo(BaseModel):

View File

@@ -102,6 +102,7 @@ class BaseTracer(BaseCallbackHandler, ABC):
tags: Optional[List[str]] = None,
parent_run_id: Optional[UUID] = None,
metadata: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
**kwargs: Any,
) -> Run:
"""Start a trace for an LLM run."""
@@ -122,6 +123,7 @@ class BaseTracer(BaseCallbackHandler, ABC):
child_execution_order=execution_order,
run_type="llm",
tags=tags or [],
name=name,
)
self._start_trace(llm_run)
self._on_llm_start(llm_run)
@@ -335,6 +337,7 @@ class BaseTracer(BaseCallbackHandler, ABC):
tags: Optional[List[str]] = None,
parent_run_id: Optional[UUID] = None,
metadata: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
**kwargs: Any,
) -> Run:
"""Start a trace for a tool run."""
@@ -356,6 +359,7 @@ class BaseTracer(BaseCallbackHandler, ABC):
child_runs=[],
run_type="tool",
tags=tags or [],
name=name,
)
self._start_trace(tool_run)
self._on_tool_start(tool_run)
@@ -406,6 +410,7 @@ class BaseTracer(BaseCallbackHandler, ABC):
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
**kwargs: Any,
) -> Run:
"""Run when Retriever starts running."""
@@ -416,7 +421,7 @@ class BaseTracer(BaseCallbackHandler, ABC):
kwargs.update({"metadata": metadata})
retrieval_run = Run(
id=run_id,
name="Retriever",
name=name or "Retriever",
parent_run_id=parent_run_id,
serialized=serialized,
inputs={"query": query},

View File

@@ -98,6 +98,7 @@ class LangChainTracer(BaseTracer):
tags: Optional[List[str]] = None,
parent_run_id: Optional[UUID] = None,
metadata: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
**kwargs: Any,
) -> None:
"""Start a trace for an LLM run."""
@@ -118,6 +119,7 @@ class LangChainTracer(BaseTracer):
child_execution_order=execution_order,
run_type="llm",
tags=tags,
name=name,
)
self._start_trace(chat_model_run)
self._on_chat_model_start(chat_model_run)

View File

@@ -193,7 +193,7 @@ class LogStreamCallbackHandler(BaseTracer):
"op": "replace",
"path": "",
"value": RunState(
id=run.id,
id=str(run.id),
streamed_output=[],
final_output=None,
logs=[],

View File

@@ -22,7 +22,7 @@ from langchain.pydantic_v1 import Extra, Field, root_validator
from langchain.schema import BasePromptTemplate, BaseRetriever, Document
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.messages import BaseMessage
from langchain.vectorstores.base import VectorStore
from langchain.schema.vectorstore import VectorStore
# Depending on the memory type and configuration, the chat history format may differ.
# This needs to be consolidated.

View File

@@ -34,12 +34,54 @@ def extract_cypher(text: str) -> str:
return matches[0] if matches else text
def construct_schema(
structured_schema: Dict[str, Any],
include_types: List[str],
exclude_types: List[str],
) -> str:
"""Filter the schema based on included or excluded types"""
def filter_func(x: str) -> bool:
return x in include_types if include_types else x not in exclude_types
filtered_schema = {
"node_props": {
k: v
for k, v in structured_schema.get("node_props", {}).items()
if filter_func(k)
},
"rel_props": {
k: v
for k, v in structured_schema.get("rel_props", {}).items()
if filter_func(k)
},
"relationships": [
r
for r in structured_schema.get("relationships", [])
if all(filter_func(r[t]) for t in ["start", "end", "type"])
],
}
return (
f"Node properties are the following: \n {filtered_schema['node_props']}\n"
f"Relationships properties are the following: \n {filtered_schema['rel_props']}"
"\nRelationships are: \n"
+ str(
[
f"(:{el['start']})-[:{el['type']}]->(:{el['end']})"
for el in filtered_schema["relationships"]
]
)
)
class GraphCypherQAChain(Chain):
"""Chain for question-answering against a graph by generating Cypher statements."""
graph: Neo4jGraph = Field(exclude=True)
cypher_generation_chain: LLMChain
qa_chain: LLMChain
graph_schema: str
input_key: str = "query" #: :meta private:
output_key: str = "result" #: :meta private:
top_k: int = 10
@@ -79,6 +121,8 @@ class GraphCypherQAChain(Chain):
cypher_prompt: BasePromptTemplate = CYPHER_GENERATION_PROMPT,
cypher_llm: Optional[BaseLanguageModel] = None,
qa_llm: Optional[BaseLanguageModel] = None,
exclude_types: List[str] = [],
include_types: List[str] = [],
**kwargs: Any,
) -> GraphCypherQAChain:
"""Initialize from LLM."""
@@ -96,7 +140,18 @@ class GraphCypherQAChain(Chain):
qa_chain = LLMChain(llm=qa_llm or llm, prompt=qa_prompt)
cypher_generation_chain = LLMChain(llm=cypher_llm or llm, prompt=cypher_prompt)
if exclude_types and include_types:
raise ValueError(
"Either `exclude_types` or `include_types` "
"can be provided, but not both"
)
graph_schema = construct_schema(
kwargs["graph"].structured_schema, include_types, exclude_types
)
return cls(
graph_schema=graph_schema,
qa_chain=qa_chain,
cypher_generation_chain=cypher_generation_chain,
**kwargs,
@@ -115,7 +170,7 @@ class GraphCypherQAChain(Chain):
intermediate_steps: List = []
generated_cypher = self.cypher_generation_chain.run(
{"question": question, "schema": self.graph.get_schema}, callbacks=callbacks
{"question": question, "schema": self.graph_schema}, callbacks=callbacks
)
# Extract Cypher code if it is wrapped in backticks

View File

@@ -68,7 +68,7 @@ def use_simple_prompt(llm: BaseLanguageModel) -> bool:
return True
# Bedrock anthropic
if llm.model_id and "anthropic" in llm.model_id: # type: ignore
if hasattr(llm, "model_id") and "anthropic" in llm.model_id: # type: ignore
return True
return False

View File

@@ -42,8 +42,8 @@ class LLMChain(Chain):
llm = LLMChain(llm=OpenAI(), prompt=prompt)
"""
@property
def lc_serializable(self) -> bool:
@classmethod
def is_lc_serializable(self) -> bool:
return True
prompt: BasePromptTemplate

View File

@@ -2,3 +2,13 @@
Heavily borrowed from llm_math, wrapper for SymPy
"""
from langchain._api import warn_deprecated
warn_deprecated(
since="0.0.304",
message=(
"On 2023-10-06 this module will be moved to langchain-experimental as "
"it relies on sympify https://github.com/sympy/sympy/issues/10805"
),
pending=True,
)

View File

@@ -11,7 +11,7 @@ from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chains.qa_with_sources.base import BaseQAWithSourcesChain
from langchain.docstore.document import Document
from langchain.pydantic_v1 import Field, root_validator
from langchain.vectorstores.base import VectorStore
from langchain.schema.vectorstore import VectorStore
class VectorDBQAWithSourcesChain(BaseQAWithSourcesChain):

View File

@@ -33,7 +33,7 @@ refine_template = (
"If the context isn't useful, return the original answer."
)
CHAT_REFINE_PROMPT = ChatPromptTemplate.from_messages(
[("human", "{question}"), ("ai", "{existing_answer}"), ("human", "refine_template")]
[("human", "{question}"), ("ai", "{existing_answer}"), ("human", refine_template)]
)
REFINE_PROMPT_SELECTOR = ConditionalPromptSelector(
default_prompt=DEFAULT_REFINE_PROMPT,

View File

@@ -21,7 +21,7 @@ from langchain.prompts import PromptTemplate
from langchain.pydantic_v1 import Extra, Field, root_validator
from langchain.schema import BaseRetriever, Document
from langchain.schema.language_model import BaseLanguageModel
from langchain.vectorstores.base import VectorStore
from langchain.schema.vectorstore import VectorStore
class BaseRetrievalQA(Chain):
@@ -197,8 +197,8 @@ class RetrievalQA(BaseRetrievalQA):
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA
from langchain.faiss import FAISS
from langchain.vectorstores.base import VectorStoreRetriever
from langchain.vectorstores import FAISS
from langchain.schema.vectorstore import VectorStoreRetriever
retriever = VectorStoreRetriever(vectorstore=FAISS(...))
retrievalQA = RetrievalQA.from_llm(llm=OpenAI(), retriever=retriever)

View File

@@ -7,7 +7,7 @@ from langchain.chains.router.base import RouterChain
from langchain.docstore.document import Document
from langchain.pydantic_v1 import Extra
from langchain.schema.embeddings import Embeddings
from langchain.vectorstores.base import VectorStore
from langchain.schema.vectorstore import VectorStore
class EmbeddingRouterChain(RouterChain):

View File

@@ -1,15 +1,7 @@
from abc import ABC, abstractmethod
from typing import Iterator, List, Sequence, TypedDict
from typing import Iterator, List
from langchain.schema.messages import BaseMessage
class ChatSession(TypedDict):
"""Chat Session represents a single
conversation, channel, or other group of messages."""
messages: Sequence[BaseMessage]
"""The LangChain chat messages loaded from the source."""
from langchain.schema.chat import ChatSession
class BaseChatLoader(ABC):

View File

@@ -3,7 +3,8 @@ import logging
from pathlib import Path
from typing import Iterator, Union
from langchain.chat_loaders.base import BaseChatLoader, ChatSession
from langchain.chat_loaders.base import BaseChatLoader
from langchain.schema.chat import ChatSession
from langchain.schema.messages import HumanMessage
logger = logging.getLogger(__file__)

View File

@@ -2,7 +2,8 @@ import base64
import re
from typing import Any, Iterator
from langchain.chat_loaders.base import BaseChatLoader, ChatSession
from langchain.chat_loaders.base import BaseChatLoader
from langchain.schema.chat import ChatSession
from langchain.schema.messages import HumanMessage

View File

@@ -3,8 +3,9 @@ from __future__ import annotations
from pathlib import Path
from typing import TYPE_CHECKING, Iterator, List, Optional, Union
from langchain.chat_loaders.base import BaseChatLoader, ChatSession
from langchain.chat_loaders.base import BaseChatLoader
from langchain.schema import HumanMessage
from langchain.schema.chat import ChatSession
if TYPE_CHECKING:
import sqlite3

View File

@@ -5,8 +5,9 @@ import zipfile
from pathlib import Path
from typing import Dict, Iterator, List, Union
from langchain.chat_loaders.base import BaseChatLoader, ChatSession
from langchain.chat_loaders.base import BaseChatLoader
from langchain.schema import AIMessage, HumanMessage
from langchain.schema.chat import ChatSession
logger = logging.getLogger(__name__)

View File

@@ -6,8 +6,9 @@ import zipfile
from pathlib import Path
from typing import Iterator, List, Union
from langchain.chat_loaders.base import BaseChatLoader, ChatSession
from langchain.chat_loaders.base import BaseChatLoader
from langchain.schema import AIMessage, BaseMessage, HumanMessage
from langchain.schema.chat import ChatSession
logger = logging.getLogger(__name__)

View File

@@ -2,7 +2,7 @@
from copy import deepcopy
from typing import Iterable, Iterator, List
from langchain.chat_loaders.base import ChatSession
from langchain.schema.chat import ChatSession
from langchain.schema.messages import AIMessage, BaseMessage

View File

@@ -4,8 +4,9 @@ import re
import zipfile
from typing import Iterator, List, Union
from langchain.chat_loaders.base import BaseChatLoader, ChatSession
from langchain.chat_loaders.base import BaseChatLoader
from langchain.schema import AIMessage, HumanMessage
from langchain.schema.chat import ChatSession
logger = logging.getLogger(__name__)

View File

@@ -24,6 +24,7 @@ from langchain.chat_models.baidu_qianfan_endpoint import QianfanChatEndpoint
from langchain.chat_models.bedrock import BedrockChat
from langchain.chat_models.ernie import ErnieBotChat
from langchain.chat_models.fake import FakeListChatModel
from langchain.chat_models.fireworks import ChatFireworks
from langchain.chat_models.google_palm import ChatGooglePalm
from langchain.chat_models.human import HumanInputChatModel
from langchain.chat_models.javelin_ai_gateway import ChatJavelinAIGateway
@@ -57,4 +58,5 @@ __all__ = [
"ChatJavelinAIGateway",
"ChatKonko",
"QianfanChatEndpoint",
"ChatFireworks",
]

View File

@@ -99,8 +99,9 @@ class ChatAnthropic(BaseChatModel, _AnthropicCommon):
"""Return type of chat model."""
return "anthropic-chat"
@property
def lc_serializable(self) -> bool:
@classmethod
def is_lc_serializable(cls) -> bool:
"""Return whether this model can be serialized by Langchain."""
return True
def _convert_messages_to_prompt(self, messages: List[BaseMessage]) -> str:

View File

@@ -4,7 +4,7 @@ from __future__ import annotations
import logging
import os
import sys
from typing import TYPE_CHECKING, Optional, Set
from typing import TYPE_CHECKING, Dict, Optional, Set
import requests
@@ -50,7 +50,7 @@ class ChatAnyscale(ChatOpenAI):
return "anyscale-chat"
@property
def lc_secrets(self) -> dict[str, str]:
def lc_secrets(self) -> Dict[str, str]:
return {"anyscale_api_key": "ANYSCALE_API_KEY"}
anyscale_api_key: Optional[str] = None

View File

@@ -139,6 +139,7 @@ class BaseChatModel(BaseLanguageModel[BaseMessageChunk], ABC):
callbacks=config.get("callbacks"),
tags=config.get("tags"),
metadata=config.get("metadata"),
run_name=config.get("run_name"),
**kwargs,
).generations[0][0],
).message,
@@ -165,6 +166,7 @@ class BaseChatModel(BaseLanguageModel[BaseMessageChunk], ABC):
callbacks=config.get("callbacks"),
tags=config.get("tags"),
metadata=config.get("metadata"),
run_name=config.get("run_name"),
**kwargs,
)
return cast(
@@ -197,7 +199,11 @@ class BaseChatModel(BaseLanguageModel[BaseMessageChunk], ABC):
self.metadata,
)
(run_manager,) = callback_manager.on_chat_model_start(
dumpd(self), [messages], invocation_params=params, options=options
dumpd(self),
[messages],
invocation_params=params,
options=options,
name=config.get("run_name"),
)
try:
generation: Optional[ChatGenerationChunk] = None
@@ -244,7 +250,11 @@ class BaseChatModel(BaseLanguageModel[BaseMessageChunk], ABC):
self.metadata,
)
(run_manager,) = await callback_manager.on_chat_model_start(
dumpd(self), [messages], invocation_params=params, options=options
dumpd(self),
[messages],
invocation_params=params,
options=options,
name=config.get("run_name"),
)
try:
generation: Optional[ChatGenerationChunk] = None
@@ -280,7 +290,7 @@ class BaseChatModel(BaseLanguageModel[BaseMessageChunk], ABC):
return {**params, **kwargs}
def _get_llm_string(self, stop: Optional[List[str]] = None, **kwargs: Any) -> str:
if self.lc_serializable:
if self.is_lc_serializable():
params = {**kwargs, **{"stop": stop}}
param_string = str(sorted([(k, v) for k, v in params.items()]))
llm_string = dumps(self)
@@ -298,6 +308,7 @@ class BaseChatModel(BaseLanguageModel[BaseMessageChunk], ABC):
*,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
run_name: Optional[str] = None,
**kwargs: Any,
) -> LLMResult:
"""Top Level call"""
@@ -314,7 +325,11 @@ class BaseChatModel(BaseLanguageModel[BaseMessageChunk], ABC):
self.metadata,
)
run_managers = callback_manager.on_chat_model_start(
dumpd(self), messages, invocation_params=params, options=options
dumpd(self),
messages,
invocation_params=params,
options=options,
name=run_name,
)
results = []
for i, m in enumerate(messages):
@@ -354,6 +369,7 @@ class BaseChatModel(BaseLanguageModel[BaseMessageChunk], ABC):
*,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
run_name: Optional[str] = None,
**kwargs: Any,
) -> LLMResult:
"""Top Level call"""
@@ -371,7 +387,11 @@ class BaseChatModel(BaseLanguageModel[BaseMessageChunk], ABC):
)
run_managers = await callback_manager.on_chat_model_start(
dumpd(self), messages, invocation_params=params, options=options
dumpd(self),
messages,
invocation_params=params,
options=options,
name=run_name,
)
results = await asyncio.gather(

View File

@@ -1,7 +1,6 @@
from typing import Any, AsyncIterator, Dict, Iterator, List, Optional
from typing import Any, Dict, Iterator, List, Optional
from langchain.callbacks.manager import (
AsyncCallbackManagerForLLMRun,
CallbackManagerForLLMRun,
)
from langchain.chat_models.anthropic import convert_messages_to_prompt_anthropic
@@ -59,17 +58,6 @@ class BedrockChat(BaseChatModel, BedrockBase):
delta = chunk.text
yield ChatGenerationChunk(message=AIMessageChunk(content=delta))
def _astream(
self,
messages: List[BaseMessage],
stop: Optional[List[str]] = None,
run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> AsyncIterator[ChatGenerationChunk]:
raise NotImplementedError(
"""Bedrock doesn't support async requests at the moment."""
)
def _generate(
self,
messages: List[BaseMessage],
@@ -98,14 +86,3 @@ class BedrockChat(BaseChatModel, BedrockBase):
message = AIMessage(content=completion)
return ChatResult(generations=[ChatGeneration(message=message)])
async def _agenerate(
self,
messages: List[BaseMessage],
stop: Optional[List[str]] = None,
run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> ChatResult:
raise NotImplementedError(
"""Bedrock doesn't support async stream requests at the moment."""
)

View File

@@ -0,0 +1,292 @@
from typing import (
Any,
AsyncIterator,
Callable,
Dict,
Iterator,
List,
Optional,
Type,
Union,
)
from langchain.adapters.openai import convert_message_to_dict
from langchain.callbacks.manager import (
AsyncCallbackManagerForLLMRun,
CallbackManagerForLLMRun,
)
from langchain.chat_models.base import BaseChatModel
from langchain.llms.base import create_base_retry_decorator
from langchain.pydantic_v1 import Field, root_validator
from langchain.schema.messages import (
AIMessage,
AIMessageChunk,
BaseMessage,
BaseMessageChunk,
ChatMessage,
ChatMessageChunk,
FunctionMessage,
FunctionMessageChunk,
HumanMessage,
HumanMessageChunk,
SystemMessage,
SystemMessageChunk,
)
from langchain.schema.output import ChatGeneration, ChatGenerationChunk, ChatResult
from langchain.utils.env import get_from_dict_or_env
def _convert_delta_to_message_chunk(
_dict: Any, default_class: Type[BaseMessageChunk]
) -> BaseMessageChunk:
"""Convert a delta response to a message chunk."""
role = _dict.role
content = _dict.content or ""
additional_kwargs: Dict = {}
if role == "user" or default_class == HumanMessageChunk:
return HumanMessageChunk(content=content)
elif role == "assistant" or default_class == AIMessageChunk:
return AIMessageChunk(content=content, additional_kwargs=additional_kwargs)
elif role == "system" or default_class == SystemMessageChunk:
return SystemMessageChunk(content=content)
elif role == "function" or default_class == FunctionMessageChunk:
return FunctionMessageChunk(content=content, name=_dict.name)
elif role or default_class == ChatMessageChunk:
return ChatMessageChunk(content=content, role=role)
else:
return default_class(content=content)
def convert_dict_to_message(_dict: Any) -> BaseMessage:
"""Convert a dict response to a message."""
role = _dict.role
content = _dict.content or ""
if role == "user":
return HumanMessage(content=content)
elif role == "assistant":
content = _dict.content
additional_kwargs: Dict = {}
return AIMessage(content=content, additional_kwargs=additional_kwargs)
elif role == "system":
return SystemMessage(content=content)
elif role == "function":
return FunctionMessage(content=content, name=_dict.name)
else:
return ChatMessage(content=content, role=role)
class ChatFireworks(BaseChatModel):
"""Fireworks Chat models."""
model: str = "accounts/fireworks/models/llama-v2-7b-chat"
model_kwargs: dict = Field(
default_factory=lambda: {
"temperature": 0.7,
"max_tokens": 512,
"top_p": 1,
}.copy()
)
fireworks_api_key: Optional[str] = None
max_retries: int = 20
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key in environment."""
try:
import fireworks.client
except ImportError as e:
raise ImportError("") from e
fireworks_api_key = get_from_dict_or_env(
values, "fireworks_api_key", "FIREWORKS_API_KEY"
)
fireworks.client.api_key = fireworks_api_key
return values
@property
def _llm_type(self) -> str:
"""Return type of llm."""
return "fireworks-chat"
def _generate(
self,
messages: List[BaseMessage],
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> ChatResult:
message_dicts = self._create_message_dicts(messages, stop)
params = {
"model": self.model,
"messages": message_dicts,
**self.model_kwargs,
}
response = completion_with_retry(self, run_manager=run_manager, **params)
return self._create_chat_result(response)
async def _agenerate(
self,
messages: List[BaseMessage],
stop: Optional[List[str]] = None,
run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> ChatResult:
message_dicts = self._create_message_dicts(messages, stop)
params = {
"model": self.model,
"messages": message_dicts,
**self.model_kwargs,
}
response = await acompletion_with_retry(self, run_manager=run_manager, **params)
return self._create_chat_result(response)
def _combine_llm_outputs(self, llm_outputs: List[Optional[dict]]) -> dict:
if llm_outputs[0] is None:
return {}
return llm_outputs[0]
def _create_chat_result(self, response: Any) -> ChatResult:
generations = []
for res in response.choices:
message = convert_dict_to_message(res.message)
gen = ChatGeneration(
message=message,
generation_info=dict(finish_reason=res.finish_reason),
)
generations.append(gen)
llm_output = {"model": self.model}
return ChatResult(generations=generations, llm_output=llm_output)
def _create_message_dicts(
self, messages: List[BaseMessage], stop: Optional[List[str]]
) -> List[Dict[str, Any]]:
message_dicts = [convert_message_to_dict(m) for m in messages]
return message_dicts
def _stream(
self,
messages: List[BaseMessage],
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> Iterator[ChatGenerationChunk]:
message_dicts = self._create_message_dicts(messages, stop)
default_chunk_class = AIMessageChunk
params = {
"model": self.model,
"messages": message_dicts,
"stream": True,
**self.model_kwargs,
}
for chunk in completion_with_retry(self, run_manager=run_manager, **params):
choice = chunk.choices[0]
chunk = _convert_delta_to_message_chunk(choice.delta, default_chunk_class)
finish_reason = choice.finish_reason
generation_info = (
dict(finish_reason=finish_reason) if finish_reason is not None else None
)
default_chunk_class = chunk.__class__
yield ChatGenerationChunk(message=chunk, generation_info=generation_info)
async def _astream(
self,
messages: List[BaseMessage],
stop: Optional[List[str]] = None,
run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> AsyncIterator[ChatGenerationChunk]:
message_dicts = self._create_message_dicts(messages, stop)
default_chunk_class = AIMessageChunk
params = {
"model": self.model,
"messages": message_dicts,
"stream": True,
**self.model_kwargs,
}
async for chunk in await acompletion_with_retry_streaming(
self, run_manager=run_manager, **params
):
choice = chunk.choices[0]
chunk = _convert_delta_to_message_chunk(choice.delta, default_chunk_class)
finish_reason = choice.finish_reason
generation_info = (
dict(finish_reason=finish_reason) if finish_reason is not None else None
)
default_chunk_class = chunk.__class__
yield ChatGenerationChunk(message=chunk, generation_info=generation_info)
def completion_with_retry(
llm: ChatFireworks,
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> Any:
"""Use tenacity to retry the completion call."""
import fireworks.client
retry_decorator = _create_retry_decorator(llm, run_manager=run_manager)
@retry_decorator
def _completion_with_retry(**kwargs: Any) -> Any:
return fireworks.client.ChatCompletion.create(
**kwargs,
)
return _completion_with_retry(**kwargs)
async def acompletion_with_retry(
llm: ChatFireworks,
run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> Any:
"""Use tenacity to retry the async completion call."""
import fireworks.client
retry_decorator = _create_retry_decorator(llm, run_manager=run_manager)
@retry_decorator
async def _completion_with_retry(**kwargs: Any) -> Any:
return await fireworks.client.ChatCompletion.acreate(
**kwargs,
)
return await _completion_with_retry(**kwargs)
async def acompletion_with_retry_streaming(
llm: ChatFireworks,
run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> Any:
"""Use tenacity to retry the completion call for streaming."""
import fireworks.client
retry_decorator = _create_retry_decorator(llm, run_manager=run_manager)
@retry_decorator
async def _completion_with_retry(**kwargs: Any) -> Any:
return fireworks.client.ChatCompletion.acreate(
**kwargs,
)
return await _completion_with_retry(**kwargs)
def _create_retry_decorator(
llm: ChatFireworks,
run_manager: Optional[
Union[AsyncCallbackManagerForLLMRun, CallbackManagerForLLMRun]
] = None,
) -> Callable[[Any], Any]:
"""Define retry mechanism."""
import fireworks.client
errors = [
fireworks.client.error.RateLimitError,
fireworks.client.error.ServiceUnavailableError,
]
return create_base_retry_decorator(
error_types=errors, max_retries=llm.max_retries, run_manager=run_manager
)

View File

@@ -12,6 +12,7 @@ from typing import (
Mapping,
Optional,
Tuple,
Type,
Union,
)
@@ -91,7 +92,7 @@ async def acompletion_with_retry(llm: JinaChat, **kwargs: Any) -> Any:
def _convert_delta_to_message_chunk(
_dict: Mapping[str, Any], default_class: type[BaseMessageChunk]
_dict: Mapping[str, Any], default_class: Type[BaseMessageChunk]
) -> BaseMessageChunk:
role = _dict.get("role")
content = _dict.get("content") or ""
@@ -164,8 +165,9 @@ class JinaChat(BaseChatModel):
def lc_secrets(self) -> Dict[str, str]:
return {"jinachat_api_key": "JINACHAT_API_KEY"}
@property
def lc_serializable(self) -> bool:
@classmethod
def is_lc_serializable(cls) -> bool:
"""Return whether this model can be serialized by Langchain."""
return True
client: Any #: :meta private:

View File

@@ -55,8 +55,9 @@ class ChatKonko(ChatOpenAI):
def lc_secrets(self) -> Dict[str, str]:
return {"konko_api_key": "KONKO_API_KEY", "openai_api_key": "OPENAI_API_KEY"}
@property
def lc_serializable(self) -> bool:
@classmethod
def is_lc_serializable(cls) -> bool:
"""Return whether this model can be serialized by Langchain."""
return True
client: Any = None #: :meta private:

Some files were not shown because too many files have changed in this diff Show More