Compare commits

..

56 Commits

Author SHA1 Message Date
William Fu-Hinthorn
4ecbb3aeac Delete deprecated run evaluator loaders 2023-07-10 10:52:19 -07:00
William Fu-Hinthorn
4d50092103 Switch to langsmith 2023-07-10 10:49:37 -07:00
Leonid Ganeline
5eec74d9a5 docstrings document_loaders 3 (#6937)
- Updated docstrings for `document_loaders`
- Mass update `"""Loader that loads` to `"""Loads`

@baskaryan  - please, review
2023-07-10 08:56:53 -07:00
Stanko Kuveljic
9d13dcd17c Pinecone: Add V4 support (#7473) 2023-07-10 08:39:47 -07:00
Adilkhan Sarsen
5debd5043e Added deeplake use case examples of the new features (#6528)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes # (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
 
 1. Added use cases of the new features
 2. Done some code refactoring

---------

Co-authored-by: Ivo Stranic <istranic@gmail.com>
2023-07-10 07:04:29 -07:00
Bagatur
9b615022e2 bump 229 (#7467) 2023-07-10 04:38:55 -04:00
Kazuki Maeda
92b4418c8c Datadog logs loader (#7356)
### Description
Created a Loader to get a list of specific logs from Datadog Logs.

### Dependencies
`datadog_api_client` is required.

### Twitter handle
[kzk_maeda](https://twitter.com/kzk_maeda)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-10 04:27:55 -04:00
Yifei Song
7d29bb2c02 Add Xorbits Dataframe as a Document Loader (#7319)
- [Xorbits](https://doc.xorbits.io/en/latest/) is an open-source
computing framework that makes it easy to scale data science and machine
learning workloads in parallel. Xorbits can leverage multi cores or GPUs
to accelerate computation on a single machine, or scale out up to
thousands of machines to support processing terabytes of data.

- This PR added support for the Xorbits document loader, which allows
langchain to leverage Xorbits to parallelize and distribute the loading
of data.
- Dependencies: This change requires the Xorbits library to be installed
in order to be used.
`pip install xorbits`
- Request for review: @rlancemartin, @eyurtsev
- Twitter handle: https://twitter.com/Xorbitsio

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-10 04:24:47 -04:00
Sergio Moreno
21a353e9c2 feat: ctransformers support async chain (#6859)
- Description: Adding async method for CTransformers 
- Issue: I've found impossible without this code to run Websockets
inside a FastAPI micro service and a CTransformers model.
  - Tag maintainer: Not necessary yet, I don't like to mention directly 
  - Twitter handle: @_semoal
2023-07-10 04:23:41 -04:00
Paul-Emile Brotons
d2cf0d16b3 adding max_marginal_relevance_search method to MongoDBAtlasVectorSearch (#7310)
Adding a maximal_marginal_relevance method to the
MongoDBAtlasVectorSearch vectorstore enhances the user experience by
providing more diverse search results

Issue: #7304
2023-07-10 04:04:19 -04:00
Bagatur
04cddfba0d Add lark import error (#7465) 2023-07-10 03:21:23 -04:00
Matt Robinson
bcab894f4e feat: Add UnstructuredTSVLoader (#7367)
### Summary

Adds an `UnstructuredTSVLoader` for TSV files. Also updates the doc
strings for `UnstructuredCSV` and `UnstructuredExcel` loaders.

### Testing

```python
from langchain.document_loaders.tsv import UnstructuredTSVLoader

loader = UnstructuredTSVLoader(
    file_path="example_data/mlb_teams_2012.csv", mode="elements"
)
docs = loader.load()
```
2023-07-10 03:07:10 -04:00
Ronald Li
490f4a9ff0 Fixes KeyError in AmazonKendraRetriever initializer (#7464)
### Description
argument variable client is marked as required in commit
81e5b1ad36 which breaks the default way of
initialization providing only index_id. This commit avoid KeyError
exception when it is initialized without a client variable
### Dependencies
no dependency required
2023-07-10 03:02:36 -04:00
Jona Sassenhagen
7ffc431b3a Add spacy sentencizer (#7442)
`SpacyTextSplitter` currently uses spacy's statistics-based
`en_core_web_sm` model for sentence splitting. This is a good splitter,
but it's also pretty slow, and in this case it's doing a lot of work
that's not needed given that the spacy parse is then just thrown away.
However, there is also a simple rules-based spacy sentencizer. Using
this is at least an order of magnitude faster than using
`en_core_web_sm` according to my local tests.
Also, spacy sentence tokenization based on `en_core_web_sm` can be sped
up in this case by not doing the NER stage. This shaves some cycles too,
both when loading the model and when parsing the text.

Consequently, this PR adds the option to use the basic spacy
sentencizer, and it disables the NER stage for the current approach,
*which is kept as the default*.

Lastly, when extracting the tokenized sentences, the `text` attribute is
called directly instead of doing the string conversion, which is IMO a
bit more idiomatic.
2023-07-10 02:52:05 -04:00
charosen
50a9fcccb0 feat(module): add param ids to ElasticVectorSearch.from_texts method (#7425)
# add param ids to ElasticVectorSearch.from_texts method.

- Description: add param ids to ElasticVectorSearch.from_texts method.
- Issue: NA. It seems `add_texts` already supports passing in document
ids, but param `ids` is omitted in `from_texts` classmethod,
- Dependencies: None,
- Tag maintainer: @rlancemartin, @eyurtsev please have a look, thanks

```
    # ElasticVectorSearch add_texts
    def add_texts(
        self,
        texts: Iterable[str],
        metadatas: Optional[List[dict]] = None,
        refresh_indices: bool = True,
        ids: Optional[List[str]] = None,
        **kwargs: Any,
    ) -> List[str]:
        ...

```

```
    # ElasticVectorSearch from_texts
    @classmethod
    def from_texts(
        cls,
        texts: List[str],
        embedding: Embeddings,
        metadatas: Optional[List[dict]] = None,
        elasticsearch_url: Optional[str] = None,
        index_name: Optional[str] = None,
        refresh_indices: bool = True,
        **kwargs: Any,
    ) -> ElasticVectorSearch:

```


Co-authored-by: charosen <charosen@bupt.cn>
2023-07-10 02:25:35 -04:00
James Yin
a5fd8873b1 fix: type hint of get_chat_history in BaseConversationalRetrievalChain (#7461)
The type hint of `get_chat_history` property in
`BaseConversationalRetrievalChain` is incorrect. @baskaryan
2023-07-10 02:14:00 -04:00
nikkie
dfc3f83b0f docs(vectorstores/integrations/chroma): Fix loading and saving (#7437)
- Description: Fix loading and saving code about Chroma
- Issue: the issue #7436 
- Dependencies: -
- Twitter handle: https://twitter.com/ftnext
2023-07-10 02:05:15 -04:00
Daniel Chalef
c7f7788d0b Add ZepMemory; improve ZepChatMessageHistory handling of metadata; Fix bugs (#7444)
Hey @hwchase17 - 

This PR adds a `ZepMemory` class, improves handling of Zep's message
metadata, and makes it easier for folks building custom chains to
persist metadata alongside their chat history.

We've had plenty confused users unfamiliar with ChatMessageHistory
classes and how to wrap the `ZepChatMessageHistory` in a
`ConversationBufferMemory`. So we've created the `ZepMemory` class as a
light wrapper for `ZepChatMessageHistory`.

Details:
- add ZepMemory, modify notebook to demo use of ZepMemory
- Modify summary to be SystemMessage
- add metadata argument to add_message; add Zep metadata to
Message.additional_kwargs
- support passing in metadata
2023-07-10 01:53:49 -04:00
Saurabh Chaturvedi
8f8e8d701e Fix info about YouTube (#7447)
(Unintentionally mean 😅) nit: YouTube wasn't created by Google, this PR
fixes the mention in docs.
2023-07-10 01:52:55 -04:00
Leonid Ganeline
560c4dfc98 docstrings: docstore and client (#6783)
updated docstrings in `docstore/` and `client/`

@baskaryan
2023-07-09 01:34:28 -04:00
Jeroen Van Goey
f5bd88757e Fix typo (#7416)
`quesitons` -> `questions`.
2023-07-09 00:54:48 -04:00
Alejandro Garrido Mota
ea9c3cc9c9 Fix syntax erros in documentation (#7409)
- Description: Tiny documentation fix. In Python, when defining function
parameters or providing arguments to a function or class constructor, we
do not use the `:` character.
- Issue: N/A
- Dependencies: N/A,
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @mogaal
2023-07-08 19:52:01 -04:00
Nolan
5da9f9abcb docs(agents/toolkits): Fix error in document_comparison_toolkit.ipynb (#7417)
Replace this comment with:
- Description: Removes unneeded output warning in documentation at
https://python.langchain.com/docs/modules/agents/toolkits/document_comparison_toolkit
  - Issue: -
  - Dependencies: -
  - Tag maintainer: @baskaryan
  - Twitter handle: @finnless
2023-07-08 19:51:08 -04:00
nikkie
2eb4a2ceea docs(retrievers/get-started): Fix broken state_of_the_union.txt link (#7399)
Thank you for this awesome library.

- Description: Fix broken link in documentation 
- Issue:
-
https://python.langchain.com/docs/modules/data_connection/retrievers/#get-started
- the URL:
https://github.com/hwchase17/langchain/blob/master/docs/modules/state_of_the_union.txt
- I think the right one is
https://github.com/hwchase17/langchain/blob/master/docs/extras/modules/state_of_the_union.txt
- Dependencies: -
- Tag maintainer: @baskaryan
- Twitter handle: -
2023-07-08 11:11:05 -04:00
Delgermurun
e7420789e4 improve description of JinaChat (#7397)
very small doc string change in the `JinaChat` class.
2023-07-08 10:57:11 -04:00
Bagatur
26c86a197c bump 228 (#7393) 2023-07-08 03:05:20 -04:00
SvMax
1d649b127e Added param to return only a structured json from the get_format_instructions method (#5848)
I just added a parameter to the method get_format_instructions, to
return directly the JSON instructions without the leading instruction
sentence. I'm planning to use it to define the structure of a JSON
object passed in input, the get_format_instructions().

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-08 02:57:26 -04:00
Bagatur
362bc301df fix jina (#7392) 2023-07-08 02:41:54 -04:00
Delgermurun
a1603fccfb integrate JinaChat (#6927)
Integration with https://chat.jina.ai/api. It is OpenAI compatible API.

- Twitter handle:
[https://twitter.com/JinaAI_](https://twitter.com/JinaAI_)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-08 02:17:04 -04:00
William FH
4ba7396f96 Add single run eval loader (#7390)
Plus 
- add evaluation name to make string and embedding validators work with
the run evaluator loader.
- Rm unused root validator
2023-07-07 23:06:49 -07:00
Roger Yu
633b673b85 Update pinecone.ipynb (#7382)
Fix typo
2023-07-08 01:48:03 -04:00
Oleg Zabluda
4d697d3f24 Allow passing custom prompts to GraphIndexCreator (#7381)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-08 01:47:53 -04:00
William FH
612a74eb7e Make Ref Example Threadsafe (#7383)
Have noticed transient ref example misalignment. I believe this is
caused by the logic of assigning an example within the thread executor
rather than before.
2023-07-07 21:50:42 -07:00
William FH
4789c99bc2 Add String Distance and Embedding Evaluators (#7123)
Add a string evaluator and pairwise string evaluator implementation for:
- Embedding distance
- String distance

Update docs
2023-07-07 21:44:31 -07:00
ljeagle
fb6e63dc36 Upgrade the AwaDB from 0.3.5 to 0.3.6 (#7363) 2023-07-07 20:41:17 -07:00
William FH
c5edbea34a Load Run Evaluator (#7101)
Current problems:
1. Evaluating LLMs or Chat models isn't smooth. Even specifying
'generations' as the output inserts a redundant list into the eval
template
2. Configuring input / prediction / reference keys in the
`get_qa_evaluator` function is confusing. Unless you are using a chain
with the default keys, you have to specify all the variables and need to
reason about whether the key corresponds to the traced run's inputs,
outputs or the examples inputs or outputs.


Proposal:
- Configure the run evaluator according to a model. Use the model type
and input/output keys to assert compatibility where possible. Only need
to specify a reference_key for certain evaluators (which is less
confusing than specifying input keys)


When does this work:
- If you have your langchain model available (assumed always for
run_on_dataset flow)
- If you are evaluating an LLM, Chat model, or chain
- If the LLM or chat models are traced by langchain (wouldn't work if
you add an incompatible schema via the REST API)

When would this fail:
- Currently if you directly create an example from an LLM run, the
outputs are generations with all the extra metadata present. A simple
`example_key` and dumping all to the template could make the evaluations
unreliable
- Doesn't help if you're not using the low level API
- If you want to instantiate the evaluator without instantiating your
chain or LLM (maybe common for monitoring, for instance) -> could also
load from run or run type though

What's ugly:
- Personally think it's better to load evaluators one by one since
passing a config down is pretty confusing.
- Lots of testing needs to be added
- Inconsistent in that it makes a separate run and example input mapper
instead of the original `RunEvaluatorInputMapper`, which maps a run and
example to a single input.

Example usage running the for an LLM, Chat Model, and Agent.

```
# Test running for the string evaluators
evaluator_names = ["qa", "criteria"]

model = ChatOpenAI()
configured_evaluators = load_run_evaluators_for_model(evaluator_names, model=model, reference_key="answer")
run_on_dataset(ds_name, model, run_evaluators=configured_evaluators)
```


<details>
  <summary>Full code with dataset upload</summary>
```
## Create dataset
from langchain.evaluation.run_evaluators.loading import load_run_evaluators_for_model
from langchain.evaluation import load_dataset
import pandas as pd

lcds = load_dataset("llm-math")
df = pd.DataFrame(lcds)

from uuid import uuid4
from langsmith import Client
client = Client()
ds_name = "llm-math - " + str(uuid4())[0:8]
ds = client.upload_dataframe(df, name=ds_name, input_keys=["question"], output_keys=["answer"])



## Define the models we'll test over
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.agents import initialize_agent, AgentType

from langchain.tools import tool

llm = OpenAI(temperature=0)
chat_model = ChatOpenAI(temperature=0)

@tool
    def sum(a: float, b: float) -> float:
        """Add two numbers"""
        return a + b
    
def construct_agent():
    return initialize_agent(
        llm=chat_model,
        tools=[sum],
        agent=AgentType.OPENAI_MULTI_FUNCTIONS,
    )

agent = construct_agent()

# Test running for the string evaluators
evaluator_names = ["qa", "criteria"]

models = [llm, chat_model, agent]
run_evaluators = []
for model in models:
    run_evaluators.append(load_run_evaluators_for_model(evaluator_names, model=model, reference_key="answer"))
    

# Run on LLM, Chat Model, and Agent
from langchain.client.runner_utils import run_on_dataset

to_test = [llm, chat_model, construct_agent]

for model, configured_evaluators in zip(to_test, run_evaluators):
    run_on_dataset(ds_name, model, run_evaluators=configured_evaluators, verbose=True)
```
</details>

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-07-07 19:57:59 -07:00
Bagatur
1ac347b4e3 update databerry-chaindesk redirect (#7378) 2023-07-07 19:11:46 -04:00
Joshua Carroll
705d2f5b92 Update the API Reference link in Streamlit integration docs (#7377)
This page:


https://python.langchain.com/docs/modules/callbacks/integrations/streamlit

Has a bad API Reference link currently. This PR fixes it to the correct
link.

Also updates the embedded app link to
https://langchain-mrkl.streamlit.app/ (better name) which is hosted in
langchain-ai/streamlit-agent repo
2023-07-07 17:35:57 -04:00
Georges Petrov
ec033ae277 Rename Databerry to Chaindesk (#7022)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-07 17:28:04 -04:00
Philip Meier
da5b0723d2 update MosaicML inputs and outputs (#7348)
As of today (July 7, 2023), the [MosaicML
API](https://docs.mosaicml.com/en/latest/inference.html#text-completion-requests)
uses `"inputs"` for the prompt

This PR adds support for this new format.
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-07 17:23:11 -04:00
Bearnardd
184ede4e48 Fix buggy output from GraphQAChain (#7372)
fixes https://github.com/hwchase17/langchain/issues/7289
A simple fix of the buggy output of `graph_qa`. If we have several
entities with triplets then the last entry of `triplets` for a given
entity merges with the first entry of the `triplets` of the next entity.
2023-07-07 17:19:53 -04:00
Harrison Chase
7cdf97ba9b Harrison/add to imports (#7370)
pgvector cleanup
2023-07-07 16:27:44 -04:00
Bagatur
4d427b2397 Base language model docstrings (#7104) 2023-07-07 16:09:10 -04:00
ॐ shivam mamgain
2179d4eef8 Fix for KeyError in MlflowCallbackHandler (#7051)
- Description: `MlflowCallbackHandler` fails with `KeyError: "['name']
not in index"`. See https://github.com/hwchase17/langchain/issues/5770
for more details. Root cause is that LangChain does not pass "name" as a
part of `serialized` argument to `on_llm_start()` callback method. The
commit where this change was made is probably this:
18af149e91.
My bug fix derives "name" from "id" field.
  - Issue: https://github.com/hwchase17/langchain/issues/5770
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-07 16:08:06 -04:00
Alex Gamble
df746ad821 Add a callback handler for Context (https://getcontext.ai) (#7151)
### Description

Adding a callback handler for Context. Context is a product analytics
platform for AI chat experiences to help you understand how users are
interacting with your product.

I've added the callback library + an example notebook showing its use.

### Dependencies

Requires the user to install the `context-python` library. The library
is lazily-loaded when the callback is instantiated.

### Announcing the feature

We spoke with Harrison a few weeks ago about also doing a blog post
announcing our integration, so will coordinate this with him. Our
Twitter handle for the company is @getcontextai, and the founders are
@_agamble and @HenrySG.

Thanks in advance!
2023-07-07 15:33:29 -04:00
Austin
c9a0f24646 Add verbose parameter for llamacpp (#7253)
**Title:** Add verbose parameter for llamacpp

**Description:**
This pull request adds a 'verbose' parameter to the llamacpp module. The
'verbose' parameter, when set to True, will enable the output of
detailed logs during the execution of the Llama model. This added
parameter can aid in debugging and understanding the internal processes
of the module.

The verbose parameter is a boolean that prints verbose output to stderr
when set to True. By default, the verbose parameter is set to True but
can be toggled off if less output is desired. This new parameter has
been added to the `validate_environment` method of the `LlamaCpp` class
which initializes the `llama_cpp.Llama` API:

```python
class LlamaCpp(LLM):
    ...
    @root_validator()
    def validate_environment(cls, values: Dict) -> Dict:
        ...
        model_param_names = [
            ...
            "verbose",  # New verbose parameter added
        ]
        ...
        values["client"] = Llama(model_path, **model_params)
        ...
```
---------

Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
2023-07-07 15:08:25 -04:00
Kenny
34a2755a54 Allow passing api key into OpenAIWhisperParser (#7281)
This just allows the user to pass in an api_key directly into
OpenAIWhisperParser. Very simple addition.
2023-07-07 15:07:45 -04:00
mrkhalil6
4e7d0c115b Add support for filters and namespaces in similarity search in Pinecone similarity_score_threshold (#7301)
At the moment, pinecone vectorStore does not support filters and
namespaces when using similarity_score_threshold search type.
In this PR, I've implemented that. It passes all the kwargs except
"score_threshold" as that is not a supported argument for method
"similarity_search_with_score".
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-07 15:03:59 -04:00
Manuel Saelices
01dca1e438 Add context to an output parsing error on Pydantic schema to improve exception handling (#7344)
## Changes

- [X] Fill the `llm_output` param when there is an output parsing error
in a Pydantic schema so that we can get the original text that failed to
parse when handling the exception

## Background

With this change, we could do something like this:

```
output_parser = PydanticOutputParser(pydantic_object=pydantic_obj)
chain = ConversationChain(..., output_parser=output_parser)
try:
    response: PydanticSchema = chain.predict(input=input)
except OutputParserException as exc:
    logger.error(
        'OutputParserException while parsing chatbot response: %s', exc.llm_output,
    )
```
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-07 14:49:37 -04:00
Raouf Chebri
1ac6deda89 update extension name (#7359)
hi @rlancemartin ,

We had a new deployment and the `pg_extension` creation command was
updated from `CREATE EXTENSION pg_embedding` to `CREATE EXTENSION
embedding`.

https://github.com/neondatabase/neon/pull/4646

The extension not made public yet. No users will be affected by this.
Will be public next week.

Please let me know if you have any questions.

Thank you in advance 🙏
2023-07-07 11:35:51 -07:00
William FH
4e180dc54e Unset Cache in Tests (#7362)
This is impacting other unit tests that use callbacks since the cache is
still set (just empty)
2023-07-07 11:05:09 -07:00
German Martin
3ce4e46c8c The Fellowship of the Vectors: New Embeddings Filter using clustering. (#7015)
Continuing with Tolkien inspired series of langchain tools. I bring to
you:
**The Fellowship of the Vectors**, AKA EmbeddingsClusteringFilter.
This document filter uses embeddings to group vectors together into
clusters, then allows you to pick an arbitrary number of documents
vector based on proximity to the cluster centers. That's a
representative sample of the cluster.

The original idea is from [Greg Kamradt](https://github.com/gkamradt)
from this video (Level4):
https://www.youtube.com/watch?v=qaPMdcCqtWk&t=365s

I added few tricks to make it a bit more versatile, so you can
parametrize what to do with duplicate documents in case of cluster
overlap: replace the duplicates with the next closest document or remove
it. This allow you to use it as an special kind of redundant filter too.
Additionally you can choose 2 diff orders: grouped by cluster or
respecting the original retriever scores.
In my use case I was using the docs grouped by cluster to run refine
chains per cluster to generate summarization over a large corpus of
documents.
Let me know if you want to change anything!

@rlancemartin, @eyurtsev, @hwchase17,

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-07-07 10:28:17 -07:00
Leonid Ganeline
b489466488 docs: dependents update 4 (#7360)
Updated links and counters of the `dependents` page.
2023-07-07 13:22:30 -04:00
William FH
38ca5c84cb Explicitly list requires_reference in function (#7357) 2023-07-07 10:04:03 -07:00
Harrison Chase
49b2b0e3c0 change embedding to None (#7355) 2023-07-07 12:33:03 -04:00
imaprogrammer
a2830e3056 Update chroma.py: Persist directory from client_settings if provided there (#7087)
Change details:
- Description: When calling db.persist(), a check prevents from it
proceeding as the constructor only sets member `_persist_directory` from
parameters. But the ChromaDB client settings also has this parameter,
and if the client_settings parameter is used without passing the
persist_directory (which is optional), the `persist` method raises
`ValueError` for not setting `_persist_directory`. This change fixes it
by setting the member `_persist_directory` variable from client_settings
if it is set, else uses the constructor parameter.
- Issue: I didn't find any github issue of this, but I discovered it
after calling the persist method
  - Dependencies: None
- Tag maintainer: vectorstore related change - @rlancemartin, @eyurtsev
  - Twitter handle: Don't have one :(

*Additional discussion*: We may need to discuss the way I implemented
the fallback using `or`.

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-07-07 09:20:27 -07:00
274 changed files with 8568 additions and 3312 deletions

View File

@@ -165,28 +165,35 @@ Classes
callbacks.aim_callback.AimCallbackHandler
callbacks.argilla_callback.ArgillaCallbackHandler
callbacks.arize_callback.ArizeCallbackHandler
callbacks.arthur_callback.ArthurCallbackHandler
callbacks.base.AsyncCallbackHandler
callbacks.base.BaseCallbackHandler
callbacks.base.BaseCallbackManager
callbacks.clearml_callback.ClearMLCallbackHandler
callbacks.comet_ml_callback.CometCallbackHandler
callbacks.file.FileCallbackHandler
callbacks.flyte_callback.FlyteCallbackHandler
callbacks.human.HumanApprovalCallbackHandler
callbacks.human.HumanRejectedException
callbacks.infino_callback.InfinoCallbackHandler
callbacks.manager.AsyncCallbackManager
callbacks.manager.AsyncCallbackManagerForChainRun
callbacks.manager.AsyncCallbackManagerForLLMRun
callbacks.manager.AsyncCallbackManagerForRetrieverRun
callbacks.manager.AsyncCallbackManagerForToolRun
callbacks.manager.AsyncParentRunManager
callbacks.manager.AsyncRunManager
callbacks.manager.BaseRunManager
callbacks.manager.CallbackManager
callbacks.manager.CallbackManagerForChainRun
callbacks.manager.CallbackManagerForLLMRun
callbacks.manager.CallbackManagerForRetrieverRun
callbacks.manager.CallbackManagerForToolRun
callbacks.manager.ParentRunManager
callbacks.manager.RunManager
callbacks.mlflow_callback.MlflowCallbackHandler
callbacks.openai_info.OpenAICallbackHandler
callbacks.promptlayer_callback.PromptLayerCallbackHandler
callbacks.stdout.StdOutCallbackHandler
callbacks.streaming_aiter.AsyncIteratorCallbackHandler
callbacks.streaming_aiter_final_only.AsyncFinalIteratorCallbackHandler
@@ -229,6 +236,8 @@ Functions
callbacks.aim_callback.import_aim
callbacks.clearml_callback.import_clearml
callbacks.comet_ml_callback.import_comet_ml
callbacks.flyte_callback.analyze_text
callbacks.flyte_callback.import_flytekit
callbacks.infino_callback.import_infino
callbacks.manager.env_var_is_set
callbacks.manager.get_openai_callback
@@ -283,9 +292,11 @@ Classes
chains.base.Chain
chains.combine_documents.base.AnalyzeDocumentChain
chains.combine_documents.base.BaseCombineDocumentsChain
chains.combine_documents.map_reduce.CombineDocsProtocol
chains.combine_documents.map_reduce.MapReduceDocumentsChain
chains.combine_documents.map_rerank.MapRerankDocumentsChain
chains.combine_documents.reduce.AsyncCombineDocsProtocol
chains.combine_documents.reduce.CombineDocsProtocol
chains.combine_documents.reduce.ReduceDocumentsChain
chains.combine_documents.refine.RefineDocumentsChain
chains.combine_documents.stuff.StuffDocumentsChain
chains.constitutional_ai.base.ConstitutionalChain
@@ -299,8 +310,10 @@ Classes
chains.flare.prompts.FinishedOutputParser
chains.graph_qa.base.GraphQAChain
chains.graph_qa.cypher.GraphCypherQAChain
chains.graph_qa.hugegraph.HugeGraphQAChain
chains.graph_qa.kuzu.KuzuQAChain
chains.graph_qa.nebulagraph.NebulaGraphQAChain
chains.graph_qa.sparql.GraphSparqlQAChain
chains.hyde.base.HypotheticalDocumentEmbedder
chains.llm.LLMChain
chains.llm_bash.base.LLMBashChain
@@ -363,7 +376,6 @@ Functions
.. autosummary::
:toctree: chains
chains.combine_documents.base.format_document
chains.graph_qa.cypher.extract_cypher
chains.loading.load_chain
chains.loading.load_chain_from_config
@@ -415,6 +427,7 @@ Classes
chat_models.fake.FakeListChatModel
chat_models.google_palm.ChatGooglePalm
chat_models.google_palm.ChatGooglePalmError
chat_models.human.HumanInputChatModel
chat_models.openai.ChatOpenAI
chat_models.promptlayer_openai.PromptLayerChatOpenAI
chat_models.vertexai.ChatVertexAI
@@ -513,6 +526,7 @@ Classes
document_loaders.blob_loaders.youtube_audio.YoutubeAudioLoader
document_loaders.blockchain.BlockchainDocumentLoader
document_loaders.blockchain.BlockchainType
document_loaders.brave_search.BraveSearchLoader
document_loaders.chatgpt.ChatGPTLoader
document_loaders.college_confidential.CollegeConfidentialLoader
document_loaders.confluence.ConfluenceLoader
@@ -520,6 +534,7 @@ Classes
document_loaders.conllu.CoNLLULoader
document_loaders.csv_loader.CSVLoader
document_loaders.csv_loader.UnstructuredCSVLoader
document_loaders.cube_semantic.CubeSemanticLoader
document_loaders.dataframe.DataFrameLoader
document_loaders.diffbot.DiffbotLoader
document_loaders.directory.DirectoryLoader
@@ -645,6 +660,7 @@ Classes
document_loaders.word_document.Docx2txtLoader
document_loaders.word_document.UnstructuredWordDocumentLoader
document_loaders.xml.UnstructuredXMLLoader
document_loaders.xorbits.XorbitsLoader
document_loaders.youtube.GoogleApiYoutubeLoader
document_loaders.youtube.YoutubeLoader
@@ -736,6 +752,7 @@ Classes
embeddings.self_hosted.SelfHostedEmbeddings
embeddings.self_hosted_hugging_face.SelfHostedHuggingFaceEmbeddings
embeddings.self_hosted_hugging_face.SelfHostedHuggingFaceInstructEmbeddings
embeddings.spacy_embeddings.SpacyEmbeddings
embeddings.tensorflow_hub.TensorflowHubEmbeddings
embeddings.vertexai.VertexAIEmbeddings
@@ -790,6 +807,9 @@ Classes
evaluation.comparison.eval_chain.PairwiseStringResultOutputParser
evaluation.criteria.eval_chain.CriteriaEvalChain
evaluation.criteria.eval_chain.CriteriaResultOutputParser
evaluation.embedding_distance.base.EmbeddingDistance
evaluation.embedding_distance.base.EmbeddingDistanceEvalChain
evaluation.embedding_distance.base.PairwiseEmbeddingDistanceEvalChain
evaluation.qa.eval_chain.ContextQAEvalChain
evaluation.qa.eval_chain.CotQAEvalChain
evaluation.qa.eval_chain.QAEvalChain
@@ -799,10 +819,16 @@ Classes
evaluation.run_evaluators.implementations.ChoicesOutputParser
evaluation.run_evaluators.implementations.CriteriaOutputParser
evaluation.run_evaluators.implementations.StringRunEvaluatorInputMapper
evaluation.run_evaluators.implementations.TrajectoryEvalOutputParser
evaluation.run_evaluators.implementations.TrajectoryInputMapper
evaluation.run_evaluators.implementations.TrajectoryRunEvalOutputParser
evaluation.schema.AgentTrajectoryEvaluator
evaluation.schema.EvaluatorType
evaluation.schema.LLMEvalChain
evaluation.schema.PairwiseStringEvaluator
evaluation.schema.StringEvaluator
evaluation.string_distance.base.PairwiseStringDistanceEvalChain
evaluation.string_distance.base.StringDistance
evaluation.string_distance.base.StringDistanceEvalChain
Functions
--------------
@@ -812,6 +838,8 @@ Functions
:toctree: evaluation
evaluation.loading.load_dataset
evaluation.loading.load_evaluator
evaluation.loading.load_evaluators
evaluation.run_evaluators.implementations.get_criteria_evaluator
evaluation.run_evaluators.implementations.get_qa_evaluator
evaluation.run_evaluators.implementations.get_trajectory_evaluator
@@ -1057,6 +1085,7 @@ Functions
llms.aviary.get_completions
llms.aviary.get_models
llms.base.create_base_retry_decorator
llms.base.get_prompts
llms.base.update_cache
llms.cohere.completion_with_retry
@@ -1069,6 +1098,7 @@ Functions
llms.openai.completion_with_retry
llms.openai.update_token_usage
llms.utils.enforce_stop_tokens
llms.vertexai.completion_with_retry
llms.vertexai.is_codey_model
:mod:`langchain.load`: Load
@@ -1241,7 +1271,6 @@ Classes
:toctree: prompts
:template: class.rst
prompts.base.BasePromptTemplate
prompts.base.StringPromptTemplate
prompts.base.StringPromptValue
prompts.chat.AIMessagePromptTemplate
@@ -1316,7 +1345,7 @@ Classes
retrievers.azure_cognitive_search.AzureCognitiveSearchRetriever
retrievers.chatgpt_plugin_retriever.ChatGPTPluginRetriever
retrievers.contextual_compression.ContextualCompressionRetriever
retrievers.databerry.DataberryRetriever
retrievers.chaindesk.ChaindeskRetriever
retrievers.docarray.DocArrayRetriever
retrievers.docarray.SearchType
retrievers.document_compressors.base.BaseDocumentCompressor
@@ -1348,7 +1377,7 @@ Classes
retrievers.multi_query.LineListOutputParser
retrievers.multi_query.MultiQueryRetriever
retrievers.pinecone_hybrid_search.PineconeHybridSearchRetriever
retrievers.pupmed.PubMedRetriever
retrievers.pubmed.PubMedRetriever
retrievers.remote_retriever.RemoteLangChainRetriever
retrievers.self_query.base.SelfQueryRetriever
retrievers.self_query.chroma.ChromaTranslator
@@ -1400,28 +1429,29 @@ Classes
:toctree: schema
:template: class.rst
schema.AIMessage
schema.AgentFinish
schema.BaseChatMessageHistory
schema.BaseDocumentTransformer
schema.BaseLLMOutputParser
schema.BaseMemory
schema.BaseMessage
schema.BaseOutputParser
schema.BaseRetriever
schema.ChatGeneration
schema.ChatMessage
schema.ChatResult
schema.Document
schema.FunctionMessage
schema.Generation
schema.HumanMessage
schema.LLMResult
schema.NoOpOutputParser
schema.OutputParserException
schema.PromptValue
schema.RunInfo
schema.SystemMessage
schema.agent.AgentFinish
schema.document.BaseDocumentTransformer
schema.document.Document
schema.memory.BaseChatMessageHistory
schema.memory.BaseMemory
schema.messages.AIMessage
schema.messages.BaseMessage
schema.messages.ChatMessage
schema.messages.FunctionMessage
schema.messages.HumanMessage
schema.messages.SystemMessage
schema.output.ChatGeneration
schema.output.ChatResult
schema.output.Generation
schema.output.LLMResult
schema.output.RunInfo
schema.output_parser.BaseLLMOutputParser
schema.output_parser.BaseOutputParser
schema.output_parser.NoOpOutputParser
schema.output_parser.OutputParserException
schema.prompt.PromptValue
schema.prompt_template.BasePromptTemplate
schema.retriever.BaseRetriever
Functions
--------------
@@ -1430,9 +1460,10 @@ Functions
.. autosummary::
:toctree: schema
schema.get_buffer_string
schema.messages_from_dict
schema.messages_to_dict
schema.messages.get_buffer_string
schema.messages.messages_from_dict
schema.messages.messages_to_dict
schema.prompt_template.format_document
:mod:`langchain.server`: Server
================================
@@ -1535,6 +1566,8 @@ Classes
tools.bing_search.tool.BingSearchRun
tools.brave_search.tool.BraveSearch
tools.convert_to_openai.FunctionDescription
tools.dataforseo_api_search.tool.DataForSeoAPISearchResults
tools.dataforseo_api_search.tool.DataForSeoAPISearchRun
tools.ddg_search.tool.DuckDuckGoSearchResults
tools.ddg_search.tool.DuckDuckGoSearchRun
tools.file_management.copy.CopyFileTool
@@ -1708,6 +1741,7 @@ Classes
utilities.bibtex.BibtexparserWrapper
utilities.bing_search.BingSearchAPIWrapper
utilities.brave_search.BraveSearchWrapper
utilities.dataforseo_api_search.DataForSeoAPIWrapper
utilities.duckduckgo_search.DuckDuckGoSearchAPIWrapper
utilities.google_places_api.GooglePlacesAPIWrapper
utilities.google_search.GoogleSearchAPIWrapper
@@ -1805,12 +1839,17 @@ Classes
vectorstores.faiss.FAISS
vectorstores.hologres.Hologres
vectorstores.lancedb.LanceDB
vectorstores.marqo.Marqo
vectorstores.matching_engine.MatchingEngine
vectorstores.milvus.Milvus
vectorstores.mongodb_atlas.MongoDBAtlasVectorSearch
vectorstores.myscale.MyScale
vectorstores.myscale.MyScaleSettings
vectorstores.opensearch_vector_search.OpenSearchVectorSearch
vectorstores.pgembedding.BaseModel
vectorstores.pgembedding.CollectionStore
vectorstores.pgembedding.EmbeddingStore
vectorstores.pgembedding.PGEmbedding
vectorstores.pgvector.BaseModel
vectorstores.pgvector.CollectionStore
vectorstores.pgvector.DistanceStrategy

View File

Before

Width:  |  Height:  |  Size: 157 KiB

After

Width:  |  Height:  |  Size: 157 KiB

View File

@@ -138,7 +138,11 @@
},
{
"source": "/en/latest/integrations/databerry.html",
"destination": "/docs/ecosystem/integrations/databerry"
"destination": "/docs/ecosystem/integrations/chaindesk"
},
{
"source": "/docs/ecosystem/integrations/databerry",
"destination": "/docs/ecosystem/integrations/chaindesk"
},
{
"source": "/en/latest/integrations/databricks/databricks.html",
@@ -1330,7 +1334,11 @@
},
{
"source": "/en/latest/modules/indexes/retrievers/examples/databerry.html",
"destination": "/docs/modules/data_connection/retrievers/integrations/databerry"
"destination": "/docs/modules/data_connection/retrievers/integrations/chaindesk"
},
{
"source": "/docs/modules/data_connection/retrievers/integrations/databerry",
"destination": "/docs/modules/data_connection/retrievers/integrations/chaindesk"
},
{
"source": "/en/latest/modules/indexes/retrievers/examples/elastic_search_bm25.html",
@@ -2125,4 +2133,4 @@
"destination": "/docs/:path*"
}
]
}
}

View File

@@ -2,188 +2,261 @@
Dependents stats for `hwchase17/langchain`
[![](https://img.shields.io/static/v1?label=Used%20by&message=5152&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[![](https://img.shields.io/static/v1?label=Used%20by%20(public)&message=172&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[![](https://img.shields.io/static/v1?label=Used%20by%20(private)&message=4980&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[![](https://img.shields.io/static/v1?label=Used%20by%20(stars)&message=17239&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[![](https://img.shields.io/static/v1?label=Used%20by&message=9941&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[![](https://img.shields.io/static/v1?label=Used%20by%20(public)&message=244&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[![](https://img.shields.io/static/v1?label=Used%20by%20(private)&message=9697&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[![](https://img.shields.io/static/v1?label=Used%20by%20(stars)&message=19827&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[update: 2023-05-17; only dependent repositories with Stars > 100]
[update: 2023-07-07; only dependent repositories with Stars > 100]
| Repository | Stars |
| :-------- | -----: |
|[openai/openai-cookbook](https://github.com/openai/openai-cookbook) | 35401 |
|[LAION-AI/Open-Assistant](https://github.com/LAION-AI/Open-Assistant) | 32861 |
|[microsoft/TaskMatrix](https://github.com/microsoft/TaskMatrix) | 32766 |
|[hpcaitech/ColossalAI](https://github.com/hpcaitech/ColossalAI) | 29560 |
|[reworkd/AgentGPT](https://github.com/reworkd/AgentGPT) | 22315 |
|[imartinez/privateGPT](https://github.com/imartinez/privateGPT) | 17474 |
|[openai/chatgpt-retrieval-plugin](https://github.com/openai/chatgpt-retrieval-plugin) | 16923 |
|[mindsdb/mindsdb](https://github.com/mindsdb/mindsdb) | 16112 |
|[jerryjliu/llama_index](https://github.com/jerryjliu/llama_index) | 15407 |
|[mlflow/mlflow](https://github.com/mlflow/mlflow) | 14345 |
|[GaiZhenbiao/ChuanhuChatGPT](https://github.com/GaiZhenbiao/ChuanhuChatGPT) | 10372 |
|[databrickslabs/dolly](https://github.com/databrickslabs/dolly) | 9919 |
|[AIGC-Audio/AudioGPT](https://github.com/AIGC-Audio/AudioGPT) | 8177 |
|[logspace-ai/langflow](https://github.com/logspace-ai/langflow) | 6807 |
|[imClumsyPanda/langchain-ChatGLM](https://github.com/imClumsyPanda/langchain-ChatGLM) | 6087 |
|[arc53/DocsGPT](https://github.com/arc53/DocsGPT) | 5292 |
|[e2b-dev/e2b](https://github.com/e2b-dev/e2b) | 4622 |
|[nsarrazin/serge](https://github.com/nsarrazin/serge) | 4076 |
|[madawei2699/myGPTReader](https://github.com/madawei2699/myGPTReader) | 3952 |
|[zauberzeug/nicegui](https://github.com/zauberzeug/nicegui) | 3952 |
|[go-skynet/LocalAI](https://github.com/go-skynet/LocalAI) | 3762 |
|[GreyDGL/PentestGPT](https://github.com/GreyDGL/PentestGPT) | 3388 |
|[mmabrouk/chatgpt-wrapper](https://github.com/mmabrouk/chatgpt-wrapper) | 3243 |
|[zilliztech/GPTCache](https://github.com/zilliztech/GPTCache) | 3189 |
|[wenda-LLM/wenda](https://github.com/wenda-LLM/wenda) | 3050 |
|[marqo-ai/marqo](https://github.com/marqo-ai/marqo) | 2930 |
|[gkamradt/langchain-tutorials](https://github.com/gkamradt/langchain-tutorials) | 2710 |
|[PrefectHQ/marvin](https://github.com/PrefectHQ/marvin) | 2545 |
|[project-baize/baize-chatbot](https://github.com/project-baize/baize-chatbot) | 2479 |
|[whitead/paper-qa](https://github.com/whitead/paper-qa) | 2399 |
|[langgenius/dify](https://github.com/langgenius/dify) | 2344 |
|[GerevAI/gerev](https://github.com/GerevAI/gerev) | 2283 |
|[hwchase17/chat-langchain](https://github.com/hwchase17/chat-langchain) | 2266 |
|[guangzhengli/ChatFiles](https://github.com/guangzhengli/ChatFiles) | 1903 |
|[Azure-Samples/azure-search-openai-demo](https://github.com/Azure-Samples/azure-search-openai-demo) | 1884 |
|[OpenBMB/BMTools](https://github.com/OpenBMB/BMTools) | 1860 |
|[Farama-Foundation/PettingZoo](https://github.com/Farama-Foundation/PettingZoo) | 1813 |
|[OpenGVLab/Ask-Anything](https://github.com/OpenGVLab/Ask-Anything) | 1571 |
|[IntelligenzaArtificiale/Free-Auto-GPT](https://github.com/IntelligenzaArtificiale/Free-Auto-GPT) | 1480 |
|[hwchase17/notion-qa](https://github.com/hwchase17/notion-qa) | 1464 |
|[NVIDIA/NeMo-Guardrails](https://github.com/NVIDIA/NeMo-Guardrails) | 1419 |
|[Unstructured-IO/unstructured](https://github.com/Unstructured-IO/unstructured) | 1410 |
|[Kav-K/GPTDiscord](https://github.com/Kav-K/GPTDiscord) | 1363 |
|[paulpierre/RasaGPT](https://github.com/paulpierre/RasaGPT) | 1344 |
|[StanGirard/quivr](https://github.com/StanGirard/quivr) | 1330 |
|[lunasec-io/lunasec](https://github.com/lunasec-io/lunasec) | 1318 |
|[vocodedev/vocode-python](https://github.com/vocodedev/vocode-python) | 1286 |
|[agiresearch/OpenAGI](https://github.com/agiresearch/OpenAGI) | 1156 |
|[h2oai/h2ogpt](https://github.com/h2oai/h2ogpt) | 1141 |
|[jina-ai/thinkgpt](https://github.com/jina-ai/thinkgpt) | 1106 |
|[yanqiangmiffy/Chinese-LangChain](https://github.com/yanqiangmiffy/Chinese-LangChain) | 1072 |
|[ttengwang/Caption-Anything](https://github.com/ttengwang/Caption-Anything) | 1064 |
|[jina-ai/dev-gpt](https://github.com/jina-ai/dev-gpt) | 1057 |
|[juncongmoo/chatllama](https://github.com/juncongmoo/chatllama) | 1003 |
|[greshake/llm-security](https://github.com/greshake/llm-security) | 1002 |
|[visual-openllm/visual-openllm](https://github.com/visual-openllm/visual-openllm) | 957 |
|[richardyc/Chrome-GPT](https://github.com/richardyc/Chrome-GPT) | 918 |
|[irgolic/AutoPR](https://github.com/irgolic/AutoPR) | 886 |
|[mmz-001/knowledge_gpt](https://github.com/mmz-001/knowledge_gpt) | 867 |
|[thomas-yanxin/LangChain-ChatGLM-Webui](https://github.com/thomas-yanxin/LangChain-ChatGLM-Webui) | 850 |
|[microsoft/X-Decoder](https://github.com/microsoft/X-Decoder) | 837 |
|[peterw/Chat-with-Github-Repo](https://github.com/peterw/Chat-with-Github-Repo) | 826 |
|[cirediatpl/FigmaChain](https://github.com/cirediatpl/FigmaChain) | 782 |
|[hashintel/hash](https://github.com/hashintel/hash) | 778 |
|[seanpixel/Teenage-AGI](https://github.com/seanpixel/Teenage-AGI) | 773 |
|[jina-ai/langchain-serve](https://github.com/jina-ai/langchain-serve) | 738 |
|[corca-ai/EVAL](https://github.com/corca-ai/EVAL) | 737 |
|[ai-sidekick/sidekick](https://github.com/ai-sidekick/sidekick) | 717 |
|[rlancemartin/auto-evaluator](https://github.com/rlancemartin/auto-evaluator) | 703 |
|[poe-platform/api-bot-tutorial](https://github.com/poe-platform/api-bot-tutorial) | 689 |
|[SamurAIGPT/Camel-AutoGPT](https://github.com/SamurAIGPT/Camel-AutoGPT) | 666 |
|[eyurtsev/kor](https://github.com/eyurtsev/kor) | 608 |
|[run-llama/llama-lab](https://github.com/run-llama/llama-lab) | 559 |
|[namuan/dr-doc-search](https://github.com/namuan/dr-doc-search) | 544 |
|[pieroit/cheshire-cat](https://github.com/pieroit/cheshire-cat) | 520 |
|[griptape-ai/griptape](https://github.com/griptape-ai/griptape) | 514 |
|[getmetal/motorhead](https://github.com/getmetal/motorhead) | 481 |
|[hwchase17/chat-your-data](https://github.com/hwchase17/chat-your-data) | 462 |
|[langchain-ai/langchain-aiplugin](https://github.com/langchain-ai/langchain-aiplugin) | 452 |
|[jina-ai/agentchain](https://github.com/jina-ai/agentchain) | 439 |
|[SamurAIGPT/ChatGPT-Developer-Plugins](https://github.com/SamurAIGPT/ChatGPT-Developer-Plugins) | 437 |
|[alexanderatallah/window.ai](https://github.com/alexanderatallah/window.ai) | 433 |
|[michaelthwan/searchGPT](https://github.com/michaelthwan/searchGPT) | 427 |
|[mpaepper/content-chatbot](https://github.com/mpaepper/content-chatbot) | 425 |
|[mckaywrigley/repo-chat](https://github.com/mckaywrigley/repo-chat) | 422 |
|[whyiyhw/chatgpt-wechat](https://github.com/whyiyhw/chatgpt-wechat) | 421 |
|[freddyaboulton/gradio-tools](https://github.com/freddyaboulton/gradio-tools) | 407 |
|[jonra1993/fastapi-alembic-sqlmodel-async](https://github.com/jonra1993/fastapi-alembic-sqlmodel-async) | 395 |
|[yeagerai/yeagerai-agent](https://github.com/yeagerai/yeagerai-agent) | 383 |
|[akshata29/chatpdf](https://github.com/akshata29/chatpdf) | 374 |
|[OpenGVLab/InternGPT](https://github.com/OpenGVLab/InternGPT) | 368 |
|[ruoccofabrizio/azure-open-ai-embeddings-qna](https://github.com/ruoccofabrizio/azure-open-ai-embeddings-qna) | 358 |
|[101dotxyz/GPTeam](https://github.com/101dotxyz/GPTeam) | 357 |
|[mtenenholtz/chat-twitter](https://github.com/mtenenholtz/chat-twitter) | 354 |
|[amosjyng/langchain-visualizer](https://github.com/amosjyng/langchain-visualizer) | 343 |
|[msoedov/langcorn](https://github.com/msoedov/langcorn) | 334 |
|[showlab/VLog](https://github.com/showlab/VLog) | 330 |
|[continuum-llms/chatgpt-memory](https://github.com/continuum-llms/chatgpt-memory) | 324 |
|[steamship-core/steamship-langchain](https://github.com/steamship-core/steamship-langchain) | 323 |
|[daodao97/chatdoc](https://github.com/daodao97/chatdoc) | 320 |
|[xuwenhao/geektime-ai-course](https://github.com/xuwenhao/geektime-ai-course) | 308 |
|[StevenGrove/GPT4Tools](https://github.com/StevenGrove/GPT4Tools) | 301 |
|[logan-markewich/llama_index_starter_pack](https://github.com/logan-markewich/llama_index_starter_pack) | 300 |
|[andylokandy/gpt-4-search](https://github.com/andylokandy/gpt-4-search) | 299 |
|[Anil-matcha/ChatPDF](https://github.com/Anil-matcha/ChatPDF) | 287 |
|[itamargol/openai](https://github.com/itamargol/openai) | 273 |
|[BlackHC/llm-strategy](https://github.com/BlackHC/llm-strategy) | 267 |
|[momegas/megabots](https://github.com/momegas/megabots) | 259 |
|[bborn/howdoi.ai](https://github.com/bborn/howdoi.ai) | 238 |
|[Cheems-Seminar/grounded-segment-any-parts](https://github.com/Cheems-Seminar/grounded-segment-any-parts) | 232 |
|[ur-whitelab/exmol](https://github.com/ur-whitelab/exmol) | 227 |
|[sullivan-sean/chat-langchainjs](https://github.com/sullivan-sean/chat-langchainjs) | 227 |
|[explosion/spacy-llm](https://github.com/explosion/spacy-llm) | 226 |
|[recalign/RecAlign](https://github.com/recalign/RecAlign) | 218 |
|[jupyterlab/jupyter-ai](https://github.com/jupyterlab/jupyter-ai) | 218 |
|[alvarosevilla95/autolang](https://github.com/alvarosevilla95/autolang) | 215 |
|[conceptofmind/toolformer](https://github.com/conceptofmind/toolformer) | 213 |
|[MagnivOrg/prompt-layer-library](https://github.com/MagnivOrg/prompt-layer-library) | 209 |
|[JohnSnowLabs/nlptest](https://github.com/JohnSnowLabs/nlptest) | 208 |
|[airobotlab/KoChatGPT](https://github.com/airobotlab/KoChatGPT) | 197 |
|[langchain-ai/auto-evaluator](https://github.com/langchain-ai/auto-evaluator) | 195 |
|[yvann-hub/Robby-chatbot](https://github.com/yvann-hub/Robby-chatbot) | 195 |
|[alejandro-ao/langchain-ask-pdf](https://github.com/alejandro-ao/langchain-ask-pdf) | 192 |
|[daveebbelaar/langchain-experiments](https://github.com/daveebbelaar/langchain-experiments) | 189 |
|[NimbleBoxAI/ChainFury](https://github.com/NimbleBoxAI/ChainFury) | 187 |
|[kaleido-lab/dolphin](https://github.com/kaleido-lab/dolphin) | 184 |
|[Anil-matcha/Website-to-Chatbot](https://github.com/Anil-matcha/Website-to-Chatbot) | 183 |
|[plchld/InsightFlow](https://github.com/plchld/InsightFlow) | 180 |
|[OpenBMB/AgentVerse](https://github.com/OpenBMB/AgentVerse) | 166 |
|[benthecoder/ClassGPT](https://github.com/benthecoder/ClassGPT) | 166 |
|[jbrukh/gpt-jargon](https://github.com/jbrukh/gpt-jargon) | 161 |
|[hardbyte/qabot](https://github.com/hardbyte/qabot) | 160 |
|[shaman-ai/agent-actors](https://github.com/shaman-ai/agent-actors) | 153 |
|[radi-cho/datasetGPT](https://github.com/radi-cho/datasetGPT) | 153 |
|[poe-platform/poe-protocol](https://github.com/poe-platform/poe-protocol) | 152 |
|[paolorechia/learn-langchain](https://github.com/paolorechia/learn-langchain) | 149 |
|[ajndkr/lanarky](https://github.com/ajndkr/lanarky) | 149 |
|[fengyuli-dev/multimedia-gpt](https://github.com/fengyuli-dev/multimedia-gpt) | 147 |
|[yasyf/compress-gpt](https://github.com/yasyf/compress-gpt) | 144 |
|[homanp/superagent](https://github.com/homanp/superagent) | 143 |
|[realminchoi/babyagi-ui](https://github.com/realminchoi/babyagi-ui) | 141 |
|[ethanyanjiali/minChatGPT](https://github.com/ethanyanjiali/minChatGPT) | 141 |
|[ccurme/yolopandas](https://github.com/ccurme/yolopandas) | 139 |
|[hwchase17/langchain-streamlit-template](https://github.com/hwchase17/langchain-streamlit-template) | 138 |
|[Jaseci-Labs/jaseci](https://github.com/Jaseci-Labs/jaseci) | 136 |
|[hirokidaichi/wanna](https://github.com/hirokidaichi/wanna) | 135 |
|[Haste171/langchain-chatbot](https://github.com/Haste171/langchain-chatbot) | 134 |
|[jmpaz/promptlib](https://github.com/jmpaz/promptlib) | 130 |
|[Klingefjord/chatgpt-telegram](https://github.com/Klingefjord/chatgpt-telegram) | 130 |
|[filip-michalsky/SalesGPT](https://github.com/filip-michalsky/SalesGPT) | 128 |
|[handrew/browserpilot](https://github.com/handrew/browserpilot) | 128 |
|[shauryr/S2QA](https://github.com/shauryr/S2QA) | 127 |
|[steamship-core/vercel-examples](https://github.com/steamship-core/vercel-examples) | 127 |
|[yasyf/summ](https://github.com/yasyf/summ) | 127 |
|[gia-guar/JARVIS-ChatGPT](https://github.com/gia-guar/JARVIS-ChatGPT) | 126 |
|[jerlendds/osintbuddy](https://github.com/jerlendds/osintbuddy) | 125 |
|[ibiscp/LLM-IMDB](https://github.com/ibiscp/LLM-IMDB) | 124 |
|[Teahouse-Studios/akari-bot](https://github.com/Teahouse-Studios/akari-bot) | 124 |
|[hwchase17/chroma-langchain](https://github.com/hwchase17/chroma-langchain) | 124 |
|[menloparklab/langchain-cohere-qdrant-doc-retrieval](https://github.com/menloparklab/langchain-cohere-qdrant-doc-retrieval) | 123 |
|[peterw/StoryStorm](https://github.com/peterw/StoryStorm) | 123 |
|[chakkaradeep/pyCodeAGI](https://github.com/chakkaradeep/pyCodeAGI) | 123 |
|[petehunt/langchain-github-bot](https://github.com/petehunt/langchain-github-bot) | 115 |
|[su77ungr/CASALIOY](https://github.com/su77ungr/CASALIOY) | 113 |
|[eunomia-bpf/GPTtrace](https://github.com/eunomia-bpf/GPTtrace) | 113 |
|[zenml-io/zenml-projects](https://github.com/zenml-io/zenml-projects) | 112 |
|[pablomarin/GPT-Azure-Search-Engine](https://github.com/pablomarin/GPT-Azure-Search-Engine) | 111 |
|[shamspias/customizable-gpt-chatbot](https://github.com/shamspias/customizable-gpt-chatbot) | 109 |
|[WongSaang/chatgpt-ui-server](https://github.com/WongSaang/chatgpt-ui-server) | 108 |
|[davila7/file-gpt](https://github.com/davila7/file-gpt) | 104 |
|[enhancedocs/enhancedocs](https://github.com/enhancedocs/enhancedocs) | 102 |
|[aurelio-labs/arxiv-bot](https://github.com/aurelio-labs/arxiv-bot) | 101 |
|[openai/openai-cookbook](https://github.com/openai/openai-cookbook) | 41047 |
|[LAION-AI/Open-Assistant](https://github.com/LAION-AI/Open-Assistant) | 33983 |
|[microsoft/TaskMatrix](https://github.com/microsoft/TaskMatrix) | 33375 |
|[imartinez/privateGPT](https://github.com/imartinez/privateGPT) | 31114 |
|[hpcaitech/ColossalAI](https://github.com/hpcaitech/ColossalAI) | 30369 |
|[reworkd/AgentGPT](https://github.com/reworkd/AgentGPT) | 24116 |
|[OpenBB-finance/OpenBBTerminal](https://github.com/OpenBB-finance/OpenBBTerminal) | 22565 |
|[openai/chatgpt-retrieval-plugin](https://github.com/openai/chatgpt-retrieval-plugin) | 18375 |
|[jerryjliu/llama_index](https://github.com/jerryjliu/llama_index) | 17723 |
|[mindsdb/mindsdb](https://github.com/mindsdb/mindsdb) | 16958 |
|[mlflow/mlflow](https://github.com/mlflow/mlflow) | 14632 |
|[GaiZhenbiao/ChuanhuChatGPT](https://github.com/GaiZhenbiao/ChuanhuChatGPT) | 11273 |
|[openai/evals](https://github.com/openai/evals) | 10745 |
|[databrickslabs/dolly](https://github.com/databrickslabs/dolly) | 10298 |
|[imClumsyPanda/langchain-ChatGLM](https://github.com/imClumsyPanda/langchain-ChatGLM) | 9838 |
|[logspace-ai/langflow](https://github.com/logspace-ai/langflow) | 9247 |
|[AIGC-Audio/AudioGPT](https://github.com/AIGC-Audio/AudioGPT) | 8768 |
|[PromtEngineer/localGPT](https://github.com/PromtEngineer/localGPT) | 8651 |
|[StanGirard/quivr](https://github.com/StanGirard/quivr) | 8119 |
|[go-skynet/LocalAI](https://github.com/go-skynet/LocalAI) | 7418 |
|[gventuri/pandas-ai](https://github.com/gventuri/pandas-ai) | 7301 |
|[PipedreamHQ/pipedream](https://github.com/PipedreamHQ/pipedream) | 6636 |
|[arc53/DocsGPT](https://github.com/arc53/DocsGPT) | 5849 |
|[e2b-dev/e2b](https://github.com/e2b-dev/e2b) | 5129 |
|[langgenius/dify](https://github.com/langgenius/dify) | 4804 |
|[serge-chat/serge](https://github.com/serge-chat/serge) | 4448 |
|[csunny/DB-GPT](https://github.com/csunny/DB-GPT) | 4350 |
|[wenda-LLM/wenda](https://github.com/wenda-LLM/wenda) | 4268 |
|[zauberzeug/nicegui](https://github.com/zauberzeug/nicegui) | 4244 |
|[intitni/CopilotForXcode](https://github.com/intitni/CopilotForXcode) | 4232 |
|[GreyDGL/PentestGPT](https://github.com/GreyDGL/PentestGPT) | 4154 |
|[madawei2699/myGPTReader](https://github.com/madawei2699/myGPTReader) | 4080 |
|[zilliztech/GPTCache](https://github.com/zilliztech/GPTCache) | 3949 |
|[gkamradt/langchain-tutorials](https://github.com/gkamradt/langchain-tutorials) | 3920 |
|[bentoml/OpenLLM](https://github.com/bentoml/OpenLLM) | 3481 |
|[MineDojo/Voyager](https://github.com/MineDojo/Voyager) | 3453 |
|[mmabrouk/chatgpt-wrapper](https://github.com/mmabrouk/chatgpt-wrapper) | 3355 |
|[postgresml/postgresml](https://github.com/postgresml/postgresml) | 3328 |
|[marqo-ai/marqo](https://github.com/marqo-ai/marqo) | 3100 |
|[kyegomez/tree-of-thoughts](https://github.com/kyegomez/tree-of-thoughts) | 3049 |
|[PrefectHQ/marvin](https://github.com/PrefectHQ/marvin) | 2844 |
|[project-baize/baize-chatbot](https://github.com/project-baize/baize-chatbot) | 2833 |
|[h2oai/h2ogpt](https://github.com/h2oai/h2ogpt) | 2809 |
|[hwchase17/chat-langchain](https://github.com/hwchase17/chat-langchain) | 2809 |
|[whitead/paper-qa](https://github.com/whitead/paper-qa) | 2664 |
|[Azure-Samples/azure-search-openai-demo](https://github.com/Azure-Samples/azure-search-openai-demo) | 2650 |
|[OpenGVLab/InternGPT](https://github.com/OpenGVLab/InternGPT) | 2525 |
|[GerevAI/gerev](https://github.com/GerevAI/gerev) | 2372 |
|[ParisNeo/lollms-webui](https://github.com/ParisNeo/lollms-webui) | 2287 |
|[OpenBMB/BMTools](https://github.com/OpenBMB/BMTools) | 2265 |
|[SamurAIGPT/privateGPT](https://github.com/SamurAIGPT/privateGPT) | 2084 |
|[Chainlit/chainlit](https://github.com/Chainlit/chainlit) | 1912 |
|[Farama-Foundation/PettingZoo](https://github.com/Farama-Foundation/PettingZoo) | 1869 |
|[OpenGVLab/Ask-Anything](https://github.com/OpenGVLab/Ask-Anything) | 1864 |
|[IntelligenzaArtificiale/Free-Auto-GPT](https://github.com/IntelligenzaArtificiale/Free-Auto-GPT) | 1849 |
|[Unstructured-IO/unstructured](https://github.com/Unstructured-IO/unstructured) | 1766 |
|[yanqiangmiffy/Chinese-LangChain](https://github.com/yanqiangmiffy/Chinese-LangChain) | 1745 |
|[NVIDIA/NeMo-Guardrails](https://github.com/NVIDIA/NeMo-Guardrails) | 1732 |
|[hwchase17/notion-qa](https://github.com/hwchase17/notion-qa) | 1716 |
|[paulpierre/RasaGPT](https://github.com/paulpierre/RasaGPT) | 1619 |
|[pinterest/querybook](https://github.com/pinterest/querybook) | 1468 |
|[vocodedev/vocode-python](https://github.com/vocodedev/vocode-python) | 1446 |
|[thomas-yanxin/LangChain-ChatGLM-Webui](https://github.com/thomas-yanxin/LangChain-ChatGLM-Webui) | 1430 |
|[Mintplex-Labs/anything-llm](https://github.com/Mintplex-Labs/anything-llm) | 1419 |
|[Kav-K/GPTDiscord](https://github.com/Kav-K/GPTDiscord) | 1416 |
|[lunasec-io/lunasec](https://github.com/lunasec-io/lunasec) | 1327 |
|[psychic-api/psychic](https://github.com/psychic-api/psychic) | 1307 |
|[jina-ai/thinkgpt](https://github.com/jina-ai/thinkgpt) | 1242 |
|[agiresearch/OpenAGI](https://github.com/agiresearch/OpenAGI) | 1239 |
|[ttengwang/Caption-Anything](https://github.com/ttengwang/Caption-Anything) | 1203 |
|[jina-ai/dev-gpt](https://github.com/jina-ai/dev-gpt) | 1179 |
|[keephq/keep](https://github.com/keephq/keep) | 1169 |
|[greshake/llm-security](https://github.com/greshake/llm-security) | 1156 |
|[richardyc/Chrome-GPT](https://github.com/richardyc/Chrome-GPT) | 1090 |
|[jina-ai/langchain-serve](https://github.com/jina-ai/langchain-serve) | 1088 |
|[mmz-001/knowledge_gpt](https://github.com/mmz-001/knowledge_gpt) | 1074 |
|[juncongmoo/chatllama](https://github.com/juncongmoo/chatllama) | 1057 |
|[noahshinn024/reflexion](https://github.com/noahshinn024/reflexion) | 1045 |
|[visual-openllm/visual-openllm](https://github.com/visual-openllm/visual-openllm) | 1036 |
|[101dotxyz/GPTeam](https://github.com/101dotxyz/GPTeam) | 999 |
|[poe-platform/api-bot-tutorial](https://github.com/poe-platform/api-bot-tutorial) | 989 |
|[irgolic/AutoPR](https://github.com/irgolic/AutoPR) | 974 |
|[homanp/superagent](https://github.com/homanp/superagent) | 970 |
|[microsoft/X-Decoder](https://github.com/microsoft/X-Decoder) | 941 |
|[peterw/Chat-with-Github-Repo](https://github.com/peterw/Chat-with-Github-Repo) | 896 |
|[SamurAIGPT/Camel-AutoGPT](https://github.com/SamurAIGPT/Camel-AutoGPT) | 856 |
|[cirediatpl/FigmaChain](https://github.com/cirediatpl/FigmaChain) | 840 |
|[chatarena/chatarena](https://github.com/chatarena/chatarena) | 829 |
|[rlancemartin/auto-evaluator](https://github.com/rlancemartin/auto-evaluator) | 816 |
|[seanpixel/Teenage-AGI](https://github.com/seanpixel/Teenage-AGI) | 816 |
|[hashintel/hash](https://github.com/hashintel/hash) | 806 |
|[corca-ai/EVAL](https://github.com/corca-ai/EVAL) | 790 |
|[eyurtsev/kor](https://github.com/eyurtsev/kor) | 752 |
|[cheshire-cat-ai/core](https://github.com/cheshire-cat-ai/core) | 713 |
|[e-johnstonn/BriefGPT](https://github.com/e-johnstonn/BriefGPT) | 686 |
|[run-llama/llama-lab](https://github.com/run-llama/llama-lab) | 685 |
|[refuel-ai/autolabel](https://github.com/refuel-ai/autolabel) | 673 |
|[griptape-ai/griptape](https://github.com/griptape-ai/griptape) | 617 |
|[billxbf/ReWOO](https://github.com/billxbf/ReWOO) | 616 |
|[Anil-matcha/ChatPDF](https://github.com/Anil-matcha/ChatPDF) | 609 |
|[NimbleBoxAI/ChainFury](https://github.com/NimbleBoxAI/ChainFury) | 592 |
|[getmetal/motorhead](https://github.com/getmetal/motorhead) | 581 |
|[ajndkr/lanarky](https://github.com/ajndkr/lanarky) | 574 |
|[namuan/dr-doc-search](https://github.com/namuan/dr-doc-search) | 572 |
|[kreneskyp/ix](https://github.com/kreneskyp/ix) | 564 |
|[akshata29/chatpdf](https://github.com/akshata29/chatpdf) | 540 |
|[hwchase17/chat-your-data](https://github.com/hwchase17/chat-your-data) | 540 |
|[whyiyhw/chatgpt-wechat](https://github.com/whyiyhw/chatgpt-wechat) | 537 |
|[khoj-ai/khoj](https://github.com/khoj-ai/khoj) | 531 |
|[SamurAIGPT/ChatGPT-Developer-Plugins](https://github.com/SamurAIGPT/ChatGPT-Developer-Plugins) | 528 |
|[microsoft/PodcastCopilot](https://github.com/microsoft/PodcastCopilot) | 526 |
|[ruoccofabrizio/azure-open-ai-embeddings-qna](https://github.com/ruoccofabrizio/azure-open-ai-embeddings-qna) | 515 |
|[alexanderatallah/window.ai](https://github.com/alexanderatallah/window.ai) | 494 |
|[StevenGrove/GPT4Tools](https://github.com/StevenGrove/GPT4Tools) | 483 |
|[jina-ai/agentchain](https://github.com/jina-ai/agentchain) | 472 |
|[mckaywrigley/repo-chat](https://github.com/mckaywrigley/repo-chat) | 465 |
|[yeagerai/yeagerai-agent](https://github.com/yeagerai/yeagerai-agent) | 464 |
|[langchain-ai/langchain-aiplugin](https://github.com/langchain-ai/langchain-aiplugin) | 464 |
|[mpaepper/content-chatbot](https://github.com/mpaepper/content-chatbot) | 455 |
|[michaelthwan/searchGPT](https://github.com/michaelthwan/searchGPT) | 455 |
|[freddyaboulton/gradio-tools](https://github.com/freddyaboulton/gradio-tools) | 450 |
|[amosjyng/langchain-visualizer](https://github.com/amosjyng/langchain-visualizer) | 446 |
|[msoedov/langcorn](https://github.com/msoedov/langcorn) | 445 |
|[plastic-labs/tutor-gpt](https://github.com/plastic-labs/tutor-gpt) | 426 |
|[poe-platform/poe-protocol](https://github.com/poe-platform/poe-protocol) | 426 |
|[jonra1993/fastapi-alembic-sqlmodel-async](https://github.com/jonra1993/fastapi-alembic-sqlmodel-async) | 418 |
|[langchain-ai/auto-evaluator](https://github.com/langchain-ai/auto-evaluator) | 416 |
|[steamship-core/steamship-langchain](https://github.com/steamship-core/steamship-langchain) | 401 |
|[xuwenhao/geektime-ai-course](https://github.com/xuwenhao/geektime-ai-course) | 400 |
|[continuum-llms/chatgpt-memory](https://github.com/continuum-llms/chatgpt-memory) | 386 |
|[mtenenholtz/chat-twitter](https://github.com/mtenenholtz/chat-twitter) | 382 |
|[explosion/spacy-llm](https://github.com/explosion/spacy-llm) | 368 |
|[showlab/VLog](https://github.com/showlab/VLog) | 363 |
|[yvann-hub/Robby-chatbot](https://github.com/yvann-hub/Robby-chatbot) | 363 |
|[daodao97/chatdoc](https://github.com/daodao97/chatdoc) | 361 |
|[opentensor/bittensor](https://github.com/opentensor/bittensor) | 360 |
|[alejandro-ao/langchain-ask-pdf](https://github.com/alejandro-ao/langchain-ask-pdf) | 355 |
|[logan-markewich/llama_index_starter_pack](https://github.com/logan-markewich/llama_index_starter_pack) | 351 |
|[jupyterlab/jupyter-ai](https://github.com/jupyterlab/jupyter-ai) | 348 |
|[alejandro-ao/ask-multiple-pdfs](https://github.com/alejandro-ao/ask-multiple-pdfs) | 321 |
|[andylokandy/gpt-4-search](https://github.com/andylokandy/gpt-4-search) | 314 |
|[mosaicml/examples](https://github.com/mosaicml/examples) | 313 |
|[personoids/personoids-lite](https://github.com/personoids/personoids-lite) | 306 |
|[itamargol/openai](https://github.com/itamargol/openai) | 304 |
|[Anil-matcha/Website-to-Chatbot](https://github.com/Anil-matcha/Website-to-Chatbot) | 299 |
|[momegas/megabots](https://github.com/momegas/megabots) | 299 |
|[BlackHC/llm-strategy](https://github.com/BlackHC/llm-strategy) | 289 |
|[daveebbelaar/langchain-experiments](https://github.com/daveebbelaar/langchain-experiments) | 283 |
|[wandb/weave](https://github.com/wandb/weave) | 279 |
|[Cheems-Seminar/grounded-segment-any-parts](https://github.com/Cheems-Seminar/grounded-segment-any-parts) | 273 |
|[jerlendds/osintbuddy](https://github.com/jerlendds/osintbuddy) | 271 |
|[OpenBMB/AgentVerse](https://github.com/OpenBMB/AgentVerse) | 270 |
|[MagnivOrg/prompt-layer-library](https://github.com/MagnivOrg/prompt-layer-library) | 269 |
|[sullivan-sean/chat-langchainjs](https://github.com/sullivan-sean/chat-langchainjs) | 259 |
|[Azure-Samples/openai](https://github.com/Azure-Samples/openai) | 252 |
|[bborn/howdoi.ai](https://github.com/bborn/howdoi.ai) | 248 |
|[hnawaz007/pythondataanalysis](https://github.com/hnawaz007/pythondataanalysis) | 247 |
|[conceptofmind/toolformer](https://github.com/conceptofmind/toolformer) | 243 |
|[truera/trulens](https://github.com/truera/trulens) | 239 |
|[ur-whitelab/exmol](https://github.com/ur-whitelab/exmol) | 238 |
|[intel/intel-extension-for-transformers](https://github.com/intel/intel-extension-for-transformers) | 237 |
|[monarch-initiative/ontogpt](https://github.com/monarch-initiative/ontogpt) | 236 |
|[wandb/edu](https://github.com/wandb/edu) | 231 |
|[recalign/RecAlign](https://github.com/recalign/RecAlign) | 229 |
|[alvarosevilla95/autolang](https://github.com/alvarosevilla95/autolang) | 223 |
|[kaleido-lab/dolphin](https://github.com/kaleido-lab/dolphin) | 221 |
|[JohnSnowLabs/nlptest](https://github.com/JohnSnowLabs/nlptest) | 220 |
|[paolorechia/learn-langchain](https://github.com/paolorechia/learn-langchain) | 219 |
|[Safiullah-Rahu/CSV-AI](https://github.com/Safiullah-Rahu/CSV-AI) | 215 |
|[Haste171/langchain-chatbot](https://github.com/Haste171/langchain-chatbot) | 215 |
|[steamship-packages/langchain-agent-production-starter](https://github.com/steamship-packages/langchain-agent-production-starter) | 214 |
|[airobotlab/KoChatGPT](https://github.com/airobotlab/KoChatGPT) | 213 |
|[filip-michalsky/SalesGPT](https://github.com/filip-michalsky/SalesGPT) | 211 |
|[marella/chatdocs](https://github.com/marella/chatdocs) | 207 |
|[su77ungr/CASALIOY](https://github.com/su77ungr/CASALIOY) | 200 |
|[shaman-ai/agent-actors](https://github.com/shaman-ai/agent-actors) | 195 |
|[plchld/InsightFlow](https://github.com/plchld/InsightFlow) | 189 |
|[jbrukh/gpt-jargon](https://github.com/jbrukh/gpt-jargon) | 186 |
|[hwchase17/langchain-streamlit-template](https://github.com/hwchase17/langchain-streamlit-template) | 185 |
|[huchenxucs/ChatDB](https://github.com/huchenxucs/ChatDB) | 179 |
|[benthecoder/ClassGPT](https://github.com/benthecoder/ClassGPT) | 178 |
|[hwchase17/chroma-langchain](https://github.com/hwchase17/chroma-langchain) | 178 |
|[radi-cho/datasetGPT](https://github.com/radi-cho/datasetGPT) | 177 |
|[jiran214/GPT-vup](https://github.com/jiran214/GPT-vup) | 176 |
|[rsaryev/talk-codebase](https://github.com/rsaryev/talk-codebase) | 174 |
|[edreisMD/plugnplai](https://github.com/edreisMD/plugnplai) | 174 |
|[gia-guar/JARVIS-ChatGPT](https://github.com/gia-guar/JARVIS-ChatGPT) | 172 |
|[hardbyte/qabot](https://github.com/hardbyte/qabot) | 171 |
|[shamspias/customizable-gpt-chatbot](https://github.com/shamspias/customizable-gpt-chatbot) | 165 |
|[gustavz/DataChad](https://github.com/gustavz/DataChad) | 164 |
|[yasyf/compress-gpt](https://github.com/yasyf/compress-gpt) | 163 |
|[SamPink/dev-gpt](https://github.com/SamPink/dev-gpt) | 161 |
|[yuanjie-ai/ChatLLM](https://github.com/yuanjie-ai/ChatLLM) | 161 |
|[pablomarin/GPT-Azure-Search-Engine](https://github.com/pablomarin/GPT-Azure-Search-Engine) | 160 |
|[jondurbin/airoboros](https://github.com/jondurbin/airoboros) | 157 |
|[fengyuli-dev/multimedia-gpt](https://github.com/fengyuli-dev/multimedia-gpt) | 157 |
|[PradipNichite/Youtube-Tutorials](https://github.com/PradipNichite/Youtube-Tutorials) | 156 |
|[nicknochnack/LangchainDocuments](https://github.com/nicknochnack/LangchainDocuments) | 155 |
|[ethanyanjiali/minChatGPT](https://github.com/ethanyanjiali/minChatGPT) | 155 |
|[ccurme/yolopandas](https://github.com/ccurme/yolopandas) | 154 |
|[chakkaradeep/pyCodeAGI](https://github.com/chakkaradeep/pyCodeAGI) | 153 |
|[preset-io/promptimize](https://github.com/preset-io/promptimize) | 150 |
|[onlyphantom/llm-python](https://github.com/onlyphantom/llm-python) | 148 |
|[Azure-Samples/azure-search-power-skills](https://github.com/Azure-Samples/azure-search-power-skills) | 146 |
|[realminchoi/babyagi-ui](https://github.com/realminchoi/babyagi-ui) | 144 |
|[microsoft/azure-openai-in-a-day-workshop](https://github.com/microsoft/azure-openai-in-a-day-workshop) | 144 |
|[jmpaz/promptlib](https://github.com/jmpaz/promptlib) | 143 |
|[shauryr/S2QA](https://github.com/shauryr/S2QA) | 142 |
|[handrew/browserpilot](https://github.com/handrew/browserpilot) | 141 |
|[Jaseci-Labs/jaseci](https://github.com/Jaseci-Labs/jaseci) | 140 |
|[Klingefjord/chatgpt-telegram](https://github.com/Klingefjord/chatgpt-telegram) | 140 |
|[WongSaang/chatgpt-ui-server](https://github.com/WongSaang/chatgpt-ui-server) | 139 |
|[ibiscp/LLM-IMDB](https://github.com/ibiscp/LLM-IMDB) | 139 |
|[menloparklab/langchain-cohere-qdrant-doc-retrieval](https://github.com/menloparklab/langchain-cohere-qdrant-doc-retrieval) | 138 |
|[hirokidaichi/wanna](https://github.com/hirokidaichi/wanna) | 137 |
|[steamship-core/vercel-examples](https://github.com/steamship-core/vercel-examples) | 137 |
|[deeppavlov/dream](https://github.com/deeppavlov/dream) | 136 |
|[miaoshouai/miaoshouai-assistant](https://github.com/miaoshouai/miaoshouai-assistant) | 135 |
|[sugarforever/LangChain-Tutorials](https://github.com/sugarforever/LangChain-Tutorials) | 135 |
|[yasyf/summ](https://github.com/yasyf/summ) | 135 |
|[peterw/StoryStorm](https://github.com/peterw/StoryStorm) | 134 |
|[vaibkumr/prompt-optimizer](https://github.com/vaibkumr/prompt-optimizer) | 132 |
|[ju-bezdek/langchain-decorators](https://github.com/ju-bezdek/langchain-decorators) | 130 |
|[homanp/vercel-langchain](https://github.com/homanp/vercel-langchain) | 128 |
|[Teahouse-Studios/akari-bot](https://github.com/Teahouse-Studios/akari-bot) | 127 |
|[petehunt/langchain-github-bot](https://github.com/petehunt/langchain-github-bot) | 125 |
|[eunomia-bpf/GPTtrace](https://github.com/eunomia-bpf/GPTtrace) | 122 |
|[fixie-ai/fixie-examples](https://github.com/fixie-ai/fixie-examples) | 122 |
|[Aggregate-Intellect/practical-llms](https://github.com/Aggregate-Intellect/practical-llms) | 120 |
|[davila7/file-gpt](https://github.com/davila7/file-gpt) | 120 |
|[Azure-Samples/azure-search-openai-demo-csharp](https://github.com/Azure-Samples/azure-search-openai-demo-csharp) | 119 |
|[prof-frink-lab/slangchain](https://github.com/prof-frink-lab/slangchain) | 117 |
|[aurelio-labs/arxiv-bot](https://github.com/aurelio-labs/arxiv-bot) | 117 |
|[zenml-io/zenml-projects](https://github.com/zenml-io/zenml-projects) | 116 |
|[flurb18/AgentOoba](https://github.com/flurb18/AgentOoba) | 114 |
|[kaarthik108/snowChat](https://github.com/kaarthik108/snowChat) | 112 |
|[RedisVentures/redis-openai-qna](https://github.com/RedisVentures/redis-openai-qna) | 111 |
|[solana-labs/chatgpt-plugin](https://github.com/solana-labs/chatgpt-plugin) | 111 |
|[kulltc/chatgpt-sql](https://github.com/kulltc/chatgpt-sql) | 109 |
|[summarizepaper/summarizepaper](https://github.com/summarizepaper/summarizepaper) | 109 |
|[Azure-Samples/miyagi](https://github.com/Azure-Samples/miyagi) | 106 |
|[ssheng/BentoChain](https://github.com/ssheng/BentoChain) | 106 |
|[voxel51/voxelgpt](https://github.com/voxel51/voxelgpt) | 105 |
|[mallahyari/drqa](https://github.com/mallahyari/drqa) | 103 |

View File

@@ -1,17 +1,17 @@
# Databerry
# Chaindesk
>[Databerry](https://databerry.ai) is an [open source](https://github.com/gmpetrov/databerry) document retrieval platform that helps to connect your personal data with Large Language Models.
>[Chaindesk](https://chaindesk.ai) is an [open source](https://github.com/gmpetrov/databerry) document retrieval platform that helps to connect your personal data with Large Language Models.
## Installation and Setup
We need to sign up for Databerry, create a datastore, add some data and get your datastore api endpoint url.
We need the [API Key](https://docs.databerry.ai/api-reference/authentication).
We need to sign up for Chaindesk, create a datastore, add some data and get your datastore api endpoint url.
We need the [API Key](https://docs.chaindesk.ai/api-reference/authentication).
## Retriever
See a [usage example](/docs/modules/data_connection/retrievers/integrations/databerry.html).
See a [usage example](/docs/modules/data_connection/retrievers/integrations/chaindesk.html).
```python
from langchain.retrievers import DataberryRetriever
from langchain.retrievers import ChaindeskRetriever
```

View File

@@ -0,0 +1,19 @@
# Datadog Logs
>[Datadog](https://www.datadoghq.com/) is a monitoring and analytics platform for cloud-scale applications.
## Installation and Setup
```bash
pip install datadog_api_client
```
We must initialize the loader with the Datadog API key and APP key, and we need to set up the query to extract the desired logs.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/datadog_logs.html).
```python
from langchain.document_loaders import DatadogLogsLoader
```

View File

@@ -1,6 +1,6 @@
# YouTube
>[YouTube](https://www.youtube.com/) is an online video sharing and social media platform created by Google.
>[YouTube](https://www.youtube.com/) is an online video sharing and social media platform by Google.
> We download the `YouTube` transcripts and video information.
## Installation and Setup

View File

@@ -17,16 +17,7 @@
"execution_count": 1,
"id": "8632a37c",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/harrisonchase/.pyenv/versions/3.9.1/envs/langchain/lib/python3.9/site-packages/deeplake/util/check_latest_version.py:32: UserWarning: A newer version of deeplake (3.6.5) is available. It's recommended that you update to the latest version using `pip install -U deeplake`.\n",
" warnings.warn(\n"
]
}
],
"outputs": [],
"source": [
"from pydantic import BaseModel, Field\n",
"\n",

View File

@@ -0,0 +1,220 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Context\n",
"\n",
"![Context - Product Analytics for AI Chatbots](https://go.getcontext.ai/langchain.png)\n",
"\n",
"[Context](https://getcontext.ai/) provides product analytics for AI chatbots.\n",
"\n",
"Context helps you understand how users are interacting with your AI chat products.\n",
"Gain critical insights, optimise poor experiences, and minimise brand risks.\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"In this guide we will show you how to integrate with Context."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"## Installation and Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "shellscript"
}
},
"outputs": [],
"source": [
"$ pip install context-python --upgrade"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Getting API Credentials\n",
"\n",
"To get your Context API token:\n",
"\n",
"1. Go to the settings page within your Context account (https://go.getcontext.ai/settings).\n",
"2. Generate a new API Token.\n",
"3. Store this token somewhere secure."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setup Context\n",
"\n",
"To use the `ContextCallbackHandler`, import the handler from Langchain and instantiate it with your Context API token.\n",
"\n",
"Ensure you have installed the `context-python` package before using the handler."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"from langchain.callbacks import ContextCallbackHandler\n",
"\n",
"token = os.environ[\"CONTEXT_API_TOKEN\"]\n",
"\n",
"context_callback = ContextCallbackHandler(token)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Usage\n",
"### Using the Context callback within a Chat Model\n",
"\n",
"The Context callback handler can be used to directly record transcripts between users and AI assistants.\n",
"\n",
"#### Example"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.schema import (\n",
" SystemMessage,\n",
" HumanMessage,\n",
")\n",
"from langchain.callbacks import ContextCallbackHandler\n",
"\n",
"token = os.environ[\"CONTEXT_API_TOKEN\"]\n",
"\n",
"chat = ChatOpenAI(\n",
" headers={\"user_id\": \"123\"}, temperature=0, callbacks=[ContextCallbackHandler(token)]\n",
")\n",
"\n",
"messages = [\n",
" SystemMessage(\n",
" content=\"You are a helpful assistant that translates English to French.\"\n",
" ),\n",
" HumanMessage(content=\"I love programming.\"),\n",
"]\n",
"\n",
"print(chat(messages))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using the Context callback within Chains\n",
"\n",
"The Context callback handler can also be used to record the inputs and outputs of chains. Note that intermediate steps of the chain are not recorded - only the starting inputs and final outputs.\n",
"\n",
"__Note:__ Ensure that you pass the same context object to the chat model and the chain.\n",
"\n",
"Wrong:\n",
"> ```python\n",
"> chat = ChatOpenAI(temperature=0.9, callbacks=[ContextCallbackHandler(token)])\n",
"> chain = LLMChain(llm=chat, prompt=chat_prompt_template, callbacks=[ContextCallbackHandler(token)])\n",
"> ```\n",
"\n",
"Correct:\n",
">```python\n",
">handler = ContextCallbackHandler(token)\n",
">chat = ChatOpenAI(temperature=0.9, callbacks=[callback])\n",
">chain = LLMChain(llm=chat, prompt=chat_prompt_template, callbacks=[callback])\n",
">```\n",
"\n",
"#### Example"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain import LLMChain\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain.prompts.chat import (\n",
" ChatPromptTemplate,\n",
" HumanMessagePromptTemplate,\n",
")\n",
"from langchain.callbacks import ContextCallbackHandler\n",
"\n",
"token = os.environ[\"CONTEXT_API_TOKEN\"]\n",
"\n",
"human_message_prompt = HumanMessagePromptTemplate(\n",
" prompt=PromptTemplate(\n",
" template=\"What is a good name for a company that makes {product}?\",\n",
" input_variables=[\"product\"],\n",
" )\n",
")\n",
"chat_prompt_template = ChatPromptTemplate.from_messages([human_message_prompt])\n",
"callback = ContextCallbackHandler(token)\n",
"chat = ChatOpenAI(temperature=0.9, callbacks=[callback])\n",
"chain = LLMChain(llm=chat, prompt=chat_prompt_template, callbacks=[callback])\n",
"print(chain.run(\"colorful socks\"))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
},
"vscode": {
"interpreter": {
"hash": "a53ebf4a859167383b364e7e7521d0add3c2dbbdecce4edf676e8c4634ff3fbb"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -9,7 +9,7 @@
In this guide we will demonstrate how to use `StreamlitCallbackHandler` to display the thoughts and actions of an agent in an
interactive Streamlit app. Try it out with the running app below using the [MRKL agent](/docs/modules/agents/how_to/mrkl/):
<iframe loading="lazy" src="https://mrkl-minimal.streamlit.app/?embed=true&embed_options=light_theme"
<iframe loading="lazy" src="https://langchain-mrkl.streamlit.app/?embed=true&embed_options=light_theme"
style={{ width: 100 + '%', border: 'none', marginBottom: 1 + 'rem', height: 600 }}
allow="camera;clipboard-read;clipboard-write;"
></iframe>
@@ -35,7 +35,7 @@ st_callback = StreamlitCallbackHandler(st.container())
```
Additional keyword arguments to customize the display behavior are described in the
[API reference](https://api.python.langchain.com/en/latest/modules/callbacks.html#langchain.callbacks.StreamlitCallbackHandler).
[API reference](https://api.python.langchain.com/en/latest/callbacks/langchain.callbacks.streamlit.streamlit_callback_handler.StreamlitCallbackHandler.html).
### Scenario 1: Using an Agent with Tools

View File

@@ -28,7 +28,7 @@
"\n",
"from pydantic import Extra\n",
"\n",
"from langchain.base_language import BaseLanguageModel\n",
"from langchain.schemea import BaseLanguageModel\n",
"from langchain.callbacks.manager import (\n",
" AsyncCallbackManagerForChainRun,\n",
" CallbackManagerForChainRun,\n",

View File

@@ -0,0 +1,96 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Datadog Logs\n",
"\n",
">[Datadog](https://www.datadoghq.com/) is a monitoring and analytics platform for cloud-scale applications.\n",
"\n",
"This loader fetches the logs from your applications in Datadog using the `datadog_api_client` Python package. You must initialize the loader with your `Datadog API key` and `APP key`, and you need to pass in the query to extract the desired logs."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import DatadogLogsLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#!pip install datadog-api-client"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query = \"service:agent status:error\"\n",
"\n",
"loader = DatadogLogsLoader(\n",
" query=query,\n",
" api_key=DD_API_KEY,\n",
" app_key=DD_APP_KEY,\n",
" from_time=1688732708951, # Optional, timestamp in milliseconds\n",
" to_time=1688736308951, # Optional, timestamp in milliseconds\n",
" limit=100, # Optional, default is 100\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='message: grep: /etc/datadog-agent/system-probe.yaml: No such file or directory', metadata={'id': 'AgAAAYkwpLImvkjRpQAAAAAAAAAYAAAAAEFZa3dwTUFsQUFEWmZfLU5QdElnM3dBWQAAACQAAAAAMDE4OTMwYTQtYzk3OS00MmJjLTlhNDAtOTY4N2EwY2I5ZDdk', 'status': 'error', 'service': 'agent', 'tags': ['accessible-from-goog-gke-node', 'allow-external-ingress-high-ports', 'allow-external-ingress-http', 'allow-external-ingress-https', 'container_id:c7d8ecd27b5b3cfdf3b0df04b8965af6f233f56b7c3c2ffabfab5e3b6ccbd6a5', 'container_name:lab_datadog_1', 'datadog.pipelines:false', 'datadog.submission_auth:private_api_key', 'docker_image:datadog/agent:7.41.1', 'env:dd101-dev', 'hostname:lab-host', 'image_name:datadog/agent', 'image_tag:7.41.1', 'instance-id:7497601202021312403', 'instance-type:custom-1-4096', 'instruqt_aws_accounts:', 'instruqt_azure_subscriptions:', 'instruqt_gcp_projects:', 'internal-hostname:lab-host.d4rjybavkary.svc.cluster.local', 'numeric_project_id:3390740675', 'p-d4rjybavkary', 'project:instruqt-prod', 'service:agent', 'short_image:agent', 'source:agent', 'zone:europe-west1-b'], 'timestamp': datetime.datetime(2023, 7, 7, 13, 57, 27, 206000, tzinfo=tzutc())}),\n",
" Document(page_content='message: grep: /etc/datadog-agent/system-probe.yaml: No such file or directory', metadata={'id': 'AgAAAYkwpLImvkjRpgAAAAAAAAAYAAAAAEFZa3dwTUFsQUFEWmZfLU5QdElnM3dBWgAAACQAAAAAMDE4OTMwYTQtYzk3OS00MmJjLTlhNDAtOTY4N2EwY2I5ZDdk', 'status': 'error', 'service': 'agent', 'tags': ['accessible-from-goog-gke-node', 'allow-external-ingress-high-ports', 'allow-external-ingress-http', 'allow-external-ingress-https', 'container_id:c7d8ecd27b5b3cfdf3b0df04b8965af6f233f56b7c3c2ffabfab5e3b6ccbd6a5', 'container_name:lab_datadog_1', 'datadog.pipelines:false', 'datadog.submission_auth:private_api_key', 'docker_image:datadog/agent:7.41.1', 'env:dd101-dev', 'hostname:lab-host', 'image_name:datadog/agent', 'image_tag:7.41.1', 'instance-id:7497601202021312403', 'instance-type:custom-1-4096', 'instruqt_aws_accounts:', 'instruqt_azure_subscriptions:', 'instruqt_gcp_projects:', 'internal-hostname:lab-host.d4rjybavkary.svc.cluster.local', 'numeric_project_id:3390740675', 'p-d4rjybavkary', 'project:instruqt-prod', 'service:agent', 'short_image:agent', 'source:agent', 'zone:europe-west1-b'], 'timestamp': datetime.datetime(2023, 7, 7, 13, 57, 27, 206000, tzinfo=tzutc())})]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"documents = loader.load()\n",
"documents"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.11"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,5 @@
Stanley Cups
Team Location Stanley Cups
Blues STL 1
Flyers PHI 2
Maple Leafs TOR 13
1 Stanley Cups
2 Team Location Stanley Cups
3 Blues STL 1
4 Flyers PHI 2
5 Maple Leafs TOR 13

View File

@@ -0,0 +1,181 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# TSV\n",
"\n",
">A [tab-separated values (TSV)](https://en.wikipedia.org/wiki/Tab-separated_values) file is a simple, text-based file format for storing tabular data.[3] Records are separated by newlines, and values within a record are separated by tab characters."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## `UnstructuredTSVLoader`\n",
"\n",
"You can also load the table using the `UnstructuredTSVLoader`. One advantage of using `UnstructuredTSVLoader` is that if you use it in `\"elements\"` mode, an HTML representation of the table will be available in the metadata."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders.tsv import UnstructuredTSVLoader"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"loader = UnstructuredTSVLoader(\n",
" file_path=\"example_data/mlb_teams_2012.csv\", mode=\"elements\"\n",
")\n",
"docs = loader.load()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<table border=\"1\" class=\"dataframe\">\n",
" <tbody>\n",
" <tr>\n",
" <td>Nationals, 81.34, 98</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Reds, 82.20, 97</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Yankees, 197.96, 95</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Giants, 117.62, 94</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Braves, 83.31, 94</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Athletics, 55.37, 94</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Rangers, 120.51, 93</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Orioles, 81.43, 93</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Rays, 64.17, 90</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Angels, 154.49, 89</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Tigers, 132.30, 88</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Cardinals, 110.30, 88</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Dodgers, 95.14, 86</td>\n",
" </tr>\n",
" <tr>\n",
" <td>White Sox, 96.92, 85</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Brewers, 97.65, 83</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Phillies, 174.54, 81</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Diamondbacks, 74.28, 81</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Pirates, 63.43, 79</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Padres, 55.24, 76</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Mariners, 81.97, 75</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Mets, 93.35, 74</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Blue Jays, 75.48, 73</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Royals, 60.91, 72</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Marlins, 118.07, 69</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Red Sox, 173.18, 69</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Indians, 78.43, 68</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Twins, 94.08, 66</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Rockies, 78.06, 64</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Cubs, 88.19, 61</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Astros, 60.65, 55</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n"
]
}
],
"source": [
"print(docs[0].metadata[\"text_as_html\"])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -0,0 +1,304 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Xorbits Pandas DataFrame\n",
"\n",
"This notebook goes over how to load data from a [xorbits.pandas](https://doc.xorbits.io/en/latest/reference/pandas/frame.html) DataFrame."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"#!pip install xorbits"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import xorbits.pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"df = pd.read_csv(\"example_data/mlb_teams_2012.csv\")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "b0d1d84e23c04f1296f63b3ea3dd1e5b",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0.00/100 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Team</th>\n",
" <th>\"Payroll (millions)\"</th>\n",
" <th>\"Wins\"</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Nationals</td>\n",
" <td>81.34</td>\n",
" <td>98</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Reds</td>\n",
" <td>82.20</td>\n",
" <td>97</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Yankees</td>\n",
" <td>197.96</td>\n",
" <td>95</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Giants</td>\n",
" <td>117.62</td>\n",
" <td>94</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Braves</td>\n",
" <td>83.31</td>\n",
" <td>94</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Team \"Payroll (millions)\" \"Wins\"\n",
"0 Nationals 81.34 98\n",
"1 Reds 82.20 97\n",
"2 Yankees 197.96 95\n",
"3 Giants 117.62 94\n",
"4 Braves 83.31 94"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import XorbitsLoader"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"loader = XorbitsLoader(df, page_content_column=\"Team\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "c8c8b67f1aae4a3c9de7734bb6cf738e",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0.00/100 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"[Document(page_content='Nationals', metadata={' \"Payroll (millions)\"': 81.34, ' \"Wins\"': 98}),\n",
" Document(page_content='Reds', metadata={' \"Payroll (millions)\"': 82.2, ' \"Wins\"': 97}),\n",
" Document(page_content='Yankees', metadata={' \"Payroll (millions)\"': 197.96, ' \"Wins\"': 95}),\n",
" Document(page_content='Giants', metadata={' \"Payroll (millions)\"': 117.62, ' \"Wins\"': 94}),\n",
" Document(page_content='Braves', metadata={' \"Payroll (millions)\"': 83.31, ' \"Wins\"': 94}),\n",
" Document(page_content='Athletics', metadata={' \"Payroll (millions)\"': 55.37, ' \"Wins\"': 94}),\n",
" Document(page_content='Rangers', metadata={' \"Payroll (millions)\"': 120.51, ' \"Wins\"': 93}),\n",
" Document(page_content='Orioles', metadata={' \"Payroll (millions)\"': 81.43, ' \"Wins\"': 93}),\n",
" Document(page_content='Rays', metadata={' \"Payroll (millions)\"': 64.17, ' \"Wins\"': 90}),\n",
" Document(page_content='Angels', metadata={' \"Payroll (millions)\"': 154.49, ' \"Wins\"': 89}),\n",
" Document(page_content='Tigers', metadata={' \"Payroll (millions)\"': 132.3, ' \"Wins\"': 88}),\n",
" Document(page_content='Cardinals', metadata={' \"Payroll (millions)\"': 110.3, ' \"Wins\"': 88}),\n",
" Document(page_content='Dodgers', metadata={' \"Payroll (millions)\"': 95.14, ' \"Wins\"': 86}),\n",
" Document(page_content='White Sox', metadata={' \"Payroll (millions)\"': 96.92, ' \"Wins\"': 85}),\n",
" Document(page_content='Brewers', metadata={' \"Payroll (millions)\"': 97.65, ' \"Wins\"': 83}),\n",
" Document(page_content='Phillies', metadata={' \"Payroll (millions)\"': 174.54, ' \"Wins\"': 81}),\n",
" Document(page_content='Diamondbacks', metadata={' \"Payroll (millions)\"': 74.28, ' \"Wins\"': 81}),\n",
" Document(page_content='Pirates', metadata={' \"Payroll (millions)\"': 63.43, ' \"Wins\"': 79}),\n",
" Document(page_content='Padres', metadata={' \"Payroll (millions)\"': 55.24, ' \"Wins\"': 76}),\n",
" Document(page_content='Mariners', metadata={' \"Payroll (millions)\"': 81.97, ' \"Wins\"': 75}),\n",
" Document(page_content='Mets', metadata={' \"Payroll (millions)\"': 93.35, ' \"Wins\"': 74}),\n",
" Document(page_content='Blue Jays', metadata={' \"Payroll (millions)\"': 75.48, ' \"Wins\"': 73}),\n",
" Document(page_content='Royals', metadata={' \"Payroll (millions)\"': 60.91, ' \"Wins\"': 72}),\n",
" Document(page_content='Marlins', metadata={' \"Payroll (millions)\"': 118.07, ' \"Wins\"': 69}),\n",
" Document(page_content='Red Sox', metadata={' \"Payroll (millions)\"': 173.18, ' \"Wins\"': 69}),\n",
" Document(page_content='Indians', metadata={' \"Payroll (millions)\"': 78.43, ' \"Wins\"': 68}),\n",
" Document(page_content='Twins', metadata={' \"Payroll (millions)\"': 94.08, ' \"Wins\"': 66}),\n",
" Document(page_content='Rockies', metadata={' \"Payroll (millions)\"': 78.06, ' \"Wins\"': 64}),\n",
" Document(page_content='Cubs', metadata={' \"Payroll (millions)\"': 88.19, ' \"Wins\"': 61}),\n",
" Document(page_content='Astros', metadata={' \"Payroll (millions)\"': 60.65, ' \"Wins\"': 55})]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"loader.load()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "fc85c9f59b3644689d05853159fbd358",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0.00/100 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"page_content='Nationals' metadata={' \"Payroll (millions)\"': 81.34, ' \"Wins\"': 98}\n",
"page_content='Reds' metadata={' \"Payroll (millions)\"': 82.2, ' \"Wins\"': 97}\n",
"page_content='Yankees' metadata={' \"Payroll (millions)\"': 197.96, ' \"Wins\"': 95}\n",
"page_content='Giants' metadata={' \"Payroll (millions)\"': 117.62, ' \"Wins\"': 94}\n",
"page_content='Braves' metadata={' \"Payroll (millions)\"': 83.31, ' \"Wins\"': 94}\n",
"page_content='Athletics' metadata={' \"Payroll (millions)\"': 55.37, ' \"Wins\"': 94}\n",
"page_content='Rangers' metadata={' \"Payroll (millions)\"': 120.51, ' \"Wins\"': 93}\n",
"page_content='Orioles' metadata={' \"Payroll (millions)\"': 81.43, ' \"Wins\"': 93}\n",
"page_content='Rays' metadata={' \"Payroll (millions)\"': 64.17, ' \"Wins\"': 90}\n",
"page_content='Angels' metadata={' \"Payroll (millions)\"': 154.49, ' \"Wins\"': 89}\n",
"page_content='Tigers' metadata={' \"Payroll (millions)\"': 132.3, ' \"Wins\"': 88}\n",
"page_content='Cardinals' metadata={' \"Payroll (millions)\"': 110.3, ' \"Wins\"': 88}\n",
"page_content='Dodgers' metadata={' \"Payroll (millions)\"': 95.14, ' \"Wins\"': 86}\n",
"page_content='White Sox' metadata={' \"Payroll (millions)\"': 96.92, ' \"Wins\"': 85}\n",
"page_content='Brewers' metadata={' \"Payroll (millions)\"': 97.65, ' \"Wins\"': 83}\n",
"page_content='Phillies' metadata={' \"Payroll (millions)\"': 174.54, ' \"Wins\"': 81}\n",
"page_content='Diamondbacks' metadata={' \"Payroll (millions)\"': 74.28, ' \"Wins\"': 81}\n",
"page_content='Pirates' metadata={' \"Payroll (millions)\"': 63.43, ' \"Wins\"': 79}\n",
"page_content='Padres' metadata={' \"Payroll (millions)\"': 55.24, ' \"Wins\"': 76}\n",
"page_content='Mariners' metadata={' \"Payroll (millions)\"': 81.97, ' \"Wins\"': 75}\n",
"page_content='Mets' metadata={' \"Payroll (millions)\"': 93.35, ' \"Wins\"': 74}\n",
"page_content='Blue Jays' metadata={' \"Payroll (millions)\"': 75.48, ' \"Wins\"': 73}\n",
"page_content='Royals' metadata={' \"Payroll (millions)\"': 60.91, ' \"Wins\"': 72}\n",
"page_content='Marlins' metadata={' \"Payroll (millions)\"': 118.07, ' \"Wins\"': 69}\n",
"page_content='Red Sox' metadata={' \"Payroll (millions)\"': 173.18, ' \"Wins\"': 69}\n",
"page_content='Indians' metadata={' \"Payroll (millions)\"': 78.43, ' \"Wins\"': 68}\n",
"page_content='Twins' metadata={' \"Payroll (millions)\"': 94.08, ' \"Wins\"': 66}\n",
"page_content='Rockies' metadata={' \"Payroll (millions)\"': 78.06, ' \"Wins\"': 64}\n",
"page_content='Cubs' metadata={' \"Payroll (millions)\"': 88.19, ' \"Wins\"': 61}\n",
"page_content='Astros' metadata={' \"Payroll (millions)\"': 60.65, ' \"Wins\"': 55}\n"
]
}
],
"source": [
"# Use lazy load for larger table, which won't read the full table into memory\n",
"for i in loader.lazy_load():\n",
" print(i)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -18,7 +18,7 @@
"## Creating a Pinecone index\n",
"First we'll want to create a `Pinecone` VectorStore and seed it with some data. We've created a small demo set of documents that contain summaries of movies.\n",
"\n",
"To use Pinecone, you to have `pinecone` package installed and you must have an API key and an Environment. Here are the [installation instructions](https://docs.pinecone.io/docs/quickstart).\n",
"To use Pinecone, you have to have `pinecone` package installed and you must have an API key and an Environment. Here are the [installation instructions](https://docs.pinecone.io/docs/quickstart).\n",
"\n",
"NOTE: The self-query retriever requires you to have `lark` package installed."
]

View File

@@ -1,21 +1,31 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "9fc6205b",
"metadata": {},
"source": [
"# Databerry\n",
"# Chaindesk\n",
"\n",
">[Databerry platform](https://docs.databerry.ai/introduction) brings data from anywhere (Datsources: Text, PDF, Word, PowerPpoint, Excel, Notion, Airtable, Google Sheets, etc..) into Datastores (container of multiple Datasources).\n",
"Then your Datastores can be connected to ChatGPT via Plugins or any other Large Langue Model (LLM) via the `Databerry API`.\n",
">[Chaindesk platform](https://docs.chaindesk.ai/introduction) brings data from anywhere (Datsources: Text, PDF, Word, PowerPpoint, Excel, Notion, Airtable, Google Sheets, etc..) into Datastores (container of multiple Datasources).\n",
"Then your Datastores can be connected to ChatGPT via Plugins or any other Large Langue Model (LLM) via the `Chaindesk API`.\n",
"\n",
"This notebook shows how to use [Databerry's](https://www.databerry.ai/) retriever.\n",
"This notebook shows how to use [Chaindesk's](https://www.chaindesk.ai/) retriever.\n",
"\n",
"First, you will need to sign up for Databerry, create a datastore, add some data and get your datastore api endpoint url. You need the [API Key](https://docs.databerry.ai/api-reference/authentication)."
"First, you will need to sign up for Chaindesk, create a datastore, add some data and get your datastore api endpoint url. You need the [API Key](https://docs.chaindesk.ai/api-reference/authentication)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3697b9fd",
"metadata": {},
"outputs": [],
"source": []
},
{
"attachments": {},
"cell_type": "markdown",
"id": "944e172b",
"metadata": {},
@@ -34,7 +44,7 @@
},
"outputs": [],
"source": [
"from langchain.retrievers import DataberryRetriever"
"from langchain.retrievers import ChaindeskRetriever"
]
},
{
@@ -46,9 +56,9 @@
},
"outputs": [],
"source": [
"retriever = DataberryRetriever(\n",
" datastore_url=\"https://clg1xg2h80000l708dymr0fxc.databerry.ai/query\",\n",
" # api_key=\"DATABERRY_API_KEY\", # optional if datastore is public\n",
"retriever = ChaindeskRetriever(\n",
" datastore_url=\"https://clg1xg2h80000l708dymr0fxc.chaindesk.ai/query\",\n",
" # api_key=\"CHAINDESK_API_KEY\", # optional if datastore is public\n",
" # top_k=10 # optional\n",
")"
]

View File

@@ -1,6 +1,7 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "fc0db1bc",
"metadata": {},
@@ -25,7 +26,7 @@
"from langchain.vectorstores import Chroma\n",
"from langchain.embeddings import HuggingFaceEmbeddings\n",
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.document_transformers import EmbeddingsRedundantFilter\n",
"from langchain.document_transformers import EmbeddingsRedundantFilter,EmbeddingsClusteringFilter\n",
"from langchain.retrievers.document_compressors import DocumentCompressorPipeline\n",
"from langchain.retrievers import ContextualCompressionRetriever\n",
"\n",
@@ -70,6 +71,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "c152339d",
"metadata": {},
@@ -92,6 +94,46 @@
" base_compressor=pipeline, base_retriever=lotr\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "c10022fa",
"metadata": {},
"source": [
"## Pick a representative sample of documents from the merged retrievers."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b3885482",
"metadata": {},
"outputs": [],
"source": [
"# This filter will divide the documents vectors into clusters or \"centers\" of meaning.\n",
"# Then it will pick the closest document to that center for the final results.\n",
"# By default the result document will be ordered/grouped by clusters.\n",
"filter_ordered_cluster = EmbeddingsClusteringFilter(\n",
" embeddings=filter_embeddings,\n",
" num_clusters=10,\n",
" num_closest=1,\n",
" )\n",
"\n",
"# If you want the final document to be ordered by the original retriever scores\n",
"# you need to add the \"sorted\" parameter.\n",
"filter_ordered_by_retriever = EmbeddingsClusteringFilter(\n",
" embeddings=filter_embeddings,\n",
" num_clusters=10,\n",
" num_closest=1,\n",
" sorted = True,\n",
" )\n",
"\n",
"pipeline = DocumentCompressorPipeline(transformers=[filter_ordered_by_retriever])\n",
"compression_retriever = ContextualCompressionRetriever(\n",
" base_compressor=pipeline, base_retriever=lotr\n",
")\n"
]
}
],
"metadata": {

View File

@@ -146,11 +146,11 @@
"# save to disk\n",
"db2 = Chroma.from_documents(docs, embedding_function, persist_directory=\"./chroma_db\")\n",
"db2.persist()\n",
"docs = db.similarity_search(query)\n",
"docs = db2.similarity_search(query)\n",
"\n",
"# load from disk\n",
"db3 = Chroma(persist_directory=\"./chroma_db\")\n",
"docs = db.similarity_search(query)\n",
"db3 = Chroma(persist_directory=\"./chroma_db\", embedding_function=embedding_function)\n",
"docs = db3.similarity_search(query)\n",
"print(docs[0].page_content)"
]
},

View File

@@ -1,14 +1,15 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Deep Lake\n",
"# Activeloop's Deep Lake\n",
"\n",
">[Deep Lake](https://docs.activeloop.ai/) as a Multi-Modal Vector Store that stores embeddings and their metadata including text, jsons, images, audio, video, and more. It saves the data locally, in your cloud, or on Activeloop storage. It performs hybrid search including embeddings and their attributes.\n",
">[Activeloop's Deep Lake](https://docs.activeloop.ai/) as a Multi-Modal Vector Store that stores embeddings and their metadata including text, jsons, images, audio, video, and more. It saves the data locally, in your cloud, or on Activeloop storage. It performs hybrid search including embeddings and their attributes.\n",
"\n",
"This notebook showcases basic functionality related to `Deep Lake`. While `Deep Lake` can store embeddings, it is capable of storing any type of data. It is a fully fledged serverless data lake with version control, query engine and streaming dataloader to deep learning frameworks. \n",
"This notebook showcases basic functionality related to `Activeloop's Deep Lake`. While `Deep Lake` can store embeddings, it is capable of storing any type of data. It is a serverless data lake with version control, query engine and streaming dataloaders to deep learning frameworks. \n",
"\n",
"For more information, please see the Deep Lake [documentation](https://docs.activeloop.ai) or [api reference](https://docs.deeplake.ai)"
]
@@ -16,12 +17,10 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"metadata": {},
"outputs": [],
"source": [
"!pip install openai deeplake tiktoken"
"!pip install openai 'deeplake[enterprise]' tiktoken"
]
},
{
@@ -61,7 +60,7 @@
"source": [
"from langchain.document_loaders import TextLoader\n",
"\n",
"loader = TextLoader(\"docs/modules/state_of_the_union.txt\")\n",
"loader = TextLoader(\"../../../state_of_the_union.txt\")\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)\n",
@@ -70,6 +69,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -78,31 +78,9 @@
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": []
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dataset(path='./my_deeplake/', tensors=['embedding', 'id', 'metadata', 'text'])\n",
"\n",
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding embedding (42, 1536) float32 None \n",
" id text (42, 1) str None \n",
" metadata json (42, 1) str None \n",
" text text (42, 1) str None \n"
]
}
],
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"db = DeepLake(\n",
" dataset_path=\"./my_deeplake/\", embedding_function=embeddings, overwrite=True\n",
@@ -116,30 +94,15 @@
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n"
]
}
],
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(docs[0].page_content)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -148,19 +111,9 @@
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Deep Lake Dataset in ./my_deeplake/ already exists, loading from the storage\n"
]
}
],
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"db = DeepLake(\n",
" dataset_path=\"./my_deeplake/\", embedding_function=embeddings, read_only=True\n",
@@ -169,6 +122,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -176,6 +130,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -184,20 +139,9 @@
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/adilkhansarsen/Documents/work/LangChain/langchain/langchain/llms/openai.py:751: UserWarning: You are trying to use a chat model. This way of initializing it is no longer supported. Instead, please use: `from langchain.chat_models import ChatOpenAI`\n",
" warnings.warn(\n"
]
}
],
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains import RetrievalQA\n",
"from langchain.llms import OpenAIChat\n",
@@ -211,28 +155,16 @@
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'The President nominated Ketanji Brown Jackson to serve on the United States Supreme Court and spoke highly of her legal expertise and reputation as a consensus builder.'"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"qa.run(query)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -240,35 +172,18 @@
]
},
{
"cell_type": "code",
"execution_count": 9,
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": []
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dataset(path='./my_deeplake/', tensors=['embedding', 'id', 'metadata', 'text'])\n",
"\n",
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding embedding (4, 1536) float32 None \n",
" id text (4, 1) str None \n",
" metadata json (4, 1) str None \n",
" text text (4, 1) str None \n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": []
}
],
"source": [
"Let's create another vector store containing metadata with the year the documents were created."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import random\n",
"\n",
@@ -282,29 +197,9 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 4/4 [00:00<00:00, 3300.00it/s]\n"
]
},
{
"data": {
"text/plain": [
"[Document(lc_kwargs={'page_content': 'Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', 'metadata': {'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}}, page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}),\n",
" Document(lc_kwargs={'page_content': 'A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWeve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWere putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWere securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', 'metadata': {'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}}, page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWeve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWere putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWere securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', metadata={'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}),\n",
" Document(lc_kwargs={'page_content': 'Tonight, Im announcing a crackdown on these companies overcharging American businesses and consumers. \\n\\nAnd as Wall Street firms take over more nursing homes, quality in those homes has gone down and costs have gone up. \\n\\nThat ends on my watch. \\n\\nMedicare is going to set higher standards for nursing homes and make sure your loved ones get the care they deserve and expect. \\n\\nWell also cut costs and keep the economy going strong by giving workers a fair shot, provide more training and apprenticeships, hire them based on their skills not degrees. \\n\\nLets pass the Paycheck Fairness Act and paid leave. \\n\\nRaise the minimum wage to $15 an hour and extend the Child Tax Credit, so no one has to raise a family in poverty. \\n\\nLets increase Pell Grants and increase our historic support of HBCUs, and invest in what Jill—our First Lady who teaches full-time—calls Americas best-kept secret: community colleges.', 'metadata': {'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}}, page_content='Tonight, Im announcing a crackdown on these companies overcharging American businesses and consumers. \\n\\nAnd as Wall Street firms take over more nursing homes, quality in those homes has gone down and costs have gone up. \\n\\nThat ends on my watch. \\n\\nMedicare is going to set higher standards for nursing homes and make sure your loved ones get the care they deserve and expect. \\n\\nWell also cut costs and keep the economy going strong by giving workers a fair shot, provide more training and apprenticeships, hire them based on their skills not degrees. \\n\\nLets pass the Paycheck Fairness Act and paid leave. \\n\\nRaise the minimum wage to $15 an hour and extend the Child Tax Credit, so no one has to raise a family in poverty. \\n\\nLets increase Pell Grants and increase our historic support of HBCUs, and invest in what Jill—our First Lady who teaches full-time—calls Americas best-kept secret: community colleges.', metadata={'source': 'docs/modules/state_of_the_union.txt', 'year': 2013})]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"db.similarity_search(\n",
" \"What did the president say about Ketanji Brown Jackson\",\n",
@@ -313,6 +208,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -322,23 +218,9 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": null,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(lc_kwargs={'page_content': 'Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', 'metadata': {'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}}, page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}),\n",
" Document(lc_kwargs={'page_content': 'A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWeve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWere putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWere securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', 'metadata': {'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}}, page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWeve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWere putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWere securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', metadata={'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}),\n",
" Document(lc_kwargs={'page_content': 'Tonight, Im announcing a crackdown on these companies overcharging American businesses and consumers. \\n\\nAnd as Wall Street firms take over more nursing homes, quality in those homes has gone down and costs have gone up. \\n\\nThat ends on my watch. \\n\\nMedicare is going to set higher standards for nursing homes and make sure your loved ones get the care they deserve and expect. \\n\\nWell also cut costs and keep the economy going strong by giving workers a fair shot, provide more training and apprenticeships, hire them based on their skills not degrees. \\n\\nLets pass the Paycheck Fairness Act and paid leave. \\n\\nRaise the minimum wage to $15 an hour and extend the Child Tax Credit, so no one has to raise a family in poverty. \\n\\nLets increase Pell Grants and increase our historic support of HBCUs, and invest in what Jill—our First Lady who teaches full-time—calls Americas best-kept secret: community colleges.', 'metadata': {'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}}, page_content='Tonight, Im announcing a crackdown on these companies overcharging American businesses and consumers. \\n\\nAnd as Wall Street firms take over more nursing homes, quality in those homes has gone down and costs have gone up. \\n\\nThat ends on my watch. \\n\\nMedicare is going to set higher standards for nursing homes and make sure your loved ones get the care they deserve and expect. \\n\\nWell also cut costs and keep the economy going strong by giving workers a fair shot, provide more training and apprenticeships, hire them based on their skills not degrees. \\n\\nLets pass the Paycheck Fairness Act and paid leave. \\n\\nRaise the minimum wage to $15 an hour and extend the Child Tax Credit, so no one has to raise a family in poverty. \\n\\nLets increase Pell Grants and increase our historic support of HBCUs, and invest in what Jill—our First Lady who teaches full-time—calls Americas best-kept secret: community colleges.', metadata={'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}),\n",
" Document(lc_kwargs={'page_content': 'And for our LGBTQ+ Americans, lets finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. \\n\\nAs I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. \\n\\nWhile it often appears that we never agree, that isnt true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. \\n\\nAnd soon, well strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. \\n\\nSo tonight Im offering a Unity Agenda for the Nation. Four big things we can do together. \\n\\nFirst, beat the opioid epidemic.', 'metadata': {'source': 'docs/modules/state_of_the_union.txt', 'year': 2012}}, page_content='And for our LGBTQ+ Americans, lets finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. \\n\\nAs I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. \\n\\nWhile it often appears that we never agree, that isnt true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. \\n\\nAnd soon, well strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. \\n\\nSo tonight Im offering a Unity Agenda for the Nation. Four big things we can do together. \\n\\nFirst, beat the opioid epidemic.', metadata={'source': 'docs/modules/state_of_the_union.txt', 'year': 2012})]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"db.similarity_search(\n",
" \"What did the president say about Ketanji Brown Jackson?\", distance_metric=\"cos\"\n",
@@ -346,6 +228,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -355,23 +238,9 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": null,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(lc_kwargs={'page_content': 'Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', 'metadata': {'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}}, page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}),\n",
" Document(lc_kwargs={'page_content': 'Tonight, Im announcing a crackdown on these companies overcharging American businesses and consumers. \\n\\nAnd as Wall Street firms take over more nursing homes, quality in those homes has gone down and costs have gone up. \\n\\nThat ends on my watch. \\n\\nMedicare is going to set higher standards for nursing homes and make sure your loved ones get the care they deserve and expect. \\n\\nWell also cut costs and keep the economy going strong by giving workers a fair shot, provide more training and apprenticeships, hire them based on their skills not degrees. \\n\\nLets pass the Paycheck Fairness Act and paid leave. \\n\\nRaise the minimum wage to $15 an hour and extend the Child Tax Credit, so no one has to raise a family in poverty. \\n\\nLets increase Pell Grants and increase our historic support of HBCUs, and invest in what Jill—our First Lady who teaches full-time—calls Americas best-kept secret: community colleges.', 'metadata': {'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}}, page_content='Tonight, Im announcing a crackdown on these companies overcharging American businesses and consumers. \\n\\nAnd as Wall Street firms take over more nursing homes, quality in those homes has gone down and costs have gone up. \\n\\nThat ends on my watch. \\n\\nMedicare is going to set higher standards for nursing homes and make sure your loved ones get the care they deserve and expect. \\n\\nWell also cut costs and keep the economy going strong by giving workers a fair shot, provide more training and apprenticeships, hire them based on their skills not degrees. \\n\\nLets pass the Paycheck Fairness Act and paid leave. \\n\\nRaise the minimum wage to $15 an hour and extend the Child Tax Credit, so no one has to raise a family in poverty. \\n\\nLets increase Pell Grants and increase our historic support of HBCUs, and invest in what Jill—our First Lady who teaches full-time—calls Americas best-kept secret: community colleges.', metadata={'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}),\n",
" Document(lc_kwargs={'page_content': 'A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWeve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWere putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWere securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', 'metadata': {'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}}, page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWeve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWere putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWere securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', metadata={'source': 'docs/modules/state_of_the_union.txt', 'year': 2013}),\n",
" Document(lc_kwargs={'page_content': 'And for our LGBTQ+ Americans, lets finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. \\n\\nAs I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. \\n\\nWhile it often appears that we never agree, that isnt true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. \\n\\nAnd soon, well strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. \\n\\nSo tonight Im offering a Unity Agenda for the Nation. Four big things we can do together. \\n\\nFirst, beat the opioid epidemic.', 'metadata': {'source': 'docs/modules/state_of_the_union.txt', 'year': 2012}}, page_content='And for our LGBTQ+ Americans, lets finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. \\n\\nAs I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. \\n\\nWhile it often appears that we never agree, that isnt true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. \\n\\nAnd soon, well strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. \\n\\nSo tonight Im offering a Unity Agenda for the Nation. Four big things we can do together. \\n\\nFirst, beat the opioid epidemic.', metadata={'source': 'docs/modules/state_of_the_union.txt', 'year': 2012})]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"db.max_marginal_relevance_search(\n",
" \"What did the president say about Ketanji Brown Jackson?\"\n",
@@ -379,6 +248,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -401,6 +271,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -423,11 +294,12 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deep Lake datasets on cloud (Activeloop, AWS, GCS, etc.) or in memory\n",
"By default deep lake datasets are stored locally, in case you want to store them in memory, in the Deep Lake Managed DB, or in any object storage, you can provide the [corresponding path to the dataset](https://docs.activeloop.ai/storage-and-credentials/storage-options). You can retrieve your user token from [app.activeloop.ai](https://app.activeloop.ai/)"
"By default, Deep Lake datasets are stored locally. To store them in memory, in the Deep Lake Managed DB, or in any object storage, you can provide the [corresponding path and credentials when creating the vector store](https://docs.activeloop.ai/storage-and-credentials/storage-options). Some paths require registration with Activeloop and creation of an API token that can be [retrieved here](https://app.activeloop.ai/)"
]
},
{
@@ -439,106 +311,11 @@
"os.environ[\"ACTIVELOOP_TOKEN\"] = activeloop_token"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Deeplake now supports running the inference in 3 modes. `python` naive way of searching inside of the data, `tensor_db` which is managed database, it runs tql on a remote optimized engine and sends results back, and `compute_engine` which is C++ implementation of search that runs locally."
]
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Your Deep Lake dataset has been successfully created!\n",
"The dataset is private so make sure you are logged in!\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"-"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dataset(path='hub://adilkhan/langchain_testing_python', tensors=['embedding', 'id', 'metadata', 'text'])\n",
"\n",
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding embedding (42, 1536) float32 None \n",
" id text (42, 1) str None \n",
" metadata json (42, 1) str None \n",
" text text (42, 1) str None \n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
" \r"
]
},
{
"data": {
"text/plain": [
"['d604b1ac-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b238-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b260-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b27e-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b29c-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b2ba-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b2d8-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b2f6-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b314-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b332-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b350-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b36e-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b38c-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b3a0-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b3be-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b3dc-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b3fa-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b418-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b436-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b454-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b472-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b490-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b4a4-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b4c2-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b4e0-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b4fe-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b51c-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b53a-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b558-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b576-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b594-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b5b2-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b5c6-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b5e4-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b602-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b620-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b63e-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b65c-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b67a-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b698-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b6b6-093c-11ee-bdba-76d8a30504e0',\n",
" 'd604b6d4-093c-11ee-bdba-76d8a30504e0']"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"# Embed and store the texts\n",
"username = \"<username>\" # your username on app.activeloop.ai\n",
@@ -553,23 +330,9 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n"
]
}
],
"outputs": [],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = db.similarity_search(query)\n",
@@ -577,102 +340,30 @@
]
},
{
"cell_type": "code",
"execution_count": 20,
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Your Deep Lake dataset has been successfully created!\n",
"The dataset is private so make sure you are logged in!\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"|"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dataset(path='hub://adilkhan/langchain_testing', tensors=['embedding', 'id', 'metadata', 'text'])\n",
"\n",
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding embedding (42, 1536) float32 None \n",
" id text (42, 1) str None \n",
" metadata json (42, 1) str None \n",
" text text (42, 1) str None \n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
" \r"
]
},
{
"data": {
"text/plain": [
"['6584c33a-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c3ee-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c420-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c43e-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c466-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c484-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c4a2-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c4c0-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c4de-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c4fc-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c51a-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c538-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c556-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c574-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c592-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c5b0-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c5ce-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c5f6-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c614-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c632-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c646-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c66e-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c682-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c6a0-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c6be-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c6e6-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c704-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c722-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c740-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c75e-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c77c-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c79a-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c7ae-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c7cc-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c7ea-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c808-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c826-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c844-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c862-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c876-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c894-093d-11ee-bdba-76d8a30504e0',\n",
" '6584c8bc-093d-11ee-bdba-76d8a30504e0']"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#### `tensor_db` execution option "
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"In order to utilize Deep Lake's Managed Tensor Database, it is necessary to specify the runtime parameter as {'tensor_db': True} during the creation of the vector store. This configuration enables the execution of queries on the Managed Tensor Database, rather than on the client side. It should be noted that this functionality is not applicable to datasets stored locally or in-memory. In the event that a vector store has already been created outside of the Managed Tensor Database, it is possible to transfer it to the Managed Tensor Database by following the prescribed steps."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Embed and store the texts\n",
"username = \"adilkhan\" # your username on app.activeloop.ai\n",
"dataset_path = f\"hub://{username}/langchain_testing\" # could be also ./local/path (much faster locally), s3://bucket/path/to/dataset, gcs://path/to/dataset, etc.\n",
"dataset_path = f\"hub://{username}/langchain_testing\"\n",
"\n",
"docs = text_splitter.split_documents(documents)\n",
"\n",
@@ -681,44 +372,13 @@
" dataset_path=dataset_path,\n",
" embedding_function=embeddings,\n",
" overwrite=True,\n",
" exec_option=\"tensor_db\",\n",
" runtime={\"tensor_db\": True},\n",
")\n",
"db.add_documents(docs)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n"
]
}
],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = db.similarity_search(query, exec_option=\"tensor_db\")\n",
"print(docs[0].page_content)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### The difference will be apparent on a bigger datasets (~10000 rows)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -726,15 +386,16 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"now we can use tql search with DeepLake"
"Furthermore, the execution of queries is also supported within the similarity_search method, whereby the query can be specified utilizing Deep Lake's Tensor Query Language (TQL)."
]
},
{
"cell_type": "code",
"execution_count": 23,
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
@@ -743,42 +404,31 @@
},
{
"cell_type": "code",
"execution_count": 24,
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"docs = db.similarity_search(\n",
" query=None,\n",
" tql_query=f\"SELECT * WHERE id == '{search_id[0]}'\",\n",
" exec_option=\"tensor_db\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 25,
"execution_count": null,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(lc_kwargs={'page_content': 'Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans. \\n\\nLast year COVID-19 kept us apart. This year we are finally together again. \\n\\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \\n\\nWith a duty to one another to the American people to the Constitution. \\n\\nAnd with an unwavering resolve that freedom will always triumph over tyranny. \\n\\nSix days ago, Russias Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \\n\\nHe thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \\n\\nHe met the Ukrainian people. \\n\\nFrom President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world.', 'metadata': {'source': 'docs/modules/state_of_the_union.txt'}}, page_content='Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans. \\n\\nLast year COVID-19 kept us apart. This year we are finally together again. \\n\\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \\n\\nWith a duty to one another to the American people to the Constitution. \\n\\nAnd with an unwavering resolve that freedom will always triumph over tyranny. \\n\\nSix days ago, Russias Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \\n\\nHe thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \\n\\nHe met the Ukrainian people. \\n\\nFrom President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world.', metadata={'source': 'docs/modules/state_of_the_union.txt'})]"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"docs"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating dataset on AWS S3"
"### Creating vector stores on AWS S3"
]
},
{
@@ -841,11 +491,12 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deep Lake API\n",
"you can access the Deep Lake dataset at `db.ds`"
"you can access the Deep Lake dataset at `db.vectorstore`"
]
},
{
@@ -884,6 +535,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [

View File

@@ -1,226 +1,243 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "683953b3",
"metadata": {},
"source": [
"# MongoDB Atlas\n",
"\n",
">[MongoDB Atlas](https://www.mongodb.com/docs/atlas/) is a fully-managed cloud database available in AWS , Azure, and GCP. It now has support for native Vector Search on your MongoDB document data.\n",
"\n",
"This notebook shows how to use `MongoDB Atlas Vector Search` to store your embeddings in MongoDB documents, create a vector search index, and perform KNN search with an approximate nearest neighbor algorithm.\n",
"\n",
"It uses the [knnBeta Operator](https://www.mongodb.com/docs/atlas/atlas-search/knn-beta) available in MongoDB Atlas Search. This feature is in Public Preview and available for evaluation purposes, to validate functionality, and to gather feedback from public preview users. It is not recommended for production deployments as we may introduce breaking changes.\n",
"\n",
"To use MongoDB Atlas, you must first deploy a cluster. We have a Forever-Free tier of clusters available. \n",
"To get started head over to Atlas here: [quick start](https://www.mongodb.com/docs/atlas/getting-started/)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b4c41cad-08ef-4f72-a545-2151e4598efe",
"metadata": {
"tags": []
"cells":[
{
"attachments":{
},
"cell_type":"markdown",
"id":"683953b3",
"metadata":{
},
"source":[
"# MongoDB Atlas\n",
"\n",
">[MongoDB Atlas](https://www.mongodb.com/docs/atlas/) is a fully-managed cloud database available in AWS , Azure, and GCP. It now has support for native Vector Search on your MongoDB document data.\n",
"\n",
"This notebook shows how to use `MongoDB Atlas Vector Search` to store your embeddings in MongoDB documents, create a vector search index, and perform KNN search with an approximate nearest neighbor algorithm.\n",
"\n",
"It uses the [knnBeta Operator](https://www.mongodb.com/docs/atlas/atlas-search/knn-beta) available in MongoDB Atlas Search. This feature is in Public Preview and available for evaluation purposes, to validate functionality, and to gather feedback from public preview users. It is not recommended for production deployments as we may introduce breaking changes.\n",
"\n",
"To use MongoDB Atlas, you must first deploy a cluster. We have a Forever-Free tier of clusters available. \n",
"To get started head over to Atlas here: [quick start](https://www.mongodb.com/docs/atlas/getting-started/)."
]
},
{
"cell_type":"code",
"execution_count":null,
"id":"b4c41cad-08ef-4f72-a545-2151e4598efe",
"metadata":{
"tags":[
]
},
"outputs":[
],
"source":[
"!pip install pymongo"
]
},
{
"cell_type":"code",
"execution_count":null,
"id":"c1e38361-c1fe-4ac6-86e9-c90ebaf7ae87",
"metadata":{
},
"outputs":[
],
"source":[
"import os\n",
"import getpass\n",
"\n",
"MONGODB_ATLAS_CLUSTER_URI = getpass.getpass(\"MongoDB Atlas Cluster URI:\")\n"
]
},
{
"attachments":{
},
"cell_type":"markdown",
"id":"457ace44-1d95-4001-9dd5-78811ab208ad",
"metadata":{
},
"source":[
"We want to use `OpenAIEmbeddings` so we need to set up our OpenAI API Key. "
]
},
{
"cell_type":"code",
"execution_count":null,
"id":"2d8f240d",
"metadata":{
},
"outputs":[
],
"source":[
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")\n"
]
},
{
"attachments":{
},
"cell_type":"markdown",
"id":"1f3ecc42",
"metadata":{
},
"source":[
"Now, let's create a vector search index on your cluster. In the below example, `embedding` is the name of the field that contains the embedding vector. Please refer to the [documentation](https://www.mongodb.com/docs/atlas/atlas-search/define-field-mappings-for-vector-search) to get more details on how to define an Atlas Vector Search index.\n",
"You can name the index `langchain_demo` and create the index on the namespace `lanchain_db.langchain_col`. Finally, write the following definition in the JSON editor on MongoDB Atlas:\n",
"\n",
"```json\n",
"{\n",
" \"mappings\": {\n",
" \"dynamic\": true,\n",
" \"fields\": {\n",
" \"embedding\": {\n",
" \"dimensions\": 1536,\n",
" \"similarity\": \"cosine\",\n",
" \"type\": \"knnVector\"\n",
" }\n",
" }\n",
" }\n",
"}\n",
"```"
]
},
{
"cell_type":"code",
"execution_count":2,
"id":"aac9563e",
"metadata":{
"tags":[
]
},
"outputs":[
],
"source":[
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import MongoDBAtlasVectorSearch\n",
"from langchain.document_loaders import TextLoader\n",
"\n",
"loader = TextLoader(\"../../../state_of_the_union.txt\")\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = OpenAIEmbeddings()"
]
},
{
"cell_type":"code",
"execution_count":null,
"id":"6e104aee",
"metadata":{
},
"outputs":[
],
"source":[
"from pymongo import MongoClient\n",
"\n",
"# initialize MongoDB python client\n",
"client = MongoClient(MONGODB_ATLAS_CLUSTER_URI)\n",
"\n",
"db_name = \"langchain_db\"\n",
"collection_name = \"langchain_col\"\n",
"collection = client[db_name][collection_name]\n",
"index_name = \"langchain_demo\"\n",
"\n",
"# insert the documents in MongoDB Atlas with their embedding\n",
"docsearch = MongoDBAtlasVectorSearch.from_documents(\n",
" docs, embeddings, collection=collection, index_name=index_name\n",
")\n",
"\n",
"# perform a similarity search between the embedding of the query and the embeddings of the documents\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = docsearch.similarity_search(query)"
]
},
{
"cell_type":"code",
"execution_count":null,
"id":"9c608226",
"metadata":{
},
"outputs":[
],
"source":[
"print(docs[0].page_content)"
]
},
{
"attachments":{
},
"cell_type":"markdown",
"id":"851a2ec9-9390-49a4-8412-3e132c9f789d",
"metadata":{
},
"source":[
"You can also instantiate the vector store directly and execute a query as follows:"
]
},
{
"cell_type":"code",
"execution_count":null,
"id":"6336fe79-3e73-48be-b20a-0ff1bb6a4399",
"metadata":{
},
"outputs":[
],
"source":[
"# initialize vector store\n",
"vectorstore = MongoDBAtlasVectorSearch(\n",
" collection, OpenAIEmbeddings(), index_name=index_name\n",
")\n",
"\n",
"# perform a similarity search between a query and the ingested documents\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = vectorstore.similarity_search(query)\n",
"\n",
"print(docs[0].page_content)"
]
}
],
"metadata":{
"kernelspec":{
"display_name":"Python 3 (ipykernel)",
"language":"python",
"name":"python3"
},
"language_info":{
"codemirror_mode":{
"name":"ipython",
"version":3
},
"file_extension":".py",
"mimetype":"text/x-python",
"name":"python",
"nbconvert_exporter":"python",
"pygments_lexer":"ipython3",
"version":"3.10.6"
}
},
"outputs": [],
"source": [
"!pip install pymongo"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c1e38361-c1fe-4ac6-86e9-c90ebaf7ae87",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"MONGODB_ATLAS_CLUSTER_URI = getpass.getpass(\"MongoDB Atlas Cluster URI:\")\n",
"MONGODB_ATLAS_CLUSTER_URI = os.environ[\"MONGODB_ATLAS_CLUSTER_URI\"]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "457ace44-1d95-4001-9dd5-78811ab208ad",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we need to set up our OpenAI API Key. "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2d8f240d",
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")\n",
"OPENAI_API_KEY = os.environ[\"OPENAI_API_KEY\"]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "1f3ecc42",
"metadata": {},
"source": [
"Now, let's create a vector search index on your cluster. In the below example, `embedding` is the name of the field that contains the embedding vector. Please refer to the [documentation](https://www.mongodb.com/docs/atlas/atlas-search/define-field-mappings-for-vector-search) to get more details on how to define an Atlas Vector Search index.\n",
"You can name the index `langchain_demo` and create the index on the namespace `lanchain_db.langchain_col`. Finally, write the following definition in the JSON editor on MongoDB Atlas:\n",
"\n",
"```json\n",
"{\n",
" \"mappings\": {\n",
" \"dynamic\": true,\n",
" \"fields\": {\n",
" \"embedding\": {\n",
" \"dimensions\": 1536,\n",
" \"similarity\": \"cosine\",\n",
" \"type\": \"knnVector\"\n",
" }\n",
" }\n",
" }\n",
"}\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "aac9563e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import MongoDBAtlasVectorSearch\n",
"from langchain.document_loaders import TextLoader"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a3c3999a",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
"\n",
"loader = TextLoader(\"../../../state_of_the_union.txt\")\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = OpenAIEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6e104aee",
"metadata": {},
"outputs": [],
"source": [
"from pymongo import MongoClient\n",
"\n",
"# initialize MongoDB python client\n",
"client = MongoClient(MONGODB_ATLAS_CLUSTER_URI)\n",
"\n",
"db_name = \"langchain_db\"\n",
"collection_name = \"langchain_col\"\n",
"collection = client[db_name][collection_name]\n",
"index_name = \"langchain_demo\"\n",
"\n",
"# insert the documents in MongoDB Atlas with their embedding\n",
"docsearch = MongoDBAtlasVectorSearch.from_documents(\n",
" docs, embeddings, collection=collection, index_name=index_name\n",
")\n",
"\n",
"# perform a similarity search between the embedding of the query and the embeddings of the documents\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = docsearch.similarity_search(query)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9c608226",
"metadata": {},
"outputs": [],
"source": [
"print(docs[0].page_content)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "851a2ec9-9390-49a4-8412-3e132c9f789d",
"metadata": {},
"source": [
"You can reuse the vector search index you created, make sure the `OPENAI_API_KEY` environment variable is set up, then execute another query."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6336fe79-3e73-48be-b20a-0ff1bb6a4399",
"metadata": {},
"outputs": [],
"source": [
"from pymongo import MongoClient\n",
"from langchain.vectorstores import MongoDBAtlasVectorSearch\n",
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"import os\n",
"\n",
"MONGODB_ATLAS_URI = os.environ[\"MONGODB_ATLAS_URI\"]\n",
"\n",
"# initialize MongoDB python client\n",
"client = MongoClient(MONGODB_ATLAS_URI)\n",
"\n",
"db_name = \"langchain_db\"\n",
"collection_name = \"langchain_col\"\n",
"collection = client[db_name][collection_name]\n",
"index_name = \"langchain_demo\"\n",
"\n",
"# initialize vector store\n",
"vectorStore = MongoDBAtlasVectorSearch(\n",
" collection, OpenAIEmbeddings(), index_name=index_name\n",
")\n",
"\n",
"# perform a similarity search between a query and the ingested documents\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = vectorStore.similarity_search(query)\n",
"\n",
"print(docs[0].page_content)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
"nbformat":4,
"nbformat_minor":5
}

View File

@@ -123,7 +123,7 @@
},
{
"cell_type": "code",
"execution_count": 62,
"execution_count": 1,
"metadata": {
"tags": []
},
@@ -138,7 +138,7 @@
},
{
"cell_type": "code",
"execution_count": 63,
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
@@ -152,49 +152,25 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"## PGVector needs the connection string to the database.\n",
"## We will load it from the environment variables.\n",
"import os\n",
"# PGVector needs the connection string to the database.\n",
"CONNECTION_STRING = \"postgresql+psycopg2://harrisonchase@localhost:5432/test3\"\n",
"\n",
"CONNECTION_STRING = PGVector.connection_string_from_db_params(\n",
" driver=os.environ.get(\"PGVECTOR_DRIVER\", \"psycopg2\"),\n",
" host=os.environ.get(\"PGVECTOR_HOST\", \"localhost\"),\n",
" port=int(os.environ.get(\"PGVECTOR_PORT\", \"5432\")),\n",
" database=os.environ.get(\"PGVECTOR_DATABASE\", \"postgres\"),\n",
" user=os.environ.get(\"PGVECTOR_USER\", \"postgres\"),\n",
" password=os.environ.get(\"PGVECTOR_PASSWORD\", \"postgres\"),\n",
")\n",
"\n",
"\n",
"## Example\n",
"# postgresql+psycopg2://username:password@localhost:5432/database_name"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [],
"source": [
"# ## PGVector needs the connection string to the database.\n",
"# ## We will load it from the environment variables.\n",
"# # Alternatively, you can create it from enviornment variables.\n",
"# import os\n",
"\n",
"# CONNECTION_STRING = PGVector.connection_string_from_db_params(\n",
"# driver=os.environ.get(\"PGVECTOR_DRIVER\", \"psycopg2\"),\n",
"# host=os.environ.get(\"PGVECTOR_HOST\", \"localhost\"),\n",
"# port=int(os.environ.get(\"PGVECTOR_PORT\", \"5432\")),\n",
"# database=os.environ.get(\"PGVECTOR_DATABASE\", \"rd-embeddings\"),\n",
"# user=os.environ.get(\"PGVECTOR_USER\", \"admin\"),\n",
"# password=os.environ.get(\"PGVECTOR_PASSWORD\", \"password\"),\n",
"# database=os.environ.get(\"PGVECTOR_DATABASE\", \"postgres\"),\n",
"# user=os.environ.get(\"PGVECTOR_USER\", \"postgres\"),\n",
"# password=os.environ.get(\"PGVECTOR_PASSWORD\", \"postgres\"),\n",
"# )\n",
"\n",
"\n",
"# ## Example\n",
"# # postgresql+psycopg2://username:password@localhost:5432/database_name"
"\n"
]
},
{
@@ -206,27 +182,36 @@
},
{
"cell_type": "code",
"execution_count": 69,
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"# The PGVector Module will try to create a table with the name of the collection. So, make sure that the collection name is unique and the user has the\n",
"# permission to create a table.\n",
"# The PGVector Module will try to create a table with the name of the collection. \n",
"# So, make sure that the collection name is unique and the user has the permission to create a table.\n",
"\n",
"COLLECTION_NAME = \"state_of_the_union_test\"\n",
"\n",
"db = PGVector.from_documents(\n",
" embedding=embeddings,\n",
" documents=docs,\n",
" collection_name=\"state_of_the_union\",\n",
" collection_name=COLLECTION_NAME,\n",
" connection_string=CONNECTION_STRING,\n",
")\n",
"\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query)"
")"
]
},
{
"cell_type": "code",
"execution_count": 70,
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs_with_score = db.similarity_search_with_score(query)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
@@ -234,7 +219,7 @@
"output_type": "stream",
"text": [
"--------------------------------------------------------------------------------\n",
"Score: 0.6076804864602984\n",
"Score: 0.18460171628856903\n",
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
@@ -244,7 +229,7 @@
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n",
"--------------------------------------------------------------------------------\n",
"--------------------------------------------------------------------------------\n",
"Score: 0.6076804864602984\n",
"Score: 0.18460171628856903\n",
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
@@ -254,21 +239,17 @@
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n",
"--------------------------------------------------------------------------------\n",
"--------------------------------------------------------------------------------\n",
"Score: 0.659062774389974\n",
"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n",
"Score: 0.18470284560586236\n",
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"We can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"Weve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \n",
"\n",
"Were putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \n",
"\n",
"Were securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n",
"--------------------------------------------------------------------------------\n",
"--------------------------------------------------------------------------------\n",
"Score: 0.659062774389974\n",
"Score: 0.21730864082247825\n",
"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n",
"\n",
"And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n",
@@ -296,183 +277,189 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Working with vectorstore"
"## Working with vectorstore\n",
"\n",
"Above, we created a vectorstore from scratch. However, often times we want to work with an existing vectorstore.\n",
"In order to do that, we can initialize it directly."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"store = PGVector(\n",
" collection_name=COLLECTION_NAME,\n",
" connection_string=CONNECTION_STRING,\n",
" embedding_function=embeddings,\n",
")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Uploading a vectorstore"
"### Add documents\n",
"We can add documents to the existing vectorstore."
]
},
{
"cell_type": "code",
"execution_count": 55,
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['048c2e14-1cf3-11ee-8777-e65801318980']"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"store.add_documents([Document(page_content=\"foo\")])"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"docs_with_score = db.similarity_search_with_score(\"foo\")"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(Document(page_content='foo', metadata={}), 3.3203430005457335e-09)"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs_with_score[0]"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(Document(page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWeve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWere putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWere securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', metadata={'source': '../../../state_of_the_union.txt'}),\n",
" 0.2404395365581814)"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs_with_score[1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Overriding a vectorstore\n",
"\n",
"If you have an existing collection, you override it by doing `from_documents` and setting `pre_delete_collection` = True"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"data = docs\n",
"api_key = os.environ[\"OPENAI_API_KEY\"]\n",
"db = PGVector.from_documents(\n",
" documents=docs,\n",
" embedding=embeddings,\n",
" collection_name=collection_name,\n",
" connection_string=connection_string,\n",
" distance_strategy=DistanceStrategy.COSINE,\n",
" openai_api_key=api_key,\n",
" pre_delete_collection=False,\n",
" collection_name=COLLECTION_NAME,\n",
" connection_string=CONNECTION_STRING,\n",
" pre_delete_collection=True,\n",
")"
]
},
{
"cell_type": "markdown",
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"### Retrieving a vectorstore"
"docs_with_score = db.similarity_search_with_score(\"foo\")"
]
},
{
"cell_type": "code",
"execution_count": 56,
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(Document(page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWeve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWere putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWere securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', metadata={'source': '../../../state_of_the_union.txt'}),\n",
" 0.2404115088144465)"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs_with_score[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using a VectorStore as a Retriever"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"connection_string = CONNECTION_STRING\n",
"embedding = embeddings\n",
"collection_name = \"state_of_the_union\"\n",
"from langchain.vectorstores.pgvector import DistanceStrategy\n",
"\n",
"store = PGVector(\n",
" connection_string=connection_string,\n",
" embedding_function=embedding,\n",
" collection_name=collection_name,\n",
" distance_strategy=DistanceStrategy.COSINE,\n",
")\n",
"\n",
"retriever = store.as_retriever()"
]
},
{
"cell_type": "code",
"execution_count": 57,
"execution_count": 27,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"vectorstore=<langchain.vectorstores.pgvector.PGVector object at 0x7fe9a1b1c670> search_type='similarity' search_kwargs={}\n"
"tags=None metadata=None vectorstore=<langchain.vectorstores.pgvector.PGVector object at 0x29f94f880> search_type='similarity' search_kwargs={}\n"
]
}
],
"source": [
"print(retriever)"
]
},
{
"cell_type": "code",
"execution_count": 83,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[(Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'}), 0.6075870262188066), (Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'}), 0.6075870262188066), (Document(page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWeve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWere putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWere securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', metadata={'source': '../../../state_of_the_union.txt'}), 0.6589478388546668), (Document(page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWeve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWere putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWere securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', metadata={'source': '../../../state_of_the_union.txt'}), 0.6589478388546668)]\n"
]
}
],
"source": [
"# When we have an existing PG VEctor\n",
"DEFAULT_DISTANCE_STRATEGY = DistanceStrategy.EUCLIDEAN\n",
"db1 = PGVector.from_existing_index(\n",
" embedding=embeddings,\n",
" collection_name=\"state_of_the_union\",\n",
" distance_strategy=DEFAULT_DISTANCE_STRATEGY,\n",
" pre_delete_collection=False,\n",
" connection_string=CONNECTION_STRING,\n",
")\n",
"\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs_with_score: List[Tuple[Document, float]] = db1.similarity_search_with_score(query)\n",
"print(docs_with_score)"
]
},
{
"cell_type": "code",
"execution_count": 81,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--------------------------------------------------------------------------------\n",
"Score: 0.6075870262188066\n",
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n",
"--------------------------------------------------------------------------------\n",
"--------------------------------------------------------------------------------\n",
"Score: 0.6075870262188066\n",
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n",
"--------------------------------------------------------------------------------\n",
"--------------------------------------------------------------------------------\n",
"Score: 0.6589478388546668\n",
"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n",
"\n",
"And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n",
"\n",
"We can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \n",
"\n",
"Weve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \n",
"\n",
"Were putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \n",
"\n",
"Were securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.\n",
"--------------------------------------------------------------------------------\n",
"--------------------------------------------------------------------------------\n",
"Score: 0.6589478388546668\n",
"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n",
"\n",
"And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n",
"\n",
"We can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \n",
"\n",
"Weve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \n",
"\n",
"Were putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \n",
"\n",
"Were securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.\n",
"--------------------------------------------------------------------------------\n"
]
}
],
"source": [
"for doc, score in docs_with_score:\n",
" print(\"-\" * 80)\n",
" print(\"Score: \", score)\n",
" print(doc.page_content)\n",
" print(\"-\" * 80)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -491,7 +478,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@@ -40,12 +40,15 @@
"cell_type": "code",
"execution_count": 1,
"metadata": {
"is_executing": true
"ExecuteTime": {
"end_time": "2023-07-09T19:20:49.003167Z",
"start_time": "2023-07-09T19:20:47.446370Z"
}
},
"outputs": [],
"source": [
"from langchain.memory.chat_message_histories import ZepChatMessageHistory\n",
"from langchain.memory import ConversationBufferMemory\n",
"from langchain.memory import ZepMemory\n",
"from langchain.retrievers import ZepRetriever\n",
"from langchain import OpenAI\n",
"from langchain.schema import HumanMessage, AIMessage\n",
"from langchain.utilities import WikipediaAPIWrapper\n",
@@ -64,19 +67,11 @@
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-25T15:09:41.762056Z",
"start_time": "2023-05-25T15:09:41.755238Z"
"end_time": "2023-07-09T19:23:14.378234Z",
"start_time": "2023-07-09T19:20:49.005041Z"
}
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"outputs": [],
"source": [
"# Provide your OpenAI key\n",
"import getpass\n",
@@ -87,16 +82,13 @@
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
"metadata": {
"ExecuteTime": {
"end_time": "2023-07-09T19:23:16.329934Z",
"start_time": "2023-07-09T19:23:14.345580Z"
}
],
},
"outputs": [],
"source": [
"# Provide your Zep API key. Note that this is optional. See https://docs.getzep.com/deployment/auth\n",
"\n",
@@ -116,8 +108,8 @@
"execution_count": 4,
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-25T15:09:41.840440Z",
"start_time": "2023-05-25T15:09:41.762277Z"
"end_time": "2023-07-09T19:23:16.528212Z",
"start_time": "2023-07-09T19:23:16.279045Z"
}
},
"outputs": [],
@@ -132,15 +124,11 @@
"]\n",
"\n",
"# Set up Zep Chat History\n",
"zep_chat_history = ZepChatMessageHistory(\n",
"memory = ZepMemory(\n",
" session_id=session_id,\n",
" url=ZEP_API_URL,\n",
" api_key=zep_api_key\n",
")\n",
"\n",
"# Use a standard ConversationBufferMemory to encapsulate the Zep chat history\n",
"memory = ConversationBufferMemory(\n",
" memory_key=\"chat_history\", chat_memory=zep_chat_history\n",
" api_key=zep_api_key,\n",
" memory_key=\"chat_history\",\n",
")\n",
"\n",
"# Initialize the agent\n",
@@ -167,8 +155,8 @@
"execution_count": 5,
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-25T15:09:41.960661Z",
"start_time": "2023-05-25T15:09:41.842656Z"
"end_time": "2023-07-09T19:23:16.659484Z",
"start_time": "2023-07-09T19:23:16.532090Z"
}
},
"outputs": [],
@@ -230,14 +218,16 @@
" \" living in a dystopian future where society has collapsed due to\"\n",
" \" environmental disasters, poverty, and violence.\"\n",
" ),\n",
" \"metadata\": {\"foo\": \"bar\"},\n",
" },\n",
"]\n",
"\n",
"for msg in test_history:\n",
" zep_chat_history.add_message(\n",
" memory.chat_memory.add_message(\n",
" HumanMessage(content=msg[\"content\"])\n",
" if msg[\"role\"] == \"human\"\n",
" else AIMessage(content=msg[\"content\"])\n",
" else AIMessage(content=msg[\"content\"]),\n",
" metadata=msg.get(\"metadata\", {}),\n",
" )"
]
},
@@ -256,8 +246,8 @@
"execution_count": 6,
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-25T15:09:50.485377Z",
"start_time": "2023-05-25T15:09:41.962287Z"
"end_time": "2023-07-09T19:23:19.348822Z",
"start_time": "2023-07-09T19:23:16.660130Z"
}
},
"outputs": [
@@ -269,16 +259,14 @@
"\n",
"\u001B[1m> Entering new chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mThought: Do I need to use a tool? No\n",
"AI: Parable of the Sower is a prescient novel that speaks to the challenges facing contemporary society, such as climate change, economic inequality, and the rise of authoritarianism. It is a cautionary tale that warns of the dangers of ignoring these issues and the importance of taking action to address them.\u001B[0m\n",
"AI: Parable of the Sower is a prescient novel that speaks to the challenges facing contemporary society, such as climate change, inequality, and violence. It is a cautionary tale that warns of the dangers of unchecked greed and the need for individuals to take responsibility for their own lives and the lives of those around them.\u001B[0m\n",
"\n",
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
"data": {
"text/plain": [
"'Parable of the Sower is a prescient novel that speaks to the challenges facing contemporary society, such as climate change, economic inequality, and the rise of authoritarianism. It is a cautionary tale that warns of the dangers of ignoring these issues and the importance of taking action to address them.'"
]
"text/plain": "'Parable of the Sower is a prescient novel that speaks to the challenges facing contemporary society, such as climate change, inequality, and violence. It is a cautionary tale that warns of the dangers of unchecked greed and the need for individuals to take responsibility for their own lives and the lives of those around them.'"
},
"execution_count": 6,
"metadata": {},
@@ -287,7 +275,7 @@
],
"source": [
"agent_chain.run(\n",
" input=\"WWhat is the book's relevance to the challenges facing contemporary society?\"\n",
" input=\"What is the book's relevance to the challenges facing contemporary society?\",\n",
")"
]
},
@@ -305,11 +293,11 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 9,
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-25T15:09:50.493438Z",
"start_time": "2023-05-25T15:09:50.479230Z"
"end_time": "2023-07-09T19:23:41.042254Z",
"start_time": "2023-07-09T19:23:41.016815Z"
}
},
"outputs": [
@@ -317,29 +305,39 @@
"name": "stdout",
"output_type": "stream",
"text": [
"The human asks about Octavia Butler and the AI identifies her as an American science fiction author. They continue to discuss her works and the fact that the FX series Kindred is based on one of her novels. The AI also lists Ursula K. Le Guin, Samuel R. Delany, and Joanna Russ as Butler's contemporaries.\n",
"The human inquires about Octavia Butler. The AI identifies her as an American science fiction author. The human then asks which books of hers were made into movies. The AI responds by mentioning the FX series Kindred, based on her novel of the same name. The human then asks about her contemporaries, and the AI lists Ursula K. Le Guin, Samuel R. Delany, and Joanna Russ.\n",
"\n",
"\n",
"{'role': 'human', 'content': 'What awards did she win?', 'uuid': 'a4bdc592-71a5-47d0-9c64-230b882aab48', 'created_at': '2023-06-26T23:37:56.383953Z', 'token_count': 8, 'metadata': {'system': {'entities': [], 'intent': 'The subject is asking about the awards someone won, likely referring to a specific individual.'}}}\n",
"{'role': 'ai', 'content': 'Octavia Butler won the Hugo Award, the Nebula Award, and the MacArthur Fellowship.', 'uuid': '60cc6e6b-7cd4-4a81-aebc-72ef997286b4', 'created_at': '2023-06-26T23:37:56.389935Z', 'token_count': 21, 'metadata': {'system': {'entities': [{'Label': 'PERSON', 'Matches': [{'End': 14, 'Start': 0, 'Text': 'Octavia Butler'}], 'Name': 'Octavia Butler'}, {'Label': 'WORK_OF_ART', 'Matches': [{'End': 33, 'Start': 19, 'Text': 'the Hugo Award'}], 'Name': 'the Hugo Award'}, {'Label': 'EVENT', 'Matches': [{'End': 81, 'Start': 57, 'Text': 'the MacArthur Fellowship'}], 'Name': 'the MacArthur Fellowship'}], 'intent': 'The subject is stating the accomplishments and awards received by Octavia Butler.'}}}\n",
"{'role': 'human', 'content': 'Which other women sci-fi writers might I want to read?', 'uuid': 'b189fc60-1510-4a4b-a503-899481d652de', 'created_at': '2023-06-26T23:37:56.395722Z', 'token_count': 14, 'metadata': {'system': {'entities': [], 'intent': 'The subject is looking for recommendations on women science fiction writers to read.'}}}\n",
"{'role': 'ai', 'content': 'You might want to read Ursula K. Le Guin or Joanna Russ.', 'uuid': '4be1ccbb-a915-45d6-9f18-7a0c1cbd9907', 'created_at': '2023-06-26T23:37:56.403596Z', 'token_count': 18, 'metadata': {'system': {'entities': [{'Label': 'ORG', 'Matches': [{'End': 40, 'Start': 23, 'Text': 'Ursula K. Le Guin'}], 'Name': 'Ursula K. Le Guin'}, {'Label': 'PERSON', 'Matches': [{'End': 55, 'Start': 44, 'Text': 'Joanna Russ'}], 'Name': 'Joanna Russ'}], 'intent': 'The subject is suggesting reading material and making a literary recommendation.'}}}\n",
"{'role': 'human', 'content': \"Write a short synopsis of Butler's book, Parable of the Sower. What is it about?\", 'uuid': 'ac3c5e3e-26a7-4f3b-aeb0-bba084e22753', 'created_at': '2023-06-26T23:37:56.410662Z', 'token_count': 23, 'metadata': {'system': {'entities': [{'Label': 'ORG', 'Matches': [{'End': 32, 'Start': 26, 'Text': 'Butler'}], 'Name': 'Butler'}, {'Label': 'WORK_OF_ART', 'Matches': [{'End': 61, 'Start': 41, 'Text': 'Parable of the Sower'}], 'Name': 'Parable of the Sower'}], 'intent': 'The subject is asking for a brief overview or summary of the book \"Parable of the Sower\" written by Butler.'}}}\n",
"{'role': 'ai', 'content': 'Parable of the Sower is a science fiction novel by Octavia Butler, published in 1993. It follows the story of Lauren Olamina, a young woman living in a dystopian future where society has collapsed due to environmental disasters, poverty, and violence.', 'uuid': '4a463b4c-bcab-473c-bed1-fc56a7a20ae2', 'created_at': '2023-06-26T23:37:56.41764Z', 'token_count': 56, 'metadata': {'system': {'entities': [{'Label': 'GPE', 'Matches': [{'End': 20, 'Start': 15, 'Text': 'Sower'}], 'Name': 'Sower'}, {'Label': 'PERSON', 'Matches': [{'End': 65, 'Start': 51, 'Text': 'Octavia Butler'}], 'Name': 'Octavia Butler'}, {'Label': 'DATE', 'Matches': [{'End': 84, 'Start': 80, 'Text': '1993'}], 'Name': '1993'}, {'Label': 'PERSON', 'Matches': [{'End': 124, 'Start': 110, 'Text': 'Lauren Olamina'}], 'Name': 'Lauren Olamina'}]}}}\n",
"{'role': 'human', 'content': \"WWhat is the book's relevance to the challenges facing contemporary society?\", 'uuid': '41bab0c7-5e20-40a4-9303-f82069977c91', 'created_at': '2023-06-26T23:38:03.559642Z', 'token_count': 16, 'metadata': {'system': {'entities': [{'Label': 'ORG', 'Matches': [{'End': 5, 'Start': 0, 'Text': 'WWhat'}], 'Name': 'WWhat'}]}}}\n",
"{'role': 'ai', 'content': 'Parable of the Sower is a prescient novel that speaks to the challenges facing contemporary society, such as climate change, economic inequality, and the rise of authoritarianism. It is a cautionary tale that warns of the dangers of ignoring these issues and the importance of taking action to address them.', 'uuid': 'bfd8146a-4632-4c8c-98b6-9468bb624339', 'created_at': '2023-06-26T23:38:03.589312Z', 'token_count': 62, 'metadata': {'system': {'entities': [{'Label': 'GPE', 'Matches': [{'End': 20, 'Start': 15, 'Text': 'Sower'}], 'Name': 'Sower'}]}}}\n"
"system :\n",
" {'content': 'The human inquires about Octavia Butler. The AI identifies her as an American science fiction author. The human then asks which books of hers were made into movies. The AI responds by mentioning the FX series Kindred, based on her novel of the same name. The human then asks about her contemporaries, and the AI lists Ursula K. Le Guin, Samuel R. Delany, and Joanna Russ.', 'additional_kwargs': {}}\n",
"human :\n",
" {'content': 'What awards did she win?', 'additional_kwargs': {'uuid': '6b733f0b-6778-49ae-b3ec-4e077c039f31', 'created_at': '2023-07-09T19:23:16.611232Z', 'token_count': 8, 'metadata': {'system': {'entities': [], 'intent': 'The subject is inquiring about the awards that someone, whose identity is not specified, has won.'}}}, 'example': False}\n",
"ai :\n",
" {'content': 'Octavia Butler won the Hugo Award, the Nebula Award, and the MacArthur Fellowship.', 'additional_kwargs': {'uuid': '2f6d80c6-3c08-4fd4-8d4e-7bbee341ac90', 'created_at': '2023-07-09T19:23:16.618947Z', 'token_count': 21, 'metadata': {'system': {'entities': [{'Label': 'PERSON', 'Matches': [{'End': 14, 'Start': 0, 'Text': 'Octavia Butler'}], 'Name': 'Octavia Butler'}, {'Label': 'WORK_OF_ART', 'Matches': [{'End': 33, 'Start': 19, 'Text': 'the Hugo Award'}], 'Name': 'the Hugo Award'}, {'Label': 'EVENT', 'Matches': [{'End': 81, 'Start': 57, 'Text': 'the MacArthur Fellowship'}], 'Name': 'the MacArthur Fellowship'}], 'intent': 'The subject is stating that Octavia Butler received the Hugo Award, the Nebula Award, and the MacArthur Fellowship.'}}}, 'example': False}\n",
"human :\n",
" {'content': 'Which other women sci-fi writers might I want to read?', 'additional_kwargs': {'uuid': 'ccdcc901-ea39-4981-862f-6fe22ab9289b', 'created_at': '2023-07-09T19:23:16.62678Z', 'token_count': 14, 'metadata': {'system': {'entities': [], 'intent': 'The subject is seeking recommendations for additional women science fiction writers to explore.'}}}, 'example': False}\n",
"ai :\n",
" {'content': 'You might want to read Ursula K. Le Guin or Joanna Russ.', 'additional_kwargs': {'uuid': '7977099a-0c62-4c98-bfff-465bbab6c9c3', 'created_at': '2023-07-09T19:23:16.631721Z', 'token_count': 18, 'metadata': {'system': {'entities': [{'Label': 'ORG', 'Matches': [{'End': 40, 'Start': 23, 'Text': 'Ursula K. Le Guin'}], 'Name': 'Ursula K. Le Guin'}, {'Label': 'PERSON', 'Matches': [{'End': 55, 'Start': 44, 'Text': 'Joanna Russ'}], 'Name': 'Joanna Russ'}], 'intent': 'The subject is suggesting that the person should consider reading the works of Ursula K. Le Guin or Joanna Russ.'}}}, 'example': False}\n",
"human :\n",
" {'content': \"Write a short synopsis of Butler's book, Parable of the Sower. What is it about?\", 'additional_kwargs': {'uuid': 'e439b7e6-286a-4278-a8cb-dc260fa2e089', 'created_at': '2023-07-09T19:23:16.63623Z', 'token_count': 23, 'metadata': {'system': {'entities': [{'Label': 'ORG', 'Matches': [{'End': 32, 'Start': 26, 'Text': 'Butler'}], 'Name': 'Butler'}, {'Label': 'WORK_OF_ART', 'Matches': [{'End': 61, 'Start': 41, 'Text': 'Parable of the Sower'}], 'Name': 'Parable of the Sower'}], 'intent': 'The subject is requesting a brief summary or explanation of the book \"Parable of the Sower\" by Butler.'}}}, 'example': False}\n",
"ai :\n",
" {'content': 'Parable of the Sower is a science fiction novel by Octavia Butler, published in 1993. It follows the story of Lauren Olamina, a young woman living in a dystopian future where society has collapsed due to environmental disasters, poverty, and violence.', 'additional_kwargs': {'uuid': '6760489b-19c9-41aa-8b45-fae6cb1d7ee6', 'created_at': '2023-07-09T19:23:16.647524Z', 'token_count': 56, 'metadata': {'foo': 'bar', 'system': {'entities': [{'Label': 'GPE', 'Matches': [{'End': 20, 'Start': 15, 'Text': 'Sower'}], 'Name': 'Sower'}, {'Label': 'PERSON', 'Matches': [{'End': 65, 'Start': 51, 'Text': 'Octavia Butler'}], 'Name': 'Octavia Butler'}, {'Label': 'DATE', 'Matches': [{'End': 84, 'Start': 80, 'Text': '1993'}], 'Name': '1993'}, {'Label': 'PERSON', 'Matches': [{'End': 124, 'Start': 110, 'Text': 'Lauren Olamina'}], 'Name': 'Lauren Olamina'}], 'intent': 'The subject is providing information about the novel \"Parable of the Sower\" by Octavia Butler, including its genre, publication date, and a brief summary of the plot.'}}}, 'example': False}\n",
"human :\n",
" {'content': \"What is the book's relevance to the challenges facing contemporary society?\", 'additional_kwargs': {'uuid': '7dbbbb93-492b-4739-800f-cad2b6e0e764', 'created_at': '2023-07-09T19:23:19.315182Z', 'token_count': 15, 'metadata': {'system': {'entities': [], 'intent': 'The subject is asking about the relevance of a book to the challenges currently faced by society.'}}}, 'example': False}\n",
"ai :\n",
" {'content': 'Parable of the Sower is a prescient novel that speaks to the challenges facing contemporary society, such as climate change, inequality, and violence. It is a cautionary tale that warns of the dangers of unchecked greed and the need for individuals to take responsibility for their own lives and the lives of those around them.', 'additional_kwargs': {'uuid': '3e14ac8f-b7c1-4360-958b-9f3eae1f784f', 'created_at': '2023-07-09T19:23:19.332517Z', 'token_count': 66, 'metadata': {'system': {'entities': [{'Label': 'GPE', 'Matches': [{'End': 20, 'Start': 15, 'Text': 'Sower'}], 'Name': 'Sower'}], 'intent': 'The subject is providing an analysis and evaluation of the novel \"Parable of the Sower\" and highlighting its relevance to contemporary societal challenges.'}}}, 'example': False}\n"
]
}
],
"source": [
"def print_messages(messages):\n",
" for m in messages:\n",
" print(m.to_dict())\n",
" print(m.type, \":\\n\", m.dict())\n",
"\n",
"\n",
"print(zep_chat_history.zep_summary)\n",
"print(memory.chat_memory.zep_summary)\n",
"print(\"\\n\")\n",
"print_messages(zep_chat_history.zep_messages)"
"print_messages(memory.chat_memory.messages)"
]
},
{
@@ -349,16 +347,18 @@
"source": [
"### Vector search over the Zep memory\n",
"\n",
"Zep provides native vector search over historical conversation memory. Embedding happens automatically.\n"
"Zep provides native vector search over historical conversation memory via the `ZepRetriever`.\n",
"\n",
"You can use the `ZepRetriever` with chains that support passing in a Langchain `Retriever` object.\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 11,
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-25T15:09:50.751203Z",
"start_time": "2023-05-25T15:09:50.495050Z"
"end_time": "2023-07-09T19:24:30.781893Z",
"start_time": "2023-07-09T19:24:30.595650Z"
}
},
"outputs": [
@@ -366,38 +366,36 @@
"name": "stdout",
"output_type": "stream",
"text": [
"{'uuid': 'b189fc60-1510-4a4b-a503-899481d652de', 'created_at': '2023-06-26T23:37:56.395722Z', 'role': 'human', 'content': 'Which other women sci-fi writers might I want to read?', 'metadata': {'system': {'entities': [], 'intent': 'The subject is looking for recommendations on women science fiction writers to read.'}}, 'token_count': 14} 0.9119619869747062\n",
"{'uuid': '4be1ccbb-a915-45d6-9f18-7a0c1cbd9907', 'created_at': '2023-06-26T23:37:56.403596Z', 'role': 'ai', 'content': 'You might want to read Ursula K. Le Guin or Joanna Russ.', 'metadata': {'system': {'entities': [{'Label': 'ORG', 'Matches': [{'End': 40, 'Start': 23, 'Text': 'Ursula K. Le Guin'}], 'Name': 'Ursula K. Le Guin'}, {'Label': 'PERSON', 'Matches': [{'End': 55, 'Start': 44, 'Text': 'Joanna Russ'}], 'Name': 'Joanna Russ'}], 'intent': 'The subject is suggesting reading material and making a literary recommendation.'}}, 'token_count': 18} 0.8534346954749745\n",
"{'uuid': '76ec2a3d-b908-4c23-a55d-71ff92865a7a', 'created_at': '2023-06-26T23:37:56.378345Z', 'role': 'ai', 'content': \"Octavia Butler's contemporaries included Ursula K. Le Guin, Samuel R. Delany, and Joanna Russ.\", 'metadata': {'system': {'entities': [{'Label': 'PERSON', 'Matches': [{'End': 16, 'Start': 0, 'Text': \"Octavia Butler's\"}], 'Name': \"Octavia Butler's\"}, {'Label': 'ORG', 'Matches': [{'End': 58, 'Start': 41, 'Text': 'Ursula K. Le Guin'}], 'Name': 'Ursula K. Le Guin'}, {'Label': 'PERSON', 'Matches': [{'End': 76, 'Start': 60, 'Text': 'Samuel R. Delany'}], 'Name': 'Samuel R. Delany'}, {'Label': 'PERSON', 'Matches': [{'End': 93, 'Start': 82, 'Text': 'Joanna Russ'}], 'Name': 'Joanna Russ'}], 'intent': 'The subject is stating the contemporaries of Octavia Butler, who are also science fiction writers.'}}, 'token_count': 27} 0.8523930955780226\n",
"{'uuid': '1feb02c7-63c9-4616-854d-0d97fb590ea5', 'created_at': '2023-06-26T23:37:56.313009Z', 'role': 'human', 'content': 'Who was Octavia Butler?', 'metadata': {'system': {'entities': [{'Label': 'PERSON', 'Matches': [{'End': 22, 'Start': 8, 'Text': 'Octavia Butler'}], 'Name': 'Octavia Butler'}], 'intent': 'The subject is asking about the identity of Octavia Butler, likely seeking information about her background or accomplishments.'}}, 'token_count': 8} 0.8236355436055457\n",
"{'uuid': 'ebe4696d-b5fa-4ca0-88c9-da794d9611ab', 'created_at': '2023-06-26T23:37:56.332247Z', 'role': 'ai', 'content': 'Octavia Estelle Butler (June 22, 1947 February 24, 2006) was an American science fiction author.', 'metadata': {'system': {'entities': [{'Label': 'PERSON', 'Matches': [{'End': 22, 'Start': 0, 'Text': 'Octavia Estelle Butler'}], 'Name': 'Octavia Estelle Butler'}, {'Label': 'DATE', 'Matches': [{'End': 37, 'Start': 24, 'Text': 'June 22, 1947'}], 'Name': 'June 22, 1947'}, {'Label': 'DATE', 'Matches': [{'End': 57, 'Start': 40, 'Text': 'February 24, 2006'}], 'Name': 'February 24, 2006'}, {'Label': 'NORP', 'Matches': [{'End': 74, 'Start': 66, 'Text': 'American'}], 'Name': 'American'}], 'intent': 'The subject is making a statement about the background and profession of Octavia Estelle Butler, an American author.'}}, 'token_count': 31} 0.8206687242257686\n",
"{'uuid': '60cc6e6b-7cd4-4a81-aebc-72ef997286b4', 'created_at': '2023-06-26T23:37:56.389935Z', 'role': 'ai', 'content': 'Octavia Butler won the Hugo Award, the Nebula Award, and the MacArthur Fellowship.', 'metadata': {'system': {'entities': [{'Label': 'PERSON', 'Matches': [{'End': 14, 'Start': 0, 'Text': 'Octavia Butler'}], 'Name': 'Octavia Butler'}, {'Label': 'WORK_OF_ART', 'Matches': [{'End': 33, 'Start': 19, 'Text': 'the Hugo Award'}], 'Name': 'the Hugo Award'}, {'Label': 'EVENT', 'Matches': [{'End': 81, 'Start': 57, 'Text': 'the MacArthur Fellowship'}], 'Name': 'the MacArthur Fellowship'}], 'intent': 'The subject is stating the accomplishments and awards received by Octavia Butler.'}}, 'token_count': 21} 0.8194249796585193\n",
"{'uuid': '0fa4f336-909d-4880-b01a-8e80e91fa8f2', 'created_at': '2023-06-26T23:37:56.344552Z', 'role': 'human', 'content': 'Which books of hers were made into movies?', 'metadata': {'system': {'entities': [], 'intent': 'The subject is inquiring about which books written by an unknown female author were adapted into movies.'}}, 'token_count': 11} 0.7955105671310818\n",
"{'uuid': 'f91de7f2-4b84-4c5a-8a33-a71f38f3a59c', 'created_at': '2023-06-26T23:37:56.368146Z', 'role': 'human', 'content': 'Who were her contemporaries?', 'metadata': {'system': {'entities': [], 'intent': 'The subject is asking about the people who lived during the same time period as a specific individual.'}}, 'token_count': 8} 0.7942358617914813\n",
"{'uuid': '4a463b4c-bcab-473c-bed1-fc56a7a20ae2', 'created_at': '2023-06-26T23:37:56.41764Z', 'role': 'ai', 'content': 'Parable of the Sower is a science fiction novel by Octavia Butler, published in 1993. It follows the story of Lauren Olamina, a young woman living in a dystopian future where society has collapsed due to environmental disasters, poverty, and violence.', 'metadata': {'system': {'entities': [{'Label': 'GPE', 'Matches': [{'End': 20, 'Start': 15, 'Text': 'Sower'}], 'Name': 'Sower'}, {'Label': 'PERSON', 'Matches': [{'End': 65, 'Start': 51, 'Text': 'Octavia Butler'}], 'Name': 'Octavia Butler'}, {'Label': 'DATE', 'Matches': [{'End': 84, 'Start': 80, 'Text': '1993'}], 'Name': '1993'}, {'Label': 'PERSON', 'Matches': [{'End': 124, 'Start': 110, 'Text': 'Lauren Olamina'}], 'Name': 'Lauren Olamina'}]}}, 'token_count': 56} 0.7816448549236643\n",
"{'uuid': '6161d934-a629-4ba2-8bba-0b0996c93964', 'created_at': '2023-06-26T23:37:56.358632Z', 'role': 'ai', 'content': \"The most well-known adaptation of Octavia Butler's work is the FX series Kindred, based on her novel of the same name.\", 'metadata': {'system': {'entities': [{'Label': 'PERSON', 'Matches': [{'End': 50, 'Start': 34, 'Text': \"Octavia Butler's\"}], 'Name': \"Octavia Butler's\"}, {'Label': 'ORG', 'Matches': [{'End': 65, 'Start': 63, 'Text': 'FX'}], 'Name': 'FX'}, {'Label': 'GPE', 'Matches': [{'End': 80, 'Start': 73, 'Text': 'Kindred'}], 'Name': 'Kindred'}], 'intent': \"The subject is discussing Octavia Butler's work being adapted into a TV series called Kindred.\"}}, 'token_count': 29} 0.7815841371388998\n"
"{'uuid': 'ccdcc901-ea39-4981-862f-6fe22ab9289b', 'created_at': '2023-07-09T19:23:16.62678Z', 'role': 'human', 'content': 'Which other women sci-fi writers might I want to read?', 'metadata': {'system': {'entities': [], 'intent': 'The subject is seeking recommendations for additional women science fiction writers to explore.'}}, 'token_count': 14} 0.9119619869747062\n",
"{'uuid': '7977099a-0c62-4c98-bfff-465bbab6c9c3', 'created_at': '2023-07-09T19:23:16.631721Z', 'role': 'ai', 'content': 'You might want to read Ursula K. Le Guin or Joanna Russ.', 'metadata': {'system': {'entities': [{'Label': 'ORG', 'Matches': [{'End': 40, 'Start': 23, 'Text': 'Ursula K. Le Guin'}], 'Name': 'Ursula K. Le Guin'}, {'Label': 'PERSON', 'Matches': [{'End': 55, 'Start': 44, 'Text': 'Joanna Russ'}], 'Name': 'Joanna Russ'}], 'intent': 'The subject is suggesting that the person should consider reading the works of Ursula K. Le Guin or Joanna Russ.'}}, 'token_count': 18} 0.8534346954749745\n",
"{'uuid': 'b05e2eb5-c103-4973-9458-928726f08655', 'created_at': '2023-07-09T19:23:16.603098Z', 'role': 'ai', 'content': \"Octavia Butler's contemporaries included Ursula K. Le Guin, Samuel R. Delany, and Joanna Russ.\", 'metadata': {'system': {'entities': [{'Label': 'PERSON', 'Matches': [{'End': 16, 'Start': 0, 'Text': \"Octavia Butler's\"}], 'Name': \"Octavia Butler's\"}, {'Label': 'ORG', 'Matches': [{'End': 58, 'Start': 41, 'Text': 'Ursula K. Le Guin'}], 'Name': 'Ursula K. Le Guin'}, {'Label': 'PERSON', 'Matches': [{'End': 76, 'Start': 60, 'Text': 'Samuel R. Delany'}], 'Name': 'Samuel R. Delany'}, {'Label': 'PERSON', 'Matches': [{'End': 93, 'Start': 82, 'Text': 'Joanna Russ'}], 'Name': 'Joanna Russ'}], 'intent': \"The subject is stating that Octavia Butler's contemporaries included Ursula K. Le Guin, Samuel R. Delany, and Joanna Russ.\"}}, 'token_count': 27} 0.8523831524040919\n",
"{'uuid': 'e346f02b-f854-435d-b6ba-fb394a416b9b', 'created_at': '2023-07-09T19:23:16.556587Z', 'role': 'human', 'content': 'Who was Octavia Butler?', 'metadata': {'system': {'entities': [{'Label': 'PERSON', 'Matches': [{'End': 22, 'Start': 8, 'Text': 'Octavia Butler'}], 'Name': 'Octavia Butler'}], 'intent': 'The subject is asking for information about the identity or background of Octavia Butler.'}}, 'token_count': 8} 0.8236355436055457\n",
"{'uuid': '42ff41d2-c63a-4d5b-b19b-d9a87105cfc3', 'created_at': '2023-07-09T19:23:16.578022Z', 'role': 'ai', 'content': 'Octavia Estelle Butler (June 22, 1947 February 24, 2006) was an American science fiction author.', 'metadata': {'system': {'entities': [{'Label': 'PERSON', 'Matches': [{'End': 22, 'Start': 0, 'Text': 'Octavia Estelle Butler'}], 'Name': 'Octavia Estelle Butler'}, {'Label': 'DATE', 'Matches': [{'End': 37, 'Start': 24, 'Text': 'June 22, 1947'}], 'Name': 'June 22, 1947'}, {'Label': 'DATE', 'Matches': [{'End': 57, 'Start': 40, 'Text': 'February 24, 2006'}], 'Name': 'February 24, 2006'}, {'Label': 'NORP', 'Matches': [{'End': 74, 'Start': 66, 'Text': 'American'}], 'Name': 'American'}], 'intent': 'The subject is providing information about Octavia Estelle Butler, who was an American science fiction author.'}}, 'token_count': 31} 0.8206687242257686\n",
"{'uuid': '2f6d80c6-3c08-4fd4-8d4e-7bbee341ac90', 'created_at': '2023-07-09T19:23:16.618947Z', 'role': 'ai', 'content': 'Octavia Butler won the Hugo Award, the Nebula Award, and the MacArthur Fellowship.', 'metadata': {'system': {'entities': [{'Label': 'PERSON', 'Matches': [{'End': 14, 'Start': 0, 'Text': 'Octavia Butler'}], 'Name': 'Octavia Butler'}, {'Label': 'WORK_OF_ART', 'Matches': [{'End': 33, 'Start': 19, 'Text': 'the Hugo Award'}], 'Name': 'the Hugo Award'}, {'Label': 'EVENT', 'Matches': [{'End': 81, 'Start': 57, 'Text': 'the MacArthur Fellowship'}], 'Name': 'the MacArthur Fellowship'}], 'intent': 'The subject is stating that Octavia Butler received the Hugo Award, the Nebula Award, and the MacArthur Fellowship.'}}, 'token_count': 21} 0.8199012397683285\n"
]
}
],
"source": [
"search_results = zep_chat_history.search(\"who are some famous women sci-fi authors?\")\n",
"retriever = ZepRetriever(\n",
" session_id=session_id,\n",
" url=ZEP_API_URL,\n",
" api_key=zep_api_key,\n",
")\n",
"\n",
"search_results = memory.chat_memory.search(\"who are some famous women sci-fi authors?\")\n",
"for r in search_results:\n",
" print(r.message, r.dist)"
" if r.dist > 0.8: # Only print results with similarity of 0.8 or higher\n",
" print(r.message, r.dist)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
"source": [],
"metadata": {
"collapsed": false
}
}
],
"metadata": {

View File

@@ -0,0 +1,162 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "e49f1e0d",
"metadata": {},
"source": [
"# JinaChat\n",
"\n",
"This notebook covers how to get started with JinaChat chat models."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "522686de",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.chat_models import JinaChat\n",
"from langchain.prompts.chat import (\n",
" ChatPromptTemplate,\n",
" SystemMessagePromptTemplate,\n",
" AIMessagePromptTemplate,\n",
" HumanMessagePromptTemplate,\n",
")\n",
"from langchain.schema import AIMessage, HumanMessage, SystemMessage"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "62e0dbc3",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"chat = JinaChat(temperature=0)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "ce16ad78-8e6f-48cd-954e-98be75eb5836",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"J'aime programmer.\", additional_kwargs={}, example=False)"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"messages = [\n",
" SystemMessage(\n",
" content=\"You are a helpful assistant that translates English to French.\"\n",
" ),\n",
" HumanMessage(\n",
" content=\"Translate this sentence from English to French. I love programming.\"\n",
" ),\n",
"]\n",
"chat(messages)"
]
},
{
"cell_type": "markdown",
"id": "778f912a-66ea-4a5d-b3de-6c7db4baba26",
"metadata": {},
"source": [
"You can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplates`. You can use `ChatPromptTemplate`'s `format_prompt` -- this returns a `PromptValue`, which you can convert to a string or Message object, depending on whether you want to use the formatted value as input to an llm or chat model.\n",
"\n",
"For convenience, there is a `from_template` method exposed on the template. If you were to use this template, this is what it would look like:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "180c5cc8",
"metadata": {},
"outputs": [],
"source": [
"template = (\n",
" \"You are a helpful assistant that translates {input_language} to {output_language}.\"\n",
")\n",
"system_message_prompt = SystemMessagePromptTemplate.from_template(template)\n",
"human_template = \"{text}\"\n",
"human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "fbb043e6",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"J'aime programmer.\", additional_kwargs={}, example=False)"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chat_prompt = ChatPromptTemplate.from_messages(\n",
" [system_message_prompt, human_message_prompt]\n",
")\n",
"\n",
"# get a chat completion from the formatted messages\n",
"chat(\n",
" chat_prompt.format_prompt(\n",
" input_language=\"English\", output_language=\"French\", text=\"I love programming.\"\n",
" ).to_messages()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c095285d",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -5,8 +5,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Use LangChain, GPT and Deep Lake to work with code base\n",
"In this tutorial, we are going to use Langchain + Deep Lake with GPT to analyze the code base of the LangChain itself. "
"# Use LangChain, GPT and Activeloop's Deep Lake to work with code base\n",
"In this tutorial, we are going to use Langchain + Activeloop's Deep Lake with GPT to analyze the code base of the LangChain itself. "
]
},
{
@@ -60,7 +60,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"metadata": {
"tags": []
},
@@ -81,19 +81,11 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"outputs": [],
"source": [
"import os\n",
"from getpass import getpass\n",
@@ -112,21 +104,14 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"outputs": [],
"source": [
"os.environ[\"ACTIVELOOP_TOKEN\"] = getpass.getpass(\"Activeloop Token:\")"
"activeloop_token = getpass(\"Activeloop Token:\")\n",
"os.environ[\"ACTIVELOOP_TOKEN\"] = activeloop_token"
]
},
{
@@ -149,19 +134,20 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!ls \"../../../..\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1147\n"
]
}
],
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
"\n",
@@ -189,180 +175,11 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Created a chunk of size 1620, which is longer than the specified 1000\n",
"Created a chunk of size 1213, which is longer than the specified 1000\n",
"Created a chunk of size 1263, which is longer than the specified 1000\n",
"Created a chunk of size 1448, which is longer than the specified 1000\n",
"Created a chunk of size 1120, which is longer than the specified 1000\n",
"Created a chunk of size 1148, which is longer than the specified 1000\n",
"Created a chunk of size 1826, which is longer than the specified 1000\n",
"Created a chunk of size 1260, which is longer than the specified 1000\n",
"Created a chunk of size 1195, which is longer than the specified 1000\n",
"Created a chunk of size 2147, which is longer than the specified 1000\n",
"Created a chunk of size 1410, which is longer than the specified 1000\n",
"Created a chunk of size 1269, which is longer than the specified 1000\n",
"Created a chunk of size 1030, which is longer than the specified 1000\n",
"Created a chunk of size 1046, which is longer than the specified 1000\n",
"Created a chunk of size 1024, which is longer than the specified 1000\n",
"Created a chunk of size 1026, which is longer than the specified 1000\n",
"Created a chunk of size 1285, which is longer than the specified 1000\n",
"Created a chunk of size 1370, which is longer than the specified 1000\n",
"Created a chunk of size 1031, which is longer than the specified 1000\n",
"Created a chunk of size 1999, which is longer than the specified 1000\n",
"Created a chunk of size 1029, which is longer than the specified 1000\n",
"Created a chunk of size 1120, which is longer than the specified 1000\n",
"Created a chunk of size 1033, which is longer than the specified 1000\n",
"Created a chunk of size 1143, which is longer than the specified 1000\n",
"Created a chunk of size 1416, which is longer than the specified 1000\n",
"Created a chunk of size 2482, which is longer than the specified 1000\n",
"Created a chunk of size 1890, which is longer than the specified 1000\n",
"Created a chunk of size 1418, which is longer than the specified 1000\n",
"Created a chunk of size 1848, which is longer than the specified 1000\n",
"Created a chunk of size 1069, which is longer than the specified 1000\n",
"Created a chunk of size 2369, which is longer than the specified 1000\n",
"Created a chunk of size 1045, which is longer than the specified 1000\n",
"Created a chunk of size 1501, which is longer than the specified 1000\n",
"Created a chunk of size 1208, which is longer than the specified 1000\n",
"Created a chunk of size 1950, which is longer than the specified 1000\n",
"Created a chunk of size 1283, which is longer than the specified 1000\n",
"Created a chunk of size 1414, which is longer than the specified 1000\n",
"Created a chunk of size 1304, which is longer than the specified 1000\n",
"Created a chunk of size 1224, which is longer than the specified 1000\n",
"Created a chunk of size 1060, which is longer than the specified 1000\n",
"Created a chunk of size 2461, which is longer than the specified 1000\n",
"Created a chunk of size 1099, which is longer than the specified 1000\n",
"Created a chunk of size 1178, which is longer than the specified 1000\n",
"Created a chunk of size 1449, which is longer than the specified 1000\n",
"Created a chunk of size 1345, which is longer than the specified 1000\n",
"Created a chunk of size 3359, which is longer than the specified 1000\n",
"Created a chunk of size 2248, which is longer than the specified 1000\n",
"Created a chunk of size 1589, which is longer than the specified 1000\n",
"Created a chunk of size 2104, which is longer than the specified 1000\n",
"Created a chunk of size 1505, which is longer than the specified 1000\n",
"Created a chunk of size 1387, which is longer than the specified 1000\n",
"Created a chunk of size 1215, which is longer than the specified 1000\n",
"Created a chunk of size 1240, which is longer than the specified 1000\n",
"Created a chunk of size 1635, which is longer than the specified 1000\n",
"Created a chunk of size 1075, which is longer than the specified 1000\n",
"Created a chunk of size 2180, which is longer than the specified 1000\n",
"Created a chunk of size 1791, which is longer than the specified 1000\n",
"Created a chunk of size 1555, which is longer than the specified 1000\n",
"Created a chunk of size 1082, which is longer than the specified 1000\n",
"Created a chunk of size 1225, which is longer than the specified 1000\n",
"Created a chunk of size 1287, which is longer than the specified 1000\n",
"Created a chunk of size 1085, which is longer than the specified 1000\n",
"Created a chunk of size 1117, which is longer than the specified 1000\n",
"Created a chunk of size 1966, which is longer than the specified 1000\n",
"Created a chunk of size 1150, which is longer than the specified 1000\n",
"Created a chunk of size 1285, which is longer than the specified 1000\n",
"Created a chunk of size 1150, which is longer than the specified 1000\n",
"Created a chunk of size 1585, which is longer than the specified 1000\n",
"Created a chunk of size 1208, which is longer than the specified 1000\n",
"Created a chunk of size 1267, which is longer than the specified 1000\n",
"Created a chunk of size 1542, which is longer than the specified 1000\n",
"Created a chunk of size 1183, which is longer than the specified 1000\n",
"Created a chunk of size 2424, which is longer than the specified 1000\n",
"Created a chunk of size 1017, which is longer than the specified 1000\n",
"Created a chunk of size 1304, which is longer than the specified 1000\n",
"Created a chunk of size 1379, which is longer than the specified 1000\n",
"Created a chunk of size 1324, which is longer than the specified 1000\n",
"Created a chunk of size 1205, which is longer than the specified 1000\n",
"Created a chunk of size 1056, which is longer than the specified 1000\n",
"Created a chunk of size 1195, which is longer than the specified 1000\n",
"Created a chunk of size 3608, which is longer than the specified 1000\n",
"Created a chunk of size 1058, which is longer than the specified 1000\n",
"Created a chunk of size 1075, which is longer than the specified 1000\n",
"Created a chunk of size 1217, which is longer than the specified 1000\n",
"Created a chunk of size 1109, which is longer than the specified 1000\n",
"Created a chunk of size 1440, which is longer than the specified 1000\n",
"Created a chunk of size 1046, which is longer than the specified 1000\n",
"Created a chunk of size 1220, which is longer than the specified 1000\n",
"Created a chunk of size 1403, which is longer than the specified 1000\n",
"Created a chunk of size 1241, which is longer than the specified 1000\n",
"Created a chunk of size 1427, which is longer than the specified 1000\n",
"Created a chunk of size 1049, which is longer than the specified 1000\n",
"Created a chunk of size 1580, which is longer than the specified 1000\n",
"Created a chunk of size 1565, which is longer than the specified 1000\n",
"Created a chunk of size 1131, which is longer than the specified 1000\n",
"Created a chunk of size 1425, which is longer than the specified 1000\n",
"Created a chunk of size 1054, which is longer than the specified 1000\n",
"Created a chunk of size 1027, which is longer than the specified 1000\n",
"Created a chunk of size 2559, which is longer than the specified 1000\n",
"Created a chunk of size 1028, which is longer than the specified 1000\n",
"Created a chunk of size 1382, which is longer than the specified 1000\n",
"Created a chunk of size 1888, which is longer than the specified 1000\n",
"Created a chunk of size 1475, which is longer than the specified 1000\n",
"Created a chunk of size 1652, which is longer than the specified 1000\n",
"Created a chunk of size 1891, which is longer than the specified 1000\n",
"Created a chunk of size 1899, which is longer than the specified 1000\n",
"Created a chunk of size 1021, which is longer than the specified 1000\n",
"Created a chunk of size 1085, which is longer than the specified 1000\n",
"Created a chunk of size 1854, which is longer than the specified 1000\n",
"Created a chunk of size 1672, which is longer than the specified 1000\n",
"Created a chunk of size 2537, which is longer than the specified 1000\n",
"Created a chunk of size 1251, which is longer than the specified 1000\n",
"Created a chunk of size 1734, which is longer than the specified 1000\n",
"Created a chunk of size 1642, which is longer than the specified 1000\n",
"Created a chunk of size 1376, which is longer than the specified 1000\n",
"Created a chunk of size 1253, which is longer than the specified 1000\n",
"Created a chunk of size 1642, which is longer than the specified 1000\n",
"Created a chunk of size 1419, which is longer than the specified 1000\n",
"Created a chunk of size 1438, which is longer than the specified 1000\n",
"Created a chunk of size 1427, which is longer than the specified 1000\n",
"Created a chunk of size 1684, which is longer than the specified 1000\n",
"Created a chunk of size 1760, which is longer than the specified 1000\n",
"Created a chunk of size 1157, which is longer than the specified 1000\n",
"Created a chunk of size 2504, which is longer than the specified 1000\n",
"Created a chunk of size 1082, which is longer than the specified 1000\n",
"Created a chunk of size 2268, which is longer than the specified 1000\n",
"Created a chunk of size 1784, which is longer than the specified 1000\n",
"Created a chunk of size 1311, which is longer than the specified 1000\n",
"Created a chunk of size 2972, which is longer than the specified 1000\n",
"Created a chunk of size 1144, which is longer than the specified 1000\n",
"Created a chunk of size 1825, which is longer than the specified 1000\n",
"Created a chunk of size 1508, which is longer than the specified 1000\n",
"Created a chunk of size 2901, which is longer than the specified 1000\n",
"Created a chunk of size 1715, which is longer than the specified 1000\n",
"Created a chunk of size 1062, which is longer than the specified 1000\n",
"Created a chunk of size 1206, which is longer than the specified 1000\n",
"Created a chunk of size 1102, which is longer than the specified 1000\n",
"Created a chunk of size 1184, which is longer than the specified 1000\n",
"Created a chunk of size 1002, which is longer than the specified 1000\n",
"Created a chunk of size 1065, which is longer than the specified 1000\n",
"Created a chunk of size 1871, which is longer than the specified 1000\n",
"Created a chunk of size 1754, which is longer than the specified 1000\n",
"Created a chunk of size 2413, which is longer than the specified 1000\n",
"Created a chunk of size 1771, which is longer than the specified 1000\n",
"Created a chunk of size 2054, which is longer than the specified 1000\n",
"Created a chunk of size 2000, which is longer than the specified 1000\n",
"Created a chunk of size 2061, which is longer than the specified 1000\n",
"Created a chunk of size 1066, which is longer than the specified 1000\n",
"Created a chunk of size 1419, which is longer than the specified 1000\n",
"Created a chunk of size 1368, which is longer than the specified 1000\n",
"Created a chunk of size 1008, which is longer than the specified 1000\n",
"Created a chunk of size 1227, which is longer than the specified 1000\n",
"Created a chunk of size 1745, which is longer than the specified 1000\n",
"Created a chunk of size 2296, which is longer than the specified 1000\n",
"Created a chunk of size 1083, which is longer than the specified 1000\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"3477\n"
]
}
],
"outputs": [],
"source": [
"from langchain.text_splitter import CharacterTextSplitter\n",
"\n",
@@ -383,22 +200,11 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"OpenAIEmbeddings(client=<class 'openai.api_resources.embedding.Embedding'>, model='text-embedding-ada-002', document_model_name='text-embedding-ada-002', query_model_name='text-embedding-ada-002', embedding_ctx_length=8191, openai_api_key=None, openai_organization=None, allowed_special=set(), disallowed_special='all', chunk_size=1000, max_retries=6)"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"\n",
@@ -417,11 +223,33 @@
"from langchain.vectorstores import DeepLake\n",
"\n",
"db = DeepLake.from_documents(\n",
" texts, embeddings, dataset_path=f\"hub://{DEEPLAKE_ACCOUNT_NAME}/langchain-code\"\n",
" texts, embeddings, dataset_path=f\"hub://{<org_id>}/langchain-code\"\n",
")\n",
"db"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"`Optional`: You can also use Deep Lake's Managed Tensor Database as a hosting service and run queries there. In order to do so, it is necessary to specify the runtime parameter as {'tensor_db': True} during the creation of the vector store. This configuration enables the execution of queries on the Managed Tensor Database, rather than on the client side. It should be noted that this functionality is not applicable to datasets stored locally or in-memory. In the event that a vector store has already been created outside of the Managed Tensor Database, it is possible to transfer it to the Managed Tensor Database by following the prescribed steps."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# from langchain.vectorstores import DeepLake\n",
"\n",
"# db = DeepLake.from_documents(\n",
"# texts, embeddings, dataset_path=f\"hub://{<org_id>}/langchain-code\", runtime={\"tensor_db\": True}\n",
"# )\n",
"# db"
]
},
{
"attachments": {},
"cell_type": "markdown",
@@ -433,66 +261,14 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"-"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"This dataset can be visualized in Jupyter Notebook by ds.visualize() or at https://app.activeloop.ai/user_name/langchain-code\n",
"\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"hub://user_name/langchain-code loaded successfully.\n",
"\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Deep Lake Dataset in hub://user_name/langchain-code already exists, loading from the storage\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dataset(path='hub://user_name/langchain-code', read_only=True, tensors=['embedding', 'ids', 'metadata', 'text'])\n",
"\n",
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding generic (3477, 1536) float32 None \n",
" ids text (3477, 1) str None \n",
" metadata json (3477, 1) str None \n",
" text text (3477, 1) str None \n"
]
}
],
"outputs": [],
"source": [
"db = DeepLake(\n",
" dataset_path=f\"hub://{DEEPLAKE_ACCOUNT_NAME}/langchain-code\",\n",
" dataset_path=f\"hub://{<org_id>}/langchain-code\",\n",
" read_only=True,\n",
" embedding_function=embeddings,\n",
")"
@@ -500,7 +276,7 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": null,
"metadata": {
"tags": []
},
@@ -523,7 +299,7 @@
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": null,
"metadata": {
"tags": []
},
@@ -545,7 +321,7 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": null,
"metadata": {
"tags": []
},
@@ -658,7 +434,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.9.6"
}
},
"nbformat": 4,

View File

@@ -5,8 +5,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Analysis of Twitter the-algorithm source code with LangChain, GPT4 and Deep Lake\n",
"In this tutorial, we are going to use Langchain + Deep Lake with GPT4 to analyze the code base of the twitter algorithm. "
"# Analysis of Twitter the-algorithm source code with LangChain, GPT4 and Activeloop's Deep Lake\n",
"In this tutorial, we are going to use Langchain + Activeloop's Deep Lake with GPT4 to analyze the code base of the twitter algorithm. "
]
},
{
@@ -15,7 +15,7 @@
"metadata": {},
"outputs": [],
"source": [
"!python3 -m pip install --upgrade langchain deeplake openai tiktoken"
"!python3 -m pip install --upgrade langchain 'deeplake[enterprise]' openai tiktoken"
]
},
{
@@ -41,7 +41,8 @@
"from langchain.vectorstores import DeepLake\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")\n",
"os.environ[\"ACTIVELOOP_TOKEN\"] = getpass.getpass(\"Activeloop Token:\")"
"activeloop_token = getpass.getpass(\"Activeloop Token:\")\n",
"os.environ[\"ACTIVELOOP_TOKEN\"] = activeloop_token"
]
},
{
@@ -149,6 +150,29 @@
"db.add_documents(texts)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"`Optional`: You can also use Deep Lake's Managed Tensor Database as a hosting service and run queries there. In order to do so, it is necessary to specify the runtime parameter as {'tensor_db': True} during the creation of the vector store. This configuration enables the execution of queries on the Managed Tensor Database, rather than on the client side. It should be noted that this functionality is not applicable to datasets stored locally or in-memory. In the event that a vector store has already been created outside of the Managed Tensor Database, it is possible to transfer it to the Managed Tensor Database by following the prescribed steps."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# username = \"davitbun\" # replace with your username from app.activeloop.ai\n",
"# db = DeepLake(\n",
"# dataset_path=f\"hub://{username}/twitter-algorithm\",\n",
"# embedding_function=embeddings,\n",
"# runtime={\"tensor_db\": True}\n",
"# )\n",
"# db.add_documents(texts)"
]
},
{
"attachments": {},
"cell_type": "markdown",
@@ -176,6 +200,7 @@
" dataset_path=\"hub://davitbun/twitter-algorithm\",\n",
" read_only=True,\n",
" embedding_function=embeddings,\n",
" \n",
")"
]
},

View File

@@ -1,16 +1,18 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Question answering over a group chat messages\n",
"In this tutorial, we are going to use Langchain + Deep Lake with GPT4 to semantically search and ask questions over a group chat.\n",
"# Question answering over a group chat messages using Activeloop's DeepLake\n",
"In this tutorial, we are going to use Langchain + Activeloop's Deep Lake with GPT4 to semantically search and ask questions over a group chat.\n",
"\n",
"View a working demo [here](https://twitter.com/thisissukh_/status/1647223328363679745)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -23,10 +25,11 @@
"metadata": {},
"outputs": [],
"source": [
"!python3 -m pip install --upgrade langchain deeplake openai tiktoken"
"!python3 -m pip install --upgrade langchain 'deeplake[enterprise]' openai tiktoken"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -34,6 +37,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": []
@@ -58,16 +62,18 @@
"from langchain.llms import OpenAI\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")\n",
"os.environ[\"ACTIVELOOP_TOKEN\"] = getpass.getpass(\"Activeloop Token:\")\n",
"activeloop_token = getpass.getpass(\"Activeloop Token:\")\n",
"os.environ[\"ACTIVELOOP_TOKEN\"] = activeloop_token\n",
"os.environ[\"ACTIVELOOP_ORG\"] = getpass.getpass(\"Activeloop Org:\")\n",
"\n",
"org = os.environ[\"ACTIVELOOP_ORG\"]\n",
"org_id = os.environ[\"ACTIVELOOP_ORG\"]\n",
"embeddings = OpenAIEmbeddings()\n",
"\n",
"dataset_path = \"hub://\" + org + \"/data\""
"dataset_path = \"hub://\" + org_id + \"/data\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -77,6 +83,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -117,6 +124,38 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"`Optional`: You can also use Deep Lake's Managed Tensor Database as a hosting service and run queries there. In order to do so, it is necessary to specify the runtime parameter as {'tensor_db': True} during the creation of the vector store. This configuration enables the execution of queries on the Managed Tensor Database, rather than on the client side. It should be noted that this functionality is not applicable to datasets stored locally or in-memory. In the event that a vector store has already been created outside of the Managed Tensor Database, it is possible to transfer it to the Managed Tensor Database by following the prescribed steps."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# with open(\"messages.txt\") as f:\n",
"# state_of_the_union = f.read()\n",
"# text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"# pages = text_splitter.split_text(state_of_the_union)\n",
"\n",
"# text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)\n",
"# texts = text_splitter.create_documents(pages)\n",
"\n",
"# print(texts)\n",
"\n",
"# dataset_path = \"hub://\" + org + \"/data\"\n",
"# embeddings = OpenAIEmbeddings()\n",
"# db = DeepLake.from_documents(\n",
"# texts, embeddings, dataset_path=dataset_path, overwrite=True, runtime=\"tensor_db\"\n",
"# )"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [

View File

@@ -35,7 +35,7 @@ retriever_infos = [
},
{
"name": "pg essay",
"description": "Good for answer quesitons about Paul Graham's essay on his career",
"description": "Good for answering questions about Paul Graham's essay on his career",
"retriever": pg_retriever
},
{

View File

@@ -66,7 +66,7 @@ from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
```
Next in the generic setup, let's specify the document loader we want to use. You can download the `state_of_the_union.txt` file [here](https://github.com/hwchase17/langchain/blob/master/docs/modules/state_of_the_union.txt)
Next in the generic setup, let's specify the document loader we want to use. You can download the `state_of_the_union.txt` file [here](https://github.com/hwchase17/langchain/blob/master/docs/extras/modules/state_of_the_union.txt)
```python

View File

@@ -14,7 +14,6 @@ from pydantic import BaseModel, root_validator
from langchain.agents.agent_types import AgentType
from langchain.agents.tools import InvalidTool
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.callbacks.manager import (
AsyncCallbackManagerForChainRun,
@@ -35,6 +34,7 @@ from langchain.schema import (
BasePromptTemplate,
OutputParserException,
)
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.messages import BaseMessage
from langchain.tools.base import BaseTool
from langchain.utilities.asyncio import asyncio_timeout

View File

@@ -3,7 +3,7 @@ from typing import Any, List, Optional, Union
from langchain.agents.agent import AgentExecutor
from langchain.agents.agent_toolkits.pandas.base import create_pandas_dataframe_agent
from langchain.base_language import BaseLanguageModel
from langchain.schema.language_model import BaseLanguageModel
def create_csv_agent(

View File

@@ -6,9 +6,9 @@ from langchain.agents.agent_toolkits.json.prompt import JSON_PREFIX, JSON_SUFFIX
from langchain.agents.agent_toolkits.json.toolkit import JsonToolkit
from langchain.agents.mrkl.base import ZeroShotAgent
from langchain.agents.mrkl.prompt import FORMAT_INSTRUCTIONS
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains.llm import LLMChain
from langchain.schema.language_model import BaseLanguageModel
def create_json_agent(

View File

@@ -4,9 +4,9 @@
from typing import Any, Optional
from langchain.agents.tools import Tool
from langchain.base_language import BaseLanguageModel
from langchain.chains.api.openapi.chain import OpenAPIEndpointChain
from langchain.requests import Requests
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools.openapi.utils.api_models import APIOperation
from langchain.tools.openapi.utils.openapi_utils import OpenAPISpec

View File

@@ -7,8 +7,8 @@ from pydantic import Field
from langchain.agents.agent_toolkits.base import BaseToolkit
from langchain.agents.agent_toolkits.nla.tool import NLATool
from langchain.base_language import BaseLanguageModel
from langchain.requests import Requests
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools.base import BaseTool
from langchain.tools.openapi.utils.openapi_utils import OpenAPISpec
from langchain.tools.plugin import AIPlugin

View File

@@ -9,9 +9,9 @@ from langchain.agents.agent_toolkits.openapi.prompt import (
from langchain.agents.agent_toolkits.openapi.toolkit import OpenAPIToolkit
from langchain.agents.mrkl.base import ZeroShotAgent
from langchain.agents.mrkl.prompt import FORMAT_INSTRUCTIONS
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains.llm import LLMChain
from langchain.schema.language_model import BaseLanguageModel
def create_openapi_agent(

View File

@@ -28,7 +28,6 @@ from langchain.agents.agent_toolkits.openapi.planner_prompt import (
from langchain.agents.agent_toolkits.openapi.spec import ReducedOpenAPISpec
from langchain.agents.mrkl.base import ZeroShotAgent
from langchain.agents.tools import Tool
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains.llm import LLMChain
from langchain.llms.openai import OpenAI
@@ -36,6 +35,7 @@ from langchain.memory import ReadOnlySharedMemory
from langchain.prompts import PromptTemplate
from langchain.requests import RequestsWrapper
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools.base import BaseTool
from langchain.tools.requests.tool import BaseRequestsTool

View File

@@ -9,8 +9,8 @@ from langchain.agents.agent_toolkits.json.base import create_json_agent
from langchain.agents.agent_toolkits.json.toolkit import JsonToolkit
from langchain.agents.agent_toolkits.openapi.prompt import DESCRIPTION
from langchain.agents.tools import Tool
from langchain.base_language import BaseLanguageModel
from langchain.requests import TextRequestsWrapper
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools import BaseTool
from langchain.tools.json.tool import JsonSpec
from langchain.tools.requests.tool import (

View File

@@ -16,10 +16,10 @@ from langchain.agents.agent_toolkits.pandas.prompt import (
from langchain.agents.mrkl.base import ZeroShotAgent
from langchain.agents.openai_functions_agent.base import OpenAIFunctionsAgent
from langchain.agents.types import AgentType
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains.llm import LLMChain
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.messages import SystemMessage
from langchain.tools.python.tool import PythonAstREPLTool

View File

@@ -9,9 +9,9 @@ from langchain.agents.agent_toolkits.powerbi.prompt import (
from langchain.agents.agent_toolkits.powerbi.toolkit import PowerBIToolkit
from langchain.agents.mrkl.base import ZeroShotAgent
from langchain.agents.mrkl.prompt import FORMAT_INSTRUCTIONS
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains.llm import LLMChain
from langchain.schema.language_model import BaseLanguageModel
from langchain.utilities.powerbi import PowerBIDataset

View File

@@ -4,7 +4,6 @@ from typing import List, Optional, Union
from pydantic import Field
from langchain.agents.agent_toolkits.base import BaseToolkit
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains.llm import LLMChain
from langchain.chat_models.base import BaseChatModel
@@ -14,6 +13,7 @@ from langchain.prompts.chat import (
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools import BaseTool
from langchain.tools.powerbi.prompt import (
QUESTION_TO_QUERY_BASE,

View File

@@ -7,9 +7,9 @@ from langchain.agents.agent_toolkits.python.prompt import PREFIX
from langchain.agents.mrkl.base import ZeroShotAgent
from langchain.agents.openai_functions_agent.base import OpenAIFunctionsAgent
from langchain.agents.types import AgentType
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains.llm import LLMChain
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.messages import SystemMessage
from langchain.tools.python.tool import PythonREPLTool

View File

@@ -6,9 +6,9 @@ from langchain.agents.agent_toolkits.spark_sql.prompt import SQL_PREFIX, SQL_SUF
from langchain.agents.agent_toolkits.spark_sql.toolkit import SparkSQLToolkit
from langchain.agents.mrkl.base import ZeroShotAgent
from langchain.agents.mrkl.prompt import FORMAT_INSTRUCTIONS
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains.llm import LLMChain
from langchain.schema.language_model import BaseLanguageModel
def create_spark_sql_agent(

View File

@@ -4,7 +4,7 @@ from typing import List
from pydantic import Field
from langchain.agents.agent_toolkits.base import BaseToolkit
from langchain.base_language import BaseLanguageModel
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools import BaseTool
from langchain.tools.spark_sql.tool import (
InfoSparkSQLTool,

View File

@@ -12,7 +12,6 @@ from langchain.agents.agent_types import AgentType
from langchain.agents.mrkl.base import ZeroShotAgent
from langchain.agents.mrkl.prompt import FORMAT_INSTRUCTIONS
from langchain.agents.openai_functions_agent.base import OpenAIFunctionsAgent
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains.llm import LLMChain
from langchain.prompts.chat import (
@@ -20,6 +19,7 @@ from langchain.prompts.chat import (
HumanMessagePromptTemplate,
MessagesPlaceholder,
)
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.messages import AIMessage, SystemMessage

View File

@@ -4,7 +4,7 @@ from typing import List
from pydantic import Field
from langchain.agents.agent_toolkits.base import BaseToolkit
from langchain.base_language import BaseLanguageModel
from langchain.schema.language_model import BaseLanguageModel
from langchain.sql_database import SQLDatabase
from langchain.tools import BaseTool
from langchain.tools.sql_database.tool import (

View File

@@ -8,9 +8,9 @@ from langchain.agents.agent_toolkits.vectorstore.toolkit import (
VectorStoreToolkit,
)
from langchain.agents.mrkl.base import ZeroShotAgent
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains.llm import LLMChain
from langchain.schema.language_model import BaseLanguageModel
def create_vectorstore_agent(

View File

@@ -4,8 +4,8 @@ from typing import List
from pydantic import BaseModel, Field
from langchain.agents.agent_toolkits.base import BaseToolkit
from langchain.base_language import BaseLanguageModel
from langchain.llms.openai import OpenAI
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools import BaseTool
from langchain.tools.vectorstore.tool import (
VectorStoreQATool,

View File

@@ -11,7 +11,6 @@ from langchain.agents.chat.prompt import (
SYSTEM_MESSAGE_SUFFIX,
)
from langchain.agents.utils import validate_tools_single_input
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains.llm import LLMChain
from langchain.prompts.chat import (
@@ -20,6 +19,7 @@ from langchain.prompts.chat import (
SystemMessagePromptTemplate,
)
from langchain.schema import AgentAction, BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools.base import BaseTool

View File

@@ -10,10 +10,10 @@ from langchain.agents.agent_types import AgentType
from langchain.agents.conversational.output_parser import ConvoOutputParser
from langchain.agents.conversational.prompt import FORMAT_INSTRUCTIONS, PREFIX, SUFFIX
from langchain.agents.utils import validate_tools_single_input
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools.base import BaseTool

View File

@@ -13,7 +13,6 @@ from langchain.agents.conversational_chat.prompt import (
TEMPLATE_TOOL_RESPONSE,
)
from langchain.agents.utils import validate_tools_single_input
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains import LLMChain
from langchain.prompts.chat import (
@@ -23,6 +22,7 @@ from langchain.prompts.chat import (
SystemMessagePromptTemplate,
)
from langchain.schema import AgentAction, BaseOutputParser, BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.messages import AIMessage, BaseMessage, HumanMessage
from langchain.tools.base import BaseTool

View File

@@ -4,8 +4,8 @@ from typing import Any, Optional, Sequence
from langchain.agents.agent import AgentExecutor
from langchain.agents.agent_types import AgentType
from langchain.agents.loading import AGENT_TO_CLASS, load_agent
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools.base import BaseTool

View File

@@ -5,7 +5,7 @@ from typing import Any, Dict, List, Optional, Callable, Tuple
from mypy_extensions import Arg, KwArg
from langchain.agents.tools import Tool
from langchain.base_language import BaseLanguageModel
from langchain.schema.language_model import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.callbacks.manager import Callbacks
from langchain.chains.api import news_docs, open_meteo_docs, podcast_docs, tmdb_docs

View File

@@ -9,8 +9,8 @@ import yaml
from langchain.agents.agent import BaseMultiActionAgent, BaseSingleActionAgent
from langchain.agents.tools import Tool
from langchain.agents.types import AGENT_TO_CLASS
from langchain.base_language import BaseLanguageModel
from langchain.chains.loading import load_chain, load_chain_from_config
from langchain.schema.language_model import BaseLanguageModel
from langchain.utilities.loading import try_load_from_hub
logger = logging.getLogger(__file__)

View File

@@ -11,10 +11,10 @@ from langchain.agents.mrkl.output_parser import MRKLOutputParser
from langchain.agents.mrkl.prompt import FORMAT_INSTRUCTIONS, PREFIX, SUFFIX
from langchain.agents.tools import Tool
from langchain.agents.utils import validate_tools_single_input
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools.base import BaseTool

View File

@@ -7,7 +7,6 @@ from typing import Any, List, Optional, Sequence, Tuple, Union
from pydantic import root_validator
from langchain.agents import BaseSingleActionAgent
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.callbacks.manager import Callbacks
from langchain.chat_models.openai import ChatOpenAI
@@ -23,6 +22,7 @@ from langchain.schema import (
BasePromptTemplate,
OutputParserException,
)
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.messages import (
AIMessage,
BaseMessage,

View File

@@ -7,7 +7,6 @@ from typing import Any, List, Optional, Sequence, Tuple, Union
from pydantic import root_validator
from langchain.agents import BaseMultiActionAgent
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.callbacks.manager import Callbacks
from langchain.chat_models.openai import ChatOpenAI
@@ -23,6 +22,7 @@ from langchain.schema import (
BasePromptTemplate,
OutputParserException,
)
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.messages import (
AIMessage,
BaseMessage,

View File

@@ -10,10 +10,10 @@ from langchain.agents.react.textworld_prompt import TEXTWORLD_PROMPT
from langchain.agents.react.wiki_prompt import WIKI_PROMPT
from langchain.agents.tools import Tool
from langchain.agents.utils import validate_tools_single_input
from langchain.base_language import BaseLanguageModel
from langchain.docstore.base import Docstore
from langchain.docstore.document import Document
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools.base import BaseTool

View File

@@ -9,8 +9,8 @@ from langchain.agents.self_ask_with_search.output_parser import SelfAskOutputPar
from langchain.agents.self_ask_with_search.prompt import PROMPT
from langchain.agents.tools import Tool
from langchain.agents.utils import validate_tools_single_input
from langchain.base_language import BaseLanguageModel
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools.base import BaseTool
from langchain.utilities.google_serper import GoogleSerperAPIWrapper
from langchain.utilities.serpapi import SerpAPIWrapper

View File

@@ -8,7 +8,6 @@ from langchain.agents.structured_chat.output_parser import (
StructuredChatOutputParserWithRetries,
)
from langchain.agents.structured_chat.prompt import FORMAT_INSTRUCTIONS, PREFIX, SUFFIX
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains.llm import LLMChain
from langchain.prompts.chat import (
@@ -17,6 +16,7 @@ from langchain.prompts.chat import (
SystemMessagePromptTemplate,
)
from langchain.schema import AgentAction, BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools import BaseTool
HUMAN_MESSAGE_TEMPLATE = "{input}\n\n{agent_scratchpad}"

View File

@@ -9,9 +9,9 @@ from pydantic import Field
from langchain.agents.agent import AgentOutputParser
from langchain.agents.structured_chat.prompt import FORMAT_INSTRUCTIONS
from langchain.base_language import BaseLanguageModel
from langchain.output_parsers import OutputFixingParser
from langchain.schema import AgentAction, AgentFinish, OutputParserException
from langchain.schema.language_model import BaseLanguageModel
logger = logging.getLogger(__name__)

View File

@@ -1,105 +1,6 @@
"""Deprecated module for BaseLanguageModel class, kept for backwards compatibility."""
from __future__ import annotations
from abc import ABC, abstractmethod
from typing import Any, List, Optional, Sequence, Set
from langchain.schema.language_model import BaseLanguageModel
from langchain.callbacks.manager import Callbacks
from langchain.load.serializable import Serializable
from langchain.schema import LLMResult, PromptValue
from langchain.schema.messages import BaseMessage, get_buffer_string
def _get_token_ids_default_method(text: str) -> List[int]:
"""Encode the text into token IDs."""
# TODO: this method may not be exact.
# TODO: this method may differ based on model (eg codex).
try:
from transformers import GPT2TokenizerFast
except ImportError:
raise ValueError(
"Could not import transformers python package. "
"This is needed in order to calculate get_token_ids. "
"Please install it with `pip install transformers`."
)
# create a GPT-2 tokenizer instance
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")
# tokenize the text using the GPT-2 tokenizer
return tokenizer.encode(text)
class BaseLanguageModel(Serializable, ABC):
"""Base class for all language models."""
@abstractmethod
def generate_prompt(
self,
prompts: List[PromptValue],
stop: Optional[List[str]] = None,
callbacks: Callbacks = None,
**kwargs: Any,
) -> LLMResult:
"""Take in a list of prompt values and return an LLMResult."""
@abstractmethod
async def agenerate_prompt(
self,
prompts: List[PromptValue],
stop: Optional[List[str]] = None,
callbacks: Callbacks = None,
**kwargs: Any,
) -> LLMResult:
"""Take in a list of prompt values and return an LLMResult."""
@abstractmethod
def predict(
self, text: str, *, stop: Optional[Sequence[str]] = None, **kwargs: Any
) -> str:
"""Predict text from text."""
@abstractmethod
def predict_messages(
self,
messages: List[BaseMessage],
*,
stop: Optional[Sequence[str]] = None,
**kwargs: Any,
) -> BaseMessage:
"""Predict message from messages."""
@abstractmethod
async def apredict(
self, text: str, *, stop: Optional[Sequence[str]] = None, **kwargs: Any
) -> str:
"""Predict text from text."""
@abstractmethod
async def apredict_messages(
self,
messages: List[BaseMessage],
*,
stop: Optional[Sequence[str]] = None,
**kwargs: Any,
) -> BaseMessage:
"""Predict message from messages."""
def get_token_ids(self, text: str) -> List[int]:
"""Get the token present in the text."""
return _get_token_ids_default_method(text)
def get_num_tokens(self, text: str) -> int:
"""Get the number of tokens present in the text."""
return len(self.get_token_ids(text))
def get_num_tokens_from_messages(self, messages: List[BaseMessage]) -> int:
"""Get the number of tokens in the message."""
return sum([self.get_num_tokens(get_buffer_string([m])) for m in messages])
@classmethod
def all_required_field_names(cls) -> Set:
all_required_field_names = set()
for field in cls.__fields__.values():
all_required_field_names.add(field.name)
if field.has_alias:
all_required_field_names.add(field.alias)
return all_required_field_names
__all__ = ["BaseLanguageModel"]

View File

@@ -6,6 +6,7 @@ from langchain.callbacks.arize_callback import ArizeCallbackHandler
from langchain.callbacks.arthur_callback import ArthurCallbackHandler
from langchain.callbacks.clearml_callback import ClearMLCallbackHandler
from langchain.callbacks.comet_ml_callback import CometCallbackHandler
from langchain.callbacks.context_callback import ContextCallbackHandler
from langchain.callbacks.file import FileCallbackHandler
from langchain.callbacks.flyte_callback import FlyteCallbackHandler
from langchain.callbacks.human import HumanApprovalCallbackHandler
@@ -36,6 +37,7 @@ __all__ = [
"ArthurCallbackHandler",
"ClearMLCallbackHandler",
"CometCallbackHandler",
"ContextCallbackHandler",
"FileCallbackHandler",
"HumanApprovalCallbackHandler",
"InfinoCallbackHandler",

View File

@@ -0,0 +1,193 @@
"""Callback handler for Context AI"""
import os
from typing import Any, Dict, List
from uuid import UUID
from langchain.callbacks.base import BaseCallbackHandler
from langchain.schema import (
BaseMessage,
LLMResult,
)
def import_context() -> Any:
try:
import getcontext # noqa: F401
from getcontext.generated.models import (
Conversation,
Message,
MessageRole,
Rating,
)
from getcontext.token import Credential # noqa: F401
except ImportError:
raise ImportError(
"To use the context callback manager you need to have the "
"`getcontext` python package installed (version >=0.3.0). "
"Please install it with `pip install --upgrade python-context`"
)
return getcontext, Credential, Conversation, Message, MessageRole, Rating
class ContextCallbackHandler(BaseCallbackHandler):
"""Callback Handler that records transcripts to Context (https://getcontext.ai).
Keyword Args:
token (optional): The token with which to authenticate requests to Context.
Visit https://go.getcontext.ai/settings to generate a token.
If not provided, the value of the `CONTEXT_TOKEN` environment
variable will be used.
Raises:
ImportError: if the `context-python` package is not installed.
Chat Example:
>>> from langchain.llms import ChatOpenAI
>>> from langchain.callbacks import ContextCallbackHandler
>>> context_callback = ContextCallbackHandler(
... token="<CONTEXT_TOKEN_HERE>",
... )
>>> chat = ChatOpenAI(
... temperature=0,
... headers={"user_id": "123"},
... callbacks=[context_callback],
... openai_api_key="API_KEY_HERE",
... )
>>> messages = [
... SystemMessage(content="You translate English to French."),
... HumanMessage(content="I love programming with LangChain."),
... ]
>>> chat(messages)
Chain Example:
>>> from langchain import LLMChain
>>> from langchain.llms import ChatOpenAI
>>> from langchain.callbacks import ContextCallbackHandler
>>> context_callback = ContextCallbackHandler(
... token="<CONTEXT_TOKEN_HERE>",
... )
>>> human_message_prompt = HumanMessagePromptTemplate(
... prompt=PromptTemplate(
... template="What is a good name for a company that makes {product}?",
... input_variables=["product"],
... ),
... )
>>> chat_prompt_template = ChatPromptTemplate.from_messages(
... [human_message_prompt]
... )
>>> callback = ContextCallbackHandler(token)
>>> # Note: the same callback object must be shared between the
... LLM and the chain.
>>> chat = ChatOpenAI(temperature=0.9, callbacks=[callback])
>>> chain = LLMChain(
... llm=chat,
... prompt=chat_prompt_template,
... callbacks=[callback]
... )
>>> chain.run("colorful socks")
"""
def __init__(self, token: str = "", verbose: bool = False, **kwargs: Any) -> None:
(
self.context,
self.credential,
self.conversation_model,
self.message_model,
self.message_role_model,
self.rating_model,
) = import_context()
token = token or os.environ.get("CONTEXT_TOKEN") or ""
self.client = self.context.ContextAPI(credential=self.credential(token))
self.chain_run_id = None
self.llm_model = None
self.messages: List[Any] = []
self.metadata: Dict[str, str] = {}
def on_chat_model_start(
self,
serialized: Dict[str, Any],
messages: List[List[BaseMessage]],
*,
run_id: UUID,
**kwargs: Any,
) -> Any:
"""Run when the chat model is started."""
llm_model = kwargs.get("invocation_params", {}).get("model", None)
if llm_model is not None:
self.metadata["llm_model"] = llm_model
if len(messages) == 0:
return
for message in messages[0]:
role = self.message_role_model.SYSTEM
if message.type == "human":
role = self.message_role_model.USER
elif message.type == "system":
role = self.message_role_model.SYSTEM
elif message.type == "ai":
role = self.message_role_model.ASSISTANT
self.messages.append(
self.message_model(
message=message.content,
role=role,
)
)
def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
"""Run when LLM ends."""
if len(response.generations) == 0 or len(response.generations[0]) == 0:
return
if not self.chain_run_id:
generation = response.generations[0][0]
self.messages.append(
self.message_model(
message=generation.text,
role=self.message_role_model.ASSISTANT,
)
)
self._log_conversation()
def on_chain_start(
self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs: Any
) -> None:
"""Run when chain starts."""
self.chain_run_id = kwargs.get("run_id", None)
def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> None:
"""Run when chain ends."""
self.messages.append(
self.message_model(
message=outputs["text"],
role=self.message_role_model.ASSISTANT,
)
)
self._log_conversation()
self.chain_run_id = None
def _log_conversation(self) -> None:
"""Log the conversation to the context API."""
if len(self.messages) == 0:
return
self.client.log.conversation_upsert(
body={
"conversation": self.conversation_model(
messages=self.messages,
metadata=self.metadata,
)
}
)
self.messages = []
self.metadata = {}

View File

@@ -551,8 +551,18 @@ class MlflowCallbackHandler(BaseMetadataCallbackHandler, BaseCallbackHandler):
on_llm_start_records_df = pd.DataFrame(self.records["on_llm_start_records"])
on_llm_end_records_df = pd.DataFrame(self.records["on_llm_end_records"])
llm_input_columns = ["step", "prompt"]
if "name" in on_llm_start_records_df.columns:
llm_input_columns.append("name")
elif "id" in on_llm_start_records_df.columns:
# id is llm class's full import path. For example:
# ["langchain", "llms", "openai", "AzureOpenAI"]
on_llm_start_records_df["name"] = on_llm_start_records_df["id"].apply(
lambda id_: id_[-1]
)
llm_input_columns.append("name")
llm_input_prompts_df = (
on_llm_start_records_df[["step", "prompt", "name"]]
on_llm_start_records_df[llm_input_columns]
.dropna(axis=1)
.rename({"step": "prompt_step"}, axis=1)
)

View File

@@ -4,7 +4,7 @@ from concurrent.futures import Future, ThreadPoolExecutor, wait
from typing import Any, Optional, Sequence, Set, Union
from uuid import UUID
from langchainplus_sdk import LangChainPlusClient, RunEvaluator
from langsmith import Client, RunEvaluator
from langchain.callbacks.manager import tracing_v2_enabled
from langchain.callbacks.tracers.base import BaseTracer
@@ -23,8 +23,8 @@ class EvaluatorCallbackHandler(BaseTracer):
max_workers : int, optional
The maximum number of worker threads to use for running the evaluators.
If not specified, it will default to the number of evaluators.
client : LangChainPlusClient, optional
The LangChainPlusClient instance to use for evaluating the runs.
client : LangSmith Client, optional
The LangSmith client instance to use for evaluating the runs.
If not specified, a new instance will be created.
example_id : Union[UUID, str], optional
The example ID to be associated with the runs.
@@ -35,8 +35,8 @@ class EvaluatorCallbackHandler(BaseTracer):
----------
example_id : Union[UUID, None]
The example ID associated with the runs.
client : LangChainPlusClient
The LangChainPlusClient instance used for evaluating the runs.
client : Client
The LangSmith client instance used for evaluating the runs.
evaluators : Sequence[RunEvaluator]
The sequence of run evaluators to be executed.
executor : ThreadPoolExecutor
@@ -56,7 +56,7 @@ class EvaluatorCallbackHandler(BaseTracer):
self,
evaluators: Sequence[RunEvaluator],
max_workers: Optional[int] = None,
client: Optional[LangChainPlusClient] = None,
client: Optional[Client] = None,
example_id: Optional[Union[UUID, str]] = None,
skip_unfinished: bool = True,
project_name: Optional[str] = None,
@@ -66,7 +66,7 @@ class EvaluatorCallbackHandler(BaseTracer):
self.example_id = (
UUID(example_id) if isinstance(example_id, str) else example_id
)
self.client = client or LangChainPlusClient()
self.client = client or Client()
self.evaluators = evaluators
self.executor = ThreadPoolExecutor(
max_workers=max(max_workers or len(evaluators), 1)

View File

@@ -8,7 +8,7 @@ from datetime import datetime
from typing import Any, Dict, List, Optional, Set, Union
from uuid import UUID
from langchainplus_sdk import LangChainPlusClient
from langsmith import Client
from langchain.callbacks.tracers.base import BaseTracer
from langchain.callbacks.tracers.schemas import Run, RunTypeEnum, TracerSession
@@ -44,7 +44,7 @@ class LangChainTracer(BaseTracer):
self,
example_id: Optional[Union[UUID, str]] = None,
project_name: Optional[str] = None,
client: Optional[LangChainPlusClient] = None,
client: Optional[Client] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
@@ -59,7 +59,7 @@ class LangChainTracer(BaseTracer):
)
# set max_workers to 1 to process tasks in order
self.executor = ThreadPoolExecutor(max_workers=1)
self.client = client or LangChainPlusClient()
self.client = client or Client()
self._futures: Set[Future] = set()
self.tags = tags or []
global _TRACERS
@@ -109,8 +109,6 @@ class LangChainTracer(BaseTracer):
def _persist_run_single(self, run: Run) -> None:
"""Persist a run."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
run_dict = run.dict(exclude={"child_runs"})
run_dict["tags"] = self._get_tags(run)
extra = run_dict.get("extra", {})
@@ -136,12 +134,16 @@ class LangChainTracer(BaseTracer):
def _on_llm_start(self, run: Run) -> None:
"""Persist an LLM run."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._futures.add(
self.executor.submit(self._persist_run_single, run.copy(deep=True))
)
def _on_chat_model_start(self, run: Run) -> None:
"""Persist an LLM run."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._futures.add(
self.executor.submit(self._persist_run_single, run.copy(deep=True))
)
@@ -160,6 +162,8 @@ class LangChainTracer(BaseTracer):
def _on_chain_start(self, run: Run) -> None:
"""Process the Chain Run upon start."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._futures.add(
self.executor.submit(self._persist_run_single, run.copy(deep=True))
)
@@ -178,6 +182,8 @@ class LangChainTracer(BaseTracer):
def _on_tool_start(self, run: Run) -> None:
"""Process the Tool Run upon start."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._futures.add(
self.executor.submit(self._persist_run_single, run.copy(deep=True))
)
@@ -196,6 +202,8 @@ class LangChainTracer(BaseTracer):
def _on_retriever_start(self, run: Run) -> None:
"""Process the Retriever Run upon start."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._futures.add(
self.executor.submit(self._persist_run_single, run.copy(deep=True))
)

View File

@@ -5,8 +5,8 @@ import datetime
from typing import Any, Dict, List, Optional
from uuid import UUID
from langchainplus_sdk.schemas import RunBase as BaseRunV2
from langchainplus_sdk.schemas import RunTypeEnum
from langsmith.schemas import RunBase as BaseRunV2
from langsmith.schemas import RunTypeEnum
from pydantic import BaseModel, Field, root_validator
from langchain.schema import LLMResult

View File

@@ -5,7 +5,6 @@ from typing import Any, Dict, List, Optional
from pydantic import Field, root_validator
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import (
AsyncCallbackManagerForChainRun,
CallbackManagerForChainRun,
@@ -15,6 +14,7 @@ from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.requests import TextRequestsWrapper
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
class APIChain(Chain):

View File

@@ -7,13 +7,13 @@ from typing import Any, Dict, List, NamedTuple, Optional, cast
from pydantic import BaseModel, Field
from requests import Response
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun, Callbacks
from langchain.chains.api.openapi.requests_chain import APIRequesterChain
from langchain.chains.api.openapi.response_chain import APIResponderChain
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.requests import Requests
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools.openapi.utils.api_models import APIOperation

View File

@@ -4,11 +4,11 @@ import json
import re
from typing import Any
from langchain.base_language import BaseLanguageModel
from langchain.chains.api.openapi.prompts import REQUEST_TEMPLATE
from langchain.chains.llm import LLMChain
from langchain.prompts.prompt import PromptTemplate
from langchain.schema import BaseOutputParser
from langchain.schema.language_model import BaseLanguageModel
class APIRequesterOutputParser(BaseOutputParser):

View File

@@ -4,11 +4,11 @@ import json
import re
from typing import Any
from langchain.base_language import BaseLanguageModel
from langchain.chains.api.openapi.prompts import RESPONSE_TEMPLATE
from langchain.chains.llm import LLMChain
from langchain.prompts.prompt import PromptTemplate
from langchain.schema import BaseOutputParser
from langchain.schema.language_model import BaseLanguageModel
class APIResponderOutputParser(BaseOutputParser):

View File

@@ -1,7 +1,6 @@
"""Chain for applying constitutional principles to the outputs of another chain."""
from typing import Any, Dict, List, Optional
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.constitutional_ai.models import ConstitutionalPrinciple
@@ -9,6 +8,7 @@ from langchain.chains.constitutional_ai.principles import PRINCIPLES
from langchain.chains.constitutional_ai.prompts import CRITIQUE_PROMPT, REVISION_PROMPT
from langchain.chains.llm import LLMChain
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
class ConstitutionalChain(Chain):

View File

@@ -9,7 +9,6 @@ from typing import Any, Callable, Dict, List, Optional, Tuple, Union
from pydantic import Extra, Field, root_validator
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import (
AsyncCallbackManagerForChainRun,
CallbackManagerForChainRun,
@@ -22,6 +21,7 @@ from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_
from langchain.chains.llm import LLMChain
from langchain.chains.question_answering import load_qa_chain
from langchain.schema import BasePromptTemplate, BaseRetriever, Document
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.messages import BaseMessage
from langchain.vectorstores.base import VectorStore
@@ -72,7 +72,7 @@ class BaseConversationalRetrievalChain(Chain):
"""Return the retrieved source documents as part of the final result."""
return_generated_question: bool = False
"""Return the generated question as part of the final result."""
get_chat_history: Optional[Callable[[CHAT_TURN_TYPE], str]] = None
get_chat_history: Optional[Callable[[List[CHAT_TURN_TYPE]], str]] = None
"""An optional function to get a string of the chat history.
If None is provided, will use a default."""

View File

@@ -7,7 +7,6 @@ from typing import Any, Dict, List, Optional, Sequence, Tuple
import numpy as np
from pydantic import Field
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import (
CallbackManagerForChainRun,
)
@@ -20,6 +19,7 @@ from langchain.chains.flare.prompts import (
from langchain.chains.llm import LLMChain
from langchain.llms import OpenAI
from langchain.schema import BasePromptTemplate, BaseRetriever, Generation
from langchain.schema.language_model import BaseLanguageModel
class _ResponseChain(LLMChain):

View File

@@ -5,13 +5,13 @@ from typing import Any, Dict, List, Optional
from pydantic import Field
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.graph_qa.prompts import ENTITY_EXTRACTION_PROMPT, GRAPH_QA_PROMPT
from langchain.chains.llm import LLMChain
from langchain.graphs.networkx_graph import NetworkxEntityGraph, get_entities
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
class GraphQAChain(Chain):
@@ -75,9 +75,10 @@ class GraphQAChain(Chain):
)
entities = get_entities(entity_string)
context = ""
all_triplets = []
for entity in entities:
triplets = self.graph.get_entity_knowledge(entity)
context += "\n".join(triplets)
all_triplets.extend(self.graph.get_entity_knowledge(entity))
context = "\n".join(all_triplets)
_run_manager.on_text("Full Context:", end="\n", verbose=self.verbose)
_run_manager.on_text(context, color="green", end="\n", verbose=self.verbose)
result = self.qa_chain(

View File

@@ -6,13 +6,13 @@ from typing import Any, Dict, List, Optional
from pydantic import Field
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.graph_qa.prompts import CYPHER_GENERATION_PROMPT, CYPHER_QA_PROMPT
from langchain.chains.llm import LLMChain
from langchain.graphs.neo4j_graph import Neo4jGraph
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
INTERMEDIATE_STEPS_KEY = "intermediate_steps"

View File

@@ -5,7 +5,6 @@ from typing import Any, Dict, List, Optional
from pydantic import Field
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.graph_qa.prompts import (
@@ -15,6 +14,7 @@ from langchain.chains.graph_qa.prompts import (
from langchain.chains.llm import LLMChain
from langchain.graphs.hugegraph import HugeGraph
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
class HugeGraphQAChain(Chain):

View File

@@ -5,13 +5,13 @@ from typing import Any, Dict, List, Optional
from pydantic import Field
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.graph_qa.prompts import CYPHER_QA_PROMPT, KUZU_GENERATION_PROMPT
from langchain.chains.llm import LLMChain
from langchain.graphs.kuzu_graph import KuzuGraph
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
class KuzuQAChain(Chain):

View File

@@ -5,13 +5,13 @@ from typing import Any, Dict, List, Optional
from pydantic import Field
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.graph_qa.prompts import CYPHER_QA_PROMPT, NGQL_GENERATION_PROMPT
from langchain.chains.llm import LLMChain
from langchain.graphs.nebula_graph import NebulaGraph
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
class NebulaGraphQAChain(Chain):

View File

@@ -7,7 +7,6 @@ from typing import Any, Dict, List, Optional
from pydantic import Field
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.graph_qa.prompts import (
@@ -19,6 +18,7 @@ from langchain.chains.graph_qa.prompts import (
from langchain.chains.llm import LLMChain
from langchain.graphs.rdf_graph import RdfGraph
from langchain.prompts.base import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
class GraphSparqlQAChain(Chain):

View File

@@ -9,12 +9,12 @@ from typing import Any, Dict, List, Optional
import numpy as np
from pydantic import Extra
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.hyde.prompts import PROMPT_MAP
from langchain.chains.llm import LLMChain
from langchain.embeddings.base import Embeddings
from langchain.schema.language_model import BaseLanguageModel
class HypotheticalDocumentEmbedder(Chain, Embeddings):

View File

@@ -6,7 +6,6 @@ from typing import Any, Dict, List, Optional, Sequence, Tuple, Union
from pydantic import Extra, Field
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import (
AsyncCallbackManager,
AsyncCallbackManagerForChainRun,
@@ -25,6 +24,7 @@ from langchain.schema import (
NoOpOutputParser,
PromptValue,
)
from langchain.schema.language_model import BaseLanguageModel
class LLMChain(Chain):

View File

@@ -7,12 +7,12 @@ from typing import Any, Dict, List, Optional
from pydantic import Extra, Field, root_validator
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.llm_bash.prompt import PROMPT
from langchain.schema import BasePromptTemplate, OutputParserException
from langchain.schema.language_model import BaseLanguageModel
from langchain.utilities.bash import BashProcess
logger = logging.getLogger(__name__)

View File

@@ -6,7 +6,6 @@ from typing import Any, Dict, List, Optional
from pydantic import Extra, root_validator
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
@@ -18,6 +17,7 @@ from langchain.chains.llm_checker.prompt import (
)
from langchain.chains.sequential import SequentialChain
from langchain.prompts import PromptTemplate
from langchain.schema.language_model import BaseLanguageModel
def _load_question_to_checked_assertions_chain(

View File

@@ -9,7 +9,6 @@ from typing import Any, Dict, List, Optional
import numexpr
from pydantic import Extra, root_validator
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import (
AsyncCallbackManagerForChainRun,
CallbackManagerForChainRun,
@@ -18,6 +17,7 @@ from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.llm_math.prompt import PROMPT
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
class LLMMathChain(Chain):

View File

@@ -8,12 +8,12 @@ from typing import Any, Dict, List, Optional
from pydantic import Extra, root_validator
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.sequential import SequentialChain
from langchain.prompts.prompt import PromptTemplate
from langchain.schema.language_model import BaseLanguageModel
PROMPTS_DIR = Path(__file__).parent / "prompts"

View File

@@ -9,7 +9,6 @@ from typing import Any, Dict, List, Mapping, Optional
from pydantic import Extra
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun, Callbacks
from langchain.chains import ReduceDocumentsChain
from langchain.chains.base import Chain
@@ -19,6 +18,7 @@ from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chains.llm import LLMChain
from langchain.docstore.document import Document
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.text_splitter import TextSplitter

View File

@@ -6,12 +6,12 @@ from typing import Any, Dict, List, Optional
from pydantic import Extra, root_validator
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.natbot.prompt import PROMPT
from langchain.llms.openai import OpenAI
from langchain.schema.language_model import BaseLanguageModel
class NatBotChain(Chain):

View File

@@ -2,13 +2,13 @@ from typing import Iterator, List
from pydantic import BaseModel, Field
from langchain.base_language import BaseLanguageModel
from langchain.chains.llm import LLMChain
from langchain.chains.openai_functions.utils import get_llm_kwargs
from langchain.output_parsers.openai_functions import (
PydanticOutputFunctionsParser,
)
from langchain.prompts.chat import ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.messages import HumanMessage, SystemMessage

View File

@@ -2,7 +2,6 @@ from typing import Any, List
from pydantic import BaseModel
from langchain.base_language import BaseLanguageModel
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.openai_functions.utils import (
@@ -15,6 +14,7 @@ from langchain.output_parsers.openai_functions import (
PydanticAttrOutputFunctionsParser,
)
from langchain.prompts import ChatPromptTemplate
from langchain.schema.language_model import BaseLanguageModel
def _get_extraction_function(entity_schema: dict) -> dict:

View File

@@ -8,7 +8,6 @@ from openapi_schema_pydantic import Parameter
from requests import Response
from langchain import LLMChain
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.sequential import SequentialChain
@@ -17,6 +16,7 @@ from langchain.input import get_colored_text
from langchain.output_parsers.openai_functions import JsonOutputFunctionsParser
from langchain.prompts import ChatPromptTemplate
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools import APIOperation
from langchain.utilities.openapi import OpenAPISpec

View File

@@ -2,7 +2,6 @@ from typing import Any, List, Optional, Type, Union
from pydantic import BaseModel, Field
from langchain.base_language import BaseLanguageModel
from langchain.chains.llm import LLMChain
from langchain.chains.openai_functions.utils import get_llm_kwargs
from langchain.output_parsers.openai_functions import (
@@ -12,6 +11,7 @@ from langchain.output_parsers.openai_functions import (
from langchain.prompts import PromptTemplate
from langchain.prompts.chat import ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.schema import BaseLLMOutputParser
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.messages import HumanMessage, SystemMessage

View File

@@ -1,6 +1,5 @@
from typing import Any
from langchain.base_language import BaseLanguageModel
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.openai_functions.utils import _convert_schema, get_llm_kwargs
@@ -9,6 +8,7 @@ from langchain.output_parsers.openai_functions import (
PydanticOutputFunctionsParser,
)
from langchain.prompts import ChatPromptTemplate
from langchain.schema.language_model import BaseLanguageModel
def _get_tagging_function(schema: dict) -> dict:

View File

@@ -9,13 +9,13 @@ from typing import Any, Dict, List, Optional
from pydantic import Extra, root_validator
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.pal.colored_object_prompt import COLORED_OBJECT_PROMPT
from langchain.chains.pal.math_prompt import MATH_PROMPT
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.utilities import PythonREPL

View File

@@ -3,10 +3,10 @@ from typing import Callable, List, Tuple
from pydantic import BaseModel, Field
from langchain.base_language import BaseLanguageModel
from langchain.chat_models.base import BaseChatModel
from langchain.llms.base import BaseLLM
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
class BasePromptSelector(BaseModel, ABC):

View File

@@ -5,12 +5,12 @@ from typing import Any, Dict, List, Optional
from pydantic import Field
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.qa_generation.prompt import PROMPT_SELECTOR
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.text_splitter import RecursiveCharacterTextSplitter, TextSplitter

View File

@@ -9,7 +9,6 @@ from typing import Any, Dict, List, Optional
from pydantic import Extra, root_validator
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import (
AsyncCallbackManagerForChainRun,
CallbackManagerForChainRun,
@@ -28,6 +27,7 @@ from langchain.chains.qa_with_sources.map_reduce_prompt import (
)
from langchain.docstore.document import Document
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
class BaseQAWithSourcesChain(Chain, ABC):

View File

@@ -3,11 +3,10 @@ from __future__ import annotations
from typing import Any, Mapping, Optional, Protocol
from langchain.base_language import BaseLanguageModel
from langchain.chains import ReduceDocumentsChain
from langchain.chains.combine_documents.base import BaseCombineDocumentsChain
from langchain.chains.combine_documents.map_reduce import MapReduceDocumentsChain
from langchain.chains.combine_documents.map_rerank import MapRerankDocumentsChain
from langchain.chains.combine_documents.reduce import ReduceDocumentsChain
from langchain.chains.combine_documents.refine import RefineDocumentsChain
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chains.llm import LLMChain
@@ -19,6 +18,7 @@ from langchain.chains.qa_with_sources import (
from langchain.chains.question_answering.map_rerank_prompt import (
PROMPT as MAP_RERANK_PROMPT,
)
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.prompt_template import BasePromptTemplate

View File

@@ -5,7 +5,6 @@ import json
from typing import Any, Callable, List, Optional, Sequence
from langchain import FewShotPromptTemplate, LLMChain
from langchain.base_language import BaseLanguageModel
from langchain.chains.query_constructor.ir import (
Comparator,
Operator,
@@ -24,6 +23,7 @@ from langchain.chains.query_constructor.prompt import (
from langchain.chains.query_constructor.schema import AttributeInfo
from langchain.output_parsers.json import parse_and_check_json_markdown
from langchain.schema import BaseOutputParser, BasePromptTemplate, OutputParserException
from langchain.schema.language_model import BaseLanguageModel
class StructuredQueryOutputParser(BaseOutputParser[StructuredQuery]):

View File

@@ -145,6 +145,11 @@ def get_parser(
Returns:
Lark parser for the query language.
"""
# QueryTransformer is None when Lark cannot be imported.
if QueryTransformer is None:
raise ImportError(
"Cannot import lark, please install it with 'pip install lark'."
)
transformer = QueryTransformer(
allowed_comparators=allowed_comparators, allowed_operators=allowed_operators
)

Some files were not shown because too many files have changed in this diff Show More