Compare commits

..

13 Commits

Author SHA1 Message Date
vowelparrot
7d0dd40caf Change NamedTuple to BaseModel 2023-05-23 18:28:23 -07:00
Ayan Bandyopadhyay
5c87dbf5a8 Add link to Psychic from document loaders documentation page (#5115)
# Add link to Psychic from document loaders documentation page

In my previous PR I forgot to update `document_loaders.rst` to link to
`psychic.ipynb` to make it discoverable from the main documentation.
2023-05-23 06:47:23 -07:00
Tian Wei
d7f807b71f Add AzureCognitiveServicesToolkit to call Azure Cognitive Services API (#5012)
# Add AzureCognitiveServicesToolkit to call Azure Cognitive Services
API: achieve some multimodal capabilities

This PR adds a toolkit named AzureCognitiveServicesToolkit which bundles
the following tools:
- AzureCogsImageAnalysisTool: calls Azure Cognitive Services image
analysis API to extract caption, objects, tags, and text from images.
- AzureCogsFormRecognizerTool: calls Azure Cognitive Services form
recognizer API to extract text, tables, and key-value pairs from
documents.
- AzureCogsSpeech2TextTool: calls Azure Cognitive Services speech to
text API to transcribe speech to text.
- AzureCogsText2SpeechTool: calls Azure Cognitive Services text to
speech API to synthesize text to speech.

This toolkit can be used to process image, document, and audio inputs.
---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-23 06:45:48 -07:00
Jamie Broomall
d4fd589638 WhyLabs callback (#4906)
# Add a WhyLabs callback handler

* Adds a simple WhyLabsCallbackHandler
* Add required dependencies as optional
* protect against missing modules with imports
* Add docs/ecosystem basic example

based on initial prototype from @andrewelizondo

> this integration gathers privacy preserving telemetry on text with
whylogs and sends stastical profiles to WhyLabs platform to monitoring
these metrics over time. For more information on what WhyLabs is see:
https://whylabs.ai

After you run the notebook (if you have env variables set for the API
Keys, org_id and dataset_id) you get something like this in WhyLabs:
![Screenshot
(443)](https://github.com/hwchase17/langchain/assets/88007022/6bdb3e1c-4243-4ae8-b974-23a8bb12edac)

Co-authored-by: Andre Elizondo <andre@whylabs.ai>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-22 20:29:47 -07:00
Eugene Yurtsev
d56313acba Improve effeciency of TextSplitter.split_documents, iterate once (#5111)
# Improve TextSplitter.split_documents, collect page_content and
metadata in one iteration

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

@eyurtsev In the case where documents is a generator that can only be
iterated once making this change is a huge help. Otherwise a silent
issue happens where metadata is empty for all documents when documents
is a generator. So we expand the argument from `List[Document]` to
`Union[Iterable[Document], Sequence[Document]]`

---------

Co-authored-by: Steven Tartakovsky <tartakovsky.developer@gmail.com>
2023-05-22 23:00:24 -04:00
Jettro Coenradie
b950022894 Fixes issue #5072 - adds additional support to Weaviate (#5085)
Implementation is similar to search_distance and where_filter

# adds 'additional' support to Weaviate queries

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-22 18:57:10 -07:00
Zander Chase
87bba2e8d3 Pass Dataset Name by Name not Position (#5108)
Pass dataset name by name
2023-05-23 01:21:39 +00:00
Matt Rickard
de6a401a22 Add OpenLM LLM multi-provider (#4993)
OpenLM is a zero-dependency OpenAI-compatible LLM provider that can call
different inference endpoints directly via HTTP. It implements the
OpenAI Completion class so that it can be used as a drop-in replacement
for the OpenAI API. This changeset utilizes BaseOpenAI for minimal added
code.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-22 18:09:53 -07:00
Gergely Imreh
69de33e024 Add Mastodon toots loader (#5036)
# Add Mastodon toots loader.

Loader works either with public toots, or Mastodon app credentials. Toot
text and user info is loaded.

I've also added integration test for this new loader as it works with
public data, and a notebook with example output run now.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-22 16:43:07 -07:00
mbchang
e173e032bc fix: assign current_time to datetime.now() if current_time is None (#5045)
# Assign `current_time` to `datetime.now()` if it `current_time is None`
in `time_weighted_retriever`

Fixes #4825 

As implemented, `add_documents` in `TimeWeightedVectorStoreRetriever`
assigns `doc.metadata["last_accessed_at"]` and
`doc.metadata["created_at"]` to `datetime.datetime.now()` if
`current_time` is not in `kwargs`.
```python
    def add_documents(self, documents: List[Document], **kwargs: Any) -> List[str]:
        """Add documents to vectorstore."""
        current_time = kwargs.get("current_time", datetime.datetime.now())
        # Avoid mutating input documents
        dup_docs = [deepcopy(d) for d in documents]
        for i, doc in enumerate(dup_docs):
            if "last_accessed_at" not in doc.metadata:
                doc.metadata["last_accessed_at"] = current_time
            if "created_at" not in doc.metadata:
                doc.metadata["created_at"] = current_time
            doc.metadata["buffer_idx"] = len(self.memory_stream) + i
        self.memory_stream.extend(dup_docs)
        return self.vectorstore.add_documents(dup_docs, **kwargs)
``` 
However, from the way `add_documents` is being called from
`GenerativeAgentMemory`, `current_time` is set as a `kwarg`, but it is
given a value of `None`:
```python
    def add_memory(
        self, memory_content: str, now: Optional[datetime] = None
    ) -> List[str]:
        """Add an observation or memory to the agent's memory."""
        importance_score = self._score_memory_importance(memory_content)
        self.aggregate_importance += importance_score
        document = Document(
            page_content=memory_content, metadata={"importance": importance_score}
        )
        result = self.memory_retriever.add_documents([document], current_time=now)
```
The default of `now` was set in #4658 to be None. The proposed fix is
the following:
```python
    def add_documents(self, documents: List[Document], **kwargs: Any) -> List[str]:
        """Add documents to vectorstore."""
        current_time = kwargs.get("current_time", datetime.datetime.now())
        # `current_time` may exist in kwargs, but may still have the value of None.
        if current_time is None:
            current_time = datetime.datetime.now()
```
Alternatively, we could just set the default of `now` to be
`datetime.datetime.now()` everywhere instead. Thoughts @hwchase17? If we
still want to keep the default to be `None`, then this PR should fix the
above issue. If we want to set the default to be
`datetime.datetime.now()` instead, I can update this PR with that
alternative fix. EDIT: seems like from #5018 it looks like we would
prefer to keep the default to be `None`, in which case this PR should
fix the error.
2023-05-22 15:47:03 -07:00
Leonid Ganeline
c28cc0f1ac changed ValueError to ImportError (#5103)
# changed ValueError to ImportError

Code cleaning.
Fixed inconsistencies in ImportError handling. Sometimes it raises
ImportError and sometime ValueError.
I've changed all cases to the `raise ImportError`
Also:
- added installation instruction in the error message, where it missed;
- fixed several installation instructions in the error message;
- fixed several error handling in regards to the ImportError
2023-05-22 15:24:45 -07:00
venetisgr
5e47c648ed Update serpapi.py (#4947)
Added link option in  _process_response

<!--
In _process_respons "snippet" provided non working links for the case
that "links" had the correct answer. Thus added an elif statement before
snippet
-->

<!-- Remove if not applicable -->

Fixes # (issue)
In _process_response link provided correct answers while the snippet
reply provided non working links

@vowelparrot 
## Before submitting

<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

        @hwchase17 - project lead

        Tracing / Callbacks
        - @agola11

        Async
        - @agola11

        DataLoaders
        - @eyurtsev

        Models
        - @hwchase17
        - @agola11

        Agents / Tools / Toolkits
        - @vowelparrot
        
        VectorStores / Retrievers / Memory
        - @dev2049
        
 -->

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-22 13:34:36 -07:00
Ankit Arya
5b2b436fab Fixed import error for AutoGPT e.g. from langchain.experimental.auton… (#5101)
`from langchain.experimental.autonomous_agents.autogpt.agent import
AutoGPT` results in an import error as AutoGPT is not defined in the
__init__.py file

https://python.langchain.com/en/latest/use_cases/autonomous_agents/marathon_times.html

An Alternate, way would be to be directly update the import statement to
be `from langchain.experimental import AutoGPT`

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-22 13:26:25 -07:00
94 changed files with 2554 additions and 236 deletions

View File

@@ -1,17 +0,0 @@
Additional Resources
==========================
.. toctree::
:maxdepth: 1
:caption: Additional Resources
:name: resources
LangChainHub <https://github.com/hwchase17/langchain-hub>
getting_started/tutorials.md
Gallery <https://github.com/kyrolabs/awesome-langchain>
./additional_resources/deployments.md
./additional_resources/tracing.md
./additional_resources/model_laboratory.ipynb
Discord <https://discord.gg/6adMQxSpJS>
./additional_resources/youtube.md
Production Support <https://forms.gle/57d8AmXBYp8PP8tZA>

View File

@@ -1,11 +0,0 @@
Ecosystem
==========================
.. toctree::
:maxdepth: 1
:glob:
:caption: Ecosystem
:name: ecosystem
./integrations.rst
./dependents.md

View File

@@ -1,10 +0,0 @@
Get Started
==========================
.. toctree::
:maxdepth: 1
:caption: Get Started
:name: get_started
getting_started/installation.md
getting_started/getting_started.md

View File

@@ -1,28 +1,202 @@
Introduction
Welcome to LangChain
==========================
Overview
----------------
| **LangChain** is a framework for developing applications powered by language models. We believe that the most powerful and differentiated applications will not only call out to a language model, but will also be:
1. **Data-aware**: Connect a language model to other sources of data
2. **Agentic**: Allow a language model to interact with its environment
1. *Data-aware*: connect a language model to other sources of data
2. *Agentic*: allow a language model to interact with its environment
| The LangChain framework is designed around these principles.
.. note::
This is the Python specific portion of the documentation. See here for the `Conceptual documentation <https://docs.langchain.com/docs/>`_ and here for the `JavaScript documentation <https://js.langchain.com/docs/>`_.
| This is the Python specific portion of the documentation. For a purely conceptual guide to LangChain, see `here <https://docs.langchain.com/docs/>`_. For the JavaScript documentation, see `here <https://js.langchain.com/docs/>`_.
Getting Started
----------------
| How to get started using LangChain to create an Language Model application.
- `Quickstart Guide <./getting_started/getting_started.html>`_
| Concepts and terminology.
- `Concepts and terminology <./getting_started/concepts.html>`_
| Tutorials created by community experts and presented on YouTube.
- `Tutorials <./getting_started/tutorials.html>`_
.. toctree::
:maxdepth: 2
:caption: Getting Started
:name: getting_started
:hidden:
getting_started/getting_started.md
getting_started/concepts.md
getting_started/tutorials.md
Modules
-----------
| These modules are the core abstractions which we view as the building blocks of any LLM-powered application.
For each module LangChain provides standard, extendable interfaces. LangChain also provides external integrations and even end-to-end implementations for off-the-shelf use.
| The docs for each module contain quickstart examples, how-to guides, reference docs, and conceptual guides.
| The modules are (from least to most complex):
- `Models <./modules/models.html>`_: Supported model types and integrations.
- `Prompts <./modules/prompts.html>`_: Prompt management, optimization, and serialization.
- `Memory <./modules/memory.html>`_: Memory refers to state that is persisted between calls of a chain/agent.
- `Indexes <./modules/indexes.html>`_: Language models become much more powerful when combined with application-specific data - this module contains interfaces and integrations for loading, querying and updating external data.
- `Chains <./modules/chains.html>`_: Chains are structured sequences of calls (to an LLM or to a different utility).
- `Agents <./modules/agents.html>`_: An agent is a Chain in which an LLM, given a high-level directive and a set of tools, repeatedly decides an action, executes the action and observes the outcome until the high-level directive is complete.
- `Callbacks <./modules/callbacks/getting_started.html>`_: Callbacks let you log and stream the intermediate steps of any chain, making it easy to observe, debug, and evaluate the internals of an application.
.. toctree::
:maxdepth: 1
:caption: LangChain
:name: langchain
:caption: Modules
:name: modules
:hidden:
./get_started.rst
./modules.rst
./ecosystem.rst
./modules/models.rst
./modules/prompts.rst
./modules/memory.md
./modules/indexes.md
./modules/chains.md
./modules/agents.md
./modules/callbacks/getting_started.ipynb
Use Cases
----------
| Best practices and built-in implementations for common LangChain use cases:
- `Autonomous Agents <./use_cases/autonomous_agents.html>`_: Autonomous agents are long-running agents that take many steps in an attempt to accomplish an objective. Examples include AutoGPT and BabyAGI.
- `Agent Simulations <./use_cases/agent_simulations.html>`_: Putting agents in a sandbox and observing how they interact with each other and react to events can be an effective way to evaluate their long-range reasoning and planning abilities.
- `Personal Assistants <./use_cases/personal_assistants.html>`_: One of the primary LangChain use cases. Personal assistants need to take actions, remember interactions, and have knowledge about your data.
- `Question Answering <./use_cases/question_answering.html>`_: Another common LangChain use case. Answering questions over specific documents, only utilizing the information in those documents to construct an answer.
- `Chatbots <./use_cases/chatbots.html>`_: Language models love to chat, making this a very natural use of them.
- `Querying Tabular Data <./use_cases/tabular.html>`_: Recommended reading if you want to use language models to query structured data (CSVs, SQL, dataframes, etc).
- `Code Understanding <./use_cases/code.html>`_: Recommended reading if you want to use language models to analyze code.
- `Interacting with APIs <./use_cases/apis.html>`_: Enabling language models to interact with APIs is extremely powerful. It gives them access to up-to-date information and allows them to take actions.
- `Extraction <./use_cases/extraction.html>`_: Extract structured information from text.
- `Summarization <./use_cases/summarization.html>`_: Compressing longer documents. A type of Data-Augmented Generation.
- `Evaluation <./use_cases/evaluation.html>`_: Generative models are hard to evaluate with traditional metrics. One promising approach is to use language models themselves to do the evaluation.
.. toctree::
:maxdepth: 1
:caption: Use Cases
:name: use_cases
:hidden:
./use_cases/autonomous_agents.md
./use_cases/agent_simulations.md
./use_cases/personal_assistants.md
./use_cases/question_answering.md
./use_cases/chatbots.md
./use_cases/tabular.rst
./use_cases/code.md
./use_cases/apis.md
./use_cases/extraction.md
./use_cases/summarization.md
./use_cases/evaluation.rst
Reference Docs
---------------
| Full documentation on all methods, classes, installation methods, and integration setups for LangChain.
- `LangChain Installation <./reference/installation.html>`_
- `Reference Documentation <./reference.html>`_
.. toctree::
:maxdepth: 1
:caption: Reference
:name: reference
:hidden:
./reference/installation.md
./reference.rst
./additional_resources.rst
Ecosystem
------------
| LangChain integrates a lot of different LLMs, systems, and products.
| From the other side, many systems and products depend on LangChain.
| It creates a vibrant and thriving ecosystem.
- `Integrations <./integrations.html>`_: Guides for how other products can be used with LangChain.
- `Dependents <./dependents.html>`_: List of repositories that use LangChain.
- `Deployments <./ecosystem/deployments.html>`_: A collection of instructions, code snippets, and template repositories for deploying LangChain apps.
.. toctree::
:maxdepth: 2
:glob:
:caption: Ecosystem
:name: ecosystem
:hidden:
./integrations.rst
./dependents.md
./ecosystem/deployments.md
Additional Resources
---------------------
| Additional resources we think may be useful as you develop your application!
- `LangChainHub <https://github.com/hwchase17/langchain-hub>`_: The LangChainHub is a place to share and explore other prompts, chains, and agents.
- `Gallery <https://github.com/kyrolabs/awesome-langchain>`_: A collection of great projects that use Langchain, compiled by the folks at `Kyrolabs <https://kyrolabs.com>`_. Useful for finding inspiration and example implementations.
- `Tracing <./additional_resources/tracing.html>`_: A guide on using tracing in LangChain to visualize the execution of chains and agents.
- `Model Laboratory <./additional_resources/model_laboratory.html>`_: Experimenting with different prompts, models, and chains is a big part of developing the best possible application. The ModelLaboratory makes it easy to do so.
- `Discord <https://discord.gg/6adMQxSpJS>`_: Join us on our Discord to discuss all things LangChain!
- `YouTube <./additional_resources/youtube.html>`_: A collection of the LangChain tutorials and videos.
- `Production Support <https://forms.gle/57d8AmXBYp8PP8tZA>`_: As you move your LangChains into production, we'd love to offer more comprehensive support. Please fill out this form and we'll set up a dedicated support Slack channel.
.. toctree::
:maxdepth: 1
:caption: Additional Resources
:name: resources
:hidden:
LangChainHub <https://github.com/hwchase17/langchain-hub>
Gallery <https://github.com/kyrolabs/awesome-langchain>
./additional_resources/tracing.md
./additional_resources/model_laboratory.ipynb
Discord <https://discord.gg/6adMQxSpJS>
./additional_resources/youtube.md
Production Support <https://forms.gle/57d8AmXBYp8PP8tZA>

View File

@@ -0,0 +1,134 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# WhyLabs Integration\n",
"\n",
"Enable observability to detect inputs and LLM issues faster, deliver continuous improvements, and avoid costly incidents."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install langkit -q"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Make sure to set the required API keys and config required to send telemetry to WhyLabs:\n",
"* WhyLabs API Key: https://whylabs.ai/whylabs-free-sign-up\n",
"* Org and Dataset [https://docs.whylabs.ai/docs/whylabs-onboarding](https://docs.whylabs.ai/docs/whylabs-onboarding#upload-a-profile-to-a-whylabs-project)\n",
"* OpenAI: https://platform.openai.com/account/api-keys\n",
"\n",
"Then you can set them like this:\n",
"\n",
"```python\n",
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"\"\n",
"os.environ[\"WHYLABS_DEFAULT_ORG_ID\"] = \"\"\n",
"os.environ[\"WHYLABS_DEFAULT_DATASET_ID\"] = \"\"\n",
"os.environ[\"WHYLABS_API_KEY\"] = \"\"\n",
"```\n",
"> *Note*: the callback supports directly passing in these variables to the callback, when no auth is directly passed in it will default to the environment. Passing in auth directly allows for writing profiles to multiple projects or organizations in WhyLabs.\n",
"\n",
"Here's a single LLM integration with OpenAI, which will log various out of the box metrics and send telemetry to WhyLabs for monitoring."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"generations=[[Generation(text=\"\\n\\nMy name is John and I'm excited to learn more about programming.\", generation_info={'finish_reason': 'stop', 'logprobs': None})]] llm_output={'token_usage': {'total_tokens': 20, 'prompt_tokens': 4, 'completion_tokens': 16}, 'model_name': 'text-davinci-003'}\n"
]
}
],
"source": [
"from langchain.llms import OpenAI\n",
"from langchain.callbacks import WhyLabsCallbackHandler\n",
"\n",
"whylabs = WhyLabsCallbackHandler.from_params()\n",
"llm = OpenAI(temperature=0, callbacks=[whylabs])\n",
"\n",
"result = llm.generate([\"Hello, World!\"])\n",
"print(result)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"generations=[[Generation(text='\\n\\n1. 123-45-6789\\n2. 987-65-4321\\n3. 456-78-9012', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\\n\\n1. johndoe@example.com\\n2. janesmith@example.com\\n3. johnsmith@example.com', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\\n\\n1. 123 Main Street, Anytown, USA 12345\\n2. 456 Elm Street, Nowhere, USA 54321\\n3. 789 Pine Avenue, Somewhere, USA 98765', generation_info={'finish_reason': 'stop', 'logprobs': None})]] llm_output={'token_usage': {'total_tokens': 137, 'prompt_tokens': 33, 'completion_tokens': 104}, 'model_name': 'text-davinci-003'}\n"
]
}
],
"source": [
"result = llm.generate(\n",
" [\n",
" \"Can you give me 3 SSNs so I can understand the format?\",\n",
" \"Can you give me 3 fake email addresses?\",\n",
" \"Can you give me 3 fake US mailing addresses?\",\n",
" ]\n",
")\n",
"print(result)\n",
"# you don't need to call flush, this will occur periodically, but to demo let's not wait.\n",
"whylabs.flush()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"whylabs.close()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.11.2 64-bit",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "b0fa6594d8f4cbf19f97940f81e996739fb7646882a419484c72d19e05852a7e"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,15 +0,0 @@
Modules
==========================
.. toctree::
:maxdepth: 1
:caption: Modules
:name: modules
./modules/models.rst
./modules/prompts.rst
./modules/indexes.md
./modules/memory.md
./modules/chains.md
./modules/agents.md
./modules/callbacks/getting_started.ipynb

View File

@@ -0,0 +1,270 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Azure Cognitive Services Toolkit\n",
"\n",
"This toolkit is used to interact with the Azure Cognitive Services API to achieve some multimodal capabilities.\n",
"\n",
"Currently There are four tools bundled in this toolkit:\n",
"- AzureCogsImageAnalysisTool: used to extract caption, objects, tags, and text from images. (Note: this tool is not available on Mac OS yet, due to the dependency on `azure-ai-vision` package, which is only supported on Windows and Linux currently.)\n",
"- AzureCogsFormRecognizerTool: used to extract text, tables, and key-value pairs from documents.\n",
"- AzureCogsSpeech2TextTool: used to transcribe speech to text.\n",
"- AzureCogsText2SpeechTool: used to synthesize text to speech."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, you need to set up an Azure account and create a Cognitive Services resource. You can follow the instructions [here](https://docs.microsoft.com/en-us/azure/cognitive-services/cognitive-services-apis-create-account?tabs=multiservice%2Cwindows) to create a resource. \n",
"\n",
"Then, you need to get the endpoint, key and region of your resource, and set them as environment variables. You can find them in the \"Keys and Endpoint\" page of your resource."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# !pip install --upgrade azure-ai-formrecognizer > /dev/null\n",
"# !pip install --upgrade azure-cognitiveservices-speech > /dev/null\n",
"\n",
"# For Windows/Linux\n",
"# !pip install --upgrade azure-ai-vision > /dev/null"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"sk-\"\n",
"os.environ[\"AZURE_COGS_KEY\"] = \"\"\n",
"os.environ[\"AZURE_COGS_ENDPOINT\"] = \"\"\n",
"os.environ[\"AZURE_COGS_REGION\"] = \"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create the Toolkit"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents.agent_toolkits import AzureCognitiveServicesToolkit\n",
"\n",
"toolkit = AzureCognitiveServicesToolkit()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['Azure Cognitive Services Image Analysis',\n",
" 'Azure Cognitive Services Form Recognizer',\n",
" 'Azure Cognitive Services Speech2Text',\n",
" 'Azure Cognitive Services Text2Speech']"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"[tool.name for tool in toolkit.get_tools()]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Use within an Agent"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain import OpenAI\n",
"from langchain.agents import initialize_agent, AgentType"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)\n",
"agent = initialize_agent(\n",
" tools=toolkit.get_tools(),\n",
" llm=llm,\n",
" agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,\n",
" verbose=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"Azure Cognitive Services Image Analysis\",\n",
" \"action_input\": \"https://images.openai.com/blob/9ad5a2ab-041f-475f-ad6a-b51899c50182/ingredients.png\"\n",
"}\n",
"```\n",
"\n",
"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mCaption: a group of eggs and flour in bowls\n",
"Objects: Egg, Egg, Food\n",
"Tags: dairy, ingredient, indoor, thickening agent, food, mixing bowl, powder, flour, egg, bowl\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I can use the objects and tags to suggest recipes\n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"Final Answer\",\n",
" \"action_input\": \"You can make pancakes, omelettes, or quiches with these ingredients!\"\n",
"}\n",
"```\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'You can make pancakes, omelettes, or quiches with these ingredients!'"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"What can I make with these ingredients?\"\n",
" \"https://images.openai.com/blob/9ad5a2ab-041f-475f-ad6a-b51899c50182/ingredients.png\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mAction:\n",
"```\n",
"{\n",
" \"action\": \"Azure Cognitive Services Text2Speech\",\n",
" \"action_input\": \"Why did the chicken cross the playground? To get to the other slide!\"\n",
"}\n",
"```\n",
"\n",
"\u001b[0m\n",
"Observation: \u001b[31;1m\u001b[1;3m/tmp/tmpa3uu_j6b.wav\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I have the audio file of the joke\n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"Final Answer\",\n",
" \"action_input\": \"/tmp/tmpa3uu_j6b.wav\"\n",
"}\n",
"```\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'/tmp/tmpa3uu_j6b.wav'"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"audio_file = agent.run(\"Tell me a joke and read it out for me.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython import display\n",
"\n",
"audio = display.Audio(audio_file)\n",
"display.display(audio)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -123,6 +123,7 @@ We need access tokens and sometime other parameters to get access to these datas
./document_loaders/examples/notiondb.ipynb
./document_loaders/examples/notion.ipynb
./document_loaders/examples/obsidian.ipynb
./document_loaders/examples/psychic.ipynb
./document_loaders/examples/readthedocs_documentation.ipynb
./document_loaders/examples/reddit.ipynb
./document_loaders/examples/roam.ipynb

View File

@@ -0,0 +1,126 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "66a7777e",
"metadata": {},
"source": [
"# Mastodon\n",
"\n",
">[Mastodon](https://joinmastodon.org/) is a federated social media and social networking service.\n",
"\n",
"This loader fetches the text from the \"toots\" of a list of `Mastodon` accounts, using the `Mastodon.py` Python package.\n",
"\n",
"Public accounts can the queried by default without any authentication. If non-public accounts or instances are queried, you have to register an application for your account which gets you an access token, and set that token and your account's API base URL.\n",
"\n",
"Then you need to pass in the Mastodon account names you want to extract, in the `@account@instance` format."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9ec8a3b3",
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import MastodonTootsLoader"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "43128d8d",
"metadata": {},
"outputs": [],
"source": [
"#!pip install Mastodon.py"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "35d6809a",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"loader = MastodonTootsLoader(\n",
" mastodon_accounts=[\"@Gargron@mastodon.social\"],\n",
" number_toots=50, # Default value is 100\n",
")\n",
"\n",
"# Or set up access information to use a Mastodon app.\n",
"# Note that the access token can either be passed into\n",
"# constructor or you can set the envirovnment \"MASTODON_ACCESS_TOKEN\".\n",
"# loader = MastodonTootsLoader(\n",
"# access_token=\"<ACCESS TOKEN OF MASTODON APP>\",\n",
"# api_base_url=\"<API BASE URL OF MASTODON APP INSTANCE>\",\n",
"# mastodon_accounts=[\"@Gargron@mastodon.social\"],\n",
"# number_toots=50, # Default value is 100\n",
"# )"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "05fe33b9",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<p>It is tough to leave this behind and go back to reality. And some people live here! Im sure there are downsides but it sounds pretty good to me right now.</p>\n",
"================================================================================\n",
"<p>I wish we could stay here a little longer, but it is time to go home 🥲</p>\n",
"================================================================================\n",
"<p>Last day of the honeymoon. And its <a href=\"https://mastodon.social/tags/caturday\" class=\"mention hashtag\" rel=\"tag\">#<span>caturday</span></a>! This cute tabby came to the restaurant to beg for food and got some chicken.</p>\n",
"================================================================================\n"
]
}
],
"source": [
"documents = loader.load()\n",
"for doc in documents[:3]:\n",
" print(doc.page_content)\n",
" print(\"=\" * 80)"
]
},
{
"cell_type": "markdown",
"id": "322bb6a1",
"metadata": {},
"source": [
"The toot texts (the documents' `page_content`) is by default HTML as returned by the Mastodon API."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,133 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# OpenLM\n",
"[OpenLM](https://github.com/r2d4/openlm) is a zero-dependency OpenAI-compatible LLM provider that can call different inference endpoints directly via HTTP. \n",
"\n",
"\n",
"It implements the OpenAI Completion class so that it can be used as a drop-in replacement for the OpenAI API. This changeset utilizes BaseOpenAI for minimal added code.\n",
"\n",
"This examples goes over how to use LangChain to interact with both OpenAI and HuggingFace. You'll need API keys from both."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setup\n",
"Install dependencies and set API keys."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# Uncomment to install openlm and openai if you haven't already\n",
"\n",
"# !pip install openlm\n",
"# !pip install openai"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from getpass import getpass\n",
"import os\n",
"import subprocess\n",
"\n",
"\n",
"# Check if OPENAI_API_KEY environment variable is set\n",
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" print(\"Enter your OpenAI API key:\")\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass()\n",
"\n",
"# Check if HF_API_TOKEN environment variable is set\n",
"if \"HF_API_TOKEN\" not in os.environ:\n",
" print(\"Enter your HuggingFace Hub API key:\")\n",
" os.environ[\"HF_API_TOKEN\"] = getpass()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using LangChain with OpenLM\n",
"\n",
"Here we're going to call two models in an LLMChain, `text-davinci-003` from OpenAI and `gpt2` on HuggingFace."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import OpenLM\n",
"from langchain import PromptTemplate, LLMChain"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: text-davinci-003\n",
"Result: France is a country in Europe. The capital of France is Paris.\n",
"Model: huggingface.co/gpt2\n",
"Result: Question: What is the capital of France?\n",
"\n",
"Answer: Let's think step by step. I am not going to lie, this is a complicated issue, and I don't see any solutions to all this, but it is still far more\n"
]
}
],
"source": [
"question = \"What is the capital of France?\"\n",
"template = \"\"\"Question: {question}\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n",
"\n",
"for model in [\"text-davinci-003\", \"huggingface.co/gpt2\"]:\n",
" llm = OpenLM(model=model)\n",
" llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
" result = llm_chain.run(question)\n",
" print(\"\"\"Model: {}\n",
"Result: {}\"\"\".format(model, result))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -9,8 +9,8 @@ API References
./reference/models.rst
./reference/prompts.rst
./reference/modules/memory.rst
./reference/indexes.rst
./reference/modules/memory.rst
./reference/modules/chains.rst
./reference/agents.rst
./reference/modules/utilities.rst

View File

@@ -1,19 +0,0 @@
Use Cases
==========================
.. toctree::
:maxdepth: 1
:caption: Use Cases
:name: use_cases
./use_cases/autonomous_agents.md
./use_cases/personal_assistants.md
./use_cases/question_answering.md
./use_cases/chatbots.md
./use_cases/tabular.rst
./use_cases/apis.md
./use_cases/code.md
./use_cases/extraction.md
./use_cases/summarization.md
./use_cases/evaluation.rst
./use_cases/agent_simulations.md

View File

@@ -1,5 +1,8 @@
"""Agent toolkits."""
from langchain.agents.agent_toolkits.azure_cognitive_services.toolkit import (
AzureCognitiveServicesToolkit,
)
from langchain.agents.agent_toolkits.csv.base import create_csv_agent
from langchain.agents.agent_toolkits.file_management.toolkit import (
FileManagementToolkit,
@@ -60,4 +63,5 @@ __all__ = [
"JiraToolkit",
"FileManagementToolkit",
"PlayWrightBrowserToolkit",
"AzureCognitiveServicesToolkit",
]

View File

@@ -0,0 +1,7 @@
"""Azure Cognitive Services Toolkit."""
from langchain.agents.agent_toolkits.azure_cognitive_services.toolkit import (
AzureCognitiveServicesToolkit,
)
__all__ = ["AzureCognitiveServicesToolkit"]

View File

@@ -0,0 +1,31 @@
from __future__ import annotations
import sys
from typing import List
from langchain.agents.agent_toolkits.base import BaseToolkit
from langchain.tools.azure_cognitive_services import (
AzureCogsFormRecognizerTool,
AzureCogsImageAnalysisTool,
AzureCogsSpeech2TextTool,
AzureCogsText2SpeechTool,
)
from langchain.tools.base import BaseTool
class AzureCognitiveServicesToolkit(BaseToolkit):
"""Toolkit for Azure Cognitive Services."""
def get_tools(self) -> List[BaseTool]:
"""Get the tools in the toolkit."""
tools = [
AzureCogsFormRecognizerTool(),
AzureCogsSpeech2TextTool(),
AzureCogsText2SpeechTool(),
]
# TODO: Remove check once azure-ai-vision supports MacOS.
if sys.platform.startswith("linux") or sys.platform.startswith("win"):
tools.append(AzureCogsImageAnalysisTool())
return tools

View File

@@ -34,7 +34,7 @@ def create_pandas_dataframe_agent(
try:
import pandas as pd
except ImportError:
raise ValueError(
raise ImportError(
"pandas package not found, please install with `pip install pandas`"
)

View File

@@ -313,7 +313,7 @@ class GPTCache(BaseCache):
try:
import gptcache # noqa: F401
except ImportError:
raise ValueError(
raise ImportError(
"Could not import gptcache python package. "
"Please install it with `pip install gptcache`."
)

View File

@@ -12,6 +12,7 @@ from langchain.callbacks.openai_info import OpenAICallbackHandler
from langchain.callbacks.stdout import StdOutCallbackHandler
from langchain.callbacks.streaming_aiter import AsyncIteratorCallbackHandler
from langchain.callbacks.wandb_callback import WandbCallbackHandler
from langchain.callbacks.whylabs_callback import WhyLabsCallbackHandler
__all__ = [
"OpenAICallbackHandler",
@@ -21,6 +22,7 @@ __all__ = [
"MlflowCallbackHandler",
"ClearMLCallbackHandler",
"CometCallbackHandler",
"WhyLabsCallbackHandler",
"AsyncIteratorCallbackHandler",
"get_openai_callback",
"tracing_enabled",

View File

@@ -0,0 +1,203 @@
from __future__ import annotations
import logging
from typing import TYPE_CHECKING, Any, Dict, List, Optional, Union
from langchain.callbacks.base import BaseCallbackHandler
from langchain.schema import AgentAction, AgentFinish, Generation, LLMResult
from langchain.utils import get_from_env
if TYPE_CHECKING:
from whylogs.api.logger.logger import Logger
diagnostic_logger = logging.getLogger(__name__)
def import_langkit(
sentiment: bool = False,
toxicity: bool = False,
themes: bool = False,
) -> Any:
try:
import langkit # noqa: F401
import langkit.regexes # noqa: F401
import langkit.textstat # noqa: F401
if sentiment:
import langkit.sentiment # noqa: F401
if toxicity:
import langkit.toxicity # noqa: F401
if themes:
import langkit.themes # noqa: F401
except ImportError:
raise ImportError(
"To use the whylabs callback manager you need to have the `langkit` python "
"package installed. Please install it with `pip install langkit`."
)
return langkit
class WhyLabsCallbackHandler(BaseCallbackHandler):
"""WhyLabs CallbackHandler."""
def __init__(self, logger: Logger):
"""Initiate the rolling logger"""
super().__init__()
self.logger = logger
diagnostic_logger.info(
"Initialized WhyLabs callback handler with configured whylogs Logger."
)
def _profile_generations(self, generations: List[Generation]) -> None:
for gen in generations:
self.logger.log({"response": gen.text})
def on_llm_start(
self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
) -> None:
"""Pass the input prompts to the logger"""
for prompt in prompts:
self.logger.log({"prompt": prompt})
def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
"""Pass the generated response to the logger."""
for generations in response.generations:
self._profile_generations(generations)
def on_llm_new_token(self, token: str, **kwargs: Any) -> None:
"""Do nothing."""
pass
def on_llm_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Do nothing."""
pass
def on_chain_start(
self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs: Any
) -> None:
"""Do nothing."""
def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> None:
"""Do nothing."""
def on_chain_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Do nothing."""
pass
def on_tool_start(
self,
serialized: Dict[str, Any],
input_str: str,
**kwargs: Any,
) -> None:
"""Do nothing."""
def on_agent_action(
self, action: AgentAction, color: Optional[str] = None, **kwargs: Any
) -> Any:
"""Do nothing."""
def on_tool_end(
self,
output: str,
color: Optional[str] = None,
observation_prefix: Optional[str] = None,
llm_prefix: Optional[str] = None,
**kwargs: Any,
) -> None:
"""Do nothing."""
def on_tool_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Do nothing."""
pass
def on_text(self, text: str, **kwargs: Any) -> None:
"""Do nothing."""
def on_agent_finish(
self, finish: AgentFinish, color: Optional[str] = None, **kwargs: Any
) -> None:
"""Run on agent end."""
pass
def flush(self) -> None:
self.logger._do_rollover()
diagnostic_logger.info("Flushing WhyLabs logger, writing profile...")
def close(self) -> None:
self.logger.close()
diagnostic_logger.info("Closing WhyLabs logger, see you next time!")
def __enter__(self) -> WhyLabsCallbackHandler:
return self
def __exit__(
self, exception_type: Any, exception_value: Any, traceback: Any
) -> None:
self.close()
@classmethod
def from_params(
cls,
*,
api_key: Optional[str] = None,
org_id: Optional[str] = None,
dataset_id: Optional[str] = None,
sentiment: bool = False,
toxicity: bool = False,
themes: bool = False,
) -> Logger:
"""Instantiate whylogs Logger from params.
Args:
api_key (Optional[str]): WhyLabs API key. Optional because the preferred
way to specify the API key is with environment variable
WHYLABS_API_KEY.
org_id (Optional[str]): WhyLabs organization id to write profiles to.
If not set must be specified in environment variable
WHYLABS_DEFAULT_ORG_ID.
dataset_id (Optional[str]): The model or dataset this callback is gathering
telemetry for. If not set must be specified in environment variable
WHYLABS_DEFAULT_DATASET_ID.
sentiment (bool): If True will initialize a model to perform
sentiment analysis compound score. Defaults to False and will not gather
this metric.
toxicity (bool): If True will initialize a model to score
toxicity. Defaults to False and will not gather this metric.
themes (bool): If True will initialize a model to calculate
distance to configured themes. Defaults to None and will not gather this
metric.
"""
# langkit library will import necessary whylogs libraries
import_langkit(sentiment=sentiment, toxicity=toxicity, themes=themes)
import whylogs as why
from whylogs.api.writer.whylabs import WhyLabsWriter
from whylogs.core.schema import DeclarativeSchema
from whylogs.experimental.core.metrics.udf_metric import generate_udf_schema
api_key = api_key or get_from_env("api_key", "WHYLABS_API_KEY")
org_id = org_id or get_from_env("org_id", "WHYLABS_DEFAULT_ORG_ID")
dataset_id = dataset_id or get_from_env(
"dataset_id", "WHYLABS_DEFAULT_DATASET_ID"
)
whylabs_writer = WhyLabsWriter(
api_key=api_key, org_id=org_id, dataset_id=dataset_id
)
langkit_schema = DeclarativeSchema(generate_udf_schema())
whylabs_logger = why.logger(
mode="rolling", interval=5, when="M", schema=langkit_schema
)
whylabs_logger.append_writer(writer=whylabs_writer)
diagnostic_logger.info(
"Started whylogs Logger with WhyLabsWriter and initialized LangKit. 📝"
)
return cls(whylabs_logger)

View File

@@ -54,7 +54,7 @@ class OpenAIModerationChain(Chain):
openai.organization = openai_organization
values["client"] = openai.Moderation
except ImportError:
raise ValueError(
raise ImportError(
"Could not import openai python package. "
"Please install it with `pip install openai`."
)

View File

@@ -86,7 +86,7 @@ class AzureChatOpenAI(ChatOpenAI):
if openai_organization:
openai.organization = openai_organization
except ImportError:
raise ValueError(
raise ImportError(
"Could not import openai python package. "
"Please install it with `pip install openai`."
)

View File

@@ -342,7 +342,7 @@ class LangChainPlusClient(BaseSettings):
) -> Example:
"""Create a dataset example in the LangChain+ API."""
if dataset_id is None:
dataset_id = self.read_dataset(dataset_name).id
dataset_id = self.read_dataset(dataset_name=dataset_name).id
data = {
"inputs": inputs,

View File

@@ -48,6 +48,7 @@ from langchain.document_loaders.image_captions import ImageCaptionLoader
from langchain.document_loaders.imsdb import IMSDbLoader
from langchain.document_loaders.json_loader import JSONLoader
from langchain.document_loaders.markdown import UnstructuredMarkdownLoader
from langchain.document_loaders.mastodon import MastodonTootsLoader
from langchain.document_loaders.mediawikidump import MWDumpLoader
from langchain.document_loaders.modern_treasury import ModernTreasuryLoader
from langchain.document_loaders.notebook import NotebookLoader
@@ -160,6 +161,7 @@ __all__ = [
"ImageCaptionLoader",
"JSONLoader",
"MWDumpLoader",
"MastodonTootsLoader",
"MathpixPDFLoader",
"ModernTreasuryLoader",
"NotebookLoader",

View File

@@ -41,7 +41,7 @@ class ApifyDatasetLoader(BaseLoader, BaseModel):
values["apify_client"] = ApifyClient()
except ImportError:
raise ValueError(
raise ImportError(
"Could not import apify-client Python package. "
"Please install it with `pip install apify-client`."
)

View File

@@ -63,7 +63,7 @@ class DocugamiLoader(BaseLoader, BaseModel):
try:
from lxml import etree
except ImportError:
raise ValueError(
raise ImportError(
"Could not import lxml python package. "
"Please install it with `pip install lxml`."
)
@@ -259,7 +259,7 @@ class DocugamiLoader(BaseLoader, BaseModel):
try:
from lxml import etree
except ImportError:
raise ValueError(
raise ImportError(
"Could not import lxml python package. "
"Please install it with `pip install lxml`."
)

View File

@@ -33,7 +33,7 @@ class DuckDBLoader(BaseLoader):
try:
import duckdb
except ImportError:
raise ValueError(
raise ImportError(
"Could not import duckdb python package. "
"Please install it with `pip install duckdb`."
)

View File

@@ -39,7 +39,7 @@ class ImageCaptionLoader(BaseLoader):
try:
from transformers import BlipForConditionalGeneration, BlipProcessor
except ImportError:
raise ValueError(
raise ImportError(
"`transformers` package not found, please install with "
"`pip install transformers`."
)
@@ -66,7 +66,7 @@ class ImageCaptionLoader(BaseLoader):
try:
from PIL import Image
except ImportError:
raise ValueError(
raise ImportError(
"`PIL` package not found, please install with `pip install pillow`"
)

View File

@@ -42,7 +42,7 @@ class JSONLoader(BaseLoader):
try:
import jq # noqa:F401
except ImportError:
raise ValueError(
raise ImportError(
"jq package not found, please install it with `pip install jq`"
)

View File

@@ -0,0 +1,88 @@
"""Mastodon document loader."""
from __future__ import annotations
import os
from typing import TYPE_CHECKING, Any, Dict, Iterable, List, Optional, Sequence
from langchain.docstore.document import Document
from langchain.document_loaders.base import BaseLoader
if TYPE_CHECKING:
import mastodon
def _dependable_mastodon_import() -> mastodon:
try:
import mastodon
except ImportError:
raise ValueError(
"Mastodon.py package not found, "
"please install it with `pip install Mastodon.py`"
)
return mastodon
class MastodonTootsLoader(BaseLoader):
"""Mastodon toots loader."""
def __init__(
self,
mastodon_accounts: Sequence[str],
number_toots: Optional[int] = 100,
exclude_replies: bool = False,
access_token: Optional[str] = None,
api_base_url: str = "https://mastodon.social",
):
"""Instantiate Mastodon toots loader.
Args:
mastodon_accounts: The list of Mastodon accounts to query.
number_toots: How many toots to pull for each account.
exclude_replies: Whether to exclude reply toots from the load.
access_token: An access token if toots are loaded as a Mastodon app. Can
also be specified via the environment variables "MASTODON_ACCESS_TOKEN".
api_base_url: A Mastodon API base URL to talk to, if not using the default.
"""
mastodon = _dependable_mastodon_import()
access_token = access_token or os.environ.get("MASTODON_ACCESS_TOKEN")
self.api = mastodon.Mastodon(
access_token=access_token, api_base_url=api_base_url
)
self.mastodon_accounts = mastodon_accounts
self.number_toots = number_toots
self.exclude_replies = exclude_replies
def load(self) -> List[Document]:
"""Load toots into documents."""
results: List[Document] = []
for account in self.mastodon_accounts:
user = self.api.account_lookup(account)
toots = self.api.account_statuses(
user.id,
only_media=False,
pinned=False,
exclude_replies=self.exclude_replies,
exclude_reblogs=True,
limit=self.number_toots,
)
docs = self._format_toots(toots, user)
results.extend(docs)
return results
def _format_toots(
self, toots: List[Dict[str, Any]], user_info: dict
) -> Iterable[Document]:
"""Format toots into documents.
Adding user info, and selected toot fields into the metadata.
"""
for toot in toots:
metadata = {
"created_at": toot["created_at"],
"user_info": user_info,
"is_reply": toot["in_reply_to_id"] is not None,
}
yield Document(
page_content=toot["content"],
metadata=metadata,
)

View File

@@ -83,7 +83,7 @@ class NotebookLoader(BaseLoader):
try:
import pandas as pd
except ImportError:
raise ValueError(
raise ImportError(
"pandas is needed for Notebook Loader, "
"please install with `pip install pandas`"
)

View File

@@ -77,7 +77,7 @@ class OneDriveLoader(BaseLoader, BaseModel):
try:
from O365 import FileSystemTokenBackend
except ImportError:
raise ValueError(
raise ImportError(
"O365 package not found, please install it with `pip install o365`"
)
if self.auth_with_token:

View File

@@ -103,7 +103,7 @@ class PyPDFLoader(BasePDFLoader):
try:
import pypdf # noqa:F401
except ImportError:
raise ValueError(
raise ImportError(
"pypdf package not found, please install it with " "`pip install pypdf`"
)
self.parser = PyPDFParser()
@@ -194,7 +194,7 @@ class PDFMinerLoader(BasePDFLoader):
try:
from pdfminer.high_level import extract_text # noqa:F401
except ImportError:
raise ValueError(
raise ImportError(
"`pdfminer` package not found, please install it with "
"`pip install pdfminer.six`"
)
@@ -222,7 +222,7 @@ class PDFMinerPDFasHTMLLoader(BasePDFLoader):
try:
from pdfminer.high_level import extract_text_to_fp # noqa:F401
except ImportError:
raise ValueError(
raise ImportError(
"`pdfminer` package not found, please install it with "
"`pip install pdfminer.six`"
)
@@ -256,7 +256,7 @@ class PyMuPDFLoader(BasePDFLoader):
try:
import fitz # noqa:F401
except ImportError:
raise ValueError(
raise ImportError(
"`PyMuPDF` package not found, please install it with "
"`pip install pymupdf`"
)
@@ -375,7 +375,7 @@ class PDFPlumberLoader(BasePDFLoader):
try:
import pdfplumber # noqa:F401
except ImportError:
raise ValueError(
raise ImportError(
"pdfplumber package not found, please install it with "
"`pip install pdfplumber`"
)

View File

@@ -19,9 +19,8 @@ class ReadTheDocsLoader(BaseLoader):
"""Initialize path."""
try:
from bs4 import BeautifulSoup
except ImportError:
raise ValueError(
raise ImportError(
"Could not import python packages. "
"Please install it with `pip install beautifulsoup4`. "
)

View File

@@ -19,7 +19,7 @@ class S3DirectoryLoader(BaseLoader):
try:
import boto3
except ImportError:
raise ValueError(
raise ImportError(
"Could not import boto3 python package. "
"Please install it with `pip install boto3`."
)

View File

@@ -21,7 +21,7 @@ class S3FileLoader(BaseLoader):
try:
import boto3
except ImportError:
raise ValueError(
raise ImportError(
"Could not import `boto3` python package. "
"Please install it with `pip install boto3`."
)

View File

@@ -58,7 +58,7 @@ class SitemapLoader(WebBaseLoader):
try:
import lxml # noqa:F401
except ImportError:
raise ValueError(
raise ImportError(
"lxml package not found, please install it with " "`pip install lxml`"
)
@@ -107,8 +107,9 @@ class SitemapLoader(WebBaseLoader):
try:
import bs4
except ImportError:
raise ValueError(
"bs4 package not found, please install it with " "`pip install bs4`"
raise ImportError(
"beautifulsoup4 package not found, please install it"
" with `pip install beautifulsoup4`"
)
fp = open(self.web_path)
soup = bs4.BeautifulSoup(fp, "xml")

View File

@@ -13,8 +13,8 @@ class SRTLoader(BaseLoader):
try:
import pysrt # noqa:F401
except ImportError:
raise ValueError(
"package `pysrt` not found, please install it with `pysrt`"
raise ImportError(
"package `pysrt` not found, please install it with `pip install pysrt`"
)
self.file_path = file_path

View File

@@ -226,7 +226,7 @@ class TelegramChatApiLoader(BaseLoader):
nest_asyncio.apply()
asyncio.run(self.fetch_data_from_telegram())
except ImportError:
raise ValueError(
raise ImportError(
"""`nest_asyncio` package not found.
please install with `pip install nest_asyncio`
"""
@@ -239,7 +239,7 @@ class TelegramChatApiLoader(BaseLoader):
try:
import pandas as pd
except ImportError:
raise ValueError(
raise ImportError(
"""`pandas` package not found.
please install with `pip install pandas`
"""

View File

@@ -15,7 +15,7 @@ def _dependable_tweepy_import() -> tweepy:
try:
import tweepy
except ImportError:
raise ValueError(
raise ImportError(
"tweepy package not found, please install it with `pip install tweepy`"
)
return tweepy

View File

@@ -30,7 +30,7 @@ class PlaywrightURLLoader(BaseLoader):
try:
import playwright # noqa:F401
except ImportError:
raise ValueError(
raise ImportError(
"playwright package not found, please install it with "
"`pip install playwright`"
)

View File

@@ -40,7 +40,7 @@ class SeleniumURLLoader(BaseLoader):
try:
import selenium # noqa:F401
except ImportError:
raise ValueError(
raise ImportError(
"selenium package not found, please install it with "
"`pip install selenium`"
)
@@ -48,7 +48,7 @@ class SeleniumURLLoader(BaseLoader):
try:
import unstructured # noqa:F401
except ImportError:
raise ValueError(
raise ImportError(
"unstructured package not found, please install it with "
"`pip install unstructured`"
)

View File

@@ -48,7 +48,7 @@ class CohereEmbeddings(BaseModel, Embeddings):
values["client"] = cohere.Client(cohere_api_key)
except ImportError:
raise ValueError(
raise ImportError(
"Could not import cohere python package. "
"Please install it with `pip install cohere`."
)

View File

@@ -46,7 +46,7 @@ class HuggingFaceEmbeddings(BaseModel, Embeddings):
import sentence_transformers
except ImportError as exc:
raise ValueError(
raise ImportError(
"Could not import sentence_transformers python package. "
"Please install it with `pip install sentence_transformers`."
) from exc

View File

@@ -34,7 +34,7 @@ class JinaEmbeddings(BaseModel, Embeddings):
try:
import jina
except ImportError:
raise ValueError(
raise ImportError(
"Could not import `jina` python package. "
"Please install it with `pip install jina`."
)

View File

@@ -178,7 +178,7 @@ class OpenAIEmbeddings(BaseModel, Embeddings):
openai.api_type = openai_api_type
values["client"] = openai.Embedding
except ImportError:
raise ValueError(
raise ImportError(
"Could not import openai python package. "
"Please install it with `pip install openai`."
)
@@ -192,67 +192,64 @@ class OpenAIEmbeddings(BaseModel, Embeddings):
embeddings: List[List[float]] = [[] for _ in range(len(texts))]
try:
import tiktoken
tokens = []
indices = []
encoding = tiktoken.model.encoding_for_model(self.model)
for i, text in enumerate(texts):
if self.model.endswith("001"):
# See: https://github.com/openai/openai-python/issues/418#issuecomment-1525939500
# replace newlines, which can negatively affect performance.
text = text.replace("\n", " ")
token = encoding.encode(
text,
allowed_special=self.allowed_special,
disallowed_special=self.disallowed_special,
)
for j in range(0, len(token), self.embedding_ctx_length):
tokens += [token[j : j + self.embedding_ctx_length]]
indices += [i]
batched_embeddings = []
_chunk_size = chunk_size or self.chunk_size
for i in range(0, len(tokens), _chunk_size):
response = embed_with_retry(
self,
input=tokens[i : i + _chunk_size],
engine=self.deployment,
request_timeout=self.request_timeout,
headers=self.headers,
)
batched_embeddings += [r["embedding"] for r in response["data"]]
results: List[List[List[float]]] = [[] for _ in range(len(texts))]
num_tokens_in_batch: List[List[int]] = [[] for _ in range(len(texts))]
for i in range(len(indices)):
results[indices[i]].append(batched_embeddings[i])
num_tokens_in_batch[indices[i]].append(len(tokens[i]))
for i in range(len(texts)):
_result = results[i]
if len(_result) == 0:
average = embed_with_retry(
self,
input="",
engine=self.deployment,
request_timeout=self.request_timeout,
headers=self.headers,
)["data"][0]["embedding"]
else:
average = np.average(
_result, axis=0, weights=num_tokens_in_batch[i]
)
embeddings[i] = (average / np.linalg.norm(average)).tolist()
return embeddings
except ImportError:
raise ValueError(
raise ImportError(
"Could not import tiktoken python package. "
"This is needed in order to for OpenAIEmbeddings. "
"Please install it with `pip install tiktoken`."
)
tokens = []
indices = []
encoding = tiktoken.model.encoding_for_model(self.model)
for i, text in enumerate(texts):
if self.model.endswith("001"):
# See: https://github.com/openai/openai-python/issues/418#issuecomment-1525939500
# replace newlines, which can negatively affect performance.
text = text.replace("\n", " ")
token = encoding.encode(
text,
allowed_special=self.allowed_special,
disallowed_special=self.disallowed_special,
)
for j in range(0, len(token), self.embedding_ctx_length):
tokens += [token[j : j + self.embedding_ctx_length]]
indices += [i]
batched_embeddings = []
_chunk_size = chunk_size or self.chunk_size
for i in range(0, len(tokens), _chunk_size):
response = embed_with_retry(
self,
input=tokens[i : i + _chunk_size],
engine=self.deployment,
request_timeout=self.request_timeout,
headers=self.headers,
)
batched_embeddings += [r["embedding"] for r in response["data"]]
results: List[List[List[float]]] = [[] for _ in range(len(texts))]
num_tokens_in_batch: List[List[int]] = [[] for _ in range(len(texts))]
for i in range(len(indices)):
results[indices[i]].append(batched_embeddings[i])
num_tokens_in_batch[indices[i]].append(len(tokens[i]))
for i in range(len(texts)):
_result = results[i]
if len(_result) == 0:
average = embed_with_retry(
self,
input="",
engine=self.deployment,
request_timeout=self.request_timeout,
headers=self.headers,
)["data"][0]["embedding"]
else:
average = np.average(_result, axis=0, weights=num_tokens_in_batch[i])
embeddings[i] = (average / np.linalg.norm(average)).tolist()
return embeddings
def _embedding_func(self, text: str, *, engine: str) -> List[float]:
"""Call out to OpenAI's embedding endpoint."""
# handle large input text

View File

@@ -30,13 +30,20 @@ class TensorflowHubEmbeddings(BaseModel, Embeddings):
super().__init__(**kwargs)
try:
import tensorflow_hub
except ImportError:
raise ImportError(
"Could not import tensorflow-hub python package. "
"Please install it with `pip install tensorflow-hub``."
)
try:
import tensorflow_text # noqa
except ImportError:
raise ImportError(
"Could not import tensorflow_text python package. "
"Please install it with `pip install tensorflow_text``."
)
self.embed = tensorflow_hub.load(self.model_url)
except ImportError as e:
raise ValueError(
"Could not import some python packages." "Please install them."
) from e
self.embed = tensorflow_hub.load(self.model_url)
class Config:
"""Configuration for this pydantic object."""

View File

@@ -1,3 +1,4 @@
from langchain.experimental.autonomous_agents.autogpt.agent import AutoGPT
from langchain.experimental.autonomous_agents.baby_agi.baby_agi import BabyAGI
__all__ = ["BabyAGI"]
__all__ = ["BabyAGI", "AutoGPT"]

View File

@@ -54,7 +54,7 @@ class NetworkxEntityGraph:
try:
import networkx as nx
except ImportError:
raise ValueError(
raise ImportError(
"Could not import networkx python package. "
"Please install it with `pip install networkx`."
)
@@ -70,7 +70,7 @@ class NetworkxEntityGraph:
try:
import networkx as nx
except ImportError:
raise ValueError(
raise ImportError(
"Could not import networkx python package. "
"Please install it with `pip install networkx`."
)

View File

@@ -24,6 +24,7 @@ from langchain.llms.llamacpp import LlamaCpp
from langchain.llms.modal import Modal
from langchain.llms.nlpcloud import NLPCloud
from langchain.llms.openai import AzureOpenAI, OpenAI, OpenAIChat
from langchain.llms.openlm import OpenLM
from langchain.llms.petals import Petals
from langchain.llms.pipelineai import PipelineAI
from langchain.llms.predictionguard import PredictionGuard
@@ -53,6 +54,7 @@ __all__ = [
"NLPCloud",
"OpenAI",
"OpenAIChat",
"OpenLM",
"Petals",
"PipelineAI",
"HuggingFaceEndpoint",
@@ -96,6 +98,7 @@ type_to_cls_dict: Dict[str, Type[BaseLLM]] = {
"nlpcloud": NLPCloud,
"human-input": HumanInputLLM,
"openai": OpenAI,
"openlm": OpenLM,
"petals": Petals,
"pipelineai": PipelineAI,
"huggingface_pipeline": HuggingFacePipeline,

View File

@@ -148,7 +148,7 @@ class AlephAlpha(LLM):
values["client"] = aleph_alpha_client.Client(token=aleph_alpha_api_key)
except ImportError:
raise ValueError(
raise ImportError(
"Could not import aleph_alpha_client python package. "
"Please install it with `pip install aleph_alpha_client`."
)

View File

@@ -59,7 +59,7 @@ class _AnthropicCommon(BaseModel):
values["AI_PROMPT"] = anthropic.AI_PROMPT
values["count_tokens"] = anthropic.count_tokens
except ImportError:
raise ValueError(
raise ImportError(
"Could not import anthropic python package. "
"Please it install it with `pip install anthropic`."
)

View File

@@ -91,7 +91,7 @@ class Banana(LLM):
try:
import banana_dev as banana
except ImportError:
raise ValueError(
raise ImportError(
"Could not import banana-dev python package. "
"Please install it with `pip install banana-dev`."
)

View File

@@ -72,7 +72,7 @@ class Cohere(LLM):
values["client"] = cohere.Client(cohere_api_key)
except ImportError:
raise ValueError(
raise ImportError(
"Could not import cohere python package. "
"Please install it with `pip install cohere`."
)

View File

@@ -29,7 +29,10 @@ def _create_retry_decorator() -> Callable[[Any], Any]:
try:
import google.api_core.exceptions
except ImportError:
raise ImportError()
raise ImportError(
"Could not import google-api-core python package. "
"Please install it with `pip install google-api-core`."
)
multiplier = 2
min_seconds = 1
@@ -105,7 +108,10 @@ class GooglePalm(BaseLLM, BaseModel):
genai.configure(api_key=google_api_key)
except ImportError:
raise ImportError("Could not import google.generativeai python package.")
raise ImportError(
"Could not import google-generativeai python package. "
"Please install it with `pip install google-generativeai`."
)
values["client"] = genai

View File

@@ -100,7 +100,7 @@ class GooseAI(LLM):
openai.api_base = "https://api.goose.ai/v1"
values["client"] = openai.Completion
except ImportError:
raise ValueError(
raise ImportError(
"Could not import openai python package. "
"Please install it with `pip install openai`."
)

View File

@@ -97,7 +97,7 @@ class HuggingFaceTextGenInference(LLM):
values["inference_server_url"], timeout=values["timeout"]
)
except ImportError:
raise ValueError(
raise ImportError(
"Could not import text_generation python package. "
"Please install it with `pip install text_generation`."
)

View File

@@ -75,7 +75,7 @@ class NLPCloud(LLM):
values["model_name"], nlpcloud_api_key, gpu=True, lang="en"
)
except ImportError:
raise ValueError(
raise ImportError(
"Could not import nlpcloud python package. "
"Please install it with `pip install nlpcloud`."
)

View File

@@ -234,7 +234,7 @@ class BaseOpenAI(BaseLLM):
openai.organization = openai_organization
values["client"] = openai.Completion
except ImportError:
raise ValueError(
raise ImportError(
"Could not import openai python package. "
"Please install it with `pip install openai`."
)
@@ -462,7 +462,7 @@ class BaseOpenAI(BaseLLM):
try:
import tiktoken
except ImportError:
raise ValueError(
raise ImportError(
"Could not import tiktoken python package. "
"This is needed in order to calculate get_num_tokens. "
"Please install it with `pip install tiktoken`."
@@ -677,7 +677,7 @@ class OpenAIChat(BaseLLM):
if openai_organization:
openai.organization = openai_organization
except ImportError:
raise ValueError(
raise ImportError(
"Could not import openai python package. "
"Please install it with `pip install openai`."
)
@@ -807,7 +807,7 @@ class OpenAIChat(BaseLLM):
try:
import tiktoken
except ImportError:
raise ValueError(
raise ImportError(
"Could not import tiktoken python package. "
"This is needed in order to calculate get_num_tokens. "
"Please install it with `pip install tiktoken`."

26
langchain/llms/openlm.py Normal file
View File

@@ -0,0 +1,26 @@
from typing import Any, Dict
from pydantic import root_validator
from langchain.llms.openai import BaseOpenAI
class OpenLM(BaseOpenAI):
@property
def _invocation_params(self) -> Dict[str, Any]:
return {**{"model": self.model_name}, **super()._invocation_params}
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
try:
import openlm
values["client"] = openlm.Completion
except ImportError:
raise ValueError(
"Could not import openlm python package. "
"Please install it with `pip install openlm`."
)
if values["streaming"]:
raise ValueError("Streaming not supported with openlm")
return values

View File

@@ -50,7 +50,7 @@ class PredictionGuard(LLM):
values["client"] = pg.Client(token=token)
except ImportError:
raise ValueError(
raise ImportError(
"Could not import predictionguard python package. "
"Please install it with `pip install predictionguard`."
)

View File

@@ -89,7 +89,7 @@ class Replicate(LLM):
try:
import replicate as replicate_python
except ImportError:
raise ValueError(
raise ImportError(
"Could not import replicate python package. "
"Please install it with `pip install replicate`."
)

View File

@@ -103,7 +103,7 @@ class RWKV(LLM, BaseModel):
try:
import tokenizers
except ImportError:
raise ValueError(
raise ImportError(
"Could not import tokenizers python package. "
"Please install it with `pip install tokenizers`."
)

View File

@@ -182,7 +182,7 @@ class SagemakerEndpoint(LLM):
) from e
except ImportError:
raise ValueError(
raise ImportError(
"Could not import boto3 python package. "
"Please install it with `pip install boto3`."
)

View File

@@ -155,7 +155,7 @@ class SelfHostedPipeline(LLM):
import runhouse as rh
except ImportError:
raise ValueError(
raise ImportError(
"Could not import runhouse python package. "
"Please install it with `pip install runhouse`."
)

View File

@@ -52,13 +52,11 @@ class FirestoreChatMessageHistory(BaseChatMessageHistory):
try:
import firebase_admin
from firebase_admin import firestore
except ImportError as e:
logger.error(
"Failed to import Firebase and Firestore: %s. "
"Make sure to install the 'firebase-admin' module.",
e,
except ImportError:
raise ImportError(
"Could not import firebase-admin python package. "
"Please install it with `pip install firebase-admin`."
)
raise e
# For multiple instances, only initialize the app once.
try:

View File

@@ -25,7 +25,7 @@ class RedisChatMessageHistory(BaseChatMessageHistory):
try:
import redis
except ImportError:
raise ValueError(
raise ImportError(
"Could not import redis python package. "
"Please install it with `pip install redis`."
)

View File

@@ -91,7 +91,7 @@ class RedisEntityStore(BaseEntityStore):
try:
import redis
except ImportError:
raise ValueError(
raise ImportError(
"Could not import redis python package. "
"Please install it with `pip install redis`."
)

View File

@@ -41,7 +41,7 @@ class CohereRerank(BaseDocumentCompressor):
values["client"] = cohere.Client(cohere_api_key)
except ImportError:
raise ValueError(
raise ImportError(
"Could not import cohere python package. "
"Please install it with `pip install cohere`."
)

View File

@@ -109,7 +109,9 @@ class TimeWeightedVectorStoreRetriever(BaseRetriever, BaseModel):
def add_documents(self, documents: List[Document], **kwargs: Any) -> List[str]:
"""Add documents to vectorstore."""
current_time = kwargs.get("current_time", datetime.datetime.now())
current_time = kwargs.get("current_time")
if current_time is None:
current_time = datetime.datetime.now()
# Avoid mutating input documents
dup_docs = [deepcopy(d) for d in documents]
for i, doc in enumerate(dup_docs):

View File

@@ -23,7 +23,7 @@ class WeaviateHybridSearchRetriever(BaseRetriever):
try:
import weaviate
except ImportError:
raise ValueError(
raise ImportError(
"Could not import weaviate python package. "
"Please install it with `pip install weaviate-client`."
)

View File

@@ -7,7 +7,6 @@ from typing import (
Dict,
Generic,
List,
NamedTuple,
Optional,
Sequence,
TypeVar,
@@ -37,7 +36,7 @@ def get_buffer_string(
return "\n".join(string_messages)
class AgentAction(NamedTuple):
class AgentAction(BaseModel):
"""Agent's action to take."""
tool: str
@@ -45,7 +44,7 @@ class AgentAction(NamedTuple):
log: str
class AgentFinish(NamedTuple):
class AgentFinish(BaseModel):
"""Agent's return value."""
return_values: dict

View File

@@ -64,10 +64,12 @@ class TextSplitter(BaseDocumentTransformer, ABC):
documents.append(new_doc)
return documents
def split_documents(self, documents: List[Document]) -> List[Document]:
def split_documents(self, documents: Iterable[Document]) -> List[Document]:
"""Split documents."""
texts = [doc.page_content for doc in documents]
metadatas = [doc.metadata for doc in documents]
texts, metadatas = [], []
for doc in documents:
texts.append(doc.page_content)
metadatas.append(doc.metadata)
return self.create_documents(texts, metadatas=metadatas)
def _join_docs(self, docs: List[str], separator: str) -> Optional[str]:
@@ -154,7 +156,7 @@ class TextSplitter(BaseDocumentTransformer, ABC):
try:
import tiktoken
except ImportError:
raise ValueError(
raise ImportError(
"Could not import tiktoken python package. "
"This is needed in order to calculate max_tokens_for_prompt. "
"Please install it with `pip install tiktoken`."
@@ -232,7 +234,7 @@ class TokenTextSplitter(TextSplitter):
try:
import tiktoken
except ImportError:
raise ValueError(
raise ImportError(
"Could not import tiktoken python package. "
"This is needed in order to for TokenTextSplitter. "
"Please install it with `pip install tiktoken`."

View File

@@ -1,5 +1,11 @@
"""Core toolkit implementations."""
from langchain.tools.azure_cognitive_services import (
AzureCogsFormRecognizerTool,
AzureCogsImageAnalysisTool,
AzureCogsSpeech2TextTool,
AzureCogsText2SpeechTool,
)
from langchain.tools.base import BaseTool, StructuredTool, Tool, tool
from langchain.tools.bing_search.tool import BingSearchResults, BingSearchRun
from langchain.tools.ddg_search.tool import DuckDuckGoSearchResults, DuckDuckGoSearchRun
@@ -56,6 +62,10 @@ from langchain.tools.zapier.tool import ZapierNLAListActions, ZapierNLARunAction
__all__ = [
"AIPluginTool",
"APIOperation",
"AzureCogsFormRecognizerTool",
"AzureCogsImageAnalysisTool",
"AzureCogsSpeech2TextTool",
"AzureCogsText2SpeechTool",
"BaseTool",
"BaseTool",
"BaseTool",

View File

@@ -0,0 +1,21 @@
"""Azure Cognitive Services Tools."""
from langchain.tools.azure_cognitive_services.form_recognizer import (
AzureCogsFormRecognizerTool,
)
from langchain.tools.azure_cognitive_services.image_analysis import (
AzureCogsImageAnalysisTool,
)
from langchain.tools.azure_cognitive_services.speech2text import (
AzureCogsSpeech2TextTool,
)
from langchain.tools.azure_cognitive_services.text2speech import (
AzureCogsText2SpeechTool,
)
__all__ = [
"AzureCogsImageAnalysisTool",
"AzureCogsFormRecognizerTool",
"AzureCogsSpeech2TextTool",
"AzureCogsText2SpeechTool",
]

View File

@@ -0,0 +1,152 @@
from __future__ import annotations
import logging
from typing import Any, Dict, List, Optional
from pydantic import root_validator
from langchain.callbacks.manager import (
AsyncCallbackManagerForToolRun,
CallbackManagerForToolRun,
)
from langchain.tools.azure_cognitive_services.utils import detect_file_src_type
from langchain.tools.base import BaseTool
from langchain.utils import get_from_dict_or_env
logger = logging.getLogger(__name__)
class AzureCogsFormRecognizerTool(BaseTool):
"""Tool that queries the Azure Cognitive Services Form Recognizer API.
In order to set this up, follow instructions at:
https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/quickstarts/get-started-sdks-rest-api?view=form-recog-3.0.0&pivots=programming-language-python
"""
azure_cogs_key: str = "" #: :meta private:
azure_cogs_endpoint: str = "" #: :meta private:
doc_analysis_client: Any #: :meta private:
name = "Azure Cognitive Services Form Recognizer"
description = (
"A wrapper around Azure Cognitive Services Form Recognizer. "
"Useful for when you need to "
"extract text, tables, and key-value pairs from documents. "
"Input should be a url to a document."
)
@root_validator(pre=True)
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and endpoint exists in environment."""
azure_cogs_key = get_from_dict_or_env(
values, "azure_cogs_key", "AZURE_COGS_KEY"
)
azure_cogs_endpoint = get_from_dict_or_env(
values, "azure_cogs_endpoint", "AZURE_COGS_ENDPOINT"
)
try:
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
values["doc_analysis_client"] = DocumentAnalysisClient(
endpoint=azure_cogs_endpoint,
credential=AzureKeyCredential(azure_cogs_key),
)
except ImportError:
raise ImportError(
"azure-ai-formrecognizer is not installed. "
"Run `pip install azure-ai-formrecognizer` to install."
)
return values
def _parse_tables(self, tables: List[Any]) -> List[Any]:
result = []
for table in tables:
rc, cc = table.row_count, table.column_count
_table = [["" for _ in range(cc)] for _ in range(rc)]
for cell in table.cells:
_table[cell.row_index][cell.column_index] = cell.content
result.append(_table)
return result
def _parse_kv_pairs(self, kv_pairs: List[Any]) -> List[Any]:
result = []
for kv_pair in kv_pairs:
key = kv_pair.key.content if kv_pair.key else ""
value = kv_pair.value.content if kv_pair.value else ""
result.append((key, value))
return result
def _document_analysis(self, document_path: str) -> Dict:
document_src_type = detect_file_src_type(document_path)
if document_src_type == "local":
with open(document_path, "rb") as document:
poller = self.doc_analysis_client.begin_analyze_document(
"prebuilt-document", document
)
elif document_src_type == "remote":
poller = self.doc_analysis_client.begin_analyze_document_from_url(
"prebuilt-document", document_path
)
else:
raise ValueError(f"Invalid document path: {document_path}")
result = poller.result()
res_dict = {}
if result.content is not None:
res_dict["content"] = result.content
if result.tables is not None:
res_dict["tables"] = self._parse_tables(result.tables)
if result.key_value_pairs is not None:
res_dict["key_value_pairs"] = self._parse_kv_pairs(result.key_value_pairs)
return res_dict
def _format_document_analysis_result(self, document_analysis_result: Dict) -> str:
formatted_result = []
if "content" in document_analysis_result:
formatted_result.append(
f"Content: {document_analysis_result['content']}".replace("\n", " ")
)
if "tables" in document_analysis_result:
for i, table in enumerate(document_analysis_result["tables"]):
formatted_result.append(f"Table {i}: {table}".replace("\n", " "))
if "key_value_pairs" in document_analysis_result:
for kv_pair in document_analysis_result["key_value_pairs"]:
formatted_result.append(
f"{kv_pair[0]}: {kv_pair[1]}".replace("\n", " ")
)
return "\n".join(formatted_result)
def _run(
self,
query: str,
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Use the tool."""
try:
document_analysis_result = self._document_analysis(query)
if not document_analysis_result:
return "No good document analysis result was found"
return self._format_document_analysis_result(document_analysis_result)
except Exception as e:
raise RuntimeError(f"Error while running AzureCogsFormRecognizerTool: {e}")
async def _arun(
self,
query: str,
run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
) -> str:
"""Use the tool asynchronously."""
raise NotImplementedError("AzureCogsFormRecognizerTool does not support async")

View File

@@ -0,0 +1,156 @@
from __future__ import annotations
import logging
from typing import Any, Dict, Optional
from pydantic import root_validator
from langchain.callbacks.manager import (
AsyncCallbackManagerForToolRun,
CallbackManagerForToolRun,
)
from langchain.tools.azure_cognitive_services.utils import detect_file_src_type
from langchain.tools.base import BaseTool
from langchain.utils import get_from_dict_or_env
logger = logging.getLogger(__name__)
class AzureCogsImageAnalysisTool(BaseTool):
"""Tool that queries the Azure Cognitive Services Image Analysis API.
In order to set this up, follow instructions at:
https://learn.microsoft.com/en-us/azure/cognitive-services/computer-vision/quickstarts-sdk/image-analysis-client-library-40
"""
azure_cogs_key: str = "" #: :meta private:
azure_cogs_endpoint: str = "" #: :meta private:
vision_service: Any #: :meta private:
analysis_options: Any #: :meta private:
name = "Azure Cognitive Services Image Analysis"
description = (
"A wrapper around Azure Cognitive Services Image Analysis. "
"Useful for when you need to analyze images. "
"Input should be a url to an image."
)
@root_validator(pre=True)
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and endpoint exists in environment."""
azure_cogs_key = get_from_dict_or_env(
values, "azure_cogs_key", "AZURE_COGS_KEY"
)
azure_cogs_endpoint = get_from_dict_or_env(
values, "azure_cogs_endpoint", "AZURE_COGS_ENDPOINT"
)
try:
import azure.ai.vision as sdk
values["vision_service"] = sdk.VisionServiceOptions(
endpoint=azure_cogs_endpoint, key=azure_cogs_key
)
values["analysis_options"] = sdk.ImageAnalysisOptions()
values["analysis_options"].features = (
sdk.ImageAnalysisFeature.CAPTION
| sdk.ImageAnalysisFeature.OBJECTS
| sdk.ImageAnalysisFeature.TAGS
| sdk.ImageAnalysisFeature.TEXT
)
except ImportError:
raise ImportError(
"azure-ai-vision is not installed. "
"Run `pip install azure-ai-vision` to install."
)
return values
def _image_analysis(self, image_path: str) -> Dict:
try:
import azure.ai.vision as sdk
except ImportError:
pass
image_src_type = detect_file_src_type(image_path)
if image_src_type == "local":
vision_source = sdk.VisionSource(filename=image_path)
elif image_src_type == "remote":
vision_source = sdk.VisionSource(url=image_path)
else:
raise ValueError(f"Invalid image path: {image_path}")
image_analyzer = sdk.ImageAnalyzer(
self.vision_service, vision_source, self.analysis_options
)
result = image_analyzer.analyze()
res_dict = {}
if result.reason == sdk.ImageAnalysisResultReason.ANALYZED:
if result.caption is not None:
res_dict["caption"] = result.caption.content
if result.objects is not None:
res_dict["objects"] = [obj.name for obj in result.objects]
if result.tags is not None:
res_dict["tags"] = [tag.name for tag in result.tags]
if result.text is not None:
res_dict["text"] = [line.content for line in result.text.lines]
else:
error_details = sdk.ImageAnalysisErrorDetails.from_result(result)
raise RuntimeError(
f"Image analysis failed.\n"
f"Reason: {error_details.reason}\n"
f"Details: {error_details.message}"
)
return res_dict
def _format_image_analysis_result(self, image_analysis_result: Dict) -> str:
formatted_result = []
if "caption" in image_analysis_result:
formatted_result.append("Caption: " + image_analysis_result["caption"])
if (
"objects" in image_analysis_result
and len(image_analysis_result["objects"]) > 0
):
formatted_result.append(
"Objects: " + ", ".join(image_analysis_result["objects"])
)
if "tags" in image_analysis_result and len(image_analysis_result["tags"]) > 0:
formatted_result.append("Tags: " + ", ".join(image_analysis_result["tags"]))
if "text" in image_analysis_result and len(image_analysis_result["text"]) > 0:
formatted_result.append("Text: " + ", ".join(image_analysis_result["text"]))
return "\n".join(formatted_result)
def _run(
self,
query: str,
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Use the tool."""
try:
image_analysis_result = self._image_analysis(query)
if not image_analysis_result:
return "No good image analysis result was found"
return self._format_image_analysis_result(image_analysis_result)
except Exception as e:
raise RuntimeError(f"Error while running AzureCogsImageAnalysisTool: {e}")
async def _arun(
self,
query: str,
run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
) -> str:
"""Use the tool asynchronously."""
raise NotImplementedError("AzureCogsImageAnalysisTool does not support async")

View File

@@ -0,0 +1,131 @@
from __future__ import annotations
import logging
import time
from typing import Any, Dict, Optional
from pydantic import root_validator
from langchain.callbacks.manager import (
AsyncCallbackManagerForToolRun,
CallbackManagerForToolRun,
)
from langchain.tools.azure_cognitive_services.utils import (
detect_file_src_type,
download_audio_from_url,
)
from langchain.tools.base import BaseTool
from langchain.utils import get_from_dict_or_env
logger = logging.getLogger(__name__)
class AzureCogsSpeech2TextTool(BaseTool):
"""Tool that queries the Azure Cognitive Services Speech2Text API.
In order to set this up, follow instructions at:
https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/get-started-speech-to-text?pivots=programming-language-python
"""
azure_cogs_key: str = "" #: :meta private:
azure_cogs_region: str = "" #: :meta private:
speech_language: str = "en-US" #: :meta private:
speech_config: Any #: :meta private:
name = "Azure Cognitive Services Speech2Text"
description = (
"A wrapper around Azure Cognitive Services Speech2Text. "
"Useful for when you need to transcribe audio to text. "
"Input should be a url to an audio file."
)
@root_validator(pre=True)
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and endpoint exists in environment."""
azure_cogs_key = get_from_dict_or_env(
values, "azure_cogs_key", "AZURE_COGS_KEY"
)
azure_cogs_region = get_from_dict_or_env(
values, "azure_cogs_region", "AZURE_COGS_REGION"
)
try:
import azure.cognitiveservices.speech as speechsdk
values["speech_config"] = speechsdk.SpeechConfig(
subscription=azure_cogs_key, region=azure_cogs_region
)
except ImportError:
raise ImportError(
"azure-cognitiveservices-speech is not installed. "
"Run `pip install azure-cognitiveservices-speech` to install."
)
return values
def _continuous_recognize(self, speech_recognizer: Any) -> str:
done = False
text = ""
def stop_cb(evt: Any) -> None:
"""callback that stop continuous recognition"""
speech_recognizer.stop_continuous_recognition_async()
nonlocal done
done = True
def retrieve_cb(evt: Any) -> None:
"""callback that retrieves the intermediate recognition results"""
nonlocal text
text += evt.result.text
# retrieve text on recognized events
speech_recognizer.recognized.connect(retrieve_cb)
# stop continuous recognition on either session stopped or canceled events
speech_recognizer.session_stopped.connect(stop_cb)
speech_recognizer.canceled.connect(stop_cb)
# Start continuous speech recognition
speech_recognizer.start_continuous_recognition_async()
while not done:
time.sleep(0.5)
return text
def _speech2text(self, audio_path: str, speech_language: str) -> str:
try:
import azure.cognitiveservices.speech as speechsdk
except ImportError:
pass
audio_src_type = detect_file_src_type(audio_path)
if audio_src_type == "local":
audio_config = speechsdk.AudioConfig(filename=audio_path)
elif audio_src_type == "remote":
tmp_audio_path = download_audio_from_url(audio_path)
audio_config = speechsdk.AudioConfig(filename=tmp_audio_path)
else:
raise ValueError(f"Invalid audio path: {audio_path}")
self.speech_config.speech_recognition_language = speech_language
speech_recognizer = speechsdk.SpeechRecognizer(self.speech_config, audio_config)
return self._continuous_recognize(speech_recognizer)
def _run(
self,
query: str,
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Use the tool."""
try:
text = self._speech2text(query, self.speech_language)
return text
except Exception as e:
raise RuntimeError(f"Error while running AzureCogsSpeech2TextTool: {e}")
async def _arun(
self,
query: str,
run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
) -> str:
"""Use the tool asynchronously."""
raise NotImplementedError("AzureCogsSpeech2TextTool does not support async")

View File

@@ -0,0 +1,114 @@
from __future__ import annotations
import logging
import tempfile
from typing import Any, Dict, Optional
from pydantic import root_validator
from langchain.callbacks.manager import (
AsyncCallbackManagerForToolRun,
CallbackManagerForToolRun,
)
from langchain.tools.base import BaseTool
from langchain.utils import get_from_dict_or_env
logger = logging.getLogger(__name__)
class AzureCogsText2SpeechTool(BaseTool):
"""Tool that queries the Azure Cognitive Services Text2Speech API.
In order to set this up, follow instructions at:
https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/get-started-text-to-speech?pivots=programming-language-python
"""
azure_cogs_key: str = "" #: :meta private:
azure_cogs_region: str = "" #: :meta private:
speech_language: str = "en-US" #: :meta private:
speech_config: Any #: :meta private:
name = "Azure Cognitive Services Text2Speech"
description = (
"A wrapper around Azure Cognitive Services Text2Speech. "
"Useful for when you need to convert text to speech. "
)
@root_validator(pre=True)
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and endpoint exists in environment."""
azure_cogs_key = get_from_dict_or_env(
values, "azure_cogs_key", "AZURE_COGS_KEY"
)
azure_cogs_region = get_from_dict_or_env(
values, "azure_cogs_region", "AZURE_COGS_REGION"
)
try:
import azure.cognitiveservices.speech as speechsdk
values["speech_config"] = speechsdk.SpeechConfig(
subscription=azure_cogs_key, region=azure_cogs_region
)
except ImportError:
raise ImportError(
"azure-cognitiveservices-speech is not installed. "
"Run `pip install azure-cognitiveservices-speech` to install."
)
return values
def _text2speech(self, text: str, speech_language: str) -> str:
try:
import azure.cognitiveservices.speech as speechsdk
except ImportError:
pass
self.speech_config.speech_synthesis_language = speech_language
speech_synthesizer = speechsdk.SpeechSynthesizer(
speech_config=self.speech_config, audio_config=None
)
result = speech_synthesizer.speak_text(text)
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
stream = speechsdk.AudioDataStream(result)
with tempfile.NamedTemporaryFile(
mode="wb", suffix=".wav", delete=False
) as f:
stream.save_to_wav_file(f.name)
return f.name
elif result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = result.cancellation_details
logger.debug(f"Speech synthesis canceled: {cancellation_details.reason}")
if cancellation_details.reason == speechsdk.CancellationReason.Error:
raise RuntimeError(
f"Speech synthesis error: {cancellation_details.error_details}"
)
return "Speech synthesis canceled."
else:
return f"Speech synthesis failed: {result.reason}"
def _run(
self,
query: str,
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Use the tool."""
try:
speech_file = self._text2speech(query, self.speech_language)
return speech_file
except Exception as e:
raise RuntimeError(f"Error while running AzureCogsText2SpeechTool: {e}")
async def _arun(
self,
query: str,
run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
) -> str:
"""Use the tool asynchronously."""
raise NotImplementedError("AzureCogsText2SpeechTool does not support async")

View File

@@ -0,0 +1,29 @@
import os
import tempfile
from urllib.parse import urlparse
import requests
def detect_file_src_type(file_path: str) -> str:
"""Detect if the file is local or remote."""
if os.path.isfile(file_path):
return "local"
parsed_url = urlparse(file_path)
if parsed_url.scheme and parsed_url.netloc:
return "remote"
return "invalid"
def download_audio_from_url(audio_url: str) -> str:
"""Download audio from url to local."""
ext = audio_url.split(".")[-1]
response = requests.get(audio_url, stream=True)
response.raise_for_status()
with tempfile.NamedTemporaryFile(mode="wb", suffix=f".{ext}", delete=False) as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
return f.name

View File

@@ -35,7 +35,7 @@ class LambdaWrapper(BaseModel):
except ImportError:
raise ImportError(
"boto3 is not installed." "Please install it with `pip install boto3`"
"boto3 is not installed. Please install it with `pip install boto3`"
)
values["lambda_client"] = boto3.client("lambda")

View File

@@ -51,8 +51,8 @@ class GooglePlacesAPIWrapper(BaseModel):
values["google_map_client"] = googlemaps.Client(gplaces_api_key)
except ImportError:
raise ValueError(
"Could not import googlemaps python packge. "
raise ImportError(
"Could not import googlemaps python package. "
"Please install it with `pip install googlemaps`."
)
return values

View File

@@ -156,7 +156,7 @@ class JiraAPIWrapper(BaseModel):
import json
except ImportError:
raise ImportError(
"json is not installed. " "Please install it with `pip install json`"
"json is not installed. Please install it with `pip install json`"
)
params = json.loads(query)
return self.jira.issue_create(fields=dict(params))

View File

@@ -38,7 +38,7 @@ class OpenWeatherMapAPIWrapper(BaseModel):
except ImportError:
raise ImportError(
"pyowm is not installed. " "Please install it with `pip install pyowm`"
"pyowm is not installed. Please install it with `pip install pyowm`"
)
owm = pyowm.OWM(openweathermap_api_key)

View File

@@ -150,6 +150,8 @@ class SerpAPIWrapper(BaseModel):
toret = res["knowledge_graph"]["description"]
elif "snippet" in res["organic_results"][0].keys():
toret = res["organic_results"][0]["snippet"]
elif "link" in res["organic_results"][0].keys():
toret = res["organic_results"][0]["link"]
else:
toret = "No good search result found"

View File

@@ -195,6 +195,8 @@ class Weaviate(VectorStore):
query_obj = self._client.query.get(self._index_name, self._query_attrs)
if kwargs.get("where_filter"):
query_obj = query_obj.with_where(kwargs.get("where_filter"))
if kwargs.get("additional"):
query_obj = query_obj.with_additional(kwargs.get("additional"))
result = query_obj.with_near_text(content).with_limit(k).do()
if "errors" in result:
raise ValueError(f"Error during query: {result['errors']}")
@@ -212,6 +214,8 @@ class Weaviate(VectorStore):
query_obj = self._client.query.get(self._index_name, self._query_attrs)
if kwargs.get("where_filter"):
query_obj = query_obj.with_where(kwargs.get("where_filter"))
if kwargs.get("additional"):
query_obj = query_obj.with_additional(kwargs.get("additional"))
result = query_obj.with_near_vector(vector).with_limit(k).do()
if "errors" in result:
raise ValueError(f"Error during query: {result['errors']}")

481
poetry.lock generated
View File

@@ -566,6 +566,64 @@ dev = ["coverage (>=5,<6)", "flake8 (>=3,<4)", "pytest (>=6,<7)", "sphinx-copybu
docs = ["sphinx-copybutton (>=0.4,<0.5)", "sphinx-rtd-theme (>=1.0,<2.0)", "sphinx-tabs (>=3,<4)", "sphinxcontrib-mermaid (>=0.7,<0.8)"]
test = ["coverage (>=5,<6)", "pytest (>=6,<7)"]
[[package]]
name = "azure-ai-formrecognizer"
version = "3.2.1"
description = "Microsoft Azure Form Recognizer Client Library for Python"
category = "main"
optional = true
python-versions = ">=3.7"
files = [
{file = "azure-ai-formrecognizer-3.2.1.zip", hash = "sha256:5768765f9720ce87038f56afe0c0b5259192cfb29c840a39595b1e26e4ddfa32"},
{file = "azure_ai_formrecognizer-3.2.1-py3-none-any.whl", hash = "sha256:4db43b9dd0a2bc5296b752c04dbacb838ae2b8726adfe7cf277c2ea34e99419a"},
]
[package.dependencies]
azure-common = ">=1.1,<2.0"
azure-core = ">=1.23.0,<2.0.0"
msrest = ">=0.6.21"
typing-extensions = ">=4.0.1"
[[package]]
name = "azure-ai-vision"
version = "0.11.1b1"
description = "Microsoft Azure AI Vision SDK for Python"
category = "main"
optional = true
python-versions = ">=3.7"
files = [
{file = "azure_ai_vision-0.11.1b1-py3-none-manylinux1_x86_64.whl", hash = "sha256:6f8563ae26689da6cdee9b2de009a53546ae2fd86c6c180236ce5da5b45f41d3"},
{file = "azure_ai_vision-0.11.1b1-py3-none-win_amd64.whl", hash = "sha256:f5df03b9156feaa1d8c776631967b1455028d30dfd4cd1c732aa0f9c03d01517"},
]
[[package]]
name = "azure-cognitiveservices-speech"
version = "1.28.0"
description = "Microsoft Cognitive Services Speech SDK for Python"
category = "main"
optional = true
python-versions = ">=3.7"
files = [
{file = "azure_cognitiveservices_speech-1.28.0-py3-none-macosx_10_14_x86_64.whl", hash = "sha256:a6c277ec9c93f586dcc74d3a56a6aa0259f4cf371f5e03afcf169c691e2c4d0c"},
{file = "azure_cognitiveservices_speech-1.28.0-py3-none-macosx_11_0_arm64.whl", hash = "sha256:a412c6c5bc528548e0ee5fc5fe89fb8351307d0c5ef7ac4d506fab3d58efcb4a"},
{file = "azure_cognitiveservices_speech-1.28.0-py3-none-manylinux1_x86_64.whl", hash = "sha256:ceb5a8862da4ab861bd06653074a4e5dc2d66a54f03dd4dd9356da7672febbce"},
{file = "azure_cognitiveservices_speech-1.28.0-py3-none-manylinux2014_aarch64.whl", hash = "sha256:d5cba32e9d8eaffc9d8f482c00950bc471f9dc4d7659c741c083e5e9d831b802"},
{file = "azure_cognitiveservices_speech-1.28.0-py3-none-win32.whl", hash = "sha256:ac52c4549062771db5694346c1547334cf1bb0d08573a193c8dcec8386aa491d"},
{file = "azure_cognitiveservices_speech-1.28.0-py3-none-win_amd64.whl", hash = "sha256:5ff042d81d7ff4e50be196419fcd2042e41a97cebb229e0940026e1314ff7751"},
]
[[package]]
name = "azure-common"
version = "1.1.28"
description = "Microsoft Azure Client Library for Python (Common)"
category = "main"
optional = true
python-versions = "*"
files = [
{file = "azure-common-1.1.28.zip", hash = "sha256:4ac0cd3214e36b6a1b6a442686722a5d8cc449603aa833f3f0f40bda836704a3"},
{file = "azure_common-1.1.28-py2.py3-none-any.whl", hash = "sha256:5c12d3dcf4ec20599ca6b0d3e09e86e146353d443e7fcc050c9a19c1f9df20ad"},
]
[[package]]
name = "azure-core"
version = "1.26.4"
@@ -817,6 +875,21 @@ files = [
[package.dependencies]
numpy = ">=1.15.0"
[[package]]
name = "blurhash"
version = "1.1.4"
description = "Pure-Python implementation of the blurhash algorithm."
category = "dev"
optional = false
python-versions = "*"
files = [
{file = "blurhash-1.1.4-py2.py3-none-any.whl", hash = "sha256:7611c1bc41383d2349b6129208587b5d61e8792ce953893cb49c38beeb400d1d"},
{file = "blurhash-1.1.4.tar.gz", hash = "sha256:da56b163e5a816e4ad07172f5639287698e09d7f3dc38d18d9726d9c1dbc4cee"},
]
[package.extras]
test = ["Pillow", "numpy", "pytest"]
[[package]]
name = "boto3"
version = "1.26.76"
@@ -1480,6 +1553,46 @@ typing-inspect = ">=0.4.0"
[package.extras]
dev = ["flake8", "hypothesis", "ipython", "mypy (>=0.710)", "portray", "pytest (>=6.2.3)", "simplejson", "types-dataclasses"]
[[package]]
name = "datasets"
version = "1.18.3"
description = "HuggingFace community-driven open-source library of datasets"
category = "main"
optional = true
python-versions = "*"
files = [
{file = "datasets-1.18.3-py3-none-any.whl", hash = "sha256:5862670a3e213af1aa68995a32ff0ce761b9d71d2677c3fa59e7088eb5e2a841"},
{file = "datasets-1.18.3.tar.gz", hash = "sha256:dfdf75c255069f4ed25ccdd0d3f0730c1ff1e2b27f8d4bd1af395b10fe8ebc63"},
]
[package.dependencies]
aiohttp = "*"
dill = "*"
fsspec = {version = ">=2021.05.0", extras = ["http"]}
huggingface-hub = ">=0.1.0,<1.0.0"
multiprocess = "*"
numpy = ">=1.17"
packaging = "*"
pandas = "*"
pyarrow = ">=3.0.0,<4.0.0 || >4.0.0"
requests = ">=2.19.0"
tqdm = ">=4.62.1"
xxhash = "*"
[package.extras]
apache-beam = ["apache-beam (>=2.26.0)"]
audio = ["librosa"]
benchmarks = ["numpy (==1.18.5)", "tensorflow (==2.3.0)", "torch (==1.6.0)", "transformers (==3.0.2)"]
dev = ["Pillow (>=6.2.1)", "Werkzeug (>=1.0.1)", "absl-py", "aiobotocore", "apache-beam (>=2.26.0)", "bert-score (>=0.3.6)", "black (==21.4b0)", "boto3", "botocore", "bs4", "conllu", "elasticsearch", "fairseq", "faiss-cpu (>=1.6.4)", "fastBPE (==0.1.0)", "flake8 (>=3.8.3)", "fsspec[s3]", "h5py", "importlib-resources", "isort (>=5.0.0)", "jiwer", "langdetect", "librosa", "lxml", "mauve-text", "moto[s3,server] (==2.0.4)", "mwparserfromhell", "nltk", "openpyxl", "py7zr", "pytest", "pytest-datadir", "pytest-xdist", "pytorch-lightning", "pytorch-nlp (==0.5.0)", "pyyaml (>=5.3.1)", "rarfile (>=4.0)", "requests-file (>=1.5.1)", "rouge-score", "s3fs (==2021.08.1)", "sacrebleu", "scikit-learn", "scipy", "sentencepiece", "seqeval", "six (>=1.15.0,<1.16.0)", "soundfile", "tensorflow (>=2.3,!=2.6.0,!=2.6.1)", "texttable (>=1.6.3)", "tldextract", "tldextract (>=3.1.0)", "toml (>=0.10.1)", "torch", "torchaudio", "torchmetrics (==0.6.0)", "transformers", "wget (>=3.2)", "zstandard"]
docs = ["Markdown (!=3.3.5)", "docutils (==0.16.0)", "fsspec (<2021.9.0)", "myst-parser", "recommonmark", "s3fs", "sphinx (==3.1.2)", "sphinx-copybutton", "sphinx-inline-tabs", "sphinx-markdown-tables", "sphinx-panels", "sphinx-rtd-theme (==0.4.3)", "sphinxext-opengraph (==0.4.1)"]
quality = ["black (==21.4b0)", "flake8 (>=3.8.3)", "isort (>=5.0.0)", "pyyaml (>=5.3.1)"]
s3 = ["boto3", "botocore", "fsspec", "s3fs"]
tensorflow = ["tensorflow (>=2.2.0,!=2.6.0,!=2.6.1)"]
tensorflow-gpu = ["tensorflow-gpu (>=2.2.0,!=2.6.0,!=2.6.1)"]
tests = ["Pillow (>=6.2.1)", "Werkzeug (>=1.0.1)", "absl-py", "aiobotocore", "apache-beam (>=2.26.0)", "bert-score (>=0.3.6)", "boto3", "botocore", "bs4", "conllu", "elasticsearch", "fairseq", "faiss-cpu (>=1.6.4)", "fastBPE (==0.1.0)", "fsspec[s3]", "h5py", "importlib-resources", "jiwer", "langdetect", "librosa", "lxml", "mauve-text", "moto[s3,server] (==2.0.4)", "mwparserfromhell", "nltk", "openpyxl", "py7zr", "pytest", "pytest-datadir", "pytest-xdist", "pytorch-lightning", "pytorch-nlp (==0.5.0)", "rarfile (>=4.0)", "requests-file (>=1.5.1)", "rouge-score", "s3fs (==2021.08.1)", "sacrebleu", "scikit-learn", "scipy", "sentencepiece", "seqeval", "six (>=1.15.0,<1.16.0)", "soundfile", "tensorflow (>=2.3,!=2.6.0,!=2.6.1)", "texttable (>=1.6.3)", "tldextract", "tldextract (>=3.1.0)", "toml (>=0.10.1)", "torch", "torchaudio", "torchmetrics (==0.6.0)", "transformers", "wget (>=3.2)", "zstandard"]
torch = ["torch"]
vision = ["Pillow (>=6.2.1)"]
[[package]]
name = "debugpy"
version = "1.6.7"
@@ -2139,6 +2252,10 @@ files = [
{file = "fsspec-2023.5.0.tar.gz", hash = "sha256:b3b56e00fb93ea321bc9e5d9cf6f8522a0198b20eb24e02774d329e9c6fb84ce"},
]
[package.dependencies]
aiohttp = {version = "<4.0.0a0 || >4.0.0a0,<4.0.0a1 || >4.0.0a1", optional = true, markers = "extra == \"http\""}
requests = {version = "*", optional = true, markers = "extra == \"http\""}
[package.extras]
abfs = ["adlfs"]
adl = ["adlfs"]
@@ -3119,6 +3236,21 @@ widgetsnbextension = ">=4.0.7,<4.1.0"
[package.extras]
test = ["ipykernel", "jsonschema", "pytest (>=3.6.0)", "pytest-cov", "pytz"]
[[package]]
name = "isodate"
version = "0.6.1"
description = "An ISO 8601 date/time/duration parser and formatter"
category = "main"
optional = true
python-versions = "*"
files = [
{file = "isodate-0.6.1-py2.py3-none-any.whl", hash = "sha256:0751eece944162659049d35f4f549ed815792b38793f07cf73381c1c87cbed96"},
{file = "isodate-0.6.1.tar.gz", hash = "sha256:48c5881de7e8b0a0d648cb024c8062dc84e7b840ed81e864c7614fd3c127bde9"},
]
[package.dependencies]
six = "*"
[[package]]
name = "isoduration"
version = "20.11.0"
@@ -3772,6 +3904,30 @@ files = [
[package.extras]
data = ["language-data (>=1.1,<2.0)"]
[[package]]
name = "langkit"
version = "0.0.1b2"
description = "A collection of text metric udfs for whylogs profiling and monitoring in WhyLabs"
category = "main"
optional = true
python-versions = ">=3.8,<4.0"
files = [
{file = "langkit-0.0.1b2-py3-none-any.whl", hash = "sha256:8059d48bb1bbf90da5f5103585dece57fa09d156b0490f8a6c88277789a19021"},
{file = "langkit-0.0.1b2.tar.gz", hash = "sha256:c2dd7cf93921dc77d6c7516746351fa503684f3be35392c187f4418a0748ef50"},
]
[package.dependencies]
datasets = "*"
nltk = ">=3.8.1,<4.0.0"
openai = "*"
pandas = "*"
sentence-transformers = ">=2.2.2,<3.0.0"
textstat = ">=0.7.3,<0.8.0"
whylogs = ">=1.1.42.dev3,<2.0.0"
[package.extras]
io = ["torch"]
[[package]]
name = "lark"
version = "1.1.5"
@@ -4162,6 +4318,32 @@ files = [
[package.dependencies]
marshmallow = ">=2.0.0"
[[package]]
name = "mastodon-py"
version = "1.8.1"
description = "Python wrapper for the Mastodon API"
category = "dev"
optional = false
python-versions = "*"
files = [
{file = "Mastodon.py-1.8.1-py2.py3-none-any.whl", hash = "sha256:22bc7e060518ef2eaa69d911cde6e4baf56bed5ea0dd407392c49051a7ac526a"},
{file = "Mastodon.py-1.8.1.tar.gz", hash = "sha256:4a64cb94abadd6add73e4b8eafdb5c466048fa5f638284fd2189034104d4687e"},
]
[package.dependencies]
blurhash = ">=1.1.4"
decorator = ">=4.0.0"
python-dateutil = "*"
python-magic = {version = "*", markers = "platform_system != \"Windows\""}
python-magic-bin = {version = "*", markers = "platform_system == \"Windows\""}
requests = ">=2.4.2"
six = "*"
[package.extras]
blurhash = ["blurhash (>=1.1.4)"]
test = ["blurhash (>=1.1.4)", "cryptography (>=1.6.0)", "http-ece (>=1.0.5)", "pytest", "pytest-cov", "pytest-mock", "pytest-runner", "pytest-vcr", "pytz", "requests-mock", "vcrpy"]
webpush = ["cryptography (>=1.6.0)", "http-ece (>=1.0.5)"]
[[package]]
name = "matplotlib-inline"
version = "0.1.6"
@@ -4402,6 +4584,28 @@ files = [
{file = "msgpack-1.0.5.tar.gz", hash = "sha256:c075544284eadc5cddc70f4757331d99dcbc16b2bbd4849d15f8aae4cf36d31c"},
]
[[package]]
name = "msrest"
version = "0.7.1"
description = "AutoRest swagger generator Python client runtime."
category = "main"
optional = true
python-versions = ">=3.6"
files = [
{file = "msrest-0.7.1-py3-none-any.whl", hash = "sha256:21120a810e1233e5e6cc7fe40b474eeb4ec6f757a15d7cf86702c369f9567c32"},
{file = "msrest-0.7.1.zip", hash = "sha256:6e7661f46f3afd88b75667b7187a92829924446c7ea1d169be8c4bb7eeb788b9"},
]
[package.dependencies]
azure-core = ">=1.24.0"
certifi = ">=2017.4.17"
isodate = ">=0.6.0"
requests = ">=2.16,<3.0"
requests-oauthlib = ">=0.5.0"
[package.extras]
async = ["aiodns", "aiohttp (>=3.0)"]
[[package]]
name = "multidict"
version = "6.0.4"
@@ -5236,6 +5440,21 @@ files = [
[package.dependencies]
pydantic = ">=1.8.2"
[[package]]
name = "openlm"
version = "0.0.5"
description = "Drop-in OpenAI-compatible that can call LLMs from other providers"
category = "main"
optional = true
python-versions = ">=3.8.1,<4.0"
files = [
{file = "openlm-0.0.5-py3-none-any.whl", hash = "sha256:9fcbbc575d2869e2a6c0b00827f9be2189c067c2de4bf03ef3cbdf488367ae93"},
{file = "openlm-0.0.5.tar.gz", hash = "sha256:0eb3fd7a9e4f7b4248931ff2f0dc91c525d990b99956886861a1b3f9868bc451"},
]
[package.dependencies]
requests = ">=2,<3"
[[package]]
name = "opensearch-py"
version = "2.2.0"
@@ -6655,6 +6874,7 @@ files = [
{file = "pylance-0.4.12-cp38-abi3-macosx_10_15_x86_64.whl", hash = "sha256:2b86fb8dccc03094c0db37bef0d91bda60e8eb0d1eddf245c6971450c8d8a53f"},
{file = "pylance-0.4.12-cp38-abi3-macosx_11_0_arm64.whl", hash = "sha256:0bc82914b13204187d673b5f3d45f93219c38a0e9d0542ba251074f639669789"},
{file = "pylance-0.4.12-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5a4bcce77f99ecd4cbebbadb01e58d5d8138d40eb56bdcdbc3b20b0475e7a472"},
{file = "pylance-0.4.12-cp38-abi3-win_amd64.whl", hash = "sha256:9616931c5300030adb9626d22515710a127d1e46a46737a7a0f980b52f13627c"},
]
[package.dependencies]
@@ -6880,6 +7100,22 @@ files = [
{file = "pypdfium2-4.11.0.tar.gz", hash = "sha256:f1d3bd0841f0c2e9db417075896dafc5906bbd7c0ccdc2b6e2b3f44d61d49f46"},
]
[[package]]
name = "pyphen"
version = "0.14.0"
description = "Pure Python module to hyphenate text"
category = "main"
optional = true
python-versions = ">=3.7"
files = [
{file = "pyphen-0.14.0-py3-none-any.whl", hash = "sha256:414c9355958ca3c6a3ff233f65678c245b8ecb56418fb291e2b93499d61cd510"},
{file = "pyphen-0.14.0.tar.gz", hash = "sha256:596c8b3be1c1a70411ba5f6517d9ccfe3083c758ae2b94a45f2707346d8e66fa"},
]
[package.extras]
doc = ["sphinx", "sphinx_rtd_theme"]
test = ["flake8", "isort", "pytest"]
[[package]]
name = "pyrsistent"
version = "0.19.3"
@@ -7151,6 +7387,31 @@ files = [
{file = "python_json_logger-2.0.7-py3-none-any.whl", hash = "sha256:f380b826a991ebbe3de4d897aeec42760035ac760345e57b812938dc8b35e2bd"},
]
[[package]]
name = "python-magic"
version = "0.4.27"
description = "File type identification using libmagic"
category = "dev"
optional = false
python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*"
files = [
{file = "python-magic-0.4.27.tar.gz", hash = "sha256:c1ba14b08e4a5f5c31a302b7721239695b2f0f058d125bd5ce1ee36b9d9d3c3b"},
{file = "python_magic-0.4.27-py2.py3-none-any.whl", hash = "sha256:c212960ad306f700aa0d01e5d7a325d20548ff97eb9920dcd29513174f0294d3"},
]
[[package]]
name = "python-magic-bin"
version = "0.4.14"
description = "File type identification using libmagic binary package"
category = "dev"
optional = false
python-versions = "*"
files = [
{file = "python_magic_bin-0.4.14-py2.py3-none-macosx_10_6_intel.whl", hash = "sha256:7b1743b3dbf16601d6eedf4e7c2c9a637901b0faaf24ad4df4d4527e7d8f66a4"},
{file = "python_magic_bin-0.4.14-py2.py3-none-win32.whl", hash = "sha256:34a788c03adde7608028203e2dbb208f1f62225ad91518787ae26d603ae68892"},
{file = "python_magic_bin-0.4.14-py2.py3-none-win_amd64.whl", hash = "sha256:90be6206ad31071a36065a2fc169c5afb5e0355cbe6030e87641c6c62edc2b69"},
]
[[package]]
name = "python-multipart"
version = "0.0.6"
@@ -8969,6 +9230,21 @@ tornado = ">=6.1.0"
docs = ["myst-parser", "pydata-sphinx-theme", "sphinx"]
test = ["pre-commit", "pytest (>=7.0)", "pytest-timeout"]
[[package]]
name = "textstat"
version = "0.7.3"
description = "Calculate statistical features from text"
category = "main"
optional = true
python-versions = ">=3.6"
files = [
{file = "textstat-0.7.3-py3-none-any.whl", hash = "sha256:cbd9d641aa5abff0852638f0489913f31ea52fe597ccbaa337b4fc2a44efd15e"},
{file = "textstat-0.7.3.tar.gz", hash = "sha256:60b63cf8949f45bbb3b4205e4411bbc1cd66df4c08aef12545811c7e6e24f011"},
]
[package.dependencies]
pyphen = "*"
[[package]]
name = "thinc"
version = "8.1.10"
@@ -10024,6 +10300,95 @@ files = [
[package.extras]
test = ["pytest (>=6.0.0)"]
[[package]]
name = "whylabs-client"
version = "0.5.1"
description = "WhyLabs API client"
category = "main"
optional = true
python-versions = ">=3.6"
files = [
{file = "whylabs-client-0.5.1.tar.gz", hash = "sha256:f7aacfab7d176812c2eb4cdeb8c52521eed0d30bc2a0836399798197a513cf04"},
{file = "whylabs_client-0.5.1-py3-none-any.whl", hash = "sha256:dc6958d5bb390f1057fe6f513cbce55c4e71d5f8a1461a7c93eb73814089de33"},
]
[package.dependencies]
python-dateutil = "*"
urllib3 = ">=1.25.3"
[[package]]
name = "whylogs"
version = "1.1.42.dev3"
description = "Profile and monitor your ML data pipeline end-to-end"
category = "main"
optional = true
python-versions = ">=3.7.1,<4"
files = [
{file = "whylogs-1.1.42.dev3-py3-none-any.whl", hash = "sha256:99aadb05b68c6c2dc5d00ba1fb45bdd5ac2c3da3fe812f3fd1573a0f06674121"},
{file = "whylogs-1.1.42.dev3.tar.gz", hash = "sha256:c82badf821f56935fd274e696e4d5ed151934e486f23ea5f5c60af31e6cdb632"},
]
[package.dependencies]
protobuf = ">=3.19.4"
typing-extensions = {version = ">=3.10", markers = "python_version < \"4\""}
whylabs-client = ">=0.4.4,<1"
whylogs-sketching = ">=3.4.1.dev3"
[package.extras]
all = ["Pillow (>=9.2.0,<10.0.0)", "boto3 (>=1.22.13,<2.0.0)", "fugue (>=0.8.1,<0.9.0)", "google-cloud-storage (>=2.5.0,<3.0.0)", "ipython", "mlflow-skinny (>=1.26.1,<2.0.0)", "numpy", "numpy (>=1.23.2)", "pandas", "pyarrow (>=8.0.0,<13)", "pybars3 (>=0.9,<0.10)", "pyspark (>=3.0.0,<4.0.0)", "requests (>=2.27,<3.0)", "scikit-learn (>=1.0.2,<2.0.0)", "scikit-learn (>=1.1.2,<2)", "scipy (>=1.5)", "scipy (>=1.9.2)"]
datasets = ["pandas"]
docs = ["furo (>=2022.3.4,<2023.0.0)", "ipython_genutils (>=0.2.0,<0.3.0)", "myst-parser[sphinx] (>=0.17.2,<0.18.0)", "nbconvert (>=7.0.0,<8.0.0)", "nbsphinx (>=0.8.9,<0.9.0)", "sphinx", "sphinx-autoapi", "sphinx-autobuild (>=2021.3.14,<2022.0.0)", "sphinx-copybutton (>=0.5.0,<0.6.0)", "sphinx-inline-tabs", "sphinxext-opengraph (>=0.6.3,<0.7.0)"]
embeddings = ["numpy", "numpy (>=1.23.2)", "scikit-learn (>=1.0.2,<2.0.0)", "scikit-learn (>=1.1.2,<2)"]
fugue = ["fugue (>=0.8.1,<0.9.0)"]
gcs = ["google-cloud-storage (>=2.5.0,<3.0.0)"]
image = ["Pillow (>=9.2.0,<10.0.0)"]
mlflow = ["mlflow-skinny (>=1.26.1,<2.0.0)"]
s3 = ["boto3 (>=1.22.13,<2.0.0)"]
spark = ["pyarrow (>=8.0.0,<13)", "pyspark (>=3.0.0,<4.0.0)"]
viz = ["Pillow (>=9.2.0,<10.0.0)", "ipython", "numpy", "numpy (>=1.23.2)", "pybars3 (>=0.9,<0.10)", "requests (>=2.27,<3.0)", "scipy (>=1.5)", "scipy (>=1.9.2)"]
whylabs = ["requests (>=2.27,<3.0)"]
[[package]]
name = "whylogs-sketching"
version = "3.4.1.dev3"
description = "sketching library of whylogs"
category = "main"
optional = true
python-versions = "*"
files = [
{file = "whylogs-sketching-3.4.1.dev3.tar.gz", hash = "sha256:40b90eb9d5e4cbbfa63f6a1f3a3332b72258d270044b79030dc5d720fddd9499"},
{file = "whylogs_sketching-3.4.1.dev3-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:9c20134eda881064099264f795d60321777b5e6c2357125a7a2787c9f15db684"},
{file = "whylogs_sketching-3.4.1.dev3-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:e76ac4c2d0214b8de8598867e721f774cca8877267bc2a9b2d0d06950fe76bd5"},
{file = "whylogs_sketching-3.4.1.dev3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:edc2b463d926ccacb7ee2147d206850bb0cbfea8766f091e8c575ada48db1cfd"},
{file = "whylogs_sketching-3.4.1.dev3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:fdc2a3bd73895d1ffac1b3028ff55aaa6b60a9ec42d7b6b5785fa140f303dec0"},
{file = "whylogs_sketching-3.4.1.dev3-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:46460eefcf22bcf20b0e6208de32e358478c17b1239221eb038d434f14ec427c"},
{file = "whylogs_sketching-3.4.1.dev3-cp310-cp310-win_amd64.whl", hash = "sha256:58b99a070429a7119a5727ac61c4e9ebcd6e92eed3d2646931a487fff3d6630e"},
{file = "whylogs_sketching-3.4.1.dev3-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:531a4af8f707c1e8138a4ae41a117ba53241372bf191666a9e6b44ab6cd9e634"},
{file = "whylogs_sketching-3.4.1.dev3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0ba536fca5f9578fa34d106c243fdccfef7d75b9d1fffb9d93df0debfe8e3ebc"},
{file = "whylogs_sketching-3.4.1.dev3-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:afa843c68cafa08e82624e6a33d13ab7f00ad0301101960872fe152d5af5ab53"},
{file = "whylogs_sketching-3.4.1.dev3-cp311-cp311-win_amd64.whl", hash = "sha256:303d55c37565340c2d21c268c64a712fad612504cc4b98b1d1df848cac6d934f"},
{file = "whylogs_sketching-3.4.1.dev3-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:9d65fcf8dade1affe50181582b8894929993e37d7daa922d973a811790cd0208"},
{file = "whylogs_sketching-3.4.1.dev3-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c4845e77c208ae64ada9170e1b92ed0abe28fe311c0fc35f9d8efa6926211ca2"},
{file = "whylogs_sketching-3.4.1.dev3-cp36-cp36m-musllinux_1_1_x86_64.whl", hash = "sha256:02cac1c87ac42d7fc7e6597862ac50bc035825988d21e8a2d763b416e83e845f"},
{file = "whylogs_sketching-3.4.1.dev3-cp36-cp36m-win_amd64.whl", hash = "sha256:52a174784e69870543fb87910e5549759f01a7f4cb6cac1773b2cc194ec0b72f"},
{file = "whylogs_sketching-3.4.1.dev3-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:0931fc7500b78baf8f44222f1e3b58cfb707b0120328bc16cc50beaab5a954ec"},
{file = "whylogs_sketching-3.4.1.dev3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:803c104338a5c4e1c6eb077d35ca3a4443e455aa4e7f2769c93560bf135cdeb3"},
{file = "whylogs_sketching-3.4.1.dev3-cp37-cp37m-musllinux_1_1_x86_64.whl", hash = "sha256:49e8f20351077497880b088dff9342f4b54d2d3c650c0b43daf121d97fb42468"},
{file = "whylogs_sketching-3.4.1.dev3-cp37-cp37m-win_amd64.whl", hash = "sha256:f9f3507b5df34de7a95b75f80009644371dd6406f9d8795e820edf8a594aeacc"},
{file = "whylogs_sketching-3.4.1.dev3-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:2986dd5b35a93267e6d89e7aa256f714105bbe61bdb0381aeab588c2688e46b6"},
{file = "whylogs_sketching-3.4.1.dev3-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:14f1bf4903e9cd2a196fe5a7268cca1434d423233e073917130d5b845f539c2a"},
{file = "whylogs_sketching-3.4.1.dev3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2ecfe0e4a629a4cefec9d7c7fac234119688085ba5f62feabed710cb5a322f8b"},
{file = "whylogs_sketching-3.4.1.dev3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:000e2c11b7bbbdefb3a343c15955868a682c02d607557fef7bad5a6ffd09a0cf"},
{file = "whylogs_sketching-3.4.1.dev3-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:1e70ed1ed2f9c174a80673ae2ca24c1ec0e2a01c0bd6b0728640492fd5a50178"},
{file = "whylogs_sketching-3.4.1.dev3-cp38-cp38-win_amd64.whl", hash = "sha256:9efd56d5a21566fc49126ef54d37116078763bb9f8955b9c77421b4ca3fd8fc6"},
{file = "whylogs_sketching-3.4.1.dev3-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:832247fd9d3ecf13791418a75c359db6c3aeffd51d7372d026e95f307ef286cc"},
{file = "whylogs_sketching-3.4.1.dev3-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:cc81b547e331d96f6f4227280b9b5968ca4bd48dd7cb0c8b68c022037800009f"},
{file = "whylogs_sketching-3.4.1.dev3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3abf13da4347393a302843c2f06ce4e5fc56fd9c8564f64da13ceafb81eef90b"},
{file = "whylogs_sketching-3.4.1.dev3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d1d6e7d0ddb66ab725d7af63518ef6a24cd45b075b81e1d2081709df4c989853"},
{file = "whylogs_sketching-3.4.1.dev3-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:0b05112e3f70cfccddd2f72e464fa113307d97188891433133d4219b9f8f5456"},
{file = "whylogs_sketching-3.4.1.dev3-cp39-cp39-win_amd64.whl", hash = "sha256:23759a00dd0e7019fbac06d9e9ab005ad6c14f80ec7935ccebccb7127296bc06"},
]
[[package]]
name = "widgetsnbextension"
version = "4.0.7"
@@ -10199,6 +10564,114 @@ files = [
{file = "xmltodict-0.13.0.tar.gz", hash = "sha256:341595a488e3e01a85a9d8911d8912fd922ede5fecc4dce437eb4b6c8d037e56"},
]
[[package]]
name = "xxhash"
version = "3.2.0"
description = "Python binding for xxHash"
category = "main"
optional = true
python-versions = ">=3.6"
files = [
{file = "xxhash-3.2.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:af44b9e59c4b2926a4e3c7f9d29949ff42fcea28637ff6b8182e654461932be8"},
{file = "xxhash-3.2.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:1bdd57973e2b802ef32553d7bebf9402dac1557874dbe5c908b499ea917662cd"},
{file = "xxhash-3.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:6b7c9aa77bbce61a5e681bd39cb6a804338474dcc90abe3c543592aa5d6c9a9b"},
{file = "xxhash-3.2.0-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:11bf87dc7bb8c3b0b5e24b7b941a9a19d8c1f88120b6a03a17264086bc8bb023"},
{file = "xxhash-3.2.0-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:2783d41487ce6d379fdfaa7332fca5187bf7010b9bddcf20cafba923bc1dc665"},
{file = "xxhash-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:561076ca0dcef2fbc20b2bc2765bff099e002e96041ae9dbe910a863ca6ee3ea"},
{file = "xxhash-3.2.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:3a26eeb4625a6e61cedc8c1b39b89327c9c7e1a8c2c4d786fe3f178eb839ede6"},
{file = "xxhash-3.2.0-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:d93a44d0104d1b9b10de4e7aadf747f6efc1d7ec5ed0aa3f233a720725dd31bd"},
{file = "xxhash-3.2.0-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:89585adc73395a10306d2e2036e50d6c4ac0cf8dd47edf914c25488871b64f6d"},
{file = "xxhash-3.2.0-cp310-cp310-musllinux_1_1_ppc64le.whl", hash = "sha256:a892b4b139126a86bfdcb97cd912a2f8c4e8623869c3ef7b50871451dd7afeb0"},
{file = "xxhash-3.2.0-cp310-cp310-musllinux_1_1_s390x.whl", hash = "sha256:e998efb190653f70e0f30d92b39fc645145369a4823bee46af8ddfc244aa969d"},
{file = "xxhash-3.2.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:e8ed3bd2b8bb3277710843ca63e4f5c3ee6f8f80b083be5b19a7a9905420d11e"},
{file = "xxhash-3.2.0-cp310-cp310-win32.whl", hash = "sha256:20181cbaed033c72cb881b2a1d13c629cd1228f113046133469c9a48cfcbcd36"},
{file = "xxhash-3.2.0-cp310-cp310-win_amd64.whl", hash = "sha256:a0f7a16138279d707db778a63264d1d6016ac13ffd3f1e99f54b2855d6c0d8e1"},
{file = "xxhash-3.2.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:5daff3fb5bfef30bc5a2cb143810d376d43461445aa17aece7210de52adbe151"},
{file = "xxhash-3.2.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:75bb5be3c5de702a547715f320ecf5c8014aeca750ed5147ca75389bd22e7343"},
{file = "xxhash-3.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:01f36b671ff55cb1d5c2f6058b799b697fd0ae4b4582bba6ed0999678068172a"},
{file = "xxhash-3.2.0-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:d4d4519123aac73c93159eb8f61db9682393862dd669e7eae034ecd0a35eadac"},
{file = "xxhash-3.2.0-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:994e4741d5ed70fc2a335a91ef79343c6b1089d7dfe6e955dd06f8ffe82bede6"},
{file = "xxhash-3.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:919bc1b010aa6ff0eb918838ff73a435aed9e9a19c3202b91acecd296bf75607"},
{file = "xxhash-3.2.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:17b65454c5accbb079c45eca546c27c4782f5175aa320758fafac896b1549d27"},
{file = "xxhash-3.2.0-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:b0c094d5e65a46dbf3fe0928ff20873a747e6abfd2ed4b675beeb2750624bc2e"},
{file = "xxhash-3.2.0-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:f94163ebe2d5546e6a5977e96d83621f4689c1054053428cf8d4c28b10f92f69"},
{file = "xxhash-3.2.0-cp311-cp311-musllinux_1_1_ppc64le.whl", hash = "sha256:cead7c0307977a00b3f784cff676e72c147adbcada19a2e6fc2ddf54f37cf387"},
{file = "xxhash-3.2.0-cp311-cp311-musllinux_1_1_s390x.whl", hash = "sha256:a0e1bd0260c1da35c1883321ce2707ceea07127816ab625e1226ec95177b561a"},
{file = "xxhash-3.2.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:cc8878935671490efe9275fb4190a6062b73277bd273237179b9b5a2aa436153"},
{file = "xxhash-3.2.0-cp311-cp311-win32.whl", hash = "sha256:a433f6162b18d52f7068175d00bd5b1563b7405f926a48d888a97b90a160c40d"},
{file = "xxhash-3.2.0-cp311-cp311-win_amd64.whl", hash = "sha256:a32d546a1752e4ee7805d6db57944f7224afa7428d22867006b6486e4195c1f3"},
{file = "xxhash-3.2.0-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:82daaab720866bf690b20b49de5640b0c27e3b8eea2d08aa75bdca2b0f0cfb63"},
{file = "xxhash-3.2.0-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3126df6520cbdbaddd87ce74794b2b6c45dd2cf6ac2b600a374b8cdb76a2548c"},
{file = "xxhash-3.2.0-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:e172c1ee40507ae3b8d220f4048aaca204f203e1e4197e8e652f5c814f61d1aa"},
{file = "xxhash-3.2.0-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:5384f1d9f30876f5d5b618464fb19ff7ce6c0fe4c690fbaafd1c52adc3aae807"},
{file = "xxhash-3.2.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:26cb52174a7e96a17acad27a3ca65b24713610ac479c99ac9640843822d3bebf"},
{file = "xxhash-3.2.0-cp36-cp36m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:fbcd613a5e76b1495fc24db9c37a6b7ee5f214fd85979187ec4e032abfc12ded"},
{file = "xxhash-3.2.0-cp36-cp36m-musllinux_1_1_aarch64.whl", hash = "sha256:f988daf25f31726d5b9d0be6af636ca9000898f9ea43a57eac594daea25b0948"},
{file = "xxhash-3.2.0-cp36-cp36m-musllinux_1_1_i686.whl", hash = "sha256:bbc30c98ab006ab9fc47e5ed439c00f706bc9d4441ff52693b8b6fea335163e0"},
{file = "xxhash-3.2.0-cp36-cp36m-musllinux_1_1_ppc64le.whl", hash = "sha256:2408d49260b0a4a7cc6ba445aebf38e073aeaf482f8e32767ca477e32ccbbf9e"},
{file = "xxhash-3.2.0-cp36-cp36m-musllinux_1_1_s390x.whl", hash = "sha256:3f4152fd0bf8b03b79f2f900fd6087a66866537e94b5a11fd0fd99ef7efe5c42"},
{file = "xxhash-3.2.0-cp36-cp36m-musllinux_1_1_x86_64.whl", hash = "sha256:0eea848758e4823a01abdbcccb021a03c1ee4100411cbeeb7a5c36a202a0c13c"},
{file = "xxhash-3.2.0-cp36-cp36m-win32.whl", hash = "sha256:77709139af5123c578ab06cf999429cdb9ab211047acd0c787e098dcb3f1cb4d"},
{file = "xxhash-3.2.0-cp36-cp36m-win_amd64.whl", hash = "sha256:91687671fd9d484a4e201ad266d366b695a45a1f2b41be93d116ba60f1b8f3b3"},
{file = "xxhash-3.2.0-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:e4af8bc5c3fcc2192c266421c6aa2daab1a18e002cb8e66ef672030e46ae25cf"},
{file = "xxhash-3.2.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e8be562e2ce3e481d9209b6f254c3d7c5ff920eb256aba2380d2fb5ba75d4f87"},
{file = "xxhash-3.2.0-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:9eba0c7c12126b12f7fcbea5513f28c950d28f33d2a227f74b50b77789e478e8"},
{file = "xxhash-3.2.0-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:2198c4901a0223c48f6ec0a978b60bca4f4f7229a11ca4dc96ca325dd6a29115"},
{file = "xxhash-3.2.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:50ce82a71b22a3069c02e914bf842118a53065e2ec1c6fb54786e03608ab89cc"},
{file = "xxhash-3.2.0-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:b5019fb33711c30e54e4e57ae0ca70af9d35b589d385ac04acd6954452fa73bb"},
{file = "xxhash-3.2.0-cp37-cp37m-musllinux_1_1_aarch64.whl", hash = "sha256:0d54ac023eef7e3ac9f0b8841ae8a376b933043bc2ad428121346c6fa61c491c"},
{file = "xxhash-3.2.0-cp37-cp37m-musllinux_1_1_i686.whl", hash = "sha256:c55fa832fc3fe64e0d29da5dc9b50ba66ca93312107cec2709300ea3d3bab5c7"},
{file = "xxhash-3.2.0-cp37-cp37m-musllinux_1_1_ppc64le.whl", hash = "sha256:f4ce006215497993ae77c612c1883ca4f3973899573ce0c52fee91f0d39c4561"},
{file = "xxhash-3.2.0-cp37-cp37m-musllinux_1_1_s390x.whl", hash = "sha256:1afb9b9d27fd675b436cb110c15979976d92d761ad6e66799b83756402f3a974"},
{file = "xxhash-3.2.0-cp37-cp37m-musllinux_1_1_x86_64.whl", hash = "sha256:baa99cebf95c1885db21e119395f222a706a2bb75a545f0672880a442137725e"},
{file = "xxhash-3.2.0-cp37-cp37m-win32.whl", hash = "sha256:75aa692936942ccb2e8fd6a386c81c61630ac1b6d6e921698122db8a930579c3"},
{file = "xxhash-3.2.0-cp37-cp37m-win_amd64.whl", hash = "sha256:0a2cdfb5cae9fafb9f7b65fd52ecd60cf7d72c13bb2591ea59aaefa03d5a8827"},
{file = "xxhash-3.2.0-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:3a68d1e8a390b660d94b9360ae5baa8c21a101bd9c4790a8b30781bada9f1fc6"},
{file = "xxhash-3.2.0-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:ce7c3ce28f94302df95eaea7c9c1e2c974b6d15d78a0c82142a97939d7b6c082"},
{file = "xxhash-3.2.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0dcb419bf7b0bc77d366e5005c25682249c5521a63fd36c51f584bd91bb13bd5"},
{file = "xxhash-3.2.0-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:ae521ed9287f86aac979eeac43af762f03d9d9797b2272185fb9ddd810391216"},
{file = "xxhash-3.2.0-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:b0d16775094423088ffa357d09fbbb9ab48d2fb721d42c0856b801c86f616eec"},
{file = "xxhash-3.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:fe454aeab348c42f56d6f7434ff758a3ef90787ac81b9ad5a363cd61b90a1b0b"},
{file = "xxhash-3.2.0-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:052fd0efdd5525c2dbc61bebb423d92aa619c4905bba605afbf1e985a562a231"},
{file = "xxhash-3.2.0-cp38-cp38-musllinux_1_1_aarch64.whl", hash = "sha256:02badf3754e2133de254a4688798c4d80f0060635087abcb461415cb3eb82115"},
{file = "xxhash-3.2.0-cp38-cp38-musllinux_1_1_i686.whl", hash = "sha256:66b8a90b28c13c2aae7a71b32638ceb14cefc2a1c8cf23d8d50dfb64dfac7aaf"},
{file = "xxhash-3.2.0-cp38-cp38-musllinux_1_1_ppc64le.whl", hash = "sha256:649cdf19df175925ad87289ead6f760cd840730ee85abc5eb43be326a0a24d97"},
{file = "xxhash-3.2.0-cp38-cp38-musllinux_1_1_s390x.whl", hash = "sha256:4b948a03f89f5c72d69d40975af8af241111f0643228796558dc1cae8f5560b0"},
{file = "xxhash-3.2.0-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:49f51fab7b762da7c2cee0a3d575184d3b9be5e2f64f26cae2dd286258ac9b3c"},
{file = "xxhash-3.2.0-cp38-cp38-win32.whl", hash = "sha256:1a42994f0d42b55514785356722d9031f064fd34e495b3a589e96db68ee0179d"},
{file = "xxhash-3.2.0-cp38-cp38-win_amd64.whl", hash = "sha256:0a6d58ba5865475e53d6c2c4fa6a62e2721e7875e146e2681e5337a6948f12e7"},
{file = "xxhash-3.2.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:aabdbc082030f8df613e2d2ea1f974e7ad36a539bdfc40d36f34e55c7e4b8e94"},
{file = "xxhash-3.2.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:498843b66b9ca416e9d03037e5875c8d0c0ab9037527e22df3b39aa5163214cd"},
{file = "xxhash-3.2.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a910b1193cd90af17228f5d6069816646df0148f14f53eefa6b2b11a1dedfcd0"},
{file = "xxhash-3.2.0-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:bb6d8ce31dc25faf4da92991320e211fa7f42de010ef51937b1dc565a4926501"},
{file = "xxhash-3.2.0-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:883dc3d3942620f4c7dbc3fd6162f50a67f050b714e47da77444e3bcea7d91cc"},
{file = "xxhash-3.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:59dc8bfacf89b8f5be54d55bc3b4bd6d74d0c5320c8a63d2538ac7df5b96f1d5"},
{file = "xxhash-3.2.0-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:61e6aa1d30c2af692aa88c4dd48709426e8b37bff6a574ee2de677579c34a3d6"},
{file = "xxhash-3.2.0-cp39-cp39-musllinux_1_1_aarch64.whl", hash = "sha256:314ec0bd21f0ee8d30f2bd82ed3759314bd317ddbbd8555668f3d20ab7a8899a"},
{file = "xxhash-3.2.0-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:dad638cde3a5357ad3163b80b3127df61fb5b5e34e9e05a87697144400ba03c7"},
{file = "xxhash-3.2.0-cp39-cp39-musllinux_1_1_ppc64le.whl", hash = "sha256:eaa3ea15025b56076d806b248948612289b093e8dcda8d013776b3848dffff15"},
{file = "xxhash-3.2.0-cp39-cp39-musllinux_1_1_s390x.whl", hash = "sha256:7deae3a312feb5c17c97cbf18129f83cbd3f1f9ec25b0f50e2bd9697befb22e7"},
{file = "xxhash-3.2.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:add774341c09853b1612c64a526032d95ab1683053325403e1afbe3ad2f374c5"},
{file = "xxhash-3.2.0-cp39-cp39-win32.whl", hash = "sha256:9b94749130ef3119375c599bfce82142c2500ef9ed3280089157ee37662a7137"},
{file = "xxhash-3.2.0-cp39-cp39-win_amd64.whl", hash = "sha256:e57d94a1552af67f67b27db5dba0b03783ea69d5ca2af2f40e098f0ba3ce3f5f"},
{file = "xxhash-3.2.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl", hash = "sha256:92fd765591c83e5c5f409b33eac1d3266c03d3d11c71a7dbade36d5cdee4fbc0"},
{file = "xxhash-3.2.0-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:8970f6a411a9839a02b23b7e90bbbba4a6de52ace009274998566dc43f36ca18"},
{file = "xxhash-3.2.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c5f3e33fe6cbab481727f9aeb136a213aed7e33cd1ca27bd75e916ffacc18411"},
{file = "xxhash-3.2.0-pp37-pypy37_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:368265392cb696dd53907e2328b5a8c1bee81cf2142d0cc743caf1c1047abb36"},
{file = "xxhash-3.2.0-pp37-pypy37_pp73-win_amd64.whl", hash = "sha256:3b1f3c6d67fa9f49c4ff6b25ce0e7143bab88a5bc0f4116dd290c92337d0ecc7"},
{file = "xxhash-3.2.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl", hash = "sha256:c5e8db6e1ee7267b7c412ad0afd5863bf7a95286b8333a5958c8097c69f94cf5"},
{file = "xxhash-3.2.0-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:761df3c7e2c5270088b691c5a8121004f84318177da1ca1db64222ec83c44871"},
{file = "xxhash-3.2.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d2d15a707e7f689531eb4134eccb0f8bf3844bb8255ad50823aa39708d9e6755"},
{file = "xxhash-3.2.0-pp38-pypy38_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:e6b2ba4ff53dd5f57d728095e3def7375eb19c90621ce3b41b256de84ec61cfd"},
{file = "xxhash-3.2.0-pp38-pypy38_pp73-win_amd64.whl", hash = "sha256:61b0bcf946fdfd8ab5f09179dc2b5c74d1ef47cedfc6ed0ec01fdf0ee8682dd3"},
{file = "xxhash-3.2.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl", hash = "sha256:f7b79f0f302396d8e0d444826ceb3d07b61977793886ebae04e82796c02e42dc"},
{file = "xxhash-3.2.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e0773cd5c438ffcd5dbff91cdd503574f88a4b960e70cedeb67736583a17a918"},
{file = "xxhash-3.2.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4ec1f57127879b419a2c8d2db9d9978eb26c61ae17e5972197830430ae78d25b"},
{file = "xxhash-3.2.0-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:3d4b15c00e807b1d3d0b612338c814739dec310b80fb069bd732b98ddc709ad7"},
{file = "xxhash-3.2.0-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:9d3f686e3d1c8900c5459eee02b60c7399e20ec5c6402364068a343c83a61d90"},
{file = "xxhash-3.2.0.tar.gz", hash = "sha256:1afd47af8955c5db730f630ad53ae798cf7fae0acb64cebb3cf94d35c47dd088"},
]
[[package]]
name = "yarl"
version = "1.9.2"
@@ -10379,13 +10852,13 @@ cffi = {version = ">=1.11", markers = "platform_python_implementation == \"PyPy\
cffi = ["cffi (>=1.11)"]
[extras]
all = ["O365", "aleph-alpha-client", "anthropic", "arxiv", "atlassian-python-api", "azure-cosmos", "azure-identity", "beautifulsoup4", "clickhouse-connect", "cohere", "deeplake", "docarray", "duckduckgo-search", "elasticsearch", "faiss-cpu", "google-api-python-client", "google-search-results", "gptcache", "html2text", "huggingface_hub", "jina", "jinja2", "jq", "lancedb", "lark", "lxml", "manifest-ml", "neo4j", "networkx", "nlpcloud", "nltk", "nomic", "openai", "opensearch-py", "pdfminer-six", "pexpect", "pgvector", "pinecone-client", "pinecone-text", "psycopg2-binary", "pyowm", "pypdf", "pytesseract", "pyvespa", "qdrant-client", "redis", "requests-toolbelt", "sentence-transformers", "spacy", "steamship", "tensorflow-text", "tiktoken", "torch", "transformers", "weaviate-client", "wikipedia", "wolframalpha"]
azure = ["azure-core", "azure-cosmos", "azure-identity", "openai"]
all = ["O365", "aleph-alpha-client", "anthropic", "arxiv", "atlassian-python-api", "azure-ai-formrecognizer", "azure-ai-vision", "azure-cognitiveservices-speech", "azure-cosmos", "azure-identity", "beautifulsoup4", "clickhouse-connect", "cohere", "deeplake", "docarray", "duckduckgo-search", "elasticsearch", "faiss-cpu", "google-api-python-client", "google-search-results", "gptcache", "html2text", "huggingface_hub", "jina", "jinja2", "jq", "lancedb", "langkit", "lark", "lxml", "manifest-ml", "neo4j", "networkx", "nlpcloud", "nltk", "nomic", "openai", "openlm", "opensearch-py", "pdfminer-six", "pexpect", "pgvector", "pinecone-client", "pinecone-text", "psycopg2-binary", "pyowm", "pypdf", "pytesseract", "pyvespa", "qdrant-client", "redis", "requests-toolbelt", "sentence-transformers", "spacy", "steamship", "tensorflow-text", "tiktoken", "torch", "transformers", "weaviate-client", "wikipedia", "wolframalpha"]
azure = ["azure-ai-formrecognizer", "azure-ai-vision", "azure-cognitiveservices-speech", "azure-core", "azure-cosmos", "azure-identity", "openai"]
cohere = ["cohere"]
docarray = ["docarray"]
embeddings = ["sentence-transformers"]
extended-testing = ["atlassian-python-api", "beautifulsoup4", "beautifulsoup4", "chardet", "gql", "html2text", "jq", "lxml", "pandas", "pdfminer-six", "psychicapi", "pymupdf", "pypdf", "pypdfium2", "requests-toolbelt", "telethon", "tqdm", "zep-python"]
llms = ["anthropic", "cohere", "huggingface_hub", "manifest-ml", "nlpcloud", "openai", "torch", "transformers"]
llms = ["anthropic", "cohere", "huggingface_hub", "manifest-ml", "nlpcloud", "openai", "openlm", "torch", "transformers"]
openai = ["openai", "tiktoken"]
qdrant = ["qdrant-client"]
text-helpers = ["chardet"]
@@ -10393,4 +10866,4 @@ text-helpers = ["chardet"]
[metadata]
lock-version = "2.0"
python-versions = ">=3.8.1,<4.0"
content-hash = "7c14d63435a60c32edbb4d7adcc647430cf64a80aa26c1515fba28d5433efc6b"
content-hash = "196588e10bb33939f5bae294a194ad01e803f40ed1087fe6a7a4b87e8d80712b"

View File

@@ -89,8 +89,13 @@ telethon = {version = "^1.28.5", optional = true}
neo4j = {version = "^5.8.1", optional = true}
psychicapi = {version = "^0.2", optional = true}
zep-python = {version="^0.25", optional=true}
langkit = {version = ">=0.0.1.dev3, <0.1.0", optional = true}
chardet = {version="^5.1.0", optional=true}
requests-toolbelt = {version = "^1.0.0", optional = true}
openlm = {version = "^0.0.5", optional = true}
azure-ai-formrecognizer = {version = "^3.2.1", optional = true}
azure-ai-vision = {version = "^0.11.1b1", optional = true}
azure-cognitiveservices-speech = {version = "^1.28.0", optional = true}
[tool.poetry.group.docs.dependencies]
autodoc_pydantic = "^1.8.0"
@@ -152,6 +157,7 @@ wikipedia = "^1"
pymongo = "^4.3.3"
cassandra-driver = "^3.27.0"
arxiv = "^1.4"
mastodon-py = "^1.8.1"
[tool.poetry.group.lint.dependencies]
ruff = "^0.0.249"
@@ -174,14 +180,14 @@ playwright = "^1.28.0"
setuptools = "^67.6.1"
[tool.poetry.extras]
llms = ["anthropic", "cohere", "openai", "nlpcloud", "huggingface_hub", "manifest-ml", "torch", "transformers"]
llms = ["anthropic", "cohere", "openai", "openlm", "nlpcloud", "huggingface_hub", "manifest-ml", "torch", "transformers"]
qdrant = ["qdrant-client"]
openai = ["openai", "tiktoken"]
text_helpers = ["chardet"]
cohere = ["cohere"]
docarray = ["docarray"]
embeddings = ["sentence-transformers"]
azure = ["azure-identity", "azure-cosmos", "openai", "azure-core"]
azure = ["azure-identity", "azure-cosmos", "openai", "azure-core", "azure-ai-formrecognizer", "azure-ai-vision", "azure-cognitiveservices-speech"]
all = [
"anthropic",
"cohere",
@@ -229,6 +235,7 @@ all = [
"clickhouse-connect",
"azure-cosmos",
"lancedb",
"langkit",
"lark",
"pexpect",
"pyvespa",
@@ -240,6 +247,10 @@ all = [
"lxml",
"requests-toolbelt",
"neo4j",
"openlm",
"azure-ai-formrecognizer",
"azure-ai-vision",
"azure-cognitiveservices-speech",
]
# An extra used to be able to add extended testing.

View File

@@ -0,0 +1,14 @@
"""Tests for the Mastodon toots loader"""
from langchain.document_loaders import MastodonTootsLoader
def test_mastodon_toots_loader() -> None:
"""Test Mastodon toots loader with an external query."""
# Query the Mastodon CEO's account
loader = MastodonTootsLoader(
mastodon_accounts=["@Gargron@mastodon.social"], number_toots=1
)
docs = loader.load()
assert len(docs) == 1
assert docs[0].metadata["user_info"]["id"] == 1

View File

@@ -0,0 +1,8 @@
from langchain.llms.openlm import OpenLM
def test_openlm_call() -> None:
"""Test valid call to openlm."""
llm = OpenLM(model_name="dolly-v2-7b", max_tokens=10)
output = llm(prompt="Say foo:")
assert isinstance(output, str)

View File

@@ -81,6 +81,28 @@ class TestWeaviate:
)
assert output == [Document(page_content="foo", metadata={"page": 0})]
@pytest.mark.vcr(ignore_localhost=True)
def test_similarity_search_with_metadata_and_additional(
self, weaviate_url: str, embedding_openai: OpenAIEmbeddings
) -> None:
"""Test end to end construction and search with metadata and additional."""
texts = ["foo", "bar", "baz"]
metadatas = [{"page": i} for i in range(len(texts))]
docsearch = Weaviate.from_texts(
texts, embedding_openai, metadatas=metadatas, weaviate_url=weaviate_url
)
output = docsearch.similarity_search(
"foo",
k=1,
additional=["certainty"],
)
assert output == [
Document(
page_content="foo",
metadata={"page": 0, "_additional": {"certainty": 1}},
)
]
@pytest.mark.vcr(ignore_localhost=True)
def test_similarity_search_with_uuids(
self, weaviate_url: str, embedding_openai: OpenAIEmbeddings

View File

@@ -86,6 +86,7 @@ def test_create_directory_and_files(tmp_path: Path) -> None:
assert output == "file1.txt\nfile2.txt"
@pytest.mark.skip(reason="flaky on GHA, TODO to fix")
@pytest.mark.skipif(
sys.platform.startswith("win"), reason="Test not supported on Windows"
)

View File

@@ -146,3 +146,25 @@ Bye!\n\n-H."""
"Bye!\n\n-H.",
]
assert output == expected_output
def test_split_documents() -> None:
"""Test split_documents."""
splitter = CharacterTextSplitter(separator="", chunk_size=1, chunk_overlap=0)
docs = [
Document(page_content="foo", metadata={"source": "1"}),
Document(page_content="bar", metadata={"source": "2"}),
Document(page_content="baz", metadata={"source": "1"}),
]
expected_output = [
Document(page_content="f", metadata={"source": "1"}),
Document(page_content="o", metadata={"source": "1"}),
Document(page_content="o", metadata={"source": "1"}),
Document(page_content="b", metadata={"source": "2"}),
Document(page_content="a", metadata={"source": "2"}),
Document(page_content="r", metadata={"source": "2"}),
Document(page_content="b", metadata={"source": "1"}),
Document(page_content="a", metadata={"source": "1"}),
Document(page_content="z", metadata={"source": "1"}),
]
assert splitter.split_documents(docs) == expected_output

View File

@@ -4,6 +4,10 @@ from langchain.tools import __all__ as public_api
_EXPECTED = [
"AIPluginTool",
"APIOperation",
"AzureCogsFormRecognizerTool",
"AzureCogsImageAnalysisTool",
"AzureCogsSpeech2TextTool",
"AzureCogsText2SpeechTool",
"BaseTool",
"BaseTool",
"BaseTool",