mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-09 15:03:21 +00:00
Bagatur/clarifai update (#7324)
This PR improves upon the Clarifai LangChain integration with improved docs, errors, args and the addition of embedding model support in LancChain for Clarifai's embedding models and an overview of the various ways you can integrate with Clarifai added to the docs. --------- Co-authored-by: Matthew Zeiler <zeiler@clarifai.com>
This commit is contained in:
52
docs/extras/ecosystem/integrations/clarifai.mdx
Normal file
52
docs/extras/ecosystem/integrations/clarifai.mdx
Normal file
@@ -0,0 +1,52 @@
|
||||
# Clarifai
|
||||
|
||||
>[Clarifai](https://clarifai.com) is one of first deep learning platforms having been founded in 2013. Clarifai provides an AI platform with the full AI lifecycle for data exploration, data labeling, model training, evaluation and inference around images, video, text and audio data. In the LangChain ecosystem, as far as we're aware, Clarifai is the only provider that supports LLMs, embeddings and a vector store in one production scale platform, making it an excellent choice to operationalize your LangChain implementations.
|
||||
|
||||
## Installation and Setup
|
||||
- Install the Python SDK:
|
||||
```bash
|
||||
pip install clarifai
|
||||
```
|
||||
[Sign-up](https://clarifai.com/signup) for a Clarifai account, then get a personal access token to access the Clarifai API from your [security settings](https://clarifai.com/settings/security) and set it as an environment variable (`CLARIFAI_PAT`).
|
||||
|
||||
|
||||
## Models
|
||||
|
||||
Clarifai provides 1,000s of AI models for many different use cases. You can [explore them here](https://clarifai.com/explore) to find the one most suited for your use case. These models include those created by other providers such as OpenAI, Anthropic, Cohere, AI21, etc. as well as state of the art from open source such as Falcon, InstructorXL, etc. so that you build the best in AI into your products. You'll find these organized by the creator's user_id and into projects we call applications denoted by their app_id. Those IDs will be needed in additional to the model_id and optionally the version_id, so make note of all these IDs once you found the best model for your use case!
|
||||
|
||||
Also note that given there are many models for images, video, text and audio understanding, you can build some interested AI agents that utilize the variety of AI models as experts to understand those data types.
|
||||
|
||||
### LLMs
|
||||
|
||||
To find the selection of LLMs in the Clarifai platform you can select the text to text model type [here](https://clarifai.com/explore/models?filterData=%5B%7B%22field%22%3A%22model_type_id%22%2C%22value%22%3A%5B%22text-to-text%22%5D%7D%5D&page=1&perPage=24).
|
||||
|
||||
```python
|
||||
from langchain.llms import Clarifai
|
||||
llm = Clarifai(pat=CLARIFAI_PAT, user_id=USER_ID, app_id=APP_ID, model_id=MODEL_ID)
|
||||
```
|
||||
|
||||
For more details, the docs on the Clarifai LLM wrapper provide a [detailed walkthrough](/docs/modules/model_io/models/llms/integrations/clarifai.html).
|
||||
|
||||
|
||||
### Text Embedding Models
|
||||
|
||||
To find the selection of text embeddings models in the Clarifai platform you can select the text to embedding model type [here](https://clarifai.com/explore/models?page=1&perPage=24&filterData=%5B%7B%22field%22%3A%22model_type_id%22%2C%22value%22%3A%5B%22text-embedder%22%5D%7D%5D).
|
||||
|
||||
There is a Clarifai Embedding model in LangChain, which you can access with:
|
||||
```python
|
||||
from langchain.embeddings import ClarifaiEmbeddings
|
||||
embeddings = ClarifaiEmbeddings(pat=CLARIFAI_PAT, user_id=USER_ID, app_id=APP_ID, model_id=MODEL_ID)
|
||||
```
|
||||
For more details, the docs on the Clarifai Embeddings wrapper provide a [detailed walthrough](/docs/modules/data_connection/text_embedding/integrations/clarifai.html).
|
||||
|
||||
## Vectorstore
|
||||
|
||||
Clarifai's vector DB was launched in 2016 and has been optimized to support live search queries. With workflows in the Clarifai platform, you data is automatically indexed by am embedding model and optionally other models as well to index that information in the DB for search. You can query the DB not only via the vectors but also filter by metadata matches, other AI predicted concepts, and even do geo-coordinate search. Simply create an application, select the appropriate base workflow for your type of data, and upload it (through the API as [documented here](https://docs.clarifai.com/api-guide/data/create-get-update-delete) or the UIs at clarifai.com).
|
||||
|
||||
You an also add data directly from LangChain as well, and the auto-indexing will take place for you. You'll notice this is a little different than other vectorstores where you need to provde an embedding model in their constructor and have LangChain coordinate getting the embeddings from text and writing those to the index. Not only is it more convenient, but it's much more scalable to use Clarifai's distributed cloud to do all the index in the background.
|
||||
|
||||
```python
|
||||
from langchain.vectorstores import Clarifai
|
||||
clarifai_vector_db = Clarifai.from_texts(user_id=USER_ID, app_id=APP_ID, texts=texts, pat=CLARIFAI_PAT, number_of_docs=NUMBER_OF_DOCS, metadatas = metadatas)
|
||||
```
|
||||
For more details, the docs on the Clarifai vector store provide a [detailed walthrough](/docs/modules/data_connection/text_embedding/integrations/clarifai.html).
|
@@ -0,0 +1,208 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"id": "9597802c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Clarifai\n",
|
||||
"\n",
|
||||
">[Clarifai](https://www.clarifai.com/) is an AI Platform that provides the full AI lifecycle ranging from data exploration, data labeling, model training, evaluation, and inference.\n",
|
||||
"\n",
|
||||
"This example goes over how to use LangChain to interact with `Clarifai` [models](https://clarifai.com/explore/models). Text embedding models in particular can be found [here](https://clarifai.com/explore/models?page=1&perPage=24&filterData=%5B%7B%22field%22%3A%22model_type_id%22%2C%22value%22%3A%5B%22text-embedder%22%5D%7D%5D).\n",
|
||||
"\n",
|
||||
"To use Clarifai, you must have an account and a Personal Access Token (PAT) key. \n",
|
||||
"[Check here](https://clarifai.com/settings/security) to get or create a PAT."
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"id": "2a773d8d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Dependencies"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "91ea14ce-831d-409a-a88f-30353acdabd1",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Install required dependencies\n",
|
||||
"!pip install clarifai"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"id": "426f1156",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Imports\n",
|
||||
"Here we will be setting the personal access token. You can find your PAT under [settings/security](https://clarifai.com/settings/security) in your Clarifai account."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "3f5dc9d7-65e3-4b5b-9086-3327d016cfe0",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdin",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
" ········\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Please login and get your API key from https://clarifai.com/settings/security \n",
|
||||
"from getpass import getpass\n",
|
||||
"\n",
|
||||
"CLARIFAI_PAT = getpass()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "6fb585dd",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Import the required modules\n",
|
||||
"from langchain.embeddings import ClarifaiEmbeddings\n",
|
||||
"from langchain import PromptTemplate, LLMChain"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"id": "16521ed2",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Input\n",
|
||||
"Create a prompt template to be used with the LLM Chain:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "035dea0f",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"template = \"\"\"Question: {question}\n",
|
||||
"\n",
|
||||
"Answer: Let's think step by step.\"\"\"\n",
|
||||
"\n",
|
||||
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"id": "c8905eac",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Setup\n",
|
||||
"Set the user id and app id to the application in which the model resides. You can find a list of public models on https://clarifai.com/explore/models\n",
|
||||
"\n",
|
||||
"You will have to also initialize the model id and if needed, the model version id. Some models have many versions, you can choose the one appropriate for your task."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "1fe9bf15",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"USER_ID = 'openai'\n",
|
||||
"APP_ID = 'embed'\n",
|
||||
"MODEL_ID = 'text-embedding-ada'\n",
|
||||
"\n",
|
||||
"# You can provide a specific model version as the model_version_id arg.\n",
|
||||
"# MODEL_VERSION_ID = \"MODEL_VERSION_ID\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "3f3458d9",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Initialize a Clarifai embedding model\n",
|
||||
"embeddings = ClarifaiEmbeddings(pat=CLARIFAI_PAT, user_id=USER_ID, app_id=APP_ID, model_id=MODEL_ID)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "a641dbd9",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"text = \"This is a test document.\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "32b4d5f4-2b8e-4681-856f-19a3dd141ae4",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"query_result = embeddings.embed_query(text)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"id": "47076457-1880-48ac-970f-872ead6f0d94",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"doc_result = embeddings.embed_documents([text])"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.16"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
@@ -8,12 +8,12 @@
|
||||
"source": [
|
||||
"# Clarifai\n",
|
||||
"\n",
|
||||
">[Clarifai](https://www.clarifai.com/) is a AI Platform that provides the full AI lifecycle ranging from data exploration, data labeling, model building and inference. A Clarifai application can be used as a vector database after uploading inputs. \n",
|
||||
">[Clarifai](https://www.clarifai.com/) is an AI Platform that provides the full AI lifecycle ranging from data exploration, data labeling, model training, evaluation, and inference. A Clarifai application can be used as a vector database after uploading inputs. \n",
|
||||
"\n",
|
||||
"This notebook shows how to use functionality related to the `Clarifai` vector database.\n",
|
||||
"\n",
|
||||
"To use Clarifai, you must have an account and a Personal Access Token key. \n",
|
||||
"Here are the [installation instructions](https://clarifai.com/settings/security )."
|
||||
"To use Clarifai, you must have an account and a Personal Access Token (PAT) key. \n",
|
||||
"[Check here](https://clarifai.com/settings/security) to get or create a PAT."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -58,7 +58,7 @@
|
||||
"# Please login and get your API key from https://clarifai.com/settings/security \n",
|
||||
"from getpass import getpass\n",
|
||||
"\n",
|
||||
"CLARIFAI_PAT_KEY = getpass()"
|
||||
"CLARIFAI_PAT = getpass()"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -92,7 +92,7 @@
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Setup\n",
|
||||
"Setup the user id and app id where the text data will be uploaded. \n",
|
||||
"Setup the user id and app id where the text data will be uploaded. Note: when creating that application please select an appropriate base workflow for indexing your text documents such as the Language-Understanding workflow.\n",
|
||||
"\n",
|
||||
"You will have to first create an account on [Clarifai](https://clarifai.com/login) and then create an application."
|
||||
]
|
||||
@@ -139,7 +139,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"clarifai_vector_db = Clarifai.from_texts(user_id=USER_ID, app_id=APP_ID, texts=texts, pat=CLARIFAI_PAT_KEY, number_of_docs=NUMBER_OF_DOCS, metadatas = metadatas)"
|
||||
"clarifai_vector_db = Clarifai.from_texts(user_id=USER_ID, app_id=APP_ID, texts=texts, pat=CLARIFAI_PAT, number_of_docs=NUMBER_OF_DOCS, metadatas = metadatas)"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@@ -8,9 +8,12 @@
|
||||
"source": [
|
||||
"# Clarifai\n",
|
||||
"\n",
|
||||
">[Clarifai](https://www.clarifai.com/) is a AI Platform that provides the full AI lifecycle ranging from data exploration, data labeling, model building and inference.\n",
|
||||
">[Clarifai](https://www.clarifai.com/) is an AI Platform that provides the full AI lifecycle ranging from data exploration, data labeling, model training, evaluation, and inference.\n",
|
||||
"\n",
|
||||
"This example goes over how to use LangChain to interact with `Clarifai` [models](https://clarifai.com/explore/models)."
|
||||
"This example goes over how to use LangChain to interact with `Clarifai` [models](https://clarifai.com/explore/models). \n",
|
||||
"\n",
|
||||
"To use Clarifai, you must have an account and a Personal Access Token (PAT) key. \n",
|
||||
"[Check here](https://clarifai.com/settings/security) to get or create a PAT."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -42,7 +45,7 @@
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Imports\n",
|
||||
"Here we will be setting the personal access token. You can find your PAT under settings/security on the platform."
|
||||
"Here we will be setting the personal access token. You can find your PAT under [settings/security](https://clarifai.com/settings/security) in your Clarifai account."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -52,17 +55,25 @@
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdin",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
" ········\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Please login and get your API key from https://clarifai.com/settings/security \n",
|
||||
"from getpass import getpass\n",
|
||||
"\n",
|
||||
"CLARIFAI_PAT_KEY = getpass()"
|
||||
"CLARIFAI_PAT = getpass()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"execution_count": null,
|
||||
"id": "6fb585dd",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
@@ -81,12 +92,12 @@
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Input\n",
|
||||
"Create a prompt template to be used with the LLM Chain"
|
||||
"Create a prompt template to be used with the LLM Chain:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"execution_count": null,
|
||||
"id": "035dea0f",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
@@ -121,10 +132,10 @@
|
||||
"source": [
|
||||
"USER_ID = 'openai'\n",
|
||||
"APP_ID = 'chat-completion'\n",
|
||||
"MODEL_ID = 'chatgpt-3_5-turbo'\n",
|
||||
"MODEL_ID = 'GPT-3_5-turbo'\n",
|
||||
"\n",
|
||||
"# You can provide a specific model version\n",
|
||||
"# model_version_id = \"MODEL_VERSION_ID\""
|
||||
"# You can provide a specific model version as the model_version_id arg.\n",
|
||||
"# MODEL_VERSION_ID = \"MODEL_VERSION_ID\""
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -137,7 +148,7 @@
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Initialize a Clarifai LLM\n",
|
||||
"clarifai_llm = Clarifai(clarifai_pat_key=CLARIFAI_PAT_KEY, user_id=USER_ID, app_id=APP_ID, model_id=MODEL_ID)"
|
||||
"clarifai_llm = Clarifai(pat=CLARIFAI_PAT, user_id=USER_ID, app_id=APP_ID, model_id=MODEL_ID)"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -171,7 +182,7 @@
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'Justin Bieber was born on March 1, 1994. So, we need to look at the Super Bowl that was played in the year 1994. \\n\\nThe Super Bowl in 1994 was Super Bowl XXVIII (28). It was played on January 30, 1994, between the Dallas Cowboys and the Buffalo Bills. \\n\\nThe Dallas Cowboys won the Super Bowl in 1994, defeating the Buffalo Bills by a score of 30-13. \\n\\nTherefore, the Dallas Cowboys are the NFL team that won the Super Bowl in the year Justin Bieber was born.'"
|
||||
"'Justin Bieber was born on March 1, 1994. So, we need to figure out the Super Bowl winner for the 1994 season. The NFL season spans two calendar years, so the Super Bowl for the 1994 season would have taken place in early 1995. \\n\\nThe Super Bowl in question is Super Bowl XXIX, which was played on January 29, 1995. The game was won by the San Francisco 49ers, who defeated the San Diego Chargers by a score of 49-26. Therefore, the San Francisco 49ers won the Super Bowl in the year Justin Bieber was born.'"
|
||||
]
|
||||
},
|
||||
"execution_count": 7,
|
||||
|
@@ -7,6 +7,7 @@ from langchain.embeddings.aleph_alpha import (
|
||||
AlephAlphaSymmetricSemanticEmbedding,
|
||||
)
|
||||
from langchain.embeddings.bedrock import BedrockEmbeddings
|
||||
from langchain.embeddings.clarifai import ClarifaiEmbeddings
|
||||
from langchain.embeddings.cohere import CohereEmbeddings
|
||||
from langchain.embeddings.dashscope import DashScopeEmbeddings
|
||||
from langchain.embeddings.deepinfra import DeepInfraEmbeddings
|
||||
@@ -43,6 +44,7 @@ __all__ = [
|
||||
"OpenAIEmbeddings",
|
||||
"HuggingFaceEmbeddings",
|
||||
"CohereEmbeddings",
|
||||
"ClarifaiEmbeddings",
|
||||
"ElasticsearchEmbeddings",
|
||||
"JinaEmbeddings",
|
||||
"LlamaCppEmbeddings",
|
||||
|
197
langchain/embeddings/clarifai.py
Normal file
197
langchain/embeddings/clarifai.py
Normal file
@@ -0,0 +1,197 @@
|
||||
"""Wrapper around Clarifai embedding models."""
|
||||
import logging
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
from pydantic import BaseModel, Extra, root_validator
|
||||
|
||||
from langchain.embeddings.base import Embeddings
|
||||
from langchain.utils import get_from_dict_or_env
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class ClarifaiEmbeddings(BaseModel, Embeddings):
|
||||
"""Wrapper around Clarifai embedding models.
|
||||
|
||||
To use, you should have the ``clarifai`` python package installed, and the
|
||||
environment variable ``CLARIFAI_PAT`` set with your personal access token or pass it
|
||||
as a named parameter to the constructor.
|
||||
|
||||
Example:
|
||||
.. code-block:: python
|
||||
|
||||
from langchain.embeddings import ClarifaiEmbeddings
|
||||
clarifai = ClarifaiEmbeddings(
|
||||
model="embed-english-light-v2.0", clarifai_api_key="my-api-key"
|
||||
)
|
||||
"""
|
||||
|
||||
stub: Any #: :meta private:
|
||||
userDataObject: Any
|
||||
|
||||
model_id: Optional[str] = None
|
||||
"""Model id to use."""
|
||||
|
||||
model_version_id: Optional[str] = None
|
||||
"""Model version id to use."""
|
||||
|
||||
app_id: Optional[str] = None
|
||||
"""Clarifai application id to use."""
|
||||
|
||||
user_id: Optional[str] = None
|
||||
"""Clarifai user id to use."""
|
||||
|
||||
pat: Optional[str] = None
|
||||
|
||||
api_base: str = "https://api.clarifai.com"
|
||||
|
||||
class Config:
|
||||
"""Configuration for this pydantic object."""
|
||||
|
||||
extra = Extra.forbid
|
||||
|
||||
@root_validator()
|
||||
def validate_environment(cls, values: Dict) -> Dict:
|
||||
"""Validate that api key and python package exists in environment."""
|
||||
values["pat"] = get_from_dict_or_env(values, "pat", "CLARIFAI_PAT")
|
||||
user_id = values.get("user_id")
|
||||
app_id = values.get("app_id")
|
||||
model_id = values.get("model_id")
|
||||
|
||||
if values["pat"] is None:
|
||||
raise ValueError("Please provide a pat.")
|
||||
if user_id is None:
|
||||
raise ValueError("Please provide a user_id.")
|
||||
if app_id is None:
|
||||
raise ValueError("Please provide a app_id.")
|
||||
if model_id is None:
|
||||
raise ValueError("Please provide a model_id.")
|
||||
|
||||
try:
|
||||
from clarifai.auth.helper import ClarifaiAuthHelper
|
||||
from clarifai.client import create_stub
|
||||
except ImportError:
|
||||
raise ImportError(
|
||||
"Could not import clarifai python package. "
|
||||
"Please install it with `pip install clarifai`."
|
||||
)
|
||||
auth = ClarifaiAuthHelper(
|
||||
user_id=user_id,
|
||||
app_id=app_id,
|
||||
pat=values["pat"],
|
||||
base=values["api_base"],
|
||||
)
|
||||
values["userDataObject"] = auth.get_user_app_id_proto()
|
||||
values["stub"] = create_stub(auth)
|
||||
|
||||
return values
|
||||
|
||||
def embed_documents(self, texts: List[str]) -> List[List[float]]:
|
||||
"""Call out to Clarifai's embedding models.
|
||||
|
||||
Args:
|
||||
texts: The list of texts to embed.
|
||||
|
||||
Returns:
|
||||
List of embeddings, one for each text.
|
||||
"""
|
||||
|
||||
try:
|
||||
from clarifai_grpc.grpc.api import (
|
||||
resources_pb2,
|
||||
service_pb2,
|
||||
)
|
||||
from clarifai_grpc.grpc.api.status import status_code_pb2
|
||||
except ImportError:
|
||||
raise ImportError(
|
||||
"Could not import clarifai python package. "
|
||||
"Please install it with `pip install clarifai`."
|
||||
)
|
||||
|
||||
post_model_outputs_request = service_pb2.PostModelOutputsRequest(
|
||||
user_app_id=self.userDataObject,
|
||||
model_id=self.model_id,
|
||||
version_id=self.model_version_id,
|
||||
inputs=[
|
||||
resources_pb2.Input(
|
||||
data=resources_pb2.Data(text=resources_pb2.Text(raw=t))
|
||||
)
|
||||
for t in texts
|
||||
],
|
||||
)
|
||||
post_model_outputs_response = self.stub.PostModelOutputs(
|
||||
post_model_outputs_request
|
||||
)
|
||||
|
||||
if post_model_outputs_response.status.code != status_code_pb2.SUCCESS:
|
||||
logger.error(post_model_outputs_response.status)
|
||||
first_output_failure = (
|
||||
post_model_outputs_response.outputs[0].status
|
||||
if len(post_model_outputs_response.outputs[0])
|
||||
else None
|
||||
)
|
||||
raise Exception(
|
||||
f"Post model outputs failed, status: "
|
||||
f"{post_model_outputs_response.status}, first output failure: "
|
||||
f"{first_output_failure}"
|
||||
)
|
||||
embeddings = [
|
||||
list(o.data.embeddings[0].vector)
|
||||
for o in post_model_outputs_response.outputs
|
||||
]
|
||||
return embeddings
|
||||
|
||||
def embed_query(self, text: str) -> List[float]:
|
||||
"""Call out to Clarifai's embedding models.
|
||||
|
||||
Args:
|
||||
text: The text to embed.
|
||||
|
||||
Returns:
|
||||
Embeddings for the text.
|
||||
"""
|
||||
|
||||
try:
|
||||
from clarifai_grpc.grpc.api import (
|
||||
resources_pb2,
|
||||
service_pb2,
|
||||
)
|
||||
from clarifai_grpc.grpc.api.status import status_code_pb2
|
||||
except ImportError:
|
||||
raise ImportError(
|
||||
"Could not import clarifai python package. "
|
||||
"Please install it with `pip install clarifai`."
|
||||
)
|
||||
|
||||
post_model_outputs_request = service_pb2.PostModelOutputsRequest(
|
||||
user_app_id=self.userDataObject,
|
||||
model_id=self.model_id,
|
||||
version_id=self.model_version_id,
|
||||
inputs=[
|
||||
resources_pb2.Input(
|
||||
data=resources_pb2.Data(text=resources_pb2.Text(raw=text))
|
||||
)
|
||||
],
|
||||
)
|
||||
post_model_outputs_response = self.stub.PostModelOutputs(
|
||||
post_model_outputs_request
|
||||
)
|
||||
|
||||
if post_model_outputs_response.status.code != status_code_pb2.SUCCESS:
|
||||
logger.error(post_model_outputs_response.status)
|
||||
first_output_failure = (
|
||||
post_model_outputs_response.outputs[0].status
|
||||
if len(post_model_outputs_response.outputs[0])
|
||||
else None
|
||||
)
|
||||
raise Exception(
|
||||
f"Post model outputs failed, status: "
|
||||
f"{post_model_outputs_response.status}, first output failure: "
|
||||
f"{first_output_failure}"
|
||||
)
|
||||
|
||||
embeddings = [
|
||||
list(o.data.embeddings[0].vector)
|
||||
for o in post_model_outputs_response.outputs
|
||||
]
|
||||
return embeddings[0]
|
@@ -15,21 +15,20 @@ logger = logging.getLogger(__name__)
|
||||
class Clarifai(LLM):
|
||||
"""Wrapper around Clarifai's large language models.
|
||||
|
||||
To use, you should have an account on the Clarifai platform,
|
||||
To use, you should have an account on the Clarifai platform,
|
||||
the ``clarifai`` python package installed, and the
|
||||
environment variable ``CLARIFAI_PAT_KEY`` set with your PAT key,
|
||||
environment variable ``CLARIFAI_PAT`` set with your PAT key,
|
||||
or pass it as a named parameter to the constructor.
|
||||
|
||||
Example:
|
||||
.. code-block:: python
|
||||
|
||||
from langchain.llms import Clarifai
|
||||
clarifai_llm = Clarifai(clarifai_pat_key=CLARIFAI_PAT_KEY, \
|
||||
clarifai_llm = Clarifai(pat=CLARIFAI_PAT, \
|
||||
user_id=USER_ID, app_id=APP_ID, model_id=MODEL_ID)
|
||||
"""
|
||||
|
||||
stub: Any #: :meta private:
|
||||
metadata: Any
|
||||
userDataObject: Any
|
||||
|
||||
model_id: Optional[str] = None
|
||||
@@ -44,12 +43,10 @@ class Clarifai(LLM):
|
||||
user_id: Optional[str] = None
|
||||
"""Clarifai user id to use."""
|
||||
|
||||
clarifai_pat_key: Optional[str] = None
|
||||
pat: Optional[str] = None
|
||||
|
||||
api_base: str = "https://api.clarifai.com"
|
||||
|
||||
stop: Optional[List[str]] = None
|
||||
|
||||
class Config:
|
||||
"""Configuration for this pydantic object."""
|
||||
|
||||
@@ -59,32 +56,54 @@ class Clarifai(LLM):
|
||||
def validate_environment(cls, values: Dict) -> Dict:
|
||||
"""Validate that we have all required info to access Clarifai
|
||||
platform and python package exists in environment."""
|
||||
values["clarifai_pat_key"] = get_from_dict_or_env(
|
||||
values, "clarifai_pat_key", "CLARIFAI_PAT_KEY"
|
||||
)
|
||||
values["pat"] = get_from_dict_or_env(values, "pat", "CLARIFAI_PAT")
|
||||
user_id = values.get("user_id")
|
||||
app_id = values.get("app_id")
|
||||
model_id = values.get("model_id")
|
||||
|
||||
if values["clarifai_pat_key"] is None:
|
||||
raise ValueError("Please provide a clarifai_pat_key.")
|
||||
if values["pat"] is None:
|
||||
raise ValueError("Please provide a pat.")
|
||||
if user_id is None:
|
||||
raise ValueError("Please provide a user_id.")
|
||||
if app_id is None:
|
||||
raise ValueError("Please provide a app_id.")
|
||||
if model_id is None:
|
||||
raise ValueError("Please provide a model_id.")
|
||||
|
||||
try:
|
||||
from clarifai.auth.helper import ClarifaiAuthHelper
|
||||
from clarifai.client import create_stub
|
||||
except ImportError:
|
||||
raise ImportError(
|
||||
"Could not import clarifai python package. "
|
||||
"Please install it with `pip install clarifai`."
|
||||
)
|
||||
auth = ClarifaiAuthHelper(
|
||||
user_id=user_id,
|
||||
app_id=app_id,
|
||||
pat=values["pat"],
|
||||
base=values["api_base"],
|
||||
)
|
||||
values["userDataObject"] = auth.get_user_app_id_proto()
|
||||
values["stub"] = create_stub(auth)
|
||||
|
||||
return values
|
||||
|
||||
@property
|
||||
def _default_params(self) -> Dict[str, Any]:
|
||||
"""Get the default parameters for calling Cohere API."""
|
||||
"""Get the default parameters for calling Clarifai API."""
|
||||
return {}
|
||||
|
||||
@property
|
||||
def _identifying_params(self) -> Dict[str, Any]:
|
||||
"""Get the identifying parameters."""
|
||||
return {**{"model_id": self.model_id}}
|
||||
return {
|
||||
**{
|
||||
"user_id": self.user_id,
|
||||
"app_id": self.app_id,
|
||||
"model_id": self.model_id,
|
||||
}
|
||||
}
|
||||
|
||||
@property
|
||||
def _llm_type(self) -> str:
|
||||
@@ -96,7 +115,7 @@ class Clarifai(LLM):
|
||||
prompt: str,
|
||||
stop: Optional[List[str]] = None,
|
||||
run_manager: Optional[CallbackManagerForLLMRun] = None,
|
||||
**kwargs: Any
|
||||
**kwargs: Any,
|
||||
) -> str:
|
||||
"""Call out to Clarfai's PostModelOutputs endpoint.
|
||||
|
||||
@@ -114,8 +133,6 @@ class Clarifai(LLM):
|
||||
"""
|
||||
|
||||
try:
|
||||
from clarifai.auth.helper import ClarifaiAuthHelper
|
||||
from clarifai.client import create_stub
|
||||
from clarifai_grpc.grpc.api import (
|
||||
resources_pb2,
|
||||
service_pb2,
|
||||
@@ -127,23 +144,6 @@ class Clarifai(LLM):
|
||||
"Please install it with `pip install clarifai`."
|
||||
)
|
||||
|
||||
auth = ClarifaiAuthHelper(
|
||||
user_id=self.user_id,
|
||||
app_id=self.app_id,
|
||||
pat=self.clarifai_pat_key,
|
||||
base=self.api_base,
|
||||
)
|
||||
self.userDataObject = auth.get_user_app_id_proto()
|
||||
self.stub = create_stub(auth)
|
||||
|
||||
params = self._default_params
|
||||
if self.stop is not None and stop is not None:
|
||||
raise ValueError("`stop` found in both the input and default params.")
|
||||
elif self.stop is not None:
|
||||
params["stop_sequences"] = self.stop
|
||||
else:
|
||||
params["stop_sequences"] = stop
|
||||
|
||||
# The userDataObject is created in the overview and
|
||||
# is required when using a PAT
|
||||
# If version_id None, Defaults to the latest model version
|
||||
@@ -163,14 +163,20 @@ class Clarifai(LLM):
|
||||
|
||||
if post_model_outputs_response.status.code != status_code_pb2.SUCCESS:
|
||||
logger.error(post_model_outputs_response.status)
|
||||
first_model_failure = (
|
||||
post_model_outputs_response.outputs[0].status
|
||||
if len(post_model_outputs_response.outputs[0])
|
||||
else None
|
||||
)
|
||||
raise Exception(
|
||||
"Post model outputs failed, status: "
|
||||
+ post_model_outputs_response.status.description
|
||||
f"Post model outputs failed, status: "
|
||||
f"{post_model_outputs_response.status}, first output failure: "
|
||||
f"{first_model_failure}"
|
||||
)
|
||||
|
||||
text = post_model_outputs_response.outputs[0].data.text.raw
|
||||
|
||||
# In order to make this consistent with other endpoints, we strip them.
|
||||
if stop is not None or self.stop is not None:
|
||||
text = enforce_stop_tokens(text, params["stop_sequences"])
|
||||
if stop is not None:
|
||||
text = enforce_stop_tokens(text, stop)
|
||||
return text
|
||||
|
@@ -64,7 +64,7 @@ class Clarifai(VectorStore):
|
||||
|
||||
self._user_id = user_id or os.environ.get("CLARIFAI_USER_ID")
|
||||
self._app_id = app_id or os.environ.get("CLARIFAI_APP_ID")
|
||||
self._pat = pat or os.environ.get("CLARIFAI_PAT_KEY")
|
||||
self._pat = pat or os.environ.get("CLARIFAI_PAT")
|
||||
if self._user_id is None or self._app_id is None or self._pat is None:
|
||||
raise ValueError(
|
||||
"Could not find CLARIFAI_USER_ID, CLARIFAI_APP_ID or\
|
||||
|
14
poetry.lock
generated
14
poetry.lock
generated
@@ -1,4 +1,4 @@
|
||||
# This file is automatically @generated by Poetry and should not be changed by hand.
|
||||
# This file is automatically @generated by Poetry 1.4.2 and should not be changed by hand.
|
||||
|
||||
[[package]]
|
||||
name = "absl-py"
|
||||
@@ -11147,7 +11147,7 @@ files = [
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
accelerate = {version = ">=0.20.2", optional = true, markers = "extra == \"accelerate\""}
|
||||
accelerate = {version = ">=0.20.2", optional = true, markers = "extra == \"accelerate\" or extra == \"torch\""}
|
||||
filelock = "*"
|
||||
huggingface-hub = ">=0.14.1,<1.0"
|
||||
numpy = ">=1.17"
|
||||
@@ -12404,15 +12404,15 @@ cffi = {version = ">=1.11", markers = "platform_python_implementation == \"PyPy\
|
||||
cffi = ["cffi (>=1.11)"]
|
||||
|
||||
[extras]
|
||||
all = ["anthropic", "clarifai", "cohere", "openai", "nlpcloud", "huggingface_hub", "jina", "manifest-ml", "elasticsearch", "opensearch-py", "google-search-results", "faiss-cpu", "sentence-transformers", "transformers", "spacy", "nltk", "wikipedia", "beautifulsoup4", "tiktoken", "torch", "jinja2", "pinecone-client", "pinecone-text", "marqo", "pymongo", "weaviate-client", "redis", "google-api-python-client", "google-auth", "wolframalpha", "qdrant-client", "tensorflow-text", "pypdf", "networkx", "nomic", "aleph-alpha-client", "deeplake", "pgvector", "psycopg2-binary", "pyowm", "pytesseract", "html2text", "atlassian-python-api", "gptcache", "duckduckgo-search", "arxiv", "azure-identity", "clickhouse-connect", "azure-cosmos", "lancedb", "langkit", "lark", "pexpect", "pyvespa", "O365", "jq", "docarray", "steamship", "pdfminer-six", "lxml", "requests-toolbelt", "neo4j", "openlm", "azure-ai-formrecognizer", "azure-ai-vision", "azure-cognitiveservices-speech", "momento", "singlestoredb", "tigrisdb", "nebula3-python", "awadb", "esprima", "octoai-sdk", "rdflib"]
|
||||
azure = ["azure-identity", "azure-cosmos", "openai", "azure-core", "azure-ai-formrecognizer", "azure-ai-vision", "azure-cognitiveservices-speech", "azure-search-documents"]
|
||||
all = ["O365", "aleph-alpha-client", "anthropic", "arxiv", "atlassian-python-api", "awadb", "azure-ai-formrecognizer", "azure-ai-vision", "azure-cognitiveservices-speech", "azure-cosmos", "azure-identity", "beautifulsoup4", "clarifai", "clickhouse-connect", "cohere", "deeplake", "docarray", "duckduckgo-search", "elasticsearch", "esprima", "faiss-cpu", "google-api-python-client", "google-auth", "google-search-results", "gptcache", "html2text", "huggingface_hub", "jina", "jinja2", "jq", "lancedb", "langkit", "lark", "lxml", "manifest-ml", "marqo", "momento", "nebula3-python", "neo4j", "networkx", "nlpcloud", "nltk", "nomic", "octoai-sdk", "openai", "openlm", "opensearch-py", "pdfminer-six", "pexpect", "pgvector", "pinecone-client", "pinecone-text", "psycopg2-binary", "pymongo", "pyowm", "pypdf", "pytesseract", "pyvespa", "qdrant-client", "rdflib", "redis", "requests-toolbelt", "sentence-transformers", "singlestoredb", "spacy", "steamship", "tensorflow-text", "tigrisdb", "tiktoken", "torch", "transformers", "weaviate-client", "wikipedia", "wolframalpha"]
|
||||
azure = ["azure-ai-formrecognizer", "azure-ai-vision", "azure-cognitiveservices-speech", "azure-core", "azure-cosmos", "azure-identity", "azure-search-documents", "openai"]
|
||||
clarifai = ["clarifai"]
|
||||
cohere = ["cohere"]
|
||||
docarray = ["docarray"]
|
||||
embeddings = ["sentence-transformers"]
|
||||
extended-testing = ["beautifulsoup4", "bibtexparser", "cassio", "chardet", "esprima", "jq", "pdfminer-six", "pgvector", "pypdf", "pymupdf", "pypdfium2", "tqdm", "lxml", "atlassian-python-api", "beautifulsoup4", "pandas", "telethon", "psychicapi", "zep-python", "gql", "requests-toolbelt", "html2text", "py-trello", "scikit-learn", "streamlit", "pyspark", "openai"]
|
||||
extended-testing = ["atlassian-python-api", "beautifulsoup4", "beautifulsoup4", "bibtexparser", "cassio", "chardet", "esprima", "gql", "html2text", "jq", "lxml", "openai", "pandas", "pdfminer-six", "pgvector", "psychicapi", "py-trello", "pymupdf", "pypdf", "pypdfium2", "pyspark", "requests-toolbelt", "scikit-learn", "streamlit", "telethon", "tqdm", "zep-python"]
|
||||
javascript = ["esprima"]
|
||||
llms = ["anthropic", "clarifai", "cohere", "openai", "openllm", "openlm", "nlpcloud", "huggingface_hub", "manifest-ml", "torch", "transformers"]
|
||||
llms = ["anthropic", "clarifai", "cohere", "huggingface_hub", "manifest-ml", "nlpcloud", "openai", "openllm", "openlm", "torch", "transformers"]
|
||||
openai = ["openai", "tiktoken"]
|
||||
qdrant = ["qdrant-client"]
|
||||
text-helpers = ["chardet"]
|
||||
@@ -12420,4 +12420,4 @@ text-helpers = ["chardet"]
|
||||
[metadata]
|
||||
lock-version = "2.0"
|
||||
python-versions = ">=3.8.1,<4.0"
|
||||
content-hash = "e1da6f8f88f3410d6ac4b1babb2a8aa9e01263deb52da423419f5362c5ddfc1f"
|
||||
content-hash = "06160f44910ddc8178bf481652d23f00572fc0db7cb0b65dc624c743d7151e10"
|
||||
|
@@ -104,7 +104,7 @@ momento = {version = "^1.5.0", optional = true}
|
||||
bibtexparser = {version = "^1.4.0", optional = true}
|
||||
singlestoredb = {version = "^0.7.1", optional = true}
|
||||
pyspark = {version = "^3.4.0", optional = true}
|
||||
clarifai = {version = "9.1.0", optional = true}
|
||||
clarifai = {version = ">=9.1.0", optional = true}
|
||||
tigrisdb = {version = "^1.0.0b6", optional = true}
|
||||
nebula3-python = {version = "^3.4.0", optional = true}
|
||||
langchainplus-sdk = "^0.0.20"
|
||||
|
Reference in New Issue
Block a user