mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-15 06:26:12 +00:00
ibm: Add support for Embedding Models (#20647)
--------- Co-authored-by: Erick Friis <erick@langchain.dev>
This commit is contained in:
@@ -37,3 +37,13 @@ See a [usage example](/docs/integrations/llms/ibm_watsonx).
|
||||
```python
|
||||
from langchain_ibm import WatsonxLLM
|
||||
```
|
||||
|
||||
## Embedding Models
|
||||
|
||||
### WatsonxEmbeddings
|
||||
|
||||
See a [usage example](/docs/integrations/text_embedding/ibm_watsonx).
|
||||
|
||||
```python
|
||||
from langchain_ibm import WatsonxEmbeddings
|
||||
```
|
||||
|
243
docs/docs/integrations/text_embedding/ibm_watsonx.ipynb
Normal file
243
docs/docs/integrations/text_embedding/ibm_watsonx.ipynb
Normal file
@@ -0,0 +1,243 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# IBM watsonx.ai\n",
|
||||
"\n",
|
||||
">WatsonxEmbeddings is a wrapper for IBM [watsonx.ai](https://www.ibm.com/products/watsonx-ai) foundation models.\n",
|
||||
"\n",
|
||||
"This example shows how to communicate with `watsonx.ai` models using `LangChain`."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Setting up\n",
|
||||
"\n",
|
||||
"Install the package `langchain-ibm`."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!pip install -qU langchain-ibm"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This cell defines the WML credentials required to work with watsonx Embeddings.\n",
|
||||
"\n",
|
||||
"**Action:** Provide the IBM Cloud user API key. For details, see\n",
|
||||
"[documentation](https://cloud.ibm.com/docs/account?topic=account-userapikey&interface=ui)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"from getpass import getpass\n",
|
||||
"\n",
|
||||
"watsonx_api_key = getpass()\n",
|
||||
"os.environ[\"WATSONX_APIKEY\"] = watsonx_api_key"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Additionaly you are able to pass additional secrets as an environment variable. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"\n",
|
||||
"os.environ[\"WATSONX_URL\"] = \"your service instance url\"\n",
|
||||
"os.environ[\"WATSONX_TOKEN\"] = \"your token for accessing the CPD cluster\"\n",
|
||||
"os.environ[\"WATSONX_PASSWORD\"] = \"your password for accessing the CPD cluster\"\n",
|
||||
"os.environ[\"WATSONX_USERNAME\"] = \"your username for accessing the CPD cluster\"\n",
|
||||
"os.environ[\"WATSONX_INSTANCE_ID\"] = \"your instance_id for accessing the CPD cluster\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Load the model\n",
|
||||
"\n",
|
||||
"You might need to adjust model `parameters` for different models."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from ibm_watsonx_ai.metanames import EmbedTextParamsMetaNames\n",
|
||||
"\n",
|
||||
"embed_params = {\n",
|
||||
" EmbedTextParamsMetaNames.TRUNCATE_INPUT_TOKENS: 3,\n",
|
||||
" EmbedTextParamsMetaNames.RETURN_OPTIONS: {\"input_text\": True},\n",
|
||||
"}"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Initialize the `WatsonxEmbeddings` class with previously set parameters.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"**Note**: \n",
|
||||
"\n",
|
||||
"- To provide context for the API call, you must add `project_id` or `space_id`. For more information see [documentation](https://www.ibm.com/docs/en/watsonx-as-a-service?topic=projects).\n",
|
||||
"- Depending on the region of your provisioned service instance, use one of the urls described [here](https://ibm.github.io/watsonx-ai-python-sdk/setup_cloud.html#authentication).\n",
|
||||
"\n",
|
||||
"In this example, we’ll use the `project_id` and Dallas url.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"You need to specify `model_id` that will be used for inferencing."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_ibm import WatsonxEmbeddings\n",
|
||||
"\n",
|
||||
"watsonx_embedding = WatsonxEmbeddings(\n",
|
||||
" model_id=\"ibm/slate-125m-english-rtrvr\",\n",
|
||||
" url=\"https://us-south.ml.cloud.ibm.com\",\n",
|
||||
" project_id=\"PASTE YOUR PROJECT_ID HERE\",\n",
|
||||
" params=embed_params,\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Alternatively you can use Cloud Pak for Data credentials. For details, see [documentation](https://ibm.github.io/watsonx-ai-python-sdk/setup_cpd.html). "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"watsonx_embedding = WatsonxEmbeddings(\n",
|
||||
" model_id=\"ibm/slate-125m-english-rtrvr\",\n",
|
||||
" url=\"PASTE YOUR URL HERE\",\n",
|
||||
" username=\"PASTE YOUR USERNAME HERE\",\n",
|
||||
" password=\"PASTE YOUR PASSWORD HERE\",\n",
|
||||
" instance_id=\"openshift\",\n",
|
||||
" version=\"5.0\",\n",
|
||||
" project_id=\"PASTE YOUR PROJECT_ID HERE\",\n",
|
||||
" params=embed_params,\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Usage\n",
|
||||
"\n",
|
||||
"### Embed query"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[0.0094472, -0.024981909, -0.026013248, -0.040483925, -0.057804465]"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"text = \"This is a test document.\"\n",
|
||||
"\n",
|
||||
"query_result = watsonx_embedding.embed_query(text)\n",
|
||||
"query_result[:5]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Embed documents"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[0.009447193, -0.024981918, -0.026013244, -0.040483937, -0.057804447]"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"texts = [\"This is a content of the document\", \"This is another document\"]\n",
|
||||
"\n",
|
||||
"doc_result = watsonx_embedding.embed_documents(texts)\n",
|
||||
"doc_result[0][:5]"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "langchain",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.13"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
Reference in New Issue
Block a user