diff --git a/docs/docs/integrations/text_embedding/cohere.ipynb b/docs/docs/integrations/text_embedding/cohere.ipynb index 7d72399976e..9ef4f5876e4 100644 --- a/docs/docs/integrations/text_embedding/cohere.ipynb +++ b/docs/docs/integrations/text_embedding/cohere.ipynb @@ -1,265 +1,267 @@ { - "cells": [ - { - "cell_type": "raw", - "id": "afaf8039", - "metadata": {}, - "source": [ - "---\n", - "sidebar_label: Cohere\n", - "---" - ] - }, - { - "cell_type": "markdown", - "id": "9a3d6f34", - "metadata": {}, - "source": [ - "# CohereEmbeddings\n", - "\n", - "This will help you get started with Cohere embedding models using LangChain. For detailed documentation on `CohereEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/cohere/embeddings/langchain_cohere.embeddings.CohereEmbeddings.html).\n", - "\n", - "## Overview\n", - "### Integration details\n", - "\n", - "import { ItemTable } from \"@theme/FeatureTables\";\n", - "\n", - "\n", - "\n", - "## Setup\n", - "\n", - "To access Cohere embedding models you'll need to create a/an Cohere account, get an API key, and install the `langchain-cohere` integration package.\n", - "\n", - "### Credentials\n", - "\n", - "\n", - "Head to [cohere.com](https://cohere.com) to sign up to Cohere and generate an API key. Once you’ve done this set the COHERE_API_KEY environment variable:" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "id": "36521c2a", - "metadata": {}, - "outputs": [], - "source": [ - "import getpass\n", - "import os\n", - "\n", - "if not os.getenv(\"COHERE_API_KEY\"):\n", - " os.environ[\"COHERE_API_KEY\"] = getpass.getpass(\"Enter your Cohere API key: \")" - ] - }, - { - "cell_type": "markdown", - "id": "c84fb993", - "metadata": {}, - "source": "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" - }, - { - "cell_type": "code", - "execution_count": 9, - "id": "39a4953b", - "metadata": {}, - "outputs": [], - "source": [ - "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n", - "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" - ] - }, - { - "cell_type": "markdown", - "id": "d9664366", - "metadata": {}, - "source": [ - "### Installation\n", - "\n", - "The LangChain Cohere integration lives in the `langchain-cohere` package:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "64853226", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -qU langchain-cohere" - ] - }, - { - "cell_type": "markdown", - "id": "45dd1724", - "metadata": {}, - "source": [ - "## Instantiation\n", - "\n", - "Now we can instantiate our model object and generate chat completions:" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "id": "9ea7a09b", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_cohere import CohereEmbeddings\n", - "\n", - "embeddings = CohereEmbeddings(\n", - " model=\"embed-english-v3.0\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "77d271b6", - "metadata": {}, - "source": [ - "## Indexing and Retrieval\n", - "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", - "\n", - "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "id": "d817716b", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "'LangChain is the framework for building context-aware reasoning applications'" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Create a vector store with a sample text\n", - "from langchain_core.vectorstores import InMemoryVectorStore\n", - "\n", - "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", - "\n", - "vectorstore = InMemoryVectorStore.from_texts(\n", - " [text],\n", - " embedding=embeddings,\n", - ")\n", - "\n", - "# Use the vectorstore as a retriever\n", - "retriever = vectorstore.as_retriever()\n", - "\n", - "# Retrieve the most similar text\n", - "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", - "\n", - "# show the retrieved document's content\n", - "retrieved_documents[0].page_content" - ] - }, - { - "cell_type": "markdown", - "id": "e02b9855", - "metadata": {}, - "source": [ - "## Direct Usage\n", - "\n", - "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", - "\n", - "You can directly call these methods to get embeddings for your own use cases.\n", - "\n", - "### Embed single texts\n", - "\n", - "You can embed single texts or documents with `embed_query`:" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "id": "0d2befcd", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[-0.022979736, -0.030212402, -0.08886719, -0.08569336, 0.007030487, -0.0010671616, -0.033813477, 0.0\n" - ] - } - ], - "source": [ - "single_vector = embeddings.embed_query(text)\n", - "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" - ] - }, - { - "cell_type": "markdown", - "id": "1b5a7d03", - "metadata": {}, - "source": [ - "### Embed multiple texts\n", - "\n", - "You can embed multiple texts with `embed_documents`:" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "id": "2f4d6e97", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[-0.028869629, -0.030410767, -0.099121094, -0.07116699, -0.012748718, -0.0059432983, -0.04360962, 0.\n", - "[-0.047332764, -0.049957275, -0.07458496, -0.034332275, -0.057922363, -0.0112838745, -0.06994629, 0.\n" - ] - } - ], - "source": [ - "text2 = (\n", - " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", - ")\n", - "two_vectors = embeddings.embed_documents([text, text2])\n", - "for vector in two_vectors:\n", - " print(str(vector)[:100]) # Show the first 100 characters of the vector" - ] - }, - { - "cell_type": "markdown", - "id": "98785c12", - "metadata": {}, - "source": [ - "## API Reference\n", - "\n", - "For detailed documentation on `CohereEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/cohere/embeddings/langchain_cohere.embeddings.CohereEmbeddings.html).\n" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.4" - } + "cells": [ + { + "cell_type": "raw", + "id": "afaf8039", + "metadata": {}, + "source": [ + "---\n", + "sidebar_label: Cohere\n", + "---" + ] }, - "nbformat": 4, - "nbformat_minor": 5 + { + "cell_type": "markdown", + "id": "9a3d6f34", + "metadata": {}, + "source": [ + "# CohereEmbeddings\n", + "\n", + "This will help you get started with Cohere embedding models using LangChain. For detailed documentation on `CohereEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/cohere/embeddings/langchain_cohere.embeddings.CohereEmbeddings.html).\n", + "\n", + "## Overview\n", + "### Integration details\n", + "\n", + "import { ItemTable } from \"@theme/FeatureTables\";\n", + "\n", + "\n", + "\n", + "## Setup\n", + "\n", + "To access Cohere embedding models you'll need to create a/an Cohere account, get an API key, and install the `langchain-cohere` integration package.\n", + "\n", + "### Credentials\n", + "\n", + "\n", + "Head to [cohere.com](https://cohere.com) to sign up to Cohere and generate an API key. Once you’ve done this set the COHERE_API_KEY environment variable:" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "36521c2a", + "metadata": {}, + "outputs": [], + "source": [ + "import getpass\n", + "import os\n", + "\n", + "if not os.getenv(\"COHERE_API_KEY\"):\n", + " os.environ[\"COHERE_API_KEY\"] = getpass.getpass(\"Enter your Cohere API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "c84fb993", + "metadata": {}, + "source": [ + "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "39a4953b", + "metadata": {}, + "outputs": [], + "source": [ + "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n", + "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "d9664366", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "The LangChain Cohere integration lives in the `langchain-cohere` package:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "64853226", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -qU langchain-cohere" + ] + }, + { + "cell_type": "markdown", + "id": "45dd1724", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "Now we can instantiate our model object and generate chat completions:" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "9ea7a09b", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_cohere import CohereEmbeddings\n", + "\n", + "embeddings = CohereEmbeddings(\n", + " model=\"embed-english-v3.0\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "77d271b6", + "metadata": {}, + "source": [ + "## Indexing and Retrieval\n", + "\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", + "\n", + "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "d817716b", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'LangChain is the framework for building context-aware reasoning applications'" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create a vector store with a sample text\n", + "from langchain_core.vectorstores import InMemoryVectorStore\n", + "\n", + "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", + "\n", + "vectorstore = InMemoryVectorStore.from_texts(\n", + " [text],\n", + " embedding=embeddings,\n", + ")\n", + "\n", + "# Use the vectorstore as a retriever\n", + "retriever = vectorstore.as_retriever()\n", + "\n", + "# Retrieve the most similar text\n", + "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", + "\n", + "# show the retrieved document's content\n", + "retrieved_documents[0].page_content" + ] + }, + { + "cell_type": "markdown", + "id": "e02b9855", + "metadata": {}, + "source": [ + "## Direct Usage\n", + "\n", + "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", + "\n", + "You can directly call these methods to get embeddings for your own use cases.\n", + "\n", + "### Embed single texts\n", + "\n", + "You can embed single texts or documents with `embed_query`:" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "0d2befcd", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[-0.022979736, -0.030212402, -0.08886719, -0.08569336, 0.007030487, -0.0010671616, -0.033813477, 0.0\n" + ] + } + ], + "source": [ + "single_vector = embeddings.embed_query(text)\n", + "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "1b5a7d03", + "metadata": {}, + "source": [ + "### Embed multiple texts\n", + "\n", + "You can embed multiple texts with `embed_documents`:" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "2f4d6e97", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[-0.028869629, -0.030410767, -0.099121094, -0.07116699, -0.012748718, -0.0059432983, -0.04360962, 0.\n", + "[-0.047332764, -0.049957275, -0.07458496, -0.034332275, -0.057922363, -0.0112838745, -0.06994629, 0.\n" + ] + } + ], + "source": [ + "text2 = (\n", + " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", + ")\n", + "two_vectors = embeddings.embed_documents([text, text2])\n", + "for vector in two_vectors:\n", + " print(str(vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "98785c12", + "metadata": {}, + "source": [ + "## API Reference\n", + "\n", + "For detailed documentation on `CohereEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/cohere/embeddings/langchain_cohere.embeddings.CohereEmbeddings.html).\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/docs/docs/integrations/text_embedding/databricks.ipynb b/docs/docs/integrations/text_embedding/databricks.ipynb index 31505992ca7..f231e513175 100644 --- a/docs/docs/integrations/text_embedding/databricks.ipynb +++ b/docs/docs/integrations/text_embedding/databricks.ipynb @@ -125,7 +125,7 @@ "source": [ "## Indexing and Retrieval\n", "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", "\n", "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." ] @@ -264,7 +264,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.5" + "version": "3.9.6" } }, "nbformat": 4, diff --git a/docs/docs/integrations/text_embedding/fireworks.ipynb b/docs/docs/integrations/text_embedding/fireworks.ipynb index 815bc9fa57d..f516050fa4c 100644 --- a/docs/docs/integrations/text_embedding/fireworks.ipynb +++ b/docs/docs/integrations/text_embedding/fireworks.ipynb @@ -1,265 +1,267 @@ { - "cells": [ - { - "cell_type": "raw", - "id": "afaf8039", - "metadata": {}, - "source": [ - "---\n", - "sidebar_label: Fireworks\n", - "---" - ] - }, - { - "cell_type": "markdown", - "id": "9a3d6f34", - "metadata": {}, - "source": [ - "# FireworksEmbeddings\n", - "\n", - "This will help you get started with Fireworks embedding models using LangChain. For detailed documentation on `FireworksEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/fireworks/embeddings/langchain_fireworks.embeddings.FireworksEmbeddings.html).\n", - "\n", - "## Overview\n", - "\n", - "### Integration details\n", - "\n", - "import { ItemTable } from \"@theme/FeatureTables\";\n", - "\n", - "\n", - "\n", - "## Setup\n", - "\n", - "To access Fireworks embedding models you'll need to create a Fireworks account, get an API key, and install the `langchain-fireworks` integration package.\n", - "\n", - "### Credentials\n", - "\n", - "Head to [fireworks.ai](https://fireworks.ai/) to sign up to Fireworks and generate an API key. Once you’ve done this set the FIREWORKS_API_KEY environment variable:" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "id": "36521c2a", - "metadata": {}, - "outputs": [], - "source": [ - "import getpass\n", - "import os\n", - "\n", - "if not os.getenv(\"FIREWORKS_API_KEY\"):\n", - " os.environ[\"FIREWORKS_API_KEY\"] = getpass.getpass(\"Enter your Fireworks API key: \")" - ] - }, - { - "cell_type": "markdown", - "id": "c84fb993", - "metadata": {}, - "source": "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "39a4953b", - "metadata": {}, - "outputs": [], - "source": [ - "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n", - "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" - ] - }, - { - "cell_type": "markdown", - "id": "d9664366", - "metadata": {}, - "source": [ - "### Installation\n", - "\n", - "The LangChain Fireworks integration lives in the `langchain-fireworks` package:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "64853226", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -qU langchain-fireworks" - ] - }, - { - "cell_type": "markdown", - "id": "45dd1724", - "metadata": {}, - "source": [ - "## Instantiation\n", - "\n", - "Now we can instantiate our model object and generate chat completions:" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "id": "9ea7a09b", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_fireworks import FireworksEmbeddings\n", - "\n", - "embeddings = FireworksEmbeddings(\n", - " model=\"nomic-ai/nomic-embed-text-v1.5\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "77d271b6", - "metadata": {}, - "source": [ - "## Indexing and Retrieval\n", - "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", - "\n", - "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "id": "d817716b", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "'LangChain is the framework for building context-aware reasoning applications'" - ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Create a vector store with a sample text\n", - "from langchain_core.vectorstores import InMemoryVectorStore\n", - "\n", - "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", - "\n", - "vectorstore = InMemoryVectorStore.from_texts(\n", - " [text],\n", - " embedding=embeddings,\n", - ")\n", - "\n", - "# Use the vectorstore as a retriever\n", - "retriever = vectorstore.as_retriever()\n", - "\n", - "# Retrieve the most similar text\n", - "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", - "\n", - "# show the retrieved document's content\n", - "retrieved_documents[0].page_content" - ] - }, - { - "cell_type": "markdown", - "id": "e02b9855", - "metadata": {}, - "source": [ - "## Direct Usage\n", - "\n", - "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", - "\n", - "You can directly call these methods to get embeddings for your own use cases.\n", - "\n", - "### Embed single texts\n", - "\n", - "You can embed single texts or documents with `embed_query`:" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "id": "0d2befcd", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[0.01666259765625, 0.011688232421875, -0.1181640625, -0.10205078125, 0.05438232421875, -0.0890502929\n" - ] - } - ], - "source": [ - "single_vector = embeddings.embed_query(text)\n", - "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" - ] - }, - { - "cell_type": "markdown", - "id": "1b5a7d03", - "metadata": {}, - "source": [ - "### Embed multiple texts\n", - "\n", - "You can embed multiple texts with `embed_documents`:" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "id": "2f4d6e97", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[0.016632080078125, 0.01165008544921875, -0.1181640625, -0.10186767578125, 0.05438232421875, -0.0890\n", - "[-0.02667236328125, 0.036651611328125, -0.1630859375, -0.0904541015625, -0.022430419921875, -0.09545\n" - ] - } - ], - "source": [ - "text2 = (\n", - " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", - ")\n", - "two_vectors = embeddings.embed_documents([text, text2])\n", - "for vector in two_vectors:\n", - " print(str(vector)[:100]) # Show the first 100 characters of the vector" - ] - }, - { - "cell_type": "markdown", - "id": "3fba556a-b53d-431c-b0c6-ffb1e2fa5a6e", - "metadata": {}, - "source": [ - "## API Reference\n", - "\n", - "For detailed documentation of all `FireworksEmbeddings` features and configurations head to the [API reference](https://python.langchain.com/api_reference/fireworks/embeddings/langchain_fireworks.embeddings.FireworksEmbeddings.html)." - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.4" - } + "cells": [ + { + "cell_type": "raw", + "id": "afaf8039", + "metadata": {}, + "source": [ + "---\n", + "sidebar_label: Fireworks\n", + "---" + ] }, - "nbformat": 4, - "nbformat_minor": 5 + { + "cell_type": "markdown", + "id": "9a3d6f34", + "metadata": {}, + "source": [ + "# FireworksEmbeddings\n", + "\n", + "This will help you get started with Fireworks embedding models using LangChain. For detailed documentation on `FireworksEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/fireworks/embeddings/langchain_fireworks.embeddings.FireworksEmbeddings.html).\n", + "\n", + "## Overview\n", + "\n", + "### Integration details\n", + "\n", + "import { ItemTable } from \"@theme/FeatureTables\";\n", + "\n", + "\n", + "\n", + "## Setup\n", + "\n", + "To access Fireworks embedding models you'll need to create a Fireworks account, get an API key, and install the `langchain-fireworks` integration package.\n", + "\n", + "### Credentials\n", + "\n", + "Head to [fireworks.ai](https://fireworks.ai/) to sign up to Fireworks and generate an API key. Once you’ve done this set the FIREWORKS_API_KEY environment variable:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "36521c2a", + "metadata": {}, + "outputs": [], + "source": [ + "import getpass\n", + "import os\n", + "\n", + "if not os.getenv(\"FIREWORKS_API_KEY\"):\n", + " os.environ[\"FIREWORKS_API_KEY\"] = getpass.getpass(\"Enter your Fireworks API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "c84fb993", + "metadata": {}, + "source": [ + "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "39a4953b", + "metadata": {}, + "outputs": [], + "source": [ + "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n", + "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "d9664366", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "The LangChain Fireworks integration lives in the `langchain-fireworks` package:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "64853226", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -qU langchain-fireworks" + ] + }, + { + "cell_type": "markdown", + "id": "45dd1724", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "Now we can instantiate our model object and generate chat completions:" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "9ea7a09b", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_fireworks import FireworksEmbeddings\n", + "\n", + "embeddings = FireworksEmbeddings(\n", + " model=\"nomic-ai/nomic-embed-text-v1.5\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "77d271b6", + "metadata": {}, + "source": [ + "## Indexing and Retrieval\n", + "\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", + "\n", + "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "d817716b", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'LangChain is the framework for building context-aware reasoning applications'" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create a vector store with a sample text\n", + "from langchain_core.vectorstores import InMemoryVectorStore\n", + "\n", + "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", + "\n", + "vectorstore = InMemoryVectorStore.from_texts(\n", + " [text],\n", + " embedding=embeddings,\n", + ")\n", + "\n", + "# Use the vectorstore as a retriever\n", + "retriever = vectorstore.as_retriever()\n", + "\n", + "# Retrieve the most similar text\n", + "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", + "\n", + "# show the retrieved document's content\n", + "retrieved_documents[0].page_content" + ] + }, + { + "cell_type": "markdown", + "id": "e02b9855", + "metadata": {}, + "source": [ + "## Direct Usage\n", + "\n", + "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", + "\n", + "You can directly call these methods to get embeddings for your own use cases.\n", + "\n", + "### Embed single texts\n", + "\n", + "You can embed single texts or documents with `embed_query`:" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "0d2befcd", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0.01666259765625, 0.011688232421875, -0.1181640625, -0.10205078125, 0.05438232421875, -0.0890502929\n" + ] + } + ], + "source": [ + "single_vector = embeddings.embed_query(text)\n", + "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "1b5a7d03", + "metadata": {}, + "source": [ + "### Embed multiple texts\n", + "\n", + "You can embed multiple texts with `embed_documents`:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "2f4d6e97", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0.016632080078125, 0.01165008544921875, -0.1181640625, -0.10186767578125, 0.05438232421875, -0.0890\n", + "[-0.02667236328125, 0.036651611328125, -0.1630859375, -0.0904541015625, -0.022430419921875, -0.09545\n" + ] + } + ], + "source": [ + "text2 = (\n", + " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", + ")\n", + "two_vectors = embeddings.embed_documents([text, text2])\n", + "for vector in two_vectors:\n", + " print(str(vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "3fba556a-b53d-431c-b0c6-ffb1e2fa5a6e", + "metadata": {}, + "source": [ + "## API Reference\n", + "\n", + "For detailed documentation of all `FireworksEmbeddings` features and configurations head to the [API reference](https://python.langchain.com/api_reference/fireworks/embeddings/langchain_fireworks.embeddings.FireworksEmbeddings.html)." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/docs/docs/integrations/text_embedding/ibm_watsonx.ipynb b/docs/docs/integrations/text_embedding/ibm_watsonx.ipynb index 76db43cbe6f..5d2e23fcefb 100644 --- a/docs/docs/integrations/text_embedding/ibm_watsonx.ipynb +++ b/docs/docs/integrations/text_embedding/ibm_watsonx.ipynb @@ -203,7 +203,7 @@ "source": [ "## Indexing and Retrieval\n", "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", "\n", "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." ] @@ -327,7 +327,7 @@ ], "metadata": { "kernelspec": { - "display_name": "langchain_ibm", + "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, @@ -341,9 +341,9 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.11.12" + "version": "3.9.6" } }, "nbformat": 4, - "nbformat_minor": 2 + "nbformat_minor": 4 } diff --git a/docs/docs/integrations/text_embedding/lindorm.ipynb b/docs/docs/integrations/text_embedding/lindorm.ipynb index 78f7db24335..8d60317fe94 100644 --- a/docs/docs/integrations/text_embedding/lindorm.ipynb +++ b/docs/docs/integrations/text_embedding/lindorm.ipynb @@ -132,7 +132,7 @@ "source": [ "## Indexing and Retrieval\n", "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", "\n", "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." ] @@ -286,7 +286,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.5" + "version": "3.9.6" } }, "nbformat": 4, diff --git a/docs/docs/integrations/text_embedding/mistralai.ipynb b/docs/docs/integrations/text_embedding/mistralai.ipynb index 56efd00bc19..4aee9ff8cd8 100644 --- a/docs/docs/integrations/text_embedding/mistralai.ipynb +++ b/docs/docs/integrations/text_embedding/mistralai.ipynb @@ -1,264 +1,266 @@ { - "cells": [ - { - "cell_type": "raw", - "id": "afaf8039", - "metadata": {}, - "source": [ - "---\n", - "sidebar_label: MistralAI\n", - "---" - ] - }, - { - "cell_type": "markdown", - "id": "9a3d6f34", - "metadata": {}, - "source": [ - "# MistralAIEmbeddings\n", - "\n", - "This will help you get started with MistralAI embedding models using LangChain. For detailed documentation on `MistralAIEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/mistralai/embeddings/langchain_mistralai.embeddings.MistralAIEmbeddings.html).\n", - "\n", - "## Overview\n", - "### Integration details\n", - "\n", - "import { ItemTable } from \"@theme/FeatureTables\";\n", - "\n", - "\n", - "\n", - "## Setup\n", - "\n", - "To access MistralAI embedding models you'll need to create a/an MistralAI account, get an API key, and install the `langchain-mistralai` integration package.\n", - "\n", - "### Credentials\n", - "\n", - "Head to [https://console.mistral.ai/](https://console.mistral.ai/) to sign up to MistralAI and generate an API key. Once you've done this set the MISTRALAI_API_KEY environment variable:" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "id": "36521c2a", - "metadata": {}, - "outputs": [], - "source": [ - "import getpass\n", - "import os\n", - "\n", - "if not os.getenv(\"MISTRALAI_API_KEY\"):\n", - " os.environ[\"MISTRALAI_API_KEY\"] = getpass.getpass(\"Enter your MistralAI API key: \")" - ] - }, - { - "cell_type": "markdown", - "id": "c84fb993", - "metadata": {}, - "source": "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "39a4953b", - "metadata": {}, - "outputs": [], - "source": [ - "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n", - "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" - ] - }, - { - "cell_type": "markdown", - "id": "d9664366", - "metadata": {}, - "source": [ - "### Installation\n", - "\n", - "The LangChain MistralAI integration lives in the `langchain-mistralai` package:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "64853226", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -qU langchain-mistralai" - ] - }, - { - "cell_type": "markdown", - "id": "45dd1724", - "metadata": {}, - "source": [ - "## Instantiation\n", - "\n", - "Now we can instantiate our model object and generate chat completions:" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "id": "9ea7a09b", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_mistralai import MistralAIEmbeddings\n", - "\n", - "embeddings = MistralAIEmbeddings(\n", - " model=\"mistral-embed\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "77d271b6", - "metadata": {}, - "source": [ - "## Indexing and Retrieval\n", - "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", - "\n", - "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "id": "d817716b", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "'LangChain is the framework for building context-aware reasoning applications'" - ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Create a vector store with a sample text\n", - "from langchain_core.vectorstores import InMemoryVectorStore\n", - "\n", - "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", - "\n", - "vectorstore = InMemoryVectorStore.from_texts(\n", - " [text],\n", - " embedding=embeddings,\n", - ")\n", - "\n", - "# Use the vectorstore as a retriever\n", - "retriever = vectorstore.as_retriever()\n", - "\n", - "# Retrieve the most similar text\n", - "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", - "\n", - "# show the retrieved document's content\n", - "retrieved_documents[0].page_content" - ] - }, - { - "cell_type": "markdown", - "id": "e02b9855", - "metadata": {}, - "source": [ - "## Direct Usage\n", - "\n", - "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", - "\n", - "You can directly call these methods to get embeddings for your own use cases.\n", - "\n", - "### Embed single texts\n", - "\n", - "You can embed single texts or documents with `embed_query`:" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "id": "0d2befcd", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[-0.04443359375, 0.01885986328125, 0.018035888671875, -0.00864410400390625, 0.049652099609375, -0.00\n" - ] - } - ], - "source": [ - "single_vector = embeddings.embed_query(text)\n", - "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" - ] - }, - { - "cell_type": "markdown", - "id": "1b5a7d03", - "metadata": {}, - "source": [ - "### Embed multiple texts\n", - "\n", - "You can embed multiple texts with `embed_documents`:" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "id": "2f4d6e97", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[-0.04443359375, 0.01885986328125, 0.0180511474609375, -0.0086517333984375, 0.049652099609375, -0.00\n", - "[-0.02032470703125, 0.02606201171875, 0.051605224609375, -0.0281982421875, 0.055755615234375, 0.0019\n" - ] - } - ], - "source": [ - "text2 = (\n", - " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", - ")\n", - "two_vectors = embeddings.embed_documents([text, text2])\n", - "for vector in two_vectors:\n", - " print(str(vector)[:100]) # Show the first 100 characters of the vector" - ] - }, - { - "cell_type": "markdown", - "id": "98785c12", - "metadata": {}, - "source": [ - "## API Reference\n", - "\n", - "For detailed documentation on `MistralAIEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/mistralai/embeddings/langchain_mistralai.embeddings.MistralAIEmbeddings.html).\n" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.4" - } + "cells": [ + { + "cell_type": "raw", + "id": "afaf8039", + "metadata": {}, + "source": [ + "---\n", + "sidebar_label: MistralAI\n", + "---" + ] }, - "nbformat": 4, - "nbformat_minor": 5 + { + "cell_type": "markdown", + "id": "9a3d6f34", + "metadata": {}, + "source": [ + "# MistralAIEmbeddings\n", + "\n", + "This will help you get started with MistralAI embedding models using LangChain. For detailed documentation on `MistralAIEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/mistralai/embeddings/langchain_mistralai.embeddings.MistralAIEmbeddings.html).\n", + "\n", + "## Overview\n", + "### Integration details\n", + "\n", + "import { ItemTable } from \"@theme/FeatureTables\";\n", + "\n", + "\n", + "\n", + "## Setup\n", + "\n", + "To access MistralAI embedding models you'll need to create a/an MistralAI account, get an API key, and install the `langchain-mistralai` integration package.\n", + "\n", + "### Credentials\n", + "\n", + "Head to [https://console.mistral.ai/](https://console.mistral.ai/) to sign up to MistralAI and generate an API key. Once you've done this set the MISTRALAI_API_KEY environment variable:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "36521c2a", + "metadata": {}, + "outputs": [], + "source": [ + "import getpass\n", + "import os\n", + "\n", + "if not os.getenv(\"MISTRALAI_API_KEY\"):\n", + " os.environ[\"MISTRALAI_API_KEY\"] = getpass.getpass(\"Enter your MistralAI API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "c84fb993", + "metadata": {}, + "source": [ + "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "39a4953b", + "metadata": {}, + "outputs": [], + "source": [ + "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n", + "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "d9664366", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "The LangChain MistralAI integration lives in the `langchain-mistralai` package:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "64853226", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -qU langchain-mistralai" + ] + }, + { + "cell_type": "markdown", + "id": "45dd1724", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "Now we can instantiate our model object and generate chat completions:" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "9ea7a09b", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_mistralai import MistralAIEmbeddings\n", + "\n", + "embeddings = MistralAIEmbeddings(\n", + " model=\"mistral-embed\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "77d271b6", + "metadata": {}, + "source": [ + "## Indexing and Retrieval\n", + "\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", + "\n", + "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "d817716b", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'LangChain is the framework for building context-aware reasoning applications'" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create a vector store with a sample text\n", + "from langchain_core.vectorstores import InMemoryVectorStore\n", + "\n", + "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", + "\n", + "vectorstore = InMemoryVectorStore.from_texts(\n", + " [text],\n", + " embedding=embeddings,\n", + ")\n", + "\n", + "# Use the vectorstore as a retriever\n", + "retriever = vectorstore.as_retriever()\n", + "\n", + "# Retrieve the most similar text\n", + "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", + "\n", + "# show the retrieved document's content\n", + "retrieved_documents[0].page_content" + ] + }, + { + "cell_type": "markdown", + "id": "e02b9855", + "metadata": {}, + "source": [ + "## Direct Usage\n", + "\n", + "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", + "\n", + "You can directly call these methods to get embeddings for your own use cases.\n", + "\n", + "### Embed single texts\n", + "\n", + "You can embed single texts or documents with `embed_query`:" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "0d2befcd", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[-0.04443359375, 0.01885986328125, 0.018035888671875, -0.00864410400390625, 0.049652099609375, -0.00\n" + ] + } + ], + "source": [ + "single_vector = embeddings.embed_query(text)\n", + "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "1b5a7d03", + "metadata": {}, + "source": [ + "### Embed multiple texts\n", + "\n", + "You can embed multiple texts with `embed_documents`:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "2f4d6e97", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[-0.04443359375, 0.01885986328125, 0.0180511474609375, -0.0086517333984375, 0.049652099609375, -0.00\n", + "[-0.02032470703125, 0.02606201171875, 0.051605224609375, -0.0281982421875, 0.055755615234375, 0.0019\n" + ] + } + ], + "source": [ + "text2 = (\n", + " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", + ")\n", + "two_vectors = embeddings.embed_documents([text, text2])\n", + "for vector in two_vectors:\n", + " print(str(vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "98785c12", + "metadata": {}, + "source": [ + "## API Reference\n", + "\n", + "For detailed documentation on `MistralAIEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/mistralai/embeddings/langchain_mistralai.embeddings.MistralAIEmbeddings.html).\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/docs/docs/integrations/text_embedding/modelscope_embedding.ipynb b/docs/docs/integrations/text_embedding/modelscope_embedding.ipynb index b5db8dbab9a..f756a722155 100644 --- a/docs/docs/integrations/text_embedding/modelscope_embedding.ipynb +++ b/docs/docs/integrations/text_embedding/modelscope_embedding.ipynb @@ -128,7 +128,7 @@ "source": [ "## Indexing and Retrieval\n", "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", "\n", "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." ] @@ -277,7 +277,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.16" + "version": "3.9.6" } }, "nbformat": 4, diff --git a/docs/docs/integrations/text_embedding/naver.ipynb b/docs/docs/integrations/text_embedding/naver.ipynb index 3f52992f94a..81fb46bd131 100644 --- a/docs/docs/integrations/text_embedding/naver.ipynb +++ b/docs/docs/integrations/text_embedding/naver.ipynb @@ -112,7 +112,7 @@ "source": [ "## Indexing and Retrieval\n", "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", "\n", "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." ] @@ -249,7 +249,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.12.3" + "version": "3.9.6" } }, "nbformat": 4, diff --git a/docs/docs/integrations/text_embedding/netmind.ipynb b/docs/docs/integrations/text_embedding/netmind.ipynb index ad59fc28590..937703b9e49 100644 --- a/docs/docs/integrations/text_embedding/netmind.ipynb +++ b/docs/docs/integrations/text_embedding/netmind.ipynb @@ -37,6 +37,7 @@ }, { "cell_type": "code", + "execution_count": 1, "id": "36521c2a", "metadata": { "ExecuteTime": { @@ -44,15 +45,14 @@ "start_time": "2025-03-20T01:53:27.764291Z" } }, + "outputs": [], "source": [ "import getpass\n", "import os\n", "\n", "if not os.getenv(\"NETMIND_API_KEY\"):\n", " os.environ[\"NETMIND_API_KEY\"] = getpass.getpass(\"Enter your Netmind API key: \")" - ], - "outputs": [], - "execution_count": 1 + ] }, { "cell_type": "markdown", @@ -64,6 +64,7 @@ }, { "cell_type": "code", + "execution_count": 2, "id": "39a4953b", "metadata": { "ExecuteTime": { @@ -71,12 +72,11 @@ "start_time": "2025-03-20T01:53:32.141858Z" } }, + "outputs": [], "source": [ "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n", "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" - ], - "outputs": [], - "execution_count": 2 + ] }, { "cell_type": "markdown", @@ -90,6 +90,7 @@ }, { "cell_type": "code", + "execution_count": 3, "id": "64853226", "metadata": { "ExecuteTime": { @@ -97,22 +98,21 @@ "start_time": "2025-03-20T01:53:36.171640Z" } }, - "source": [ - "%pip install -qU langchain-netmind" - ], "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r\n", - "\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m A new release of pip is available: \u001B[0m\u001B[31;49m24.0\u001B[0m\u001B[39;49m -> \u001B[0m\u001B[32;49m25.0.1\u001B[0m\r\n", - "\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m To update, run: \u001B[0m\u001B[32;49mpip install --upgrade pip\u001B[0m\r\n", + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.0.1\u001b[0m\r\n", + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\r\n", "Note: you may need to restart the kernel to use updated packages.\n" ] } ], - "execution_count": 3 + "source": [ + "%pip install -qU langchain-netmind" + ] }, { "cell_type": "markdown", @@ -126,6 +126,7 @@ }, { "cell_type": "code", + "execution_count": 4, "id": "9ea7a09b", "metadata": { "ExecuteTime": { @@ -133,15 +134,14 @@ "start_time": "2025-03-20T01:54:30.146876Z" } }, + "outputs": [], "source": [ "from langchain_netmind import NetmindEmbeddings\n", "\n", "embeddings = NetmindEmbeddings(\n", " model=\"nvidia/NV-Embed-v2\",\n", ")" - ], - "outputs": [], - "execution_count": 4 + ] }, { "cell_type": "markdown", @@ -150,13 +150,14 @@ "source": [ "## Indexing and Retrieval\n", "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", "\n", "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." ] }, { "cell_type": "code", + "execution_count": 5, "id": "d817716b", "metadata": { "ExecuteTime": { @@ -164,6 +165,18 @@ "start_time": "2025-03-20T01:54:34.500805Z" } }, + "outputs": [ + { + "data": { + "text/plain": [ + "'LangChain is the framework for building context-aware reasoning applications'" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "# Create a vector store with a sample text\n", "from langchain_core.vectorstores import InMemoryVectorStore\n", @@ -183,20 +196,7 @@ "\n", "# show the retrieved document's content\n", "retrieved_documents[0].page_content" - ], - "outputs": [ - { - "data": { - "text/plain": [ - "'LangChain is the framework for building context-aware reasoning applications'" - ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - } - ], - "execution_count": 5 + ] }, { "cell_type": "markdown", @@ -216,6 +216,7 @@ }, { "cell_type": "code", + "execution_count": 6, "id": "0d2befcd", "metadata": { "ExecuteTime": { @@ -223,10 +224,6 @@ "start_time": "2025-03-20T01:54:45.196528Z" } }, - "source": [ - "single_vector = embeddings.embed_query(text)\n", - "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" - ], "outputs": [ { "name": "stdout", @@ -236,7 +233,10 @@ ] } ], - "execution_count": 6 + "source": [ + "single_vector = embeddings.embed_query(text)\n", + "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" + ] }, { "cell_type": "markdown", @@ -250,6 +250,7 @@ }, { "cell_type": "code", + "execution_count": 7, "id": "2f4d6e97", "metadata": { "ExecuteTime": { @@ -257,14 +258,6 @@ "start_time": "2025-03-20T01:54:52.468719Z" } }, - "source": [ - "text2 = (\n", - " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", - ")\n", - "two_vectors = embeddings.embed_documents([text, text2])\n", - "for vector in two_vectors:\n", - " print(str(vector)[:100]) # Show the first 100 characters of the vector" - ], "outputs": [ { "name": "stdout", @@ -275,7 +268,14 @@ ] } ], - "execution_count": 7 + "source": [ + "text2 = (\n", + " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", + ")\n", + "two_vectors = embeddings.embed_documents([text, text2])\n", + "for vector in two_vectors:\n", + " print(str(vector)[:100]) # Show the first 100 characters of the vector" + ] }, { "cell_type": "markdown", @@ -291,12 +291,12 @@ ] }, { - "metadata": {}, "cell_type": "code", - "outputs": [], "execution_count": null, - "source": "", - "id": "adb9e45c34733299" + "id": "adb9e45c34733299", + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { @@ -315,7 +315,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.5" + "version": "3.9.6" } }, "nbformat": 4, diff --git a/docs/docs/integrations/text_embedding/nomic.ipynb b/docs/docs/integrations/text_embedding/nomic.ipynb index 60142aa0a76..8f6a9a77049 100644 --- a/docs/docs/integrations/text_embedding/nomic.ipynb +++ b/docs/docs/integrations/text_embedding/nomic.ipynb @@ -1,285 +1,287 @@ { - "cells": [ - { - "cell_type": "raw", - "id": "afaf8039", - "metadata": {}, - "source": [ - "---\n", - "sidebar_label: Nomic\n", - "---" - ] - }, - { - "cell_type": "markdown", - "id": "9a3d6f34", - "metadata": {}, - "source": [ - "# NomicEmbeddings\n", - "\n", - "This will help you get started with Nomic embedding models using LangChain. For detailed documentation on `NomicEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/nomic/embeddings/langchain_nomic.embeddings.NomicEmbeddings.html).\n", - "\n", - "## Overview\n", - "### Integration details\n", - "\n", - "import { ItemTable } from \"@theme/FeatureTables\";\n", - "\n", - "\n", - "\n", - "## Setup\n", - "\n", - "To access Nomic embedding models you'll need to create a/an Nomic account, get an API key, and install the `langchain-nomic` integration package.\n", - "\n", - "### Credentials\n", - "\n", - "Head to [https://atlas.nomic.ai/](https://atlas.nomic.ai/) to sign up to Nomic and generate an API key. Once you've done this set the `NOMIC_API_KEY` environment variable:" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "36521c2a", - "metadata": {}, - "outputs": [], - "source": [ - "import getpass\n", - "import os\n", - "\n", - "if not os.getenv(\"NOMIC_API_KEY\"):\n", - " os.environ[\"NOMIC_API_KEY\"] = getpass.getpass(\"Enter your Nomic API key: \")" - ] - }, - { - "cell_type": "markdown", - "id": "c84fb993", - "metadata": {}, - "source": "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" - }, - { - "cell_type": "code", - "execution_count": 3, - "id": "39a4953b", - "metadata": {}, - "outputs": [], - "source": [ - "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n", - "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" - ] - }, - { - "cell_type": "markdown", - "id": "d9664366", - "metadata": {}, - "source": [ - "### Installation\n", - "\n", - "The LangChain Nomic integration lives in the `langchain-nomic` package:" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "64853226", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Note: you may need to restart the kernel to use updated packages.\n" - ] - } - ], - "source": [ - "%pip install -qU langchain-nomic" - ] - }, - { - "cell_type": "markdown", - "id": "45dd1724", - "metadata": {}, - "source": [ - "## Instantiation\n", - "\n", - "Now we can instantiate our model object and generate chat completions:" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "id": "9ea7a09b", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_nomic import NomicEmbeddings\n", - "\n", - "embeddings = NomicEmbeddings(\n", - " model=\"nomic-embed-text-v1.5\",\n", - " # dimensionality=256,\n", - " # Nomic's `nomic-embed-text-v1.5` model was [trained with Matryoshka learning](https://blog.nomic.ai/posts/nomic-embed-matryoshka)\n", - " # to enable variable-length embeddings with a single model.\n", - " # This means that you can specify the dimensionality of the embeddings at inference time.\n", - " # The model supports dimensionality from 64 to 768.\n", - " # inference_mode=\"remote\",\n", - " # One of `remote`, `local` (Embed4All), or `dynamic` (automatic). Defaults to `remote`.\n", - " # api_key=... , # if using remote inference,\n", - " # device=\"cpu\",\n", - " # The device to use for local embeddings. Choices include\n", - " # `cpu`, `gpu`, `nvidia`, `amd`, or a specific device name. See\n", - " # the docstring for `GPT4All.__init__` for more info. Typically\n", - " # defaults to CPU. Do not use on macOS.\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "77d271b6", - "metadata": {}, - "source": [ - "## Indexing and Retrieval\n", - "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", - "\n", - "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "id": "d817716b", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "'LangChain is the framework for building context-aware reasoning applications'" - ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Create a vector store with a sample text\n", - "from langchain_core.vectorstores import InMemoryVectorStore\n", - "\n", - "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", - "\n", - "vectorstore = InMemoryVectorStore.from_texts(\n", - " [text],\n", - " embedding=embeddings,\n", - ")\n", - "\n", - "# Use the vectorstore as a retriever\n", - "retriever = vectorstore.as_retriever()\n", - "\n", - "# Retrieve the most similar text\n", - "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", - "\n", - "# show the retrieved document's content\n", - "retrieved_documents[0].page_content" - ] - }, - { - "cell_type": "markdown", - "id": "e02b9855", - "metadata": {}, - "source": [ - "## Direct Usage\n", - "\n", - "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", - "\n", - "You can directly call these methods to get embeddings for your own use cases.\n", - "\n", - "### Embed single texts\n", - "\n", - "You can embed single texts or documents with `embed_query`:" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "id": "0d2befcd", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[0.024642944, 0.029083252, -0.14013672, -0.09082031, 0.058898926, -0.07489014, -0.0138168335, 0.0037\n" - ] - } - ], - "source": [ - "single_vector = embeddings.embed_query(text)\n", - "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" - ] - }, - { - "cell_type": "markdown", - "id": "1b5a7d03", - "metadata": {}, - "source": [ - "### Embed multiple texts\n", - "\n", - "You can embed multiple texts with `embed_documents`:" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "id": "2f4d6e97", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[0.012771606, 0.023727417, -0.12365723, -0.083740234, 0.06530762, -0.07110596, -0.021896362, -0.0068\n", - "[-0.019058228, 0.04058838, -0.15222168, -0.06842041, -0.012130737, -0.07128906, -0.04534912, 0.00522\n" - ] - } - ], - "source": [ - "text2 = (\n", - " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", - ")\n", - "two_vectors = embeddings.embed_documents([text, text2])\n", - "for vector in two_vectors:\n", - " print(str(vector)[:100]) # Show the first 100 characters of the vector" - ] - }, - { - "cell_type": "markdown", - "id": "98785c12", - "metadata": {}, - "source": [ - "## API Reference\n", - "\n", - "For detailed documentation on `NomicEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/nomic/embeddings/langchain_nomic.embeddings.NomicEmbeddings.html).\n" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.6" - } + "cells": [ + { + "cell_type": "raw", + "id": "afaf8039", + "metadata": {}, + "source": [ + "---\n", + "sidebar_label: Nomic\n", + "---" + ] }, - "nbformat": 4, - "nbformat_minor": 5 + { + "cell_type": "markdown", + "id": "9a3d6f34", + "metadata": {}, + "source": [ + "# NomicEmbeddings\n", + "\n", + "This will help you get started with Nomic embedding models using LangChain. For detailed documentation on `NomicEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/nomic/embeddings/langchain_nomic.embeddings.NomicEmbeddings.html).\n", + "\n", + "## Overview\n", + "### Integration details\n", + "\n", + "import { ItemTable } from \"@theme/FeatureTables\";\n", + "\n", + "\n", + "\n", + "## Setup\n", + "\n", + "To access Nomic embedding models you'll need to create a/an Nomic account, get an API key, and install the `langchain-nomic` integration package.\n", + "\n", + "### Credentials\n", + "\n", + "Head to [https://atlas.nomic.ai/](https://atlas.nomic.ai/) to sign up to Nomic and generate an API key. Once you've done this set the `NOMIC_API_KEY` environment variable:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "36521c2a", + "metadata": {}, + "outputs": [], + "source": [ + "import getpass\n", + "import os\n", + "\n", + "if not os.getenv(\"NOMIC_API_KEY\"):\n", + " os.environ[\"NOMIC_API_KEY\"] = getpass.getpass(\"Enter your Nomic API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "c84fb993", + "metadata": {}, + "source": [ + "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "39a4953b", + "metadata": {}, + "outputs": [], + "source": [ + "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n", + "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "d9664366", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "The LangChain Nomic integration lives in the `langchain-nomic` package:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "64853226", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Note: you may need to restart the kernel to use updated packages.\n" + ] + } + ], + "source": [ + "%pip install -qU langchain-nomic" + ] + }, + { + "cell_type": "markdown", + "id": "45dd1724", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "Now we can instantiate our model object and generate chat completions:" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "9ea7a09b", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_nomic import NomicEmbeddings\n", + "\n", + "embeddings = NomicEmbeddings(\n", + " model=\"nomic-embed-text-v1.5\",\n", + " # dimensionality=256,\n", + " # Nomic's `nomic-embed-text-v1.5` model was [trained with Matryoshka learning](https://blog.nomic.ai/posts/nomic-embed-matryoshka)\n", + " # to enable variable-length embeddings with a single model.\n", + " # This means that you can specify the dimensionality of the embeddings at inference time.\n", + " # The model supports dimensionality from 64 to 768.\n", + " # inference_mode=\"remote\",\n", + " # One of `remote`, `local` (Embed4All), or `dynamic` (automatic). Defaults to `remote`.\n", + " # api_key=... , # if using remote inference,\n", + " # device=\"cpu\",\n", + " # The device to use for local embeddings. Choices include\n", + " # `cpu`, `gpu`, `nvidia`, `amd`, or a specific device name. See\n", + " # the docstring for `GPT4All.__init__` for more info. Typically\n", + " # defaults to CPU. Do not use on macOS.\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "77d271b6", + "metadata": {}, + "source": [ + "## Indexing and Retrieval\n", + "\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", + "\n", + "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "d817716b", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'LangChain is the framework for building context-aware reasoning applications'" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create a vector store with a sample text\n", + "from langchain_core.vectorstores import InMemoryVectorStore\n", + "\n", + "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", + "\n", + "vectorstore = InMemoryVectorStore.from_texts(\n", + " [text],\n", + " embedding=embeddings,\n", + ")\n", + "\n", + "# Use the vectorstore as a retriever\n", + "retriever = vectorstore.as_retriever()\n", + "\n", + "# Retrieve the most similar text\n", + "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", + "\n", + "# show the retrieved document's content\n", + "retrieved_documents[0].page_content" + ] + }, + { + "cell_type": "markdown", + "id": "e02b9855", + "metadata": {}, + "source": [ + "## Direct Usage\n", + "\n", + "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", + "\n", + "You can directly call these methods to get embeddings for your own use cases.\n", + "\n", + "### Embed single texts\n", + "\n", + "You can embed single texts or documents with `embed_query`:" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "0d2befcd", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0.024642944, 0.029083252, -0.14013672, -0.09082031, 0.058898926, -0.07489014, -0.0138168335, 0.0037\n" + ] + } + ], + "source": [ + "single_vector = embeddings.embed_query(text)\n", + "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "1b5a7d03", + "metadata": {}, + "source": [ + "### Embed multiple texts\n", + "\n", + "You can embed multiple texts with `embed_documents`:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "2f4d6e97", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0.012771606, 0.023727417, -0.12365723, -0.083740234, 0.06530762, -0.07110596, -0.021896362, -0.0068\n", + "[-0.019058228, 0.04058838, -0.15222168, -0.06842041, -0.012130737, -0.07128906, -0.04534912, 0.00522\n" + ] + } + ], + "source": [ + "text2 = (\n", + " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", + ")\n", + "two_vectors = embeddings.embed_documents([text, text2])\n", + "for vector in two_vectors:\n", + " print(str(vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "98785c12", + "metadata": {}, + "source": [ + "## API Reference\n", + "\n", + "For detailed documentation on `NomicEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/nomic/embeddings/langchain_nomic.embeddings.NomicEmbeddings.html).\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/docs/docs/integrations/text_embedding/openai.ipynb b/docs/docs/integrations/text_embedding/openai.ipynb index 61b964ce308..31bdcee0932 100644 --- a/docs/docs/integrations/text_embedding/openai.ipynb +++ b/docs/docs/integrations/text_embedding/openai.ipynb @@ -1,270 +1,272 @@ { - "cells": [ - { - "cell_type": "raw", - "id": "afaf8039", - "metadata": {}, - "source": [ - "---\n", - "sidebar_label: OpenAI\n", - "keywords: [openaiembeddings]\n", - "---" - ] - }, - { - "cell_type": "markdown", - "id": "9a3d6f34", - "metadata": {}, - "source": [ - "# OpenAIEmbeddings\n", - "\n", - "This will help you get started with OpenAI embedding models using LangChain. For detailed documentation on `OpenAIEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/openai/embeddings/langchain_openai.embeddings.base.OpenAIEmbeddings.html).\n", - "\n", - "\n", - "## Overview\n", - "### Integration details\n", - "\n", - "import { ItemTable } from \"@theme/FeatureTables\";\n", - "\n", - "\n", - "\n", - "## Setup\n", - "\n", - "To access OpenAI embedding models you'll need to create a/an OpenAI account, get an API key, and install the `langchain-openai` integration package.\n", - "\n", - "### Credentials\n", - "\n", - "Head to [platform.openai.com](https://platform.openai.com) to sign up to OpenAI and generate an API key. Once you’ve done this set the OPENAI_API_KEY environment variable:" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "id": "36521c2a", - "metadata": {}, - "outputs": [], - "source": [ - "import getpass\n", - "import os\n", - "\n", - "if not os.getenv(\"OPENAI_API_KEY\"):\n", - " os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"Enter your OpenAI API key: \")" - ] - }, - { - "cell_type": "markdown", - "id": "c84fb993", - "metadata": {}, - "source": "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" - }, - { - "cell_type": "code", - "execution_count": 7, - "id": "39a4953b", - "metadata": {}, - "outputs": [], - "source": [ - "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n", - "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" - ] - }, - { - "cell_type": "markdown", - "id": "d9664366", - "metadata": {}, - "source": [ - "### Installation\n", - "\n", - "The LangChain OpenAI integration lives in the `langchain-openai` package:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "64853226", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -qU langchain-openai" - ] - }, - { - "cell_type": "markdown", - "id": "45dd1724", - "metadata": {}, - "source": [ - "## Instantiation\n", - "\n", - "Now we can instantiate our model object and generate chat completions:" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "id": "9ea7a09b", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_openai import OpenAIEmbeddings\n", - "\n", - "embeddings = OpenAIEmbeddings(\n", - " model=\"text-embedding-3-large\",\n", - " # With the `text-embedding-3` class\n", - " # of models, you can specify the size\n", - " # of the embeddings you want returned.\n", - " # dimensions=1024\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "77d271b6", - "metadata": {}, - "source": [ - "## Indexing and Retrieval\n", - "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", - "\n", - "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "id": "d817716b", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "'LangChain is the framework for building context-aware reasoning applications'" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Create a vector store with a sample text\n", - "from langchain_core.vectorstores import InMemoryVectorStore\n", - "\n", - "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", - "\n", - "vectorstore = InMemoryVectorStore.from_texts(\n", - " [text],\n", - " embedding=embeddings,\n", - ")\n", - "\n", - "# Use the vectorstore as a retriever\n", - "retriever = vectorstore.as_retriever()\n", - "\n", - "# Retrieve the most similar text\n", - "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", - "\n", - "# show the retrieved document's content\n", - "retrieved_documents[0].page_content" - ] - }, - { - "cell_type": "markdown", - "id": "e02b9855", - "metadata": {}, - "source": [ - "## Direct Usage\n", - "\n", - "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", - "\n", - "You can directly call these methods to get embeddings for your own use cases.\n", - "\n", - "### Embed single texts\n", - "\n", - "You can embed single texts or documents with `embed_query`:" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "id": "0d2befcd", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[-0.019276829436421394, 0.0037708976306021214, -0.03294256329536438, 0.0037671267054975033, 0.008175\n" - ] - } - ], - "source": [ - "single_vector = embeddings.embed_query(text)\n", - "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" - ] - }, - { - "cell_type": "markdown", - "id": "1b5a7d03", - "metadata": {}, - "source": [ - "### Embed multiple texts\n", - "\n", - "You can embed multiple texts with `embed_documents`:" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "id": "2f4d6e97", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[-0.019260549917817116, 0.0037612367887049913, -0.03291035071015358, 0.003757466096431017, 0.0082049\n", - "[-0.010181212797760963, 0.023419594392180443, -0.04215526953339577, -0.001532090245746076, -0.023573\n" - ] - } - ], - "source": [ - "text2 = (\n", - " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", - ")\n", - "two_vectors = embeddings.embed_documents([text, text2])\n", - "for vector in two_vectors:\n", - " print(str(vector)[:100]) # Show the first 100 characters of the vector" - ] - }, - { - "cell_type": "markdown", - "id": "98785c12", - "metadata": {}, - "source": [ - "## API Reference\n", - "\n", - "For detailed documentation on `OpenAIEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/openai/embeddings/langchain_openai.embeddings.base.OpenAIEmbeddings.html).\n" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.4" - } + "cells": [ + { + "cell_type": "raw", + "id": "afaf8039", + "metadata": {}, + "source": [ + "---\n", + "sidebar_label: OpenAI\n", + "keywords: [openaiembeddings]\n", + "---" + ] }, - "nbformat": 4, - "nbformat_minor": 5 + { + "cell_type": "markdown", + "id": "9a3d6f34", + "metadata": {}, + "source": [ + "# OpenAIEmbeddings\n", + "\n", + "This will help you get started with OpenAI embedding models using LangChain. For detailed documentation on `OpenAIEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/openai/embeddings/langchain_openai.embeddings.base.OpenAIEmbeddings.html).\n", + "\n", + "\n", + "## Overview\n", + "### Integration details\n", + "\n", + "import { ItemTable } from \"@theme/FeatureTables\";\n", + "\n", + "\n", + "\n", + "## Setup\n", + "\n", + "To access OpenAI embedding models you'll need to create a/an OpenAI account, get an API key, and install the `langchain-openai` integration package.\n", + "\n", + "### Credentials\n", + "\n", + "Head to [platform.openai.com](https://platform.openai.com) to sign up to OpenAI and generate an API key. Once you’ve done this set the OPENAI_API_KEY environment variable:" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "36521c2a", + "metadata": {}, + "outputs": [], + "source": [ + "import getpass\n", + "import os\n", + "\n", + "if not os.getenv(\"OPENAI_API_KEY\"):\n", + " os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"Enter your OpenAI API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "c84fb993", + "metadata": {}, + "source": [ + "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "39a4953b", + "metadata": {}, + "outputs": [], + "source": [ + "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n", + "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "d9664366", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "The LangChain OpenAI integration lives in the `langchain-openai` package:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "64853226", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -qU langchain-openai" + ] + }, + { + "cell_type": "markdown", + "id": "45dd1724", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "Now we can instantiate our model object and generate chat completions:" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "9ea7a09b", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_openai import OpenAIEmbeddings\n", + "\n", + "embeddings = OpenAIEmbeddings(\n", + " model=\"text-embedding-3-large\",\n", + " # With the `text-embedding-3` class\n", + " # of models, you can specify the size\n", + " # of the embeddings you want returned.\n", + " # dimensions=1024\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "77d271b6", + "metadata": {}, + "source": [ + "## Indexing and Retrieval\n", + "\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", + "\n", + "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "d817716b", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'LangChain is the framework for building context-aware reasoning applications'" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create a vector store with a sample text\n", + "from langchain_core.vectorstores import InMemoryVectorStore\n", + "\n", + "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", + "\n", + "vectorstore = InMemoryVectorStore.from_texts(\n", + " [text],\n", + " embedding=embeddings,\n", + ")\n", + "\n", + "# Use the vectorstore as a retriever\n", + "retriever = vectorstore.as_retriever()\n", + "\n", + "# Retrieve the most similar text\n", + "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", + "\n", + "# show the retrieved document's content\n", + "retrieved_documents[0].page_content" + ] + }, + { + "cell_type": "markdown", + "id": "e02b9855", + "metadata": {}, + "source": [ + "## Direct Usage\n", + "\n", + "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", + "\n", + "You can directly call these methods to get embeddings for your own use cases.\n", + "\n", + "### Embed single texts\n", + "\n", + "You can embed single texts or documents with `embed_query`:" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "0d2befcd", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[-0.019276829436421394, 0.0037708976306021214, -0.03294256329536438, 0.0037671267054975033, 0.008175\n" + ] + } + ], + "source": [ + "single_vector = embeddings.embed_query(text)\n", + "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "1b5a7d03", + "metadata": {}, + "source": [ + "### Embed multiple texts\n", + "\n", + "You can embed multiple texts with `embed_documents`:" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "2f4d6e97", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[-0.019260549917817116, 0.0037612367887049913, -0.03291035071015358, 0.003757466096431017, 0.0082049\n", + "[-0.010181212797760963, 0.023419594392180443, -0.04215526953339577, -0.001532090245746076, -0.023573\n" + ] + } + ], + "source": [ + "text2 = (\n", + " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", + ")\n", + "two_vectors = embeddings.embed_documents([text, text2])\n", + "for vector in two_vectors:\n", + " print(str(vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "98785c12", + "metadata": {}, + "source": [ + "## API Reference\n", + "\n", + "For detailed documentation on `OpenAIEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/openai/embeddings/langchain_openai.embeddings.base.OpenAIEmbeddings.html).\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/docs/docs/integrations/text_embedding/sambanova.ipynb b/docs/docs/integrations/text_embedding/sambanova.ipynb index 43e0d03e553..2d4a2414b58 100644 --- a/docs/docs/integrations/text_embedding/sambanova.ipynb +++ b/docs/docs/integrations/text_embedding/sambanova.ipynb @@ -133,7 +133,7 @@ "source": [ "## Indexing and Retrieval\n", "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", "\n", "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." ] @@ -244,7 +244,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.5" + "version": "3.9.6" } }, "nbformat": 4, diff --git a/docs/docs/integrations/text_embedding/sambastudio.ipynb b/docs/docs/integrations/text_embedding/sambastudio.ipynb index e8f36dfaa98..4d843ed8ba2 100644 --- a/docs/docs/integrations/text_embedding/sambastudio.ipynb +++ b/docs/docs/integrations/text_embedding/sambastudio.ipynb @@ -141,7 +141,7 @@ "source": [ "## Indexing and Retrieval\n", "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", "\n", "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." ] @@ -252,7 +252,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.5" + "version": "3.9.6" } }, "nbformat": 4, diff --git a/docs/docs/integrations/text_embedding/together.ipynb b/docs/docs/integrations/text_embedding/together.ipynb index 5d2867c69cd..d3ec044917a 100644 --- a/docs/docs/integrations/text_embedding/together.ipynb +++ b/docs/docs/integrations/text_embedding/together.ipynb @@ -1,275 +1,277 @@ { - "cells": [ - { - "cell_type": "raw", - "id": "afaf8039", - "metadata": {}, - "source": [ - "---\n", - "sidebar_label: Together AI\n", - "---" - ] - }, - { - "cell_type": "markdown", - "id": "9a3d6f34", - "metadata": {}, - "source": [ - "# TogetherEmbeddings\n", - "\n", - "This will help you get started with Together embedding models using LangChain. For detailed documentation on `TogetherEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/together/embeddings/langchain_together.embeddings.TogetherEmbeddings.html).\n", - "\n", - "## Overview\n", - "### Integration details\n", - "\n", - "import { ItemTable } from \"@theme/FeatureTables\";\n", - "\n", - "\n", - "\n", - "## Setup\n", - "\n", - "To access Together embedding models you'll need to create a/an Together account, get an API key, and install the `langchain-together` integration package.\n", - "\n", - "### Credentials\n", - "\n", - "Head to [https://api.together.xyz/](https://api.together.xyz/) to sign up to Together and generate an API key. Once you've done this set the TOGETHER_API_KEY environment variable:" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "id": "36521c2a", - "metadata": {}, - "outputs": [], - "source": [ - "import getpass\n", - "import os\n", - "\n", - "if not os.getenv(\"TOGETHER_API_KEY\"):\n", - " os.environ[\"TOGETHER_API_KEY\"] = getpass.getpass(\"Enter your Together API key: \")" - ] - }, - { - "cell_type": "markdown", - "id": "c84fb993", - "metadata": {}, - "source": "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "39a4953b", - "metadata": {}, - "outputs": [], - "source": [ - "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n", - "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" - ] - }, - { - "cell_type": "markdown", - "id": "d9664366", - "metadata": {}, - "source": [ - "### Installation\n", - "\n", - "The LangChain Together integration lives in the `langchain-together` package:" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "id": "64853226", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.2\u001b[0m\n", - "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpython -m pip install --upgrade pip\u001b[0m\n", - "Note: you may need to restart the kernel to use updated packages.\n" - ] - } - ], - "source": [ - "%pip install -qU langchain-together" - ] - }, - { - "cell_type": "markdown", - "id": "45dd1724", - "metadata": {}, - "source": [ - "## Instantiation\n", - "\n", - "Now we can instantiate our model object and generate chat completions:" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "id": "9ea7a09b", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_together import TogetherEmbeddings\n", - "\n", - "embeddings = TogetherEmbeddings(\n", - " model=\"togethercomputer/m2-bert-80M-8k-retrieval\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "77d271b6", - "metadata": {}, - "source": [ - "## Indexing and Retrieval\n", - "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", - "\n", - "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "id": "d817716b", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "'LangChain is the framework for building context-aware reasoning applications'" - ] - }, - "execution_count": 6, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Create a vector store with a sample text\n", - "from langchain_core.vectorstores import InMemoryVectorStore\n", - "\n", - "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", - "\n", - "vectorstore = InMemoryVectorStore.from_texts(\n", - " [text],\n", - " embedding=embeddings,\n", - ")\n", - "\n", - "# Use the vectorstore as a retriever\n", - "retriever = vectorstore.as_retriever()\n", - "\n", - "# Retrieve the most similar text\n", - "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", - "\n", - "# show the retrieved document's content\n", - "retrieved_documents[0].page_content" - ] - }, - { - "cell_type": "markdown", - "id": "e02b9855", - "metadata": {}, - "source": [ - "## Direct Usage\n", - "\n", - "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", - "\n", - "You can directly call these methods to get embeddings for your own use cases.\n", - "\n", - "### Embed single texts\n", - "\n", - "You can embed single texts or documents with `embed_query`:" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "id": "0d2befcd", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[0.3812227, -0.052848946, -0.10564975, 0.03480297, 0.2878488, 0.0084609175, 0.11605915, 0.05303011, \n" - ] - } - ], - "source": [ - "single_vector = embeddings.embed_query(text)\n", - "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" - ] - }, - { - "cell_type": "markdown", - "id": "1b5a7d03", - "metadata": {}, - "source": [ - "### Embed multiple texts\n", - "\n", - "You can embed multiple texts with `embed_documents`:" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "id": "2f4d6e97", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[0.3812227, -0.052848946, -0.10564975, 0.03480297, 0.2878488, 0.0084609175, 0.11605915, 0.05303011, \n", - "[0.066308185, -0.032866564, 0.115751594, 0.19082588, 0.14017, -0.26976448, -0.056340694, -0.26923394\n" - ] - } - ], - "source": [ - "text2 = (\n", - " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", - ")\n", - "two_vectors = embeddings.embed_documents([text, text2])\n", - "for vector in two_vectors:\n", - " print(str(vector)[:100]) # Show the first 100 characters of the vector" - ] - }, - { - "cell_type": "markdown", - "id": "98785c12", - "metadata": {}, - "source": [ - "## API Reference\n", - "\n", - "For detailed documentation on `TogetherEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/together/embeddings/langchain_together.embeddings.TogetherEmbeddings.html).\n" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.4" - } + "cells": [ + { + "cell_type": "raw", + "id": "afaf8039", + "metadata": {}, + "source": [ + "---\n", + "sidebar_label: Together AI\n", + "---" + ] }, - "nbformat": 4, - "nbformat_minor": 5 + { + "cell_type": "markdown", + "id": "9a3d6f34", + "metadata": {}, + "source": [ + "# TogetherEmbeddings\n", + "\n", + "This will help you get started with Together embedding models using LangChain. For detailed documentation on `TogetherEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/together/embeddings/langchain_together.embeddings.TogetherEmbeddings.html).\n", + "\n", + "## Overview\n", + "### Integration details\n", + "\n", + "import { ItemTable } from \"@theme/FeatureTables\";\n", + "\n", + "\n", + "\n", + "## Setup\n", + "\n", + "To access Together embedding models you'll need to create a/an Together account, get an API key, and install the `langchain-together` integration package.\n", + "\n", + "### Credentials\n", + "\n", + "Head to [https://api.together.xyz/](https://api.together.xyz/) to sign up to Together and generate an API key. Once you've done this set the TOGETHER_API_KEY environment variable:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "36521c2a", + "metadata": {}, + "outputs": [], + "source": [ + "import getpass\n", + "import os\n", + "\n", + "if not os.getenv(\"TOGETHER_API_KEY\"):\n", + " os.environ[\"TOGETHER_API_KEY\"] = getpass.getpass(\"Enter your Together API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "c84fb993", + "metadata": {}, + "source": [ + "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "39a4953b", + "metadata": {}, + "outputs": [], + "source": [ + "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n", + "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "d9664366", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "The LangChain Together integration lives in the `langchain-together` package:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "64853226", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.2\u001b[0m\n", + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpython -m pip install --upgrade pip\u001b[0m\n", + "Note: you may need to restart the kernel to use updated packages.\n" + ] + } + ], + "source": [ + "%pip install -qU langchain-together" + ] + }, + { + "cell_type": "markdown", + "id": "45dd1724", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "Now we can instantiate our model object and generate chat completions:" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "9ea7a09b", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_together import TogetherEmbeddings\n", + "\n", + "embeddings = TogetherEmbeddings(\n", + " model=\"togethercomputer/m2-bert-80M-8k-retrieval\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "77d271b6", + "metadata": {}, + "source": [ + "## Indexing and Retrieval\n", + "\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", + "\n", + "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "d817716b", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'LangChain is the framework for building context-aware reasoning applications'" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create a vector store with a sample text\n", + "from langchain_core.vectorstores import InMemoryVectorStore\n", + "\n", + "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", + "\n", + "vectorstore = InMemoryVectorStore.from_texts(\n", + " [text],\n", + " embedding=embeddings,\n", + ")\n", + "\n", + "# Use the vectorstore as a retriever\n", + "retriever = vectorstore.as_retriever()\n", + "\n", + "# Retrieve the most similar text\n", + "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", + "\n", + "# show the retrieved document's content\n", + "retrieved_documents[0].page_content" + ] + }, + { + "cell_type": "markdown", + "id": "e02b9855", + "metadata": {}, + "source": [ + "## Direct Usage\n", + "\n", + "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", + "\n", + "You can directly call these methods to get embeddings for your own use cases.\n", + "\n", + "### Embed single texts\n", + "\n", + "You can embed single texts or documents with `embed_query`:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "0d2befcd", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0.3812227, -0.052848946, -0.10564975, 0.03480297, 0.2878488, 0.0084609175, 0.11605915, 0.05303011, \n" + ] + } + ], + "source": [ + "single_vector = embeddings.embed_query(text)\n", + "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "1b5a7d03", + "metadata": {}, + "source": [ + "### Embed multiple texts\n", + "\n", + "You can embed multiple texts with `embed_documents`:" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "2f4d6e97", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0.3812227, -0.052848946, -0.10564975, 0.03480297, 0.2878488, 0.0084609175, 0.11605915, 0.05303011, \n", + "[0.066308185, -0.032866564, 0.115751594, 0.19082588, 0.14017, -0.26976448, -0.056340694, -0.26923394\n" + ] + } + ], + "source": [ + "text2 = (\n", + " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", + ")\n", + "two_vectors = embeddings.embed_documents([text, text2])\n", + "for vector in two_vectors:\n", + " print(str(vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "98785c12", + "metadata": {}, + "source": [ + "## API Reference\n", + "\n", + "For detailed documentation on `TogetherEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/together/embeddings/langchain_together.embeddings.TogetherEmbeddings.html).\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/docs/docs/integrations/text_embedding/zhipuai.ipynb b/docs/docs/integrations/text_embedding/zhipuai.ipynb index 39235d4ae8f..e76454fac73 100644 --- a/docs/docs/integrations/text_embedding/zhipuai.ipynb +++ b/docs/docs/integrations/text_embedding/zhipuai.ipynb @@ -1,277 +1,279 @@ { - "cells": [ - { - "cell_type": "raw", - "id": "afaf8039", - "metadata": {}, - "source": [ - "---\n", - "sidebar_label: ZhipuAI\n", - "keywords: [zhipuaiembeddings]\n", - "---" - ] - }, - { - "cell_type": "markdown", - "id": "9a3d6f34", - "metadata": {}, - "source": [ - "# ZhipuAIEmbeddings\n", - "\n", - "This will help you get started with ZhipuAI embedding models using LangChain. For detailed documentation on `ZhipuAIEmbeddings` features and configuration options, please refer to the [API reference](https://bigmodel.cn/dev/api#vector).\n", - "\n", - "## Overview\n", - "### Integration details\n", - "\n", - "| Provider | Package |\n", - "|:--------:|:-------:|\n", - "| [ZhipuAI](/docs/integrations/providers/zhipuai/) | [langchain-community](https://python.langchain.com/api_reference/community/embeddings/langchain_community.embeddings.zhipuai.ZhipuAIEmbeddings.html) |\n", - "\n", - "## Setup\n", - "\n", - "To access ZhipuAI embedding models you'll need to create a/an ZhipuAI account, get an API key, and install the `zhipuai` integration package.\n", - "\n", - "### Credentials\n", - "\n", - "Head to [https://bigmodel.cn/](https://bigmodel.cn/usercenter/apikeys) to sign up to ZhipuAI and generate an API key. Once you've done this set the ZHIPUAI_API_KEY environment variable:" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "id": "36521c2a", - "metadata": {}, - "outputs": [], - "source": [ - "import getpass\n", - "import os\n", - "\n", - "if not os.getenv(\"ZHIPUAI_API_KEY\"):\n", - " os.environ[\"ZHIPUAI_API_KEY\"] = getpass.getpass(\"Enter your ZhipuAI API key: \")" - ] - }, - { - "cell_type": "markdown", - "id": "c84fb993", - "metadata": {}, - "source": "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "39a4953b", - "metadata": {}, - "outputs": [], - "source": [ - "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n", - "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" - ] - }, - { - "cell_type": "markdown", - "id": "d9664366", - "metadata": {}, - "source": [ - "### Installation\n", - "\n", - "The LangChain ZhipuAI integration lives in the `zhipuai` package:" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "id": "64853226", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Note: you may need to restart the kernel to use updated packages.\n" - ] - } - ], - "source": [ - "%pip install -qU zhipuai" - ] - }, - { - "cell_type": "markdown", - "id": "45dd1724", - "metadata": {}, - "source": [ - "## Instantiation\n", - "\n", - "Now we can instantiate our model object and generate chat completions:" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "id": "9ea7a09b", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_community.embeddings import ZhipuAIEmbeddings\n", - "\n", - "embeddings = ZhipuAIEmbeddings(\n", - " model=\"embedding-3\",\n", - " # With the `embedding-3` class\n", - " # of models, you can specify the size\n", - " # of the embeddings you want returned.\n", - " # dimensions=1024\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "77d271b6", - "metadata": {}, - "source": [ - "## Indexing and Retrieval\n", - "\n", - "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n", - "\n", - "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "id": "d817716b", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "'LangChain is the framework for building context-aware reasoning applications'" - ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Create a vector store with a sample text\n", - "from langchain_core.vectorstores import InMemoryVectorStore\n", - "\n", - "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", - "\n", - "vectorstore = InMemoryVectorStore.from_texts(\n", - " [text],\n", - " embedding=embeddings,\n", - ")\n", - "\n", - "# Use the vectorstore as a retriever\n", - "retriever = vectorstore.as_retriever()\n", - "\n", - "# Retrieve the most similar text\n", - "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", - "\n", - "# show the retrieved document's content\n", - "retrieved_documents[0].page_content" - ] - }, - { - "cell_type": "markdown", - "id": "e02b9855", - "metadata": {}, - "source": [ - "## Direct Usage\n", - "\n", - "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", - "\n", - "You can directly call these methods to get embeddings for your own use cases.\n", - "\n", - "### Embed single texts\n", - "\n", - "You can embed single texts or documents with `embed_query`:" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "id": "0d2befcd", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[-0.022979736, 0.007785797, 0.04598999, 0.012741089, -0.01689148, 0.008277893, 0.016464233, 0.009246\n" - ] - } - ], - "source": [ - "single_vector = embeddings.embed_query(text)\n", - "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" - ] - }, - { - "cell_type": "markdown", - "id": "1b5a7d03", - "metadata": {}, - "source": [ - "### Embed multiple texts\n", - "\n", - "You can embed multiple texts with `embed_documents`:" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "id": "2f4d6e97", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[-0.022979736, 0.007785797, 0.04598999, 0.012741089, -0.01689148, 0.008277893, 0.016464233, 0.009246\n", - "[-0.02330017, -0.013916016, 0.00022411346, 0.017196655, -0.034240723, 0.011131287, 0.011497498, -0.0\n" - ] - } - ], - "source": [ - "text2 = (\n", - " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", - ")\n", - "two_vectors = embeddings.embed_documents([text, text2])\n", - "for vector in two_vectors:\n", - " print(str(vector)[:100]) # Show the first 100 characters of the vector" - ] - }, - { - "cell_type": "markdown", - "id": "98785c12", - "metadata": {}, - "source": [ - "## API Reference\n", - "\n", - "For detailed documentation on `ZhipuAIEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/community/embeddings/langchain_community.embeddings.zhipuai.ZhipuAIEmbeddings.html).\n" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.12.3" - } + "cells": [ + { + "cell_type": "raw", + "id": "afaf8039", + "metadata": {}, + "source": [ + "---\n", + "sidebar_label: ZhipuAI\n", + "keywords: [zhipuaiembeddings]\n", + "---" + ] }, - "nbformat": 4, - "nbformat_minor": 5 + { + "cell_type": "markdown", + "id": "9a3d6f34", + "metadata": {}, + "source": [ + "# ZhipuAIEmbeddings\n", + "\n", + "This will help you get started with ZhipuAI embedding models using LangChain. For detailed documentation on `ZhipuAIEmbeddings` features and configuration options, please refer to the [API reference](https://bigmodel.cn/dev/api#vector).\n", + "\n", + "## Overview\n", + "### Integration details\n", + "\n", + "| Provider | Package |\n", + "|:--------:|:-------:|\n", + "| [ZhipuAI](/docs/integrations/providers/zhipuai/) | [langchain-community](https://python.langchain.com/api_reference/community/embeddings/langchain_community.embeddings.zhipuai.ZhipuAIEmbeddings.html) |\n", + "\n", + "## Setup\n", + "\n", + "To access ZhipuAI embedding models you'll need to create a/an ZhipuAI account, get an API key, and install the `zhipuai` integration package.\n", + "\n", + "### Credentials\n", + "\n", + "Head to [https://bigmodel.cn/](https://bigmodel.cn/usercenter/apikeys) to sign up to ZhipuAI and generate an API key. Once you've done this set the ZHIPUAI_API_KEY environment variable:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "36521c2a", + "metadata": {}, + "outputs": [], + "source": [ + "import getpass\n", + "import os\n", + "\n", + "if not os.getenv(\"ZHIPUAI_API_KEY\"):\n", + " os.environ[\"ZHIPUAI_API_KEY\"] = getpass.getpass(\"Enter your ZhipuAI API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "c84fb993", + "metadata": {}, + "source": [ + "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "39a4953b", + "metadata": {}, + "outputs": [], + "source": [ + "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n", + "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "d9664366", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "The LangChain ZhipuAI integration lives in the `zhipuai` package:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "64853226", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Note: you may need to restart the kernel to use updated packages.\n" + ] + } + ], + "source": [ + "%pip install -qU zhipuai" + ] + }, + { + "cell_type": "markdown", + "id": "45dd1724", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "Now we can instantiate our model object and generate chat completions:" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "9ea7a09b", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_community.embeddings import ZhipuAIEmbeddings\n", + "\n", + "embeddings = ZhipuAIEmbeddings(\n", + " model=\"embedding-3\",\n", + " # With the `embedding-3` class\n", + " # of models, you can specify the size\n", + " # of the embeddings you want returned.\n", + " # dimensions=1024\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "77d271b6", + "metadata": {}, + "source": [ + "## Indexing and Retrieval\n", + "\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n", + "\n", + "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "d817716b", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'LangChain is the framework for building context-aware reasoning applications'" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create a vector store with a sample text\n", + "from langchain_core.vectorstores import InMemoryVectorStore\n", + "\n", + "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", + "\n", + "vectorstore = InMemoryVectorStore.from_texts(\n", + " [text],\n", + " embedding=embeddings,\n", + ")\n", + "\n", + "# Use the vectorstore as a retriever\n", + "retriever = vectorstore.as_retriever()\n", + "\n", + "# Retrieve the most similar text\n", + "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", + "\n", + "# show the retrieved document's content\n", + "retrieved_documents[0].page_content" + ] + }, + { + "cell_type": "markdown", + "id": "e02b9855", + "metadata": {}, + "source": [ + "## Direct Usage\n", + "\n", + "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", + "\n", + "You can directly call these methods to get embeddings for your own use cases.\n", + "\n", + "### Embed single texts\n", + "\n", + "You can embed single texts or documents with `embed_query`:" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "0d2befcd", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[-0.022979736, 0.007785797, 0.04598999, 0.012741089, -0.01689148, 0.008277893, 0.016464233, 0.009246\n" + ] + } + ], + "source": [ + "single_vector = embeddings.embed_query(text)\n", + "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "1b5a7d03", + "metadata": {}, + "source": [ + "### Embed multiple texts\n", + "\n", + "You can embed multiple texts with `embed_documents`:" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "2f4d6e97", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[-0.022979736, 0.007785797, 0.04598999, 0.012741089, -0.01689148, 0.008277893, 0.016464233, 0.009246\n", + "[-0.02330017, -0.013916016, 0.00022411346, 0.017196655, -0.034240723, 0.011131287, 0.011497498, -0.0\n" + ] + } + ], + "source": [ + "text2 = (\n", + " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", + ")\n", + "two_vectors = embeddings.embed_documents([text, text2])\n", + "for vector in two_vectors:\n", + " print(str(vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "98785c12", + "metadata": {}, + "source": [ + "## API Reference\n", + "\n", + "For detailed documentation on `ZhipuAIEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/community/embeddings/langchain_community.embeddings.zhipuai.ZhipuAIEmbeddings.html).\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 }