From b1aed44540e41a75d77b0970deb5169a6ab4f05d Mon Sep 17 00:00:00 2001 From: Eugene Yurtsev Date: Tue, 13 Aug 2024 20:04:18 -0400 Subject: [PATCH] docs: Updating integration docs for Fireworks Embeddings (#25247) Providers: * fireworks See related issue: * https://github.com/langchain-ai/langchain/issues/24856 Features: ```json [ { "provider": "fireworks", "js": true, "local": false, "serializable": false, } ] ``` --------- Co-authored-by: isaac hershenson Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> --- .../text_embedding/fireworks.ipynb | 263 ++++++++++++++---- 1 file changed, 206 insertions(+), 57 deletions(-) diff --git a/docs/docs/integrations/text_embedding/fireworks.ipynb b/docs/docs/integrations/text_embedding/fireworks.ipynb index d5ac356ec06..c87249874c2 100644 --- a/docs/docs/integrations/text_embedding/fireworks.ipynb +++ b/docs/docs/integrations/text_embedding/fireworks.ipynb @@ -1,19 +1,88 @@ { "cells": [ + { + "cell_type": "raw", + "id": "afaf8039", + "metadata": {}, + "source": [ + "---\n", + "sidebar_label: Fireworks\n", + "---" + ] + }, { "cell_type": "markdown", - "id": "b14a24db", + "id": "9a3d6f34", "metadata": {}, "source": [ "# FireworksEmbeddings\n", "\n", - "This notebook explains how to use Fireworks Embeddings, which is included in the langchain_fireworks package, to embed texts in langchain. We use the default nomic-ai v1.5 model in this example." + "This will help you get started with Fireworks embedding models using LangChain. For detailed documentation on `FireworksEmbeddings` features and configuration options, please refer to the [API reference](https://api.python.langchain.com/en/latest/embeddings/langchain_fireworks.embeddings.FireworksEmbeddings.html).\n", + "\n", + "## Overview\n", + "\n", + "### Integration details\n", + "\n", + "import { ItemTable } from \"@theme/FeatureTables\";\n", + "\n", + "\n", + "\n", + "## Setup\n", + "\n", + "To access Fireworks embedding models you'll need to create a Fireworks account, get an API key, and install the `langchain-fireworks` integration package.\n", + "\n", + "### Credentials\n", + "\n", + "Head to [fireworks.ai](https://fireworks.ai/) to sign up to Fireworks and generate an API key. Once you’ve done this set the FIREWORKS_API_KEY environment variable:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "36521c2a", + "metadata": {}, + "outputs": [], + "source": [ + "import getpass\n", + "import os\n", + "\n", + "if not os.getenv(\"FIREWORKS_API_KEY\"):\n", + " os.environ[\"FIREWORKS_API_KEY\"] = getpass.getpass(\"Enter your Fireworks API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "c84fb993", + "metadata": {}, + "source": [ + "If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "39a4953b", + "metadata": {}, + "outputs": [], + "source": [ + "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n", + "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "d9664366", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "The LangChain Fireworks integration lives in the `langchain-fireworks` package:" ] }, { "cell_type": "code", "execution_count": null, - "id": "0ab948fc", + "id": "64853226", "metadata": {}, "outputs": [], "source": [ @@ -22,83 +91,163 @@ }, { "cell_type": "markdown", - "id": "67c637ca", + "id": "45dd1724", "metadata": {}, "source": [ - "## Setup" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "id": "5709b030", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_fireworks import FireworksEmbeddings" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "3d81e58c", - "metadata": {}, - "outputs": [], - "source": [ - "import getpass\n", - "import os\n", + "## Instantiation\n", "\n", - "if \"FIREWORKS_API_KEY\" not in os.environ:\n", - " os.environ[\"FIREWORKS_API_KEY\"] = getpass.getpass(\"Fireworks API Key:\")" - ] - }, - { - "cell_type": "markdown", - "id": "4a2a098d", - "metadata": {}, - "source": [ - "# Using the Embedding Model\n", - "With `FireworksEmbeddings`, you can directly use the default model 'nomic-ai/nomic-embed-text-v1.5', or set a different one if available." - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "id": "584b9af5", - "metadata": {}, - "outputs": [], - "source": [ - "embedding = FireworksEmbeddings(model=\"nomic-ai/nomic-embed-text-v1.5\")" + "Now we can instantiate our model object and generate chat completions:" ] }, { "cell_type": "code", "execution_count": 4, - "id": "be18b873", + "id": "9ea7a09b", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_fireworks import FireworksEmbeddings\n", + "\n", + "embeddings = FireworksEmbeddings(\n", + " model=\"nomic-ai/nomic-embed-text-v1.5\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "77d271b6", + "metadata": {}, + "source": [ + "## Indexing and Retrieval\n", + "\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our RAG tutorials under the [working with external knowledge tutorials](/docs/tutorials/#working-with-external-knowledge).\n", + "\n", + "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "d817716b", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'LangChain is the framework for building context-aware reasoning applications'" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create a vector store with a sample text\n", + "from langchain_core.vectorstores import InMemoryVectorStore\n", + "\n", + "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", + "\n", + "vectorstore = InMemoryVectorStore.from_texts(\n", + " [text],\n", + " embedding=embeddings,\n", + ")\n", + "\n", + "# Use the vectorstore as a retriever\n", + "retriever = vectorstore.as_retriever()\n", + "\n", + "# Retrieve the most similar text\n", + "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", + "\n", + "# show the retrieved document's content\n", + "retrieved_documents[0].page_content" + ] + }, + { + "cell_type": "markdown", + "id": "e02b9855", + "metadata": {}, + "source": [ + "## Direct Usage\n", + "\n", + "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", + "\n", + "You can directly call these methods to get embeddings for your own use cases.\n", + "\n", + "### Embed single texts\n", + "\n", + "You can embed single texts or documents with `embed_query`:" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "0d2befcd", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "[0.01367950439453125, 0.0103607177734375, -0.157958984375, -0.003070831298828125, 0.05926513671875]\n", - "[0.0369873046875, 0.00545501708984375, -0.179931640625, -0.018707275390625, 0.0552978515625]\n" + "[0.01666259765625, 0.011688232421875, -0.1181640625, -0.10205078125, 0.05438232421875, -0.0890502929\n" ] } ], "source": [ - "res_query = embedding.embed_query(\"The test information\")\n", - "res_document = embedding.embed_documents([\"test1\", \"another test\"])\n", - "print(res_query[:5])\n", - "print(res_document[1][:5])" + "single_vector = embeddings.embed_query(text)\n", + "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "1b5a7d03", + "metadata": {}, + "source": [ + "### Embed multiple texts\n", + "\n", + "You can embed multiple texts with `embed_documents`:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "2f4d6e97", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0.016632080078125, 0.01165008544921875, -0.1181640625, -0.10186767578125, 0.05438232421875, -0.0890\n", + "[-0.02667236328125, 0.036651611328125, -0.1630859375, -0.0904541015625, -0.022430419921875, -0.09545\n" + ] + } + ], + "source": [ + "text2 = (\n", + " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", + ")\n", + "two_vectors = embeddings.embed_documents([text, text2])\n", + "for vector in two_vectors:\n", + " print(str(vector)[:100]) # Show the first 100 characters of the vector" + ] + }, + { + "cell_type": "markdown", + "id": "3fba556a-b53d-431c-b0c6-ffb1e2fa5a6e", + "metadata": {}, + "source": [ + "## API Reference\n", + "\n", + "For detailed documentation of all `FireworksEmbeddings` features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/embeddings/langchain_fireworks.embeddings.FireworksEmbeddings.html)." ] } ], "metadata": { "kernelspec": { - "display_name": "poetry-venv-2", + "display_name": "Python 3 (ipykernel)", "language": "python", - "name": "poetry-venv-2" + "name": "python3" }, "language_info": { "codemirror_mode": { @@ -110,7 +259,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.1" + "version": "3.11.4" } }, "nbformat": 4,