langchain/docs/versioned_docs/version-0.2.x/integrations/text_embedding/premai.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# PremAI\n",
    "\n",
    ">[PremAI](https://app.premai.io) is an unified platform that let's you build powerful production-ready GenAI powered applications with least effort, so that you can focus more on user experience and overall growth. In this section we are going to dicuss how we can get access to different embedding model using `PremAIEmbeddings`\n",
    "\n",
    "## Installation and Setup\n",
    "\n",
    "We start by installing langchain and premai-sdk. You can type the following command to install:\n",
    "\n",
    "```bash\n",
    "pip install premai langchain\n",
    "```\n",
    "\n",
    "Before proceeding further, please make sure that you have made an account on Prem and already started a project. If not, then here's how you can start for free:\n",
    "\n",
    "1. Sign in to [PremAI](https://app.premai.io/accounts/login/), if you are coming for the first time and create your API key [here](https://app.premai.io/api_keys/).\n",
    "\n",
    "2. Go to [app.premai.io](https://app.premai.io) and this will take you to the project's dashboard. \n",
    "\n",
    "3. Create a project and this will generate a project-id (written as ID). This ID will help you to interact with your deployed application. \n",
    "\n",
    "Congratulations on creating your first deployed application on Prem 🎉 Now we can use langchain to interact with our application. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Let's start by doing some imports and define our embedding object\n",
    "\n",
    "from langchain_community.embeddings import PremAIEmbeddings"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once we imported our required modules, let's setup our client. For now let's assume that our `project_id` is 8. But make sure you use your project-id, otherwise it will throw error.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import getpass\n",
    "import os\n",
    "\n",
    "if os.environ.get(\"PREMAI_API_KEY\") is None:\n",
    "    os.environ[\"PREMAI_API_KEY\"] = getpass.getpass(\"PremAI API Key:\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = \"text-embedding-3-large\"\n",
    "embedder = PremAIEmbeddings(project_id=8, model=model)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have defined our embedding model. We support a lot of embedding models. Here is a table that shows the number of embedding models we support. \n",
    "\n",
    "\n",
    "| Provider    | Slug                                     | Context Tokens |\n",
    "|-------------|------------------------------------------|----------------|\n",
    "| cohere      | embed-english-v3.0                       | N/A            |\n",
    "| openai      | text-embedding-3-small                   | 8191           |\n",
    "| openai      | text-embedding-3-large                   | 8191           |\n",
    "| openai      | text-embedding-ada-002                   | 8191           |\n",
    "| replicate   | replicate/all-mpnet-base-v2              | N/A            |\n",
    "| together    | togethercomputer/Llama-2-7B-32K-Instruct | N/A            |\n",
    "| mistralai   | mistral-embed                            | 4096           |\n",
    "\n",
    "To change the model, you simply need to copy the `slug` and access your embedding model. Now let's start using our embedding model with a single query followed by multiple queries (which is also called as a document)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[-0.02129288576543331, 0.0008162345038726926, -0.004556538071483374, 0.02918623760342598, -0.02547479420900345]\n"
     ]
    }
   ],
   "source": [
    "query = \"Hello, this is a test query\"\n",
    "query_result = embedder.embed_query(query)\n",
    "\n",
    "# Let's print the first five elements of the query embedding vector\n",
    "\n",
    "print(query_result[:5])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally let's embed a document"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[-0.0030691148713231087, -0.045334383845329285, -0.0161729846149683, 0.04348714277148247, -0.0036920777056366205]\n"
     ]
    }
   ],
   "source": [
    "documents = [\"This is document1\", \"This is document2\", \"This is document3\"]\n",
    "\n",
    "doc_result = embedder.embed_documents(documents)\n",
    "\n",
    "# Similar to previous result, let's print the first five element\n",
    "# of the first document vector\n",
    "\n",
    "print(doc_result[0][:5])"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}