Update cookbook

2026-01-24 05:50:18 +00:00 · 2024-01-31 15:59:03 -08:00
parent c3da04c203
commit da8ba98cb5
2 changed files with 295 additions and 352 deletions
--- a/cookbook/nomic_embeddings.ipynb
+++ b/cookbook/nomic_embeddings.ipynb
@@ -0,0 +1,295 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "d8da6094-30c7-43f3-a608-c91717b673db",
+   "metadata": {},
+   "source": [
+    "# Nomic Embeddings\n",
+    "\n",
+    "Nomic has released a new embedding model with strong performance for long context retrieval (8k context window).\n",
+    "\n",
+    "## Signup\n",
+    "\n",
+    "Get your API token, then run:\n",
+    "```\n",
+    "! nomic login\n",
+    "```\n",
+    "\n",
+    "Then run with your generated API token \n",
+    "```\n",
+    "! nomic login < token > \n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f737ec15-e9ab-4629-b54c-24be69e8b60b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! nomic login"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "8ab7434a-2930-42b5-9164-dc2c03abe232",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! nomic login token"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a3501e2a-4686-4b95-8a1c-f19e035ea354",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! pip install -U langchain-nomic"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "134475f2-f256-4c13-9712-c55783e6a4e2",
+   "metadata": {},
+   "source": [
+    "## Document Loading\n",
+    "\n",
+    "Let's test 3 interesting blog posts."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "01c4d270-171e-45c2-a1b6-e350faa74117",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_community.document_loaders import WebBaseLoader\n",
+    "\n",
+    "urls =[\"https://lilianweng.github.io/posts/2023-06-23-agent/\",\n",
+    "       \"https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/\",\n",
+    "       \"https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/\"]\n",
+    "\n",
+    "docs = [WebBaseLoader(url).load() for url in urls]\n",
+    "docs_list = [item for sublist in docs for item in sublist]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "75ab7f74-873c-4d84-af5a-5cf19c61239d",
+   "metadata": {},
+   "source": [
+    "## Splitting \n",
+    "\n",
+    "Long context retrieval "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "f512e128-629e-4304-926f-94fe5c999527",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.text_splitter import CharacterTextSplitter\n",
+    "text_splitter = CharacterTextSplitter.from_tiktoken_encoder(chunk_size=7500, \n",
+    "                                                            chunk_overlap=100)\n",
+    "doc_splits = text_splitter.split_documents(docs_list)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "d2a69cf0-e3ab-4c92-a1d0-10da45c08b3b",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The document is 6562 tokens\n",
+      "The document is 3037 tokens\n",
+      "The document is 6092 tokens\n",
+      "The document is 1050 tokens\n",
+      "The document is 6933 tokens\n",
+      "The document is 5560 tokens\n"
+     ]
+    }
+   ],
+   "source": [
+    "import tiktoken\n",
+    "encoding = tiktoken.get_encoding(\"cl100k_base\")\n",
+    "encoding = tiktoken.encoding_for_model(\"gpt-3.5-turbo\")\n",
+    "for d in doc_splits:\n",
+    "    print(\"The document is %s tokens\"%len(encoding.encode(d.page_content)))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c58d1e9b-e98e-4bd9-b52f-4dfc2a4e69f4",
+   "metadata": {},
+   "source": [
+    "## Index \n",
+    "\n",
+    "Nomic embeddings [here](https://docs.nomic.ai/reference/endpoints/nomic-embed-text). "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "76447866-bf8b-412b-93bc-d6ea8ec35952",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "from langchain_nomic import NomicEmbeddings\n",
+    "from langchain_community.vectorstores import Chroma\n",
+    "from langchain_nomic.embeddings import NomicEmbeddings\n",
+    "from langchain_core.output_parsers import StrOutputParser\n",
+    "from langchain_core.runnables import RunnableLambda, RunnablePassthrough"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "15b3eab2-2689-49d4-8cb0-67ef2adcbc49",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Add to vectorDB\n",
+    "vectorstore = Chroma.from_documents(\n",
+    "    documents=doc_splits,\n",
+    "    collection_name=\"rag-chroma\",\n",
+    "    embedding=NomicEmbeddings(model='nomic-embed-text-v1') \n",
+    ")\n",
+    "retriever = vectorstore.as_retriever()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "41131122-3591-4566-aac1-ed19d496820a",
+   "metadata": {},
+   "source": [
+    "## RAG Chain\n",
+    "\n",
+    "We can use the Mistral `v0.2`, which is [fine-tuned for 32k context](https://x.com/dchaplot/status/1734198245067243629?s=20).\n",
+    "\n",
+    "We can [use Ollama](https://ollama.ai/library/mistral) -\n",
+    "```\n",
+    "ollama pull mistral:instruct\n",
+    "```\n",
+    "\n",
+    "We can also run [GPT-4 128k](https://openai.com/blog/new-models-and-developer-products-announced-at-devday). "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "1397de64-5b4a-4001-adc5-570ff8d31ff6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_openai import ChatOpenAI\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_community.chat_models import ChatOllama\n",
+    "\n",
+    "# Prompt \n",
+    "template = \"\"\"Answer the question based only on the following context:\n",
+    "{context}\n",
+    "\n",
+    "Question: {question}\n",
+    "\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_template(template)\n",
+    "\n",
+    "# LLM API\n",
+    "model = ChatOpenAI(temperature=0, model=\"gpt-4-1106-preview\")\n",
+    " \n",
+    "# Local LLM\n",
+    "ollama_llm = \"mistral:instruct\"\n",
+    "model_local = ChatOllama(model=ollama_llm)\n",
+    "\n",
+    "# Chain\n",
+    "chain = (\n",
+    "    {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
+    "    | prompt\n",
+    "    | model_local\n",
+    "    | StrOutputParser()\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "1548e00c-1ff6-4e88-aa13-69badf2088fb",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "' Agents, especially those used in artificial intelligence and natural language processing, can have different types of memory. Here are some common types:\\n\\n1. **Short-term memory** or working memory: This is a small capacity, high-turnover memory that holds information temporarily while the agent processes it. Short-term memory is essential for tasks requiring attention and quick response, such as parsing sentences or following instructions.\\n\\n2. **Long-term memory**: This is a large capacity, low-turnover memory where agents store information for extended periods. Long-term memory enables learning from experiences, accessing past knowledge, and improving performance over time.\\n\\n3. **Explicit memory** or declarative memory: Agents use explicit memory to store and recall facts, concepts, and rules that can be expressed in natural language. This type of memory is crucial for problem solving and reasoning.\\n\\n4. **Implicit memory** or procedural memory: Implicit memory refers to the acquisition and retention of skills and habits. The agent learns through repeated experiences without necessarily being aware of it.\\n\\n5. **Connectionist memory**: Connectionist memory, also known as neural networks, is inspired by the structure and function of biological brains. Connectionist models learn and store information in interconnected nodes or artificial neurons. This type of memory enables the model to recognize patterns and generalize knowledge.\\n\\n6. **Hybrid memory systems**: Many advanced agents employ a combination of different memory types to maximize their learning potential and performance. These hybrid systems can integrate short-term, long-term, explicit, implicit, and connectionist memories.'"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Question\n",
+    "chain.invoke(\"What are the types of agent memory?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5ec5b4c3-757d-44df-92ea-dd5f08017dd6",
+   "metadata": {},
+   "source": [
+    "**Mistral**\n",
+    "\n",
+    "Trace: 24k prompt tokens.\n",
+    "\n",
+    "* https://smith.langchain.com/public/3e04d475-ea08-4ee3-ae66-6416a93d8b08/r\n",
+    "\n",
+    "--- \n",
+    "\n",
+    "Some considerations are noted in the [needle in a haystack analysis](https://twitter.com/GregKamradt/status/1722386725635580292?lang=en):\n",
+    "\n",
+    "* LLMs may suffer with retrieval from large context depending on where the information is placed."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6ffb6b63-17ee-42d8-b1fb-d6a866e98458",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.16"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/libs/partners/nomic/ntbk/long_context_RAG.ipynb
+++ b/libs/partners/nomic/ntbk/long_context_RAG.ipynb
@@ -1,352 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "d8da6094-30c7-43f3-a608-c91717b673db",
-   "metadata": {},
-   "source": [
-    "## Init\n",
-    "\n",
-    "Get your API token, then run:\n",
-    "```\n",
-    "! nomic login\n",
-    "```\n",
-    "\n",
-    "Then run with your generated API token \n",
-    "```\n",
-    "! nomic login < token > \n",
-    "```"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "8ab7434a-2930-42b5-9164-dc2c03abe232",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "! nomic login\n",
-    "! nomic login token"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "134475f2-f256-4c13-9712-c55783e6a4e2",
-   "metadata": {},
-   "source": [
-    "## Document Loading\n",
-    "\n",
-    "Let's test 3 interesting blog posts."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 19,
-   "id": "01c4d270-171e-45c2-a1b6-e350faa74117",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_community.document_loaders import WebBaseLoader\n",
-    "\n",
-    "urls =[\"https://lilianweng.github.io/posts/2023-06-23-agent/\",\n",
-    "       \"https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/\",\n",
-    "       \"https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/\"]\n",
-    "\n",
-    "docs = [WebBaseLoader(url).load() for url in urls]\n",
-    "docs_list = [item for sublist in docs for item in sublist]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "75ab7f74-873c-4d84-af5a-5cf19c61239d",
-   "metadata": {},
-   "source": [
-    "## Splitting \n",
-    "\n",
-    "### Larger Context Models\n",
-    "\n",
-    "There's a lot of interesting considerations to think about on [text splitting](https://www.youtube.com/watch?v=8OJC21T2SL4). \n",
-    "\n",
-    "Many approaches to date have focused on very granular splitting by semantic groups or sub-sections, which is a challenge.\n",
-    "\n",
-    "The intution was: retrieve just the minimal context needed to address the question driven by:\n",
-    "\n",
-    "(1) Embedding models with smaller context size\n",
-    "\n",
-    "(2) LLMs with smaller context size\n",
-    "\n",
-    "This means, we need high `precision` in retreival: \n",
-    "\n",
-    "> We reject as many irrelevant chunks (false positives) as possible.\n",
-    "\n",
-    "Thus, all chunks we send to the model are relevant, but:\n",
-    "\n",
-    "(1) We can suffer lower `recall` (leave our importaint details) \n",
-    "\n",
-    "(2) We incur higher splitting complexity\n",
-    "\n",
-    "--- \n",
-    "\n",
-    "Embeddings models are starting to support larger context as discussed [here](https://hazyresearch.stanford.edu/blog/2024-01-11-m2-bert-retrieval).\n",
-    "\n",
-    "Nomic's release supports > 8k token limit locally (GPU today, CPU soon) and via API (soon).\n",
-    "\n",
-    "And LLMs are seeing context window expansion, as seen with [GPT-4 128k](https://openai.com/blog/new-models-and-developer-products-announced-at-devday) or Yarn LLaMA2 [here](https://x.com/mattshumer_/status/1720115354884514042?s=20), [here](https://ollama.ai/library/yarn-mistral). \n",
-    "\n",
-    "Here, we can try a workflow that is less concerned with `precision`:\n",
-    "\n",
-    "(1) We use larger context chunks and embedds to promote `recall` \n",
-    "\n",
-    "(2) Use use larger context LLMs that can \"sift\" through less relevant information to get our answer\n",
-    "\n",
-    "Lets pick a few interesting blog posts and see how long each document is using [TikToken](https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb)."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 21,
-   "id": "f512e128-629e-4304-926f-94fe5c999527",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain.text_splitter import CharacterTextSplitter\n",
-    "text_splitter = CharacterTextSplitter.from_tiktoken_encoder(chunk_size=10000, \n",
-    "                                                            chunk_overlap=100)\n",
-    "doc_splits = text_splitter.split_documents(docs_list)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 22,
-   "id": "d2a69cf0-e3ab-4c92-a1d0-10da45c08b3b",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "The document is 8759 tokens\n",
-      "The document is 811 tokens\n",
-      "The document is 7083 tokens\n",
-      "The document is 9029 tokens\n",
-      "The document is 3488 tokens\n"
-     ]
-    }
-   ],
-   "source": [
-    "import tiktoken\n",
-    "encoding = tiktoken.get_encoding(\"cl100k_base\")\n",
-    "encoding = tiktoken.encoding_for_model(\"gpt-3.5-turbo\")\n",
-    "for d in doc_splits:\n",
-    "    print(\"The document is %s tokens\"%len(encoding.encode(d.page_content)))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "c58d1e9b-e98e-4bd9-b52f-4dfc2a4e69f4",
-   "metadata": {},
-   "source": [
-    "## Index \n",
-    "\n",
-    "Nomic embeddings [here](https://docs.nomic.ai/reference/endpoints/nomic-embed-text). "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 36,
-   "id": "76447866-bf8b-412b-93bc-d6ea8ec35952",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import os\n",
-    "from langchain_community.vectorstores import Chroma\n",
-    "from langchain_nomic.embeddings import NomicEmbeddings\n",
-    "from langchain_core.output_parsers import StrOutputParser\n",
-    "from langchain_core.runnables import RunnableLambda, RunnablePassthrough"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 42,
-   "id": "15b3eab2-2689-49d4-8cb0-67ef2adcbc49",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">                                          Authenticate with the Nomic API                                          </span>\n",
-       "</pre>\n"
-      ],
-      "text/plain": [
-       "\u001b[1m                                          \u001b[0m\u001b[1mAuthenticate with the Nomic API\u001b[0m\u001b[1m                                          \u001b[0m\n"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">                                         </span><span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://atlas.nomic.ai/cli-login</span><span style=\"font-weight: bold\">                                          </span>\n",
-       "</pre>\n"
-      ],
-      "text/plain": [
-       "\u001b[1m                                         \u001b[0m\u001b[4;94mhttps://atlas.nomic.ai/cli-login\u001b[0m\u001b[1m                                          \u001b[0m\n"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">               Click the above link to retrieve your access token and then run `nomic login [token]`               </span>\n",
-       "</pre>\n"
-      ],
-      "text/plain": [
-       "\u001b[1m               \u001b[0m\u001b[1mClick the above link to retrieve your access token and then run `nomic login \u001b[0m\u001b[1m[\u001b[0m\u001b[1mtoken\u001b[0m\u001b[1m]\u001b[0m\u001b[1m`\u001b[0m\u001b[1m               \u001b[0m\n"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "ename": "NameError",
-     "evalue": "name 'exit' is not defined",
-     "output_type": "error",
-     "traceback": [
-      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
-      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
-      "Cell \u001b[0;32mIn[42], line 7\u001b[0m\n\u001b[1;32m      2\u001b[0m api_key \u001b[38;5;241m=\u001b[39m os\u001b[38;5;241m.\u001b[39mgetenv(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mNOMIC_API_KEY\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m      3\u001b[0m \u001b[38;5;66;03m# api_key = \"eTiGYQ2ep1EMxFZlyTopWHRpk7JqjSU89FdTuLvbD132c\"\u001b[39;00m\n\u001b[1;32m      4\u001b[0m vectorstore \u001b[38;5;241m=\u001b[39m Chroma\u001b[38;5;241m.\u001b[39mfrom_documents(\n\u001b[1;32m      5\u001b[0m     documents\u001b[38;5;241m=\u001b[39mtexts,\n\u001b[1;32m      6\u001b[0m     collection_name\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mrag-chroma\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[0;32m----> 7\u001b[0m     embedding\u001b[38;5;241m=\u001b[39m\u001b[43mNomicEmbeddings\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmodel\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[38;5;124;43mnomic-embed-text-v1\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m      8\u001b[0m \u001b[43m                              \u001b[49m\u001b[43mnomic_api_key\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mapi_key\u001b[49m\u001b[43m)\u001b[49m,\n\u001b[1;32m      9\u001b[0m )\n\u001b[1;32m     10\u001b[0m retriever \u001b[38;5;241m=\u001b[39m vectorstore\u001b[38;5;241m.\u001b[39mas_retriever()\n",
-      "File \u001b[0;32m~/Desktop/Code/langchain-main/langchain/libs/partners/nomic/langchain_nomic/embeddings.py:27\u001b[0m, in \u001b[0;36mNomicEmbeddings.__init__\u001b[0;34m(self, model, nomic_api_key)\u001b[0m\n\u001b[1;32m     21\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"Initialize NomicEmbeddings model.\u001b[39;00m\n\u001b[1;32m     22\u001b[0m \n\u001b[1;32m     23\u001b[0m \u001b[38;5;124;03mArgs:\u001b[39;00m\n\u001b[1;32m     24\u001b[0m \u001b[38;5;124;03m    model: model name\u001b[39;00m\n\u001b[1;32m     25\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m     26\u001b[0m _api_key \u001b[38;5;241m=\u001b[39m nomic_api_key \u001b[38;5;129;01mor\u001b[39;00m os\u001b[38;5;241m.\u001b[39menviron\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mNOMIC_API_KEY\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m---> 27\u001b[0m \u001b[43mnomic\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mlogin\u001b[49m\u001b[43m(\u001b[49m\u001b[43m_api_key\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m     28\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mmodel \u001b[38;5;241m=\u001b[39m model\n",
-      "File \u001b[0;32m~/miniforge3/envs/llama2/lib/python3.9/site-packages/nomic/cli.py:60\u001b[0m, in \u001b[0;36mlogin\u001b[0;34m(token, tenant, domain)\u001b[0m\n\u001b[1;32m     54\u001b[0m     console\u001b[38;5;241m.\u001b[39mprint(auth0_auth_endpoint, style\u001b[38;5;241m=\u001b[39mstyle, justify\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcenter\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m     55\u001b[0m     console\u001b[38;5;241m.\u001b[39mprint(\n\u001b[1;32m     56\u001b[0m         \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mClick the above link to retrieve your access token and then run `nomic login \u001b[39m\u001b[38;5;124m\\\u001b[39m\u001b[38;5;124m[token]`\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[1;32m     57\u001b[0m         style\u001b[38;5;241m=\u001b[39mstyle,\n\u001b[1;32m     58\u001b[0m         justify\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcenter\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[1;32m     59\u001b[0m     )\n\u001b[0;32m---> 60\u001b[0m     \u001b[43mexit\u001b[49m()\n\u001b[1;32m     62\u001b[0m \u001b[38;5;66;03m# save credential\u001b[39;00m\n\u001b[1;32m     63\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m nomic_base_path\u001b[38;5;241m.\u001b[39mexists():\n",
-      "\u001b[0;31mNameError\u001b[0m: name 'exit' is not defined"
-     ]
-    }
-   ],
-   "source": [
-    "# Add to vectorDB\n",
-    "api_key = os.getenv(\"NOMIC_API_KEY\")\n",
-    "# api_key = \"xxx2\"\n",
-    "vectorstore = Chroma.from_documents(\n",
-    "    documents=texts,\n",
-    "    collection_name=\"rag-chroma\",\n",
-    "    embedding=NomicEmbeddings(model='nomic-embed-text-v1',\n",
-    "                              nomic_api_key=api_key), # TO FIX \n",
-    ")\n",
-    "retriever = vectorstore.as_retriever()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "41131122-3591-4566-aac1-ed19d496820a",
-   "metadata": {},
-   "source": [
-    "## RAG Chain\n",
-    "\n",
-    "To test locally, we can use Ollama [here](https://x.com/mattshumer_/status/1720115354884514042?s=20), [here](https://ollama.ai/library/yarn-mistral) - \n",
-    "```\n",
-    "ollama pull yarn-mistral\n",
-    "```\n",
-    "\n",
-    "Of course, we can also run [GPT-4 128k](https://openai.com/blog/new-models-and-developer-products-announced-at-devday). "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 33,
-   "id": "1397de64-5b4a-4001-adc5-570ff8d31ff6",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_openai import ChatOpenAI\n",
-    "from langchain_core.prompts import ChatPromptTemplate\n",
-    "from langchain_community.chat_models import ChatOllama\n",
-    "\n",
-    "# Prompt \n",
-    "template = \"\"\"Answer the question based only on the following context:\n",
-    "{context}\n",
-    "\n",
-    "Question: {question}\n",
-    "\"\"\"\n",
-    "prompt = ChatPromptTemplate.from_template(template)\n",
-    "\n",
-    "# Local LLM\n",
-    "ollama_llm = \"yarn-mistral\"\n",
-    "model = ChatOllama(model=ollama_llm)\n",
-    "\n",
-    "# LLM API\n",
-    "model = ChatOpenAI(temperature=0, model=\"gpt-4-1106-preview\")\n",
-    "\n",
-    "# Chain\n",
-    "chain = (\n",
-    "    {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
-    "    | prompt\n",
-    "    | model\n",
-    "    | StrOutputParser()\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 34,
-   "id": "1548e00c-1ff6-4e88-aa13-69badf2088fb",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'In the context provided, the types of agent memory mentioned are:\\n\\n1. Sensory Memory: This is the earliest stage of memory, providing the ability to retain impressions of sensory information after the original stimuli have ended. It typically only lasts for a few seconds.\\n\\n2. Short-Term Memory (STM) or Working Memory: It stores information that is currently being used to carry out complex cognitive tasks such as learning and reasoning. Short-term memory has a limited capacity and duration.\\n\\n3. Long-Term Memory (LTM): This type of memory can store information for a remarkably long time, ranging from a few days to decades, with an essentially unlimited storage capacity. It is divided into two subtypes:\\n   - Explicit / Declarative Memory: Memory of facts and events that can be consciously recalled, including episodic memory (events and experiences) and semantic memory (facts and concepts).\\n   - Implicit / Procedural Memory: Unconscious memory involving skills and routines performed automatically, like riding a bike or typing on a keyboard.'"
-      ]
-     },
-     "execution_count": 34,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# Question\n",
-    "chain.invoke(\"What are the types of agent memory?\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "81616653-be22-445a-9d90-34e0c2e788bc",
-   "metadata": {},
-   "source": [
-    "Some considerations are noted in the [needle in a haystack analysis](https://twitter.com/GregKamradt/status/1722386725635580292?lang=en):\n",
-    "\n",
-    "* LLMs may suffer with retrieval from large context depending on where the information is placed."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "d067e4ea-fd84-40db-8ecb-1aac94f55417",
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.9.16"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}