Add prompt hub support for Mistral w/ Ollama (#11315)

Add Mistral example with prompt support
2025-08-14 15:16:21 +00:00 · 2023-10-03 08:17:46 -07:00 · 2023-10-03 08:17:46 -07:00 · b3c83fdd33
commit b3c83fdd33
parent 2343302fc6
1 changed files with 134 additions and 80 deletions
--- a/docs/extras/integrations/llms/ollama.ipynb
+++ b/docs/extras/integrations/llms/ollama.ipynb
@ -51,15 +51,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 38,
+   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.llms import Ollama\n",
    "from langchain.callbacks.manager import CallbackManager\n",
    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler                                  \n",
-    "llm = Ollama(base_url=\"http://localhost:11434\", \n",
-    "             model=\"llama2\", \n",
+    "llm = Ollama(model=\"llama2\", \n",
    "             callback_manager = CallbackManager([StreamingStdOutCallbackHandler()]))"
   ]
  },
@ -72,36 +71,9 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 40,
+   "execution_count": null,
   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "\n",
-      "Great! The history of Artificial Intelligence (AI) is a fascinating and complex topic that spans several decades. Here's a brief overview:\n",
-      "\n",
-      "1. Early Years (1950s-1960s): The term \"Artificial Intelligence\" was coined in 1956 by computer scientist John McCarthy. However, the concept of AI dates back to ancient Greece, where mythical creatures like Talos and Hephaestus were created to perform tasks without any human intervention. In the 1950s and 1960s, researchers began exploring ways to replicate human intelligence using computers, leading to the development of simple AI programs like ELIZA (1966) and PARRY (1972).\n",
-      "2. Rule-Based Systems (1970s-1980s): As computing power increased, researchers developed rule-based systems, such as Mycin (1976), which could diagnose medical conditions based on a set of rules. This period also saw the rise of expert systems, like EDICT (1985), which mimicked human experts in specific domains.\n",
-      "3. Machine Learning (1990s-2000s): With the advent of big data and machine learning algorithms, AI evolved to include neural networks, decision trees, and other techniques for training models on large datasets. This led to the development of applications like speech recognition (e.g., Siri, Alexa), image recognition (e.g., Google Image Search), and natural language processing (e.g., chatbots).\n",
-      "4. Deep Learning (2010s-present): The rise of deep learning techniques, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), has enabled AI to perform complex tasks like image and speech recognition, natural language processing, and even autonomous driving. Companies like Google, Facebook, and Baidu have invested heavily in deep learning research, leading to breakthroughs in areas like facial recognition, object detection, and machine translation.\n",
-      "5. Current Trends (present-future): AI is currently being applied to various industries, including healthcare, finance, education, and entertainment. With the growth of cloud computing, edge AI, and autonomous systems, we can expect to see more sophisticated AI applications in the near future. However, there are also concerns about the ethical implications of AI, such as data privacy, algorithmic bias, and job displacement.\n",
-      "\n",
-      "Remember, AI has a long history, and its development is an ongoing process. As technology advances, we can expect to see even more innovative applications of AI in various fields."
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "'\\nGreat! The history of Artificial Intelligence (AI) is a fascinating and complex topic that spans several decades. Here\\'s a brief overview:\\n\\n1. Early Years (1950s-1960s): The term \"Artificial Intelligence\" was coined in 1956 by computer scientist John McCarthy. However, the concept of AI dates back to ancient Greece, where mythical creatures like Talos and Hephaestus were created to perform tasks without any human intervention. In the 1950s and 1960s, researchers began exploring ways to replicate human intelligence using computers, leading to the development of simple AI programs like ELIZA (1966) and PARRY (1972).\\n2. Rule-Based Systems (1970s-1980s): As computing power increased, researchers developed rule-based systems, such as Mycin (1976), which could diagnose medical conditions based on a set of rules. This period also saw the rise of expert systems, like EDICT (1985), which mimicked human experts in specific domains.\\n3. Machine Learning (1990s-2000s): With the advent of big data and machine learning algorithms, AI evolved to include neural networks, decision trees, and other techniques for training models on large datasets. This led to the development of applications like speech recognition (e.g., Siri, Alexa), image recognition (e.g., Google Image Search), and natural language processing (e.g., chatbots).\\n4. Deep Learning (2010s-present): The rise of deep learning techniques, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), has enabled AI to perform complex tasks like image and speech recognition, natural language processing, and even autonomous driving. Companies like Google, Facebook, and Baidu have invested heavily in deep learning research, leading to breakthroughs in areas like facial recognition, object detection, and machine translation.\\n5. Current Trends (present-future): AI is currently being applied to various industries, including healthcare, finance, education, and entertainment. With the growth of cloud computing, edge AI, and autonomous systems, we can expect to see more sophisticated AI applications in the near future. However, there are also concerns about the ethical implications of AI, such as data privacy, algorithmic bias, and job displacement.\\n\\nRemember, AI has a long history, and its development is an ongoing process. As technology advances, we can expect to see even more innovative applications of AI in various fields.'"
-      ]
-     },
-     "execution_count": 40,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
   "source": [
    "llm(\"Tell me about the history of AI\")"
   ]
@ -121,7 +93,6 @@
   "source": [
    "from langchain.embeddings import OllamaEmbeddings\n",
    "oembed = OllamaEmbeddings(base_url=\"http://localhost:11434\", model=\"llama2\")\n",
-    "\n",
    "oembed.embed_query(\"Llamas are social animals and live with others as a herd.\")"
   ]
  },
@ -153,34 +124,60 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 60,
+   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
+    "# Load web page\n",
    "from langchain.document_loaders import WebBaseLoader\n",
    "loader = WebBaseLoader(\"https://lilianweng.github.io/posts/2023-06-23-agent/\")\n",
-    "data = loader.load()\n",
-    "\n",
+    "data = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Split into chunks \n",
    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
-    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)\n",
+    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=100)\n",
    "all_splits = text_splitter.split_documents(data)"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 6,
   "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Found model file at  /Users/rlm/.cache/gpt4all/ggml-all-MiniLM-L6-v2-f16.bin\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "objc[77472]: Class GGMLMetalClass is implemented in both /Users/rlm/miniforge3/envs/llama2/lib/python3.9/site-packages/gpt4all/llmodel_DO_NOT_MODIFY/build/libreplit-mainline-metal.dylib (0x17f754208) and /Users/rlm/miniforge3/envs/llama2/lib/python3.9/site-packages/gpt4all/llmodel_DO_NOT_MODIFY/build/libllamamodel-mainline-metal.dylib (0x17fb80208). One of the two will be used. Which one is undefined.\n"
+     ]
+    }
+   ],
   "source": [
+    "# Embed and store\n",
    "from langchain.vectorstores import Chroma\n",
-    "from langchain.embeddings import OllamaEmbeddings\n",
-    "\n",
-    "vectorstore = Chroma.from_documents(documents=all_splits, embedding=OllamaEmbeddings())"
+    "from langchain.embeddings import GPT4AllEmbeddings\n",
+    "from langchain.embeddings import OllamaEmbeddings # We can also try Ollama embeddings\n",
+    "vectorstore = Chroma.from_documents(documents=all_splits,\n",
+    "                                    embedding=GPT4AllEmbeddings())"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 62,
+   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
@ -189,41 +186,32 @@
       "4"
      ]
     },
-     "execution_count": 62,
+     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "question = \"What are the approaches to Task Decomposition?\"\n",
+    "# Retrieve\n",
+    "question = \"How can Task Decomposition be done?\"\n",
    "docs = vectorstore.similarity_search(question)\n",
    "len(docs)"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain.prompts import PromptTemplate\n",
-    "\n",
-    "# Prompt\n",
-    "template = \"\"\"Use the following pieces of context to answer the question at the end. \n",
-    "If you don't know the answer, just say that you don't know, don't try to make up an answer. \n",
-    "Use three sentences maximum and keep the answer as concise as possible. \n",
-    "{context}\n",
-    "Question: {question}\n",
-    "Helpful Answer:\"\"\"\n",
-    "QA_CHAIN_PROMPT = PromptTemplate(\n",
-    "    input_variables=[\"context\", \"question\"],\n",
-    "    template=template,\n",
-    ")\n"
+    "# RAG prompt\n",
+    "from langchain import hub\n",
+    "QA_CHAIN_PROMPT = hub.pull(\"rlm/rag-prompt-llama\")"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 69,
+   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
@ -231,15 +219,14 @@
    "from langchain.llms import Ollama\n",
    "from langchain.callbacks.manager import CallbackManager\n",
    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
-    "llm = Ollama(base_url=\"http://localhost:11434\",\n",
-    "             model=\"llama2\",\n",
+    "llm = Ollama(model=\"llama2\",\n",
    "             verbose=True,\n",
    "             callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 66,
+   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
@ -254,18 +241,21 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 70,
+   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "Task decomposition can be approached in different ways for AI agents, including:\n",
+      " There are several approaches to task decomposition for AI agents, including:\n",
      "\n",
-      "1. Using simple prompts like \"Steps for XYZ.\" or \"What are the subgoals for achieving XYZ?\" to guide the LLM.\n",
-      "2. Providing task-specific instructions, such as \"Write a story outline\" for writing a novel.\n",
-      "3. Utilizing human inputs to help the AI agent understand the task and break it down into smaller steps."
+      "1. Chain of thought (CoT): This involves instructing the model to \"think step by step\" and use more test-time computation to decompose hard tasks into smaller and simpler steps.\n",
+      "2. Tree of thoughts (ToT): This extends CoT by exploring multiple reasoning possibilities at each step, creating a tree structure. The search process can be BFS or DFS with each state evaluated by a classifier or majority vote.\n",
+      "3. Using task-specific instructions: For example, \"Write a story outline.\" for writing a novel.\n",
+      "4. Human inputs: The agent can receive input from a human operator to perform tasks that require creativity and domain expertise.\n",
+      "\n",
+      "These approaches allow the agent to break down complex tasks into manageable subgoals, enabling efficient handling of tasks and improving the quality of final results through self-reflection and refinement."
     ]
    }
   ],
@ -283,17 +273,9 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 56,
+   "execution_count": null,
   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Task decomposition can be approached in three ways: (1) using simple prompting like \"Steps for XYZ.\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions, or (3) with human inputs.{'model': 'llama2', 'created_at': '2023-08-08T04:01:09.005367Z', 'done': True, 'context': [1, 29871, 1, 13, 9314, 14816, 29903, 6778, 13, 13, 3492, 526, 263, 8444, 29892, 3390, 1319, 322, 15993, 20255, 29889, 29849, 1234, 408, 1371, 3730, 408, 1950, 29892, 1550, 1641, 9109, 29889, 3575, 6089, 881, 451, 3160, 738, 10311, 1319, 29892, 443, 621, 936, 29892, 11021, 391, 29892, 7916, 391, 29892, 304, 27375, 29892, 18215, 29892, 470, 27302, 2793, 29889, 3529, 9801, 393, 596, 20890, 526, 5374, 635, 443, 5365, 1463, 322, 6374, 297, 5469, 29889, 13, 13, 3644, 263, 1139, 947, 451, 1207, 738, 4060, 29892, 470, 338, 451, 2114, 1474, 16165, 261, 296, 29892, 5649, 2020, 2012, 310, 22862, 1554, 451, 1959, 29889, 960, 366, 1016, 29915, 29873, 1073, 278, 1234, 304, 263, 1139, 29892, 3113, 1016, 29915, 29873, 6232, 2089, 2472, 29889, 13, 13, 29966, 829, 14816, 29903, 6778, 13, 13, 29961, 25580, 29962, 4803, 278, 1494, 12785, 310, 3030, 304, 1234, 278, 1139, 472, 278, 1095, 29889, 29871, 13, 3644, 366, 1016, 29915, 29873, 1073, 278, 1234, 29892, 925, 1827, 393, 366, 1016, 29915, 29873, 1073, 29892, 1016, 29915, 29873, 1018, 304, 1207, 701, 385, 1234, 29889, 29871, 13, 11403, 2211, 25260, 7472, 322, 3013, 278, 1234, 408, 3022, 895, 408, 1950, 29889, 29871, 13, 5398, 26227, 508, 367, 2309, 313, 29896, 29897, 491, 365, 26369, 411, 2560, 9508, 292, 763, 376, 7789, 567, 363, 1060, 29979, 29999, 7790, 29876, 29896, 19602, 376, 5618, 526, 278, 1014, 1484, 1338, 363, 3657, 15387, 1060, 29979, 29999, 29973, 613, 313, 29906, 29897, 491, 773, 3414, 29899, 14940, 11994, 29936, 321, 29889, 29887, 29889, 376, 6113, 263, 5828, 27887, 1213, 363, 5007, 263, 9554, 29892, 470, 313, 29941, 29897, 411, 5199, 10970, 29889, 13, 13, 5398, 26227, 508, 367, 2309, 313, 29896, 29897, 491, 365, 26369, 411, 2560, 9508, 292, 763, 376, 7789, 567, 363, 1060, 29979, 29999, 7790, 29876, 29896, 19602, 376, 5618, 526, 278, 1014, 1484, 1338, 363, 3657, 15387, 1060, 29979, 29999, 29973, 613, 313, 29906, 29897, 491, 773, 3414, 29899, 14940, 11994, 29936, 321, 29889, 29887, 29889, 376, 6113, 263, 5828, 27887, 1213, 363, 5007, 263, 9554, 29892, 470, 313, 29941, 29897, 411, 5199, 10970, 29889, 13, 13, 5398, 26227, 508, 367, 2309, 313, 29896, 29897, 491, 365, 26369, 411, 2560, 9508, 292, 763, 376, 7789, 567, 363, 1060, 29979, 29999, 7790, 29876, 29896, 19602, 376, 5618, 526, 278, 1014, 1484, 1338, 363, 3657, 15387, 1060, 29979, 29999, 29973, 613, 313, 29906, 29897, 491, 773, 3414, 29899, 14940, 11994, 29936, 321, 29889, 29887, 29889, 376, 6113, 263, 5828, 27887, 1213, 363, 5007, 263, 9554, 29892, 470, 313, 29941, 29897, 411, 5199, 10970, 29889, 13, 13, 1451, 16047, 267, 297, 1472, 29899, 8489, 18987, 322, 3414, 26227, 29901, 1858, 9450, 975, 263, 3309, 29891, 4955, 322, 17583, 3902, 8253, 278, 1650, 2913, 3933, 18066, 292, 29889, 365, 26369, 29879, 21117, 304, 10365, 13900, 746, 20050, 411, 15668, 4436, 29892, 3907, 963, 3109, 16424, 9401, 304, 25618, 1058, 5110, 515, 14260, 322, 1059, 29889, 13, 16492, 29901, 1724, 526, 278, 13501, 304, 9330, 897, 510, 3283, 29973, 13, 29648, 1319, 673, 29901, 518, 29914, 25580, 29962, 13, 5398, 26227, 508, 367, 26733, 297, 2211, 5837, 29901, 313, 29896, 29897, 773, 2560, 9508, 292, 763, 376, 7789, 567, 363, 1060, 29979, 29999, 7790, 29876, 29896, 19602, 376, 5618, 526, 278, 1014, 1484, 1338, 363, 3657, 15387, 1060, 29979, 29999, 29973, 613, 313, 29906, 29897, 491, 773, 3414, 29899, 14940, 11994, 29892, 470, 313, 29941, 29897, 411, 5199, 10970, 29889, 2], 'total_duration': 1364428708, 'load_duration': 1246375, 'sample_count': 62, 'sample_duration': 44859000, 'prompt_eval_count': 1, 'eval_count': 62, 'eval_duration': 1313002000}\n"
-     ]
-    }
-   ],
+   "outputs": [],
   "source": [
    "from langchain.schema import LLMResult\n",
    "from langchain.callbacks.base import BaseCallbackHandler\n",
@ -345,6 +327,78 @@
   "source": [
    "62 / (1313002000/1000/1000/1000)"
   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Using the Hub for prompt management\n",
+    " \n",
+    "Open source models often benefit from specific prompts. \n",
+    "\n",
+    "For example, [Mistral 7b](https://mistral.ai/news/announcing-mistral-7b/) was fine-tuned for chat using the prompt format shown [here](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1).\n",
+    "\n",
+    "Get the model: `ollama pull mistral:7b-instruct`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# LLM\n",
+    "from langchain.llms import Ollama\n",
+    "from langchain.callbacks.manager import CallbackManager\n",
+    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
+    "llm = Ollama(model=\"mistral:7b-instruct\",\n",
+    "             verbose=True,\n",
+    "             callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain import hub\n",
+    "QA_CHAIN_PROMPT = hub.pull(\"rlm/rag-prompt-mistral\")\n",
+    "\n",
+    "# QA chain\n",
+    "from langchain.chains import RetrievalQA\n",
+    "qa_chain = RetrievalQA.from_chain_type(\n",
+    "    llm,\n",
+    "    retriever=vectorstore.as_retriever(),\n",
+    "    chain_type_kwargs={\"prompt\": QA_CHAIN_PROMPT},\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "There are different approaches to Task Decomposition for AI Agents such as Chain of thought (CoT) and Tree of Thoughts (ToT). CoT breaks down big tasks into multiple manageable tasks and generates multiple thoughts per step, while ToT explores multiple reasoning possibilities at each step. Task decomposition can be done by LLM with simple prompting or using task-specific instructions or human inputs."
+     ]
+    }
+   ],
+   "source": [
+    "question = \"What are the various approaches to Task Decomposition for AI Agents?\"\n",
+    "result = qa_chain({\"query\": question})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
  }
 ],
 "metadata": {
@ -363,9 +417,9 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.5"
+   "version": "3.9.16"
  }
 },
 "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }