From 1ee8cf7b203655215c1c6c57942a282a8c6441de Mon Sep 17 00:00:00 2001
From: Eugene Yurtsev <eyurtsev@gmail.com>
Date: Thu, 4 Apr 2024 22:36:03 -0400
Subject: [PATCH] Docs: Update custom chat model (#19967)

* Clean up in the existing tutorial
* Add model_name to identifying params
* Add table to summarize messages
---
 .../model_io/chat/custom_chat_model.ipynb     | 320 +++++++-----------
 1 file changed, 121 insertions(+), 199 deletions(-)

diff --git a/docs/docs/modules/model_io/chat/custom_chat_model.ipynb b/docs/docs/modules/model_io/chat/custom_chat_model.ipynb
index b91ca4cfd43..b410f837293 100644
--- a/docs/docs/modules/model_io/chat/custom_chat_model.ipynb
+++ b/docs/docs/modules/model_io/chat/custom_chat_model.ipynb
@@ -1,7 +1,6 @@
 {
  "cells": [
   {
-   "attachments": {},
    "cell_type": "markdown",
    "id": "e3da9a3f-f583-4ba6-994e-0e8c1158f5eb",
    "metadata": {},
@@ -10,13 +9,13 @@
     "\n",
     "In this guide, we'll learn how to create a custom chat model using LangChain abstractions.\n",
     "\n",
-    "Wrapping your LLM with the standard `ChatModel` interface allow you to use your LLM in existing LangChain programs with minimal code modifications!\n",
+    "Wrapping your LLM with the standard `BaseChatModel` interface allow you to use your LLM in existing LangChain programs with minimal code modifications!\n",
     "\n",
     "As an bonus, your LLM will automatically become a LangChain `Runnable` and will benefit from some optimizations out of the box (e.g., batch via a threadpool), async support, the `astream_events` API, etc.\n",
     "\n",
     "## Inputs and outputs\n",
     "\n",
-    "First, we need to talk about messages which are the inputs and outputs of chat models.\n",
+    "First, we need to talk about **messages** which are the inputs and outputs of chat models.\n",
     "\n",
     "### Messages\n",
     "\n",
@@ -24,13 +23,17 @@
     "\n",
     "LangChain has a few built-in message types:\n",
     "\n",
-    "- `SystemMessage`: Used for priming AI behavior, usually passed in as the first of a sequence of input messages.\n",
-    "- `HumanMessage`: Represents a message from a person interacting with the chat model.\n",
-    "- `AIMessage`: Represents a message from the chat model. This can be either text or a request to invoke a tool.\n",
-    "- `FunctionMessage` / `ToolMessage`: Message for passing the results of tool invocation back to the model.\n",
+    "| Message Type          | Description                                                                                     |\n",
+    "|-----------------------|-------------------------------------------------------------------------------------------------|\n",
+    "| `SystemMessage`       | Used for priming AI behavior, usually passed in as the first of a sequence of input messages.   |\n",
+    "| `HumanMessage`        | Represents a message from a person interacting with the chat model.                             |\n",
+    "| `AIMessage`           | Represents a message from the chat model. This can be either text or a request to invoke a tool.|\n",
+    "| `FunctionMessage` / `ToolMessage` | Message for passing the results of tool invocation back to the model.               |\n",
+    "| `AIMessageChunk` / `HumanMessageChunk` / ... | Chunk variant of each type of message. |\n",
+    "\n",
     "\n",
     "::: {.callout-note}\n",
-    "`ToolMessage` and `FunctionMessage` closely follow OpenAIs `function` and `tool` arguments.\n",
+    "`ToolMessage` and `FunctionMessage` closely follow OpenAIs `function` and `tool` roles.\n",
     "\n",
     "This is a rapidly developing field and as more models add function calling capabilities, expect that there will be additions to this schema.\n",
     ":::"
@@ -40,7 +43,9 @@
    "cell_type": "code",
    "execution_count": 1,
    "id": "c5046e6a-8b09-4a99-b6e6-7a605aac5738",
-   "metadata": {},
+   "metadata": {
+    "tags": []
+   },
    "outputs": [],
    "source": [
     "from langchain_core.messages import (\n",
@@ -67,7 +72,9 @@
    "cell_type": "code",
    "execution_count": 2,
    "id": "d4656e9d-bfa1-4703-8f79-762fe6421294",
-   "metadata": {},
+   "metadata": {
+    "tags": []
+   },
    "outputs": [],
    "source": [
     "from langchain_core.messages import (\n",
@@ -91,7 +98,9 @@
    "cell_type": "code",
    "execution_count": 3,
    "id": "9c15c299-6f8a-49cf-a072-09924fd44396",
-   "metadata": {},
+   "metadata": {
+    "tags": []
+   },
    "outputs": [
     {
      "data": {
@@ -108,32 +117,6 @@
     "AIMessageChunk(content=\"Hello\") + AIMessageChunk(content=\" World!\")"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "8e952d64-6d38-4a2b-b996-8812c204a12c",
-   "metadata": {},
-   "source": [
-    "## Simple Chat Model\n",
-    "\n",
-    "Inherting from `SimpleChatModel` is great for prototyping!\n",
-    "\n",
-    "It won't allow you to implement all features that you might want out of a chat model, but it's quick to implement, and if you need more you can transition to `BaseChatModel` shown below.\n",
-    "\n",
-    "Let's implement a chat model that echoes back the last `n` characters of the prompt!\n",
-    "\n",
-    "You need to implement the following:\n",
-    "\n",
-    "* The method `_call` - Use to generate a chat result from a prompt.\n",
-    "\n",
-    "In addition, you have the option to specify the following:\n",
-    "\n",
-    "* The property `_identifying_params` - Represent model parameterization for logging purposes.\n",
-    "\n",
-    "Optional:\n",
-    "\n",
-    "* `_stream` - Use to implement streaming.\n"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "bbfebea1",
@@ -143,29 +126,22 @@
     "\n",
     "Let's implement a chat model that echoes back the first `n` characetrs of the last message in the prompt!\n",
     "\n",
-    "To do so, we will inherit from `BaseChatModel` and we'll need to implement the following methods/properties:\n",
+    "To do so, we will inherit from `BaseChatModel` and we'll need to implement the following:\n",
     "\n",
-    "In addition, you have the option to specify the following:\n",
-    "\n",
-    "To do so inherit from `BaseChatModel` which is a lower level class and implement the methods:\n",
-    "\n",
-    "* `_generate` - Use to generate a chat result from a prompt\n",
-    "* The property `_llm_type` - Used to uniquely identify the type of the model. Used for logging.\n",
-    "\n",
-    "Optional:\n",
-    "\n",
-    "* `_stream` - Use to implement streaming.\n",
-    "* `_agenerate` - Use to implement a native async method.\n",
-    "* `_astream` - Use to implement async version of `_stream`.\n",
-    "* The property `_identifying_params` - Represent model parameterization for logging purposes.\n",
+    "| Method/Property                    | Description                                                       | Required/Optional  |\n",
+    "|------------------------------------|-------------------------------------------------------------------|--------------------|\n",
+    "| `_generate`                        | Use to generate a chat result from a prompt                       | Required           |\n",
+    "| `_llm_type` (property)             | Used to uniquely identify the type of the model. Used for logging.| Required           |\n",
+    "| `_identifying_params` (property)   | Represent model parameterization for tracing purposes.            | Optional           |\n",
+    "| `_stream`                          | Use to implement streaming.                                       | Optional           |\n",
+    "| `_agenerate`                       | Use to implement a native async method.                           | Optional           |\n",
+    "| `_astream`                         | Use to implement async version of `_stream`.                      | Optional           |\n",
     "\n",
     "\n",
-    ":::{.callout-caution}\n",
+    ":::{.callout-tip}\n",
+    "The `_astream` implementation uses `run_in_executor` to launch the sync `_stream` in a separate thread if `_stream` is implemented, otherwise it fallsback to use `_agenerate`.\n",
     "\n",
-    "Currently, to get async streaming to work (via `astream`), you must provide an implementation of `_astream`.\n",
-    "\n",
-    "By default if `_astream` is not provided, then async streaming falls back on `_agenerate` which does not support\n",
-    "token by token streaming.\n",
+    "You can use this trick if you want to reuse the `_stream` implementation, but if you're able to implement code that's natively async that's a better solution since that code will run with less overhead.\n",
     ":::"
    ]
   },
@@ -181,7 +157,9 @@
    "cell_type": "code",
    "execution_count": 4,
    "id": "25ba32e5-5a6d-49f4-bb68-911827b84d61",
-   "metadata": {},
+   "metadata": {
+    "tags": []
+   },
    "outputs": [],
    "source": [
     "from typing import Any, AsyncIterator, Dict, Iterator, List, Optional\n",
@@ -214,6 +192,8 @@
     "                                 [HumanMessage(content=\"world\")]])\n",
     "    \"\"\"\n",
     "\n",
+    "    model_name: str\n",
+    "    \"\"\"The name of the model\"\"\"\n",
     "    n: int\n",
     "    \"\"\"The number of characters from the last message of the prompt to be echoed.\"\"\"\n",
     "\n",
@@ -239,9 +219,19 @@
     "                  downstream and understand why generation stopped.\n",
     "            run_manager: A run manager with callbacks for the LLM.\n",
     "        \"\"\"\n",
+    "        # Replace this with actual logic to generate a response from a list\n",
+    "        # of messages.\n",
     "        last_message = messages[-1]\n",
     "        tokens = last_message.content[: self.n]\n",
-    "        message = AIMessage(content=tokens)\n",
+    "        message = AIMessage(\n",
+    "            content=tokens,\n",
+    "            additional_kwargs={},  # Used to add additional payload (e.g., function calling request)\n",
+    "            response_metadata={  # Use for response metadata\n",
+    "                \"time_in_seconds\": 3,\n",
+    "            },\n",
+    "        )\n",
+    "        ##\n",
+    "\n",
     "        generation = ChatGeneration(message=message)\n",
     "        return ChatResult(generations=[generation])\n",
     "\n",
@@ -276,36 +266,21 @@
     "            chunk = ChatGenerationChunk(message=AIMessageChunk(content=token))\n",
     "\n",
     "            if run_manager:\n",
+    "                # This is optional in newer versions of LangChain\n",
+    "                # The on_llm_new_token will be called automatically\n",
     "                run_manager.on_llm_new_token(token, chunk=chunk)\n",
     "\n",
     "            yield chunk\n",
     "\n",
-    "    async def _astream(\n",
-    "        self,\n",
-    "        messages: List[BaseMessage],\n",
-    "        stop: Optional[List[str]] = None,\n",
-    "        run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,\n",
-    "        **kwargs: Any,\n",
-    "    ) -> AsyncIterator[ChatGenerationChunk]:\n",
-    "        \"\"\"An async variant of astream.\n",
-    "\n",
-    "        If not provided, the default behavior is to delegate to the _generate method.\n",
-    "\n",
-    "        The implementation below instead will delegate to `_stream` and will\n",
-    "        kick it off in a separate thread.\n",
-    "\n",
-    "        If you're able to natively support async, then by all means do so!\n",
-    "        \"\"\"\n",
-    "        result = await run_in_executor(\n",
-    "            None,\n",
-    "            self._stream,\n",
-    "            messages,\n",
-    "            stop=stop,\n",
-    "            run_manager=run_manager.get_sync() if run_manager else None,\n",
-    "            **kwargs,\n",
+    "        # Let's add some other information (e.g., response metadata)\n",
+    "        chunk = ChatGenerationChunk(\n",
+    "            message=AIMessageChunk(content=\"\", response_metadata={\"time_in_sec\": 3})\n",
     "        )\n",
-    "        for chunk in result:\n",
-    "            yield chunk\n",
+    "        if run_manager:\n",
+    "            # This is optional in newer versions of LangChain\n",
+    "            # The on_llm_new_token will be called automatically\n",
+    "            run_manager.on_llm_new_token(token, chunk=chunk)\n",
+    "        yield chunk\n",
     "\n",
     "    @property\n",
     "    def _llm_type(self) -> str:\n",
@@ -314,21 +289,18 @@
     "\n",
     "    @property\n",
     "    def _identifying_params(self) -> Dict[str, Any]:\n",
-    "        \"\"\"Return a dictionary of identifying parameters.\"\"\"\n",
-    "        return {\"n\": self.n}"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "b3c3d030-8d8b-4891-962d-a2d39b331883",
-   "metadata": {},
-   "source": [
-    ":::{.callout-tip}\n",
-    "The `_astream` implementation uses `run_in_executor` to launch the sync `_stream` in a separate thread.\n",
+    "        \"\"\"Return a dictionary of identifying parameters.\n",
     "\n",
-    "You can use this trick if you want to reuse the `_stream` implementation, but if you're able to implement code\n",
-    "that's natively async that's a better solution since that code will run with less overhead.\n",
-    ":::"
+    "        This information is used by the LangChain callback system, which\n",
+    "        is used for tracing purposes make it possible to monitor LLMs.\n",
+    "        \"\"\"\n",
+    "        return {\n",
+    "            # The model name allows users to specify custom token counting\n",
+    "            # rules in LLM monitoring applications (e.g., in LangSmith users\n",
+    "            # can provide per token pricing for their model and monitor\n",
+    "            # costs for the given LLM.)\n",
+    "            \"model_name\": self.model_name,\n",
+    "        }"
    ]
   },
   {
@@ -345,22 +317,26 @@
    "cell_type": "code",
    "execution_count": 5,
    "id": "34bf2d48-556a-48be-aee7-496fb02332f3",
-   "metadata": {},
+   "metadata": {
+    "tags": []
+   },
    "outputs": [],
    "source": [
-    "model = CustomChatModelAdvanced(n=3)"
+    "model = CustomChatModelAdvanced(n=3, model_name=\"my_custom_model\")"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": 6,
    "id": "27689f30-dcd2-466b-ba9d-f60b7d434110",
-   "metadata": {},
+   "metadata": {
+    "tags": []
+   },
    "outputs": [
     {
      "data": {
       "text/plain": [
-       "AIMessage(content='Meo')"
+       "AIMessage(content='Meo', response_metadata={'time_in_seconds': 3}, id='run-ddb42bd6-4fdd-4bd2-8be5-e11b67d3ac29-0')"
       ]
      },
      "execution_count": 6,
@@ -382,12 +358,14 @@
    "cell_type": "code",
    "execution_count": 7,
    "id": "406436df-31bf-466b-9c3d-39db9d6b6407",
-   "metadata": {},
+   "metadata": {
+    "tags": []
+   },
    "outputs": [
     {
      "data": {
       "text/plain": [
-       "AIMessage(content='hel')"
+       "AIMessage(content='hel', response_metadata={'time_in_seconds': 3}, id='run-4d3cc912-44aa-454b-977b-ca02be06c12e-0')"
       ]
      },
      "execution_count": 7,
@@ -403,12 +381,15 @@
    "cell_type": "code",
    "execution_count": 8,
    "id": "a72ffa46-6004-41ef-bbe4-56fa17a029e2",
-   "metadata": {},
+   "metadata": {
+    "tags": []
+   },
    "outputs": [
     {
      "data": {
       "text/plain": [
-       "[AIMessage(content='hel'), AIMessage(content='goo')]"
+       "[AIMessage(content='hel', response_metadata={'time_in_seconds': 3}, id='run-9620e228-1912-4582-8aa1-176813afec49-0'),\n",
+       " AIMessage(content='goo', response_metadata={'time_in_seconds': 3}, id='run-1ce8cdf8-6f75-448e-82f7-1bb4a121df93-0')]"
       ]
      },
      "execution_count": 8,
@@ -424,13 +405,15 @@
    "cell_type": "code",
    "execution_count": 9,
    "id": "3633be2c-2ea0-42f9-a72f-3b5240690b55",
-   "metadata": {},
+   "metadata": {
+    "tags": []
+   },
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "c|a|t|"
+      "c|a|t||"
      ]
     }
    ],
@@ -451,13 +434,15 @@
    "cell_type": "code",
    "execution_count": 10,
    "id": "b7d73995-eeab-48c6-a7d8-32c98ba29fc2",
-   "metadata": {},
+   "metadata": {
+    "tags": []
+   },
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "c|a|t|"
+      "c|a|t||"
      ]
     }
    ],
@@ -478,24 +463,27 @@
    "cell_type": "code",
    "execution_count": 11,
    "id": "17840eba-8ff4-4e73-8e4f-85f16eb1c9d0",
-   "metadata": {},
+   "metadata": {
+    "tags": []
+   },
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "{'event': 'on_chat_model_start', 'run_id': 'e03c0b21-521f-4cb4-a837-02fed65cf1cf', 'name': 'CustomChatModelAdvanced', 'tags': [], 'metadata': {}, 'data': {'input': 'cat'}}\n",
-      "{'event': 'on_chat_model_stream', 'run_id': 'e03c0b21-521f-4cb4-a837-02fed65cf1cf', 'tags': [], 'metadata': {}, 'name': 'CustomChatModelAdvanced', 'data': {'chunk': AIMessageChunk(content='c')}}\n",
-      "{'event': 'on_chat_model_stream', 'run_id': 'e03c0b21-521f-4cb4-a837-02fed65cf1cf', 'tags': [], 'metadata': {}, 'name': 'CustomChatModelAdvanced', 'data': {'chunk': AIMessageChunk(content='a')}}\n",
-      "{'event': 'on_chat_model_stream', 'run_id': 'e03c0b21-521f-4cb4-a837-02fed65cf1cf', 'tags': [], 'metadata': {}, 'name': 'CustomChatModelAdvanced', 'data': {'chunk': AIMessageChunk(content='t')}}\n",
-      "{'event': 'on_chat_model_end', 'name': 'CustomChatModelAdvanced', 'run_id': 'e03c0b21-521f-4cb4-a837-02fed65cf1cf', 'tags': [], 'metadata': {}, 'data': {'output': AIMessageChunk(content='cat')}}\n"
+      "{'event': 'on_chat_model_start', 'run_id': '125a2a16-b9cd-40de-aa08-8aa9180b07d0', 'name': 'CustomChatModelAdvanced', 'tags': [], 'metadata': {}, 'data': {'input': 'cat'}}\n",
+      "{'event': 'on_chat_model_stream', 'run_id': '125a2a16-b9cd-40de-aa08-8aa9180b07d0', 'tags': [], 'metadata': {}, 'name': 'CustomChatModelAdvanced', 'data': {'chunk': AIMessageChunk(content='c', id='run-125a2a16-b9cd-40de-aa08-8aa9180b07d0')}}\n",
+      "{'event': 'on_chat_model_stream', 'run_id': '125a2a16-b9cd-40de-aa08-8aa9180b07d0', 'tags': [], 'metadata': {}, 'name': 'CustomChatModelAdvanced', 'data': {'chunk': AIMessageChunk(content='a', id='run-125a2a16-b9cd-40de-aa08-8aa9180b07d0')}}\n",
+      "{'event': 'on_chat_model_stream', 'run_id': '125a2a16-b9cd-40de-aa08-8aa9180b07d0', 'tags': [], 'metadata': {}, 'name': 'CustomChatModelAdvanced', 'data': {'chunk': AIMessageChunk(content='t', id='run-125a2a16-b9cd-40de-aa08-8aa9180b07d0')}}\n",
+      "{'event': 'on_chat_model_stream', 'run_id': '125a2a16-b9cd-40de-aa08-8aa9180b07d0', 'tags': [], 'metadata': {}, 'name': 'CustomChatModelAdvanced', 'data': {'chunk': AIMessageChunk(content='', response_metadata={'time_in_sec': 3}, id='run-125a2a16-b9cd-40de-aa08-8aa9180b07d0')}}\n",
+      "{'event': 'on_chat_model_end', 'name': 'CustomChatModelAdvanced', 'run_id': '125a2a16-b9cd-40de-aa08-8aa9180b07d0', 'tags': [], 'metadata': {}, 'data': {'output': AIMessageChunk(content='cat', response_metadata={'time_in_sec': 3}, id='run-125a2a16-b9cd-40de-aa08-8aa9180b07d0')}}\n"
      ]
     },
     {
      "name": "stderr",
      "output_type": "stream",
      "text": [
-      "/home/eugene/src/langchain/libs/core/langchain_core/_api/beta_decorator.py:86: LangChainBetaWarning: This API is in beta and may change in the future.\n",
+      "/home/eugene/src/langchain/libs/core/langchain_core/_api/beta_decorator.py:87: LangChainBetaWarning: This API is in beta and may change in the future.\n",
       "  warn_beta(\n"
      ]
     }
@@ -505,84 +493,6 @@
     "    print(event)"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "42f9553f-7d8c-4277-aeb4-d80d77839d90",
-   "metadata": {},
-   "source": [
-    "## Identifying Params\n",
-    "\n",
-    "LangChain has a callback system which allows implementing loggers to monitor the behavior of LLM applications.\n",
-    "\n",
-    "Remember the `_identifying_params` property from earlier? \n",
-    "\n",
-    "It's passed to the callback system and is accessible for user specified loggers.\n",
-    "\n",
-    "Below we'll implement a handler with just a single `on_chat_model_start` event to see where `_identifying_params` appears."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 12,
-   "id": "cc7e6b5f-711b-48aa-9ebe-92a13e230c37",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "---\n",
-      "On chat model start.\n",
-      "{'invocation_params': {'n': 3, '_type': 'echoing-chat-model-advanced', 'stop': ['woof']}, 'options': {'stop': ['woof']}, 'name': None, 'batch_size': 1}\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "AIMessage(content='meo')"
-      ]
-     },
-     "execution_count": 12,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from typing import Union\n",
-    "from uuid import UUID\n",
-    "\n",
-    "from langchain_core.callbacks import AsyncCallbackHandler\n",
-    "from langchain_core.outputs import (\n",
-    "    ChatGenerationChunk,\n",
-    "    ChatResult,\n",
-    "    GenerationChunk,\n",
-    "    LLMResult,\n",
-    ")\n",
-    "\n",
-    "\n",
-    "class SampleCallbackHandler(AsyncCallbackHandler):\n",
-    "    \"\"\"Async callback handler that handles callbacks from LangChain.\"\"\"\n",
-    "\n",
-    "    async def on_chat_model_start(\n",
-    "        self,\n",
-    "        serialized: Dict[str, Any],\n",
-    "        messages: List[List[BaseMessage]],\n",
-    "        *,\n",
-    "        run_id: UUID,\n",
-    "        parent_run_id: Optional[UUID] = None,\n",
-    "        tags: Optional[List[str]] = None,\n",
-    "        metadata: Optional[Dict[str, Any]] = None,\n",
-    "        **kwargs: Any,\n",
-    "    ) -> Any:\n",
-    "        \"\"\"Run when a chat model starts running.\"\"\"\n",
-    "        print(\"---\")\n",
-    "        print(\"On chat model start.\")\n",
-    "        print(kwargs)\n",
-    "\n",
-    "\n",
-    "model.invoke(\"meow\", stop=[\"woof\"], config={\"callbacks\": [SampleCallbackHandler()]})"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "44ee559b-b1da-4851-8c97-420ab394aff9",
@@ -603,11 +513,10 @@
     "\n",
     "* [ ] Add unit or integration tests to the overridden methods. Verify that `invoke`, `ainvoke`, `batch`, `stream` work if you've over-ridden the corresponding code.\n",
     "\n",
+    "\n",
     "Streaming (if you're implementing it):\n",
     "\n",
-    "* [ ] Provided an async implementation via `_astream`\n",
-    "* [ ] Make sure to invoke the `on_llm_new_token` callback\n",
-    "* [ ] `on_llm_new_token` is invoked BEFORE yielding the chunk\n",
+    "* [ ] Implement the _stream method to get streaming working\n",
     "\n",
     "Stop Token Behavior:\n",
     "\n",
@@ -616,7 +525,20 @@
     "\n",
     "Secret API Keys:\n",
     "\n",
-    "* [ ] If your model connects to an API it will likely accept API keys as part of its initialization. Use Pydantic's `SecretStr` type for secrets, so they don't get accidentally printed out when folks print the model."
+    "* [ ] If your model connects to an API it will likely accept API keys as part of its initialization. Use Pydantic's `SecretStr` type for secrets, so they don't get accidentally printed out when folks print the model.\n",
+    "\n",
+    "\n",
+    "Identifying Params:\n",
+    "\n",
+    "* [ ] Include a `model_name` in identifying params\n",
+    "\n",
+    "\n",
+    "Optimizations:\n",
+    "\n",
+    "Consider providing native async support to reduce the overhead from the model!\n",
+    " \n",
+    "* [ ] Provided a native async of `_agenerate` (used by `ainvoke`)\n",
+    "* [ ] Provided a native async of `_astream` (used by `astream`)"
    ]
   }
  ],
@@ -636,7 +558,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.11.2"
+   "version": "3.11.4"
   }
  },
  "nbformat": 4,