# Cookbook

In this notebook we'll take a look at a few common types of sequences to create.

## PromptTemplate + LLM

A PromptTemplate -> LLM is a core chain that is used in most other larger chains/systems.

In [1]:
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI

In [2]:
model = ChatOpenAI()

In [3]:
prompt = ChatPromptTemplate.from_template("tell me a joke about {foo}")

In [4]:
chain = prompt | model

In [5]:
chain.invoke({"foo": "bears"})

AIMessage(content="Why don't bears wear shoes?\n\nBecause they have bear feet!", additional_kwargs={}, example=False)

Often times we want to attach kwargs to the model that's passed in. Here's a few examples of that:

### Attaching Stop Sequences

In [6]:
chain = prompt | model.bind(stop=["\n"])

In [7]:
chain.invoke({"foo": "bears"})

AIMessage(content="Why don't bears wear shoes?", additional_kwargs={}, example=False)

### Attaching Function Call information

In [8]:
functions = [
    {
      "name": "joke",
      "description": "A joke",
      "parameters": {
        "type": "object",
        "properties": {
          "setup": {
            "type": "string",
            "description": "The setup for the joke"
          },
          "punchline": {
            "type": "string",
            "description": "The punchline for the joke"
          }
        },
        "required": ["setup", "punchline"]
      }
    }
  ]
chain = prompt | model.bind(function_call= {"name": "joke"}, functions= functions)

In [9]:
chain.invoke({"foo": "bears"}, config={})

AIMessage(content='', additional_kwargs={'function_call': {'name': 'joke', 'arguments': '{\n  "setup": "Why don\'t bears wear shoes?",\n  "punchline": "Because they have bear feet!"\n}'}}, example=False)

## PromptTemplate + LLM + OutputParser

We can also add in an output parser to easily trasform the raw LLM/ChatModel output into a more workable format

In [10]:
from langchain.schema.output_parser import StrOutputParser

In [11]:
chain = prompt | model | StrOutputParser()

Notice that this now returns a string - a much more workable format for downstream tasks

In [12]:
chain.invoke({"foo": "bears"})

"Sure, here's a bear joke for you:\n\nWhy don't bears like fast food?\n\nBecause they can't catch it!"

### Functions Output Parser

When you specify the function to return, you may just want to parse that directly

In [13]:
from langchain.output_parsers.openai_functions import JsonOutputFunctionsParser
chain = (
    prompt 
    | model.bind(function_call= {"name": "joke"}, functions= functions) 
    | JsonOutputFunctionsParser()
)

In [14]:
chain.invoke({"foo": "bears"})

{'setup': "Why don't bears wear shoes?",
 'punchline': 'Because they have bear feet!'}

In [15]:
from langchain.output_parsers.openai_functions import JsonKeyOutputFunctionsParser
chain = (
    prompt 
    | model.bind(function_call= {"name": "joke"}, functions= functions) 
    | JsonKeyOutputFunctionsParser(key_name="setup")
)

In [16]:
chain.invoke({"foo": "bears"})

"Why don't bears wear shoes?"

## Passthroughs and itemgetter

Often times when constructing a chain you may want to pass along original input variables to future steps in the chain. How exactly you do this depends on what exactly the input is:

- If the original input was a string, then you likely just want to pass along the string. This can be done with `RunnablePassthrough`. For an example of this, see `LLMChain + Retriever`
- If the original input was a dictionary, then you likely want to pass along specific keys. This can be done with `itemgetter`. For an example of this see `Multiple LLM Chains`

In [17]:
from langchain.schema.runnable import RunnablePassthrough
from operator import itemgetter

## LLMChain + Retriever

Let's now look at adding in a retrieval step, which adds up to a "retrieval-augmented generation" chain

In [18]:
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.schema.runnable import RunnablePassthrough

In [19]:
# Create the retriever
vectorstore = Chroma.from_texts(["harrison worked at kensho"], embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

In [20]:
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

In [21]:
chain = (
    {"context": retriever, "question": RunnablePassthrough()} 
    | prompt 
    | model 
    | StrOutputParser()
)

In [22]:
chain.invoke("where did harrison work?")

Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1


'Harrison worked at Kensho.'

In [23]:
template = """Answer the question based only on the following context:
{context}

Question: {question}

Answer in the following language: {language}
"""
prompt = ChatPromptTemplate.from_template(template)

chain = {
    "context": itemgetter("question") | retriever, 
    "question": itemgetter("question"), 
    "language": itemgetter("language")
} | prompt | model | StrOutputParser()

In [24]:
chain.invoke({"question": "where did harrison work", "language": "italian"})

Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1


'Harrison ha lavorato a Kensho.'

## Conversational Retrieval Chain

We can easily add in conversation history. This primarily means adding in chat_message_history

In [25]:
from langchain.schema.runnable import RunnableMap
from langchain.schema import format_document

In [26]:
from langchain.prompts.prompt import PromptTemplate

_template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
{chat_history}
Follow Up Input: {question}
Standalone question:"""
CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(_template)

In [27]:
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
ANSWER_PROMPT = ChatPromptTemplate.from_template(template)

In [28]:
DEFAULT_DOCUMENT_PROMPT = PromptTemplate.from_template(template="{page_content}")
def _combine_documents(docs, document_prompt = DEFAULT_DOCUMENT_PROMPT, document_separator="\n\n"):
    doc_strings = [format_document(doc, document_prompt) for doc in docs]
    return document_separator.join(doc_strings)

In [29]:
from typing import Tuple, List
def _format_chat_history(chat_history: List[Tuple]) -> str:
    buffer = ""
    for dialogue_turn in chat_history:
        human = "Human: " + dialogue_turn[0]
        ai = "Assistant: " + dialogue_turn[1]
        buffer += "\n" + "\n".join([human, ai])
    return buffer

In [30]:
_inputs = RunnableMap(
    {
        "standalone_question": {
            "question": lambda x: x["question"],
            "chat_history": lambda x: _format_chat_history(x['chat_history'])
        } | CONDENSE_QUESTION_PROMPT | ChatOpenAI(temperature=0) | StrOutputParser(),
    }
)
_context = {
    "context": itemgetter("standalone_question") | retriever | _combine_documents,
    "question": lambda x: x["standalone_question"]
}
conversational_qa_chain = _inputs | _context | ANSWER_PROMPT | ChatOpenAI()

In [31]:
conversational_qa_chain.invoke({
    "question": "where did harrison work?",
    "chat_history": [],
})

Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1


AIMessage(content='Harrison was employed at Kensho.', additional_kwargs={}, example=False)

In [32]:
conversational_qa_chain.invoke({
    "question": "where did he work?",
    "chat_history": [("Who wrote this notebook?", "Harrison")],
})

Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1


AIMessage(content='Harrison worked at Kensho.', additional_kwargs={}, example=False)

### With Memory and returning source documents

This shows how to use memory with the above. For memory, we need to manage that outside at the memory. For returning the retrieved documents, we just need to pass them through all the way.

In [33]:
from langchain.memory import ConversationBufferMemory

In [34]:
memory = ConversationBufferMemory(return_messages=True, output_key="answer", input_key="question")

In [35]:
# First we add a step to load memory
# This needs to be a RunnableMap because its the first input
loaded_memory = RunnableMap(
    {
        "question": itemgetter("question"),
        "memory": memory.load_memory_variables,
    }
)
# Next we add a step to expand memory into the variables
expanded_memory = {
    "question": itemgetter("question"),
    "chat_history": lambda x: x["memory"]["history"]
}

# Now we calculate the standalone question
standalone_question = {
    "standalone_question": {
        "question": lambda x: x["question"],
        "chat_history": lambda x: _format_chat_history(x['chat_history'])
    } | CONDENSE_QUESTION_PROMPT | ChatOpenAI(temperature=0) | StrOutputParser(),
}
# Now we retrieve the documents
retrieved_documents = {
    "docs": itemgetter("standalone_question") | retriever,
    "question": lambda x: x["standalone_question"]
}
# Now we construct the inputs for the final prompt
final_inputs = {
    "context": lambda x: _combine_documents(x["docs"]),
    "question": itemgetter("question")
}
# And finally, we do the part that returns the answers
answer = {
    "answer": final_inputs | ANSWER_PROMPT | ChatOpenAI(),
    "docs": itemgetter("docs"),
}
# And now we put it all together!
final_chain = loaded_memory | expanded_memory | standalone_question | retrieved_documents | answer

In [36]:
inputs = {"question": "where did harrison work?"}
result = final_chain.invoke(inputs)
result

Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1


{'answer': AIMessage(content='Harrison was employed at Kensho.', additional_kwargs={}, example=False),
 'docs': [Document(page_content='harrison worked at kensho', metadata={})]}

In [37]:
# Note that the memory does not save automatically
# This will be improved in the future
# For now you need to save it yourself
memory.save_context(inputs, {"answer": result["answer"].content})

In [38]:
memory.load_memory_variables({})

{'history': [HumanMessage(content='where did harrison work?', additional_kwargs={}, example=False),
  AIMessage(content='Harrison was employed at Kensho.', additional_kwargs={}, example=False)]}

## Multiple LLM Chains

This can also be used to string together multiple LLMChains

In [39]:
from operator import itemgetter

prompt1 = ChatPromptTemplate.from_template("what is the city {person} is from?")
prompt2 = ChatPromptTemplate.from_template("what country is the city {city} in? respond in {language}")

chain1 = prompt1 | model | StrOutputParser()

chain2 = {"city": chain1, "language": itemgetter("language")} | prompt2 | model | StrOutputParser()

chain2.invoke({"person": "obama", "language": "spanish"})

'El país en el que se encuentra la ciudad de Honolulu, Hawái, donde nació Barack Obama, el 44º presidente de los Estados Unidos, es Estados Unidos.'

In [40]:
from langchain.schema.runnable import RunnableMap
prompt1 = ChatPromptTemplate.from_template("generate a random color")
prompt2 = ChatPromptTemplate.from_template("what is a fruit of color: {color}")
prompt3 = ChatPromptTemplate.from_template("what is countries flag that has the color: {color}")
prompt4 = ChatPromptTemplate.from_template("What is the color of {fruit} and {country}")
chain1 = prompt1 | model | StrOutputParser()
chain2 = RunnableMap(steps={"color": chain1}) | {
    "fruit": prompt2 | model | StrOutputParser(),
    "country": prompt3 | model | StrOutputParser(),
} | prompt4

In [41]:
chain2.invoke({})

ChatPromptValue(messages=[HumanMessage(content="What is the color of A fruit that is of color #FF4500 is typically an orange fruit. and The country's flag that has the color #FF4500 is the flag of India.", additional_kwargs={}, example=False)])

### Branching and Merging

You may want the output of one component to be processed by 2 or more other components. [RunnableMaps](https://api.python.langchain.com/en/latest/schema/langchain.schema.runnable.base.RunnableMap.html) let you split or fork the chain so multiple components can process the input in parallel. Later, other components can join or merge the results to synthesize a final response. This type of chain creates a computation graph that looks like the following:

```text
     Input
      / \
     /   \
 Branch1 Branch2
     \   /
      \ /
      Combine
```

In [63]:
planner = (
    ChatPromptTemplate.from_template(
        "Generate an argument about: {input}"
    )
    | ChatOpenAI()
    | StrOutputParser()
    | {"base_response": RunnablePassthrough()}
)

arguments_for = (
    ChatPromptTemplate.from_template(
        "List the pros or positive aspects of {base_response}"
    )
    | ChatOpenAI()
    | StrOutputParser()
)
arguments_against =  (
    ChatPromptTemplate.from_template(
        "List the cons or negative aspects of {base_response}"
    )
    | ChatOpenAI()
    | StrOutputParser()
)

final_responder = (
    ChatPromptTemplate.from_messages(
        [
            ("ai", "{original_response}"),
            ("human", "Pros:\n{results_1}\n\nCons:\n{results_2}"),
            ("system", "Generate a final response given the critique"),
        ]
    )
    | ChatOpenAI()
    | StrOutputParser()
)

chain = (
    planner 
    | {
        "results_1": arguments_for,
        "results_2": arguments_against,
        "original_response": itemgetter("base_response"),
    }
    | final_responder
)

In [65]:
chain.invoke({"input": "scrum"})

"While Scrum has its limitations and potential drawbacks, it is important to note that these can be mitigated with proper understanding, implementation, and adaptation. Here are some ways to address the critique:\n\n1. Lack of structure: While Scrum promotes self-organization, it is essential to provide clear guidelines, define roles and responsibilities, and establish a shared understanding of the project's goals and expectations. This can be achieved through effective communication and regular alignment meetings.\n\n2. Time and resource constraints: Proper planning, prioritization, and resource allocation are crucial in managing the sprint cycles effectively. Teams can leverage tools and techniques such as backlog refinement, sprint planning, and capacity planning to ensure that workloads are manageable and realistic.\n\n3. Managing large teams: Scaling frameworks like Scrum of Scrums or LeSS (Large-Scale Scrum) can be implemented to coordinate the efforts of multiple Scrum teams. Th

## Router

You can also use the router runnable to conditionally route inputs to different runnables.

In [66]:
from langchain.chains import create_tagging_chain_pydantic
from pydantic import BaseModel, Field

class PromptToUse(BaseModel):
    """Used to determine which prompt to use to answer the user's input."""
    
    name: str = Field(description="Should be one of `math` or `english`")

In [67]:
tagger = create_tagging_chain_pydantic(PromptToUse, ChatOpenAI(temperature=0))

In [68]:
chain1 = ChatPromptTemplate.from_template("You are a math genius. Answer the question: {question}") | ChatOpenAI()
chain2 = ChatPromptTemplate.from_template("You are an english major. Answer the question: {question}") | ChatOpenAI()

In [69]:
from langchain.schema.runnable import RouterRunnable
router = RouterRunnable({"math": chain1, "english": chain2})

In [70]:
chain = {
    "key": {"input": lambda x: x["question"]} | tagger | (lambda x: x['text'].name),
    "input": {"question": lambda x: x["question"]}
} | router

In [71]:
chain.invoke({"question": "whats 2 + 2"})

AIMessage(content='Thank you for the compliment! The sum of 2 and 2 is 4.', additional_kwargs={}, example=False)

## Tools

You can use any LangChain tool easily.

In [72]:
from langchain.tools import DuckDuckGoSearchRun

In [73]:
search = DuckDuckGoSearchRun()

In [74]:
template = """turn the following user input into a search query for a search engine:

{input}"""
prompt = ChatPromptTemplate.from_template(template)

In [75]:
chain = prompt | model | StrOutputParser() | search

In [76]:
chain.invoke({"input": "I'd like to figure out what games are tonight"})

"What sports games are on TV today & tonight? Watch and stream live sports on TV today, tonight, tomorrow. Today's 2023 sports TV schedule includes football, basketball, baseball, hockey, motorsports, soccer and more. Watch on TV or stream online on ESPN, FOX, FS1, CBS, NBC, ABC, Peacock, Paramount+, fuboTV, local channels and many other networks. MLB Games Tonight: How to Watch on TV, Streaming & Odds - Wednesday, September 6. Texas Rangers second baseman Marcus Semien, left, tags out Houston Astros' Jose Altuve (27) who was attempting to stretch out a single in the seventh inning of a baseball game, Monday, Sept. 4, 2023, in Arlington, Texas. (AP Photo/Tony Gutierrez) (APMedia) There ... MLB Games Tonight: How to Watch on TV, Streaming & Odds - Sunday, September 3. Los Angeles Dodgers right fielder Mookie Betts, left, gives a thumbs up to Vanessa Bryant, right, widow of Kobe ... WEEK 16 NFL TV SCHEDULE. NFL Games Thursday, 12/21/23. TIME ET. TV. New Orleans at LA Rams. 8:15pm. AMZN. 

## Arbitrary Functions

You can use arbitrary functions in the pipeline

Note that all inputs to these functions need to be a SINGLE argument. If you have a function that accepts multiple arguments, you should write a wrapper that accepts a single input and unpacks it into multiple argument.

In [77]:
from langchain.schema.runnable import RunnableLambda

def length_function(text):
    return len(text)

def _multiple_length_function(text1, text2):
    return len(text1) * len(text2)

def multiple_length_function(_dict):
    return _multiple_length_function(_dict["text1"], _dict["text2"])

prompt = ChatPromptTemplate.from_template("what is {a} + {b}")

chain1 = prompt | model

chain = {
    "a": itemgetter("foo") | RunnableLambda(length_function),
    "b": {"text1": itemgetter("foo"), "text2": itemgetter("bar")} | RunnableLambda(multiple_length_function)
} | prompt | model

In [78]:
chain.invoke({"foo": "bar", "bar": "gah"})

AIMessage(content='3 + 9 equals 12.', additional_kwargs={}, example=False)

## Accepting a Runnable Config

Runnable lambdas can optionally accept a [RunnableConfig](https://api.python.langchain.com/en/latest/schema/langchain.schema.runnable.config.RunnableConfig.html?highlight=runnableconfig#langchain.schema.runnable.config.RunnableConfig), which they can use to pass callbacks, tags, and other configuration information to nested runs.

In [139]:
from langchain.schema.runnable import RunnableConfig

In [149]:
import json

def parse_or_fix(text: str, config: RunnableConfig):
    fixing_chain = (
        ChatPromptTemplate.from_template(
            "Fix the following text:\n\n```text\n{input}\n```\nError: {error}"
            " Don't narrate, just respond with the fixed data."
        )
        | ChatOpenAI()
        | StrOutputParser()
    )
    for _ in range(3):
        try:
            return json.loads(text)
        except Exception as e:
            text = fixing_chain.invoke({"input": text, "error": e}, config)
    return "Failed to parse"

In [152]:
from langchain.callbacks import get_openai_callback

with get_openai_callback() as cb:
    RunnableLambda(parse_or_fix).invoke("{foo: bar}", {"tags": ["my-tag"], "callbacks": [cb]})
    print(cb)

Tokens Used: 65
	Prompt Tokens: 56
	Completion Tokens: 9
Successful Requests: 1
Total Cost (USD): $0.00010200000000000001


## SQL Database

We can also try to replicate our SQLDatabaseChain using this style.

In [106]:
template = """Based on the table schema below, write a SQL query that would answer the user's question:
{schema}

Question: {question}
SQL Query:"""
prompt = ChatPromptTemplate.from_template(template)

In [107]:
from langchain.utilities import SQLDatabase

In [111]:
db = SQLDatabase.from_uri("sqlite:///../../../../notebooks/Chinook.db")

In [109]:
def get_schema(_):
    return db.get_table_info()

In [112]:
def run_query(query):
    return db.run(query)

In [113]:
inputs = {
    "schema": RunnableLambda(get_schema),
    "question": itemgetter("question")
}
sql_response = (
        RunnableMap(inputs)
        | prompt
        | model.bind(stop=["\nSQLResult:"])
        | StrOutputParser()
    )

In [114]:
sql_response.invoke({"question": "How many employees are there?"})

'SELECT COUNT(EmployeeId) FROM Employee'

In [115]:
template = """Based on the table schema below, question, sql query, and sql response, write a natural language response:
{schema}

Question: {question}
SQL Query: {query}
SQL Response: {response}"""
prompt_response = ChatPromptTemplate.from_template(template)

In [116]:
full_chain = (
    RunnableMap({
        "question": itemgetter("question"),
        "query": sql_response,
    }) 
    | {
        "schema": RunnableLambda(get_schema),
        "question": itemgetter("question"),
        "query": itemgetter("query"),
        "response": lambda x: db.run(x["query"])    
    } 
    | prompt_response 
    | model
)

In [117]:
full_chain.invoke({"question": "How many employees are there?"})

AIMessage(content='There are 8 employees.', additional_kwargs={}, example=False)

## Code Writing

In [118]:
from langchain.utilities import PythonREPL
from langchain.prompts import SystemMessagePromptTemplate, HumanMessagePromptTemplate

In [119]:
template = """Write some python code to solve the user's problem. 

Return only python code in Markdown format, e.g.:

```python
....
```"""
prompt = ChatPromptTemplate(messages=[
    SystemMessagePromptTemplate.from_template(template),
    HumanMessagePromptTemplate.from_template("{input}")
])

In [120]:
def _sanitize_output(text: str):
    _, after = text.split("```python")
    return after.split("```")[0]

In [121]:
chain = prompt | model | StrOutputParser() | _sanitize_output | PythonREPL().run

In [122]:
chain.invoke({"input": "whats 2 plus 2"})

Python REPL can execute arbitrary code. Use with caution.


'4\n'

## Memory

This shows how to add memory to an arbitrary chain. Right now, you can use the memory classes but need to hook it up manually

In [123]:
from langchain.memory import ConversationBufferMemory
from langchain.schema.runnable import RunnableMap
from langchain.prompts import MessagesPlaceholder
model = ChatOpenAI()
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful chatbot"),
    MessagesPlaceholder(variable_name="history"),
    ("human", "{input}")
])

In [124]:
memory = ConversationBufferMemory(return_messages=True)

In [125]:
memory.load_memory_variables({})

{'history': []}

In [126]:
chain = RunnableMap({
    "input": lambda x: x["input"],
    "memory": memory.load_memory_variables
}) | {
    "input": lambda x: x["input"],
    "history": lambda x: x["memory"]["history"]
} | prompt | model

In [127]:
inputs = {"input": "hi im bob"}
response = chain.invoke(inputs)
response

AIMessage(content='Hello Bob! How can I assist you today?', additional_kwargs={}, example=False)

In [128]:
memory.save_context(inputs, {"output": response.content})

In [129]:
memory.load_memory_variables({})

{'history': [HumanMessage(content='hi im bob', additional_kwargs={}, example=False),
  AIMessage(content='Hello Bob! How can I assist you today?', additional_kwargs={}, example=False)]}

In [130]:
inputs = {"input": "whats my name"}
response = chain.invoke(inputs)
response

AIMessage(content='Your name is Bob.', additional_kwargs={}, example=False)

## Moderation

This shows how to add in moderation (or other safeguards) around your LLM application.

In [131]:
from langchain.chains import OpenAIModerationChain
from langchain.llms import OpenAI

In [132]:
moderate = OpenAIModerationChain()

In [133]:
model = OpenAI()
prompt = ChatPromptTemplate.from_messages([
    ("system", "repeat after me: {input}")
])

In [134]:
chain = prompt | model

In [135]:
chain.invoke({"input": "you are stupid"})

'\n\nYou are stupid.'

In [136]:
moderated_chain = chain | moderate

In [137]:
moderated_chain.invoke({"input": "you are stupid"})

{'input': '\n\nYou are stupid.',
 'output': "Text was found that violates OpenAI's content policy."}