# LLM Functionality

This notebook goes over all the different features of the LLM class in LangChain.

We will work with an OpenAI LLM wrapper, although these functionalities should exist for all LLM types.

In [1]:
from langchain.llms import OpenAI

In [2]:
llm = OpenAI(model_name="text-ada-001", n=2, best_of=2)

**Generate Text:** The most basic functionality an LLM has is just the ability to call it, passing in a string and getting back a string.

In [3]:
llm("Tell me a joke")

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

**Generate:** More broadly, you can call it with a list of inputs, getting back a more complete response than just the text. This complete response includes things like multiple top responses, as well as LLM provider specific information

In [4]:
llm_result = llm.generate(["Tell me a joke", "Tell me a poem"]*15)

In [5]:
len(llm_result.generations)

30

In [6]:
llm_result.generations[0]

[Generation(text='\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'),
 Generation(text='\n\nWhy did the chicken cross the road?\n\nTo get to the other side.')]

In [7]:
llm_result.generations[-1]

[Generation(text="\n\nA rose by the side of the road\n\nIs all I need to find my way\n\nTo the place I've been searching for\n\nAnd my heart is singing with joy\n\nWhen I look at this rose\n\nIt reminds me of the love I've found\n\nAnd I know that wherever I go\n\nI'll always find my rose by the side of the road."),
 Generation(text="\n\nA rose by the side of the road\n\nIs all I need to find my way\n\nTo the place I've been searching for\n\nAnd my heart is singing with joy\n\nWhen I look at this rose\n\nIt tells me that true love is nigh\n\nAnd I know that this is the day\n\nWhen I look at this rose\n\nI am sure of what I am doing\n\nWhen I look at this rose\n\nI am confident in my love for you\n\nAnd I know that I am in love with you\n\nSo let it be, the rose by the side of the road\n\nAnd let it be what you do, what you are\n\nAnd you do it well, for this is what we want\n\nAnd we want to be with you\n\nAnd we want to be with you\n\nAnd we want to be with you\n\nWhen we find our way

In [8]:
# Provider specific info
llm_result.llm_output

{'token_usage': {'completion_tokens': 4108,
  'prompt_tokens': 120,
  'total_tokens': 4228}}

**Number of Tokens:** You can also estimate how many tokens a piece of text will be in that model. This is useful because models have a context length (and cost more for more tokens), which means you need to be aware of how long the text you are passing in is.

Notice that by default the tokens are estimated using a HuggingFace tokenizer.

In [9]:
llm.get_num_tokens("what a joke")

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.


3

### Caching
With LangChain, you can also enable caching of LLM calls. Note that currently this only applies for individual LLM calls.

In [3]:
import langchain
from langchain.cache import InMemoryCache
langchain.llm_cache = InMemoryCache()

In [4]:
# To make the caching really obvious, lets use a slower model.
llm = OpenAI(model_name="text-davinci-002", n=2, best_of=2)

In [5]:
%%time
# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

CPU times: user 31.2 ms, sys: 11.8 ms, total: 43.1 ms
Wall time: 1.75 s


'\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'

In [6]:
%%time
# The second time it is, so it goes faster
llm("Tell me a joke")

CPU times: user 51 µs, sys: 1 µs, total: 52 µs
Wall time: 67.2 µs


'\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'

In [7]:
# We can do the same thing with a SQLite cache
from langchain.cache import SQLiteCache
langchain.llm_cache = SQLiteCache(database_path=".langchain.db")

In [8]:
%%time
# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

CPU times: user 26.6 ms, sys: 11.2 ms, total: 37.7 ms
Wall time: 1.89 s


'\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

In [9]:
%%time
# The second time it is, so it goes faster
llm("Tell me a joke")

CPU times: user 2.69 ms, sys: 1.57 ms, total: 4.27 ms
Wall time: 2.73 ms


'\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'