add few shot example (#148)

2025-09-08 22:42:05 +00:00 · 2022-11-19 20:32:45 -08:00
parent 8869b0ab0e
commit c02eb199b6
68 changed files with 2494 additions and 713 deletions
--- a/docs/explanation/core_concepts.md
+++ b/docs/explanation/core_concepts.md
@@ -0,0 +1,27 @@
+# Core Concepts
+
+This section goes over the core concepts of LangChain.
+Understanding these will go a long way in helping you understand the codebase and how to construct chains.
+
+## PromptTemplates
+PromptTemplates generically have a `format` method that takes in variables and returns a formatted string.
+The most simple implementation of this is to have a template string with some variables in it, and then format it with the incoming variables.
+More complex iterations dynamically construct the template string from few shot examples, etc.
+
+For a more detailed explanation of how LangChain approaches prompts and prompt templates, see [here](prompts.md).
+
+## LLMs
+Wrappers around Large Language Models (in particular, the `generate` ability of large language models) are some of the core functionality of LangChain.
+These wrappers are classes that are callable: they take in an input string, and return the generated output string.
+
+## Embeddings
+These classes are very similar to the LLM classes in that they are wrappers around models, 
+but rather than return a string they return an embedding (list of floats). This are particularly useful when 
+implementing semantic search functionality. They expose separate methods for embedding queries versus embedding documents.
+
+## Vectorstores
+These are datastores that store documents. They expose a method for passing in a string and finding similar documents.
+
+## Chains
+These are pipelines that combine multiple of the above ideas. 
+They vary greatly in complexity and are combination of generic, highly configurable pipelines and more narrow (but usually more complex) pipelines.
--- a/docs/explanation/glossary.md
+++ b/docs/explanation/glossary.md
@@ -0,0 +1,74 @@
+# Glossary
+
+This is a collection of terminology commonly used when developing LLM applications.
+It contains reference to external papers or sources where the concept was first introduced, 
+as well as to places in LangChain where the concept is used.
+
+### Chain of Thought Prompting
+
+A prompting technique used to encourage the model to generate a series of intermediate reasoning steps. 
+A less formal way to induce this behavior is to include “Let’s think step-by-step” in the prompt.
+
+Resources:
+- [Chain-of-Thought Paper](https://arxiv.org/pdf/2201.11903.pdf)
+- [Step-by-Step Paper](https://arxiv.org/abs/2112.00114)
+
+### Action Plan Generation
+
+A prompt usage that uses a language model to generate actions to take. 
+The results of these actions can then be fed back into the language model to generate a subsequent action.
+
+Resources:
+- [WebGPT Paper](https://arxiv.org/pdf/2112.09332.pdf)
+- [SayCan Paper](https://say-can.github.io/assets/palm_saycan.pdf)
+
+### ReAct Prompting
+
+A prompting technique that combines Chain-of-Thought prompting with action plan generation. 
+This induces the to model to think about what action to take, then take it. 
+
+Resources:
+- [Paper](https://arxiv.org/pdf/2210.03629.pdf)
+- [LangChain Example](https://github.com/hwchase17/langchain/blob/master/examples/react.ipynb)
+
+### Self-ask
+
+A prompting method that builds on top of chain-of-thought prompting. 
+In this method, the model explicitly asks itself follow-up questions, which are then answered by an external search engine. 
+
+Resources:
+- [Paper](https://ofir.io/self-ask.pdf)
+- [LangChain Example](https://github.com/hwchase17/langchain/blob/master/examples/self_ask_with_search.ipynb)
+
+### Prompt Chaining
+
+Combining multiple LLM calls together, with the output of one step being the input to the next. 
+
+Resources: 
+- [PromptChainer Paper](https://arxiv.org/pdf/2203.06566.pdf)
+- [Language Model Cascades](https://arxiv.org/abs/2207.10342)
+- [ICE Primer Book](https://primer.ought.org/)
+- [Socratic Models](https://socraticmodels.github.io/)
+
+### Memetic Proxy
+
+Encouraging the LLM to respond in a certain way framing the discussion in a context that the model knows of and that will result in that type of response. For example, as a conversation between a student and a teacher. 
+
+Resources:
+- [Paper](https://arxiv.org/pdf/2102.07350.pdf)
+
+### Self Consistency
+
+A decoding strategy that samples a diverse set of reasoning paths and then selects the most consistent answer. 
+Is most effective when combined with Chain-of-thought prompting. 
+
+Resources:
+- [Paper](https://arxiv.org/pdf/2203.11171.pdf)
+
+### Inception
+
+Also called “First Person Instruction”. 
+Encouraging the model to think a certain way by including the start of the model’s response in the prompt. 
+
+Resources:
+- [Example](https://twitter.com/goodside/status/1583262455207460865?s=20&t=8Hz7XBnK1OF8siQrxxCIGQ)
--- a/docs/explanation/prompts.md
+++ b/docs/explanation/prompts.md
@@ -0,0 +1,138 @@
+# Prompts
+
+Prompts and all the tooling around them are integral to working with language models, and therefor
+really important to get right, from both and interface and naming perspective. This is a "design doc"
+of sorts explaining how we think about prompts and the related concepts, and why the interfaces
+for working with are the way they are in LangChain.
+
+For a more code-based walkthrough of all these concept, checkout our example [here](/examples/prompts/prompt_management)
+
+## Prompt
+
+### Concept
+A prompt is the final string that gets fed into the language model.
+
+### LangChain Implementation
+In LangChain a prompt is represented as just a string.
+
+## Input Variables
+
+### Concept
+Input variables are parts of a prompt that are not known until runtime, eg could be user provided.
+
+### LangChain Implementation
+In LangChain input variables are just represented as a dictionary of key-value pairs, with the key
+being the variable name and the value being the variable value.
+
+## Examples
+
+### Concept
+Examples are basically datapoints that can be used to teach the model what to do. These can be included
+in prompts to better instruct the model on what to do.
+
+### LangChain Implementation
+In LangChain examples are represented as a dictionary of key-value pairs, with the key being the feature
+(or label) name, and the value being the feature (or label) value.
+
+## Example Selector
+
+### Concept
+If you have a large number of examples, you may need to select which ones to include in the prompt. The
+Example Selector is the class responsible for doing so.
+
+### LangChain Implementation
+
+#### BaseExampleSelector
+In LangChain there is a BaseExampleSelector that exposes the following interface
+
+```python
+class BaseExampleSelector:
+    
+    def select_examples(self, input_variables: dict):
+```
+
+Notice that it does not take in examples at runtime when it's selecting them - those are assumed to have been provided ahead of time.
+
+#### LengthExampleSelector
+The LengthExampleSelector selects examples based on the length of the input variables. 
+This is useful when you are worried about constructing a prompt that will go over the length
+of the context window. For longer inputs, it will select fewer examples to include, while for
+shorter inputs it will select more.
+
+#### SemanticSimilarityExampleSelector
+The SemanticSimilarityExampleSelector selects examples based on which examples are most similar
+to the inputs. It does this by finding the examples with the embeddings that have the greatest 
+cosine similarity with the inputs.
+
+
+## Prompt Template
+
+### Concept
+The prompts that get fed into the language model are nearly always not hardcoded, but rather a combination
+of parts, including Examples and Input Variables. A prompt template is responsible
+for taking those parts and constructing a prompt.
+
+### LangChain Implementation
+
+#### BasePromptTemplate
+In LangChain there is a BasePromptTemplate that exposes the following interface
+
+```python
+class BasePromptTemplate:
+    
+    @property
+    def input_variables(self) -> List[str]:
+        
+    def format(self, **kwargs) -> str:
+```
+The input variables property is used to provide introspection of the PromptTemplate and know
+what inputs it expects. The format method takes in input variables and returns the prompt.
+
+#### PromptTemplate
+The PromptTemplate implementation is the most simple form of a prompt template. It consists of three parts:
+- input variables: which variables this prompt template expects
+- template: the template into which these variables will be formatted
+- template format: the format of the template (eg mustache, python f-strings, etc)
+
+For example, if I was making an application that took a user inputted concept and asked a language model
+to make a joke about that concept, I might use this specification for the PromptTemplate
+- input variables = `["thing"]`
+- template = `"Tell me a joke about {thing}"`
+- template format = `"f-string"`
+
+#### FewShotPromptTemplate
+A FewShotPromptTemplate is a Prompt Template that includes some examples. It consists of:
+- examples OR example selector: a list of examples to use, or an Example Selector to select which examples to use
+- example prompt template: a Prompt Template responsible for taking an individual example (a dictionary) and turning it into a string to be used in the prompt.
+- prefix: the template put in the prompt before listing any examples
+- suffix: the template put in the prompt after listing any examples
+- example separator: a string separator which is used to join the prefix, the examples, and the suffix together
+
+
+For example, if I wanted to turn the above example into a few shot prompt, this is what it would
+look like:
+
+First I would collect some examples, like
+```python
+examples = [
+    {"concept": "chicken", "joke": "Why did the chicken cross the road?"},
+    ...
+]
+```
+
+I would then make sure to define a prompt template for how each example should be formatted
+when inserted into the prompt:
+```python
+prompt_template = PromptTemplate(
+    input_variables=["concept", "joke"],
+    template="Tell me a joke about {concept}\n{joke}"
+)
+```
+
+Then, I would define the components as:
+- examples: The above examples
+- example_prompt: The above example prompt
+- prefix = `"You are a comedian telling jokes on demand."`
+- suffix = `"Tell me a joke about {concept}"`
+- input variables = `["concept"]`
+- template format = `"f-string"`