mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-04 04:28:58 +00:00
big docs refactor (#1978)
Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
This commit is contained in:
@@ -1,12 +0,0 @@
|
||||
# Agents
|
||||
|
||||
Agents are systems that use a language model to interact with other tools.
|
||||
These can be used to do more grounded question/answering, interact with APIs, or even take actions.
|
||||
These agents can be used to power the next generation of personal assistants -
|
||||
systems that intelligently understand what you mean, and then can take actions to help you accomplish your goal.
|
||||
|
||||
Agents are a core use of LangChain - so much so that there is a whole module dedicated to them.
|
||||
Therefore, we recommend that you check out that documentation for detailed instruction on how to work
|
||||
with them.
|
||||
|
||||
- [Agent Documentation](../modules/agents.rst)
|
24
docs/use_cases/apis.md
Normal file
24
docs/use_cases/apis.md
Normal file
@@ -0,0 +1,24 @@
|
||||
# Interacting with APIs
|
||||
|
||||
.. note::
|
||||
`Conceptual Guide <https://docs.langchain.com/docs/use-cases/apis>`_
|
||||
|
||||
|
||||
Lots of data and information is stored behind APIs.
|
||||
This page covers all resources available in LangChain for working with APIs.
|
||||
|
||||
## Chains
|
||||
|
||||
If you are just getting started, and you have relatively apis, you should get started with chains.
|
||||
Chains are a sequence of predetermined steps, so they are good to get started with as they give you more control and let you
|
||||
understand what is happening better.
|
||||
|
||||
- [API Chain](../modules/chains/examples/api.ipynb)
|
||||
|
||||
## Agents
|
||||
|
||||
Agents are more complex, and involve multiple queries to the LLM to understand what to do.
|
||||
The downside of agents are that you have less control. The upside is that they are more powerful,
|
||||
which allows you to use them on larger and more complex schemas.
|
||||
|
||||
- [OpenAPI Agent](../modules/agents/toolkits/examples/openapi.ipynb)
|
@@ -1,15 +1,19 @@
|
||||
# Chatbots
|
||||
|
||||
.. note::
|
||||
`Conceptual Guide <https://docs.langchain.com/docs/use-cases/chatbots>`_
|
||||
|
||||
|
||||
Since language models are good at producing text, that makes them ideal for creating chatbots.
|
||||
Aside from the base prompts/LLMs, an important concept to know for Chatbots is `memory`.
|
||||
Most chat based applications rely on remembering what happened in previous interactions, which is `memory` is designed to help with.
|
||||
|
||||
The following resources exist:
|
||||
- [ChatGPT Clone](../modules/memory/examples/chatgpt_clone.ipynb): A notebook walking through how to recreate a ChatGPT-like experience with LangChain.
|
||||
- [ChatGPT Clone](../modules/agents/agent_executors/examples/chatgpt_clone.ipynb): A notebook walking through how to recreate a ChatGPT-like experience with LangChain.
|
||||
- [Conversation Memory](../modules/memory/getting_started.ipynb): A notebook walking through how to use different types of conversational memory.
|
||||
- [Conversation Agent](../modules/memory/examples/conversational_agent.ipynb): A notebook walking through how to create an agent optimized for conversation.
|
||||
- [Conversation Agent](../modules/agents/agents/examples/conversational_agent.ipynb): A notebook walking through how to create an agent optimized for conversation.
|
||||
|
||||
|
||||
Additional related resources include:
|
||||
- [Memory Key Concepts](../modules/memory/key_concepts.md): Explanation of key concepts related to memory.
|
||||
- [Memory Key Concepts](../modules/memory.rst): Explanation of key concepts related to memory.
|
||||
- [Memory Examples](../modules/memory/how_to_guides.rst): A collection of how-to examples for working with memory.
|
||||
|
@@ -1,96 +0,0 @@
|
||||
# Data Augmented Generation
|
||||
|
||||
## Overview
|
||||
|
||||
Language models are trained on large amounts of unstructured data, which makes them fantastic at general purpose text generation. However, there are many instances where you may want the language model to generate text based not on generic data but rather on specific data. Some common examples of this include:
|
||||
|
||||
- Summarization of a specific piece of text (a website, a private document, etc.)
|
||||
- Question answering over a specific piece of text (a website, a private document, etc.)
|
||||
- Question answering over multiple pieces of text (multiple websites, multiple private documents, etc.)
|
||||
- Using the results of some external call to an API (results from a SQL query, etc.)
|
||||
|
||||
All of these examples are instances when you do not want the LLM to generate text based solely on the data it was trained over, but rather you want it to incorporate other external data in some way. At a high level, this process can be broken down into two steps:
|
||||
|
||||
1. Fetching: Fetching the relevant data to include.
|
||||
2. Augmenting: Passing the data in as context to the LLM.
|
||||
|
||||
This guide is intended to provide an overview of how to do this. This includes an overview of the literature, as well as common tools, abstractions and chains for doing this.
|
||||
|
||||
## Related Literature
|
||||
There are a lot of related papers in this area. Most of them are focused on end-to-end methods that optimize the fetching of the relevant data as well as passing it in as context. These are a few of the papers that are particularly relevant:
|
||||
|
||||
**[RAG](https://arxiv.org/abs/2005.11401):** Retrieval Augmented Generation.
|
||||
This paper introduces RAG models where the parametric memory is a pre-trained seq2seq model and the non-parametric memory is a dense vector index of Wikipedia, accessed with a pre-trained neural retriever.
|
||||
|
||||
**[REALM](https://arxiv.org/abs/2002.08909):** Retrieval-Augmented Language Model Pre-Training.
|
||||
To capture knowledge in a more modular and interpretable way, this paper augments language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia, used during pre-training, fine-tuning and inference.
|
||||
|
||||
**[HayStack](https://haystack.deepset.ai/):** This is not a paper, but rather an open source library aimed at semantic search, question answering, summarization, and document ranking for a wide range of NLP applications. The underpinnings of this library are focused on the same `fetching` and `augmenting` concepts discussed here, and incorporate some methods in the above papers.
|
||||
|
||||
These papers/open-source projects are centered around retrieval of documents, which is important for question-answering tasks over a large corpus of documents (which is how they are evaluated). However, we use the terminology of `Data Augmented Generation` to highlight that retrieval from some document store is only one possible way of fetching relevant data to include. Other methods to fetch relevant data could involve hitting an API, querying a database, or just working with user provided data (eg a specific document that they want to summarize).
|
||||
|
||||
Let's now deep dive on the two steps involved: fetching and augmenting.
|
||||
|
||||
## Fetching
|
||||
There are many ways to fetch relevant data to pass in as context to a LM, and these methods largely depend
|
||||
on the use case.
|
||||
|
||||
**User provided:** In some cases, the user may provide the relevant data, and no algorithm for fetching is needed.
|
||||
An example of this is for summarization of specific documents: the user will provide the document to be summarized,
|
||||
and task the language model with summarizing it.
|
||||
|
||||
**Document Retrieval:** One of the more common use cases involves fetching relevant documents or pieces of text from
|
||||
a large corpus of data. A common example of this is question answering over a private collection of documents.
|
||||
|
||||
**API Querying:** Another common way to fetch data is from an API query. One example of this is WebGPT like system,
|
||||
where you first query Google (or another search API) for relevant information, and then those results are used in
|
||||
the generation step. Another example could be querying a structured database (like SQL) and then using a language model
|
||||
to synthesize those results.
|
||||
|
||||
There are two big issues to deal with in fetching:
|
||||
|
||||
1. Fetching small enough pieces of information
|
||||
2. Not fetching too many pieces of information (e.g. fetching only the most relevant pieces)
|
||||
|
||||
### Text Splitting
|
||||
One big issue with all of these methods is how to make sure you are working with pieces of text that are not too large.
|
||||
This is important because most language models have a context length, and so you cannot (yet) just pass a
|
||||
large document in as context. Therefore, it is important to not only fetch relevant data but also make sure it is in
|
||||
small enough chunks.
|
||||
|
||||
LangChain provides some utilities to help with splitting up larger pieces of data. This comes in the form of the TextSplitter class.
|
||||
The class takes in a document and splits it up into chunks, with several parameters that control the
|
||||
size of the chunks as well as the overlap in the chunks (important for maintaining context).
|
||||
See [this walkthrough](../modules/indexes/examples/textsplitter.ipynb) for more information.
|
||||
|
||||
### Relevant Documents
|
||||
A second large issue related fetching data is to make sure you are not fetching too many documents, and are only fetching
|
||||
the documents that are relevant to the query/question at hand. There are a few ways to deal with this.
|
||||
|
||||
One concrete example of this is vector stores for document retrieval, often used for semantic search or question answering.
|
||||
With this method, larger documents are split up into
|
||||
smaller chunks and then each chunk of text is passed to an embedding function which creates an embedding for that piece of text.
|
||||
Those are embeddings are then stored in a database. When a new search query or question comes in, an embedding is
|
||||
created for that query/question and then documents with embeddings most similar to that embedding are fetched.
|
||||
Examples of vector database companies include [Pinecone](https://www.pinecone.io/) and [Weaviate](https://weaviate.io/).
|
||||
|
||||
Although this is perhaps the most common way of document retrieval, people are starting to think about alternative
|
||||
data structures and indexing techniques specifically for working with language models. For a leading example of this,
|
||||
check out [LlamaIndex](https://github.com/jerryjliu/llama_index) - a collection of data structures created by and optimized
|
||||
for language models.
|
||||
|
||||
## Augmenting
|
||||
So you've fetched your relevant data - now what? How do you pass them to the language model in a format it can understand?
|
||||
For a detailed overview of the different ways of doing so, and the tradeoffs between them, please see
|
||||
[this documentation](../modules/indexes/combine_docs.md)
|
||||
|
||||
## Use Cases
|
||||
LangChain supports the above three methods of augmenting LLMs with external data.
|
||||
These methods can be used to underpin several common use cases, and they are discussed below.
|
||||
For all three of these use cases, all three methods are supported.
|
||||
It is important to note that a large part of these implementations is the prompts
|
||||
that are used. We provide default prompts for all three use cases, but these can be configured.
|
||||
This is in case you discover a prompt that works better for your specific application.
|
||||
|
||||
- [Question-Answering](question_answering.md)
|
||||
- [Summarization](summarization.md)
|
@@ -1,6 +1,10 @@
|
||||
Evaluation
|
||||
==============
|
||||
|
||||
.. note::
|
||||
`Conceptual Guide <https://docs.langchain.com/docs/use-cases/evaluation>`_
|
||||
|
||||
|
||||
This section of documentation covers how we approach and think about evaluation in LangChain.
|
||||
Both evaluation of internal chains/agents, but also how we would recommend people building on top of LangChain approach evaluation.
|
||||
|
||||
|
@@ -1,5 +1,9 @@
|
||||
# Extraction
|
||||
|
||||
.. note::
|
||||
`Conceptual Guide <https://docs.langchain.com/docs/use-cases/extraction>`_
|
||||
|
||||
|
||||
Most APIs and databases still deal with structured information.
|
||||
Therefore, in order to better work with those, it can be useful to extract structured information from text.
|
||||
Examples of this include:
|
||||
@@ -8,7 +12,7 @@ Examples of this include:
|
||||
- Extracting multiple rows to insert into a database from a long document
|
||||
- Extracting the correct API parameters from a user query
|
||||
|
||||
This work is extremely related to [output parsing](../modules/prompts/examples/output_parsers.ipynb).
|
||||
This work is extremely related to [output parsing](../modules/prompts/output_parsers.rst).
|
||||
Output parsers are responsible for instructing the LLM to respond in a specific format.
|
||||
In this case, the output parsers specify the format of the data you would like to extract from the document.
|
||||
Then, in addition to the output format instructions, the prompt should also contain the data you would like to extract information from.
|
||||
|
@@ -1,157 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f5d249ee",
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"name": "#%% md\n"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"# Generate Examples\n",
|
||||
"\n",
|
||||
"This notebook shows how to use LangChain to generate more examples similar to the ones you already have."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "1685fa2f",
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"name": "#%%\n"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.llms.openai import OpenAI\n",
|
||||
"from langchain.example_generator import generate_example\n",
|
||||
"from langchain.prompts import PromptTemplate"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "334ef4f7",
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"name": "#%%\n"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Use examples from ReAct\n",
|
||||
"examples = [\n",
|
||||
" {\n",
|
||||
" \"question\": \"What is the elevation range for the area that the eastern sector of the Colorado orogeny extends into?\",\n",
|
||||
" \"answer\": \"Thought 1: I need to search Colorado orogeny, find the area that the eastern sector of the Colorado orogeny extends into, then find the elevation range of that area.\\nAction 1: Search[Colorado orogeny]\\nObservation 1: The Colorado orogeny was an episode of mountain building (an orogeny) in Colorado and surrounding areas.\\nThought 2: It does not mention the eastern sector. So I need to look up eastern sector.\\nAction 2: Lookup[eastern sector]\\nObservation 2: (Result 1 / 1) The eastern sector extends into the High Plains and is called the Central Plains orogeny.\\nThought 3: The eastern sector of Colorado orogeny extends into the High Plains. So I need to search High Plains and find its elevation range.\\nAction 3: Search[High Plains]\\nObservation 3: High Plains refers to one of two distinct land regions\\nThought 4: I need to instead search High Plains (United States).\\nAction 4: Search[High Plains (United States)]\\nObservation 4: The High Plains are a subregion of the Great Plains. From east to west, the High Plains rise in elevation from around 1,800 to 7,000 ft (550 to 2,130 m).[3]\\nThought 5: High Plains rise in elevation from around 1,800 to 7,000 ft, so the answer is 1,800 to 7,000 ft.\\nAction 5: Finish[1,800 to 7,000 ft]\"\n",
|
||||
" },\n",
|
||||
" {\n",
|
||||
" \"question\": \"Musician and satirist Allie Goertz wrote a song about the \\\"The Simpsons\\\" character Milhouse, who Matt Groening named after who?\",\n",
|
||||
" \"answer\": \"Thought 1: The question simplifies to \\\"The Simpsons\\\" character Milhouse is named after who. I only need to search Milhouse and find who it is named after.\\nAction 1: Search[Milhouse]\\nObservation 1: Milhouse Mussolini Van Houten is a recurring character in the Fox animated television series The Simpsons voiced by Pamela Hayden and created by Matt Groening.\\nThought 2: The paragraph does not tell who Milhouse is named after, maybe I can look up \\\"named after\\\".\\nAction 2: Lookup[named after]\\nObservation 2: (Result 1 / 1) Milhouse was named after U.S. president Richard Nixon, whose middle name was Milhous.\\nThought 3: Milhouse was named after U.S. president Richard Nixon, so the answer is Richard Nixon.\\nAction 3: Finish[Richard Nixon]\"\n",
|
||||
" },\n",
|
||||
" {\n",
|
||||
" \"question\": \"Which documentary is about Finnish rock groups, Adam Clayton Powell or The Saimaa Gesture?\",\n",
|
||||
" \"answer\": \"Thought 1: I need to search Adam Clayton Powell and The Saimaa Gesture, and find which documentary is about Finnish rock groups.\\nAction 1: Search[Adam Clayton Powell]\\nObservation 1 Could not find [Adam Clayton Powell]. Similar: [’Adam Clayton Powell III’, ’Seventh Avenue (Manhattan)’, ’Adam Clayton Powell Jr. State Office Building’, ’Isabel Washington Powell’, ’Adam Powell’, ’Adam Clayton Powell (film)’, ’Giancarlo Esposito’].\\nThought 2: To find the documentary, I can search Adam Clayton Powell (film).\\nAction 2: Search[Adam Clayton Powell (film)]\\nObservation 2: Adam Clayton Powell is a 1989 American documentary film directed by Richard Kilberg. The film is about the rise and fall of influential African-American politician Adam Clayton Powell Jr.[3][4] It was later aired as part of the PBS series The American Experience.\\nThought 3: Adam Clayton Powell (film) is a documentary about an African-American politician, not Finnish rock groups. So the documentary about Finnish rock groups must instead be The Saimaa Gesture.\\nAction 3: Finish[The Saimaa Gesture]\"\n",
|
||||
" },\n",
|
||||
" {\n",
|
||||
" \"question\": \"What profession does Nicholas Ray and Elia Kazan have in common?\",\n",
|
||||
" \"answer\": \"Thought 1: I need to search Nicholas Ray and Elia Kazan, find their professions, then find the profession they have in common.\\nAction 1: Search[Nicholas Ray]\\nObservation 1: Nicholas Ray (born Raymond Nicholas Kienzle Jr., August 7, 1911 - June 16, 1979) was an American film director, screenwriter, and actor best known for the 1955 film Rebel Without a Cause.\\nThought 2: Professions of Nicholas Ray are director, screenwriter, and actor. I need to search Elia Kazan next and find his professions.\\nAction 2: Search[Elia Kazan]\\nObservation 2: Elia Kazan was an American film and theatre director, producer, screenwriter and actor.\\nThought 3: Professions of Elia Kazan are director, producer, screenwriter, and actor. So profession Nicholas Ray and Elia Kazan have in common is director, screenwriter, and actor.\\nAction 3: Finish[director, screenwriter, actor]\"\n",
|
||||
" },\n",
|
||||
" {\n",
|
||||
" \"question\": \"Which magazine was started first Arthur’s Magazine or First for Women?\",\n",
|
||||
" \"answer\": \"Thought 1: I need to search Arthur’s Magazine and First for Women, and find which was started first.\\nAction 1: Search[Arthur’s Magazine]\\nObservation 1: Arthur’s Magazine (1844-1846) was an American literary periodical published in Philadelphia in the 19th century.\\nThought 2: Arthur’s Magazine was started in 1844. I need to search First for Women next.\\nAction 2: Search[First for Women]\\nObservation 2: First for Women is a woman’s magazine published by Bauer Media Group in the USA.[1] The magazine was started in 1989.\\nThought 3: First for Women was started in 1989. 1844 (Arthur’s Magazine) < 1989 (First for Women), so Arthur’s Magazine was started first.\\nAction 3: Finish[Arthur’s Magazine]\"\n",
|
||||
" },\n",
|
||||
" {\n",
|
||||
" \"question\": \"Were Pavel Urysohn and Leonid Levin known for the same type of work?\",\n",
|
||||
" \"answer\": \"Thought 1: I need to search Pavel Urysohn and Leonid Levin, find their types of work, then find if they are the same.\\nAction 1: Search[Pavel Urysohn]\\nObservation 1: Pavel Samuilovich Urysohn (February 3, 1898 - August 17, 1924) was a Soviet mathematician who is best known for his contributions in dimension theory.\\nThought 2: Pavel Urysohn is a mathematician. I need to search Leonid Levin next and find its type of work.\\nAction 2: Search[Leonid Levin]\\nObservation 2: Leonid Anatolievich Levin is a Soviet-American mathematician and computer scientist.\\nThought 3: Leonid Levin is a mathematician and computer scientist. So Pavel Urysohn and Leonid Levin have the same type of work.\\nAction 3: Finish[yes]\"\n",
|
||||
" }\n",
|
||||
"]\n",
|
||||
"example_template = PromptTemplate(template=\"Question: {question}\\n{answer}\", input_variables=[\"question\", \"answer\"])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "a7bd36bc",
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"name": "#%%\n"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"new_example = generate_example(examples, OpenAI(), example_template)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "e1efb008",
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"name": "#%%\n"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"['',\n",
|
||||
" '',\n",
|
||||
" 'Question: What is the difference between the Illinois and Missouri orogeny?',\n",
|
||||
" 'Thought 1: I need to search Illinois and Missouri orogeny, and find the difference between them.',\n",
|
||||
" 'Action 1: Search[Illinois orogeny]',\n",
|
||||
" 'Observation 1: The Illinois orogeny is a hypothesized orogenic event that occurred in the Late Paleozoic either in the Pennsylvanian or Permian period.',\n",
|
||||
" 'Thought 2: The Illinois orogeny is a hypothesized orogenic event. I need to search Missouri orogeny next and find its details.',\n",
|
||||
" 'Action 2: Search[Missouri orogeny]',\n",
|
||||
" 'Observation 2: The Missouri orogeny was a major tectonic event that occurred in the late Pennsylvanian and early Permian period (about 300 million years ago).',\n",
|
||||
" 'Thought 3: The Illinois orogeny is hypothesized and occurred in the Late Paleozoic and the Missouri orogeny was a major tectonic event that occurred in the late Pennsylvanian and early Permian period. So the difference between the Illinois and Missouri orogeny is that the Illinois orogeny is hypothesized and occurred in the Late Paleozoic while the Missouri orogeny was a major']"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"new_example.split('\\n')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "1ed01ba2",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.9"
|
||||
},
|
||||
"vscode": {
|
||||
"interpreter": {
|
||||
"hash": "b1677b440931f40d89ef8be7bf03acb108ce003de0ac9b18e8d43753ea2e7103"
|
||||
}
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
@@ -1,256 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "920a3c1a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Model Comparison\n",
|
||||
"\n",
|
||||
"Constructing your language model application will likely involved choosing between many different options of prompts, models, and even chains to use. When doing so, you will want to compare these different options on different inputs in an easy, flexible, and intuitive way. \n",
|
||||
"\n",
|
||||
"LangChain provides the concept of a ModelLaboratory to test out and try different models."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "ab9e95ad",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain import LLMChain, OpenAI, Cohere, HuggingFaceHub, PromptTemplate\n",
|
||||
"from langchain.model_laboratory import ModelLaboratory"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "32cb94e6",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"llms = [\n",
|
||||
" OpenAI(temperature=0), \n",
|
||||
" Cohere(model=\"command-xlarge-20221108\", max_tokens=20, temperature=0), \n",
|
||||
" HuggingFaceHub(repo_id=\"google/flan-t5-xl\", model_kwargs={\"temperature\":1})\n",
|
||||
"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "14cde09d",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"model_lab = ModelLaboratory.from_llms(llms)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "f186c741",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\u001b[1mInput:\u001b[0m\n",
|
||||
"What color is a flamingo?\n",
|
||||
"\n",
|
||||
"\u001b[1mOpenAI\u001b[0m\n",
|
||||
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
|
||||
"\u001b[36;1m\u001b[1;3m\n",
|
||||
"\n",
|
||||
"Flamingos are pink.\u001b[0m\n",
|
||||
"\n",
|
||||
"\u001b[1mCohere\u001b[0m\n",
|
||||
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 20, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
|
||||
"\u001b[33;1m\u001b[1;3m\n",
|
||||
"\n",
|
||||
"Pink\u001b[0m\n",
|
||||
"\n",
|
||||
"\u001b[1mHuggingFaceHub\u001b[0m\n",
|
||||
"Params: {'repo_id': 'google/flan-t5-xl', 'temperature': 1}\n",
|
||||
"\u001b[38;5;200m\u001b[1;3mpink\u001b[0m\n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"model_lab.compare(\"What color is a flamingo?\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "248b652a",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"prompt = PromptTemplate(template=\"What is the capital of {state}?\", input_variables=[\"state\"])\n",
|
||||
"model_lab_with_prompt = ModelLaboratory.from_llms(llms, prompt=prompt)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "f64377ac",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\u001b[1mInput:\u001b[0m\n",
|
||||
"New York\n",
|
||||
"\n",
|
||||
"\u001b[1mOpenAI\u001b[0m\n",
|
||||
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
|
||||
"\u001b[36;1m\u001b[1;3m\n",
|
||||
"\n",
|
||||
"The capital of New York is Albany.\u001b[0m\n",
|
||||
"\n",
|
||||
"\u001b[1mCohere\u001b[0m\n",
|
||||
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 20, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
|
||||
"\u001b[33;1m\u001b[1;3m\n",
|
||||
"\n",
|
||||
"The capital of New York is Albany.\u001b[0m\n",
|
||||
"\n",
|
||||
"\u001b[1mHuggingFaceHub\u001b[0m\n",
|
||||
"Params: {'repo_id': 'google/flan-t5-xl', 'temperature': 1}\n",
|
||||
"\u001b[38;5;200m\u001b[1;3mst john s\u001b[0m\n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"model_lab_with_prompt.compare(\"New York\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "54336dbf",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain import SelfAskWithSearchChain, SerpAPIWrapper\n",
|
||||
"\n",
|
||||
"open_ai_llm = OpenAI(temperature=0)\n",
|
||||
"search = SerpAPIWrapper()\n",
|
||||
"self_ask_with_search_openai = SelfAskWithSearchChain(llm=open_ai_llm, search_chain=search, verbose=True)\n",
|
||||
"\n",
|
||||
"cohere_llm = Cohere(temperature=0, model=\"command-xlarge-20221108\")\n",
|
||||
"search = SerpAPIWrapper()\n",
|
||||
"self_ask_with_search_cohere = SelfAskWithSearchChain(llm=cohere_llm, search_chain=search, verbose=True)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "6a50a9f1",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"chains = [self_ask_with_search_openai, self_ask_with_search_cohere]\n",
|
||||
"names = [str(open_ai_llm), str(cohere_llm)]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "d3549e99",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"model_lab = ModelLaboratory(chains, names=names)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"id": "362f7f57",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\u001b[1mInput:\u001b[0m\n",
|
||||
"What is the hometown of the reigning men's U.S. Open champion?\n",
|
||||
"\n",
|
||||
"\u001b[1mOpenAI\u001b[0m\n",
|
||||
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"\u001b[1m> Entering new chain...\u001b[0m\n",
|
||||
"What is the hometown of the reigning men's U.S. Open champion?\n",
|
||||
"Are follow up questions needed here:\u001b[32;1m\u001b[1;3m Yes.\n",
|
||||
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
|
||||
"Intermediate answer: \u001b[33;1m\u001b[1;3mCarlos Alcaraz.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
|
||||
"Follow up: Where is Carlos Alcaraz from?\u001b[0m\n",
|
||||
"Intermediate answer: \u001b[33;1m\u001b[1;3mEl Palmar, Spain.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
|
||||
"So the final answer is: El Palmar, Spain\u001b[0m\n",
|
||||
"\u001b[1m> Finished chain.\u001b[0m\n",
|
||||
"\u001b[36;1m\u001b[1;3m\n",
|
||||
"So the final answer is: El Palmar, Spain\u001b[0m\n",
|
||||
"\n",
|
||||
"\u001b[1mCohere\u001b[0m\n",
|
||||
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 256, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"\u001b[1m> Entering new chain...\u001b[0m\n",
|
||||
"What is the hometown of the reigning men's U.S. Open champion?\n",
|
||||
"Are follow up questions needed here:\u001b[32;1m\u001b[1;3m Yes.\n",
|
||||
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
|
||||
"Intermediate answer: \u001b[33;1m\u001b[1;3mCarlos Alcaraz.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
|
||||
"So the final answer is:\n",
|
||||
"\n",
|
||||
"Carlos Alcaraz\u001b[0m\n",
|
||||
"\u001b[1m> Finished chain.\u001b[0m\n",
|
||||
"\u001b[33;1m\u001b[1;3m\n",
|
||||
"So the final answer is:\n",
|
||||
"\n",
|
||||
"Carlos Alcaraz\u001b[0m\n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"model_lab.compare(\"What is the hometown of the reigning men's U.S. Open champion?\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "94159131",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.9"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
19
docs/use_cases/personal_assistants.md
Normal file
19
docs/use_cases/personal_assistants.md
Normal file
@@ -0,0 +1,19 @@
|
||||
# Personal Assistants
|
||||
|
||||
.. note::
|
||||
`Conceptual Guide <https://docs.langchain.com/docs/use-cases/personal-assistants>`_
|
||||
|
||||
|
||||
We use "personal assistant" here in a very broad sense.
|
||||
Personal assistants have a few characteristics:
|
||||
|
||||
- They can interact with the outside world
|
||||
- They have knowledge of your data
|
||||
- They remember your interactions
|
||||
|
||||
Really all of the functionality in LangChain is relevant for building a personal assistant.
|
||||
Highlighting specific parts:
|
||||
|
||||
- [Agent Documentation](../modules/agents.rst) (for interacting with the outside world)
|
||||
- [Index Documentation](../modules/indexes.rst) (for giving them knowledge of your data)
|
||||
- [Memory](../modules/memory.rst) (for helping them remember interactions)
|
@@ -1,7 +1,11 @@
|
||||
# Question Answering
|
||||
# Question Answering over Docs
|
||||
|
||||
.. note::
|
||||
`Conceptual Guide <https://docs.langchain.com/docs/use-cases/qa-docs>`_
|
||||
|
||||
|
||||
Question answering in this context refers to question answering over your document data.
|
||||
For question answering over other types of data, like [SQL databases](../modules/chains/examples/sqlite.html) or [APIs](../modules/chains/examples/api.html), please see [here](../modules/chains/utility_how_to.html)
|
||||
For question answering over other types of data, please see other sources documentation like [SQL database Question Answering](./tabular.md) or [Interacting with APIs](./apis.md).
|
||||
|
||||
For question answering over many documents, you almost always want to create an index over the data.
|
||||
This can be used to smartly access the most relevant documents for a given question, allowing you to avoid having to pass all the documents to the LLM (saving you time and money).
|
||||
@@ -72,4 +76,3 @@ The following resources exist:
|
||||
Additional related resources include:
|
||||
- [Utilities for working with Documents](/modules/utils/how_to_guides.rst): Guides on how to use several of the utilities which will prove helpful for this task, including Text Splitters (for splitting up long documents) and Embeddings & Vectorstores (useful for the above Vector DB example).
|
||||
- [CombineDocuments Chains](/modules/indexes/combine_docs.md): A conceptual overview of specific types of chains by which you can accomplish this task.
|
||||
- [Data Augmented Generation](combine_docs.md): An overview of data augmented generation, which is the general concept of combining external data with LLMs (of which this is a subset).
|
||||
|
@@ -1,5 +1,9 @@
|
||||
# Summarization
|
||||
|
||||
.. note::
|
||||
`Conceptual Guide <https://docs.langchain.com/docs/use-cases/summarization>`_
|
||||
|
||||
|
||||
Summarization involves creating a smaller summary of multiple longer documents.
|
||||
This can be useful for distilling long documents into the core pieces of information.
|
||||
|
||||
@@ -12,9 +16,7 @@ chain.run(docs)
|
||||
```
|
||||
|
||||
The following resources exist:
|
||||
- [Summarization Notebook](../modules/indexes/chain_examples/summarize.ipynb): A notebook walking through how to accomplish this task.
|
||||
- [Summarization Notebook](../modules/chains/index_examples/summarize.ipynb): A notebook walking through how to accomplish this task.
|
||||
|
||||
Additional related resources include:
|
||||
- [Utilities for working with Documents](../reference/utils.rst): Guides on how to use several of the utilities which will prove helpful for this task, including Text Splitters (for splitting up long documents).
|
||||
- [CombineDocuments Chains](../modules/indexes/combine_docs.md): A conceptual overview of specific types of chains by which you can accomplish this task.
|
||||
- [Data Augmented Generation](./combine_docs.md): An overview of data augmented generation, which is the general concept of combining external data with LLMs (of which this is a subset).
|
||||
|
@@ -1,12 +1,16 @@
|
||||
# Querying Tabular Data
|
||||
|
||||
.. note::
|
||||
`Conceptual Guide <https://docs.langchain.com/docs/use-cases/qa-tabular>`_
|
||||
|
||||
|
||||
Lots of data and information is stored in tabular data, whether it be csvs, excel sheets, or SQL tables.
|
||||
This page covers all resources available in LangChain for working with data in this format.
|
||||
|
||||
## Document Loading
|
||||
If you have text data stored in a tabular format, you may want to load the data into a Document and then index it as you would
|
||||
other text/unstructured data. For this, you should use a document loader like the [CSVLoader](../modules/document_loaders/examples/csv.ipynb)
|
||||
and then you should [create an index](../modules/indexes.rst) over that data, and [query it that way](../modules/indexes/chain_examples/vector_db_qa.ipynb).
|
||||
other text/unstructured data. For this, you should use a document loader like the [CSVLoader](../modules/indexes/document_loaders/examples/csv.ipynb)
|
||||
and then you should [create an index](../modules/indexes.rst) over that data, and [query it that way](../modules/chains/index_examples/vector_db_qa.ipynb).
|
||||
|
||||
## Querying
|
||||
If you have more numeric tabular data, or have a large amount of data and don't want to index it, you should get started
|
||||
@@ -26,6 +30,6 @@ Agents are more complex, and involve multiple queries to the LLM to understand w
|
||||
The downside of agents are that you have less control. The upside is that they are more powerful,
|
||||
which allows you to use them on larger databases and more complex schemas.
|
||||
|
||||
- [SQL Agent](../modules/agents/agent_toolkits/sql_database.ipynb)
|
||||
- [Pandas Agent](../modules/agents/agent_toolkits/pandas.ipynb)
|
||||
- [CSV Agent](../modules/agents/agent_toolkits/csv.ipynb)
|
||||
- [SQL Agent](../modules/agents/toolkits/examples/sql_database.ipynb)
|
||||
- [Pandas Agent](../modules/agents/toolkits/examples/pandas.ipynb)
|
||||
- [CSV Agent](../modules/agents/toolkits/examples/csv.ipynb)
|
||||
|
Reference in New Issue
Block a user