Compare commits

..

1 Commits

Author SHA1 Message Date
Harrison Chase
3561f18e66 ape example 2022-11-13 20:03:32 -08:00
74 changed files with 878 additions and 1585 deletions

View File

@@ -1,6 +1,6 @@
name: lint
on: [push, pull_request]
on: [push, pull_request_target]
jobs:
build:

View File

@@ -1,6 +1,6 @@
name: test
on: [push, pull_request]
on: [push, pull_request_target]
jobs:
build:

View File

@@ -23,13 +23,39 @@ It aims to create:
2. a flexible interface for combining pieces into a single comprehensive "chain"
3. a schema for easily saving and sharing those chains
## 📖 Documentation
## 🔧 Setting up your environment
Please see [here](https://langchain.readthedocs.io/en/latest/?) for full documentation on:
- Getting started (installation, setting up environment, simple examples)
- How-To examples (demos, integrations, helper functions)
- Reference (full API docs)
- Resources (high level explanation of core concepts)
Besides the installation of this python package, you will also need to install packages and set environment variables depending on which chains you want to use.
Note: the reason these packages are not included in the dependencies by default is that as we imagine scaling this package, we do not want to force dependencies that are not needed.
The following use cases require specific installs and api keys:
- _OpenAI_:
- Install requirements with `pip install openai`
- Get an OpenAI api key and either set it as an environment variable (`OPENAI_API_KEY`) or pass it to the LLM constructor as `openai_api_key`.
- _Cohere_:
- Install requirements with `pip install cohere`
- Get a Cohere api key and either set it as an environment variable (`COHERE_API_KEY`) or pass it to the LLM constructor as `cohere_api_key`.
- _HuggingFace Hub_
- Install requirements with `pip install huggingface_hub`
- Get a HuggingFace Hub api token and either set it as an environment variable (`HUGGINGFACEHUB_API_TOKEN`) or pass it to the LLM constructor as `huggingfacehub_api_token`.
- _SerpAPI_:
- Install requirements with `pip install google-search-results`
- Get a SerpAPI api key and either set it as an environment variable (`SERPAPI_API_KEY`) or pass it to the LLM constructor as `serpapi_api_key`.
- _NatBot_:
- Install requirements with `pip install playwright`
- _Wikipedia_:
- Install requirements with `pip install wikipedia`
- _Elasticsearch_:
- Install requirements with `pip install elasticsearch`
- Set up Elasticsearch backend. If you want to do locally, [this](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/getting-started.html) is a good guide.
- _FAISS_:
- Install requirements with `pip install faiss` for Python 3.7 and `pip install faiss-cpu` for Python 3.10+.
- _Manifest_:
- Install requirements with `pip install manifest-ml` (Note: this is only available in Python 3.8+ currently).
If you are using the `NLTKTextSplitter` or the `SpacyTextSplitter`, you will also need to install the appropriate models. For example, if you want to use the `SpacyTextSplitter`, you will need to install the `en_core_web_sm` model with `python -m spacy download en_core_web_sm`. Similarly, if you want to use the `NLTKTextSplitter`, you will need to install the `punkt` model with `python -m nltk.downloader punkt`.
## 🚀 What can I do with this
@@ -37,7 +63,7 @@ This project was largely inspired by a few projects seen on Twitter for which we
**[Self-ask-with-search](https://ofir.io/self-ask.pdf)**
To recreate this paper, use the following code snippet or checkout the [example notebook](https://github.com/hwchase17/langchain/blob/master/docs/examples/demos/self_ask_with_search.ipynb).
To recreate this paper, use the following code snippet or checkout the [example notebook](https://github.com/hwchase17/langchain/blob/master/examples/self_ask_with_search.ipynb).
```python
from langchain import SelfAskWithSearchChain, OpenAI, SerpAPIChain
@@ -52,7 +78,7 @@ self_ask_with_search.run("What is the hometown of the reigning men's U.S. Open c
**[LLM Math](https://twitter.com/amasad/status/1568824744367259648?s=20&t=-7wxpXBJinPgDuyHLouP1w)**
To recreate this example, use the following code snippet or check out the [example notebook](https://github.com/hwchase17/langchain/blob/master/docs/examples/demos/llm_math.ipynb).
To recreate this example, use the following code snippet or check out the [example notebook](https://github.com/hwchase17/langchain/blob/master/examples/llm_math.ipynb).
```python
from langchain import OpenAI, LLMMathChain
@@ -65,7 +91,7 @@ llm_math.run("How many of the integers between 0 and 99 inclusive are divisible
**Generic Prompting**
You can also use this for simple prompting pipelines, as in the below example and this [example notebook](https://github.com/hwchase17/langchain/blob/master/docs/examples/demos/simple_prompts.ipynb).
You can also use this for simple prompting pipelines, as in the below example and this [example notebook](https://github.com/hwchase17/langchain/blob/master/examples/simple_prompts.ipynb).
```python
from langchain import Prompt, OpenAI, LLMChain
@@ -84,7 +110,7 @@ llm_chain.predict(question=question)
**Embed & Search Documents**
We support two vector databases to store and search embeddings -- FAISS and Elasticsearch. Here's a code snippet showing how to use FAISS to store embeddings and search for text similar to a query. Both database backends are featured in this [example notebook](https://github.com/hwchase17/langchain/blob/master/docs/examples/integrations/embeddings.ipynb).
We support two vector databases to store and search embeddings -- FAISS and Elasticsearch. Here's a code snippet showing how to use FAISS to store embeddings and search for text similar to a query. Both database backends are featured in this [example notebook](https://github.com/hwchase17/langchain/blob/master/examples/embeddings.ipynb).
```python
from langchain.embeddings.openai import OpenAIEmbeddings
@@ -104,6 +130,11 @@ query = "What did the president say about Ketanji Brown Jackson"
docs = docsearch.similarity_search(query)
```
## 📖 Documentation
The above examples are probably the most user friendly documentation that exists,
but full API docs can be found [here](https://langchain.readthedocs.io/en/latest/?).
## 🤖 Developer Guide
To begin developing on this project, first clone to the repo locally.

View File

@@ -37,14 +37,10 @@ extensions = [
"sphinx.ext.autodoc.typehints",
"sphinx.ext.autosummary",
"sphinx.ext.napoleon",
"sphinx.ext.viewcode",
"sphinxcontrib.autodoc_pydantic",
"myst_parser",
"nbsphinx",
"sphinx_panels",
]
autodoc_pydantic_model_show_json = False
autodoc_pydantic_field_list_validators = False
autodoc_pydantic_config_members = False

View File

@@ -1,25 +0,0 @@
# Core Concepts
This section goes over the core concepts of LangChain.
Understanding these will go a long way in helping you understand the codebase and how to construct chains.
## Prompts
Prompts generically have a `format` method that takes in variables and returns a formatted string.
The most simple implementation of this is to have a template string with some variables in it, and then format it with the incoming variables.
More complex iterations dynamically construct the template string from few shot examples, etc.
## LLMs
Wrappers around Large Language Models (in particular, the `generate` ability of large language models) are some of the core functionality of LangChain.
These wrappers are classes that are callable: they take in an input string, and return the generated output string.
## Embeddings
These classes are very similar to the LLM classes in that they are wrappers around models,
but rather than return a string they return an embedding (list of floats). This are particularly useful when
implementing semantic search functionality. They expose separate methods for embedding queries versus embedding documents.
## Vectorstores
These are datastores that store documents. They expose a method for passing in a string and finding similar documents.
## Chains
These are pipelines that combine multiple of the above ideas.
They vary greatly in complexity and are combination of generic, highly configurable pipelines and more narrow (but usually more complex) pipelines.

View File

@@ -1,10 +0,0 @@
Demos
=====
The examples here are all end-to-end chains of specific applications.
.. toctree::
:maxdepth: 1
:glob:
demos/*

View File

@@ -1,98 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "82140df0",
"metadata": {},
"source": [
"# ReAct\n",
"\n",
"This notebook showcases the implementation of the ReAct chain logic."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "4e272b47",
"metadata": {},
"outputs": [],
"source": [
"from langchain import OpenAI, ReActChain, Wikipedia\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"react = ReActChain(llm=llm, docstore=Wikipedia(), verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "8078c8f1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"Author David Chanoff has collaborated with a U.S. Navy admiral who served as the ambassador to the United Kingdom under which President?\n",
"Thought 1:\u001b[102m I need to search David Chanoff and find the U.S. Navy admiral he\n",
"collaborated with.\n",
"Action 1: Search[David Chanoff]\u001b[0m\n",
"Observation 1: \u001b[103mDavid Chanoff is a noted author of non-fiction work. His work has typically involved collaborations with the principal protagonist of the work concerned. His collaborators have included; Augustus A. White, Joycelyn Elders, Đoàn Văn Toại, William J. Crowe, Ariel Sharon, Kenneth Good and Felix Zandman. He has also written about a wide range of subjects including literary history, education and foreign for The Washington Post, The New Republic and The New York Times Magazine. He has published more than twelve books.\u001b[0m\n",
"Thought 2:\u001b[102m The U.S. Navy admiral David Chanoff collaborated with is William J. Crowe.\n",
"Action 2: Search[William J. Crowe]\u001b[0m\n",
"Observation 2: \u001b[103mWilliam James Crowe Jr. (January 2, 1925 October 18, 2007) was a United States Navy admiral and diplomat who served as the 11th chairman of the Joint Chiefs of Staff under Presidents Ronald Reagan and George H. W. Bush, and as the ambassador to the United Kingdom and Chair of the Intelligence Oversight Board under President Bill Clinton.\u001b[0m\n",
"Thought 3:\u001b[102m William J. Crowe served as the ambassador to the United Kingdom under President Bill Clinton. So the answer is Bill Clinton.\n",
"Action 3: Finish[Bill Clinton]\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'Bill Clinton'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"question = \"Author David Chanoff has collaborated with a U.S. Navy admiral who served as the ambassador to the United Kingdom under which President?\"\n",
"react.run(question)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0a6bd3b4",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,10 +0,0 @@
Integrations
============
The examples here all highlight a specific type of integration.
.. toctree::
:maxdepth: 1
:glob:
integrations/*

View File

@@ -1,180 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "b118c9dc",
"metadata": {},
"source": [
"# HuggingFace Tokenizers\n",
"\n",
"This notebook show cases how to use HuggingFace tokenizers to split text."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "e82c4685",
"metadata": {},
"outputs": [],
"source": [
"from langchain.text_splitter import CharacterTextSplitter"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a8ce51d5",
"metadata": {},
"outputs": [],
"source": [
"from transformers import GPT2TokenizerFast\n",
"\n",
"tokenizer = GPT2TokenizerFast.from_pretrained(\"gpt2\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "ca5e72c0",
"metadata": {},
"outputs": [],
"source": [
"with open('../state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"text_splitter = CharacterTextSplitter.from_huggingface_tokenizer(tokenizer, chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_text(state_of_the_union)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "37cdfbeb",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans. \n",
"\n",
"Last year COVID-19 kept us apart. This year we are finally together again. \n",
"\n",
"Tonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \n",
"\n",
"With a duty to one another to the American people to the Constitution. \n",
"\n",
"And with an unwavering resolve that freedom will always triumph over tyranny. \n",
"\n",
"Six days ago, Russias Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \n",
"\n",
"He thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \n",
"\n",
"He met the Ukrainian people. \n",
"\n",
"From President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world. \n",
"\n",
"Groups of citizens blocking tanks with their bodies. Everyone from students to retirees teachers turned soldiers defending their homeland. \n",
"\n",
"In this struggle as President Zelenskyy said in his speech to the European Parliament “Light will win over darkness.” The Ukrainian Ambassador to the United States is here tonight. \n",
"\n",
"Let each of us here tonight in this Chamber send an unmistakable signal to Ukraine and to the world. \n",
"\n",
"Please rise if you are able and show that, Yes, we the United States of America stand with the Ukrainian people. \n",
"\n",
"Throughout our history weve learned this lesson when dictators do not pay a price for their aggression they cause more chaos. \n",
"\n",
"They keep moving. \n",
"\n",
"And the costs and the threats to America and the world keep rising. \n",
"\n",
"Thats why the NATO Alliance was created to secure peace and stability in Europe after World War 2. \n",
"\n",
"The United States is a member along with 29 other nations. \n",
"\n",
"It matters. American diplomacy matters. American resolve matters. \n",
"\n",
"Putins latest attack on Ukraine was premeditated and unprovoked. \n",
"\n",
"He rejected repeated efforts at diplomacy. \n",
"\n",
"He thought the West and NATO wouldnt respond. And he thought he could divide us at home. Putin was wrong. We were ready. Here is what we did. \n",
"\n",
"We prepared extensively and carefully. \n",
"\n",
"We spent months building a coalition of other freedom-loving nations from Europe and the Americas to Asia and Africa to confront Putin. \n",
"\n",
"I spent countless hours unifying our European allies. We shared with the world in advance what we knew Putin was planning and precisely how he would try to falsely justify his aggression. \n",
"\n",
"We countered Russias lies with truth. \n",
"\n",
"And now that he has acted the free world is holding him accountable. \n",
"\n",
"Along with twenty-seven members of the European Union including France, Germany, Italy, as well as countries like the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and many others, even Switzerland. \n",
"\n",
"We are inflicting pain on Russia and supporting the people of Ukraine. Putin is now isolated from the world more than ever. \n",
"\n",
"Together with our allies we are right now enforcing powerful economic sanctions. \n",
"\n",
"We are cutting off Russias largest banks from the international financial system. \n",
"\n",
"Preventing Russias central bank from defending the Russian Ruble making Putins $630 Billion “war fund” worthless. \n",
"\n",
"We are choking off Russias access to technology that will sap its economic strength and weaken its military for years to come. \n",
"\n",
"Tonight I say to the Russian oligarchs and corrupt leaders who have bilked billions of dollars off this violent regime no more. \n",
"\n",
"The U.S. Department of Justice is assembling a dedicated task force to go after the crimes of Russian oligarchs. \n",
"\n",
"We are joining with our European allies to find and seize your yachts your luxury apartments your private jets. We are coming for your ill-begotten gains. \n",
"\n",
"And tonight I am announcing that we will join our allies in closing off American air space to all Russian flights further isolating Russia and adding an additional squeeze on their economy. The Ruble has lost 30% of its value. \n",
"\n",
"The Russian stock market has lost 40% of its value and trading remains suspended. Russias economy is reeling and Putin alone is to blame. \n",
"\n",
"Together with our allies we are providing support to the Ukrainians in their fight for freedom. Military assistance. Economic assistance. Humanitarian assistance. \n",
"\n",
"We are giving more than $1 Billion in direct assistance to Ukraine. \n",
"\n",
"And we will continue to aid the Ukrainian people as they defend their country and to help ease their suffering. \n",
"\n",
"Let me be clear, our forces are not engaged and will not engage in conflict with Russian forces in Ukraine. \n",
"\n",
"Our forces are not going to Europe to fight in Ukraine, but to defend our NATO Allies in the event that Putin decides to keep moving west. \n"
]
}
],
"source": [
"print(texts[0])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d214aec2",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,254 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "920a3c1a",
"metadata": {},
"source": [
"# Model Laboratory\n",
"\n",
"This example goes over basic functionality of how to use the ModelLaboratory to test out and try different models."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "ab9e95ad",
"metadata": {},
"outputs": [],
"source": [
"from langchain import LLMChain, OpenAI, Cohere, HuggingFaceHub, Prompt\n",
"from langchain.model_laboratory import ModelLaboratory"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "32cb94e6",
"metadata": {},
"outputs": [],
"source": [
"llms = [\n",
" OpenAI(temperature=0), \n",
" Cohere(model=\"command-xlarge-20221108\", max_tokens=20, temperature=0), \n",
" HuggingFaceHub(repo_id=\"google/flan-t5-xl\", model_kwargs={\"temperature\":1})\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "14cde09d",
"metadata": {},
"outputs": [],
"source": [
"model_lab = ModelLaboratory.from_llms(llms)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "f186c741",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"What color is a flamingo?\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\u001b[36;1m\u001b[1;3m\n",
"\n",
"Flamingos are pink.\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 20, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\u001b[33;1m\u001b[1;3m\n",
"\n",
"Pink\u001b[0m\n",
"\n",
"\u001b[1mHuggingFaceHub\u001b[0m\n",
"Params: {'repo_id': 'google/flan-t5-xl', 'temperature': 1}\n",
"\u001b[38;5;200m\u001b[1;3mpink\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab.compare(\"What color is a flamingo?\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "248b652a",
"metadata": {},
"outputs": [],
"source": [
"prompt = Prompt(template=\"What is the capital of {state}?\", input_variables=[\"state\"])\n",
"model_lab_with_prompt = ModelLaboratory.from_llms(llms, prompt=prompt)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "f64377ac",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"New York\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\u001b[36;1m\u001b[1;3m\n",
"\n",
"The capital of New York is Albany.\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 20, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\u001b[33;1m\u001b[1;3m\n",
"\n",
"The capital of New York is Albany.\u001b[0m\n",
"\n",
"\u001b[1mHuggingFaceHub\u001b[0m\n",
"Params: {'repo_id': 'google/flan-t5-xl', 'temperature': 1}\n",
"\u001b[38;5;200m\u001b[1;3mst john s\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab_with_prompt.compare(\"New York\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "54336dbf",
"metadata": {},
"outputs": [],
"source": [
"from langchain import SelfAskWithSearchChain, SerpAPIChain\n",
"\n",
"open_ai_llm = OpenAI(temperature=0)\n",
"search = SerpAPIChain()\n",
"self_ask_with_search_openai = SelfAskWithSearchChain(llm=open_ai_llm, search_chain=search, verbose=True)\n",
"\n",
"cohere_llm = Cohere(temperature=0, model=\"command-xlarge-20221108\")\n",
"search = SerpAPIChain()\n",
"self_ask_with_search_cohere = SelfAskWithSearchChain(llm=cohere_llm, search_chain=search, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "6a50a9f1",
"metadata": {},
"outputs": [],
"source": [
"chains = [self_ask_with_search_openai, self_ask_with_search_cohere]\n",
"names = [str(open_ai_llm), str(cohere_llm)]"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "d3549e99",
"metadata": {},
"outputs": [],
"source": [
"model_lab = ModelLaboratory(chains, names=names)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "362f7f57",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"Are follow up questions needed here:\u001b[32;1m\u001b[1;3m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
"Intermediate answer: \u001b[33;1m\u001b[1;3mCarlos Alcaraz.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"Follow up: Where is Carlos Alcaraz from?\u001b[0m\n",
"Intermediate answer: \u001b[33;1m\u001b[1;3mEl Palmar, Spain.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"So the final answer is: El Palmar, Spain\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[36;1m\u001b[1;3m\n",
"So the final answer is: El Palmar, Spain\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 256, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"Are follow up questions needed here:\u001b[32;1m\u001b[1;3m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
"Intermediate answer: \u001b[33;1m\u001b[1;3mCarlos Alcaraz.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"So the final answer is:\n",
"\n",
"Carlos Alcaraz\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[33;1m\u001b[1;3m\n",
"So the final answer is:\n",
"\n",
"Carlos Alcaraz\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab.compare(\"What is the hometown of the reigning men's U.S. Open champion?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "94159131",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,10 +0,0 @@
Prompts
=======
The examples here all highlight how to work with prompts.
.. toctree::
:maxdepth: 1
:glob:
prompts/*

View File

@@ -1,38 +0,0 @@
# Using Chains
Calling an LLM is a great first step, but it's just the beginning.
Normally when you use an LLM in an application, you are not sending user input directly to the LLM.
Instead, you are probably taking user input and constructing a prompt, and then sending that to the LLM.
For example, in the previous example, the text we passed in was hardcoded to ask for a name for a company that made colorful socks.
In this imaginary service, what we would want to do is take only the user input describing what the company does, and then format the prompt with that information.
This is easy to do with LangChain!
First lets define the prompt:
```python
from langchain.prompts import Prompt
prompt = Prompt(
input_variables=["product"],
template="What is a good name for a company that makes {product}?",
)
```
We can now create a very simple chain that will take user input, format the prompt with it, and then send it to the LLM:
```python
from langchain.chains import LLMChain
chain = LLMChain(llm=llm, prompt=prompt)
```
Now we can run that can only specifying the product!
```python
chain.run("colorful socks")
```
There we go! There's the first chain.
That is it for the Getting Started example.
As a next step, we would suggest checking out the more complex chains in the [Demos section](/examples/demos.rst)

View File

@@ -1,37 +0,0 @@
# Setting up your environment
Using LangChain will usually require integrations with one or more model providers, data stores, apis, etc.
There are two components to setting this up, installing the correct python packages and setting the right environment variables.
## Python packages
The python package needed varies based on the integration. See the list of integrations for details.
There should also be helpful error messages raised if you try to run an integration and are missing any required python packages.
## Environment Variables
The environment variable needed varies based on the integration. See the list of integrations for details.
There should also be helpful error messages raised if you try to run an integration and are missing any required environment variables.
You can set the environment variable in a few ways.
If you are trying to set the environment variable `FOO` to value `bar`, here are the ways you could do so:
- From the command line:
```
export FOO=bar
```
- From the python notebook/script:
```python
import os
os.environ["FOO"] = "bar"
```
For the Getting Started example, we will be using OpenAI's APIs, so we will first need to install their SDK:
```
pip install openai
```
We will then need to set the environment variable. Let's do this from inside the Jupyter notebook (or Python script).
```python
import os
os.environ["OPENAI_API_KEY"] = "..."
```

View File

@@ -1,11 +0,0 @@
# Installation
LangChain is available on PyPi, so to it is easily installable with:
```
pip install langchain
```
For more involved installation options, see the [Installation Reference](/installation.md) section.
That's it! LangChain is now installed. You can now use LangChain from a python script or Jupyter notebook.

View File

@@ -1,25 +0,0 @@
# Calling a LLM
The most basic building block of LangChain is calling an LLM on some input.
Let's walk through a simple example of how to do this.
For this purpose, let's pretend we are building a service that generates a company name based on what the company makes.
In order to do this, we first need to import the LLM wrapper.
```python
from langchain.llms import OpenAI
```
We can then initialize the wrapper with any arguments.
In this example, we probably want the outputs to be MORE random, so we'll initialize it with a HIGH temperature.
```python
llm = OpenAI(temperature=0.9)
```
We can now call it on some input!
```python
text = "What would be a good company name a company that makes colorful socks?"
llm(text)
```

View File

@@ -1,82 +1,18 @@
Welcome to LangChain
==========================
Large language models (LLMs) are emerging as a transformative technology, enabling
developers to build applications that they previously could not.
But using these LLMs in isolation is often not enough to
create a truly powerful app - the real power comes when you are able to
combine them with other sources of computation or knowledge.
This library is aimed at assisting in the development of those types of applications.
It aims to create:
1. a comprehensive collection of pieces you would ever want to combine
2. a flexible interface for combining pieces into a single comprehensive "chain"
3. a schema for easily saving and sharing those chains
The documentation is structured into the following sections:
.. toctree::
:maxdepth: 1
:caption: Getting Started
:name: getting_started
:maxdepth: 2
:caption: User API
getting_started/installation.md
getting_started/environment.md
getting_started/llm.md
getting_started/chains.md
Goes over a simple walk through and tutorial for getting started setting up a simple chain that generates a company name based on what the company makes.
Covers installation, environment set up, calling LLMs, and using prompts.
Start here if you haven't used LangChain before.
.. toctree::
:maxdepth: 1
:caption: How-To Examples
:name: examples
examples/demos.rst
examples/integrations.rst
examples/prompts.rst
examples/model_laboratory.ipynb
More elaborate examples and walk-throughs of particular
integrations and use cases. This is the place to look if you have questions
about how to integrate certain pieces, or if you want to find examples of
common tasks or cool demos.
.. toctree::
:maxdepth: 1
:caption: Reference
:name: reference
installation.md
integrations.md
modules/prompt
modules/llms
modules/embeddings
modules/text_splitter
modules/vectorstore
modules/chains
Full API documentation. This is the place to look if you want to
see detailed information about the various classes, methods, and APIs.
.. toctree::
:maxdepth: 1
:caption: Resources
:name: resources
core_concepts.md
glossary.md
Discord <https://discord.gg/6adMQxSpJS>
Higher level, conceptual explanations of the LangChain components.
This is the place to go if you want to increase your high level understanding
of the problems LangChain is solving, and how we decided to go about do so.

View File

@@ -1,24 +0,0 @@
# Installation Options
LangChain is available on PyPi, so to it is easily installable with:
```
pip install langchain
```
That will install the bare minimum requirements of LangChain.
A lot of the value of LangChain comes when integrating it with various model providers, datastores, etc.
By default, the dependencies needed to do that are NOT installed.
However, there are two other ways to install LangChain that do bring in those dependencies.
To install modules needed for the common LLM providers, run:
```
pip install langchain[llms]
```
To install all modules needed for all integrations, run:
```
pip install langchain[all]
```

View File

@@ -1,33 +0,0 @@
# Integration Reference
Besides the installation of this python package, you will also need to install packages and set environment variables depending on which chains you want to use.
Note: the reason these packages are not included in the dependencies by default is that as we imagine scaling this package, we do not want to force dependencies that are not needed.
The following use cases require specific installs and api keys:
- _OpenAI_:
- Install requirements with `pip install openai`
- Get an OpenAI api key and either set it as an environment variable (`OPENAI_API_KEY`) or pass it to the LLM constructor as `openai_api_key`.
- _Cohere_:
- Install requirements with `pip install cohere`
- Get a Cohere api key and either set it as an environment variable (`COHERE_API_KEY`) or pass it to the LLM constructor as `cohere_api_key`.
- _HuggingFace Hub_
- Install requirements with `pip install huggingface_hub`
- Get a HuggingFace Hub api token and either set it as an environment variable (`HUGGINGFACEHUB_API_TOKEN`) or pass it to the LLM constructor as `huggingfacehub_api_token`.
- _SerpAPI_:
- Install requirements with `pip install google-search-results`
- Get a SerpAPI api key and either set it as an environment variable (`SERPAPI_API_KEY`) or pass it to the LLM constructor as `serpapi_api_key`.
- _NatBot_:
- Install requirements with `pip install playwright`
- _Wikipedia_:
- Install requirements with `pip install wikipedia`
- _Elasticsearch_:
- Install requirements with `pip install elasticsearch`
- Set up Elasticsearch backend. If you want to do locally, [this](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/getting-started.html) is a good guide.
- _FAISS_:
- Install requirements with `pip install faiss` for Python 3.7 and `pip install faiss-cpu` for Python 3.10+.
- _Manifest_:
- Install requirements with `pip install manifest-ml` (Note: this is only available in Python 3.8+ currently).
If you are using the `NLTKTextSplitter` or the `SpacyTextSplitter`, you will also need to install the appropriate models. For example, if you want to use the `SpacyTextSplitter`, you will need to install the `en_core_web_sm` model with `python -m spacy download en_core_web_sm`. Similarly, if you want to use the `NLTKTextSplitter`, you will need to install the `punkt` model with `python -m nltk.downloader punkt`.

View File

@@ -1,6 +0,0 @@
:mod:`langchain.text_splitter`
==============================
.. automodule:: langchain.text_splitter
:members:
:undoc-members:

View File

@@ -1,6 +0,0 @@
:mod:`langchain.vectorstores`
=============================
.. automodule:: langchain.vectorstores
:members:
:undoc-members:

View File

@@ -1,8 +1,6 @@
autodoc_pydantic==1.8.0
myst_parser
nbsphinx==0.8.9
sphinx==4.5.0
sphinx-autobuild==2021.3.14
sphinx_rtd_theme==1.0.0
sphinx-typlog-theme==0.8.0
sphinx-panels
autodoc_pydantic==1.8.0
myst_parser

111
examples/ape.ipynb Normal file
View File

@@ -0,0 +1,111 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 5,
"id": "081420ea",
"metadata": {},
"outputs": [],
"source": [
"words = [\"sane\", \"direct\", \"informally\", \"unpopular\", \"subtractive\", \"nonresidential\",\n",
" \"inexact\", \"uptown\", \"incomparable\", \"powerful\", \"gaseous\", \"evenly\", \"formality\",\n",
" \"deliberately\", \"off\"]\n",
"antonyms = [\"insane\", \"indirect\", \"formally\", \"popular\", \"additive\", \"residential\",\n",
" \"exact\", \"downtown\", \"comparable\", \"powerless\", \"solid\", \"unevenly\", \"informality\",\n",
" \"accidentally\", \"on\"]\n",
"data = (words, antonyms)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "6c7c00b7",
"metadata": {},
"outputs": [],
"source": [
"examples = [f\"INPUT: {i}\\nOUTPUT: {j}\" for i, j in zip(words, antonyms)]"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "70d4ea31",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.ape.base import APEChain\n",
"from langchain.llms import OpenAI"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "f64829c4",
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "0663e721",
"metadata": {},
"outputs": [],
"source": [
"chain = APEChain(llm=llm)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "4d65f1ee",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' reverse the input.'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.ape(examples)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3bd0bd06",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,28 +1,10 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "7ef4d402-6662-4a26-b612-35b542066487",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# Embeddings & VectorStores\n",
"\n",
"This notebook show cases how to use embeddings to create a VectorStore"
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 2,
"id": "965eecee",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@@ -33,16 +15,12 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"id": "68481687",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [],
"source": [
"with open('../state_of_the_union.txt') as f:\n",
"with open('state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_text(state_of_the_union)\n",
@@ -52,13 +30,9 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 4,
"id": "015f4ff5",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [],
"source": [
"docsearch = FAISS.from_texts(texts, embeddings)\n",
@@ -69,13 +43,9 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 5,
"id": "67baf32e",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [
{
"name": "stdout",
@@ -97,23 +67,11 @@
"print(docs[0].page_content)"
]
},
{
"cell_type": "markdown",
"id": "eea6e627",
"metadata": {},
"source": [
"## Requires having ElasticSearch setup"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "4906b8a3",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [],
"source": [
"docsearch = ElasticVectorSearch.from_texts(texts, embeddings, elasticsearch_url=\"http://localhost:9200\")\n",
@@ -126,11 +84,7 @@
"cell_type": "code",
"execution_count": 7,
"id": "95f9eee9",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [
{
"name": "stdout",
@@ -151,6 +105,14 @@
"source": [
"print(docs[0].page_content)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "70a253c4",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -169,7 +131,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
"version": "3.10.4"
}
},
"nbformat": 4,

View File

@@ -1,28 +1,10 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "f5d249ee",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# Generate Examples\n",
"\n",
"This notebook shows how to use LangChain to generate more examples similar to the ones you already have."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "1685fa2f",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.react.prompt import EXAMPLES\n",
@@ -32,13 +14,9 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"id": "334ef4f7",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [
{
"data": {
@@ -46,7 +24,7 @@
"'Question: What is the elevation range for the area that the eastern sector of the\\nColorado orogeny extends into?\\nThought 1: I need to search Colorado orogeny, find the area that the eastern sector\\nof the Colorado orogeny extends into, then find the elevation range of the\\narea.\\nAction 1: Search[Colorado orogeny]\\nObservation 1: The Colorado orogeny was an episode of mountain building (an orogeny) in\\nColorado and surrounding areas.\\nThought 2: It does not mention the eastern sector. So I need to look up eastern\\nsector.\\nAction 2: Lookup[eastern sector]\\nObservation 2: (Result 1 / 1) The eastern sector extends into the High Plains and is called\\nthe Central Plains orogeny.\\nThought 3: The eastern sector of Colorado orogeny extends into the High Plains. So I\\nneed to search High Plains and find its elevation range.\\nAction 3: Search[High Plains]\\nObservation 3: High Plains refers to one of two distinct land regions\\nThought 4: I need to instead search High Plains (United States).\\nAction 4: Search[High Plains (United States)]\\nObservation 4: The High Plains are a subregion of the Great Plains. From east to west, the\\nHigh Plains rise in elevation from around 1,800 to 7,000 ft (550 to 2,130\\nm).[3]\\nThought 5: High Plains rise in elevation from around 1,800 to 7,000 ft, so the answer\\nis 1,800 to 7,000 ft.\\nAction 5: Finish[1,800 to 7,000 ft]'"
]
},
"execution_count": 2,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
@@ -58,13 +36,9 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 6,
"id": "a7bd36bc",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [],
"source": [
"new_example = generate_example(EXAMPLES, OpenAI())"
@@ -72,35 +46,40 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 7,
"id": "e1efb008",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['',\n",
" '',\n",
" 'Question: Which ocean is the worlds smallest?',\n",
" 'Question: Is the Mount Everest taller than the Mount Kilimanjaro?',\n",
" '',\n",
" 'Thought 1: I need to search for oceans and find which one is the worlds smallest.',\n",
" 'Thought 1: I need to search Mount Everest and Mount Kilimanjaro, find their',\n",
" 'heights, then compare them.',\n",
" '',\n",
" 'Action 1: Search[oceans]',\n",
" 'Action 1: Search[Mount Everest]',\n",
" '',\n",
" 'Observation 1: There are five oceans: the Pacific, Atlantic, Indian, Southern, and Arctic.',\n",
" \"Observation 1: Mount Everest, at 8,848 metres (29,029 ft), is the world's highest mountain\",\n",
" 'and a particularly popular goal for mountaineers.',\n",
" '',\n",
" 'Thought 2: I need to compare the sizes of the oceans and find which one is the smallest.',\n",
" 'Thought 2: Mount Everest is 8,848 metres tall. I need to search Mount Kilimanjaro',\n",
" 'next.',\n",
" '',\n",
" 'Action 2: Compare[Pacific, Atlantic, Indian, Southern, Arctic]',\n",
" 'Action 2: Search[Mount Kilimanjaro]',\n",
" '',\n",
" 'Observation 2: The Arctic is the smallest ocean.']"
" 'Observation 2: Mount Kilimanjaro, with its three volcanic cones, Kibo, Mawenzi, and',\n",
" 'Shira, is a freestanding mountain in Tanzania. It is the highest mountain in',\n",
" 'Africa, and rises approximately 4,900 metres (16,100 ft) from its base to 5,895',\n",
" 'metres (19,341 ft) above sea level.',\n",
" '',\n",
" 'Thought 3: Mount Kilimanjaro is 5,895 metres tall. 8,848 metres (Mount Everest) >',\n",
" '5,895 metres (Mount Kil']"
]
},
"execution_count": 4,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
@@ -112,7 +91,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "1ed01ba2",
"id": "d8843d7b",
"metadata": {},
"outputs": [],
"source": []
@@ -134,7 +113,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
"version": "3.10.4"
}
},
"nbformat": 4,

View File

@@ -1,15 +1,5 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "959300d4",
"metadata": {},
"source": [
"# HuggingFace Hub\n",
"\n",
"This example showcases how to connect to the HuggingFace Hub."
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -35,7 +25,7 @@
"\n",
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"\n",
"print(llm_chain.run(question))"
"print(llm_chain.predict(question=question))"
]
},
{
@@ -63,7 +53,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
"version": "3.8.7"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,104 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "e82c4685",
"metadata": {},
"outputs": [],
"source": [
"from langchain.text_splitter import HuggingFaceTokenizerSplitter"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a8ce51d5",
"metadata": {},
"outputs": [],
"source": [
"from transformers import GPT2TokenizerFast\n",
"\n",
"tokenizer = GPT2TokenizerFast.from_pretrained(\"gpt2\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "ca5e72c0",
"metadata": {},
"outputs": [],
"source": [
"with open('state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"text_splitter = HuggingFaceTokenizerSplitter(tokenizer, chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_text(state_of_the_union)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "37cdfbeb",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans. \n",
"\n",
"Last year COVID-19 kept us apart. This year we are finally together again. \n",
"\n",
"Tonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \n",
"\n",
"With a duty to one another to the American people to the Constitution. \n",
"\n",
"And with an unwavering resolve that freedom will always triumph over tyranny. \n",
"\n",
"Six days ago, Russias Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \n",
"\n",
"He thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \n",
"\n",
"He met the Ukrainian people. \n",
"\n",
"From President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world. \n",
"\n",
"Groups of citizens blocking tanks with their bodies. Everyone from students to retirees teachers turned soldiers defending their homeland. \n"
]
}
],
"source": [
"print(texts[0])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d214aec2",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,15 +1,5 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "e71e720f",
"metadata": {},
"source": [
"# LLM Math\n",
"\n",
"This notebook showcases using LLMs and Python REPLs to do complex word math problems."
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -20,9 +10,6 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"How many of the integers between 0 and 99 inclusive are divisible by 8?\u001b[102m\n",
"\n",
"```python\n",
@@ -34,8 +21,7 @@
"```\n",
"\u001b[0m\n",
"Answer: \u001b[103m13\n",
"\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"\u001b[0m"
]
},
{

View File

@@ -1,15 +1,5 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "b4462a94",
"metadata": {},
"source": [
"# Manifest\n",
"\n",
"This notebook goes over how to use Manifest and LangChain."
]
},
{
"cell_type": "markdown",
"id": "59fcaebc",
@@ -106,7 +96,7 @@
}
],
"source": [
"with open('../state_of_the_union.txt') as f:\n",
"with open('state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"mp_chain.run(state_of_the_union)"
]

View File

@@ -1,15 +1,5 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "d9a0131f",
"metadata": {},
"source": [
"# Map Reduce\n",
"\n",
"This notebok showcases an example of map-reduce chains: recursive summarization."
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -39,7 +29,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"id": "99bbe19b",
"metadata": {},
"outputs": [
@@ -49,13 +39,13 @@
"\"\\n\\nThe President discusses the recent aggression by Russia, and the response by the United States and its allies. He announces new sanctions against Russia, and says that the free world is united in holding Putin accountable. The President also discusses the American Rescue Plan, the Bipartisan Infrastructure Law, and the Bipartisan Innovation Act. Finally, the President addresses the need for women's rights and equality for LGBTQ+ Americans.\""
]
},
"execution_count": 3,
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with open('../state_of_the_union.txt') as f:\n",
"with open('state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"mp_chain.run(state_of_the_union)"
]

View File

@@ -0,0 +1,147 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "ab9e95ad",
"metadata": {},
"outputs": [],
"source": [
"from langchain import LLMChain, OpenAI, Cohere, HuggingFaceHub, Prompt\n",
"from langchain.model_laboratory import ModelLaboratory"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "32cb94e6",
"metadata": {},
"outputs": [],
"source": [
"llms = [OpenAI(temperature=0), Cohere(model=\"command-xlarge-20221108\", max_tokens=20, temperature=0), HuggingFaceHub(repo_id=\"google/flan-t5-xl\", model_kwargs={\"temperature\":1})]"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "14cde09d",
"metadata": {},
"outputs": [],
"source": [
"model_lab = ModelLaboratory(llms)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "f186c741",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"What color is a flamingo?\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\u001b[104m\n",
"\n",
"Flamingos are pink.\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 20, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\u001b[103m\n",
"\n",
"Pink\u001b[0m\n",
"\n",
"\u001b[1mHuggingFaceHub\u001b[0m\n",
"Params: {'repo_id': 'google/flan-t5-xl', 'temperature': 1}\n",
"\u001b[101mpink\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab.compare(\"What color is a flamingo?\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "248b652a",
"metadata": {},
"outputs": [],
"source": [
"prompt = Prompt(template=\"What is the capital of {state}?\", input_variables=[\"state\"])\n",
"model_lab_with_prompt = ModelLaboratory(llms, prompt=prompt)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "f64377ac",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"New York\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\u001b[104m\n",
"\n",
"The capital of New York is Albany.\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 20, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\u001b[103m\n",
"\n",
"The capital of New York is Albany.\u001b[0m\n",
"\n",
"\u001b[1mHuggingFaceHub\u001b[0m\n",
"Params: {'repo_id': 'google/flan-t5-xl', 'temperature': 1}\n",
"\u001b[101mst john s\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab_with_prompt.compare(\"New York\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "54336dbf",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,15 +1,5 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "f1390152",
"metadata": {},
"source": [
"# MRKL\n",
"\n",
"This notebook showcases using the MRKL chain to route between tasks"
]
},
{
"cell_type": "markdown",
"id": "39ea3638",
@@ -32,7 +22,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 2,
"id": "07e96d99",
"metadata": {},
"outputs": [],
@@ -40,12 +30,12 @@
"llm = OpenAI(temperature=0)\n",
"search = SerpAPIChain()\n",
"llm_math_chain = LLMMathChain(llm=llm, verbose=True)\n",
"db = SQLDatabase.from_uri(\"sqlite:///../../../notebooks/Chinook.db\")\n",
"db = SQLDatabase.from_uri(\"sqlite:///../notebooks/Chinook.db\")\n",
"db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)\n",
"chains = [\n",
" ChainConfig(\n",
" action_name = \"Search\",\n",
" action=search.run,\n",
" action=search.search,\n",
" action_description=\"useful for when you need to answer questions about current events\"\n",
" ),\n",
" ChainConfig(\n",
@@ -56,7 +46,7 @@
" \n",
" ChainConfig(\n",
" action_name=\"FooBar DB\",\n",
" action=db_chain.run,\n",
" action=db_chain.query,\n",
" action_description=\"useful for when you need to answer questions about FooBar. Input should be in the form of a question\"\n",
" )\n",
"]"
@@ -64,7 +54,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 3,
"id": "a069c4b6",
"metadata": {},
"outputs": [],
@@ -74,7 +64,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 4,
"id": "e603cd7d",
"metadata": {},
"outputs": [
@@ -122,7 +112,7 @@
"'2.1520202182226886'"
]
},
"execution_count": 6,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
@@ -133,7 +123,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 5,
"id": "a5c07010",
"metadata": {},
"outputs": [
@@ -169,22 +159,22 @@
"What albums by Alanis Morissette are in the FooBar database?\n",
"SQLQuery:\u001b[102m SELECT Title FROM Album WHERE ArtistId = (SELECT ArtistId FROM Artist WHERE Name = 'Alanis Morissette')\u001b[0m\n",
"SQLResult: \u001b[103m[('Jagged Little Pill',)]\u001b[0m\n",
"Answer:\u001b[102m Jagged Little Pill\u001b[0m\n",
"Answer:\u001b[102m The album \"Jagged Little Pill\" by Alanis Morissette is in the FooBar database.\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[101m Jagged Little Pill\u001b[0m\n",
"Observation: \u001b[101m The album \"Jagged Little Pill\" by Alanis Morissette is in the FooBar database.\u001b[0m\n",
"Thought:\u001b[102m I now know the final answer\n",
"Final Answer: The album is by Alanis Morissette and the albums in the FooBar database by her are Jagged Little Pill\u001b[0m\n",
"Final Answer: The album \"Jagged Little Pill\" by Alanis Morissette is the only album by Alanis Morissette in the FooBar database.\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The album is by Alanis Morissette and the albums in the FooBar database by her are Jagged Little Pill'"
"'The album \"Jagged Little Pill\" by Alanis Morissette is the only album by Alanis Morissette in the FooBar database.'"
]
},
"execution_count": 10,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}

View File

@@ -1,15 +1,5 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "d7467b67",
"metadata": {},
"source": [
"# Optimized Prompts\n",
"\n",
"This example showcases how using the OptimizedPrompt class enables selection of the most relevant examples to include as few-shot examples in the prompt."
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -23,7 +13,7 @@
"from langchain.llms.openai import OpenAI\n",
"from langchain.prompts.optimized import OptimizedPrompt\n",
"from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch\n",
"from langchain.vectorstores.faiss_search import FAISS"
"from langchain.vectorstores.faiss import FAISS"
]
},
{
@@ -111,18 +101,10 @@
"print(prompt.format(k=1, input=\"What is the highest mountain peak in Asia?\"))"
]
},
{
"cell_type": "markdown",
"id": "a5dc3525",
"metadata": {},
"source": [
"## Requires having ElasticSearch setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bbd92d08",
"execution_count": 7,
"id": "f7f06820",
"metadata": {},
"outputs": [],
"source": [
@@ -138,10 +120,48 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 8,
"id": "bd91f408",
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"Question: What is the elevation range for the area that the eastern sector of the\n",
"Colorado orogeny extends into?\n",
"Thought 1: I need to search Colorado orogeny, find the area that the eastern sector\n",
"of the Colorado orogeny extends into, then find the elevation range of the\n",
"area.\n",
"Action 1: Search[Colorado orogeny]\n",
"Observation 1: The Colorado orogeny was an episode of mountain building (an orogeny) in\n",
"Colorado and surrounding areas.\n",
"Thought 2: It does not mention the eastern sector. So I need to look up eastern\n",
"sector.\n",
"Action 2: Lookup[eastern sector]\n",
"Observation 2: (Result 1 / 1) The eastern sector extends into the High Plains and is called\n",
"the Central Plains orogeny.\n",
"Thought 3: The eastern sector of Colorado orogeny extends into the High Plains. So I\n",
"need to search High Plains and find its elevation range.\n",
"Action 3: Search[High Plains]\n",
"Observation 3: High Plains refers to one of two distinct land regions\n",
"Thought 4: I need to instead search High Plains (United States).\n",
"Action 4: Search[High Plains (United States)]\n",
"Observation 4: The High Plains are a subregion of the Great Plains. From east to west, the\n",
"High Plains rise in elevation from around 1,800 to 7,000 ft (550 to 2,130\n",
"m).[3]\n",
"Thought 5: High Plains rise in elevation from around 1,800 to 7,000 ft, so the answer\n",
"is 1,800 to 7,000 ft.\n",
"Action 5: Finish[1,800 to 7,000 ft]\n",
"\n",
"\n",
"\n",
"Question: What is the highest mountain peak in Asia?\n"
]
}
],
"source": [
"print(prompt.format(k=1, input=\"What is the highest mountain peak in Asia?\"))"
]
@@ -171,7 +191,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
"version": "3.10.4"
}
},
"nbformat": 4,

83
examples/react.ipynb Normal file
View File

@@ -0,0 +1,83 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 3,
"id": "4e272b47",
"metadata": {},
"outputs": [],
"source": [
"from langchain import OpenAI, ReActChain, Wikipedia\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"react = ReActChain(llm=llm, docstore=Wikipedia(), verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "8078c8f1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[102m I need to search David Chanoff and find the U.S. Navy admiral he\n",
"collaborated with.\n",
"Action 1: Search[David Chanoff]\u001b[0m\u001b[49m\n",
"Observation 1: \u001b[0m\u001b[103mDavid Chanoff is a noted author of non-fiction work. His work has typically involved collaborations with the principal protagonist of the work concerned. His collaborators have included; Augustus A. White, Joycelyn Elders, Đoàn Văn Toại, William J. Crowe, Ariel Sharon, Kenneth Good and Felix Zandman. He has also written about a wide range of subjects including literary history, education and foreign for The Washington Post, The New Republic and The New York Times Magazine. He has published more than twelve books.\u001b[0m\u001b[49m\n",
"Thought 2:\u001b[0m\u001b[102m The U.S. Navy admiral David Chanoff collaborated with is William J. Crowe.\n",
"Action 2: Search[William J. Crowe]\u001b[0m\u001b[49m\n",
"Observation 2: \u001b[0m\u001b[103mWilliam James Crowe Jr. (January 2, 1925 October 18, 2007) was a United States Navy admiral and diplomat who served as the 11th chairman of the Joint Chiefs of Staff under Presidents Ronald Reagan and George H. W. Bush, and as the ambassador to the United Kingdom and Chair of the Intelligence Oversight Board under President Bill Clinton.\u001b[0m\u001b[49m\n",
"Thought 3:\u001b[0m\u001b[102m William J. Crowe served as the ambassador to the United Kingdom under President Bill Clinton. So the answer is Bill Clinton.\n",
"Action 3: Finish[Bill Clinton]\u001b[0m"
]
},
{
"data": {
"text/plain": [
"'Bill Clinton'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"question = \"Author David Chanoff has collaborated with a U.S. Navy admiral who served as the ambassador to the United Kingdom under which President?\"\n",
"react.run(question)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0a6bd3b4",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,15 +1,5 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "0c3f1df8",
"metadata": {},
"source": [
"# Self Ask With Search\n",
"\n",
"This notebook showcases the Self Ask With Search chain."
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -20,17 +10,13 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"Are follow up questions needed here:\u001b[102m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
"Intermediate answer: \u001b[103mCarlos Alcaraz won the 2022 Men's single title while Poland's Iga Swiatek won the Women's single title defeating Tunisian's Ons Jabeur..\u001b[0m\u001b[102m\n",
"Follow up: Where is Carlos Alcaraz from?\u001b[0m\n",
"Intermediate answer: \u001b[103mEl Palmar, Murcia, Spain.\u001b[0m\u001b[102m\n",
"So the final answer is: El Palmar, Murcia, Spain\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"\u001b[49m\n",
"Are follow up questions needed here:\u001b[0m\u001b[102m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\u001b[49m\n",
"Intermediate answer: \u001b[0m\u001b[103mCarlos Alcaraz.\u001b[0m\u001b[102m\n",
"Follow up: Where is Carlos Alcaraz from?\u001b[0m\u001b[49m\n",
"Intermediate answer: \u001b[0m\u001b[103mEl Palmar, Murcia, Spain.\u001b[0m\u001b[102m\n",
"So the final answer is: El Palmar, Murcia, Spain\u001b[0m"
]
},
{
@@ -58,7 +44,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "683d69e7",
"id": "6195fc82",
"metadata": {},
"outputs": [],
"source": []

View File

@@ -1,18 +1,8 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "d8a5c5d4",
"metadata": {},
"source": [
"# Simple Example\n",
"\n",
"This notebook showcases a simple chain."
]
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 1,
"id": "51a54c4d",
"metadata": {},
"outputs": [
@@ -22,7 +12,7 @@
"' The year Justin Beiber was born was 1994. In 1994, the Dallas Cowboys won the Super Bowl.'"
]
},
"execution_count": 2,
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
@@ -38,7 +28,7 @@
"\n",
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"\n",
"llm_chain.run(question)"
"llm_chain.predict(question=question)"
]
},
{

View File

@@ -1,27 +1,9 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "0ed6aab1",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# SQLite example\n",
"\n",
"This example showcases hooking up an LLM to answer questions over a database."
]
},
{
"cell_type": "markdown",
"id": "b2f66479",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"metadata": {},
"source": [
"This uses the example Chinook database.\n",
"To set it up follow the instructions on https://database.guide/2-sample-databases-sqlite/, placing the `.db` file in a notebooks folder at the root of this repository."
@@ -31,11 +13,7 @@
"cell_type": "code",
"execution_count": 1,
"id": "d0e27d88",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [],
"source": [
"from langchain import OpenAI, SQLDatabase, SQLDatabaseChain"
@@ -45,14 +23,10 @@
"cell_type": "code",
"execution_count": 2,
"id": "72ede462",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [],
"source": [
"db = SQLDatabase.from_uri(\"sqlite:///../../../notebooks/Chinook.db\")\n",
"db = SQLDatabase.from_uri(\"sqlite:///../notebooks/Chinook.db\")\n",
"llm = OpenAI(temperature=0)\n",
"db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)"
]
@@ -61,30 +35,21 @@
"cell_type": "code",
"execution_count": 3,
"id": "15ff81df",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"How many employees are there?\n",
"SQLQuery:\u001b[102m SELECT COUNT(*) FROM Employee\u001b[0m\n",
"SQLResult: \u001b[103m[(8,)]\u001b[0m\n",
"Answer:\u001b[102m 8\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"\u001b[102m SELECT COUNT(*) FROM Employee\u001b[0m\u001b[49m\n",
"SQLResult: \u001b[0m\u001b[103m[(8,)]\u001b[0m\u001b[49m\n",
"Answer:\u001b[0m\u001b[102m There are 8 employees.\u001b[0m"
]
},
{
"data": {
"text/plain": [
"' 8'"
"' There are 8 employees.'"
]
},
"execution_count": 3,
@@ -99,7 +64,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "61d91b85",
"id": "146fa162",
"metadata": {},
"outputs": [],
"source": []

View File

@@ -1,15 +1,5 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "07c1e3b9",
"metadata": {},
"source": [
"# Vector DB Question/Answering\n",
"\n",
"This example showcases question answering over a vector database."
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -25,12 +15,12 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"id": "5c7049db",
"metadata": {},
"outputs": [],
"source": [
"with open('../state_of_the_union.txt') as f:\n",
"with open('state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_text(state_of_the_union)\n",
@@ -41,7 +31,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 3,
"id": "3018f865",
"metadata": {},
"outputs": [],
@@ -51,17 +41,17 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 4,
"id": "032a47f8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' The President said that Ketanji Brown Jackson is a consensus builder and has received a broad range of support since she was nominated.'"
"\" The president said that Ketanji Brown Jackson is one of our nation's top legal minds, who will continue Justice Breyers legacy of excellence.\""
]
},
"execution_count": 5,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}

View File

@@ -1 +1 @@
0.0.16
0.0.12

View File

@@ -0,0 +1,47 @@
from typing import Dict, List
from langchain.chains.base import Chain
from pydantic import BaseModel, Extra
from langchain.llms.base import LLM
from langchain.chains.llm import LLMChain
from langchain.chains.ape.prompt import PROMPT
class APEChain(Chain, BaseModel):
llm: LLM
input_key: str = "code" #: :meta private:
output_key: str = "output" #: :meta private:
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
arbitrary_types_allowed = True
@property
def input_keys(self) -> List[str]:
"""Expect input key.
:meta private:
"""
return [self.input_key]
@property
def output_keys(self) -> List[str]:
"""Return output key.
:meta private:
"""
return [self.output_key]
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
chain = LLMChain(llm=self.llm, prompt=PROMPT)
output = chain.run(inputs[self.input_key])
return {self.output_key: output}
def ape(self, examples: List[str]) -> str:
combined_examples = "\n\n".join(examples)
return self.run(combined_examples)

View File

@@ -0,0 +1,9 @@
# flake8: noqa
from langchain.prompts.prompt import Prompt
_TEMPLATE = """I gave a friend an instruction. Based on the instruction they produced the following input-output pairs:
{examples}
The instruction was to"""
PROMPT = Prompt(input_variables=["examples"], template=_TEMPLATE)

View File

@@ -9,7 +9,7 @@ class Chain(BaseModel, ABC):
"""Base interface that all chains should implement."""
verbose: bool = False
"""Whether to print out response text."""
"""Whether to print out the code that was executed."""
@property
@abstractmethod
@@ -49,10 +49,6 @@ class Chain(BaseModel, ABC):
self._validate_outputs(outputs)
return {**inputs, **outputs}
def apply(self, input_list: List[Dict[str, Any]]) -> List[Dict[str, str]]:
"""Call the chain on all inputs in the list."""
return [self(inputs) for inputs in input_list]
def run(self, text: str) -> str:
"""Run text in, text out (if applicable)."""
if len(self.input_keys) != 1:

View File

@@ -1,74 +0,0 @@
"""Chain that generates a list and then maps each output to another chain."""
from typing import Dict, List
from pydantic import BaseModel, Extra, root_validator
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
class MapChain(Chain, BaseModel):
"""Chain that generates a list and then maps each output to another chain."""
llm_chain: LLMChain
map_chain: Chain
n: int
output_key_prefix: str = "output" #: :meta private:
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
arbitrary_types_allowed = True
@property
def input_keys(self) -> List[str]:
"""Will be whatever keys the prompt expects.
:meta private:
"""
vars = self.llm_chain.prompt.input_variables
return [v for v in vars if v != "n"]
@property
def output_keys(self) -> List[str]:
"""Return output key.
:meta private:
"""
return [f"{self.output_key_prefix}_{i}" for i in range(self.n)]
@root_validator()
def validate_llm_chain(cls, values: Dict) -> Dict:
"""Check that llm chain takes as input `n`."""
input_vars = values["llm_chain"].prompt.input_variables
if "n" not in input_vars:
raise ValueError(
"For MapChains, `n` should be one of the input variables to "
f"llm_chain, only got {input_vars}"
)
return values
@root_validator()
def validate_map_chain(cls, values: Dict) -> Dict:
"""Check that map chain takes a single input."""
map_chain_inputs = values["map_chain"].input_keys
if len(map_chain_inputs) != 1:
raise ValueError(
"For MapChains, the map_chain should take a single input,"
f" got {map_chain_inputs}."
)
return values
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
_inputs = {key: inputs[key] for key in self.input_keys}
_inputs["n"] = self.n
output = self.llm_chain.predict(**_inputs)
new_inputs = output.split("\n")
if len(new_inputs) != self.n:
raise ValueError(
f"Got {len(new_inputs)} items, but expected to get {self.n}"
)
outputs = {self.map_chain.run(text) for text in new_inputs}
return outputs

View File

@@ -59,13 +59,16 @@ class MapReduceChain(Chain, BaseModel):
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
# Split the larger text into smaller chunks.
docs = self.text_splitter.split_text(inputs[self.input_key])
docs = self.text_splitter.split_text(
inputs[self.input_key],
)
# Now that we have the chunks, we send them to the LLM and track results.
# This is the "map" part.
input_list = [{self.map_llm.prompt.input_variables[0]: d} for d in docs]
summary_results = self.map_llm.apply(input_list)
summaries = [res[self.map_llm.output_key] for res in summary_results]
summaries = []
for d in docs:
inputs = {self.map_llm.prompt.input_variables[0]: d}
res = self.map_llm.predict(**inputs)
summaries.append(res)
# We then need to combine these individual parts into one.
# This is the reduce part.

View File

@@ -28,7 +28,14 @@ class Crawler:
"Could not import playwright python package. "
"Please it install it with `pip install playwright`."
)
self.browser = sync_playwright().start().chromium.launch(headless=False)
self.browser = (
sync_playwright()
.start()
.chromium.launch(
headless=False,
)
)
self.page = self.browser.new_page()
self.page.set_viewport_size({"width": 1280, "height": 1080})

View File

@@ -1,71 +0,0 @@
"""Chain pipeline where the outputs of one step feed directly into next."""
from typing import Dict, List
from pydantic import BaseModel, Extra, root_validator
from langchain.chains.base import Chain
class Pipeline(Chain, BaseModel):
"""Chain pipeline where the outputs of one step feed directly into next."""
chains: List[Chain]
input_variables: List[str]
output_variables: List[str] #: :meta private:
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
arbitrary_types_allowed = True
@property
def input_keys(self) -> List[str]:
"""Expect input key.
:meta private:
"""
return self.input_variables
@property
def output_keys(self) -> List[str]:
"""Return output key.
:meta private:
"""
return self.output_variables
@root_validator(pre=True)
def validate_chains(cls, values: Dict) -> Dict:
"""Validate that the correct inputs exist for all chains."""
chains = values["chains"]
input_variables = values["input_variables"]
known_variables = set(input_variables)
for chain in chains:
missing_vars = set(chain.input_keys).difference(known_variables)
if missing_vars:
raise ValueError(f"Missing required input keys: {missing_vars}")
overlapping_keys = known_variables.intersection(chain.output_keys)
if overlapping_keys:
raise ValueError(
f"Chain returned keys that already exist: {overlapping_keys}"
)
known_variables |= set(chain.output_keys)
if "output_variables" not in values:
values["output_variables"] = known_variables.difference(input_variables)
else:
missing_vars = known_variables.difference(values["output_variables"])
if missing_vars:
raise ValueError(
f"Expected output variables that were not found: {missing_vars}."
)
return values
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
known_values = inputs.copy()
for chain in self.chains:
outputs = chain(known_values)
known_values.update(outputs)
return {k: known_values[k] for k in self.output_variables}

View File

@@ -109,4 +109,8 @@ Action 3: Finish[yes]""",
]
SUFFIX = """\n\nQuestion: {input}"""
PROMPT = Prompt.from_examples(EXAMPLES, SUFFIX, ["input"])
PROMPT = Prompt.from_examples(
EXAMPLES,
SUFFIX,
["input"],
)

View File

@@ -38,4 +38,7 @@ Intermediate Answer: New Zealand.
So the final answer is: No
Question: {input}"""
PROMPT = Prompt(input_variables=["input"], template=_DEFAULT_TEMPLATE)
PROMPT = Prompt(
input_variables=["input"],
template=_DEFAULT_TEMPLATE,
)

View File

@@ -9,7 +9,6 @@ from typing import Any, Dict, List, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.chains.base import Chain
from langchain.utils import get_from_dict_or_env
class HiddenPrints:
@@ -44,7 +43,7 @@ class SerpAPIChain(Chain, BaseModel):
input_key: str = "search_query" #: :meta private:
output_key: str = "search_result" #: :meta private:
serpapi_api_key: Optional[str] = None
serpapi_api_key: Optional[str] = os.environ.get("SERPAPI_API_KEY")
class Config:
"""Configuration for this pydantic object."""
@@ -70,10 +69,14 @@ class SerpAPIChain(Chain, BaseModel):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
serpapi_api_key = get_from_dict_or_env(
values, "serpapi_api_key", "SERPAPI_API_KEY"
)
values["serpapi_api_key"] = serpapi_api_key
serpapi_api_key = values.get("serpapi_api_key")
if serpapi_api_key is None or serpapi_api_key == "":
raise ValueError(
"Did not find SerpAPI API key, please add an environment variable"
" `SERPAPI_API_KEY` which contains it, or pass `serpapi_api_key` "
"as a named parameter to the constructor."
)
try:
from serpapi import GoogleSearch

View File

@@ -1,61 +0,0 @@
"""Simple chain pipeline where the outputs of one step feed directly into next."""
from typing import Dict, List
from pydantic import BaseModel, Extra, root_validator
from langchain.chains.base import Chain
class SimplePipeline(Chain, BaseModel):
"""Simple chain pipeline where the outputs of one step feed directly into next."""
chains: List[Chain]
input_key: str = "input" #: :meta private:
output_key: str = "output" #: :meta private:
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
arbitrary_types_allowed = True
@property
def input_keys(self) -> List[str]:
"""Expect input key.
:meta private:
"""
return [self.input_key]
@property
def output_keys(self) -> List[str]:
"""Return output key.
:meta private:
"""
return [self.output_key]
@root_validator()
def validate_chains(cls, values: Dict) -> Dict:
"""Validate that chains are all single input/output."""
for chain in values["chains"]:
if len(chain.input_keys) != 1:
raise ValueError(
"Chains used in SimplePipeline should all have one input, got "
f"{chain} with {len(chain.input_keys)} inputs."
)
if len(chain.output_keys) != 1:
raise ValueError(
"Chains used in SimplePipeline should all have one output, got "
f"{chain} with {len(chain.output_keys)} outputs."
)
return values
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
_input = inputs[self.input_key]
for chain in self.chains:
_input = chain.run(_input)
# Clean the input
_input = _input.strip()
return {self.output_key: _input}

View File

@@ -15,5 +15,6 @@ Only use the following tables:
Question: {input}"""
PROMPT = Prompt(
input_variables=["input", "table_info", "dialect"], template=_DEFAULT_TEMPLATE
input_variables=["input", "table_info", "dialect"],
template=_DEFAULT_TEMPLATE,
)

View File

@@ -27,8 +27,6 @@ class VectorDBQA(Chain, BaseModel):
"""LLM wrapper to use."""
vectorstore: VectorStore
"""Vector Database to connect to."""
k: int = 4
"""Number of documents to query for."""
input_key: str = "query" #: :meta private:
output_key: str = "result" #: :meta private:
@@ -57,10 +55,26 @@ class VectorDBQA(Chain, BaseModel):
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
question = inputs[self.input_key]
llm_chain = LLMChain(llm=self.llm, prompt=prompt)
docs = self.vectorstore.similarity_search(question, k=self.k)
docs = self.vectorstore.similarity_search(question)
contexts = []
for j, doc in enumerate(docs):
contexts.append(f"Context {j}:\n{doc.page_content}")
# TODO: handle cases where this context is too long.
answer = llm_chain.predict(question=question, context="\n\n".join(contexts))
return {self.output_key: answer}
def run(self, question: str) -> str:
"""Run Question-Answering on a vector database.
Args:
question: Question to get the answer for.
Returns:
The final answer
Example:
.. code-block:: python
answer = vectordbqa.run("What is the capital of Idaho?")
"""
return self({self.input_key: question})[self.output_key]

View File

@@ -1,7 +1,7 @@
"""Interface for interacting with a document."""
from typing import List
from pydantic import BaseModel, Field
from pydantic import BaseModel
class Document(BaseModel):
@@ -10,7 +10,6 @@ class Document(BaseModel):
page_content: str
lookup_str: str = ""
lookup_index = 0
metadata: dict = Field(default_factory=dict)
@property
def paragraphs(self) -> List[str]:

View File

@@ -1,10 +1,10 @@
"""Wrapper around Cohere embedding models."""
import os
from typing import Any, Dict, List, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.embeddings.base import Embeddings
from langchain.utils import get_from_dict_or_env
class CohereEmbeddings(BaseModel, Embeddings):
@@ -25,7 +25,7 @@ class CohereEmbeddings(BaseModel, Embeddings):
model: str = "medium"
"""Model name to use."""
cohere_api_key: Optional[str] = None
cohere_api_key: Optional[str] = os.environ.get("COHERE_API_KEY")
class Config:
"""Configuration for this pydantic object."""
@@ -35,9 +35,14 @@ class CohereEmbeddings(BaseModel, Embeddings):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
cohere_api_key = get_from_dict_or_env(
values, "cohere_api_key", "COHERE_API_KEY"
)
cohere_api_key = values.get("cohere_api_key")
if cohere_api_key is None or cohere_api_key == "":
raise ValueError(
"Did not find Cohere API key, please add an environment variable"
" `COHERE_API_KEY` which contains it, or pass `cohere_api_key` as a"
" named parameter."
)
try:
import cohere

View File

@@ -1,10 +1,10 @@
"""Wrapper around OpenAI embedding models."""
import os
from typing import Any, Dict, List, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.embeddings.base import Embeddings
from langchain.utils import get_from_dict_or_env
class OpenAIEmbeddings(BaseModel, Embeddings):
@@ -25,7 +25,7 @@ class OpenAIEmbeddings(BaseModel, Embeddings):
model_name: str = "babbage"
"""Model name to use."""
openai_api_key: Optional[str] = None
openai_api_key: Optional[str] = os.environ.get("OPENAI_API_KEY")
class Config:
"""Configuration for this pydantic object."""
@@ -35,9 +35,14 @@ class OpenAIEmbeddings(BaseModel, Embeddings):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
openai_api_key = get_from_dict_or_env(
values, "openai_api_key", "OPENAI_API_KEY"
)
openai_api_key = values.get("openai_api_key")
if openai_api_key is None or openai_api_key == "":
raise ValueError(
"Did not find OpenAI API key, please add an environment variable"
" `OPENAI_API_KEY` which contains it, or pass `openai_api_key` as a"
" named parameter."
)
try:
import openai

View File

@@ -1,19 +1,14 @@
"""Handle chained inputs."""
from typing import Dict, List, Optional
_TEXT_COLOR_MAPPING = {
"blue": "36;1",
"yellow": "33;1",
"pink": "38;5;200",
"green": "32;1",
}
_COLOR_MAPPING = {"blue": 104, "yellow": 103, "red": 101, "green": 102}
def get_color_mapping(
items: List[str], excluded_colors: Optional[List] = None
) -> Dict[str, str]:
"""Get mapping for items to a support color."""
colors = list(_TEXT_COLOR_MAPPING.keys())
colors = list(_COLOR_MAPPING.keys())
if excluded_colors is not None:
colors = [c for c in colors if c not in excluded_colors]
color_mapping = {item: colors[i % len(colors)] for i, item in enumerate(items)}
@@ -25,8 +20,8 @@ def print_text(text: str, color: Optional[str] = None, end: str = "") -> None:
if color is None:
print(text, end=end)
else:
color_str = _TEXT_COLOR_MAPPING[color]
print(f"\u001b[{color_str}m\033[1;3m{text}\u001b[0m", end=end)
color_str = _COLOR_MAPPING[color]
print(f"\x1b[{color_str}m{text}\x1b[0m", end=end)
class ChainedInput:
@@ -34,14 +29,14 @@ class ChainedInput:
def __init__(self, text: str, verbose: bool = False):
"""Initialize with verbose flag and initial text."""
self._verbose = verbose
if self._verbose:
self.verbose = verbose
if self.verbose:
print_text(text, None)
self._input = text
def add(self, text: str, color: Optional[str] = None) -> None:
"""Add text to input, print if in verbose mode."""
if self._verbose:
if self.verbose:
print_text(text, color)
self._input += text

View File

@@ -1,11 +1,11 @@
"""Wrapper around AI21 APIs."""
import os
from typing import Any, Dict, List, Mapping, Optional
import requests
from pydantic import BaseModel, Extra, root_validator
from langchain.llms.base import LLM
from langchain.utils import get_from_dict_or_env
class AI21PenaltyData(BaseModel):
@@ -62,7 +62,7 @@ class AI21(BaseModel, LLM):
logitBias: Optional[Dict[str, float]] = None
"""Adjust the probability of specific tokens being generated."""
ai21_api_key: Optional[str] = None
ai21_api_key: Optional[str] = os.environ.get("AI21_API_KEY")
class Config:
"""Configuration for this pydantic object."""
@@ -72,8 +72,14 @@ class AI21(BaseModel, LLM):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key exists in environment."""
ai21_api_key = get_from_dict_or_env(values, "ai21_api_key", "AI21_API_KEY")
values["ai21_api_key"] = ai21_api_key
ai21_api_key = values.get("ai21_api_key")
if ai21_api_key is None or ai21_api_key == "":
raise ValueError(
"Did not find AI21 API key, please add an environment variable"
" `AI21_API_KEY` which contains it, or pass `ai21_api_key`"
" as a named parameter."
)
return values
@property
@@ -116,7 +122,11 @@ class AI21(BaseModel, LLM):
response = requests.post(
url=f"https://api.ai21.com/studio/v1/{self.model}/complete",
headers={"Authorization": f"Bearer {self.ai21_api_key}"},
json={"prompt": prompt, "stopSequences": stop, **self._default_params},
json={
"prompt": prompt,
"stopSequences": stop,
**self._default_params,
},
)
if response.status_code != 200:
optional_detail = response.json().get("error")

View File

@@ -1,11 +1,11 @@
"""Wrapper around Cohere APIs."""
import os
from typing import Any, Dict, List, Mapping, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.llms.base import LLM
from langchain.llms.utils import enforce_stop_tokens
from langchain.utils import get_from_dict_or_env
class Cohere(LLM, BaseModel):
@@ -44,7 +44,7 @@ class Cohere(LLM, BaseModel):
presence_penalty: int = 0
"""Penalizes repeated tokens."""
cohere_api_key: Optional[str] = None
cohere_api_key: Optional[str] = os.environ.get("COHERE_API_KEY")
class Config:
"""Configuration for this pydantic object."""
@@ -54,9 +54,14 @@ class Cohere(LLM, BaseModel):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
cohere_api_key = get_from_dict_or_env(
values, "cohere_api_key", "COHERE_API_KEY"
)
cohere_api_key = values.get("cohere_api_key")
if cohere_api_key is None or cohere_api_key == "":
raise ValueError(
"Did not find Cohere API key, please add an environment variable"
" `COHERE_API_KEY` which contains it, or pass `cohere_api_key`"
" as a named parameter."
)
try:
import cohere

View File

@@ -1,11 +1,11 @@
"""Wrapper around HuggingFace APIs."""
import os
from typing import Any, Dict, List, Mapping, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.llms.base import LLM
from langchain.llms.utils import enforce_stop_tokens
from langchain.utils import get_from_dict_or_env
DEFAULT_REPO_ID = "gpt2"
VALID_TASKS = ("text2text-generation", "text-generation")
@@ -18,7 +18,7 @@ class HuggingFaceHub(LLM, BaseModel):
environment variable ``HUGGINGFACEHUB_API_TOKEN`` set with your API token, or pass
it as a named parameter to the constructor.
Only supports `text-generation` and `text2text-generation` for now.
Only supports task `text-generation` for now.
Example:
.. code-block:: python
@@ -35,7 +35,7 @@ class HuggingFaceHub(LLM, BaseModel):
model_kwargs: Optional[dict] = None
"""Key word arguments to pass to the model."""
huggingfacehub_api_token: Optional[str] = None
huggingfacehub_api_token: Optional[str] = os.environ.get("HUGGINGFACEHUB_API_TOKEN")
class Config:
"""Configuration for this pydantic object."""
@@ -45,9 +45,13 @@ class HuggingFaceHub(LLM, BaseModel):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
huggingfacehub_api_token = get_from_dict_or_env(
values, "huggingfacehub_api_token", "HUGGINGFACEHUB_API_TOKEN"
)
huggingfacehub_api_token = values.get("huggingfacehub_api_token")
if huggingfacehub_api_token is None or huggingfacehub_api_token == "":
raise ValueError(
"Did not find HuggingFace API token, please add an environment variable"
" `HUGGINGFACEHUB_API_TOKEN` which contains it, or pass"
" `huggingfacehub_api_token` as a named parameter."
)
try:
from huggingface_hub.inference_api import InferenceApi

View File

@@ -1,10 +1,10 @@
"""Wrapper around NLPCloud APIs."""
import os
from typing import Any, Dict, List, Mapping, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.llms.base import LLM
from langchain.utils import get_from_dict_or_env
class NLPCloud(LLM, BaseModel):
@@ -54,7 +54,7 @@ class NLPCloud(LLM, BaseModel):
num_return_sequences: int = 1
"""How many completions to generate for each prompt."""
nlpcloud_api_key: Optional[str] = None
nlpcloud_api_key: Optional[str] = os.environ.get("NLPCLOUD_API_KEY")
class Config:
"""Configuration for this pydantic object."""
@@ -64,9 +64,14 @@ class NLPCloud(LLM, BaseModel):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
nlpcloud_api_key = get_from_dict_or_env(
values, "nlpcloud_api_key", "NLPCLOUD_API_KEY"
)
nlpcloud_api_key = values.get("nlpcloud_api_key")
if nlpcloud_api_key is None or nlpcloud_api_key == "":
raise ValueError(
"Did not find NLPCloud API key, please add an environment variable"
" `NLPCLOUD_API_KEY` which contains it, or pass `nlpcloud_api_key`"
" as a named parameter."
)
try:
import nlpcloud

View File

@@ -1,10 +1,10 @@
"""Wrapper around OpenAI APIs."""
import os
from typing import Any, Dict, List, Mapping, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.llms.base import LLM
from langchain.utils import get_from_dict_or_env
class OpenAI(LLM, BaseModel):
@@ -38,7 +38,7 @@ class OpenAI(LLM, BaseModel):
best_of: int = 1
"""Generates best_of completions server-side and returns the "best"."""
openai_api_key: Optional[str] = None
openai_api_key: Optional[str] = os.environ.get("OPENAI_API_KEY")
class Config:
"""Configuration for this pydantic object."""
@@ -48,9 +48,14 @@ class OpenAI(LLM, BaseModel):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
openai_api_key = get_from_dict_or_env(
values, "openai_api_key", "OPENAI_API_KEY"
)
openai_api_key = values.get("openai_api_key")
if openai_api_key is None or openai_api_key == "":
raise ValueError(
"Did not find OpenAI API key, please add an environment variable"
" `OPENAI_API_KEY` which contains it, or pass `openai_api_key`"
" as a named parameter."
)
try:
import openai

View File

@@ -1,7 +1,6 @@
"""Experiment with different models."""
from typing import List, Optional, Sequence
from typing import List, Optional
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.input import get_color_mapping, print_text
from langchain.llms.base import LLM
@@ -11,41 +10,7 @@ from langchain.prompts.prompt import Prompt
class ModelLaboratory:
"""Experiment with different models."""
def __init__(self, chains: Sequence[Chain], names: Optional[List[str]] = None):
"""Initialize with chains to experiment with.
Args:
chains: list of chains to experiment with.
"""
if not isinstance(chains[0], Chain):
raise ValueError(
"ModelLaboratory should now be initialized with Chains. "
"If you want to initialize with LLMs, use the `from_llms` method "
"instead (`ModelLaboratory.from_llms(...)`)"
)
for chain in chains:
if len(chain.input_keys) != 1:
raise ValueError(
"Currently only support chains with one input variable, "
f"got {chain.input_keys}"
)
if len(chain.output_keys) != 1:
raise ValueError(
"Currently only support chains with one output variable, "
f"got {chain.output_keys}"
)
if names is not None:
if len(names) != len(chains):
raise ValueError("Length of chains does not match length of names.")
self.chains = chains
chain_range = [str(i) for i in range(len(self.chains))]
self.chain_colors = get_color_mapping(chain_range)
self.names = names
@classmethod
def from_llms(
cls, llms: List[LLM], prompt: Optional[Prompt] = None
) -> "ModelLaboratory":
def __init__(self, llms: List[LLM], prompt: Optional[Prompt] = None):
"""Initialize with LLMs to experiment with and optional prompt.
Args:
@@ -53,11 +18,18 @@ class ModelLaboratory:
prompt: Optional prompt to use to prompt the LLMs. Defaults to None.
If a prompt was provided, it should only have one input variable.
"""
self.llms = llms
llm_range = [str(i) for i in range(len(self.llms))]
self.llm_colors = get_color_mapping(llm_range)
if prompt is None:
prompt = Prompt(input_variables=["_input"], template="{_input}")
chains = [LLMChain(llm=llm, prompt=prompt) for llm in llms]
names = [str(llm) for llm in llms]
return cls(chains, names=names)
self.prompt = Prompt(input_variables=["_input"], template="{_input}")
else:
if len(prompt.input_variables) != 1:
raise ValueError(
"Currently only support prompts with one input variable, "
f"got {prompt}"
)
self.prompt = prompt
def compare(self, text: str) -> None:
"""Compare model outputs on an input text.
@@ -70,11 +42,9 @@ class ModelLaboratory:
text: input text to run all models on.
"""
print(f"\033[1mInput:\033[0m\n{text}\n")
for i, chain in enumerate(self.chains):
if self.names is not None:
name = self.names[i]
else:
name = str(chain)
print_text(name, end="\n")
output = chain.run(text)
print_text(output, color=self.chain_colors[str(i)], end="\n\n")
for i, llm in enumerate(self.llms):
print_text(str(llm), end="\n")
chain = LLMChain(llm=llm, prompt=self.prompt)
llm_inputs = {self.prompt.input_variables[0]: text}
output = chain.predict(**llm_inputs)
print_text(output, color=self.llm_colors[str(i)], end="\n\n")

View File

@@ -94,7 +94,8 @@ class Prompt(BaseModel, BasePrompt):
Returns:
The final prompt generated.
"""
template = example_separator.join([prefix, *examples, suffix])
example_str = example_separator.join(examples)
template = prefix + example_str + suffix
return cls(input_variables=input_variables, template=template)
@classmethod

View File

@@ -1,6 +1,4 @@
"""SQLAlchemy wrapper around a database."""
from typing import Any, Iterable, List, Optional
from sqlalchemy import create_engine, inspect
from sqlalchemy.engine import Engine
@@ -8,57 +6,29 @@ from sqlalchemy.engine import Engine
class SQLDatabase:
"""SQLAlchemy wrapper around a database."""
def __init__(
self,
engine: Engine,
ignore_tables: Optional[List[str]] = None,
include_tables: Optional[List[str]] = None,
):
def __init__(self, engine: Engine):
"""Create engine from database URI."""
self._engine = engine
if include_tables and ignore_tables:
raise ValueError("Cannot specify both include_tables and ignore_tables")
self._inspector = inspect(self._engine)
self._all_tables = self._inspector.get_table_names()
self._include_tables = include_tables or []
if self._include_tables:
missing_tables = set(self._include_tables).difference(self._all_tables)
if missing_tables:
raise ValueError(
f"include_tables {missing_tables} not found in database"
)
self._ignore_tables = ignore_tables or []
if self._ignore_tables:
missing_tables = set(self._ignore_tables).difference(self._all_tables)
if missing_tables:
raise ValueError(
f"ignore_tables {missing_tables} not found in database"
)
@classmethod
def from_uri(cls, database_uri: str, **kwargs: Any) -> "SQLDatabase":
def from_uri(cls, database_uri: str) -> "SQLDatabase":
"""Construct a SQLAlchemy engine from URI."""
return cls(create_engine(database_uri), **kwargs)
return cls(create_engine(database_uri))
@property
def dialect(self) -> str:
"""Return string representation of dialect to use."""
return self._engine.dialect.name
def _get_table_names(self) -> Iterable[str]:
if self._include_tables:
return self._include_tables
return set(self._all_tables) - set(self._ignore_tables)
@property
def table_info(self) -> str:
"""Information about all tables in the database."""
template = "Table '{table_name}' has columns: {columns}."
template = "The '{table_name}' table has columns: {columns}."
tables = []
for table_name in self._get_table_names():
inspector = inspect(self._engine)
for table_name in inspector.get_table_names():
columns = []
for column in self._inspector.get_columns(table_name):
for column in inspector.get_columns(table_name):
columns.append(f"{column['name']} ({str(column['type'])})")
column_str = ", ".join(columns)
table_str = template.format(table_name=table_name, columns=column_str)

View File

@@ -1,17 +0,0 @@
"""Generic utility functions."""
import os
from typing import Any, Dict
def get_from_dict_or_env(data: Dict[str, Any], key: str, env_key: str) -> str:
"""Get a value from a dictionary or an environment variable."""
if key in data and data[key]:
return data[key]
elif env_key in os.environ and os.environ[env_key]:
return os.environ[env_key]
else:
raise ValueError(
f"Did not find {key}, please add an environment variable"
f" `{env_key}` which contains it, or pass"
f" `{key}` as a named parameter."
)

View File

@@ -1,10 +1,10 @@
"""Wrapper around Elasticsearch vector database."""
import os
import uuid
from typing import Any, Callable, Dict, List
from langchain.docstore.document import Document
from langchain.embeddings.base import Embeddings
from langchain.utils import get_from_dict_or_env
from langchain.vectorstores.base import VectorStore
@@ -45,7 +45,10 @@ class ElasticVectorSearch(VectorStore):
"""
def __init__(
self, elasticsearch_url: str, index_name: str, embedding_function: Callable
self,
elasticsearch_url: str,
index_name: str,
embedding_function: Callable,
):
"""Initialize with necessary components."""
try:
@@ -107,9 +110,16 @@ class ElasticVectorSearch(VectorStore):
elasticsearch_url="http://localhost:9200"
)
"""
elasticsearch_url = get_from_dict_or_env(
kwargs, "elasticsearch_url", "ELASTICSEARCH_URL"
)
elasticsearch_url = kwargs.get("elasticsearch_url")
if not elasticsearch_url:
elasticsearch_url = os.environ.get("ELASTICSEARCH_URL")
if elasticsearch_url is None or elasticsearch_url == "":
raise ValueError(
"Did not find Elasticsearch URL, please add an environment variable"
" `ELASTICSEARCH_URL` which contains it, or pass"
" `elasticsearch_url` as a named parameter."
)
try:
import elasticsearch
from elasticsearch.helpers import bulk

View File

@@ -6,7 +6,9 @@ def test_manifest_wrapper() -> None:
"""Test manifest wrapper."""
from manifest import Manifest
manifest = Manifest(client_name="openai")
manifest = Manifest(
client_name="openai",
)
llm = ManifestWrapper(client=manifest, llm_kwargs={"temperature": 0})
output = llm("The capital of New York is:")
assert output == "Albany"

View File

@@ -1,34 +0,0 @@
from typing import Dict, List
from pydantic import BaseModel
from langchain.chains.base import Chain
from langchain.chains.pipeline import Pipeline
class FakeChain(Chain, BaseModel):
input_variables: List[str]
output_variables: List[str]
@property
def input_keys(self) -> List[str]:
return self.input_variables
@property
def output_keys(self) -> List[str]:
return self.output_variables
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
outputs = {}
for var in self.output_variables:
outputs[var] = " ".join(self.input_variables) + "foo"
return outputs
def test_pipeline_usage() -> None:
chain_1 = FakeChain(input_variables=["foo"], output_variables=["bar"])
chain_2 = FakeChain(input_variables=["bar"], output_variables=["baz"])
pipeline = Pipeline(chains=[chain_1, chain_2], input_variables=["foo"])
output = pipeline({"foo": "123"})
breakpoint()

View File

@@ -48,7 +48,7 @@ def test_chained_input_verbose() -> None:
chained_input.add("baz", color="blue")
sys.stdout = old_stdout
output = mystdout.getvalue()
assert output == "\x1b[36;1m\x1b[1;3mbaz\x1b[0m"
assert output == "\x1b[104mbaz\x1b[0m"
assert chained_input.input == "foobarbaz"
@@ -70,5 +70,5 @@ def test_get_color_mapping_excluded_colors() -> None:
"""Test getting of color mapping with excluded colors."""
items = ["foo", "bar"]
output = get_color_mapping(items, excluded_colors=["blue"])
expected_output = {"foo": "yellow", "bar": "pink"}
expected_output = {"foo": "yellow", "bar": "red"}
assert output == expected_output

View File

@@ -51,8 +51,8 @@ Question: {question}
Answer:"""
input_variables = ["question"]
example_separator = "\n\n"
prefix = """Test Prompt:"""
suffix = """Question: {question}\nAnswer:"""
prefix = """Test Prompt:\n\n"""
suffix = """\n\nQuestion: {question}\nAnswer:"""
examples = [
"""Question: who are you?\nAnswer: foo""",
"""Question: what are you?\nAnswer: bar""",

View File

@@ -28,11 +28,11 @@ def test_table_info() -> None:
db = SQLDatabase(engine)
output = db.table_info
expected_output = (
"Table 'company' has columns: company_id (INTEGER), "
"company_location (VARCHAR).",
"Table 'user' has columns: user_id (INTEGER), user_name (VARCHAR(16)).",
"The 'company' table has columns: company_id (INTEGER), "
"company_location (VARCHAR).\n"
"The 'user' table has columns: user_id (INTEGER), user_name (VARCHAR(16))."
)
assert sorted(output.split("\n")) == sorted(expected_output)
assert output == expected_output
def test_sql_database_run() -> None: