Compare commits

..

32 Commits

Author SHA1 Message Date
Harrison Chase
6766dff804 custom chain and example 2022-11-19 10:42:42 -08:00
Harrison Chase
b5325c212b chain pipelines 2022-11-19 09:55:44 -08:00
Harrison Chase
8869b0ab0e bump version to 0.0.16 (#157) 2022-11-18 06:09:03 -08:00
Harrison Chase
b15c84e19d Harrison/chain lab (#156) 2022-11-18 05:50:02 -08:00
Harrison Chase
0ac08bbca6 bump version to 0.0.15 (#154) 2022-11-16 23:22:05 -08:00
Nicholas Larus-Stone
0c3ae78ec1 chore: update ascii colors to work with dark mode (#152) 2022-11-16 22:05:28 -08:00
Nicholas Larus-Stone
ca4b10bb74 feat: add option to ignore or restrict to SQL tables (#151)
`SQLDatabase` now accepts two `init` arguments:
1. `ignore_tables` to pass in a list of tables to not search over
2. `include_tables` to restrict to a list of tables to consider
2022-11-16 22:04:50 -08:00
Harrison Chase
d2f9288be6 add metadata to documents (#153)
add concept of metadata to document
2022-11-16 21:58:05 -08:00
Harrison Chase
d775ddd749 add apply functionality (#150) 2022-11-16 21:39:02 -08:00
thesved
47e35d7d0e Fix notebook links (#149)
Example notebook links were broken.
2022-11-16 15:13:12 -08:00
Harrison Chase
4f1bf159f4 bump version to 0.0.14 (#145) 2022-11-14 22:07:54 -08:00
Harrison Chase
b504cd739f Harrison/cleanup env check (#144) 2022-11-14 22:05:41 -08:00
Harrison Chase
a4b502d92f fix env var loader (#143) 2022-11-14 21:42:43 -08:00
Harrison Chase
1835e8a681 prompt nit (#141)
doing some cleanup, and i think this just simplifies things...
2022-11-14 21:30:33 -08:00
Harrison Chase
bbb405a492 update colors (#140) 2022-11-14 20:27:36 -08:00
Predrag Gruevski
1a95252f00 Use pull_request not pull_request_target in GitHub Actions. (#139)
`pull_request` runs on the merge commit between the opened PR and the
target branch where the PR is to be merged — `master` in this case. This
is desirable because that way the new changes get linted and tested.

The existing `pull_request_target` specifier causes lint and test to run
_on the target branch itself_ (i.e. `master` in this case). That way the
new code in the PR doesn't get linted and tested at all. This can also
lead to security vulnerabilities, as described in the GitHub docs:

![image](https://user-images.githubusercontent.com/2348618/201735153-c5dd0c03-2490-45e9-b7f9-f0d47eb0109f.png)

Screenshot from here:
https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request_target
Link from the screenshot:
https://securitylab.github.com/research/github-actions-preventing-pwn-requests/
2022-11-14 11:34:08 -08:00
Harrison Chase
9f223e6ccc Harrison/fix lint (#138) 2022-11-14 08:55:59 -08:00
Delip Rao
76cecf8165 A fix for Jupyter environment variable issue (#135)
- fixes the Jupyter environment variable issues mentioned in issue #134 
- fixes format/lint issues in some unrelated files (from make
format/lint)


![image](https://user-images.githubusercontent.com/347398/201599322-090af858-362d-4d69-bf59-208aea65419a.png)
2022-11-14 08:34:01 -08:00
Harrison Chase
ced29b816b remove extra run from merge conflict (#133) 2022-11-13 21:07:20 -08:00
Harrison Chase
11d37d556e bump version 0.0.13 (#132) 2022-11-13 21:06:50 -08:00
Harrison Chase
b1b6b27c5f Harrison/redo docs (#130)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2022-11-13 20:13:23 -08:00
Harrison Chase
f23b3ceb49 consolidate run functions (#126)
consolidating logic for when a chain is able to run with single input
text, single output text

open to feedback on naming, logic, usefulness
2022-11-13 18:14:35 -08:00
Harrison Chase
1fe3a4f724 extra requires (#129)
add extra requires
2022-11-13 17:34:58 -08:00
Eugene Yurtsev
2910f50a3c Fix a few typos and wrapped f-strings (#128)
Fix a few typos and wrapped f-strings
2022-11-13 13:16:19 -08:00
Edmar Ferreira
8a5ec894e7 Prompt from file proof of concept using plain text (#127)
This is a simple proof of concept of using external files as templates. 
I'm still feeling my way around the codebase.
As a user, I want to use files as prompts, so it will be easier to
manage and test prompts.
The future direction is to use a template engine, most likely Mako.
2022-11-13 13:15:30 -08:00
Harrison Chase
d87e73ddb1 huggingface tokenizer (#75) 2022-11-13 09:37:44 -08:00
Eugene Yurtsev
b542941234 Bumping python version for read the docs (#122)
Haven't checked whether things work with new python version, hoping
error will
be caught with CI
2022-11-12 13:43:39 -08:00
Eugene Yurtsev
6df08eec52 Readme: Fix link to embeddings example and use python markup for code examples (#123)
* Fix URL to embeddings notebook
* Specify python is used for the code block
2022-11-12 11:26:08 -08:00
Eugene Yurtsev
f5a588a165 Add py.typed marker to package (#121)
- Update
- update
2022-11-12 11:22:32 -08:00
Harrison Chase
47af2bcee4 vector db qa (#71) 2022-11-12 07:24:49 -08:00
Harrison Chase
4c0b684f79 new manifest notebook (#118) 2022-11-11 06:49:06 -08:00
Harrison Chase
7467243a42 bump version 0.0.12 (#116) 2022-11-11 06:41:07 -08:00
106 changed files with 2626 additions and 1073 deletions

View File

@@ -1,6 +1,6 @@
name: lint
on: [push, pull_request_target]
on: [push, pull_request]
jobs:
build:

View File

@@ -1,6 +1,6 @@
name: test
on: [push, pull_request_target]
on: [push, pull_request]
jobs:
build:

View File

@@ -1,2 +1,3 @@
include langchain/py.typed
include langchain/VERSION
include LICENSE

View File

@@ -23,39 +23,13 @@ It aims to create:
2. a flexible interface for combining pieces into a single comprehensive "chain"
3. a schema for easily saving and sharing those chains
## 🔧 Setting up your environment
## 📖 Documentation
Besides the installation of this python package, you will also need to install packages and set environment variables depending on which chains you want to use.
Note: the reason these packages are not included in the dependencies by default is that as we imagine scaling this package, we do not want to force dependencies that are not needed.
The following use cases require specific installs and api keys:
- _OpenAI_:
- Install requirements with `pip install openai`
- Get an OpenAI api key and either set it as an environment variable (`OPENAI_API_KEY`) or pass it to the LLM constructor as `openai_api_key`.
- _Cohere_:
- Install requirements with `pip install cohere`
- Get a Cohere api key and either set it as an environment variable (`COHERE_API_KEY`) or pass it to the LLM constructor as `cohere_api_key`.
- _HuggingFace Hub_
- Install requirements with `pip install huggingface_hub`
- Get a HuggingFace Hub api token and either set it as an environment variable (`HUGGINGFACEHUB_API_TOKEN`) or pass it to the LLM constructor as `huggingfacehub_api_token`.
- _SerpAPI_:
- Install requirements with `pip install google-search-results`
- Get a SerpAPI api key and either set it as an environment variable (`SERPAPI_API_KEY`) or pass it to the LLM constructor as `serpapi_api_key`.
- _NatBot_:
- Install requirements with `pip install playwright`
- _Wikipedia_:
- Install requirements with `pip install wikipedia`
- _Elasticsearch_:
- Install requirements with `pip install elasticsearch`
- Set up Elasticsearch backend. If you want to do locally, [this](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/getting-started.html) is a good guide.
- _FAISS_:
- Install requirements with `pip install faiss` for Python 3.7 and `pip install faiss-cpu` for Python 3.10+.
- _Manifest_:
- Install requirements with `pip install manifest-ml` (Note: this is only available in Python 3.8+ currently).
If you are using the `NLTKTextSplitter` or the `SpacyTextSplitter`, you will also need to install the appropriate models. For example, if you want to use the `SpacyTextSplitter`, you will need to install the `en_core_web_sm` model with `python -m spacy download en_core_web_sm`. Similarly, if you want to use the `NLTKTextSplitter`, you will need to install the `punkt` model with `python -m nltk.downloader punkt`.
Please see [here](https://langchain.readthedocs.io/en/latest/?) for full documentation on:
- Getting started (installation, setting up environment, simple examples)
- How-To examples (demos, integrations, helper functions)
- Reference (full API docs)
- Resources (high level explanation of core concepts)
## 🚀 What can I do with this
@@ -63,9 +37,9 @@ This project was largely inspired by a few projects seen on Twitter for which we
**[Self-ask-with-search](https://ofir.io/self-ask.pdf)**
To recreate this paper, use the following code snippet or checkout the [example notebook](https://github.com/hwchase17/langchain/blob/master/examples/self_ask_with_search.ipynb).
To recreate this paper, use the following code snippet or checkout the [example notebook](https://github.com/hwchase17/langchain/blob/master/docs/examples/demos/self_ask_with_search.ipynb).
```
```python
from langchain import SelfAskWithSearchChain, OpenAI, SerpAPIChain
llm = OpenAI(temperature=0)
@@ -78,9 +52,9 @@ self_ask_with_search.run("What is the hometown of the reigning men's U.S. Open c
**[LLM Math](https://twitter.com/amasad/status/1568824744367259648?s=20&t=-7wxpXBJinPgDuyHLouP1w)**
To recreate this example, use the following code snippet or check out the [example notebook](https://github.com/hwchase17/langchain/blob/master/examples/llm_math.ipynb).
To recreate this example, use the following code snippet or check out the [example notebook](https://github.com/hwchase17/langchain/blob/master/docs/examples/demos/llm_math.ipynb).
```
```python
from langchain import OpenAI, LLMMathChain
llm = OpenAI(temperature=0)
@@ -91,9 +65,9 @@ llm_math.run("How many of the integers between 0 and 99 inclusive are divisible
**Generic Prompting**
You can also use this for simple prompting pipelines, as in the below example and this [example notebook](https://github.com/hwchase17/langchain/blob/master/examples/simple_prompts.ipynb).
You can also use this for simple prompting pipelines, as in the below example and this [example notebook](https://github.com/hwchase17/langchain/blob/master/docs/examples/demos/simple_prompts.ipynb).
```
```python
from langchain import Prompt, OpenAI, LLMChain
template = """Question: {question}
@@ -110,9 +84,9 @@ llm_chain.predict(question=question)
**Embed & Search Documents**
We support two vector databases to store and search embeddings -- FAISS and Elasticsearch. Here's a code snippet showing how to use FAISS to store embeddings and search for text similar to a query. Both database backends are featured in this [example notebook](https://github.com/hwchase17/langchain/blob/master/notebooks/examples/embeddings.ipynb).
We support two vector databases to store and search embeddings -- FAISS and Elasticsearch. Here's a code snippet showing how to use FAISS to store embeddings and search for text similar to a query. Both database backends are featured in this [example notebook](https://github.com/hwchase17/langchain/blob/master/docs/examples/integrations/embeddings.ipynb).
```
```python
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.faiss import FAISS
from langchain.text_splitter import CharacterTextSplitter
@@ -130,11 +104,6 @@ query = "What did the president say about Ketanji Brown Jackson"
docs = docsearch.similarity_search(query)
```
## 📖 Documentation
The above examples are probably the most user friendly documentation that exists,
but full API docs can be found [here](https://langchain.readthedocs.io/en/latest/?).
## 🤖 Developer Guide
To begin developing on this project, first clone to the repo locally.

View File

@@ -37,10 +37,14 @@ extensions = [
"sphinx.ext.autodoc.typehints",
"sphinx.ext.autosummary",
"sphinx.ext.napoleon",
"sphinx.ext.viewcode",
"sphinxcontrib.autodoc_pydantic",
"myst_parser",
"nbsphinx",
"sphinx_panels",
]
autodoc_pydantic_model_show_json = False
autodoc_pydantic_field_list_validators = False
autodoc_pydantic_config_members = False

25
docs/core_concepts.md Normal file
View File

@@ -0,0 +1,25 @@
# Core Concepts
This section goes over the core concepts of LangChain.
Understanding these will go a long way in helping you understand the codebase and how to construct chains.
## Prompts
Prompts generically have a `format` method that takes in variables and returns a formatted string.
The most simple implementation of this is to have a template string with some variables in it, and then format it with the incoming variables.
More complex iterations dynamically construct the template string from few shot examples, etc.
## LLMs
Wrappers around Large Language Models (in particular, the `generate` ability of large language models) are some of the core functionality of LangChain.
These wrappers are classes that are callable: they take in an input string, and return the generated output string.
## Embeddings
These classes are very similar to the LLM classes in that they are wrappers around models,
but rather than return a string they return an embedding (list of floats). This are particularly useful when
implementing semantic search functionality. They expose separate methods for embedding queries versus embedding documents.
## Vectorstores
These are datastores that store documents. They expose a method for passing in a string and finding similar documents.
## Chains
These are pipelines that combine multiple of the above ideas.
They vary greatly in complexity and are combination of generic, highly configurable pipelines and more narrow (but usually more complex) pipelines.

10
docs/examples/demos.rst Normal file
View File

@@ -0,0 +1,10 @@
Demos
=====
The examples here are all end-to-end chains of specific applications.
.. toctree::
:maxdepth: 1
:glob:
demos/*

View File

@@ -0,0 +1,243 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "dd2aa1bb",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.pipeline import Pipeline\n",
"from langchain.chains.custom import SimpleCustomChain\n",
"from langchain.llms import OpenAI\n",
"from langchain.chains import LLMChain\n",
"from langchain import Prompt"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "05390d95",
"metadata": {},
"outputs": [],
"source": [
"prompt_template = \"\"\"You are given the below API Documentation:\n",
"\n",
"API Documentation\n",
"The API endpoint /v1/forecast accepts a geographical coordinate, a list of weather variables and responds with a JSON hourly weather forecast for 7 days. Time always starts at 0:00 today and contains 168 hours. All URL parameters are listed below:\n",
"\n",
"Parameter\tFormat\tRequired\tDefault\tDescription\n",
"latitude, longitude\tFloating point\tYes\t\tGeographical WGS84 coordinate of the location\n",
"hourly\tString array\tNo\t\tA list of weather variables which should be returned. Values can be comma separated, or multiple &hourly= parameter in the URL can be used.\n",
"daily\tString array\tNo\t\tA list of daily weather variable aggregations which should be returned. Values can be comma separated, or multiple &daily= parameter in the URL can be used. If daily weather variables are specified, parameter timezone is required.\n",
"current_weather\tBool\tNo\tfalse\tInclude current weather conditions in the JSON output.\n",
"temperature_unit\tString\tNo\tcelsius\tIf fahrenheit is set, all temperature values are converted to Fahrenheit.\n",
"windspeed_unit\tString\tNo\tkmh\tOther wind speed speed units: ms, mph and kn\n",
"precipitation_unit\tString\tNo\tmm\tOther precipitation amount units: inch\n",
"timeformat\tString\tNo\tiso8601\tIf format unixtime is selected, all time values are returned in UNIX epoch time in seconds. Please note that all timestamp are in GMT+0! For daily values with unix timestamps, please apply utc_offset_seconds again to get the correct date.\n",
"timezone\tString\tNo\tGMT\tIf timezone is set, all timestamps are returned as local-time and data is returned starting at 00:00 local-time. Any time zone name from the time zone database is supported. If auto is set as a time zone, the coordinates will be automatically resolved to the local time zone.\n",
"past_days\tInteger (0-2)\tNo\t0\tIf past_days is set, yesterday or the day before yesterday data are also returned.\n",
"start_date\n",
"end_date\tString (yyyy-mm-dd)\tNo\t\tThe time interval to get weather data. A day must be specified as an ISO8601 date (e.g. 2022-06-30).\n",
"models\tString array\tNo\tauto\tManually select one or more weather models. Per default, the best suitable weather models will be combined.\n",
"\n",
"Hourly Parameter Definition\n",
"The parameter &hourly= accepts the following values. Most weather variables are given as an instantaneous value for the indicated hour. Some variables like precipitation are calculated from the preceding hour as an average or sum.\n",
"\n",
"Variable\tValid time\tUnit\tDescription\n",
"temperature_2m\tInstant\t°C (°F)\tAir temperature at 2 meters above ground\n",
"relativehumidity_2m\tInstant\t%\tRelative humidity at 2 meters above ground\n",
"dewpoint_2m\tInstant\t°C (°F)\tDew point temperature at 2 meters above ground\n",
"apparent_temperature\tInstant\t°C (°F)\tApparent temperature is the perceived feels-like temperature combining wind chill factor, relative humidity and solar radiation\n",
"pressure_msl\n",
"surface_pressure\tInstant\thPa\tAtmospheric air pressure reduced to mean sea level (msl) or pressure at surface. Typically pressure on mean sea level is used in meteorology. Surface pressure gets lower with increasing elevation.\n",
"cloudcover\tInstant\t%\tTotal cloud cover as an area fraction\n",
"cloudcover_low\tInstant\t%\tLow level clouds and fog up to 3 km altitude\n",
"cloudcover_mid\tInstant\t%\tMid level clouds from 3 to 8 km altitude\n",
"cloudcover_high\tInstant\t%\tHigh level clouds from 8 km altitude\n",
"windspeed_10m\n",
"windspeed_80m\n",
"windspeed_120m\n",
"windspeed_180m\tInstant\tkm/h (mph, m/s, knots)\tWind speed at 10, 80, 120 or 180 meters above ground. Wind speed on 10 meters is the standard level.\n",
"winddirection_10m\n",
"winddirection_80m\n",
"winddirection_120m\n",
"winddirection_180m\tInstant\t°\tWind direction at 10, 80, 120 or 180 meters above ground\n",
"windgusts_10m\tPreceding hour max\tkm/h (mph, m/s, knots)\tGusts at 10 meters above ground as a maximum of the preceding hour\n",
"shortwave_radiation\tPreceding hour mean\tW/m²\tShortwave solar radiation as average of the preceding hour. This is equal to the total global horizontal irradiation\n",
"direct_radiation\n",
"direct_normal_irradiance\tPreceding hour mean\tW/m²\tDirect solar radiation as average of the preceding hour on the horizontal plane and the normal plane (perpendicular to the sun)\n",
"diffuse_radiation\tPreceding hour mean\tW/m²\tDiffuse solar radiation as average of the preceding hour\n",
"vapor_pressure_deficit\tInstant\tkPa\tVapor Pressure Deificit (VPD) in kilopascal (kPa). For high VPD (>1.6), water transpiration of plants increases. For low VPD (<0.4), transpiration decreases\n",
"evapotranspiration\tPreceding hour sum\tmm (inch)\tEvapotranspration from land surface and plants that weather models assumes for this location. Available soil water is considered. 1 mm evapotranspiration per hour equals 1 liter of water per spare meter.\n",
"et0_fao_evapotranspiration\tPreceding hour sum\tmm (inch)\tET₀ Reference Evapotranspiration of a well watered grass field. Based on FAO-56 Penman-Monteith equations ET₀ is calculated from temperature, wind speed, humidity and solar radiation. Unlimited soil water is assumed. ET₀ is commonly used to estimate the required irrigation for plants.\n",
"precipitation\tPreceding hour sum\tmm (inch)\tTotal precipitation (rain, showers, snow) sum of the preceding hour\n",
"snowfall\tPreceding hour sum\tcm (inch)\tSnowfall amount of the preceding hour in centimeters. For the water equivalent in millimeter, divide by 7. E.g. 7 cm snow = 10 mm precipitation water equivalent\n",
"rain\tPreceding hour sum\tmm (inch)\tRain from large scale weather systems of the preceding hour in millimeter\n",
"showers\tPreceding hour sum\tmm (inch)\tShowers from convective precipitation in millimeters from the preceding hour\n",
"weathercode\tInstant\tWMO code\tWeather condition as a numeric code. Follow WMO weather interpretation codes. See table below for details.\n",
"snow_depth\tInstant\tmeters\tSnow depth on the ground\n",
"freezinglevel_height\tInstant\tmeters\tAltitude above sea level of the 0°C level\n",
"visibility\tInstant\tmeters\tViewing distance in meters. Influenced by low clouds, humidity and aerosols. Maximum visibility is approximately 24 km.\n",
"soil_temperature_0cm\n",
"soil_temperature_6cm\n",
"soil_temperature_18cm\n",
"soil_temperature_54cm\tInstant\t°C (°F)\tTemperature in the soil at 0, 6, 18 and 54 cm depths. 0 cm is the surface temperature on land or water surface temperature on water.\n",
"soil_moisture_0_1cm\n",
"soil_moisture_1_3cm\n",
"soil_moisture_3_9cm\n",
"soil_moisture_9_27cm\n",
"soil_moisture_27_81cm\tInstant\tm³/m³\tAverage soil water content as volumetric mixing ratio at 0-1, 1-3, 3-9, 9-27 and 27-81 cm depths.\n",
"\n",
"Using that documentation, write a query that you could send to the open meteo api to answer the following question.\n",
"\n",
"Question: {question}\n",
"GET Request: /v1/forecast?\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "b95e1af1",
"metadata": {},
"outputs": [],
"source": [
"prompt = Prompt(input_variables=[\"question\"], template=prompt_template)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "f0f3db6e",
"metadata": {},
"outputs": [],
"source": [
"question_chain = LLMChain(llm=OpenAI(temperature=0), prompt=prompt, output_key=\"meteo_query\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "939c1039",
"metadata": {},
"outputs": [],
"source": [
"import requests\n",
"def make_open_meteo_request(req):\n",
" return str(requests.get(f\"https://api.open-meteo.com/v1/forecast?{req}\").json())"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "88d73fd3",
"metadata": {},
"outputs": [],
"source": [
"custom_chain = SimpleCustomChain(func=make_open_meteo_request, input_key=\"meteo_query\", output_key=\"meteo_answer\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "c63dc349",
"metadata": {},
"outputs": [],
"source": [
"answer_prompt_template = \"\"\"You are given the below weather information:\n",
"\n",
"{meteo_answer}\n",
"\n",
"Now answer the following question:\n",
"\n",
"Question: {question}\n",
"Answer:\"\"\"\n",
"answer_prompt = Prompt(input_variables=[\"question\", \"meteo_answer\"], template=answer_prompt_template)\n",
"\n",
"answer_chain = LLMChain(llm=OpenAI(temperature=0), prompt=answer_prompt)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "1842b72d",
"metadata": {},
"outputs": [],
"source": [
"pipeline = Pipeline(chains=[question_chain, custom_chain, answer_chain], input_variables=[\"question\"], verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "e381155b",
"metadata": {},
"outputs": [],
"source": [
"question = \"is it snowing in boston?\""
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "d9110ed5",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[1mChain 0\u001b[0m:\n",
"{'meteo_query': 'latitude=42.3601&longitude=-71.0589&hourly=snowfall'}\n",
"\n",
"\u001b[1mChain 1\u001b[0m:\n",
"{'meteo_answer': \"{'latitude': 42.36515, 'longitude': -71.0618, 'generationtime_ms': 0.4010200500488281, 'utc_offset_seconds': 0, 'timezone': 'GMT', 'timezone_abbreviation': 'GMT', 'elevation': 9.0, 'hourly_units': {'time': 'iso8601', 'snowfall': 'cm'}, 'hourly': {'time': ['2022-11-19T00:00', '2022-11-19T01:00', '2022-11-19T02:00', '2022-11-19T03:00', '2022-11-19T04:00', '2022-11-19T05:00', '2022-11-19T06:00', '2022-11-19T07:00', '2022-11-19T08:00', '2022-11-19T09:00', '2022-11-19T10:00', '2022-11-19T11:00', '2022-11-19T12:00', '2022-11-19T13:00', '2022-11-19T14:00', '2022-11-19T15:00', '2022-11-19T16:00', '2022-11-19T17:00', '2022-11-19T18:00', '2022-11-19T19:00', '2022-11-19T20:00', '2022-11-19T21:00', '2022-11-19T22:00', '2022-11-19T23:00', '2022-11-20T00:00', '2022-11-20T01:00', '2022-11-20T02:00', '2022-11-20T03:00', '2022-11-20T04:00', '2022-11-20T05:00', '2022-11-20T06:00', '2022-11-20T07:00', '2022-11-20T08:00', '2022-11-20T09:00', '2022-11-20T10:00', '2022-11-20T11:00', '2022-11-20T12:00', '2022-11-20T13:00', '2022-11-20T14:00', '2022-11-20T15:00', '2022-11-20T16:00', '2022-11-20T17:00', '2022-11-20T18:00', '2022-11-20T19:00', '2022-11-20T20:00', '2022-11-20T21:00', '2022-11-20T22:00', '2022-11-20T23:00', '2022-11-21T00:00', '2022-11-21T01:00', '2022-11-21T02:00', '2022-11-21T03:00', '2022-11-21T04:00', '2022-11-21T05:00', '2022-11-21T06:00', '2022-11-21T07:00', '2022-11-21T08:00', '2022-11-21T09:00', '2022-11-21T10:00', '2022-11-21T11:00', '2022-11-21T12:00', '2022-11-21T13:00', '2022-11-21T14:00', '2022-11-21T15:00', '2022-11-21T16:00', '2022-11-21T17:00', '2022-11-21T18:00', '2022-11-21T19:00', '2022-11-21T20:00', '2022-11-21T21:00', '2022-11-21T22:00', '2022-11-21T23:00', '2022-11-22T00:00', '2022-11-22T01:00', '2022-11-22T02:00', '2022-11-22T03:00', '2022-11-22T04:00', '2022-11-22T05:00', '2022-11-22T06:00', '2022-11-22T07:00', '2022-11-22T08:00', '2022-11-22T09:00', '2022-11-22T10:00', '2022-11-22T11:00', '2022-11-22T12:00', '2022-11-22T13:00', '2022-11-22T14:00', '2022-11-22T15:00', '2022-11-22T16:00', '2022-11-22T17:00', '2022-11-22T18:00', '2022-11-22T19:00', '2022-11-22T20:00', '2022-11-22T21:00', '2022-11-22T22:00', '2022-11-22T23:00', '2022-11-23T00:00', '2022-11-23T01:00', '2022-11-23T02:00', '2022-11-23T03:00', '2022-11-23T04:00', '2022-11-23T05:00', '2022-11-23T06:00', '2022-11-23T07:00', '2022-11-23T08:00', '2022-11-23T09:00', '2022-11-23T10:00', '2022-11-23T11:00', '2022-11-23T12:00', '2022-11-23T13:00', '2022-11-23T14:00', '2022-11-23T15:00', '2022-11-23T16:00', '2022-11-23T17:00', '2022-11-23T18:00', '2022-11-23T19:00', '2022-11-23T20:00', '2022-11-23T21:00', '2022-11-23T22:00', '2022-11-23T23:00', '2022-11-24T00:00', '2022-11-24T01:00', '2022-11-24T02:00', '2022-11-24T03:00', '2022-11-24T04:00', '2022-11-24T05:00', '2022-11-24T06:00', '2022-11-24T07:00', '2022-11-24T08:00', '2022-11-24T09:00', '2022-11-24T10:00', '2022-11-24T11:00', '2022-11-24T12:00', '2022-11-24T13:00', '2022-11-24T14:00', '2022-11-24T15:00', '2022-11-24T16:00', '2022-11-24T17:00', '2022-11-24T18:00', '2022-11-24T19:00', '2022-11-24T20:00', '2022-11-24T21:00', '2022-11-24T22:00', '2022-11-24T23:00', '2022-11-25T00:00', '2022-11-25T01:00', '2022-11-25T02:00', '2022-11-25T03:00', '2022-11-25T04:00', '2022-11-25T05:00', '2022-11-25T06:00', '2022-11-25T07:00', '2022-11-25T08:00', '2022-11-25T09:00', '2022-11-25T10:00', '2022-11-25T11:00', '2022-11-25T12:00', '2022-11-25T13:00', '2022-11-25T14:00', '2022-11-25T15:00', '2022-11-25T16:00', '2022-11-25T17:00', '2022-11-25T18:00', '2022-11-25T19:00', '2022-11-25T20:00', '2022-11-25T21:00', '2022-11-25T22:00', '2022-11-25T23:00'], 'snowfall': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.07, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]}}\"}\n",
"\n",
"\u001b[1mChain 2\u001b[0m:\n",
"{'text': ' No'}\n",
"\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"' No'"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pipeline.run(question)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,5 +1,15 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "e71e720f",
"metadata": {},
"source": [
"# LLM Math\n",
"\n",
"This notebook showcases using LLMs and Python REPLs to do complex word math problems."
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -10,6 +20,9 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"How many of the integers between 0 and 99 inclusive are divisible by 8?\u001b[102m\n",
"\n",
"```python\n",
@@ -21,7 +34,8 @@
"```\n",
"\u001b[0m\n",
"Answer: \u001b[103m13\n",
"\u001b[0m"
"\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{

View File

@@ -1,5 +1,15 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "d9a0131f",
"metadata": {},
"source": [
"# Map Reduce\n",
"\n",
"This notebok showcases an example of map-reduce chains: recursive summarization."
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -29,7 +39,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"id": "99bbe19b",
"metadata": {},
"outputs": [
@@ -39,13 +49,13 @@
"\"\\n\\nThe President discusses the recent aggression by Russia, and the response by the United States and its allies. He announces new sanctions against Russia, and says that the free world is united in holding Putin accountable. The President also discusses the American Rescue Plan, the Bipartisan Infrastructure Law, and the Bipartisan Innovation Act. Finally, the President addresses the need for women's rights and equality for LGBTQ+ Americans.\""
]
},
"execution_count": 2,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with open('state_of_the_union.txt') as f:\n",
"with open('../state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"mp_chain.run(state_of_the_union)"
]

View File

@@ -1,5 +1,15 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "f1390152",
"metadata": {},
"source": [
"# MRKL\n",
"\n",
"This notebook showcases using the MRKL chain to route between tasks"
]
},
{
"cell_type": "markdown",
"id": "39ea3638",
@@ -22,7 +32,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 8,
"id": "07e96d99",
"metadata": {},
"outputs": [],
@@ -30,12 +40,12 @@
"llm = OpenAI(temperature=0)\n",
"search = SerpAPIChain()\n",
"llm_math_chain = LLMMathChain(llm=llm, verbose=True)\n",
"db = SQLDatabase.from_uri(\"sqlite:///../notebooks/Chinook.db\")\n",
"db = SQLDatabase.from_uri(\"sqlite:///../../../notebooks/Chinook.db\")\n",
"db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)\n",
"chains = [\n",
" ChainConfig(\n",
" action_name = \"Search\",\n",
" action=search.search,\n",
" action=search.run,\n",
" action_description=\"useful for when you need to answer questions about current events\"\n",
" ),\n",
" ChainConfig(\n",
@@ -46,7 +56,7 @@
" \n",
" ChainConfig(\n",
" action_name=\"FooBar DB\",\n",
" action=db_chain.query,\n",
" action=db_chain.run,\n",
" action_description=\"useful for when you need to answer questions about FooBar. Input should be in the form of a question\"\n",
" )\n",
"]"
@@ -54,7 +64,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 9,
"id": "a069c4b6",
"metadata": {},
"outputs": [],
@@ -64,7 +74,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 6,
"id": "e603cd7d",
"metadata": {},
"outputs": [
@@ -112,7 +122,7 @@
"'2.1520202182226886'"
]
},
"execution_count": 4,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
@@ -123,7 +133,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 10,
"id": "a5c07010",
"metadata": {},
"outputs": [
@@ -159,22 +169,22 @@
"What albums by Alanis Morissette are in the FooBar database?\n",
"SQLQuery:\u001b[102m SELECT Title FROM Album WHERE ArtistId = (SELECT ArtistId FROM Artist WHERE Name = 'Alanis Morissette')\u001b[0m\n",
"SQLResult: \u001b[103m[('Jagged Little Pill',)]\u001b[0m\n",
"Answer:\u001b[102m The album \"Jagged Little Pill\" by Alanis Morissette is in the FooBar database.\u001b[0m\n",
"Answer:\u001b[102m Jagged Little Pill\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[101m The album \"Jagged Little Pill\" by Alanis Morissette is in the FooBar database.\u001b[0m\n",
"Observation: \u001b[101m Jagged Little Pill\u001b[0m\n",
"Thought:\u001b[102m I now know the final answer\n",
"Final Answer: The album \"Jagged Little Pill\" by Alanis Morissette is the only album by Alanis Morissette in the FooBar database.\u001b[0m\n",
"Final Answer: The album is by Alanis Morissette and the albums in the FooBar database by her are Jagged Little Pill\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The album \"Jagged Little Pill\" by Alanis Morissette is the only album by Alanis Morissette in the FooBar database.'"
"'The album is by Alanis Morissette and the albums in the FooBar database by her are Jagged Little Pill'"
]
},
"execution_count": 5,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}

View File

@@ -46,7 +46,7 @@ if __name__ == "__main__":
try:
while True:
browser_content = "\n".join(_crawler.crawl())
llm_command = nat_bot_chain.run(_crawler.page.url, browser_content)
llm_command = nat_bot_chain.execute(_crawler.page.url, browser_content)
if not quiet:
print("URL: " + _crawler.page.url)
print("Objective: " + objective)

View File

@@ -0,0 +1,98 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "82140df0",
"metadata": {},
"source": [
"# ReAct\n",
"\n",
"This notebook showcases the implementation of the ReAct chain logic."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "4e272b47",
"metadata": {},
"outputs": [],
"source": [
"from langchain import OpenAI, ReActChain, Wikipedia\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"react = ReActChain(llm=llm, docstore=Wikipedia(), verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "8078c8f1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"Author David Chanoff has collaborated with a U.S. Navy admiral who served as the ambassador to the United Kingdom under which President?\n",
"Thought 1:\u001b[102m I need to search David Chanoff and find the U.S. Navy admiral he\n",
"collaborated with.\n",
"Action 1: Search[David Chanoff]\u001b[0m\n",
"Observation 1: \u001b[103mDavid Chanoff is a noted author of non-fiction work. His work has typically involved collaborations with the principal protagonist of the work concerned. His collaborators have included; Augustus A. White, Joycelyn Elders, Đoàn Văn Toại, William J. Crowe, Ariel Sharon, Kenneth Good and Felix Zandman. He has also written about a wide range of subjects including literary history, education and foreign for The Washington Post, The New Republic and The New York Times Magazine. He has published more than twelve books.\u001b[0m\n",
"Thought 2:\u001b[102m The U.S. Navy admiral David Chanoff collaborated with is William J. Crowe.\n",
"Action 2: Search[William J. Crowe]\u001b[0m\n",
"Observation 2: \u001b[103mWilliam James Crowe Jr. (January 2, 1925 October 18, 2007) was a United States Navy admiral and diplomat who served as the 11th chairman of the Joint Chiefs of Staff under Presidents Ronald Reagan and George H. W. Bush, and as the ambassador to the United Kingdom and Chair of the Intelligence Oversight Board under President Bill Clinton.\u001b[0m\n",
"Thought 3:\u001b[102m William J. Crowe served as the ambassador to the United Kingdom under President Bill Clinton. So the answer is Bill Clinton.\n",
"Action 3: Finish[Bill Clinton]\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'Bill Clinton'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"question = \"Author David Chanoff has collaborated with a U.S. Navy admiral who served as the ambassador to the United Kingdom under which President?\"\n",
"react.run(question)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0a6bd3b4",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,5 +1,15 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "0c3f1df8",
"metadata": {},
"source": [
"# Self Ask With Search\n",
"\n",
"This notebook showcases the Self Ask With Search chain."
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -10,13 +20,17 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[49m\n",
"Are follow up questions needed here:\u001b[0m\u001b[102m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\u001b[49m\n",
"Intermediate answer: \u001b[0m\u001b[103mCarlos Alcaraz.\u001b[0m\u001b[102m\n",
"Follow up: Where is Carlos Alcaraz from?\u001b[0m\u001b[49m\n",
"Intermediate answer: \u001b[0m\u001b[103mEl Palmar, Murcia, Spain.\u001b[0m\u001b[102m\n",
"So the final answer is: El Palmar, Murcia, Spain\u001b[0m"
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"Are follow up questions needed here:\u001b[102m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
"Intermediate answer: \u001b[103mCarlos Alcaraz won the 2022 Men's single title while Poland's Iga Swiatek won the Women's single title defeating Tunisian's Ons Jabeur..\u001b[0m\u001b[102m\n",
"Follow up: Where is Carlos Alcaraz from?\u001b[0m\n",
"Intermediate answer: \u001b[103mEl Palmar, Murcia, Spain.\u001b[0m\u001b[102m\n",
"So the final answer is: El Palmar, Murcia, Spain\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
@@ -44,7 +58,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "6195fc82",
"id": "683d69e7",
"metadata": {},
"outputs": [],
"source": []

View File

@@ -1,8 +1,18 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "d8a5c5d4",
"metadata": {},
"source": [
"# Simple Example\n",
"\n",
"This notebook showcases a simple chain."
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 2,
"id": "51a54c4d",
"metadata": {},
"outputs": [
@@ -12,7 +22,7 @@
"' The year Justin Beiber was born was 1994. In 1994, the Dallas Cowboys won the Super Bowl.'"
]
},
"execution_count": 1,
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
@@ -28,7 +38,7 @@
"\n",
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"\n",
"llm_chain.predict(question=question)"
"llm_chain.run(question)"
]
},
{

View File

@@ -1,9 +1,27 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "0ed6aab1",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# SQLite example\n",
"\n",
"This example showcases hooking up an LLM to answer questions over a database."
]
},
{
"cell_type": "markdown",
"id": "b2f66479",
"metadata": {},
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"This uses the example Chinook database.\n",
"To set it up follow the instructions on https://database.guide/2-sample-databases-sqlite/, placing the `.db` file in a notebooks folder at the root of this repository."
@@ -13,7 +31,11 @@
"cell_type": "code",
"execution_count": 1,
"id": "d0e27d88",
"metadata": {},
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"from langchain import OpenAI, SQLDatabase, SQLDatabaseChain"
@@ -23,10 +45,14 @@
"cell_type": "code",
"execution_count": 2,
"id": "72ede462",
"metadata": {},
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"db = SQLDatabase.from_uri(\"sqlite:///../notebooks/Chinook.db\")\n",
"db = SQLDatabase.from_uri(\"sqlite:///../../../notebooks/Chinook.db\")\n",
"llm = OpenAI(temperature=0)\n",
"db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)"
]
@@ -35,21 +61,30 @@
"cell_type": "code",
"execution_count": 3,
"id": "15ff81df",
"metadata": {},
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[102m SELECT COUNT(*) FROM Employee\u001b[0m\u001b[49m\n",
"SQLResult: \u001b[0m\u001b[103m[(8,)]\u001b[0m\u001b[49m\n",
"Answer:\u001b[0m\u001b[102m There are 8 employees.\u001b[0m"
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"How many employees are there?\n",
"SQLQuery:\u001b[102m SELECT COUNT(*) FROM Employee\u001b[0m\n",
"SQLResult: \u001b[103m[(8,)]\u001b[0m\n",
"Answer:\u001b[102m 8\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"' There are 8 employees.'"
"' 8'"
]
},
"execution_count": 3,
@@ -58,13 +93,13 @@
}
],
"source": [
"db_chain.query(\"How many employees are there?\")"
"db_chain.run(\"How many employees are there?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "146fa162",
"id": "61d91b85",
"metadata": {},
"outputs": [],
"source": []

View File

@@ -0,0 +1,104 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "07c1e3b9",
"metadata": {},
"source": [
"# Vector DB Question/Answering\n",
"\n",
"This example showcases question answering over a vector database."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "82525493",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.vectorstores.faiss import FAISS\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain import OpenAI, VectorDBQA"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "5c7049db",
"metadata": {},
"outputs": [],
"source": [
"with open('../state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_text(state_of_the_union)\n",
"\n",
"embeddings = OpenAIEmbeddings()\n",
"docsearch = FAISS.from_texts(texts, embeddings)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "3018f865",
"metadata": {},
"outputs": [],
"source": [
"qa = VectorDBQA(llm=OpenAI(), vectorstore=docsearch)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "032a47f8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' The President said that Ketanji Brown Jackson is a consensus builder and has received a broad range of support since she was nominated.'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"qa.run(query)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f0f20b92",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,10 @@
Integrations
============
The examples here all highlight a specific type of integration.
.. toctree::
:maxdepth: 1
:glob:
integrations/*

View File

@@ -1,10 +1,28 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "7ef4d402-6662-4a26-b612-35b542066487",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# Embeddings & VectorStores\n",
"\n",
"This notebook show cases how to use embeddings to create a VectorStore"
]
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 1,
"id": "965eecee",
"metadata": {},
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@@ -15,12 +33,16 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"id": "68481687",
"metadata": {},
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"with open('state_of_the_union.txt') as f:\n",
"with open('../state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_text(state_of_the_union)\n",
@@ -30,9 +52,13 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 3,
"id": "015f4ff5",
"metadata": {},
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"docsearch = FAISS.from_texts(texts, embeddings)\n",
@@ -43,9 +69,13 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 4,
"id": "67baf32e",
"metadata": {},
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"name": "stdout",
@@ -67,11 +97,23 @@
"print(docs[0].page_content)"
]
},
{
"cell_type": "markdown",
"id": "eea6e627",
"metadata": {},
"source": [
"## Requires having ElasticSearch setup"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "4906b8a3",
"metadata": {},
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"docsearch = ElasticVectorSearch.from_texts(texts, embeddings, elasticsearch_url=\"http://localhost:9200\")\n",
@@ -84,7 +126,11 @@
"cell_type": "code",
"execution_count": 7,
"id": "95f9eee9",
"metadata": {},
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"name": "stdout",
@@ -105,14 +151,6 @@
"source": [
"print(docs[0].page_content)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "70a253c4",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -131,7 +169,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
"version": "3.7.6"
}
},
"nbformat": 4,

View File

@@ -1,5 +1,15 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "959300d4",
"metadata": {},
"source": [
"# HuggingFace Hub\n",
"\n",
"This example showcases how to connect to the HuggingFace Hub."
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -25,7 +35,7 @@
"\n",
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"\n",
"print(llm_chain.predict(question=question))"
"print(llm_chain.run(question))"
]
},
{
@@ -53,7 +63,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.7"
"version": "3.7.6"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,180 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "b118c9dc",
"metadata": {},
"source": [
"# HuggingFace Tokenizers\n",
"\n",
"This notebook show cases how to use HuggingFace tokenizers to split text."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "e82c4685",
"metadata": {},
"outputs": [],
"source": [
"from langchain.text_splitter import CharacterTextSplitter"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a8ce51d5",
"metadata": {},
"outputs": [],
"source": [
"from transformers import GPT2TokenizerFast\n",
"\n",
"tokenizer = GPT2TokenizerFast.from_pretrained(\"gpt2\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "ca5e72c0",
"metadata": {},
"outputs": [],
"source": [
"with open('../state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"text_splitter = CharacterTextSplitter.from_huggingface_tokenizer(tokenizer, chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_text(state_of_the_union)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "37cdfbeb",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans. \n",
"\n",
"Last year COVID-19 kept us apart. This year we are finally together again. \n",
"\n",
"Tonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \n",
"\n",
"With a duty to one another to the American people to the Constitution. \n",
"\n",
"And with an unwavering resolve that freedom will always triumph over tyranny. \n",
"\n",
"Six days ago, Russias Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \n",
"\n",
"He thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \n",
"\n",
"He met the Ukrainian people. \n",
"\n",
"From President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world. \n",
"\n",
"Groups of citizens blocking tanks with their bodies. Everyone from students to retirees teachers turned soldiers defending their homeland. \n",
"\n",
"In this struggle as President Zelenskyy said in his speech to the European Parliament “Light will win over darkness.” The Ukrainian Ambassador to the United States is here tonight. \n",
"\n",
"Let each of us here tonight in this Chamber send an unmistakable signal to Ukraine and to the world. \n",
"\n",
"Please rise if you are able and show that, Yes, we the United States of America stand with the Ukrainian people. \n",
"\n",
"Throughout our history weve learned this lesson when dictators do not pay a price for their aggression they cause more chaos. \n",
"\n",
"They keep moving. \n",
"\n",
"And the costs and the threats to America and the world keep rising. \n",
"\n",
"Thats why the NATO Alliance was created to secure peace and stability in Europe after World War 2. \n",
"\n",
"The United States is a member along with 29 other nations. \n",
"\n",
"It matters. American diplomacy matters. American resolve matters. \n",
"\n",
"Putins latest attack on Ukraine was premeditated and unprovoked. \n",
"\n",
"He rejected repeated efforts at diplomacy. \n",
"\n",
"He thought the West and NATO wouldnt respond. And he thought he could divide us at home. Putin was wrong. We were ready. Here is what we did. \n",
"\n",
"We prepared extensively and carefully. \n",
"\n",
"We spent months building a coalition of other freedom-loving nations from Europe and the Americas to Asia and Africa to confront Putin. \n",
"\n",
"I spent countless hours unifying our European allies. We shared with the world in advance what we knew Putin was planning and precisely how he would try to falsely justify his aggression. \n",
"\n",
"We countered Russias lies with truth. \n",
"\n",
"And now that he has acted the free world is holding him accountable. \n",
"\n",
"Along with twenty-seven members of the European Union including France, Germany, Italy, as well as countries like the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and many others, even Switzerland. \n",
"\n",
"We are inflicting pain on Russia and supporting the people of Ukraine. Putin is now isolated from the world more than ever. \n",
"\n",
"Together with our allies we are right now enforcing powerful economic sanctions. \n",
"\n",
"We are cutting off Russias largest banks from the international financial system. \n",
"\n",
"Preventing Russias central bank from defending the Russian Ruble making Putins $630 Billion “war fund” worthless. \n",
"\n",
"We are choking off Russias access to technology that will sap its economic strength and weaken its military for years to come. \n",
"\n",
"Tonight I say to the Russian oligarchs and corrupt leaders who have bilked billions of dollars off this violent regime no more. \n",
"\n",
"The U.S. Department of Justice is assembling a dedicated task force to go after the crimes of Russian oligarchs. \n",
"\n",
"We are joining with our European allies to find and seize your yachts your luxury apartments your private jets. We are coming for your ill-begotten gains. \n",
"\n",
"And tonight I am announcing that we will join our allies in closing off American air space to all Russian flights further isolating Russia and adding an additional squeeze on their economy. The Ruble has lost 30% of its value. \n",
"\n",
"The Russian stock market has lost 40% of its value and trading remains suspended. Russias economy is reeling and Putin alone is to blame. \n",
"\n",
"Together with our allies we are providing support to the Ukrainians in their fight for freedom. Military assistance. Economic assistance. Humanitarian assistance. \n",
"\n",
"We are giving more than $1 Billion in direct assistance to Ukraine. \n",
"\n",
"And we will continue to aid the Ukrainian people as they defend their country and to help ease their suffering. \n",
"\n",
"Let me be clear, our forces are not engaged and will not engage in conflict with Russian forces in Ukraine. \n",
"\n",
"Our forces are not going to Europe to fight in Ukraine, but to defend our NATO Allies in the event that Putin decides to keep moving west. \n"
]
}
],
"source": [
"print(texts[0])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d214aec2",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,215 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "b4462a94",
"metadata": {},
"source": [
"# Manifest\n",
"\n",
"This notebook goes over how to use Manifest and LangChain."
]
},
{
"cell_type": "markdown",
"id": "59fcaebc",
"metadata": {},
"source": [
"For more detailed information on `manifest`, and how to use it with local hugginface models like in this example, see https://github.com/HazyResearch/manifest"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "04a0170a",
"metadata": {},
"outputs": [],
"source": [
"from manifest import Manifest\n",
"from langchain.llms.manifest import ManifestWrapper"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "de250a6a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'model_name': 'bigscience/T0_3B', 'model_path': 'bigscience/T0_3B'}\n"
]
}
],
"source": [
"manifest = Manifest(\n",
" client_name = \"huggingface\",\n",
" client_connection = \"http://127.0.0.1:5000\"\n",
")\n",
"print(manifest.client.get_model_params())"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "67b719d6",
"metadata": {},
"outputs": [],
"source": [
"llm = ManifestWrapper(client=manifest, llm_kwargs={\"temperature\": 0.001, \"max_tokens\": 256})"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "5af505a8",
"metadata": {},
"outputs": [],
"source": [
"# Map reduce example\n",
"from langchain import Prompt\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.chains.mapreduce import MapReduceChain\n",
"\n",
"\n",
"_prompt = \"\"\"Write a concise summary of the following:\n",
"\n",
"\n",
"{text}\n",
"\n",
"\n",
"CONCISE SUMMARY:\"\"\"\n",
"prompt = Prompt(template=_prompt, input_variables=[\"text\"])\n",
"\n",
"text_splitter = CharacterTextSplitter()\n",
"\n",
"mp_chain = MapReduceChain.from_params(llm, prompt, text_splitter)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "485b3ec3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'President Obama delivered his annual State of the Union address on Tuesday night, laying out his priorities for the coming year. Obama said the government will provide free flu vaccines to all Americans, ending the government shutdown and allowing businesses to reopen. The president also said that the government will continue to send vaccines to 112 countries, more than any other nation. \"We have lost so much to COVID-19,\" Trump said. \"Time with one another. And worst of all, so much loss of life.\" He said the CDC is working on a vaccine for kids under 5, and that the government will be ready with plenty of vaccines when they are available. Obama says the new guidelines are a \"great step forward\" and that the virus is no longer a threat. He says the government is launching a \"Test to Treat\" initiative that will allow people to get tested at a pharmacy and get antiviral pills on the spot at no cost. Obama says the new guidelines are a \"great step forward\" and that the virus is no longer a threat. He says the government will continue to send vaccines to 112 countries, more than any other nation. \"We are coming for your'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with open('../state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"mp_chain.run(state_of_the_union)"
]
},
{
"cell_type": "markdown",
"id": "6e9d45a8",
"metadata": {},
"source": [
"## Compare HF Models"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "33407ab3",
"metadata": {},
"outputs": [],
"source": [
"from langchain.model_laboratory import ModelLaboratory\n",
"\n",
"manifest1 = ManifestWrapper(\n",
" client=Manifest(\n",
" client_name=\"huggingface\",\n",
" client_connection=\"http://127.0.0.1:5000\"\n",
" ),\n",
" llm_kwargs={\"temperature\": 0.01}\n",
")\n",
"manifest2 = ManifestWrapper(\n",
" client=Manifest(\n",
" client_name=\"huggingface\",\n",
" client_connection=\"http://127.0.0.1:5001\"\n",
" ),\n",
" llm_kwargs={\"temperature\": 0.01}\n",
")\n",
"manifest3 = ManifestWrapper(\n",
" client=Manifest(\n",
" client_name=\"huggingface\",\n",
" client_connection=\"http://127.0.0.1:5002\"\n",
" ),\n",
" llm_kwargs={\"temperature\": 0.01}\n",
")\n",
"llms = [manifest1, manifest2, manifest3]\n",
"model_lab = ModelLaboratory(llms)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "448935c3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"What color is a flamingo?\n",
"\n",
"\u001b[1mManifestWrapper\u001b[0m\n",
"Params: {'model_name': 'bigscience/T0_3B', 'model_path': 'bigscience/T0_3B', 'temperature': 0.01}\n",
"\u001b[104mpink\u001b[0m\n",
"\n",
"\u001b[1mManifestWrapper\u001b[0m\n",
"Params: {'model_name': 'EleutherAI/gpt-neo-125M', 'model_path': 'EleutherAI/gpt-neo-125M', 'temperature': 0.01}\n",
"\u001b[103mA flamingo is a small, round\u001b[0m\n",
"\n",
"\u001b[1mManifestWrapper\u001b[0m\n",
"Params: {'model_name': 'google/flan-t5-xl', 'model_path': 'google/flan-t5-xl', 'temperature': 0.01}\n",
"\u001b[101mpink\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab.compare(\"What color is a flamingo?\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
},
"vscode": {
"interpreter": {
"hash": "51b9b5b89a4976ad21c8b4273a6c78d700e2954ce7d7452948b7774eb33bbce4"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,254 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "920a3c1a",
"metadata": {},
"source": [
"# Model Laboratory\n",
"\n",
"This example goes over basic functionality of how to use the ModelLaboratory to test out and try different models."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "ab9e95ad",
"metadata": {},
"outputs": [],
"source": [
"from langchain import LLMChain, OpenAI, Cohere, HuggingFaceHub, Prompt\n",
"from langchain.model_laboratory import ModelLaboratory"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "32cb94e6",
"metadata": {},
"outputs": [],
"source": [
"llms = [\n",
" OpenAI(temperature=0), \n",
" Cohere(model=\"command-xlarge-20221108\", max_tokens=20, temperature=0), \n",
" HuggingFaceHub(repo_id=\"google/flan-t5-xl\", model_kwargs={\"temperature\":1})\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "14cde09d",
"metadata": {},
"outputs": [],
"source": [
"model_lab = ModelLaboratory.from_llms(llms)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "f186c741",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"What color is a flamingo?\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\u001b[36;1m\u001b[1;3m\n",
"\n",
"Flamingos are pink.\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 20, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\u001b[33;1m\u001b[1;3m\n",
"\n",
"Pink\u001b[0m\n",
"\n",
"\u001b[1mHuggingFaceHub\u001b[0m\n",
"Params: {'repo_id': 'google/flan-t5-xl', 'temperature': 1}\n",
"\u001b[38;5;200m\u001b[1;3mpink\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab.compare(\"What color is a flamingo?\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "248b652a",
"metadata": {},
"outputs": [],
"source": [
"prompt = Prompt(template=\"What is the capital of {state}?\", input_variables=[\"state\"])\n",
"model_lab_with_prompt = ModelLaboratory.from_llms(llms, prompt=prompt)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "f64377ac",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"New York\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\u001b[36;1m\u001b[1;3m\n",
"\n",
"The capital of New York is Albany.\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 20, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\u001b[33;1m\u001b[1;3m\n",
"\n",
"The capital of New York is Albany.\u001b[0m\n",
"\n",
"\u001b[1mHuggingFaceHub\u001b[0m\n",
"Params: {'repo_id': 'google/flan-t5-xl', 'temperature': 1}\n",
"\u001b[38;5;200m\u001b[1;3mst john s\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab_with_prompt.compare(\"New York\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "54336dbf",
"metadata": {},
"outputs": [],
"source": [
"from langchain import SelfAskWithSearchChain, SerpAPIChain\n",
"\n",
"open_ai_llm = OpenAI(temperature=0)\n",
"search = SerpAPIChain()\n",
"self_ask_with_search_openai = SelfAskWithSearchChain(llm=open_ai_llm, search_chain=search, verbose=True)\n",
"\n",
"cohere_llm = Cohere(temperature=0, model=\"command-xlarge-20221108\")\n",
"search = SerpAPIChain()\n",
"self_ask_with_search_cohere = SelfAskWithSearchChain(llm=cohere_llm, search_chain=search, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "6a50a9f1",
"metadata": {},
"outputs": [],
"source": [
"chains = [self_ask_with_search_openai, self_ask_with_search_cohere]\n",
"names = [str(open_ai_llm), str(cohere_llm)]"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "d3549e99",
"metadata": {},
"outputs": [],
"source": [
"model_lab = ModelLaboratory(chains, names=names)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "362f7f57",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"Are follow up questions needed here:\u001b[32;1m\u001b[1;3m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
"Intermediate answer: \u001b[33;1m\u001b[1;3mCarlos Alcaraz.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"Follow up: Where is Carlos Alcaraz from?\u001b[0m\n",
"Intermediate answer: \u001b[33;1m\u001b[1;3mEl Palmar, Spain.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"So the final answer is: El Palmar, Spain\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[36;1m\u001b[1;3m\n",
"So the final answer is: El Palmar, Spain\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 256, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"Are follow up questions needed here:\u001b[32;1m\u001b[1;3m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
"Intermediate answer: \u001b[33;1m\u001b[1;3mCarlos Alcaraz.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"So the final answer is:\n",
"\n",
"Carlos Alcaraz\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[33;1m\u001b[1;3m\n",
"So the final answer is:\n",
"\n",
"Carlos Alcaraz\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab.compare(\"What is the hometown of the reigning men's U.S. Open champion?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "94159131",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

10
docs/examples/prompts.rst Normal file
View File

@@ -0,0 +1,10 @@
Prompts
=======
The examples here all highlight how to work with prompts.
.. toctree::
:maxdepth: 1
:glob:
prompts/*

View File

@@ -1,10 +1,28 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "f5d249ee",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# Generate Examples\n",
"\n",
"This notebook shows how to use LangChain to generate more examples similar to the ones you already have."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "1685fa2f",
"metadata": {},
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"from langchain.chains.react.prompt import EXAMPLES\n",
@@ -14,9 +32,13 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"id": "334ef4f7",
"metadata": {},
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"data": {
@@ -24,7 +46,7 @@
"'Question: What is the elevation range for the area that the eastern sector of the\\nColorado orogeny extends into?\\nThought 1: I need to search Colorado orogeny, find the area that the eastern sector\\nof the Colorado orogeny extends into, then find the elevation range of the\\narea.\\nAction 1: Search[Colorado orogeny]\\nObservation 1: The Colorado orogeny was an episode of mountain building (an orogeny) in\\nColorado and surrounding areas.\\nThought 2: It does not mention the eastern sector. So I need to look up eastern\\nsector.\\nAction 2: Lookup[eastern sector]\\nObservation 2: (Result 1 / 1) The eastern sector extends into the High Plains and is called\\nthe Central Plains orogeny.\\nThought 3: The eastern sector of Colorado orogeny extends into the High Plains. So I\\nneed to search High Plains and find its elevation range.\\nAction 3: Search[High Plains]\\nObservation 3: High Plains refers to one of two distinct land regions\\nThought 4: I need to instead search High Plains (United States).\\nAction 4: Search[High Plains (United States)]\\nObservation 4: The High Plains are a subregion of the Great Plains. From east to west, the\\nHigh Plains rise in elevation from around 1,800 to 7,000 ft (550 to 2,130\\nm).[3]\\nThought 5: High Plains rise in elevation from around 1,800 to 7,000 ft, so the answer\\nis 1,800 to 7,000 ft.\\nAction 5: Finish[1,800 to 7,000 ft]'"
]
},
"execution_count": 3,
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
@@ -36,9 +58,13 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 3,
"id": "a7bd36bc",
"metadata": {},
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"new_example = generate_example(EXAMPLES, OpenAI())"
@@ -46,40 +72,35 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 4,
"id": "e1efb008",
"metadata": {},
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['',\n",
" '',\n",
" 'Question: Is the Mount Everest taller than the Mount Kilimanjaro?',\n",
" 'Question: Which ocean is the worlds smallest?',\n",
" '',\n",
" 'Thought 1: I need to search Mount Everest and Mount Kilimanjaro, find their',\n",
" 'heights, then compare them.',\n",
" 'Thought 1: I need to search for oceans and find which one is the worlds smallest.',\n",
" '',\n",
" 'Action 1: Search[Mount Everest]',\n",
" 'Action 1: Search[oceans]',\n",
" '',\n",
" \"Observation 1: Mount Everest, at 8,848 metres (29,029 ft), is the world's highest mountain\",\n",
" 'and a particularly popular goal for mountaineers.',\n",
" 'Observation 1: There are five oceans: the Pacific, Atlantic, Indian, Southern, and Arctic.',\n",
" '',\n",
" 'Thought 2: Mount Everest is 8,848 metres tall. I need to search Mount Kilimanjaro',\n",
" 'next.',\n",
" 'Thought 2: I need to compare the sizes of the oceans and find which one is the smallest.',\n",
" '',\n",
" 'Action 2: Search[Mount Kilimanjaro]',\n",
" 'Action 2: Compare[Pacific, Atlantic, Indian, Southern, Arctic]',\n",
" '',\n",
" 'Observation 2: Mount Kilimanjaro, with its three volcanic cones, Kibo, Mawenzi, and',\n",
" 'Shira, is a freestanding mountain in Tanzania. It is the highest mountain in',\n",
" 'Africa, and rises approximately 4,900 metres (16,100 ft) from its base to 5,895',\n",
" 'metres (19,341 ft) above sea level.',\n",
" '',\n",
" 'Thought 3: Mount Kilimanjaro is 5,895 metres tall. 8,848 metres (Mount Everest) >',\n",
" '5,895 metres (Mount Kil']"
" 'Observation 2: The Arctic is the smallest ocean.']"
]
},
"execution_count": 7,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
@@ -91,7 +112,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "d8843d7b",
"id": "1ed01ba2",
"metadata": {},
"outputs": [],
"source": []
@@ -113,7 +134,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
"version": "3.7.6"
}
},
"nbformat": 4,

View File

@@ -1,5 +1,15 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "d7467b67",
"metadata": {},
"source": [
"# Optimized Prompts\n",
"\n",
"This example showcases how using the OptimizedPrompt class enables selection of the most relevant examples to include as few-shot examples in the prompt."
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -13,7 +23,7 @@
"from langchain.llms.openai import OpenAI\n",
"from langchain.prompts.optimized import OptimizedPrompt\n",
"from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch\n",
"from langchain.vectorstores.faiss import FAISS"
"from langchain.vectorstores.faiss_search import FAISS"
]
},
{
@@ -101,10 +111,18 @@
"print(prompt.format(k=1, input=\"What is the highest mountain peak in Asia?\"))"
]
},
{
"cell_type": "markdown",
"id": "a5dc3525",
"metadata": {},
"source": [
"## Requires having ElasticSearch setup"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "f7f06820",
"execution_count": null,
"id": "bbd92d08",
"metadata": {},
"outputs": [],
"source": [
@@ -120,48 +138,10 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": null,
"id": "bd91f408",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"Question: What is the elevation range for the area that the eastern sector of the\n",
"Colorado orogeny extends into?\n",
"Thought 1: I need to search Colorado orogeny, find the area that the eastern sector\n",
"of the Colorado orogeny extends into, then find the elevation range of the\n",
"area.\n",
"Action 1: Search[Colorado orogeny]\n",
"Observation 1: The Colorado orogeny was an episode of mountain building (an orogeny) in\n",
"Colorado and surrounding areas.\n",
"Thought 2: It does not mention the eastern sector. So I need to look up eastern\n",
"sector.\n",
"Action 2: Lookup[eastern sector]\n",
"Observation 2: (Result 1 / 1) The eastern sector extends into the High Plains and is called\n",
"the Central Plains orogeny.\n",
"Thought 3: The eastern sector of Colorado orogeny extends into the High Plains. So I\n",
"need to search High Plains and find its elevation range.\n",
"Action 3: Search[High Plains]\n",
"Observation 3: High Plains refers to one of two distinct land regions\n",
"Thought 4: I need to instead search High Plains (United States).\n",
"Action 4: Search[High Plains (United States)]\n",
"Observation 4: The High Plains are a subregion of the Great Plains. From east to west, the\n",
"High Plains rise in elevation from around 1,800 to 7,000 ft (550 to 2,130\n",
"m).[3]\n",
"Thought 5: High Plains rise in elevation from around 1,800 to 7,000 ft, so the answer\n",
"is 1,800 to 7,000 ft.\n",
"Action 5: Finish[1,800 to 7,000 ft]\n",
"\n",
"\n",
"\n",
"Question: What is the highest mountain peak in Asia?\n"
]
}
],
"outputs": [],
"source": [
"print(prompt.format(k=1, input=\"What is the highest mountain peak in Asia?\"))"
]
@@ -191,7 +171,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
"version": "3.7.6"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,38 @@
# Using Chains
Calling an LLM is a great first step, but it's just the beginning.
Normally when you use an LLM in an application, you are not sending user input directly to the LLM.
Instead, you are probably taking user input and constructing a prompt, and then sending that to the LLM.
For example, in the previous example, the text we passed in was hardcoded to ask for a name for a company that made colorful socks.
In this imaginary service, what we would want to do is take only the user input describing what the company does, and then format the prompt with that information.
This is easy to do with LangChain!
First lets define the prompt:
```python
from langchain.prompts import Prompt
prompt = Prompt(
input_variables=["product"],
template="What is a good name for a company that makes {product}?",
)
```
We can now create a very simple chain that will take user input, format the prompt with it, and then send it to the LLM:
```python
from langchain.chains import LLMChain
chain = LLMChain(llm=llm, prompt=prompt)
```
Now we can run that can only specifying the product!
```python
chain.run("colorful socks")
```
There we go! There's the first chain.
That is it for the Getting Started example.
As a next step, we would suggest checking out the more complex chains in the [Demos section](/examples/demos.rst)

View File

@@ -0,0 +1,37 @@
# Setting up your environment
Using LangChain will usually require integrations with one or more model providers, data stores, apis, etc.
There are two components to setting this up, installing the correct python packages and setting the right environment variables.
## Python packages
The python package needed varies based on the integration. See the list of integrations for details.
There should also be helpful error messages raised if you try to run an integration and are missing any required python packages.
## Environment Variables
The environment variable needed varies based on the integration. See the list of integrations for details.
There should also be helpful error messages raised if you try to run an integration and are missing any required environment variables.
You can set the environment variable in a few ways.
If you are trying to set the environment variable `FOO` to value `bar`, here are the ways you could do so:
- From the command line:
```
export FOO=bar
```
- From the python notebook/script:
```python
import os
os.environ["FOO"] = "bar"
```
For the Getting Started example, we will be using OpenAI's APIs, so we will first need to install their SDK:
```
pip install openai
```
We will then need to set the environment variable. Let's do this from inside the Jupyter notebook (or Python script).
```python
import os
os.environ["OPENAI_API_KEY"] = "..."
```

View File

@@ -0,0 +1,11 @@
# Installation
LangChain is available on PyPi, so to it is easily installable with:
```
pip install langchain
```
For more involved installation options, see the [Installation Reference](/installation.md) section.
That's it! LangChain is now installed. You can now use LangChain from a python script or Jupyter notebook.

View File

@@ -0,0 +1,25 @@
# Calling a LLM
The most basic building block of LangChain is calling an LLM on some input.
Let's walk through a simple example of how to do this.
For this purpose, let's pretend we are building a service that generates a company name based on what the company makes.
In order to do this, we first need to import the LLM wrapper.
```python
from langchain.llms import OpenAI
```
We can then initialize the wrapper with any arguments.
In this example, we probably want the outputs to be MORE random, so we'll initialize it with a HIGH temperature.
```python
llm = OpenAI(temperature=0.9)
```
We can now call it on some input!
```python
text = "What would be a good company name a company that makes colorful socks?"
llm(text)
```

View File

@@ -1,18 +1,82 @@
Welcome to LangChain
==========================
.. toctree::
:maxdepth: 2
:caption: User API
Large language models (LLMs) are emerging as a transformative technology, enabling
developers to build applications that they previously could not.
But using these LLMs in isolation is often not enough to
create a truly powerful app - the real power comes when you are able to
combine them with other sources of computation or knowledge.
This library is aimed at assisting in the development of those types of applications.
It aims to create:
1. a comprehensive collection of pieces you would ever want to combine
2. a flexible interface for combining pieces into a single comprehensive "chain"
3. a schema for easily saving and sharing those chains
The documentation is structured into the following sections:
.. toctree::
:maxdepth: 1
:caption: Getting Started
:name: getting_started
getting_started/installation.md
getting_started/environment.md
getting_started/llm.md
getting_started/chains.md
Goes over a simple walk through and tutorial for getting started setting up a simple chain that generates a company name based on what the company makes.
Covers installation, environment set up, calling LLMs, and using prompts.
Start here if you haven't used LangChain before.
.. toctree::
:maxdepth: 1
:caption: How-To Examples
:name: examples
examples/demos.rst
examples/integrations.rst
examples/prompts.rst
examples/model_laboratory.ipynb
More elaborate examples and walk-throughs of particular
integrations and use cases. This is the place to look if you have questions
about how to integrate certain pieces, or if you want to find examples of
common tasks or cool demos.
.. toctree::
:maxdepth: 1
:caption: Reference
:name: reference
installation.md
integrations.md
modules/prompt
modules/llms
modules/embeddings
modules/text_splitter
modules/vectorstore
modules/chains
Full API documentation. This is the place to look if you want to
see detailed information about the various classes, methods, and APIs.
.. toctree::
:maxdepth: 1
:caption: Resources
:name: resources
core_concepts.md
glossary.md
Discord <https://discord.gg/6adMQxSpJS>
Higher level, conceptual explanations of the LangChain components.
This is the place to go if you want to increase your high level understanding
of the problems LangChain is solving, and how we decided to go about do so.

24
docs/installation.md Normal file
View File

@@ -0,0 +1,24 @@
# Installation Options
LangChain is available on PyPi, so to it is easily installable with:
```
pip install langchain
```
That will install the bare minimum requirements of LangChain.
A lot of the value of LangChain comes when integrating it with various model providers, datastores, etc.
By default, the dependencies needed to do that are NOT installed.
However, there are two other ways to install LangChain that do bring in those dependencies.
To install modules needed for the common LLM providers, run:
```
pip install langchain[llms]
```
To install all modules needed for all integrations, run:
```
pip install langchain[all]
```

33
docs/integrations.md Normal file
View File

@@ -0,0 +1,33 @@
# Integration Reference
Besides the installation of this python package, you will also need to install packages and set environment variables depending on which chains you want to use.
Note: the reason these packages are not included in the dependencies by default is that as we imagine scaling this package, we do not want to force dependencies that are not needed.
The following use cases require specific installs and api keys:
- _OpenAI_:
- Install requirements with `pip install openai`
- Get an OpenAI api key and either set it as an environment variable (`OPENAI_API_KEY`) or pass it to the LLM constructor as `openai_api_key`.
- _Cohere_:
- Install requirements with `pip install cohere`
- Get a Cohere api key and either set it as an environment variable (`COHERE_API_KEY`) or pass it to the LLM constructor as `cohere_api_key`.
- _HuggingFace Hub_
- Install requirements with `pip install huggingface_hub`
- Get a HuggingFace Hub api token and either set it as an environment variable (`HUGGINGFACEHUB_API_TOKEN`) or pass it to the LLM constructor as `huggingfacehub_api_token`.
- _SerpAPI_:
- Install requirements with `pip install google-search-results`
- Get a SerpAPI api key and either set it as an environment variable (`SERPAPI_API_KEY`) or pass it to the LLM constructor as `serpapi_api_key`.
- _NatBot_:
- Install requirements with `pip install playwright`
- _Wikipedia_:
- Install requirements with `pip install wikipedia`
- _Elasticsearch_:
- Install requirements with `pip install elasticsearch`
- Set up Elasticsearch backend. If you want to do locally, [this](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/getting-started.html) is a good guide.
- _FAISS_:
- Install requirements with `pip install faiss` for Python 3.7 and `pip install faiss-cpu` for Python 3.10+.
- _Manifest_:
- Install requirements with `pip install manifest-ml` (Note: this is only available in Python 3.8+ currently).
If you are using the `NLTKTextSplitter` or the `SpacyTextSplitter`, you will also need to install the appropriate models. For example, if you want to use the `SpacyTextSplitter`, you will need to install the `en_core_web_sm` model with `python -m spacy download en_core_web_sm`. Similarly, if you want to use the `NLTKTextSplitter`, you will need to install the `punkt` model with `python -m nltk.downloader punkt`.

View File

@@ -0,0 +1,6 @@
:mod:`langchain.text_splitter`
==============================
.. automodule:: langchain.text_splitter
:members:
:undoc-members:

View File

@@ -0,0 +1,6 @@
:mod:`langchain.vectorstores`
=============================
.. automodule:: langchain.vectorstores
:members:
:undoc-members:

View File

@@ -1,6 +1,8 @@
autodoc_pydantic==1.8.0
myst_parser
nbsphinx==0.8.9
sphinx==4.5.0
sphinx-autobuild==2021.3.14
sphinx_rtd_theme==1.0.0
sphinx-typlog-theme==0.8.0
autodoc_pydantic==1.8.0
myst_parser
sphinx-panels

View File

@@ -1,125 +0,0 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "04a0170a",
"metadata": {},
"outputs": [],
"source": [
"from manifest import Manifest"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "de250a6a",
"metadata": {},
"outputs": [],
"source": [
"manifest = Manifest(\n",
" client_name = \"openai\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "0148f7bb",
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms.manifest import ManifestWrapper"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "67b719d6",
"metadata": {},
"outputs": [],
"source": [
"llm = ManifestWrapper(client=manifest, llm_kwargs={\"temperature\": 0, \"max_tokens\": 256})"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "5af505a8",
"metadata": {},
"outputs": [],
"source": [
"# Map reduce example\n",
"from langchain import Prompt\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.chains.mapreduce import MapReduceChain\n",
"\n",
"\n",
"_prompt = \"\"\"Write a concise summary of the following:\n",
"\n",
"\n",
"{text}\n",
"\n",
"\n",
"CONCISE SUMMARY:\"\"\"\n",
"prompt = Prompt(template=_prompt, input_variables=[\"text\"])\n",
"\n",
"text_splitter = CharacterTextSplitter()\n",
"\n",
"mp_chain = MapReduceChain.from_params(llm, prompt, text_splitter)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "485b3ec3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"The President discusses the recent aggression by Russia, and the response by the United States and its allies. He announces new sanctions against Russia, and says that the free world is united in holding Putin accountable. The President also discusses the American Rescue Plan, the Bipartisan Infrastructure Law, and the Bipartisan Innovation Act. Finally, the President addresses the need for women's rights and equality for LGBTQ+ Americans.\""
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with open('state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"mp_chain.run(state_of_the_union)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "32da6e41",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,147 +0,0 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "ab9e95ad",
"metadata": {},
"outputs": [],
"source": [
"from langchain import LLMChain, OpenAI, Cohere, HuggingFaceHub, Prompt\n",
"from langchain.model_laboratory import ModelLaboratory"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "32cb94e6",
"metadata": {},
"outputs": [],
"source": [
"llms = [OpenAI(temperature=0), Cohere(model=\"command-xlarge-20221108\", max_tokens=20, temperature=0), HuggingFaceHub(repo_id=\"google/flan-t5-xl\", model_kwargs={\"temperature\":1})]"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "14cde09d",
"metadata": {},
"outputs": [],
"source": [
"model_lab = ModelLaboratory(llms)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "f186c741",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"What color is a flamingo?\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\u001b[104m\n",
"\n",
"Flamingos are pink.\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 20, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\u001b[103m\n",
"\n",
"Pink\u001b[0m\n",
"\n",
"\u001b[1mHuggingFaceHub\u001b[0m\n",
"Params: {'repo_id': 'google/flan-t5-xl', 'temperature': 1}\n",
"\u001b[101mpink\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab.compare(\"What color is a flamingo?\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "248b652a",
"metadata": {},
"outputs": [],
"source": [
"prompt = Prompt(template=\"What is the capital of {state}?\", input_variables=[\"state\"])\n",
"model_lab_with_prompt = ModelLaboratory(llms, prompt=prompt)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "f64377ac",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"New York\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\u001b[104m\n",
"\n",
"The capital of New York is Albany.\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 20, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\u001b[103m\n",
"\n",
"The capital of New York is Albany.\u001b[0m\n",
"\n",
"\u001b[1mHuggingFaceHub\u001b[0m\n",
"Params: {'repo_id': 'google/flan-t5-xl', 'temperature': 1}\n",
"\u001b[101mst john s\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab_with_prompt.compare(\"New York\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "54336dbf",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,83 +0,0 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 3,
"id": "4e272b47",
"metadata": {},
"outputs": [],
"source": [
"from langchain import OpenAI, ReActChain, Wikipedia\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"react = ReActChain(llm=llm, docstore=Wikipedia(), verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "8078c8f1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[102m I need to search David Chanoff and find the U.S. Navy admiral he\n",
"collaborated with.\n",
"Action 1: Search[David Chanoff]\u001b[0m\u001b[49m\n",
"Observation 1: \u001b[0m\u001b[103mDavid Chanoff is a noted author of non-fiction work. His work has typically involved collaborations with the principal protagonist of the work concerned. His collaborators have included; Augustus A. White, Joycelyn Elders, Đoàn Văn Toại, William J. Crowe, Ariel Sharon, Kenneth Good and Felix Zandman. He has also written about a wide range of subjects including literary history, education and foreign for The Washington Post, The New Republic and The New York Times Magazine. He has published more than twelve books.\u001b[0m\u001b[49m\n",
"Thought 2:\u001b[0m\u001b[102m The U.S. Navy admiral David Chanoff collaborated with is William J. Crowe.\n",
"Action 2: Search[William J. Crowe]\u001b[0m\u001b[49m\n",
"Observation 2: \u001b[0m\u001b[103mWilliam James Crowe Jr. (January 2, 1925 October 18, 2007) was a United States Navy admiral and diplomat who served as the 11th chairman of the Joint Chiefs of Staff under Presidents Ronald Reagan and George H. W. Bush, and as the ambassador to the United Kingdom and Chair of the Intelligence Oversight Board under President Bill Clinton.\u001b[0m\u001b[49m\n",
"Thought 3:\u001b[0m\u001b[102m William J. Crowe served as the ambassador to the United Kingdom under President Bill Clinton. So the answer is Bill Clinton.\n",
"Action 3: Finish[Bill Clinton]\u001b[0m"
]
},
{
"data": {
"text/plain": [
"'Bill Clinton'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"question = \"Author David Chanoff has collaborated with a U.S. Navy admiral who served as the ambassador to the United Kingdom under which President?\"\n",
"react.run(question)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0a6bd3b4",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1 +1 @@
0.0.11
0.0.16

View File

@@ -14,6 +14,7 @@ from langchain.chains import (
SelfAskWithSearchChain,
SerpAPIChain,
SQLDatabaseChain,
VectorDBQA,
)
from langchain.docstore import Wikipedia
from langchain.llms import Cohere, HuggingFaceHub, OpenAI
@@ -39,5 +40,6 @@ __all__ = [
"SQLDatabaseChain",
"FAISS",
"MRKLChain",
"VectorDBQA",
"ElasticVectorSearch",
]

View File

@@ -7,6 +7,7 @@ from langchain.chains.react.base import ReActChain
from langchain.chains.self_ask_with_search.base import SelfAskWithSearchChain
from langchain.chains.serpapi import SerpAPIChain
from langchain.chains.sql_database.base import SQLDatabaseChain
from langchain.chains.vector_db_qa.base import VectorDBQA
__all__ = [
"LLMChain",
@@ -17,4 +18,5 @@ __all__ = [
"ReActChain",
"SQLDatabaseChain",
"MRKLChain",
"VectorDBQA",
]

View File

@@ -9,7 +9,7 @@ class Chain(BaseModel, ABC):
"""Base interface that all chains should implement."""
verbose: bool = False
"""Whether to print out the code that was executed."""
"""Whether to print out response text."""
@property
@abstractmethod
@@ -35,16 +35,39 @@ class Chain(BaseModel, ABC):
)
@abstractmethod
def _run(self, inputs: Dict[str, str]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
"""Run the logic of this chain and return the output."""
def __call__(self, inputs: Dict[str, Any]) -> Dict[str, str]:
def __call__(
self, inputs: Dict[str, Any], return_only_outputs: bool = False
) -> Dict[str, str]:
"""Run the logic of this chain and add to output."""
self._validate_inputs(inputs)
if self.verbose:
print("\n\n\033[1m> Entering new chain...\033[0m")
outputs = self._run(inputs)
outputs = self._call(inputs)
if self.verbose:
print("\n\033[1m> Finished chain.\033[0m")
self._validate_outputs(outputs)
return {**inputs, **outputs}
if return_only_outputs:
return outputs
else:
return {**inputs, **outputs}
def apply(self, input_list: List[Dict[str, Any]]) -> List[Dict[str, str]]:
"""Call the chain on all inputs in the list."""
return [self(inputs) for inputs in input_list]
def run(self, text: str) -> str:
"""Run text in, text out (if applicable)."""
if len(self.input_keys) != 1:
raise ValueError(
f"`run` not supported when there is not exactly "
f"one input key, got {self.input_keys}."
)
if len(self.output_keys) != 1:
raise ValueError(
f"`run` not supported when there is not exactly "
f"one output key, got {self.output_keys}."
)
return self({self.input_keys[0]: text})[self.output_keys[0]]

View File

@@ -0,0 +1,41 @@
"""Custom chain class."""
from typing import Callable, Dict, List
from pydantic import BaseModel, Extra
from langchain.chains.base import Chain
class SimpleCustomChain(Chain, BaseModel):
"""Custom chain with single string input/output."""
func: Callable[[str], str]
"""Custom callable function."""
input_key: str = "query" #: :meta private:
output_key: str = "result" #: :meta private:
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
@property
def input_keys(self) -> List[str]:
"""Return the singular input key.
:meta private:
"""
return [self.input_key]
@property
def output_keys(self) -> List[str]:
"""Return the singular output key.
:meta private:
"""
return [self.output_key]
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
_input = inputs[self.input_key]
output = self.func(_input)
return {self.output_key: output}

View File

@@ -48,7 +48,7 @@ class LLMChain(Chain, BaseModel):
"""
return [self.output_key]
def _run(self, inputs: Dict[str, Any]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:
selected_inputs = {k: inputs[k] for k in self.prompt.input_variables}
prompt = self.prompt.format(**selected_inputs)

View File

@@ -48,7 +48,7 @@ class LLMMathChain(Chain, BaseModel):
"""
return [self.output_key]
def _run(self, inputs: Dict[str, str]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
llm_executor = LLMChain(prompt=PROMPT, llm=self.llm)
python_executor = PythonChain()
chained_input = ChainedInput(inputs[self.input_key], verbose=self.verbose)
@@ -66,19 +66,3 @@ class LLMMathChain(Chain, BaseModel):
else:
raise ValueError(f"unknown format from LLM: {t}")
return {self.output_key: answer}
def run(self, question: str) -> str:
"""Understand user question and execute math in Python if necessary.
Args:
question: User question that contains a math question to parse and answer.
Returns:
The answer to the question.
Example:
.. code-block:: python
answer = llm_math.run("What is one plus one?")
"""
return self({self.input_key: question})[self.output_key]

View File

@@ -1,6 +0,0 @@
[
{
"question": "What is 37593 * 67?",
"answer": "```python\nprint(37593 * 67)\n```\n```output\n2518731\n```\nAnswer: 2518731"
}
]

View File

@@ -1,7 +1,7 @@
# flake8: noqa
from langchain.prompts.prompt import Prompt
_PREFIX = """You are GPT-3, and you can't do math.
_PROMPT_TEMPLATE = """You are GPT-3, and you can't do math.
You can do basic math, and your memorization abilities are impressive, but you can't do any complex calculations that a human could not do in their head. You also have an annoying tendency to just make up highly specific, but wrong, answers.
@@ -21,29 +21,18 @@ Otherwise, use this simpler format:
Question: ${{Question without hard calculation}}
Answer: ${{Answer}}
Begin."""
Begin.
from pathlib import Path
Question: What is 37593 * 67?
from langchain.prompts.data import BaseExample
```python
print(37593 * 67)
```
```output
2518731
```
Answer: 2518731
example_path = Path(__file__).parent / "examples.json"
import json
Question: {question}"""
class LLMMathExample(BaseExample):
question: str
answer: str
@property
def formatted(self) -> str:
return f"Question: {self.question}\n\n{self.answer}"
with open(example_path) as f:
raw_examples = json.load(f)
examples = [LLMMathExample(**example) for example in raw_examples]
PROMPT = Prompt.from_examples(
examples, "Question: {question}", ["question"], prefix=_PREFIX
)
PROMPT = Prompt(input_variables=["question"], template=_PROMPT_TEMPLATE)

View File

@@ -57,18 +57,15 @@ class MapReduceChain(Chain, BaseModel):
"""
return [self.output_key]
def _run(self, inputs: Dict[str, str]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
# Split the larger text into smaller chunks.
docs = self.text_splitter.split_text(
inputs[self.input_key],
)
docs = self.text_splitter.split_text(inputs[self.input_key])
# Now that we have the chunks, we send them to the LLM and track results.
# This is the "map" part.
summaries = []
for d in docs:
inputs = {self.map_llm.prompt.input_variables[0]: d}
res = self.map_llm.predict(**inputs)
summaries.append(res)
input_list = [{self.map_llm.prompt.input_variables[0]: d} for d in docs]
summary_results = self.map_llm.apply(input_list)
summaries = [res[self.map_llm.output_key] for res in summary_results]
# We then need to combine these individual parts into one.
# This is the reduce part.
@@ -76,7 +73,3 @@ class MapReduceChain(Chain, BaseModel):
inputs = {self.reduce_llm.prompt.input_variables[0]: summary_str}
output = self.reduce_llm.predict(**inputs)
return {self.output_key: output}
def run(self, text: str) -> str:
"""Run the map-reduce logic on the input text."""
return self({self.input_key: text})[self.output_key]

View File

@@ -147,7 +147,7 @@ class MRKLChain(Chain, BaseModel):
"""
return [self.output_key]
def _run(self, inputs: Dict[str, str]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
llm_chain = LLMChain(llm=self.llm, prompt=self.prompt)
chained_input = ChainedInput(
f"{inputs[self.input_key]}\nThought:", verbose=self.verbose
@@ -168,7 +168,3 @@ class MRKLChain(Chain, BaseModel):
chained_input.add("\nObservation: ")
chained_input.add(ca, color=color_mapping[action])
chained_input.add("\nThought:")
def run(self, _input: str) -> str:
"""Run input through the MRKL system."""
return self({self.input_key: _input})[self.output_key]

View File

@@ -57,7 +57,7 @@ class NatBotChain(Chain, BaseModel):
"""
return [self.output_key]
def _run(self, inputs: Dict[str, str]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
llm_executor = LLMChain(prompt=PROMPT, llm=self.llm)
url = inputs[self.input_url_key]
browser_content = inputs[self.input_browser_content_key]
@@ -71,7 +71,7 @@ class NatBotChain(Chain, BaseModel):
self.previous_command = llm_cmd
return {self.output_key: llm_cmd}
def run(self, url: str, browser_content: str) -> str:
def execute(self, url: str, browser_content: str) -> str:
"""Figure out next browser command to run.
Args:

View File

@@ -28,14 +28,7 @@ class Crawler:
"Could not import playwright python package. "
"Please it install it with `pip install playwright`."
)
self.browser = (
sync_playwright()
.start()
.chromium.launch(
headless=False,
)
)
self.browser = sync_playwright().start().chromium.launch(headless=False)
self.page = self.browser.new_page()
self.page.set_viewport_size({"width": 1280, "height": 1080})

View File

@@ -0,0 +1,78 @@
"""Chain pipeline where the outputs of one step feed directly into next."""
from typing import Dict, List
from pydantic import BaseModel, Extra, root_validator
from langchain.chains.base import Chain
class Pipeline(Chain, BaseModel):
"""Chain pipeline where the outputs of one step feed directly into next."""
chains: List[Chain]
input_variables: List[str]
output_variables: List[str] #: :meta private:
return_all: bool = False
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
arbitrary_types_allowed = True
@property
def input_keys(self) -> List[str]:
"""Expect input key.
:meta private:
"""
return self.input_variables
@property
def output_keys(self) -> List[str]:
"""Return output key.
:meta private:
"""
return self.output_variables
@root_validator(pre=True)
def validate_chains(cls, values: Dict) -> Dict:
"""Validate that the correct inputs exist for all chains."""
chains = values["chains"]
input_variables = values["input_variables"]
known_variables = set(input_variables)
for chain in chains:
missing_vars = set(chain.input_keys).difference(known_variables)
if missing_vars:
raise ValueError(f"Missing required input keys: {missing_vars}")
overlapping_keys = known_variables.intersection(chain.output_keys)
if overlapping_keys:
raise ValueError(
f"Chain returned keys that already exist: {overlapping_keys}"
)
known_variables |= set(chain.output_keys)
if "output_variables" not in values:
if values.get("return_all", False):
output_keys = known_variables.difference(input_variables)
else:
output_keys = chains[-1].output_keys
values["output_variables"] = output_keys
else:
missing_vars = known_variables.difference(values["output_variables"])
if missing_vars:
raise ValueError(
f"Expected output variables that were not found: {missing_vars}."
)
return values
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
known_values = inputs.copy()
for i, chain in enumerate(self.chains):
outputs = chain(known_values, return_only_outputs=True)
if self.verbose:
print(f"\033[1mChain {i}\033[0m:\n{outputs}\n")
known_values.update(outputs)
return {k: known_values[k] for k in self.output_variables}

View File

@@ -41,7 +41,7 @@ class PythonChain(Chain, BaseModel):
"""
return [self.output_key]
def _run(self, inputs: Dict[str, str]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
python_repl = PythonREPL()
old_stdout = sys.stdout
sys.stdout = mystdout = StringIO()
@@ -49,20 +49,3 @@ class PythonChain(Chain, BaseModel):
sys.stdout = old_stdout
output = mystdout.getvalue()
return {self.output_key: output}
def run(self, code: str) -> str:
"""Run code in python interpreter.
Args:
code: Code snippet to execute, should print out the answer.
Returns:
Answer from running the code and printing out the answer.
Example:
.. code-block:: python
answer = python_chain.run("print(1+1)")
"""
return self({self.input_key: code})[self.output_key]

View File

@@ -72,9 +72,9 @@ class ReActChain(Chain, BaseModel):
:meta private:
"""
return ["full_logic", self.output_key]
return [self.output_key]
def _run(self, inputs: Dict[str, Any]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:
question = inputs[self.input_key]
llm_chain = LLMChain(llm=self.llm, prompt=PROMPT)
chained_input = ChainedInput(f"{question}\nThought 1:", verbose=self.verbose)
@@ -98,27 +98,10 @@ class ReActChain(Chain, BaseModel):
raise ValueError("Cannot lookup without a successful search first")
observation = document.lookup(directive)
elif action == "Finish":
return {"full_logic": chained_input.input, self.output_key: directive}
return {self.output_key: directive}
else:
raise ValueError(f"Got unknown action directive: {action}")
chained_input.add(f"\nObservation {i}: ")
chained_input.add(observation, color="yellow")
chained_input.add(f"\nThought {i + 1}:")
i += 1
def run(self, question: str) -> str:
"""Run ReAct framework.
Args:
question: Question to be answered.
Returns:
Final answer from thinking through the ReAct framework.
Example:
.. code-block:: python
question = "Were Scott Derrickson and Ed Wood of the same nationality?"
answer = react.run(question)
"""
return self({self.input_key: question})[self.output_key]

View File

@@ -1,26 +0,0 @@
[
{
"question": "What is the elevation range for the area that the eastern sector of the Colorado orogeny extends into?",
"answer": "Thought 1: I need to search Colorado orogeny, find the area that the eastern sector of the Colorado orogeny extends into, then find the elevation range of that area.\nAction 1: Search[Colorado orogeny]\nObservation 1: The Colorado orogeny was an episode of mountain building (an orogeny) in Colorado and surrounding areas.\nThought 2: It does not mention the eastern sector. So I need to look up eastern sector.\nAction 2: Lookup[eastern sector]\nObservation 2: (Result 1 / 1) The eastern sector extends into the High Plains and is called the Central Plains orogeny.\nThought 3: The eastern sector of Colorado orogeny extends into the High Plains. So I need to search High Plains and find its elevation range.\nAction 3: Search[High Plains]\nObservation 3: High Plains refers to one of two distinct land regions\nThought 4: I need to instead search High Plains (United States).\nAction 4: Search[High Plains (United States)]\nObservation 4: The High Plains are a subregion of the Great Plains. From east to west, the High Plains rise in elevation from around 1,800 to 7,000 ft (550 to 2,130 m).[3]\nThought 5: High Plains rise in elevation from around 1,800 to 7,000 ft, so the answer is 1,800 to 7,000 ft.\nAction 5: Finish[1,800 to 7,000 ft]"
},
{
"question": "Musician and satirist Allie Goertz wrote a song about the \"The Simpsons\" character Milhouse, who Matt Groening named after who?",
"answer": "Thought 1: The question simplifies to \"The Simpsons\" character Milhouse is named after who. I only need to search Milhouse and find who it is named after.\nAction 1: Search[Milhouse]\nObservation 1: Milhouse Mussolini Van Houten is a recurring character in the Fox animated television series The Simpsons voiced by Pamela Hayden and created by Matt Groening.\nThought 2: The paragraph does not tell who Milhouse is named after, maybe I can look up \"named after\".\nAction 2: Lookup[named after]\nObservation 2: (Result 1 / 1) Milhouse was named after U.S. president Richard Nixon, whose middle name was Milhous.\nThought 3: Milhouse was named after U.S. president Richard Nixon, so the answer is Richard Nixon.\nAction 3: Finish[Richard Nixon]"
},
{
"question": "Which documentary is about Finnish rock groups, Adam Clayton Powell or The Saimaa Gesture?",
"answer": "Thought 1: I need to search Adam Clayton Powell and The Saimaa Gesture, and find which documentary is about Finnish rock groups.\nAction 1: Search[Adam Clayton Powell]\nObservation 1 Could not find [Adam Clayton Powell]. Similar: [Adam Clayton Powell III, Seventh Avenue (Manhattan), Adam Clayton Powell Jr. State Office Building, Isabel Washington Powell, Adam Powell, Adam Clayton Powell (film), Giancarlo Esposito].\nThought 2: To find the documentary, I can search Adam Clayton Powell (film).\nAction 2: Search[Adam Clayton Powell (film)]\nObservation 2: Adam Clayton Powell is a 1989 American documentary film directed by Richard Kilberg. The film is about the rise and fall of influential African-American politician Adam Clayton Powell Jr.[3][4] It was later aired as part of the PBS series The American Experience.\nThought 3: Adam Clayton Powell (film) is a documentary about an African-American politician, not Finnish rock groups. So the documentary about Finnish rock groups must instead be The Saimaa Gesture.\nAction 3: Finish[The Saimaa Gesture]"
},
{
"question": "What profession does Nicholas Ray and Elia Kazan have in common?",
"answer": "Thought 1: I need to search Nicholas Ray and Elia Kazan, find their professions, then find the profession they have in common.\nAction 1: Search[Nicholas Ray]\nObservation 1: Nicholas Ray (born Raymond Nicholas Kienzle Jr., August 7, 1911 - June 16, 1979) was an American film director, screenwriter, and actor best known for the 1955 film Rebel Without a Cause.\nThought 2: Professions of Nicholas Ray are director, screenwriter, and actor. I need to search Elia Kazan next and find his professions.\nAction 2: Search[Elia Kazan]\nObservation 2: Elia Kazan was an American film and theatre director, producer, screenwriter and actor.\nThought 3: Professions of Elia Kazan are director, producer, screenwriter, and actor. So profession Nicholas Ray and Elia Kazan have in common is director, screenwriter, and actor.\nAction 3: Finish[director, screenwriter, actor]"
},
{
"question": "Which magazine was started first Arthurs Magazine or First for Women?",
"answer": "Thought 1: I need to search Arthurs Magazine and First for Women, and find which was started first.\nAction 1: Search[Arthurs Magazine]\nObservation 1: Arthurs Magazine (1844-1846) was an American literary periodical published in Philadelphia in the 19th century.\nThought 2: Arthurs Magazine was started in 1844. I need to search First for Women next.\nAction 2: Search[First for Women]\nObservation 2: First for Women is a womans magazine published by Bauer Media Group in the USA.[1] The magazine was started in 1989.\nThought 3: First for Women was started in 1989. 1844 (Arthurs Magazine) < 1989 (First for Women), so Arthurs Magazine was started first.\nAction 3: Finish[Arthurs Magazine]"
},
{
"question": "Were Pavel Urysohn and Leonid Levin known for the same type of work?",
"answer": "Thought 1: I need to search Pavel Urysohn and Leonid Levin, find their types of work, then find if they are the same.\nAction 1: Search[Pavel Urysohn]\nObservation 1: Pavel Samuilovich Urysohn (February 3, 1898 - August 17, 1924) was a Soviet mathematician who is best known for his contributions in dimension theory.\nThought 2: Pavel Urysohn is a mathematician. I need to search Leonid Levin next and find its type of work.\nAction 2: Search[Leonid Levin]\nObservation 2: Leonid Anatolievich Levin is a Soviet-American mathematician and computer scientist.\nThought 3: Leonid Levin is a mathematician and computer scientist. So Pavel Urysohn and Leonid Levin have the same type of work.\nAction 3: Finish[yes]"
}
]

View File

@@ -1,30 +1,112 @@
# flake8: noqa
from pathlib import Path
from langchain.prompts.data import BaseExample
from langchain.prompts.prompt import Prompt
example_path = Path(__file__).parent / "examples.json"
import json
EXAMPLES = [
"""Question: What is the elevation range for the area that the eastern sector of the
Colorado orogeny extends into?
Thought 1: I need to search Colorado orogeny, find the area that the eastern sector
of the Colorado orogeny extends into, then find the elevation range of the
area.
Action 1: Search[Colorado orogeny]
Observation 1: The Colorado orogeny was an episode of mountain building (an orogeny) in
Colorado and surrounding areas.
Thought 2: It does not mention the eastern sector. So I need to look up eastern
sector.
Action 2: Lookup[eastern sector]
Observation 2: (Result 1 / 1) The eastern sector extends into the High Plains and is called
the Central Plains orogeny.
Thought 3: The eastern sector of Colorado orogeny extends into the High Plains. So I
need to search High Plains and find its elevation range.
Action 3: Search[High Plains]
Observation 3: High Plains refers to one of two distinct land regions
Thought 4: I need to instead search High Plains (United States).
Action 4: Search[High Plains (United States)]
Observation 4: The High Plains are a subregion of the Great Plains. From east to west, the
High Plains rise in elevation from around 1,800 to 7,000 ft (550 to 2,130
m).[3]
Thought 5: High Plains rise in elevation from around 1,800 to 7,000 ft, so the answer
is 1,800 to 7,000 ft.
Action 5: Finish[1,800 to 7,000 ft]""",
"""Question: Musician and satirist Allie Goertz wrote a song about the "The Simpsons"
character Milhouse, who Matt Groening named after who?
Thought 1: The question simplifies to "The Simpsons" character Milhouse is named after
who. I only need to search Milhouse and find who it is named after.
Action 1: Search[Milhouse]
Observation 1: Milhouse Mussolini Van Houten is a recurring character in the Fox animated
television series The Simpsons voiced by Pamela Hayden and created by Matt
Groening.
Thought 2: The paragraph does not tell who Milhouse is named after, maybe I can look up
"named after".
Action 2: Lookup[named after]
Observation 2: (Result 1 / 1) Milhouse was named after U.S. president Richard Nixon, whose
middle name was Milhous.
Thought 3: Milhouse was named after U.S. president Richard Nixon, so the answer is
Richard Nixon.
Action 3: Finish[Richard Nixon]""",
"""Question: Which documentary is about Finnish rock groups, Adam Clayton Powell or The
Saimaa Gesture?
Thought 1: I need to search Adam Clayton Powell and The Saimaa Gesture, and find which
documentary is about Finnish rock groups.
Action 1: Search[Adam Clayton Powell]
Observation 1 Could not find [Adam Clayton Powell]. Similar: [Adam Clayton Powell
III, Seventh Avenue (Manhattan), Adam Clayton Powell Jr. State Office
Building, Isabel Washington Powell, Adam Powell, Adam Clayton Powell
(film), Giancarlo Esposito].
Thought 2: To find the documentary, I can search Adam Clayton Powell (film).
Action 2: Search[Adam Clayton Powell (film)]
Observation 2: Adam Clayton Powell is a 1989 American documentary film directed by
Richard Kilberg. The film is about the rise and fall of influential
African-American politician Adam Clayton Powell Jr.[3][4] It was later aired
as part of the PBS series The American Experience.
Thought 3: Adam Clayton Powell (film) is a documentary about an African-American
politician, not Finnish rock groups. So the documentary about Finnish rock
groups must instead be The Saimaa Gesture.
Action 3: Finish[The Saimaa Gesture]""",
"""Question: What profession does Nicholas Ray and Elia Kazan have in common?
Thought 1: I need to search Nicholas Ray and Elia Kazan, find their professions, then
find the profession they have in common.
Action 1: Search[Nicholas Ray]
Observation 1: Nicholas Ray (born Raymond Nicholas Kienzle Jr., August 7, 1911 - June 16,
1979) was an American film director, screenwriter, and actor best known for
the 1955 film Rebel Without a Cause.
Thought 2: Professions of Nicholas Ray are director, screenwriter, and actor. I need
to search Elia Kazan next and find his professions.
Action 2: Search[Elia Kazan]
Observation 2: Elia Kazan was an American film and theatre director, producer, screenwriter
and actor.
Thought 3: Professions of Elia Kazan are director, producer, screenwriter, and actor.
So profession Nicholas Ray and Elia Kazan have in common is director,
screenwriter, and actor.
Action 3: Finish[director, screenwriter, actor]""",
"""Question: Which magazine was started first Arthurs Magazine or First for Women?
Thought 1: I need to search Arthurs Magazine and First for Women, and find which was
started first.
Action 1: Search[Arthurs Magazine]
Observation 1: Arthurs Magazine (1844-1846) was an American literary periodical published
in Philadelphia in the 19th century.
Thought 2: Arthurs Magazine was started in 1844. I need to search First for Women
next.
Action 2: Search[First for Women]
Observation 2: First for Women is a womans magazine published by Bauer Media Group in the
USA.[1] The magazine was started in 1989.
Thought 3: First for Women was started in 1989. 1844 (Arthurs Magazine) < 1989 (First
for Women), so Arthurs Magazine was started first.
Action 3: Finish[Arthurs Magazine]""",
"""Question: Were Pavel Urysohn and Leonid Levin known for the same type of work?
Thought 1: I need to search Pavel Urysohn and Leonid Levin, find their types of work,
then find if they are the same.
Action 1: Search[Pavel Urysohn]
Observation 1: Pavel Samuilovich Urysohn (February 3, 1898 - August 17, 1924) was a Soviet
mathematician who is best known for his contributions in dimension theory.
Thought 2: Pavel Urysohn is a mathematician. I need to search Leonid Levin next and
find its type of work.
Action 2: Search[Leonid Levin]
Observation 2: Leonid Anatolievich Levin is a Soviet-American mathematician and computer
scientist.
Thought 3: Leonid Levin is a mathematician and computer scientist. So Pavel Urysohn
and Leonid Levin have the same type of work.
Action 3: Finish[yes]""",
]
SUFFIX = """\n\nQuestion: {input}"""
class ReActExample(BaseExample):
question: str
answer: str
@property
def formatted(self) -> str:
return f"Question: {self.question}\n{self.answer}"
with open(example_path) as f:
raw_examples = json.load(f)
examples = [ReActExample(**example) for example in raw_examples]
SUFFIX = """Question: {input}"""
PROMPT = Prompt.from_examples(
examples,
SUFFIX,
["input"],
)
PROMPT = Prompt.from_examples(EXAMPLES, SUFFIX, ["input"])

View File

@@ -114,7 +114,7 @@ class SelfAskWithSearchChain(Chain, BaseModel):
"""
return [self.output_key]
def _run(self, inputs: Dict[str, Any]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:
chained_input = ChainedInput(inputs[self.input_key], verbose=self.verbose)
chained_input.add("\nAre follow up questions needed here:")
llm_chain = LLMChain(llm=self.llm, prompt=PROMPT)
@@ -125,7 +125,7 @@ class SelfAskWithSearchChain(Chain, BaseModel):
chained_input.add(ret_text, color="green")
while followup in get_last_line(ret_text):
question = extract_question(ret_text, followup)
external_answer = self.search_chain.search(question)
external_answer = self.search_chain.run(question)
if external_answer is not None:
chained_input.add(intermediate + " ")
chained_input.add(external_answer + ".", color="yellow")
@@ -147,19 +147,3 @@ class SelfAskWithSearchChain(Chain, BaseModel):
chained_input.add(ret_text, color="green")
return {self.output_key: ret_text}
def run(self, question: str) -> str:
"""Run self ask with search chain.
Args:
question: Question to run self-ask-with-search with.
Returns:
The final answer
Example:
.. code-block:: python
answer = selfask.run("What is the capital of Idaho?")
"""
return self({self.input_key: question})[self.output_key]

View File

@@ -1,18 +0,0 @@
[
{
"question": "Who lived longer, Muhammad Ali or Alan Turing?",
"answer": "Are follow up questions needed here: Yes.\nFollow up: How old was Muhammad Ali when he died?\nIntermediate answer: Muhammad Ali was 74 years old when he died.\nFollow up: How old was Alan Turing when he died?\nIntermediate answer: Alan Turing was 41 years old when he died.\nSo the final answer is: Muhammad Ali"
},
{
"question": "When was the founder of craigslist born?",
"answer": "Are follow up questions needed here: Yes.\nFollow up: Who was the founder of craigslist?\nIntermediate answer: Craigslist was founded by Craig Newmark.\nFollow up: When was Craig Newmark born?\nIntermediate answer: Craig Newmark was born on December 6, 1952.\nSo the final answer is: December 6, 1952"
},
{
"question": "Who was the maternal grandfather of George Washington?",
"answer": "Are follow up questions needed here: Yes.\nFollow up: Who was the mother of George Washington?\nIntermediate answer: The mother of George Washington was Mary Ball Washington.\nFollow up: Who was the father of Mary Ball Washington?\nIntermediate answer: The father of Mary Ball Washington was Joseph Ball.\nSo the final answer is: Joseph Ball"
},
{
"question": "Are both the directors of Jaws and Casino Royale from the same country?",
"answer": "Are follow up questions needed here: Yes.\nFollow up: Who is the director of Jaws?\nIntermediate Answer: The director of Jaws is Steven Spielberg.\nFollow up: Where is Steven Spielberg from?\nIntermediate Answer: The United States.\nFollow up: Who is the director of Casino Royale?\nIntermediate Answer: The director of Casino Royale is Martin Campbell.\nFollow up: Where is Martin Campbell from?\nIntermediate Answer: New Zealand.\nSo the final answer is: No"
}
]

View File

@@ -1,28 +1,41 @@
# flake8: noqa
from pathlib import Path
from langchain.prompts.data import BaseExample
from langchain.prompts.prompt import Prompt
example_path = Path(__file__).parent / "examples.json"
import json
_DEFAULT_TEMPLATE = """Question: Who lived longer, Muhammad Ali or Alan Turing?
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
Question: When was the founder of craigslist born?
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952
class SelfAskWithSearchExample(BaseExample):
question: str
answer: str
Question: Who was the maternal grandfather of George Washington?
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
@property
def formatted(self) -> str:
return f"Question: {self.question}\n{self.answer}"
Question: Are both the directors of Jaws and Casino Royale from the same country?
Are follow up questions needed here: Yes.
Follow up: Who is the director of Jaws?
Intermediate Answer: The director of Jaws is Steven Spielberg.
Follow up: Where is Steven Spielberg from?
Intermediate Answer: The United States.
Follow up: Who is the director of Casino Royale?
Intermediate Answer: The director of Casino Royale is Martin Campbell.
Follow up: Where is Martin Campbell from?
Intermediate Answer: New Zealand.
So the final answer is: No
with open(example_path) as f:
raw_examples = json.load(f)
examples = [SelfAskWithSearchExample(**example) for example in raw_examples]
PROMPT = Prompt.from_examples(
examples,
"Question: {input}",
["input"],
)
Question: {input}"""
PROMPT = Prompt(input_variables=["input"], template=_DEFAULT_TEMPLATE)

View File

@@ -9,6 +9,7 @@ from typing import Any, Dict, List, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.chains.base import Chain
from langchain.utils import get_from_dict_or_env
class HiddenPrints:
@@ -43,7 +44,7 @@ class SerpAPIChain(Chain, BaseModel):
input_key: str = "search_query" #: :meta private:
output_key: str = "search_result" #: :meta private:
serpapi_api_key: Optional[str] = os.environ.get("SERPAPI_API_KEY")
serpapi_api_key: Optional[str] = None
class Config:
"""Configuration for this pydantic object."""
@@ -69,14 +70,10 @@ class SerpAPIChain(Chain, BaseModel):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
serpapi_api_key = values.get("serpapi_api_key")
if serpapi_api_key is None or serpapi_api_key == "":
raise ValueError(
"Did not find SerpAPI API key, please add an environment variable"
" `SERPAPI_API_KEY` which contains it, or pass `serpapi_api_key` "
"as a named parameter to the constructor."
)
serpapi_api_key = get_from_dict_or_env(
values, "serpapi_api_key", "SERPAPI_API_KEY"
)
values["serpapi_api_key"] = serpapi_api_key
try:
from serpapi import GoogleSearch
@@ -88,7 +85,7 @@ class SerpAPIChain(Chain, BaseModel):
)
return values
def _run(self, inputs: Dict[str, Any]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:
params = {
"api_key": self.serpapi_api_key,
"engine": "google",
@@ -116,19 +113,3 @@ class SerpAPIChain(Chain, BaseModel):
else:
toret = None
return {self.output_key: toret}
def search(self, search_question: str) -> str:
"""Run search query against SerpAPI.
Args:
search_question: Question to run against the SerpAPI.
Returns:
Answer from the search engine.
Example:
.. code-block:: python
answer = serpapi.search("What is the capital of Idaho?")
"""
return self({self.input_key: search_question})[self.output_key]

View File

@@ -0,0 +1,59 @@
"""Simple chain pipeline where the outputs of one step feed directly into next."""
from typing import Dict, List
from pydantic import BaseModel, Extra, root_validator
from langchain.chains.base import Chain
class SimplePipeline(Chain, BaseModel):
"""Simple chain pipeline where the outputs of one step feed directly into next."""
chains: List[Chain]
input_key: str = "input" #: :meta private:
output_key: str = "output" #: :meta private:
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
arbitrary_types_allowed = True
@property
def input_keys(self) -> List[str]:
"""Expect input key.
:meta private:
"""
return [self.input_key]
@property
def output_keys(self) -> List[str]:
"""Return output key.
:meta private:
"""
return [self.output_key]
@root_validator()
def validate_chains(cls, values: Dict) -> Dict:
"""Validate that chains are all single input/output."""
for chain in values["chains"]:
if len(chain.input_keys) != 1:
raise ValueError(
"Chains used in SimplePipeline should all have one input, got "
f"{chain} with {len(chain.input_keys)} inputs."
)
if len(chain.output_keys) != 1:
raise ValueError(
"Chains used in SimplePipeline should all have one output, got "
f"{chain} with {len(chain.output_keys)} outputs."
)
return values
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
_input = inputs[self.input_key]
for chain in self.chains:
_input = chain.run(_input)
return {self.output_key: _input}

View File

@@ -51,7 +51,7 @@ class SQLDatabaseChain(Chain, BaseModel):
"""
return [self.output_key]
def _run(self, inputs: Dict[str, str]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
llm_chain = LLMChain(llm=self.llm, prompt=PROMPT)
chained_input = ChainedInput(
inputs[self.input_key] + "\nSQLQuery:", verbose=self.verbose
@@ -72,19 +72,3 @@ class SQLDatabaseChain(Chain, BaseModel):
final_result = llm_chain.predict(**llm_inputs)
chained_input.add(final_result, color="green")
return {self.output_key: final_result}
def query(self, query: str) -> str:
"""Run natural language query against a SQL database.
Args:
query: natural language query to run against the SQL database
Returns:
The final answer as derived from the SQL database.
Example:
.. code-block:: python
answer = db_chain.query("How many customers are there?")
"""
return self({self.input_key: query})[self.output_key]

View File

@@ -15,6 +15,5 @@ Only use the following tables:
Question: {input}"""
PROMPT = Prompt(
input_variables=["input", "table_info", "dialect"],
template=_DEFAULT_TEMPLATE,
input_variables=["input", "table_info", "dialect"], template=_DEFAULT_TEMPLATE
)

View File

@@ -0,0 +1 @@
"""Chain for question-answering against a vector database."""

View File

@@ -0,0 +1,66 @@
"""Chain for question-answering against a vector database."""
from typing import Dict, List
from pydantic import BaseModel, Extra
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.vector_db_qa.prompt import prompt
from langchain.llms.base import LLM
from langchain.vectorstores.base import VectorStore
class VectorDBQA(Chain, BaseModel):
"""Chain for question-answering against a vector database.
Example:
.. code-block:: python
from langchain import OpenAI, VectorDBQA
from langchain.faiss import FAISS
vectordb = FAISS(...)
vectordbQA = VectorDBQA(llm=OpenAI(), vector_db=vectordb)
"""
llm: LLM
"""LLM wrapper to use."""
vectorstore: VectorStore
"""Vector Database to connect to."""
k: int = 4
"""Number of documents to query for."""
input_key: str = "query" #: :meta private:
output_key: str = "result" #: :meta private:
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
arbitrary_types_allowed = True
@property
def input_keys(self) -> List[str]:
"""Return the singular input key.
:meta private:
"""
return [self.input_key]
@property
def output_keys(self) -> List[str]:
"""Return the singular output key.
:meta private:
"""
return [self.output_key]
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
question = inputs[self.input_key]
llm_chain = LLMChain(llm=self.llm, prompt=prompt)
docs = self.vectorstore.similarity_search(question, k=self.k)
contexts = []
for j, doc in enumerate(docs):
contexts.append(f"Context {j}:\n{doc.page_content}")
# TODO: handle cases where this context is too long.
answer = llm_chain.predict(question=question, context="\n\n".join(contexts))
return {self.output_key: answer}

View File

@@ -0,0 +1,10 @@
# flake8: noqa
from langchain.prompts import Prompt
prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.
{context}
Question: {question}
Helpful Answer:"""
prompt = Prompt(template=prompt_template, input_variables=["context", "question"])

View File

@@ -1,7 +1,7 @@
"""Interface for interacting with a document."""
from typing import List
from pydantic import BaseModel
from pydantic import BaseModel, Field
class Document(BaseModel):
@@ -10,6 +10,7 @@ class Document(BaseModel):
page_content: str
lookup_str: str = ""
lookup_index = 0
metadata: dict = Field(default_factory=dict)
@property
def paragraphs(self) -> List[str]:

View File

@@ -32,11 +32,7 @@ class Wikipedia(Docstore):
page_content = wikipedia.page(search).content
result: Union[str, Document] = Document(page_content=page_content)
except wikipedia.PageError:
result = (
f"Could not find [{search}]. " f"Similar: {wikipedia.search(search)}"
)
result = f"Could not find [{search}]. Similar: {wikipedia.search(search)}"
except wikipedia.DisambiguationError:
result = (
f"Could not find [{search}]. " f"Similar: {wikipedia.search(search)}"
)
result = f"Could not find [{search}]. Similar: {wikipedia.search(search)}"
return result

View File

@@ -1,10 +1,10 @@
"""Wrapper around Cohere embedding models."""
import os
from typing import Any, Dict, List, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.embeddings.base import Embeddings
from langchain.utils import get_from_dict_or_env
class CohereEmbeddings(BaseModel, Embeddings):
@@ -25,7 +25,7 @@ class CohereEmbeddings(BaseModel, Embeddings):
model: str = "medium"
"""Model name to use."""
cohere_api_key: Optional[str] = os.environ.get("COHERE_API_KEY")
cohere_api_key: Optional[str] = None
class Config:
"""Configuration for this pydantic object."""
@@ -35,14 +35,9 @@ class CohereEmbeddings(BaseModel, Embeddings):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
cohere_api_key = values.get("cohere_api_key")
if cohere_api_key is None or cohere_api_key == "":
raise ValueError(
"Did not find Cohere API key, please add an environment variable"
" `COHERE_API_KEY` which contains it, or pass `cohere_api_key` as a"
" named parameter."
)
cohere_api_key = get_from_dict_or_env(
values, "cohere_api_key", "COHERE_API_KEY"
)
try:
import cohere

View File

@@ -1,10 +1,10 @@
"""Wrapper around OpenAI embedding models."""
import os
from typing import Any, Dict, List, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.embeddings.base import Embeddings
from langchain.utils import get_from_dict_or_env
class OpenAIEmbeddings(BaseModel, Embeddings):
@@ -25,7 +25,7 @@ class OpenAIEmbeddings(BaseModel, Embeddings):
model_name: str = "babbage"
"""Model name to use."""
openai_api_key: Optional[str] = os.environ.get("OPENAI_API_KEY")
openai_api_key: Optional[str] = None
class Config:
"""Configuration for this pydantic object."""
@@ -35,14 +35,9 @@ class OpenAIEmbeddings(BaseModel, Embeddings):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
openai_api_key = values.get("openai_api_key")
if openai_api_key is None or openai_api_key == "":
raise ValueError(
"Did not find OpenAI API key, please add an environment variable"
" `OPENAI_API_KEY` which contains it, or pass `openai_api_key` as a"
" named parameter."
)
openai_api_key = get_from_dict_or_env(
values, "openai_api_key", "OPENAI_API_KEY"
)
try:
import openai

View File

@@ -1,18 +1,16 @@
"""Utility functions for working with prompts."""
from typing import Sequence, Union
from typing import List
from langchain.chains.llm import LLMChain
from langchain.llms.base import LLM
from langchain.prompts.data import BaseExample, convert_to_examples
from langchain.prompts.dynamic import DynamicPrompt
TEST_GEN_TEMPLATE_SUFFIX = "Add another example."
def generate_example(examples: Sequence[Union[str, BaseExample]], llm: LLM) -> str:
def generate_example(examples: List[str], llm: LLM) -> str:
"""Return another example given a list of examples for a prompt."""
full_examples = convert_to_examples(examples)
prompt = DynamicPrompt(examples=full_examples, suffix=TEST_GEN_TEMPLATE_SUFFIX)
prompt = DynamicPrompt(examples=examples, suffix=TEST_GEN_TEMPLATE_SUFFIX)
chain = LLMChain(llm=llm, prompt=prompt)
return chain.predict()

View File

@@ -1,14 +1,19 @@
"""Handle chained inputs."""
from typing import Dict, List, Optional
_COLOR_MAPPING = {"blue": 104, "yellow": 103, "red": 101, "green": 102}
_TEXT_COLOR_MAPPING = {
"blue": "36;1",
"yellow": "33;1",
"pink": "38;5;200",
"green": "32;1",
}
def get_color_mapping(
items: List[str], excluded_colors: Optional[List] = None
) -> Dict[str, str]:
"""Get mapping for items to a support color."""
colors = list(_COLOR_MAPPING.keys())
colors = list(_TEXT_COLOR_MAPPING.keys())
if excluded_colors is not None:
colors = [c for c in colors if c not in excluded_colors]
color_mapping = {item: colors[i % len(colors)] for i, item in enumerate(items)}
@@ -20,8 +25,8 @@ def print_text(text: str, color: Optional[str] = None, end: str = "") -> None:
if color is None:
print(text, end=end)
else:
color_str = _COLOR_MAPPING[color]
print(f"\x1b[{color_str}m{text}\x1b[0m", end=end)
color_str = _TEXT_COLOR_MAPPING[color]
print(f"\u001b[{color_str}m\033[1;3m{text}\u001b[0m", end=end)
class ChainedInput:
@@ -29,14 +34,14 @@ class ChainedInput:
def __init__(self, text: str, verbose: bool = False):
"""Initialize with verbose flag and initial text."""
self.verbose = verbose
if self.verbose:
self._verbose = verbose
if self._verbose:
print_text(text, None)
self._input = text
def add(self, text: str, color: Optional[str] = None) -> None:
"""Add text to input, print if in verbose mode."""
if self.verbose:
if self._verbose:
print_text(text, color)
self._input += text

View File

@@ -1,11 +1,11 @@
"""Wrapper around AI21 APIs."""
import os
from typing import Any, Dict, List, Mapping, Optional
import requests
from pydantic import BaseModel, Extra, root_validator
from langchain.llms.base import LLM
from langchain.utils import get_from_dict_or_env
class AI21PenaltyData(BaseModel):
@@ -62,7 +62,7 @@ class AI21(BaseModel, LLM):
logitBias: Optional[Dict[str, float]] = None
"""Adjust the probability of specific tokens being generated."""
ai21_api_key: Optional[str] = os.environ.get("AI21_API_KEY")
ai21_api_key: Optional[str] = None
class Config:
"""Configuration for this pydantic object."""
@@ -72,14 +72,8 @@ class AI21(BaseModel, LLM):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key exists in environment."""
ai21_api_key = values.get("ai21_api_key")
if ai21_api_key is None or ai21_api_key == "":
raise ValueError(
"Did not find AI21 API key, please add an environment variable"
" `AI21_API_KEY` which contains it, or pass `ai21_api_key`"
" as a named parameter."
)
ai21_api_key = get_from_dict_or_env(values, "ai21_api_key", "AI21_API_KEY")
values["ai21_api_key"] = ai21_api_key
return values
@property
@@ -122,11 +116,7 @@ class AI21(BaseModel, LLM):
response = requests.post(
url=f"https://api.ai21.com/studio/v1/{self.model}/complete",
headers={"Authorization": f"Bearer {self.ai21_api_key}"},
json={
"prompt": prompt,
"stopSequences": stop,
**self._default_params,
},
json={"prompt": prompt, "stopSequences": stop, **self._default_params},
)
if response.status_code != 200:
optional_detail = response.json().get("error")

View File

@@ -1,11 +1,11 @@
"""Wrapper around Cohere APIs."""
import os
from typing import Any, Dict, List, Mapping, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.llms.base import LLM
from langchain.llms.utils import enforce_stop_tokens
from langchain.utils import get_from_dict_or_env
class Cohere(LLM, BaseModel):
@@ -44,7 +44,7 @@ class Cohere(LLM, BaseModel):
presence_penalty: int = 0
"""Penalizes repeated tokens."""
cohere_api_key: Optional[str] = os.environ.get("COHERE_API_KEY")
cohere_api_key: Optional[str] = None
class Config:
"""Configuration for this pydantic object."""
@@ -54,14 +54,9 @@ class Cohere(LLM, BaseModel):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
cohere_api_key = values.get("cohere_api_key")
if cohere_api_key is None or cohere_api_key == "":
raise ValueError(
"Did not find Cohere API key, please add an environment variable"
" `COHERE_API_KEY` which contains it, or pass `cohere_api_key`"
" as a named parameter."
)
cohere_api_key = get_from_dict_or_env(
values, "cohere_api_key", "COHERE_API_KEY"
)
try:
import cohere

View File

@@ -1,11 +1,11 @@
"""Wrapper around HuggingFace APIs."""
import os
from typing import Any, Dict, List, Mapping, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.llms.base import LLM
from langchain.llms.utils import enforce_stop_tokens
from langchain.utils import get_from_dict_or_env
DEFAULT_REPO_ID = "gpt2"
VALID_TASKS = ("text2text-generation", "text-generation")
@@ -18,7 +18,7 @@ class HuggingFaceHub(LLM, BaseModel):
environment variable ``HUGGINGFACEHUB_API_TOKEN`` set with your API token, or pass
it as a named parameter to the constructor.
Only supports task `text-generation` for now.
Only supports `text-generation` and `text2text-generation` for now.
Example:
.. code-block:: python
@@ -35,7 +35,7 @@ class HuggingFaceHub(LLM, BaseModel):
model_kwargs: Optional[dict] = None
"""Key word arguments to pass to the model."""
huggingfacehub_api_token: Optional[str] = os.environ.get("HUGGINGFACEHUB_API_TOKEN")
huggingfacehub_api_token: Optional[str] = None
class Config:
"""Configuration for this pydantic object."""
@@ -45,13 +45,9 @@ class HuggingFaceHub(LLM, BaseModel):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
huggingfacehub_api_token = values.get("huggingfacehub_api_token")
if huggingfacehub_api_token is None or huggingfacehub_api_token == "":
raise ValueError(
"Did not find HuggingFace API token, please add an environment variable"
" `HUGGINGFACEHUB_API_TOKEN` which contains it, or pass"
" `huggingfacehub_api_token` as a named parameter."
)
huggingfacehub_api_token = get_from_dict_or_env(
values, "huggingfacehub_api_token", "HUGGINGFACEHUB_API_TOKEN"
)
try:
from huggingface_hub.inference_api import InferenceApi

View File

@@ -1,10 +1,10 @@
"""Wrapper around NLPCloud APIs."""
import os
from typing import Any, Dict, List, Mapping, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.llms.base import LLM
from langchain.utils import get_from_dict_or_env
class NLPCloud(LLM, BaseModel):
@@ -54,7 +54,7 @@ class NLPCloud(LLM, BaseModel):
num_return_sequences: int = 1
"""How many completions to generate for each prompt."""
nlpcloud_api_key: Optional[str] = os.environ.get("NLPCLOUD_API_KEY")
nlpcloud_api_key: Optional[str] = None
class Config:
"""Configuration for this pydantic object."""
@@ -64,14 +64,9 @@ class NLPCloud(LLM, BaseModel):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
nlpcloud_api_key = values.get("nlpcloud_api_key")
if nlpcloud_api_key is None or nlpcloud_api_key == "":
raise ValueError(
"Did not find NLPCloud API key, please add an environment variable"
" `NLPCLOUD_API_KEY` which contains it, or pass `nlpcloud_api_key`"
" as a named parameter."
)
nlpcloud_api_key = get_from_dict_or_env(
values, "nlpcloud_api_key", "NLPCLOUD_API_KEY"
)
try:
import nlpcloud

View File

@@ -1,10 +1,10 @@
"""Wrapper around OpenAI APIs."""
import os
from typing import Any, Dict, List, Mapping, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.llms.base import LLM
from langchain.utils import get_from_dict_or_env
class OpenAI(LLM, BaseModel):
@@ -38,7 +38,7 @@ class OpenAI(LLM, BaseModel):
best_of: int = 1
"""Generates best_of completions server-side and returns the "best"."""
openai_api_key: Optional[str] = os.environ.get("OPENAI_API_KEY")
openai_api_key: Optional[str] = None
class Config:
"""Configuration for this pydantic object."""
@@ -48,14 +48,9 @@ class OpenAI(LLM, BaseModel):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
openai_api_key = values.get("openai_api_key")
if openai_api_key is None or openai_api_key == "":
raise ValueError(
"Did not find OpenAI API key, please add an environment variable"
" `OPENAI_API_KEY` which contains it, or pass `openai_api_key`"
" as a named parameter."
)
openai_api_key = get_from_dict_or_env(
values, "openai_api_key", "OPENAI_API_KEY"
)
try:
import openai

View File

@@ -1,6 +1,7 @@
"""Experiment with different models."""
from typing import List, Optional
from typing import List, Optional, Sequence
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.input import get_color_mapping, print_text
from langchain.llms.base import LLM
@@ -10,7 +11,41 @@ from langchain.prompts.prompt import Prompt
class ModelLaboratory:
"""Experiment with different models."""
def __init__(self, llms: List[LLM], prompt: Optional[Prompt] = None):
def __init__(self, chains: Sequence[Chain], names: Optional[List[str]] = None):
"""Initialize with chains to experiment with.
Args:
chains: list of chains to experiment with.
"""
if not isinstance(chains[0], Chain):
raise ValueError(
"ModelLaboratory should now be initialized with Chains. "
"If you want to initialize with LLMs, use the `from_llms` method "
"instead (`ModelLaboratory.from_llms(...)`)"
)
for chain in chains:
if len(chain.input_keys) != 1:
raise ValueError(
"Currently only support chains with one input variable, "
f"got {chain.input_keys}"
)
if len(chain.output_keys) != 1:
raise ValueError(
"Currently only support chains with one output variable, "
f"got {chain.output_keys}"
)
if names is not None:
if len(names) != len(chains):
raise ValueError("Length of chains does not match length of names.")
self.chains = chains
chain_range = [str(i) for i in range(len(self.chains))]
self.chain_colors = get_color_mapping(chain_range)
self.names = names
@classmethod
def from_llms(
cls, llms: List[LLM], prompt: Optional[Prompt] = None
) -> "ModelLaboratory":
"""Initialize with LLMs to experiment with and optional prompt.
Args:
@@ -18,18 +53,11 @@ class ModelLaboratory:
prompt: Optional prompt to use to prompt the LLMs. Defaults to None.
If a prompt was provided, it should only have one input variable.
"""
self.llms = llms
llm_range = [str(i) for i in range(len(self.llms))]
self.llm_colors = get_color_mapping(llm_range)
if prompt is None:
self.prompt = Prompt(input_variables=["_input"], template="{_input}")
else:
if len(prompt.input_variables) != 1:
raise ValueError(
"Currently only support prompts with one input variable, "
f"got {prompt}"
)
self.prompt = prompt
prompt = Prompt(input_variables=["_input"], template="{_input}")
chains = [LLMChain(llm=llm, prompt=prompt) for llm in llms]
names = [str(llm) for llm in llms]
return cls(chains, names=names)
def compare(self, text: str) -> None:
"""Compare model outputs on an input text.
@@ -42,9 +70,11 @@ class ModelLaboratory:
text: input text to run all models on.
"""
print(f"\033[1mInput:\033[0m\n{text}\n")
for i, llm in enumerate(self.llms):
print_text(str(llm), end="\n")
chain = LLMChain(llm=llm, prompt=self.prompt)
llm_inputs = {self.prompt.input_variables[0]: text}
output = chain.predict(**llm_inputs)
print_text(output, color=self.llm_colors[str(i)], end="\n\n")
for i, chain in enumerate(self.chains):
if self.names is not None:
name = self.names[i]
else:
name = str(chain)
print_text(name, end="\n")
output = chain.run(text)
print_text(output, color=self.chain_colors[str(i)], end="\n\n")

View File

@@ -3,7 +3,6 @@ from abc import ABC, abstractmethod
from typing import Any, List
from langchain.formatting import formatter
from langchain.prompts.data import BaseExample
DEFAULT_FORMATTER_MAPPING = {
"f-string": formatter.format,

View File

@@ -1,36 +0,0 @@
from abc import ABC, abstractmethod
from pydantic import BaseModel
class BaseExample(BaseModel, ABC):
"""Base class for examples."""
@property
@abstractmethod
def formatted(self) -> str:
"""Returns a formatted example as a string."""
class SimpleExample(BaseExample):
text: str
@property
def formatted(self) -> str:
return self.text
from typing import Sequence, Union
def convert_to_examples(
examples: Sequence[Union[str, BaseExample]]
) -> Sequence[BaseExample]:
new_examples = [
example
if isinstance(example, BaseExample)
else SimpleExample(text=str(example))
for example in examples
]
return new_examples

View File

@@ -5,7 +5,6 @@ from typing import Any, Callable, Dict, List
from pydantic import BaseModel, Extra, root_validator
from langchain.prompts.base import DEFAULT_FORMATTER_MAPPING, BasePrompt
from langchain.prompts.data import BaseExample, convert_to_examples
class DynamicPrompt(BaseModel, BasePrompt):
@@ -26,7 +25,7 @@ class DynamicPrompt(BaseModel, BasePrompt):
)
"""
examples: List[BaseExample]
examples: List[str]
"""A list of the examples that the prompt template expects."""
example_separator: str = "\n\n"
@@ -77,7 +76,7 @@ class DynamicPrompt(BaseModel, BasePrompt):
prompt.format(variable1="foo")
"""
curr_examples = [example.formatted for example in self.examples]
curr_examples = self.examples
template = self.template(curr_examples, **kwargs)
while self.get_text_length(template) > self.max_length and curr_examples:
curr_examples = curr_examples[:-1]
@@ -97,16 +96,6 @@ class DynamicPrompt(BaseModel, BasePrompt):
f"Invalid template format. Got `{template_format}`;"
f" should be one of {valid_formats}"
)
dummy_inputs = {input_variable: "foo" for input_variable in input_variables}
try:
formatter_func = DEFAULT_FORMATTER_MAPPING[template_format]
formatter_func(prefix + suffix, **dummy_inputs)
except KeyError:
raise ValueError("Invalid prompt schema.")
return values
@root_validator()
def get_text_length_is_valid(cls, values: Dict) -> Dict:
try:
result = values["get_text_length"]("foo")
assert isinstance(result, int)
@@ -114,10 +103,10 @@ class DynamicPrompt(BaseModel, BasePrompt):
raise ValueError(
"Invalid text length callable, must take string & return int;"
)
return values
# Needs to be pre=True to convert to the right type.
@root_validator(pre=True)
def convert_examples(cls, values: Dict) -> Dict:
values["examples"] = convert_to_examples(values["examples"])
dummy_inputs = {input_variable: "foo" for input_variable in input_variables}
try:
formatter_func = DEFAULT_FORMATTER_MAPPING[template_format]
formatter_func(prefix + suffix, **dummy_inputs)
except KeyError:
raise ValueError("Invalid prompt schema.")
return values

View File

@@ -1,10 +1,9 @@
"""Prompt schema definition."""
from typing import Any, Dict, List, Sequence, Union
from typing import Any, Dict, List
from pydantic import BaseModel, Extra, root_validator
from langchain.prompts.base import DEFAULT_FORMATTER_MAPPING, BasePrompt
from langchain.prompts.data import BaseExample, convert_to_examples
class Prompt(BaseModel, BasePrompt):
@@ -71,7 +70,7 @@ class Prompt(BaseModel, BasePrompt):
@classmethod
def from_examples(
cls,
examples: Sequence[Union[BaseExample, str]],
examples: List[str],
suffix: str,
input_variables: List[str],
example_separator: str = "\n\n",
@@ -95,7 +94,20 @@ class Prompt(BaseModel, BasePrompt):
Returns:
The final prompt generated.
"""
full_examples = convert_to_examples(examples)
data = [prefix] + [example.formatted for example in full_examples] + [suffix]
template = example_separator.join(data)
template = example_separator.join([prefix, *examples, suffix])
return cls(input_variables=input_variables, template=template)
@classmethod
def from_file(cls, template_file: str, input_variables: List[str]) -> "Prompt":
"""Load a prompt from a file.
Args:
template_file: The path to the file containing the prompt template.
input_variables: A list of variable names the final prompt template
will expect.
Returns:
The prompt loaded from the file.
"""
with open(template_file, "r") as f:
template = f.read()
return cls(input_variables=input_variables, template=template)

0
langchain/py.typed Normal file
View File

View File

@@ -1,4 +1,6 @@
"""SQLAlchemy wrapper around a database."""
from typing import Any, Iterable, List, Optional
from sqlalchemy import create_engine, inspect
from sqlalchemy.engine import Engine
@@ -6,29 +8,57 @@ from sqlalchemy.engine import Engine
class SQLDatabase:
"""SQLAlchemy wrapper around a database."""
def __init__(self, engine: Engine):
def __init__(
self,
engine: Engine,
ignore_tables: Optional[List[str]] = None,
include_tables: Optional[List[str]] = None,
):
"""Create engine from database URI."""
self._engine = engine
if include_tables and ignore_tables:
raise ValueError("Cannot specify both include_tables and ignore_tables")
self._inspector = inspect(self._engine)
self._all_tables = self._inspector.get_table_names()
self._include_tables = include_tables or []
if self._include_tables:
missing_tables = set(self._include_tables).difference(self._all_tables)
if missing_tables:
raise ValueError(
f"include_tables {missing_tables} not found in database"
)
self._ignore_tables = ignore_tables or []
if self._ignore_tables:
missing_tables = set(self._ignore_tables).difference(self._all_tables)
if missing_tables:
raise ValueError(
f"ignore_tables {missing_tables} not found in database"
)
@classmethod
def from_uri(cls, database_uri: str) -> "SQLDatabase":
def from_uri(cls, database_uri: str, **kwargs: Any) -> "SQLDatabase":
"""Construct a SQLAlchemy engine from URI."""
return cls(create_engine(database_uri))
return cls(create_engine(database_uri), **kwargs)
@property
def dialect(self) -> str:
"""Return string representation of dialect to use."""
return self._engine.dialect.name
def _get_table_names(self) -> Iterable[str]:
if self._include_tables:
return self._include_tables
return set(self._all_tables) - set(self._ignore_tables)
@property
def table_info(self) -> str:
"""Information about all tables in the database."""
template = "The '{table_name}' table has columns: {columns}."
template = "Table '{table_name}' has columns: {columns}."
tables = []
inspector = inspect(self._engine)
for table_name in inspector.get_table_names():
for table_name in self._get_table_names():
columns = []
for column in inspector.get_columns(table_name):
for column in self._inspector.get_columns(table_name):
columns.append(f"{column['name']} ({str(column['type'])})")
column_str = ", ".join(columns)
table_str = template.format(table_name=table_name, columns=column_str)

View File

@@ -1,12 +1,18 @@
"""Functionality for splitting text."""
from abc import abstractmethod
from typing import Iterable, List
from abc import ABC, abstractmethod
from typing import Any, Callable, Iterable, List
class TextSplitter:
class TextSplitter(ABC):
"""Interface for splitting text into chunks."""
def __init__(self, separator: str, chunk_size: int, chunk_overlap: int):
def __init__(
self,
separator: str = "\n\n",
chunk_size: int = 4000,
chunk_overlap: int = 200,
length_function: Callable[[str], int] = len,
):
"""Create a new TextSplitter."""
if chunk_overlap > chunk_size:
raise ValueError(
@@ -16,6 +22,7 @@ class TextSplitter:
self._separator = separator
self._chunk_size = chunk_size
self._chunk_overlap = chunk_overlap
self._length_function = length_function
@abstractmethod
def split_text(self, text: str) -> List[str]:
@@ -28,29 +35,43 @@ class TextSplitter:
current_doc: List[str] = []
total = 0
for d in splits:
if total > self._chunk_size:
if total >= self._chunk_size:
docs.append(self._separator.join(current_doc))
while total > self._chunk_overlap:
total -= len(current_doc[0])
total -= self._length_function(current_doc[0])
current_doc = current_doc[1:]
current_doc.append(d)
total += len(d)
total += self._length_function(d)
docs.append(self._separator.join(current_doc))
return docs
@classmethod
def from_huggingface_tokenizer(
cls, tokenizer: Any, **kwargs: Any
) -> "TextSplitter":
"""Text splitter than uses HuggingFace tokenizer to count length."""
try:
from transformers import PreTrainedTokenizerBase
if not isinstance(tokenizer, PreTrainedTokenizerBase):
raise ValueError(
"Tokenizer received was not an instance of PreTrainedTokenizerBase"
)
def _huggingface_tokenizer_length(text: str) -> int:
return len(tokenizer.encode(text))
except ImportError:
raise ValueError(
"Could not import transformers python package. "
"Please it install it with `pip install transformers`."
)
return cls(length_function=_huggingface_tokenizer_length, **kwargs)
class CharacterTextSplitter(TextSplitter):
"""Implementation of splitting text that looks at characters."""
def __init__(
self, separator: str = "\n\n", chunk_size: int = 4000, chunk_overlap: int = 200
):
"""Create a new CharacterTextSplitter."""
super(CharacterTextSplitter, self).__init__(
separator, chunk_size, chunk_overlap
)
self._separator = separator
def split_text(self, text: str) -> List[str]:
"""Split incoming text and return chunks."""
# First we naively split the large input into a bunch of smaller ones.

17
langchain/utils.py Normal file
View File

@@ -0,0 +1,17 @@
"""Generic utility functions."""
import os
from typing import Any, Dict
def get_from_dict_or_env(data: Dict[str, Any], key: str, env_key: str) -> str:
"""Get a value from a dictionary or an environment variable."""
if key in data and data[key]:
return data[key]
elif env_key in os.environ and os.environ[env_key]:
return os.environ[env_key]
else:
raise ValueError(
f"Did not find {key}, please add an environment variable"
f" `{env_key}` which contains it, or pass"
f" `{key}` as a named parameter."
)

View File

@@ -1,10 +1,10 @@
"""Wrapper around Elasticsearch vector database."""
import os
import uuid
from typing import Any, Callable, Dict, List
from langchain.docstore.document import Document
from langchain.embeddings.base import Embeddings
from langchain.utils import get_from_dict_or_env
from langchain.vectorstores.base import VectorStore
@@ -45,17 +45,14 @@ class ElasticVectorSearch(VectorStore):
"""
def __init__(
self,
elasticsearch_url: str,
index_name: str,
embedding_function: Callable,
self, elasticsearch_url: str, index_name: str, embedding_function: Callable
):
"""Initialize with necessary components."""
try:
import elasticsearch
except ImportError:
raise ValueError(
"Could not import elasticsearch python packge. "
"Could not import elasticsearch python package. "
"Please install it with `pip install elasticearch`."
)
self.embedding_function = embedding_function
@@ -64,7 +61,7 @@ class ElasticVectorSearch(VectorStore):
es_client = elasticsearch.Elasticsearch(elasticsearch_url) # noqa
except ValueError as e:
raise ValueError(
"Your elasticsearch client string is misformatted. " f"Got error: {e} "
f"Your elasticsearch client string is misformatted. Got error: {e} "
)
self.client = es_client
@@ -91,7 +88,7 @@ class ElasticVectorSearch(VectorStore):
) -> "ElasticVectorSearch":
"""Construct ElasticVectorSearch wrapper from raw documents.
This is a user friendly interface that:
This is a user-friendly interface that:
1. Embeds documents.
2. Creates a new index for the embeddings in the Elasticsearch instance.
3. Adds the documents to the newly created Elasticsearch index.
@@ -110,22 +107,15 @@ class ElasticVectorSearch(VectorStore):
elasticsearch_url="http://localhost:9200"
)
"""
elasticsearch_url = kwargs.get("elasticsearch_url")
if not elasticsearch_url:
elasticsearch_url = os.environ.get("ELASTICSEARCH_URL")
if elasticsearch_url is None or elasticsearch_url == "":
raise ValueError(
"Did not find Elasticsearch URL, please add an environment variable"
" `ELASTICSEARCH_URL` which contains it, or pass"
" `elasticsearch_url` as a named parameter."
)
elasticsearch_url = get_from_dict_or_env(
kwargs, "elasticsearch_url", "ELASTICSEARCH_URL"
)
try:
import elasticsearch
from elasticsearch.helpers import bulk
except ImportError:
raise ValueError(
"Could not import elasticsearch python packge. "
"Could not import elasticsearch python package. "
"Please install it with `pip install elasticearch`."
)
try:

View File

@@ -3,7 +3,7 @@ sphinx:
configuration: docs/conf.py
formats: all
python:
version: 3.6
version: 3.8
install:
- requirements: docs/requirements.txt
- method: pip

View File

@@ -1,17 +1,7 @@
-r test_requirements.txt
# For integrations
cohere
elasticsearch
openai
google-search-results
nlpcloud
-e '.[all]'
# For trickier integrations
playwright
wikipedia
huggingface_hub
faiss-cpu
sentence_transformers
manifest-ml
spacy
nltk
# For development
jupyter

View File

@@ -9,6 +9,19 @@ with open(Path(__file__).absolute().parents[0] / "langchain" / "VERSION") as _f:
with open("README.md", "r") as f:
long_description = f.read()
LLM_DEPENDENCIES = ["cohere", "openai", "nlpcloud", "huggingface_hub"]
OTHER_DEPENDENCIES = [
"elasticsearch",
"google-search-results",
"wikipedia",
"faiss-cpu",
"sentence_transformers",
"transformers",
"spacy",
"nltk",
]
setup(
name="langchain",
version=__version__,
@@ -20,4 +33,8 @@ setup(
url="https://github.com/hwchase17/langchain",
include_package_data=True,
long_description_content_type="text/markdown",
extras_require={
"llms": LLM_DEPENDENCIES,
"all": LLM_DEPENDENCIES + OTHER_DEPENDENCIES,
},
)

View File

@@ -5,5 +5,5 @@ from langchain.chains.serpapi import SerpAPIChain
def test_call() -> None:
"""Test that call gives the correct answer."""
chain = SerpAPIChain()
output = chain.search("What was Obama's first name?")
output = chain.run("What was Obama's first name?")
assert output == "Barack Hussein Obama II"

View File

@@ -25,6 +25,6 @@ def test_sql_database_run() -> None:
conn.execute(stmt)
db = SQLDatabase(engine)
db_chain = SQLDatabaseChain(llm=OpenAI(temperature=0), database=db)
output = db_chain.query("What company does Harrison work at?")
output = db_chain.run("What company does Harrison work at?")
expected_output = " Harrison works at Foo."
assert output == expected_output

View File

@@ -6,9 +6,7 @@ def test_manifest_wrapper() -> None:
"""Test manifest wrapper."""
from manifest import Manifest
manifest = Manifest(
client_name="openai",
)
manifest = Manifest(client_name="openai")
llm = ManifestWrapper(client=manifest, llm_kwargs={"temperature": 0})
output = llm("The capital of New York is:")
assert output == "Albany"

View File

@@ -0,0 +1,23 @@
"""Test text splitters that require an integration."""
import pytest
from langchain.text_splitter import CharacterTextSplitter
def test_huggingface_type_check() -> None:
"""Test that type checks are done properly on input."""
with pytest.raises(ValueError):
CharacterTextSplitter.from_huggingface_tokenizer("foo")
def test_huggingface_tokenizer() -> None:
"""Test text splitter that uses a HuggingFace tokenizer."""
from transformers import GPT2TokenizerFast
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")
text_splitter = CharacterTextSplitter.from_huggingface_tokenizer(
tokenizer, separator=" ", chunk_size=1, chunk_overlap=0
)
output = text_splitter.split_text("foo bar")
assert output == ["foo", "bar"]

View File

@@ -22,7 +22,7 @@ class FakeChain(Chain, BaseModel):
"""Output key of bar."""
return ["bar"]
def _run(self, inputs: Dict[str, str]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
if self.be_correct:
return {"bar": "baz"}
else:

View File

@@ -3,18 +3,18 @@
import pytest
from langchain.chains.llm_math.base import LLMMathChain
from langchain.chains.llm_math.prompt import PROMPT
from langchain.chains.llm_math.prompt import _PROMPT_TEMPLATE
from tests.unit_tests.llms.fake_llm import FakeLLM
@pytest.fixture
def fake_llm_math_chain() -> LLMMathChain:
"""Fake LLM Math chain for testing."""
complex_question = PROMPT.format(question="What is the square root of 2?")
complex_question = _PROMPT_TEMPLATE.format(question="What is the square root of 2?")
queries = {
PROMPT.format(question="What is 1 plus 1?"): "Answer: 2",
_PROMPT_TEMPLATE.format(question="What is 1 plus 1?"): "Answer: 2",
complex_question: "```python\nprint(2**.5)\n```",
PROMPT.format(question="foo"): "foo",
_PROMPT_TEMPLATE.format(question="foo"): "foo",
}
fake_llm = FakeLLM(queries=queries)
return LLMMathChain(llm=fake_llm, input_key="q", output_key="a")

View File

@@ -26,7 +26,7 @@ def test_proper_inputs() -> None:
nat_bot_chain = NatBotChain(llm=FakeLLM(), objective="testing")
url = "foo" * 10000
browser_content = "foo" * 10000
output = nat_bot_chain.run(url, browser_content)
output = nat_bot_chain.execute(url, browser_content)
assert output == "bar"
@@ -39,5 +39,5 @@ def test_variable_key_naming() -> None:
input_browser_content_key="b",
output_key="c",
)
output = nat_bot_chain.run("foo", "foo")
output = nat_bot_chain.execute("foo", "foo")
assert output == "bar"

View File

@@ -0,0 +1,103 @@
"""Test pipeline functionality."""
from typing import Dict, List
import pytest
from pydantic import BaseModel
from langchain.chains.base import Chain
from langchain.chains.pipeline import Pipeline
class FakeChain(Chain, BaseModel):
"""Fake Chain for testing purposes."""
input_variables: List[str]
output_variables: List[str]
@property
def input_keys(self) -> List[str]:
"""Input keys this chain returns."""
return self.input_variables
@property
def output_keys(self) -> List[str]:
"""Input keys this chain returns."""
return self.output_variables
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
outputs = {}
for var in self.output_variables:
variables = [inputs[k] for k in self.input_variables]
outputs[var] = " ".join(variables) + "foo"
return outputs
def test_pipeline_usage_single_inputs() -> None:
"""Test pipeline on single input chains."""
chain_1 = FakeChain(input_variables=["foo"], output_variables=["bar"])
chain_2 = FakeChain(input_variables=["bar"], output_variables=["baz"])
pipeline = Pipeline(chains=[chain_1, chain_2], input_variables=["foo"])
output = pipeline({"foo": "123"})
expected_output = {"bar": "123foo", "baz": "123foofoo", "foo": "123"}
assert output == expected_output
def test_pipeline_usage_multiple_inputs() -> None:
"""Test pipeline on multiple input chains."""
chain_1 = FakeChain(input_variables=["foo", "test"], output_variables=["bar"])
chain_2 = FakeChain(input_variables=["bar", "foo"], output_variables=["baz"])
pipeline = Pipeline(chains=[chain_1, chain_2], input_variables=["foo", "test"])
output = pipeline({"foo": "123", "test": "456"})
expected_output = {
"bar": "123 456foo",
"baz": "123 456foo 123foo",
"foo": "123",
"test": "456",
}
assert output == expected_output
def test_pipeline_usage_multiple_outputs() -> None:
"""Test pipeline usage on multiple output chains."""
chain_1 = FakeChain(input_variables=["foo"], output_variables=["bar", "test"])
chain_2 = FakeChain(input_variables=["bar", "foo"], output_variables=["baz"])
pipeline = Pipeline(chains=[chain_1, chain_2], input_variables=["foo"])
output = pipeline({"foo": "123"})
expected_output = {
"bar": "123foo",
"baz": "123foo 123foo",
"foo": "123",
"test": "123foo",
}
assert output == expected_output
def test_pipeline_missing_inputs() -> None:
"""Test error is raised when input variables are missing."""
chain_1 = FakeChain(input_variables=["foo"], output_variables=["bar"])
chain_2 = FakeChain(input_variables=["bar", "test"], output_variables=["baz"])
with pytest.raises(ValueError):
# Also needs "test" as an input
Pipeline(chains=[chain_1, chain_2], input_variables=["foo"])
def test_pipeline_bad_outputs() -> None:
"""Test error is raised when bad outputs are specified."""
chain_1 = FakeChain(input_variables=["foo"], output_variables=["bar"])
chain_2 = FakeChain(input_variables=["bar"], output_variables=["baz"])
with pytest.raises(ValueError):
# "test" is not present as an output variable.
Pipeline(
chains=[chain_1, chain_2],
input_variables=["foo"],
output_variables=["test"],
)
def test_pipeline_overlapping_inputs() -> None:
"""Test error is raised when input variables are overlapping."""
chain_1 = FakeChain(input_variables=["foo"], output_variables=["bar", "test"])
chain_2 = FakeChain(input_variables=["bar"], output_variables=["baz"])
with pytest.raises(ValueError):
# "test" is specified as an input, but also is an output of one step
Pipeline(chains=[chain_1, chain_2], input_variables=["foo", "test"])

Some files were not shown because too many files have changed in this diff Show More