Building applications with LLMs through composability
Go to file
Predrag Gruevski 1a95252f00
Use pull_request not pull_request_target in GitHub Actions. (#139)
`pull_request` runs on the merge commit between the opened PR and the
target branch where the PR is to be merged — `master` in this case. This
is desirable because that way the new changes get linted and tested.

The existing `pull_request_target` specifier causes lint and test to run
_on the target branch itself_ (i.e. `master` in this case). That way the
new code in the PR doesn't get linted and tested at all. This can also
lead to security vulnerabilities, as described in the GitHub docs:

![image](https://user-images.githubusercontent.com/2348618/201735153-c5dd0c03-2490-45e9-b7f9-f0d47eb0109f.png)

Screenshot from here:
https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request_target
Link from the screenshot:
https://securitylab.github.com/research/github-actions-preventing-pwn-requests/
2022-11-14 11:34:08 -08:00
.github/workflows Use pull_request not pull_request_target in GitHub Actions. (#139) 2022-11-14 11:34:08 -08:00
docs Harrison/redo docs (#130) 2022-11-13 20:13:23 -08:00
langchain Harrison/fix lint (#138) 2022-11-14 08:55:59 -08:00
tests Harrison/fix lint (#138) 2022-11-14 08:55:59 -08:00
.flake8 initial commit 2022-10-24 14:51:15 -07:00
.gitignore Fix pip install issue due to FAISS (#102) 2022-11-09 13:23:17 -08:00
LICENSE add license (#50) 2022-11-01 21:12:02 -07:00
Makefile initial commit 2022-10-24 14:51:15 -07:00
MANIFEST.in Add py.typed marker to package (#121) 2022-11-12 11:22:32 -08:00
pyproject.toml initial commit 2022-10-24 14:51:15 -07:00
README.md Harrison/redo docs (#130) 2022-11-13 20:13:23 -08:00
readthedocs.yml Bumping python version for read the docs (#122) 2022-11-12 13:43:39 -08:00
requirements.txt extra requires (#129) 2022-11-13 17:34:58 -08:00
setup.py extra requires (#129) 2022-11-13 17:34:58 -08:00
test_requirements.txt Harrison/lintai21 (#114) 2022-11-10 08:46:35 -08:00

🦜🔗 LangChain

Building applications with LLMs through composability

lint test License: MIT Twitter

Quick Install

pip install langchain

🤔 What is this?

Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you are able to combine them with other sources of computation or knowledge.

This library is aimed at assisting in the development of those types of applications. It aims to create:

  1. a comprehensive collection of pieces you would ever want to combine
  2. a flexible interface for combining pieces into a single comprehensive "chain"
  3. a schema for easily saving and sharing those chains

📖 Documentation

Please see here for full documentation on:

  • Getting started (installation, setting up environment, simple examples)
  • How-To examples (demos, integrations, helper functions)
  • Reference (full API docs)
  • Resources (high level explanation of core concepts)

🚀 What can I do with this

This project was largely inspired by a few projects seen on Twitter for which we thought it would make sense to have more explicit tooling. A lot of the initial functionality was done in an attempt to recreate those. Those are:

Self-ask-with-search

To recreate this paper, use the following code snippet or checkout the example notebook.

from langchain import SelfAskWithSearchChain, OpenAI, SerpAPIChain

llm = OpenAI(temperature=0)
search = SerpAPIChain()

self_ask_with_search = SelfAskWithSearchChain(llm=llm, search_chain=search)

self_ask_with_search.run("What is the hometown of the reigning men's U.S. Open champion?")

LLM Math

To recreate this example, use the following code snippet or check out the example notebook.

from langchain import OpenAI, LLMMathChain

llm = OpenAI(temperature=0)
llm_math = LLMMathChain(llm=llm)

llm_math.run("How many of the integers between 0 and 99 inclusive are divisible by 8?")

Generic Prompting

You can also use this for simple prompting pipelines, as in the below example and this example notebook.

from langchain import Prompt, OpenAI, LLMChain

template = """Question: {question}

Answer: Let's think step by step."""
prompt = Prompt(template=template, input_variables=["question"])
llm = OpenAI(temperature=0)
llm_chain = LLMChain(prompt=prompt, llm=llm)

question = "What NFL team won the Super Bowl in the year Justin Bieber was born?"

llm_chain.predict(question=question)

Embed & Search Documents

We support two vector databases to store and search embeddings -- FAISS and Elasticsearch. Here's a code snippet showing how to use FAISS to store embeddings and search for text similar to a query. Both database backends are featured in this example notebook.

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.faiss import FAISS
from langchain.text_splitter import CharacterTextSplitter

with open('state_of_the_union.txt') as f:
    state_of_the_union = f.read()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_text(state_of_the_union)

embeddings = OpenAIEmbeddings()

docsearch = FAISS.from_texts(texts, embeddings)

query = "What did the president say about Ketanji Brown Jackson"
docs = docsearch.similarity_search(query)

🤖 Developer Guide

To begin developing on this project, first clone to the repo locally. To install requirements, run pip install -r requirements.txt. This will install all requirements for running the package, examples, linting, formatting, and tests.

Formatting for this project is a combination of Black and isort. To run formatting for this project, run make format.

Linting for this project is a combination of Black, isort, flake8, and mypy. To run linting for this project, run make lint. We recognize linting can be annoying - if you do not want to do it, please contact a project maintainer and they can help you with it. We do not want this to be a blocker for good code getting contributed.

Unit tests cover modular logic that does not require calls to outside apis. To run unit tests, run make tests. If you add new logic, please add a unit test.

Integration tests cover logic that requires making calls to outside APIs (often integration with other services). To run integration tests, run make integration_tests. If you add support for a new external API, please add a new integration test.

If you are adding a Jupyter notebook example, you can run pip install -e . to build the langchain package from your local changes, so your new logic can be imported into the notebook.

Docs are largely autogenerated by sphinx from the code. For that reason, we ask that you add good documentation to all classes and methods. Similar to linting, we recognize documentation can be annoying - if you do not want to do it, please contact a project maintainer and they can help you with it. We do not want this to be a blocker for good code getting contributed.