# Description This pull request aims to address specific issues related to the ambiguity and error-proneness of the output types of certain output parsers, as well as the absence of unit tests for some parsers. These issues could potentially lead to runtime errors or unexpected behaviors due to type mismatches when used, causing confusion for developers and users. Through clarifying output types, this PR seeks to improve the stability and reliability. Therefore, this pull request - fixes the `OutputType` of OutputParsers to be the expected type; - e.g. `OutputType` property of `EnumOutputParser` raises `TypeError`. This PR introduce a logic to extract `OutputType` from its attribute. - and fixes the legacy API in OutputParsers like `LLMChain.run` to the modern API like `LLMChain.invoke`; - Note: For `OutputFixingParser`, `RetryOutputParser` and `RetryWithErrorOutputParser`, this PR introduces `legacy` attribute with False as default value in order to keep the backward compatibility - and adds the tests for the `OutputFixingParser` and `RetryOutputParser`. The following table shows my expected output and the actual output of the `OutputType` of OutputParsers. I have used this table to fix `OutputType` of OutputParsers. | Class Name of OutputParser | My Expected `OutputType` (after this PR)| Actual `OutputType` [evidence](#evidence) (before this PR)| Fix Required | |---------|--------------|---------|--------| | BooleanOutputParser | `<class 'bool'>` | `<class 'bool'>` | NO | | CombiningOutputParser | `typing.Dict[str, Any]` | `TypeError` is raised | YES | | DatetimeOutputParser | `<class 'datetime.datetime'>` | `<class 'datetime.datetime'>` | NO | | EnumOutputParser(enum=MyEnum) | `MyEnum` | `TypeError` is raised | YES | | OutputFixingParser | The same type as `self.parser.OutputType` | `~T` | YES | | CommaSeparatedListOutputParser | `typing.List[str]` | `typing.List[str]` | NO | | MarkdownListOutputParser | `typing.List[str]` | `typing.List[str]` | NO | | NumberedListOutputParser | `typing.List[str]` | `typing.List[str]` | NO | | JsonOutputKeyToolsParser | `typing.Any` | `typing.Any` | NO | | JsonOutputToolsParser | `typing.Any` | `typing.Any` | NO | | PydanticToolsParser | `typing.Any` | `typing.Any` | NO | | PandasDataFrameOutputParser | `typing.Dict[str, Any]` | `TypeError` is raised | YES | | PydanticOutputParser(pydantic_object=MyModel) | `<class '__main__.MyModel'>` | `<class '__main__.MyModel'>` | NO | | RegexParser | `typing.Dict[str, str]` | `TypeError` is raised | YES | | RegexDictParser | `typing.Dict[str, str]` | `TypeError` is raised | YES | | RetryOutputParser | The same type as `self.parser.OutputType` | `~T` | YES | | RetryWithErrorOutputParser | The same type as `self.parser.OutputType` | `~T` | YES | | StructuredOutputParser | `typing.Dict[str, Any]` | `TypeError` is raised | YES | | YamlOutputParser(pydantic_object=MyModel) | `MyModel` | `~T` | YES | NOTE: In "Fix Required", "YES" means that it is required to fix in this PR while "NO" means that it is not required. # Issue No issues for this PR. # Twitter handle - [hmdev3](https://twitter.com/hmdev3) # Questions: 1. Is it required to create tests for legacy APIs `LLMChain.run` in the following scripts? - libs/langchain/tests/unit_tests/output_parsers/test_fix.py; - libs/langchain/tests/unit_tests/output_parsers/test_retry.py. 2. Is there a more appropriate expected output type than I expect in the above table? - e.g. the `OutputType` of `CombiningOutputParser` should be SOMETHING... # Actual outputs (before this PR) <div id='evidence'></div> <details><summary>Actual outputs</summary> ## Requirements - Python==3.9.13 - langchain==0.1.13 ```python Python 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import langchain >>> langchain.__version__ '0.1.13' >>> from langchain import output_parsers ``` ### `BooleanOutputParser` ```python >>> output_parsers.BooleanOutputParser().OutputType <class 'bool'> ``` ### `CombiningOutputParser` ```python >>> output_parsers.CombiningOutputParser(parsers=[output_parsers.DatetimeOutputParser(), output_parsers.CommaSeparatedListOutputParser()]).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable CombiningOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `DatetimeOutputParser` ```python >>> output_parsers.DatetimeOutputParser().OutputType <class 'datetime.datetime'> ``` ### `EnumOutputParser` ```python >>> from enum import Enum >>> class MyEnum(Enum): ... a = 'a' ... b = 'b' ... >>> output_parsers.EnumOutputParser(enum=MyEnum).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable EnumOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `OutputFixingParser` ```python >>> output_parsers.OutputFixingParser(parser=output_parsers.DatetimeOutputParser()).OutputType ~T ``` ### `CommaSeparatedListOutputParser` ```python >>> output_parsers.CommaSeparatedListOutputParser().OutputType typing.List[str] ``` ### `MarkdownListOutputParser` ```python >>> output_parsers.MarkdownListOutputParser().OutputType typing.List[str] ``` ### `NumberedListOutputParser` ```python >>> output_parsers.NumberedListOutputParser().OutputType typing.List[str] ``` ### `JsonOutputKeyToolsParser` ```python >>> output_parsers.JsonOutputKeyToolsParser(key_name='tool').OutputType typing.Any ``` ### `JsonOutputToolsParser` ```python >>> output_parsers.JsonOutputToolsParser().OutputType typing.Any ``` ### `PydanticToolsParser` ```python >>> from langchain.pydantic_v1 import BaseModel >>> class MyModel(BaseModel): ... a: int ... >>> output_parsers.PydanticToolsParser(tools=[MyModel, MyModel]).OutputType typing.Any ``` ### `PandasDataFrameOutputParser` ```python >>> output_parsers.PandasDataFrameOutputParser().OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable PandasDataFrameOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `PydanticOutputParser` ```python >>> output_parsers.PydanticOutputParser(pydantic_object=MyModel).OutputType <class '__main__.MyModel'> ``` ### `RegexParser` ```python >>> output_parsers.RegexParser(regex='$', output_keys=['a']).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable RegexParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `RegexDictParser` ```python >>> output_parsers.RegexDictParser(output_key_to_format={'a':'a'}).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable RegexDictParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `RetryOutputParser` ```python >>> output_parsers.RetryOutputParser(parser=output_parsers.DatetimeOutputParser()).OutputType ~T ``` ### `RetryWithErrorOutputParser` ```python >>> output_parsers.RetryWithErrorOutputParser(parser=output_parsers.DatetimeOutputParser()).OutputType ~T ``` ### `StructuredOutputParser` ```python >>> from langchain.output_parsers.structured import ResponseSchema >>> response_schemas = [ResponseSchema(name="foo",description="a list of strings",type="List[string]"),ResponseSchema(name="bar",description="a string",type="string"), ] >>> output_parsers.StructuredOutputParser.from_response_schemas(response_schemas).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable StructuredOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `YamlOutputParser` ```python >>> output_parsers.YamlOutputParser(pydantic_object=MyModel).OutputType ~T ``` <div> --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> |
||
---|---|---|
.devcontainer | ||
.github | ||
cookbook | ||
docker | ||
docs | ||
libs | ||
templates | ||
.gitattributes | ||
.gitignore | ||
.readthedocs.yaml | ||
CITATION.cff | ||
LICENSE | ||
Makefile | ||
MIGRATE.md | ||
poetry.lock | ||
poetry.toml | ||
pyproject.toml | ||
README.md | ||
SECURITY.md |
🦜️🔗 LangChain
⚡ Build context-aware reasoning applications ⚡
Looking for the JS/TS library? Check out LangChain.js.
To help you ship LangChain apps to production faster, check out LangSmith. LangSmith is a unified developer platform for building, testing, and monitoring LLM applications. Fill out this form to speak with our sales team.
Quick Install
With pip:
pip install langchain
With conda:
conda install langchain -c conda-forge
🤔 What is LangChain?
LangChain is a framework for developing applications powered by large language models (LLMs).
For these applications, LangChain simplifies the entire application lifecycle:
- Open-source libraries: Build your applications using LangChain's modular building blocks and components. Integrate with hundreds of third-party providers.
- Productionization: Inspect, monitor, and evaluate your apps with LangSmith so that you can constantly optimize and deploy with confidence.
- Deployment: Turn any chain into a REST API with LangServe.
Open-source libraries
langchain-core
: Base abstractions and LangChain Expression Language.langchain-community
: Third party integrations.- Some integrations have been further split into partner packages that only rely on
langchain-core
. Examples includelangchain_openai
andlangchain_anthropic
.
- Some integrations have been further split into partner packages that only rely on
langchain
: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.LangGraph
: A library for building robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph.
Productionization:
- LangSmith: A developer platform that lets you debug, test, evaluate, and monitor chains built on any LLM framework and seamlessly integrates with LangChain.
Deployment:
- LangServe: A library for deploying LangChain chains as REST APIs.
🧱 What can you build with LangChain?
❓ Question answering with RAG
- Documentation
- End-to-end Example: Chat LangChain and repo
🧱 Extracting structured output
- Documentation
- End-to-end Example: SQL Llama2 Template
🤖 Chatbots
- Documentation
- End-to-end Example: Web LangChain (web researcher chatbot) and repo
And much more! Head to the Tutorials section of the docs for more.
🚀 How does LangChain help?
The main value props of the LangChain libraries are:
- Components: composable building blocks, tools and integrations for working with language models. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
- Off-the-shelf chains: built-in assemblages of components for accomplishing higher-level tasks
Off-the-shelf chains make it easy to get started. Components make it easy to customize existing chains and build new ones.
LangChain Expression Language (LCEL)
LCEL is the foundation of many of LangChain's components, and is a declarative way to compose chains. LCEL was designed from day 1 to support putting prototypes in production, with no code changes, from the simplest “prompt + LLM” chain to the most complex chains.
- Overview: LCEL and its benefits
- Interface: The standard Runnable interface for LCEL objects
- Primitives: More on the primitives LCEL includes
- Cheatsheet: Quick overview of the most common usage patterns
Components
Components fall into the following modules:
📃 Model I/O
This includes prompt management, prompt optimization, a generic interface for chat models and LLMs, and common utilities for working with model outputs.
📚 Retrieval
Retrieval Augmented Generation involves loading data from a variety of sources, preparing it, then searching over (a.k.a. retrieving from) it for use in the generation step.
🤖 Agents
Agents allow an LLM autonomy over how a task is accomplished. Agents make decisions about which Actions to take, then take that Action, observe the result, and repeat until the task is complete. LangChain provides a standard interface for agents along with the LangGraph extension for building custom agents.
📖 Documentation
Please see here for full documentation, which includes:
- Introduction: Overview of the framework and the structure of the docs.
- Tutorials: If you're looking to build something specific or are more of a hands-on learner, check out our tutorials. This is the best place to get started.
- How-to guides: Answers to “How do I….?” type questions. These guides are goal-oriented and concrete; they're meant to help you complete a specific task.
- Conceptual guide: Conceptual explanations of the key parts of the framework.
- API Reference: Thorough documentation of every class and method.
🌐 Ecosystem
- 🦜🛠️ LangSmith: Tracing and evaluating your language model applications and intelligent agents to help you move from prototype to production.
- 🦜🕸️ LangGraph: Creating stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain primitives.
- 🦜🏓 LangServe: Deploying LangChain runnables and chains as REST APIs.
- LangChain Templates: Example applications hosted with LangServe.
💁 Contributing
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
For detailed information on how to contribute, see here.