mirror of
https://github.com/hwchase17/langchain.git
synced 2025-06-21 14:18:52 +00:00
docs[patch]: Update structured output docs to have more discussion (#23786)
CC @agola11 @ccurme
This commit is contained in:
parent
ebb404527f
commit
27aa4d38bf
@ -776,14 +776,54 @@ a few ways to get structured output from models in LangChain.
|
|||||||
|
|
||||||
#### `.with_structured_output()`
|
#### `.with_structured_output()`
|
||||||
|
|
||||||
For convenience, some LangChain chat models support a `.with_structured_output()` method.
|
For convenience, some LangChain chat models support a [`.with_structured_output()`](/docs/how_to/structured_output/#the-with_structured_output-method)
|
||||||
This method only requires a schema as input, and returns a dict or Pydantic object.
|
method. This method only requires a schema as input, and returns a dict or Pydantic object.
|
||||||
Generally, this method is only present on models that support one of the more advanced methods described below,
|
Generally, this method is only present on models that support one of the more advanced methods described below,
|
||||||
and will use one of them under the hood. It takes care of importing a suitable output parser and
|
and will use one of them under the hood. It takes care of importing a suitable output parser and
|
||||||
formatting the schema in the right format for the model.
|
formatting the schema in the right format for the model.
|
||||||
|
|
||||||
|
Here's an example:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
from langchain_core.pydantic_v1 import BaseModel, Field
|
||||||
|
|
||||||
|
|
||||||
|
class Joke(BaseModel):
|
||||||
|
"""Joke to tell user."""
|
||||||
|
|
||||||
|
setup: str = Field(description="The setup of the joke")
|
||||||
|
punchline: str = Field(description="The punchline to the joke")
|
||||||
|
rating: Optional[int] = Field(description="How funny the joke is, from 1 to 10")
|
||||||
|
|
||||||
|
structured_llm = llm.with_structured_output(Joke)
|
||||||
|
|
||||||
|
structured_llm.invoke("Tell me a joke about cats")
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
|
Joke(setup='Why was the cat sitting on the computer?', punchline='To keep an eye on the mouse!', rating=None)
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
We recommend this method as a starting point when working with structured output:
|
||||||
|
|
||||||
|
- It uses other model-specific features under the hood, without the need to import an output parser.
|
||||||
|
- For the models that use tool calling, no special prompting is needed.
|
||||||
|
- If multiple underlying techniques are supported, you can supply a `method` parameter to
|
||||||
|
[toggle which one is used](/docs/how_to/structured_output/#advanced-specifying-the-method-for-structuring-outputs).
|
||||||
|
|
||||||
|
You may want or need to use other techiniques if:
|
||||||
|
|
||||||
|
- The chat model you are using does not support tool calling.
|
||||||
|
- You are working with very complex schemas and the model is having trouble generating outputs that conform.
|
||||||
|
|
||||||
For more information, check out this [how-to guide](/docs/how_to/structured_output/#the-with_structured_output-method).
|
For more information, check out this [how-to guide](/docs/how_to/structured_output/#the-with_structured_output-method).
|
||||||
|
|
||||||
|
You can also check out [this table](/docs/integrations/chat/#advanced-features) for a list of models that support
|
||||||
|
`with_structured_output()`.
|
||||||
|
|
||||||
#### Raw prompting
|
#### Raw prompting
|
||||||
|
|
||||||
The most intuitive way to get a model to structure output is to ask nicely.
|
The most intuitive way to get a model to structure output is to ask nicely.
|
||||||
@ -806,9 +846,8 @@ for smooth parsing can be surprisingly difficult and model-specific.
|
|||||||
Some may be better at interpreting [JSON schema](https://json-schema.org/), others may be best with TypeScript definitions,
|
Some may be better at interpreting [JSON schema](https://json-schema.org/), others may be best with TypeScript definitions,
|
||||||
and still others may prefer XML.
|
and still others may prefer XML.
|
||||||
|
|
||||||
While we'll next go over some ways that you can take advantage of features offered by
|
While features offered by model providers may increase reliability, prompting techniques remain important for tuning your
|
||||||
model providers to increase reliability, prompting techniques remain important for tuning your
|
results no matter which method you choose.
|
||||||
results no matter what method you choose.
|
|
||||||
|
|
||||||
#### JSON mode
|
#### JSON mode
|
||||||
<span data-heading-keywords="json mode"></span>
|
<span data-heading-keywords="json mode"></span>
|
||||||
@ -818,10 +857,11 @@ Some models, such as [Mistral](/docs/integrations/chat/mistralai/), [OpenAI](/do
|
|||||||
support a feature called **JSON mode**, usually enabled via config.
|
support a feature called **JSON mode**, usually enabled via config.
|
||||||
|
|
||||||
When enabled, JSON mode will constrain the model's output to always be some sort of valid JSON.
|
When enabled, JSON mode will constrain the model's output to always be some sort of valid JSON.
|
||||||
Often they require some custom prompting, but it's usually much less burdensome and along the lines of,
|
Often they require some custom prompting, but it's usually much less burdensome than completely raw prompting and
|
||||||
`"you must always return JSON"`, and the [output is easier to parse](/docs/how_to/output_parser_json/).
|
more along the lines of, `"you must always return JSON"`. The [output also generally easier to parse](/docs/how_to/output_parser_json/).
|
||||||
|
|
||||||
It's also generally simpler and more commonly available than tool calling.
|
It's also generally simpler to use directly and more commonly available than tool calling, and can give
|
||||||
|
more flexibility around prompting and shaping results than tool calling.
|
||||||
|
|
||||||
Here's an example:
|
Here's an example:
|
||||||
|
|
||||||
|
@ -58,7 +58,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 1,
|
"execution_count": 2,
|
||||||
"id": "6d55008f",
|
"id": "6d55008f",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
@ -81,17 +81,17 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 38,
|
"execution_count": 3,
|
||||||
"id": "070bf702",
|
"id": "070bf702",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"data": {
|
"data": {
|
||||||
"text/plain": [
|
"text/plain": [
|
||||||
"Joke(setup='Why was the cat sitting on the computer?', punchline='To keep an eye on the mouse!', rating=None)"
|
"Joke(setup='Why was the cat sitting on the computer?', punchline='Because it wanted to keep an eye on the mouse!', rating=8)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 38,
|
"execution_count": 3,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
@ -514,12 +514,49 @@
|
|||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "91e95aa2",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### (Advanced) Raw outputs\n",
|
||||||
|
"\n",
|
||||||
|
"LLMs aren't perfect at generating structured output, especially as schemas become complex. You can avoid raising exceptions and handle the raw output yourself by passing `include_raw=True`. This changes the output format to contain the raw message output, the `parsed` value (if successful), and any resulting errors:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 5,
|
||||||
|
"id": "10ed2842",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"{'raw': AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_ASK4EmZeZ69Fi3p554Mb4rWy', 'function': {'arguments': '{\"setup\":\"Why was the cat sitting on the computer?\",\"punchline\":\"Because it wanted to keep an eye on the mouse!\"}', 'name': 'Joke'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 36, 'prompt_tokens': 107, 'total_tokens': 143}, 'model_name': 'gpt-4-0125-preview', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-6491d35b-9164-4656-b75c-d7882cfb76cb-0', tool_calls=[{'name': 'Joke', 'args': {'setup': 'Why was the cat sitting on the computer?', 'punchline': 'Because it wanted to keep an eye on the mouse!'}, 'id': 'call_ASK4EmZeZ69Fi3p554Mb4rWy'}], usage_metadata={'input_tokens': 107, 'output_tokens': 36, 'total_tokens': 143}),\n",
|
||||||
|
" 'parsed': Joke(setup='Why was the cat sitting on the computer?', punchline='Because it wanted to keep an eye on the mouse!', rating=None),\n",
|
||||||
|
" 'parsing_error': None}"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 5,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"structured_llm = llm.with_structured_output(Joke, include_raw=True)\n",
|
||||||
|
"\n",
|
||||||
|
"structured_llm.invoke(\n",
|
||||||
|
" \"Tell me a joke about cats, respond in JSON with `setup` and `punchline` keys\"\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"id": "5e92a98a",
|
"id": "5e92a98a",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"## Prompting and parsing model directly\n",
|
"## Prompting and parsing model outputs directly\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Not all models support `.with_structured_output()`, since not all models have tool calling or JSON mode support. For such models you'll need to directly prompt the model to use a specific format, and use an output parser to extract the structured response from the raw model output.\n",
|
"Not all models support `.with_structured_output()`, since not all models have tool calling or JSON mode support. For such models you'll need to directly prompt the model to use a specific format, and use an output parser to extract the structured response from the raw model output.\n",
|
||||||
"\n",
|
"\n",
|
||||||
@ -787,9 +824,9 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "poetry-venv-2",
|
"display_name": "Python 3",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "poetry-venv-2"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
@ -801,7 +838,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.9.1"
|
"version": "3.10.5"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
Loading…
Reference in New Issue
Block a user