docs: revamp ChatOpenAI (#22253)

Can build API ref docs by running
```bash
make api_docs_clean; make api_docs_quick_preview API_PKG=openai
```
only builds openai ref, takes ~20 sec
This commit is contained in:
Bagatur 2024-05-29 10:20:14 -07:00 committed by GitHub
parent 00c70d98c2
commit 6dd0f095c3
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
4 changed files with 436 additions and 41 deletions

View File

@ -32,10 +32,13 @@ api_docs_build:
poetry run python docs/api_reference/create_api_rst.py
cd docs/api_reference && poetry run make html
API_PKG ?= text-splitters
api_docs_quick_preview:
poetry run python docs/api_reference/create_api_rst.py text-splitters
poetry run pip install "pydantic<2"
poetry run python docs/api_reference/create_api_rst.py $(API_PKG)
cd docs/api_reference && poetry run make html
open docs/api_reference/_build/html/text_splitters_api_reference.html
open docs/api_reference/_build/html/$(shell echo $(API_PKG) | sed 's/-/_/g')_api_reference.html
## api_docs_clean: Clean the API Reference documentation build artifacts.
api_docs_clean:

View File

@ -220,7 +220,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@ -12,38 +12,130 @@
},
{
"cell_type": "markdown",
"id": "e49f1e0d",
"id": "cb4dd00a-8893-4a45-96f7-9a9fc341cd61",
"metadata": {},
"source": [
"# ChatOpenAI\n",
"\n",
"This notebook covers how to get started with OpenAI chat models."
"This notebook provides a quick overview for getting started with OpenAI [chat models](/docs/concepts/#chat-models). For detailed documentation of all ChatOpenAI features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html).\n",
"\n",
"OpenAI has several chat models. You can find information about their latest models and their costs, context windows, and supported input types in the [OpenAI docs](https://platform.openai.com/docs/models).\n",
"\n",
":::info Azure OpenAI\n",
"\n",
"Note that certain OpenAI models can also be accessed via the [Microsoft Azure platform](https://azure.microsoft.com/en-us/products/ai-services/openai-service). To use the Azure OpenAI service use the [AzureChatOpenAI integration](/docs/integrations/chat/azure_chat_openai/).\n",
"\n",
":::"
]
},
{
"cell_type": "markdown",
"id": "e49f1e0d",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"### Integration details\n",
"| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/v0.2/docs/integrations/chat/openai) | Package downloads | Package latest |\n",
"| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n",
"| [ChatOpenAI](https://api.python.langchain.com/en/latest/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html) | [langchain-openai](https://api.python.langchain.com/en/latest/openai_api_reference.html) | ❌ | beta | ✅ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-openai?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-openai?style=flat-square&label=%20) |\n",
"\n",
"### Model features\n",
"| [Tool calling](/docs/how_to/tool_calling/) | [Structured output](/docs/how_to/structured_output/) | JSON mode | Image input | Audio input | Video input | [Native streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
"| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n",
"| ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ | \n",
"\n",
"## Setup\n",
"\n",
"To access OpenAI models you'll need to create an OpenAI account, get an API key, and install the `langchain-openai` integration package.\n",
"\n",
"### Credentials\n",
"\n",
"Head to https://platform.openai.com to sign up to OpenAI and generate an API key. Once you've done this set the OPENAI_API_KEY environment variable:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"id": "e817fe2e-4f1d-4533-b19e-2400b1cf6ce8",
"metadata": {},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"Enter your OpenAI API key: ········\n"
]
}
],
"source": [
"import getpass\n",
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"Enter your OpenAI API key: \")"
]
},
{
"cell_type": "markdown",
"id": "c2a3ce99-a44a-4ea6-8d23-8a88e332f0f9",
"metadata": {},
"source": [
"If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "85255d53-ac8a-44e1-aa26-8e567bb77ae7",
"metadata": {},
"outputs": [],
"source": [
"# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")\n",
"# os.environ[\"LANGSMITH_TRACING\"] = \"true\""
]
},
{
"cell_type": "markdown",
"id": "c59722a9-6dbb-45f7-ae59-5be50ca5733d",
"metadata": {},
"source": [
"### Installation\n",
"\n",
"The LangChain OpenAI integration lives in the `langchain-openai` package:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2113471c-75d7-45df-b784-d78da4ef7aba",
"metadata": {},
"outputs": [],
"source": [
"%pip install -qU langchain-openai"
]
},
{
"cell_type": "markdown",
"id": "1098bc9d-ce83-462b-8c19-f85bf3a159dc",
"metadata": {},
"source": [
"## Instantiation\n",
"\n",
"Now we can instantiate our model object and generate chat completions:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "522686de",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain_core.messages import HumanMessage, SystemMessage\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_openai import ChatOpenAI"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "62e0dbc3",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)"
"from langchain_openai import ChatOpenAI\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-4o\", temperature=0)"
]
},
{
@ -51,17 +143,30 @@
"id": "4e5fe97e",
"metadata": {},
"source": [
"The above cell assumes that your OpenAI API key is set in your environment variables. If you would rather manually specify your API key and/or organization ID, use the following code:\n",
"The above cell that your OpenAI API key is set in your environment variables. If you would prefer you can specify credentials like API key, organization ID, base url, etc. as init params:\n",
"\n",
"```python\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0, api_key=\"YOUR_API_KEY\", openai_organization=\"YOUR_ORGANIZATION_ID\")\n",
"```\n",
"Remove the openai_organization parameter should it not apply to you."
"llm = ChatOpenAI(\n",
" model=\"gpt-4o\", \n",
" temperature=0, \n",
" api_key=\"YOUR_API_KEY\", \n",
" organization=\"YOUR_ORGANIZATION_ID\", \n",
" base_url=\"YOUR_BASE_URL\"\n",
")\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "6511982a-734a-4193-a47d-254f8dcaff5e",
"metadata": {},
"source": [
"## Invocation"
]
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 6,
"id": "ce16ad78-8e6f-48cd-954e-98be75eb5836",
"metadata": {
"tags": []
@ -70,20 +175,42 @@
{
"data": {
"text/plain": [
"AIMessage(content=\"J'adore programmer.\", response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 34, 'total_tokens': 40}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': 'fp_b28b39ffa8', 'finish_reason': 'stop', 'logprobs': None}, id='run-8591eae1-b42b-402b-a23a-dfdb0cd151bd-0')"
"AIMessage(content=\"J'adore la programmation.\", response_metadata={'token_usage': {'completion_tokens': 5, 'prompt_tokens': 31, 'total_tokens': 36}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_43dfabdef1', 'finish_reason': 'stop', 'logprobs': None}, id='run-012cffe2-5d3d-424d-83b5-51c6d4a593d1-0', usage_metadata={'input_tokens': 31, 'output_tokens': 5, 'total_tokens': 36})"
]
},
"execution_count": 5,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"messages = [\n",
" (\"system\", \"You are a helpful assistant that translates English to French.\"),\n",
" (\"human\", \"Translate this sentence from English to French. I love programming.\"),\n",
" (\n",
" \"system\",\n",
" \"You are a helpful assistant that translates English to French. Translate the user sentence.\",\n",
" ),\n",
" (\"human\", \"I love programming.\"),\n",
"]\n",
"llm.invoke(messages)"
"ai_msg = llm.invoke(messages)\n",
"ai_msg"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "2cd224b8-4499-41fb-a604-d53a7ff17b2e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"J'adore la programmation.\n"
]
}
],
"source": [
"print(ai_msg.content)"
]
},
{
@ -116,6 +243,8 @@
}
],
"source": [
"from langchain_core.prompts import ChatPromptTemplate\n",
"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\n",
@ -277,13 +406,23 @@
"\n",
"fine_tuned_model(messages)"
]
},
{
"cell_type": "markdown",
"id": "a796d728-971b-408b-88d5-440015bbb941",
"metadata": {},
"source": [
"## API reference\n",
"\n",
"For detailed documentation of all ChatOpenAI features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "poetry-venv-2",
"language": "python",
"name": "python3"
"name": "poetry-venv-2"
},
"language_info": {
"codemirror_mode": {

View File

@ -1117,21 +1117,274 @@ class BaseChatOpenAI(BaseChatModel):
class ChatOpenAI(BaseChatOpenAI):
"""`OpenAI` Chat large language models API.
"""OpenAI chat model integration.
To use, you should have the environment variable ``OPENAI_API_KEY``
set with your API key, or pass it as a named parameter to the constructor.
Setup:
Install ``langchain-openai`` and set environment variable ``OPENAI_API_KEY``.
Any parameters that are valid to be passed to the openai.create call can be passed
in, even if not explicitly saved on this class.
.. code-block:: bash
Example:
pip install -U langchain-openai
export OPENAI_API_KEY="your-api-key"
Key init args completion params:
model: str
Name of OpenAI model to use.
temperature: float
Sampling temperature.
max_tokens: Optional[int]
Max number of tokens to generate.
logprobs: Optional[bool]
Whether to return logprobs.
stream_options: Dict
Configure streaming outputs, like whether to return token usage when
streaming (``{"include_usage": True}``).
Key init args client params:
timeout:
Timeout for requests.
max_retries:
Max number of retries.
api_key:
OpenAI API key. If not passed in will be read from env var OPENAI_API_KEY.
base_url:
Base URL for PAI requests. Only specify if using a proxy or service
emulator.
organization:
OpenAI organization ID. If not passed in will be read from env
var OPENAI_ORG_ID.
See full list of supported init args and their descriptions in the params section.
Instantiate:
.. code-block:: python
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-3.5-turbo")
"""
llm = ChatOpenAI(
model="gpt-4o",
temperature=0,
max_tokens=None,
timeout=None,
max_retries=2,
# api_key="...",
# base_url="...",
# organization="...",
# other params...
)
**NOTE**: Any param which is not explicitly supported will be passed directly to the
``openai.OpenAI.chat.completions.create(...)`` API every time to the model is
invoked. For example:
.. code-block:: python
from langchain_openai import ChatOpenAI
import openai
ChatOpenAI(..., frequency_penalty=0.2).invoke(...)
# results in underlying API call of:
openai.OpenAI(..).chat.completions.create(..., frequency_penalty=0.2)
# which is also equivalent to:
ChatOpenAI(...).invoke(..., frequency_penalty=0.2)
Invoke:
.. code-block:: python
messages = [
("system", "You are a helpful translator. Translate the user sentence to French."),
("human", "I love programming."),
]
llm.invoke(messages)
.. code-block:: python
AIMessage(content="J'adore la programmation.", response_metadata={'token_usage': {'completion_tokens': 5, 'prompt_tokens': 31, 'total_tokens': 36}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_43dfabdef1', 'finish_reason': 'stop', 'logprobs': None}, id='run-012cffe2-5d3d-424d-83b5-51c6d4a593d1-0', usage_metadata={'input_tokens': 31, 'output_tokens': 5, 'total_tokens': 36})
Stream:
.. code-block:: python
for chunk in llm.stream(messages):
print(chunk)
.. code-block:: python
AIMessageChunk(content='', id='run-9e1517e3-12bf-48f2-bb1b-2e824f7cd7b0')
AIMessageChunk(content='J', id='run-9e1517e3-12bf-48f2-bb1b-2e824f7cd7b0')
AIMessageChunk(content="'adore", id='run-9e1517e3-12bf-48f2-bb1b-2e824f7cd7b0')
AIMessageChunk(content=' la', id='run-9e1517e3-12bf-48f2-bb1b-2e824f7cd7b0')
AIMessageChunk(content=' programmation', id='run-9e1517e3-12bf-48f2-bb1b-2e824f7cd7b0')
AIMessageChunk(content='.', id='run-9e1517e3-12bf-48f2-bb1b-2e824f7cd7b0')
AIMessageChunk(content='', response_metadata={'finish_reason': 'stop'}, id='run-9e1517e3-12bf-48f2-bb1b-2e824f7cd7b0')
AIMessageChunk(content='', id='run-9e1517e3-12bf-48f2-bb1b-2e824f7cd7b0', usage_metadata={'input_tokens': 31, 'output_tokens': 5, 'total_tokens': 36})
.. code-block:: python
stream = llm.stream(messages)
full = next(stream)
for chunk in stream:
full += chunk
full
.. code-block:: python
AIMessageChunk(content="J'adore la programmation.", response_metadata={'finish_reason': 'stop'}, id='run-bf917526-7f58-4683-84f7-36a6b671d140', usage_metadata={'input_tokens': 31, 'output_tokens': 5, 'total_tokens': 36})
Async:
.. code-block:: python
await llm.ainvoke(messages)
# stream:
# async for chunk in (await llm.astream(messages))
# batch:
# await llm.abatch([messages])
.. code-block:: python
AIMessage(content="J'adore la programmation.", response_metadata={'token_usage': {'completion_tokens': 5, 'prompt_tokens': 31, 'total_tokens': 36}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_43dfabdef1', 'finish_reason': 'stop', 'logprobs': None}, id='run-012cffe2-5d3d-424d-83b5-51c6d4a593d1-0', usage_metadata={'input_tokens': 31, 'output_tokens': 5, 'total_tokens': 36})
Tool calling:
.. code-block:: python
from langchain_core.pydantic_v1 import BaseModel, Field
class GetWeather(BaseModel):
'''Get the current weather in a given location'''
location: str = Field(..., description="The city and state, e.g. San Francisco, CA")
class GetPopulation(BaseModel):
'''Get the current population in a given location'''
location: str = Field(..., description="The city and state, e.g. San Francisco, CA")
llm_with_tools = llm.bind_tools([GetWeather, GetPopulation])
ai_msg = llm_with_tools.invoke("Which city is hotter today and which is bigger: LA or NY?")
ai_msg.tool_calls
.. code-block:: python
[{'name': 'GetWeather',
'args': {'location': 'Los Angeles, CA'},
'id': 'call_6XswGD5Pqk8Tt5atYr7tfenU'},
{'name': 'GetWeather',
'args': {'location': 'New York, NY'},
'id': 'call_ZVL15vA8Y7kXqOy3dtmQgeCi'},
{'name': 'GetPopulation',
'args': {'location': 'Los Angeles, CA'},
'id': 'call_49CFW8zqC9W7mh7hbMLSIrXw'},
{'name': 'GetPopulation',
'args': {'location': 'New York, NY'},
'id': 'call_6ghfKxV264jEfe1mRIkS3PE7'}]
See ``ChatOpenAI.bind_tools()`` method for more.
Structured output:
.. code-block:: python
from typing import Optional
from langchain_core.pydantic_v1 import BaseModel, Field
class Joke(BaseModel):
'''Joke to tell user.'''
setup: str = Field(description="The setup of the joke")
punchline: str = Field(description="The punchline to the joke")
rating: Optional[int] = Field(description="How funny the joke is, from 1 to 10")
structured_llm = llm.with_structured_output(Joke)
structured_llm.invoke("Tell me a joke about cats")
.. code-block:: python
Joke(setup='Why was the cat sitting on the computer?', punchline='To keep an eye on the mouse!', rating=None)
See ``ChatOpenAI.with_structured_output()`` for more.
JSON mode:
.. code-block:: python
json_llm = llm.bind(response_format={"type": "json_object"})
ai_msg = json_llm.invoke("Return a JSON object with key 'random_ints' and a value of 10 random ints in [0-99]")
ai_msg.content
.. code-block:: python
'\\n{\\n "random_ints": [23, 87, 45, 12, 78, 34, 56, 90, 11, 67]\\n}'
Image input:
.. code-block:: python
import base64
import httpx
from langchain_core.messages import HumanMessage
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8")
message = HumanMessage(
content=[
{"type": "text", "text": "describe the weather in this image"},
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{image_data}"},
},
],
)
ai_msg = llm.invoke([message])
ai_msg.content
.. code-block:: python
"The weather in the image appears to be clear and pleasant. The sky is mostly blue with scattered, light clouds, suggesting a sunny day with minimal cloud cover. There is no indication of rain or strong winds, and the overall scene looks bright and calm. The lush green grass and clear visibility further indicate good weather conditions."
Token usage:
.. code-block:: python
ai_msg = llm.invoke(messages)
ai_msg.usage_metadata
.. code-block:: python
{'input_tokens': 28, 'output_tokens': 5, 'total_tokens': 33}
Logprobs:
.. code-block:: python
logprobs_llm = llm.bind(logprobs=True)
ai_msg = logprobs_llm.invoke(messages)
ai_msg.response_metadata["logprobs"]
.. code-block:: python
{'content': [{'token': 'J', 'bytes': [74], 'logprob': -4.9617593e-06, 'top_logprobs': []},
{'token': "'adore", 'bytes': [39, 97, 100, 111, 114, 101], 'logprob': -0.25202933, 'top_logprobs': []},
{'token': ' la', 'bytes': [32, 108, 97], 'logprob': -0.20141791, 'top_logprobs': []},
{'token': ' programmation', 'bytes': [32, 112, 114, 111, 103, 114, 97, 109, 109, 97, 116, 105, 111, 110], 'logprob': -1.9361265e-07, 'top_logprobs': []},
{'token': '.', 'bytes': [46], 'logprob': -1.2233183e-05, 'top_logprobs': []}]}
Response metadata
.. code-block:: python
ai_msg = llm.invoke(messages)
ai_msg.response_metadata
.. code-block:: python
{'token_usage': {'completion_tokens': 5,
'prompt_tokens': 28,
'total_tokens': 33},
'model_name': 'gpt-4o',
'system_fingerprint': 'fp_319be4768e',
'finish_reason': 'stop',
'logprobs': None}
""" # noqa: E501
@property
def lc_secrets(self) -> Dict[str, str]: