mirror of
https://github.com/hwchase17/langchain.git
synced 2025-06-20 05:43:55 +00:00
community: Outlines integration (#27449)
In collaboration with @rlouf I build an [outlines](https://dottxt-ai.github.io/outlines/latest/) integration for langchain! I think this is really useful for doing any type of structured output locally. [Dottxt](https://dottxt.co) spend alot of work optimising this process at a lower level ([outlines-core](https://pypi.org/project/outlines-core/0.1.14/) written in rust) so I think this is a better alternative over all current approaches in langchain to do structured output. It also implements the `.with_structured_output` method so it should be a drop in replacement for a lot of applications. The integration includes: - **Outlines LLM class** - **ChatOutlines class** - **Tutorial Cookbooks** - **Documentation Page** - **Validation and error messages** - **Exposes Outlines Structured output features** - **Support for multiple backends** - **Integration and Unit Tests** Dependencies: `outlines` + additional (depending on backend used) I am not sure if the unit-tests comply with all requirements, if not I suggest to just remove them since I don't see a useful way to do it differently. ### Quick overview: Chat Models: <img width="698" alt="image" src="https://github.com/user-attachments/assets/05a499b9-858c-4397-a9ff-165c2b3e7acc"> Structured Output: <img width="955" alt="image" src="https://github.com/user-attachments/assets/b9fcac11-d3e5-4698-b1ae-8c4cb3d54c45"> --------- Co-authored-by: Vadym Barda <vadym@langchain.dev>
This commit is contained in:
parent
2901fa20cc
commit
dee72c46c1
348
docs/docs/integrations/chat/outlines.ipynb
Normal file
348
docs/docs/integrations/chat/outlines.ipynb
Normal file
@ -0,0 +1,348 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "raw",
|
||||||
|
"id": "afaf8039",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"---\n",
|
||||||
|
"sidebar_label: Outlines\n",
|
||||||
|
"---"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "e49f1e0d",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# ChatOutlines\n",
|
||||||
|
"\n",
|
||||||
|
"This will help you getting started with Outlines [chat models](/docs/concepts/chat_models/). For detailed documentation of all ChatOutlines features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/outlines.chat_models.ChatOutlines.html).\n",
|
||||||
|
"\n",
|
||||||
|
"[Outlines](https://github.com/outlines-dev/outlines) is a library for constrained language generation. It allows you to use large language models (LLMs) with various backends while applying constraints to the generated output.\n",
|
||||||
|
"\n",
|
||||||
|
"## Overview\n",
|
||||||
|
"### Integration details\n",
|
||||||
|
"\n",
|
||||||
|
"| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n",
|
||||||
|
"| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n",
|
||||||
|
"| [ChatOutlines](https://api.python.langchain.com/en/latest/chat_models/outlines.chat_models.ChatOutlines.html) | [langchain-community](https://api.python.langchain.com/en/latest/community_api_reference.html) | ✅ | ❌ | ❌ |  |  |\n",
|
||||||
|
"\n",
|
||||||
|
"### Model features\n",
|
||||||
|
"| [Tool calling](/docs/how_to/tool_calling) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
|
||||||
|
"| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n",
|
||||||
|
"| ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | \n",
|
||||||
|
"\n",
|
||||||
|
"## Setup\n",
|
||||||
|
"\n",
|
||||||
|
"To access Outlines models you'll need to have an internet connection to download the model weights from huggingface. Depending on the backend you need to install the required dependencies (see [Outlines docs](https://dottxt-ai.github.io/outlines/latest/installation/))\n",
|
||||||
|
"\n",
|
||||||
|
"### Credentials\n",
|
||||||
|
"\n",
|
||||||
|
"There is no built-in auth mechanism for Outlines.\n",
|
||||||
|
"\n",
|
||||||
|
"### Installation\n",
|
||||||
|
"\n",
|
||||||
|
"The LangChain Outlines integration lives in the `langchain-community` package and requires the `outlines` library:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "652d6238-1f87-422a-b135-f5abbb8652fc",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%pip install -qU langchain-community outlines"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "a38cde65-254d-4219-a441-068766c0d4b5",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Instantiation\n",
|
||||||
|
"\n",
|
||||||
|
"Now we can instantiate our model object and generate chat completions:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from langchain_community.chat_models.outlines import ChatOutlines\n",
|
||||||
|
"\n",
|
||||||
|
"# For llamacpp backend\n",
|
||||||
|
"model = ChatOutlines(model=\"TheBloke/phi-2-GGUF/phi-2.Q4_K_M.gguf\", backend=\"llamacpp\")\n",
|
||||||
|
"\n",
|
||||||
|
"# For vllm backend (not available on Mac)\n",
|
||||||
|
"model = ChatOutlines(model=\"meta-llama/Llama-3.2-1B\", backend=\"vllm\")\n",
|
||||||
|
"\n",
|
||||||
|
"# For mlxlm backend (only available on Mac)\n",
|
||||||
|
"model = ChatOutlines(model=\"mistralai/Ministral-8B-Instruct-2410\", backend=\"mlxlm\")\n",
|
||||||
|
"\n",
|
||||||
|
"# For huggingface transformers backend\n",
|
||||||
|
"model = ChatOutlines(model=\"microsoft/phi-2\") # defaults to transformers backend"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "2b4f3e15",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Invocation"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "62e0dbc3",
|
||||||
|
"metadata": {
|
||||||
|
"tags": []
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from langchain_core.messages import HumanMessage\n",
|
||||||
|
"\n",
|
||||||
|
"messages = [HumanMessage(content=\"What will the capital of mars be called?\")]\n",
|
||||||
|
"response = model.invoke(messages)\n",
|
||||||
|
"\n",
|
||||||
|
"response.content"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "18e2bfc0-7e78-4528-a73f-499ac150dca8",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Streaming\n",
|
||||||
|
"\n",
|
||||||
|
"ChatOutlines supports streaming of tokens:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "e197d1d7-a070-4c96-9f8a-a0e86d046e0b",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"messages = [HumanMessage(content=\"Count to 10 in French:\")]\n",
|
||||||
|
"\n",
|
||||||
|
"for chunk in model.stream(messages):\n",
|
||||||
|
" print(chunk.content, end=\"\", flush=True)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "ccc3e2f6",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Chaining"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "5a032003",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from langchain_core.prompts import ChatPromptTemplate\n",
|
||||||
|
"\n",
|
||||||
|
"prompt = ChatPromptTemplate.from_messages(\n",
|
||||||
|
" [\n",
|
||||||
|
" (\n",
|
||||||
|
" \"system\",\n",
|
||||||
|
" \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n",
|
||||||
|
" ),\n",
|
||||||
|
" (\"human\", \"{input}\"),\n",
|
||||||
|
" ]\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"chain = prompt | model\n",
|
||||||
|
"chain.invoke(\n",
|
||||||
|
" {\n",
|
||||||
|
" \"input_language\": \"English\",\n",
|
||||||
|
" \"output_language\": \"German\",\n",
|
||||||
|
" \"input\": \"I love programming.\",\n",
|
||||||
|
" }\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "d1ee55bc-ffc8-4cfa-801c-993953a08cfd",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Constrained Generation\n",
|
||||||
|
"\n",
|
||||||
|
"ChatOutlines allows you to apply various constraints to the generated output:\n",
|
||||||
|
"\n",
|
||||||
|
"### Regex Constraint"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model.regex = r\"((25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\\.){3}(25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\"\n",
|
||||||
|
"\n",
|
||||||
|
"response = model.invoke(\"What is the IP address of Google's DNS server?\")\n",
|
||||||
|
"\n",
|
||||||
|
"response.content"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "4a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Type Constraints"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "5a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model.type_constraints = int\n",
|
||||||
|
"response = model.invoke(\"What is the answer to life, the universe, and everything?\")\n",
|
||||||
|
"\n",
|
||||||
|
"response.content"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "6a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Pydantic and JSON Schemas"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "7a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from pydantic import BaseModel\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"class Person(BaseModel):\n",
|
||||||
|
" name: str\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"model.json_schema = Person\n",
|
||||||
|
"response = model.invoke(\"Who are the main contributors to LangChain?\")\n",
|
||||||
|
"person = Person.model_validate_json(response.content)\n",
|
||||||
|
"\n",
|
||||||
|
"person"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "8a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Context Free Grammars"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "9a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model.grammar = \"\"\"\n",
|
||||||
|
"?start: expression\n",
|
||||||
|
"?expression: term ((\"+\" | \"-\") term)*\n",
|
||||||
|
"?term: factor ((\"*\" | \"/\") factor)*\n",
|
||||||
|
"?factor: NUMBER | \"-\" factor | \"(\" expression \")\"\n",
|
||||||
|
"%import common.NUMBER\n",
|
||||||
|
"%import common.WS\n",
|
||||||
|
"%ignore WS\n",
|
||||||
|
"\"\"\"\n",
|
||||||
|
"response = model.invoke(\"Give me a complex arithmetic expression:\")\n",
|
||||||
|
"\n",
|
||||||
|
"response.content"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "aa5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## LangChain's Structured Output\n",
|
||||||
|
"\n",
|
||||||
|
"You can also use LangChain's Structured Output with ChatOutlines:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "ba5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from pydantic import BaseModel\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"class AnswerWithJustification(BaseModel):\n",
|
||||||
|
" answer: str\n",
|
||||||
|
" justification: str\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"_model = model.with_structured_output(AnswerWithJustification)\n",
|
||||||
|
"result = _model.invoke(\"What weighs more, a pound of bricks or a pound of feathers?\")\n",
|
||||||
|
"\n",
|
||||||
|
"result"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "ca5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## API reference\n",
|
||||||
|
"\n",
|
||||||
|
"For detailed documentation of all ChatOutlines features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/outlines.chat_models.ChatOutlines.html\n",
|
||||||
|
"\n",
|
||||||
|
"## Full Outlines Documentation: \n",
|
||||||
|
"\n",
|
||||||
|
"https://dottxt-ai.github.io/outlines/latest/"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3 (ipykernel)",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.9.9"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 5
|
||||||
|
}
|
268
docs/docs/integrations/llms/outlines.ipynb
Normal file
268
docs/docs/integrations/llms/outlines.ipynb
Normal file
@ -0,0 +1,268 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Outlines\n",
|
||||||
|
"\n",
|
||||||
|
"This will help you getting started with Outlines LLM. For detailed documentation of all Outlines features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/llms/outlines.llms.Outlines.html).\n",
|
||||||
|
"\n",
|
||||||
|
"[Outlines](https://github.com/outlines-dev/outlines) is a library for constrained language generation. It allows you to use large language models (LLMs) with various backends while applying constraints to the generated output.\n",
|
||||||
|
"\n",
|
||||||
|
"## Overview\n",
|
||||||
|
"\n",
|
||||||
|
"### Integration details\n",
|
||||||
|
"| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n",
|
||||||
|
"| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n",
|
||||||
|
"| [Outlines](https://python.langchain.com/api_reference/community/llms/langchain_community.llms.outlines.Outlines.html) | [langchain-community](https://python.langchain.com/api_reference/community/index.html) | ✅ | beta | ❌ |  |  |\n",
|
||||||
|
"\n",
|
||||||
|
"## Setup\n",
|
||||||
|
"\n",
|
||||||
|
"To access Outlines models you'll need to have an internet connection to download the model weights from huggingface. Depending on the backend you need to install the required dependencies (see [Outlines docs](https://dottxt-ai.github.io/outlines/latest/installation/))\n",
|
||||||
|
"\n",
|
||||||
|
"### Credentials\n",
|
||||||
|
"\n",
|
||||||
|
"There is no built-in auth mechanism for Outlines.\n",
|
||||||
|
"\n",
|
||||||
|
"## Installation\n",
|
||||||
|
"\n",
|
||||||
|
"The LangChain Outlines integration lives in the `langchain-community` package and requires the `outlines` library:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {
|
||||||
|
"vscode": {
|
||||||
|
"languageId": "shellscript"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%pip install -qU langchain-community outlines"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Instantiation\n",
|
||||||
|
"\n",
|
||||||
|
"Now we can instantiate our model object and generate chat completions:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from langchain_community.llms import Outlines\n",
|
||||||
|
"\n",
|
||||||
|
"# For use with llamacpp backend\n",
|
||||||
|
"model = Outlines(model=\"microsoft/Phi-3-mini-4k-instruct\", backend=\"llamacpp\")\n",
|
||||||
|
"\n",
|
||||||
|
"# For use with vllm backend (not available on Mac)\n",
|
||||||
|
"model = Outlines(model=\"microsoft/Phi-3-mini-4k-instruct\", backend=\"vllm\")\n",
|
||||||
|
"\n",
|
||||||
|
"# For use with mlxlm backend (only available on Mac)\n",
|
||||||
|
"model = Outlines(model=\"microsoft/Phi-3-mini-4k-instruct\", backend=\"mlxlm\")\n",
|
||||||
|
"\n",
|
||||||
|
"# For use with huggingface transformers backend\n",
|
||||||
|
"model = Outlines(\n",
|
||||||
|
" model=\"microsoft/Phi-3-mini-4k-instruct\"\n",
|
||||||
|
") # defaults to backend=\"transformers\""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Invocation"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model.invoke(\"Hello how are you?\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Chaining"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from langchain_core.prompts import PromptTemplate\n",
|
||||||
|
"\n",
|
||||||
|
"prompt = PromptTemplate.from_template(\"How to say {input} in {output_language}:\\n\")\n",
|
||||||
|
"\n",
|
||||||
|
"chain = prompt | model\n",
|
||||||
|
"chain.invoke(\n",
|
||||||
|
" {\n",
|
||||||
|
" \"output_language\": \"German\",\n",
|
||||||
|
" \"input\": \"I love programming.\",\n",
|
||||||
|
" }\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Streaming\n",
|
||||||
|
"\n",
|
||||||
|
"Outlines supports streaming of tokens:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"for chunk in model.stream(\"Count to 10 in French:\"):\n",
|
||||||
|
" print(chunk, end=\"\", flush=True)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Constrained Generation\n",
|
||||||
|
"\n",
|
||||||
|
"Outlines allows you to apply various constraints to the generated output:\n",
|
||||||
|
"\n",
|
||||||
|
"#### Regex Constraint"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model.regex = r\"((25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\\.){3}(25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\"\n",
|
||||||
|
"response = model.invoke(\"What is the IP address of Google's DNS server?\")\n",
|
||||||
|
"\n",
|
||||||
|
"response"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Type Constraints"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model.type_constraints = int\n",
|
||||||
|
"response = model.invoke(\"What is the answer to life, the universe, and everything?\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"#### JSON Schema"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from pydantic import BaseModel\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"class Person(BaseModel):\n",
|
||||||
|
" name: str\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"model.json_schema = Person\n",
|
||||||
|
"response = model.invoke(\"Who is the author of LangChain?\")\n",
|
||||||
|
"person = Person.model_validate_json(response)\n",
|
||||||
|
"\n",
|
||||||
|
"person"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"#### Grammar Constraint"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model.grammar = \"\"\"\n",
|
||||||
|
"?start: expression\n",
|
||||||
|
"?expression: term ((\"+\" | \"-\") term)\n",
|
||||||
|
"?term: factor ((\"\" | \"/\") factor)\n",
|
||||||
|
"?factor: NUMBER | \"-\" factor | \"(\" expression \")\"\n",
|
||||||
|
"%import common.NUMBER\n",
|
||||||
|
"%import common.WS\n",
|
||||||
|
"%ignore WS\n",
|
||||||
|
"\"\"\"\n",
|
||||||
|
"response = model.invoke(\"Give me a complex arithmetic expression:\")\n",
|
||||||
|
"\n",
|
||||||
|
"response"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## API reference\n",
|
||||||
|
"\n",
|
||||||
|
"For detailed documentation of all ChatOutlines features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/outlines.chat_models.ChatOutlines.html\n",
|
||||||
|
"\n",
|
||||||
|
"## Outlines Documentation: \n",
|
||||||
|
"\n",
|
||||||
|
"https://dottxt-ai.github.io/outlines/latest/"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": ".venv",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.9.9"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 2
|
||||||
|
}
|
201
docs/docs/integrations/providers/outlines.mdx
Normal file
201
docs/docs/integrations/providers/outlines.mdx
Normal file
@ -0,0 +1,201 @@
|
|||||||
|
# Outlines
|
||||||
|
|
||||||
|
>[Outlines](https://github.com/dottxt-ai/outlines) is a Python library for constrained language generation. It provides a unified interface to various language models and allows for structured generation using techniques like regex matching, type constraints, JSON schemas, and context-free grammars.
|
||||||
|
|
||||||
|
Outlines supports multiple backends, including:
|
||||||
|
- Hugging Face Transformers
|
||||||
|
- llama.cpp
|
||||||
|
- vLLM
|
||||||
|
- MLX
|
||||||
|
|
||||||
|
This integration allows you to use Outlines models with LangChain, providing both LLM and chat model interfaces.
|
||||||
|
|
||||||
|
## Installation and Setup
|
||||||
|
|
||||||
|
To use Outlines with LangChain, you'll need to install the Outlines library:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install outlines
|
||||||
|
```
|
||||||
|
|
||||||
|
Depending on the backend you choose, you may need to install additional dependencies:
|
||||||
|
|
||||||
|
- For Transformers: `pip install transformers torch datasets`
|
||||||
|
- For llama.cpp: `pip install llama-cpp-python`
|
||||||
|
- For vLLM: `pip install vllm`
|
||||||
|
- For MLX: `pip install mlx`
|
||||||
|
|
||||||
|
## LLM
|
||||||
|
|
||||||
|
To use Outlines as an LLM in LangChain, you can use the `Outlines` class:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langchain_community.llms import Outlines
|
||||||
|
```
|
||||||
|
|
||||||
|
## Chat Models
|
||||||
|
|
||||||
|
To use Outlines as a chat model in LangChain, you can use the `ChatOutlines` class:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langchain_community.chat_models import ChatOutlines
|
||||||
|
```
|
||||||
|
|
||||||
|
## Model Configuration
|
||||||
|
|
||||||
|
Both `Outlines` and `ChatOutlines` classes share similar configuration options:
|
||||||
|
|
||||||
|
```python
|
||||||
|
model = Outlines(
|
||||||
|
model="meta-llama/Llama-2-7b-chat-hf", # Model identifier
|
||||||
|
backend="transformers", # Backend to use (transformers, llamacpp, vllm, or mlxlm)
|
||||||
|
max_tokens=256, # Maximum number of tokens to generate
|
||||||
|
stop=["\n"], # Optional list of stop strings
|
||||||
|
streaming=True, # Whether to stream the output
|
||||||
|
# Additional parameters for structured generation:
|
||||||
|
regex=None,
|
||||||
|
type_constraints=None,
|
||||||
|
json_schema=None,
|
||||||
|
grammar=None,
|
||||||
|
# Additional model parameters:
|
||||||
|
model_kwargs={"temperature": 0.7}
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Model Identifier
|
||||||
|
|
||||||
|
The `model` parameter can be:
|
||||||
|
- A Hugging Face model name (e.g., "meta-llama/Llama-2-7b-chat-hf")
|
||||||
|
- A local path to a model
|
||||||
|
- For GGUF models, the format is "repo_id/file_name" (e.g., "TheBloke/Llama-2-7B-Chat-GGUF/llama-2-7b-chat.Q4_K_M.gguf")
|
||||||
|
|
||||||
|
### Backend Options
|
||||||
|
|
||||||
|
The `backend` parameter specifies which backend to use:
|
||||||
|
- `"transformers"`: For Hugging Face Transformers models (default)
|
||||||
|
- `"llamacpp"`: For GGUF models using llama.cpp
|
||||||
|
- `"transformers_vision"`: For vision-language models (e.g., LLaVA)
|
||||||
|
- `"vllm"`: For models using the vLLM library
|
||||||
|
- `"mlxlm"`: For models using the MLX framework
|
||||||
|
|
||||||
|
### Structured Generation
|
||||||
|
|
||||||
|
Outlines provides several methods for structured generation:
|
||||||
|
|
||||||
|
1. **Regex Matching**:
|
||||||
|
```python
|
||||||
|
model = Outlines(
|
||||||
|
model="meta-llama/Llama-2-7b-chat-hf",
|
||||||
|
regex=r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)"
|
||||||
|
)
|
||||||
|
```
|
||||||
|
This will ensure the generated text matches the specified regex pattern (in this case, a valid IP address).
|
||||||
|
|
||||||
|
2. **Type Constraints**:
|
||||||
|
```python
|
||||||
|
model = Outlines(
|
||||||
|
model="meta-llama/Llama-2-7b-chat-hf",
|
||||||
|
type_constraints=int
|
||||||
|
)
|
||||||
|
```
|
||||||
|
This restricts the output to valid Python types (int, float, bool, datetime.date, datetime.time, datetime.datetime).
|
||||||
|
|
||||||
|
3. **JSON Schema**:
|
||||||
|
```python
|
||||||
|
from pydantic import BaseModel
|
||||||
|
|
||||||
|
class Person(BaseModel):
|
||||||
|
name: str
|
||||||
|
age: int
|
||||||
|
|
||||||
|
model = Outlines(
|
||||||
|
model="meta-llama/Llama-2-7b-chat-hf",
|
||||||
|
json_schema=Person
|
||||||
|
)
|
||||||
|
```
|
||||||
|
This ensures the generated output adheres to the specified JSON schema or Pydantic model.
|
||||||
|
|
||||||
|
4. **Context-Free Grammar**:
|
||||||
|
```python
|
||||||
|
model = Outlines(
|
||||||
|
model="meta-llama/Llama-2-7b-chat-hf",
|
||||||
|
grammar="""
|
||||||
|
?start: expression
|
||||||
|
?expression: term (("+" | "-") term)*
|
||||||
|
?term: factor (("*" | "/") factor)*
|
||||||
|
?factor: NUMBER | "-" factor | "(" expression ")"
|
||||||
|
%import common.NUMBER
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
```
|
||||||
|
This generates text that adheres to the specified context-free grammar in EBNF format.
|
||||||
|
|
||||||
|
## Usage Examples
|
||||||
|
|
||||||
|
### LLM Example
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langchain_community.llms import Outlines
|
||||||
|
|
||||||
|
llm = Outlines(model="meta-llama/Llama-2-7b-chat-hf", max_tokens=100)
|
||||||
|
result = llm.invoke("Tell me a short story about a robot.")
|
||||||
|
print(result)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Chat Model Example
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langchain_community.chat_models import ChatOutlines
|
||||||
|
from langchain_core.messages import HumanMessage, SystemMessage
|
||||||
|
|
||||||
|
chat = ChatOutlines(model="meta-llama/Llama-2-7b-chat-hf", max_tokens=100)
|
||||||
|
messages = [
|
||||||
|
SystemMessage(content="You are a helpful AI assistant."),
|
||||||
|
HumanMessage(content="What's the capital of France?")
|
||||||
|
]
|
||||||
|
result = chat.invoke(messages)
|
||||||
|
print(result.content)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Streaming Example
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langchain_community.chat_models import ChatOutlines
|
||||||
|
from langchain_core.messages import HumanMessage
|
||||||
|
|
||||||
|
chat = ChatOutlines(model="meta-llama/Llama-2-7b-chat-hf", streaming=True)
|
||||||
|
for chunk in chat.stream("Tell me a joke about programming."):
|
||||||
|
print(chunk.content, end="", flush=True)
|
||||||
|
print()
|
||||||
|
```
|
||||||
|
|
||||||
|
### Structured Output Example
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langchain_community.llms import Outlines
|
||||||
|
from pydantic import BaseModel
|
||||||
|
|
||||||
|
class MovieReview(BaseModel):
|
||||||
|
title: str
|
||||||
|
rating: int
|
||||||
|
summary: str
|
||||||
|
|
||||||
|
llm = Outlines(
|
||||||
|
model="meta-llama/Llama-2-7b-chat-hf",
|
||||||
|
json_schema=MovieReview
|
||||||
|
)
|
||||||
|
result = llm.invoke("Write a short review for the movie 'Inception'.")
|
||||||
|
print(result)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Additional Features
|
||||||
|
|
||||||
|
### Tokenizer Access
|
||||||
|
|
||||||
|
You can access the underlying tokenizer for the model:
|
||||||
|
|
||||||
|
```python
|
||||||
|
tokenizer = llm.tokenizer
|
||||||
|
encoded = tokenizer.encode("Hello, world!")
|
||||||
|
decoded = tokenizer.decode(encoded)
|
||||||
|
```
|
@ -55,6 +55,7 @@ openai<2
|
|||||||
openapi-pydantic>=0.3.2,<0.4
|
openapi-pydantic>=0.3.2,<0.4
|
||||||
oracle-ads>=2.9.1,<3
|
oracle-ads>=2.9.1,<3
|
||||||
oracledb>=2.2.0,<3
|
oracledb>=2.2.0,<3
|
||||||
|
outlines[test]>=0.1.0,<0.2
|
||||||
pandas>=2.0.1,<3
|
pandas>=2.0.1,<3
|
||||||
pdfminer-six>=20221105,<20240706
|
pdfminer-six>=20221105,<20240706
|
||||||
pgvector>=0.1.6,<0.2
|
pgvector>=0.1.6,<0.2
|
||||||
|
@ -143,6 +143,7 @@ if TYPE_CHECKING:
|
|||||||
from langchain_community.chat_models.openai import (
|
from langchain_community.chat_models.openai import (
|
||||||
ChatOpenAI,
|
ChatOpenAI,
|
||||||
)
|
)
|
||||||
|
from langchain_community.chat_models.outlines import ChatOutlines
|
||||||
from langchain_community.chat_models.pai_eas_endpoint import (
|
from langchain_community.chat_models.pai_eas_endpoint import (
|
||||||
PaiEasChatEndpoint,
|
PaiEasChatEndpoint,
|
||||||
)
|
)
|
||||||
@ -228,6 +229,7 @@ __all__ = [
|
|||||||
"ChatOCIModelDeploymentTGI",
|
"ChatOCIModelDeploymentTGI",
|
||||||
"ChatOllama",
|
"ChatOllama",
|
||||||
"ChatOpenAI",
|
"ChatOpenAI",
|
||||||
|
"ChatOutlines",
|
||||||
"ChatPerplexity",
|
"ChatPerplexity",
|
||||||
"ChatReka",
|
"ChatReka",
|
||||||
"ChatPremAI",
|
"ChatPremAI",
|
||||||
@ -294,6 +296,7 @@ _module_lookup = {
|
|||||||
"ChatOCIModelDeploymentTGI": "langchain_community.chat_models.oci_data_science",
|
"ChatOCIModelDeploymentTGI": "langchain_community.chat_models.oci_data_science",
|
||||||
"ChatOllama": "langchain_community.chat_models.ollama",
|
"ChatOllama": "langchain_community.chat_models.ollama",
|
||||||
"ChatOpenAI": "langchain_community.chat_models.openai",
|
"ChatOpenAI": "langchain_community.chat_models.openai",
|
||||||
|
"ChatOutlines": "langchain_community.chat_models.outlines",
|
||||||
"ChatReka": "langchain_community.chat_models.reka",
|
"ChatReka": "langchain_community.chat_models.reka",
|
||||||
"ChatPerplexity": "langchain_community.chat_models.perplexity",
|
"ChatPerplexity": "langchain_community.chat_models.perplexity",
|
||||||
"ChatSambaNovaCloud": "langchain_community.chat_models.sambanova",
|
"ChatSambaNovaCloud": "langchain_community.chat_models.sambanova",
|
||||||
|
532
libs/community/langchain_community/chat_models/outlines.py
Normal file
532
libs/community/langchain_community/chat_models/outlines.py
Normal file
@ -0,0 +1,532 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import importlib.util
|
||||||
|
import platform
|
||||||
|
from collections.abc import AsyncIterator
|
||||||
|
from typing import (
|
||||||
|
Any,
|
||||||
|
Callable,
|
||||||
|
Dict,
|
||||||
|
Iterator,
|
||||||
|
List,
|
||||||
|
Optional,
|
||||||
|
Sequence,
|
||||||
|
Tuple,
|
||||||
|
Type,
|
||||||
|
TypedDict,
|
||||||
|
TypeVar,
|
||||||
|
Union,
|
||||||
|
get_origin,
|
||||||
|
)
|
||||||
|
|
||||||
|
from langchain_core.callbacks import CallbackManagerForLLMRun
|
||||||
|
from langchain_core.callbacks.manager import AsyncCallbackManagerForLLMRun
|
||||||
|
from langchain_core.language_models import LanguageModelInput
|
||||||
|
from langchain_core.language_models.chat_models import BaseChatModel
|
||||||
|
from langchain_core.messages import AIMessage, AIMessageChunk, BaseMessage
|
||||||
|
from langchain_core.output_parsers import JsonOutputParser, PydanticOutputParser
|
||||||
|
from langchain_core.outputs import ChatGeneration, ChatGenerationChunk, ChatResult
|
||||||
|
from langchain_core.runnables import Runnable
|
||||||
|
from langchain_core.tools import BaseTool
|
||||||
|
from langchain_core.utils.function_calling import convert_to_openai_tool
|
||||||
|
from pydantic import BaseModel, Field, model_validator
|
||||||
|
from typing_extensions import Literal
|
||||||
|
|
||||||
|
from langchain_community.adapters.openai import convert_message_to_dict
|
||||||
|
|
||||||
|
_BM = TypeVar("_BM", bound=BaseModel)
|
||||||
|
_DictOrPydanticClass = Union[Dict[str, Any], Type[_BM], Type]
|
||||||
|
|
||||||
|
|
||||||
|
class ChatOutlines(BaseChatModel):
|
||||||
|
"""Outlines chat model integration.
|
||||||
|
|
||||||
|
Setup:
|
||||||
|
pip install outlines
|
||||||
|
|
||||||
|
Key init args — client params:
|
||||||
|
backend: Literal["llamacpp", "transformers", "transformers_vision", "vllm", "mlxlm"] = "transformers"
|
||||||
|
Specifies the backend to use for the model.
|
||||||
|
|
||||||
|
Key init args — completion params:
|
||||||
|
model: str
|
||||||
|
Identifier for the model to use with Outlines.
|
||||||
|
max_tokens: int = 256
|
||||||
|
The maximum number of tokens to generate.
|
||||||
|
stop: Optional[List[str]] = None
|
||||||
|
A list of strings to stop generation when encountered.
|
||||||
|
streaming: bool = True
|
||||||
|
Whether to stream the results, token by token.
|
||||||
|
|
||||||
|
See full list of supported init args and their descriptions in the params section.
|
||||||
|
|
||||||
|
Instantiate:
|
||||||
|
from langchain_community.chat_models import ChatOutlines
|
||||||
|
chat = ChatOutlines(model="meta-llama/Llama-2-7b-chat-hf")
|
||||||
|
|
||||||
|
Invoke:
|
||||||
|
chat.invoke([HumanMessage(content="Say foo:")])
|
||||||
|
|
||||||
|
Stream:
|
||||||
|
for chunk in chat.stream([HumanMessage(content="Count to 10:")]):
|
||||||
|
print(chunk.content, end="", flush=True)
|
||||||
|
|
||||||
|
""" # noqa: E501
|
||||||
|
|
||||||
|
client: Any = None # :meta private:
|
||||||
|
|
||||||
|
model: str
|
||||||
|
"""Identifier for the model to use with Outlines.
|
||||||
|
|
||||||
|
The model identifier should be a string specifying:
|
||||||
|
- A Hugging Face model name (e.g., "meta-llama/Llama-2-7b-chat-hf")
|
||||||
|
- A local path to a model
|
||||||
|
- For GGUF models, the format is "repo_id/file_name"
|
||||||
|
(e.g., "TheBloke/Llama-2-7B-Chat-GGUF/llama-2-7b-chat.Q4_K_M.gguf")
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
- "TheBloke/Llama-2-7B-Chat-GGUF/llama-2-7b-chat.Q4_K_M.gguf"
|
||||||
|
- "meta-llama/Llama-2-7b-chat-hf"
|
||||||
|
"""
|
||||||
|
|
||||||
|
backend: Literal[
|
||||||
|
"llamacpp", "transformers", "transformers_vision", "vllm", "mlxlm"
|
||||||
|
] = "transformers"
|
||||||
|
"""Specifies the backend to use for the model.
|
||||||
|
|
||||||
|
Supported backends are:
|
||||||
|
- "llamacpp": For GGUF models using llama.cpp
|
||||||
|
- "transformers": For Hugging Face Transformers models (default)
|
||||||
|
- "transformers_vision": For vision-language models (e.g., LLaVA)
|
||||||
|
- "vllm": For models using the vLLM library
|
||||||
|
- "mlxlm": For models using the MLX framework
|
||||||
|
|
||||||
|
Note: Ensure you have the necessary dependencies installed for the chosen backend.
|
||||||
|
The system will attempt to import required packages and may raise an ImportError
|
||||||
|
if they are not available.
|
||||||
|
"""
|
||||||
|
|
||||||
|
max_tokens: int = 256
|
||||||
|
"""The maximum number of tokens to generate."""
|
||||||
|
|
||||||
|
stop: Optional[List[str]] = None
|
||||||
|
"""A list of strings to stop generation when encountered."""
|
||||||
|
|
||||||
|
streaming: bool = True
|
||||||
|
"""Whether to stream the results, token by token."""
|
||||||
|
|
||||||
|
regex: Optional[str] = None
|
||||||
|
"""Regular expression for structured generation.
|
||||||
|
|
||||||
|
If provided, Outlines will guarantee that the generated text matches this regex.
|
||||||
|
This can be useful for generating structured outputs like IP addresses, dates, etc.
|
||||||
|
|
||||||
|
Example: (valid IP address)
|
||||||
|
regex = r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)"
|
||||||
|
|
||||||
|
Note: Computing the regex index can take some time, so it's recommended to reuse
|
||||||
|
the same regex for multiple generations if possible.
|
||||||
|
|
||||||
|
For more details, see: https://dottxt-ai.github.io/outlines/reference/generation/regex/
|
||||||
|
"""
|
||||||
|
|
||||||
|
type_constraints: Optional[Union[type, str]] = None
|
||||||
|
"""Type constraints for structured generation.
|
||||||
|
|
||||||
|
Restricts the output to valid Python types. Supported types include:
|
||||||
|
int, float, bool, datetime.date, datetime.time, datetime.datetime.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
type_constraints = int
|
||||||
|
|
||||||
|
For more details, see: https://dottxt-ai.github.io/outlines/reference/generation/format/
|
||||||
|
"""
|
||||||
|
|
||||||
|
json_schema: Optional[Union[Any, Dict, Callable]] = None
|
||||||
|
"""Pydantic model, JSON Schema, or callable (function signature)
|
||||||
|
for structured JSON generation.
|
||||||
|
|
||||||
|
Outlines can generate JSON output that follows a specified structure,
|
||||||
|
which is useful for:
|
||||||
|
1. Parsing the answer (e.g., with Pydantic), storing it, or returning it to a user.
|
||||||
|
2. Calling a function with the result.
|
||||||
|
|
||||||
|
You can provide:
|
||||||
|
- A Pydantic model
|
||||||
|
- A JSON Schema (as a Dict)
|
||||||
|
- A callable (function signature)
|
||||||
|
|
||||||
|
The generated JSON will adhere to the specified structure.
|
||||||
|
|
||||||
|
For more details, see: https://dottxt-ai.github.io/outlines/reference/generation/json/
|
||||||
|
"""
|
||||||
|
|
||||||
|
grammar: Optional[str] = None
|
||||||
|
"""Context-free grammar for structured generation.
|
||||||
|
|
||||||
|
If provided, Outlines will generate text that adheres to the specified grammar.
|
||||||
|
The grammar should be defined in EBNF format.
|
||||||
|
|
||||||
|
This can be useful for generating structured outputs like mathematical expressions,
|
||||||
|
programming languages, or custom domain-specific languages.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
grammar = '''
|
||||||
|
?start: expression
|
||||||
|
?expression: term (("+" | "-") term)*
|
||||||
|
?term: factor (("*" | "/") factor)*
|
||||||
|
?factor: NUMBER | "-" factor | "(" expression ")"
|
||||||
|
%import common.NUMBER
|
||||||
|
'''
|
||||||
|
|
||||||
|
Note: Grammar-based generation is currently experimental and may have performance
|
||||||
|
limitations. It uses greedy generation to mitigate these issues.
|
||||||
|
|
||||||
|
For more details and examples, see:
|
||||||
|
https://dottxt-ai.github.io/outlines/reference/generation/cfg/
|
||||||
|
"""
|
||||||
|
|
||||||
|
custom_generator: Optional[Any] = None
|
||||||
|
"""Set your own outlines generator object to override the default behavior."""
|
||||||
|
|
||||||
|
model_kwargs: Dict[str, Any] = Field(default_factory=dict)
|
||||||
|
"""Additional parameters to pass to the underlying model.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
model_kwargs = {"temperature": 0.8, "seed": 42}
|
||||||
|
"""
|
||||||
|
|
||||||
|
@model_validator(mode="after")
|
||||||
|
def validate_environment(self) -> "ChatOutlines":
|
||||||
|
"""Validate that outlines is installed and create a model instance."""
|
||||||
|
num_constraints = sum(
|
||||||
|
[
|
||||||
|
bool(self.regex),
|
||||||
|
bool(self.type_constraints),
|
||||||
|
bool(self.json_schema),
|
||||||
|
bool(self.grammar),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
if num_constraints > 1:
|
||||||
|
raise ValueError(
|
||||||
|
"Either none or exactly one of regex, type_constraints, "
|
||||||
|
"json_schema, or grammar can be provided."
|
||||||
|
)
|
||||||
|
return self.build_client()
|
||||||
|
|
||||||
|
def build_client(self) -> "ChatOutlines":
|
||||||
|
try:
|
||||||
|
import outlines.models as models
|
||||||
|
except ImportError:
|
||||||
|
raise ImportError(
|
||||||
|
"Could not import the Outlines library. "
|
||||||
|
"Please install it with `pip install outlines`."
|
||||||
|
)
|
||||||
|
|
||||||
|
def check_packages_installed(
|
||||||
|
packages: List[Union[str, Tuple[str, str]]],
|
||||||
|
) -> None:
|
||||||
|
missing_packages = [
|
||||||
|
pkg if isinstance(pkg, str) else pkg[0]
|
||||||
|
for pkg in packages
|
||||||
|
if importlib.util.find_spec(pkg[1] if isinstance(pkg, tuple) else pkg)
|
||||||
|
is None
|
||||||
|
]
|
||||||
|
if missing_packages:
|
||||||
|
raise ImportError(
|
||||||
|
f"Missing packages: {', '.join(missing_packages)}. "
|
||||||
|
"You can install them with:\n\n"
|
||||||
|
f" pip install {' '.join(missing_packages)}"
|
||||||
|
)
|
||||||
|
|
||||||
|
if self.backend == "llamacpp":
|
||||||
|
check_packages_installed([("llama-cpp-python", "llama_cpp")])
|
||||||
|
if ".gguf" in self.model:
|
||||||
|
creator, repo_name, file_name = self.model.split("/", 2)
|
||||||
|
repo_id = f"{creator}/{repo_name}"
|
||||||
|
else:
|
||||||
|
raise ValueError("GGUF file_name must be provided for llama.cpp.")
|
||||||
|
self.client = models.llamacpp(repo_id, file_name, **self.model_kwargs)
|
||||||
|
elif self.backend == "transformers":
|
||||||
|
check_packages_installed(["transformers", "torch", "datasets"])
|
||||||
|
self.client = models.transformers(
|
||||||
|
model_name=self.model, **self.model_kwargs
|
||||||
|
)
|
||||||
|
elif self.backend == "transformers_vision":
|
||||||
|
if hasattr(models, "transformers_vision"):
|
||||||
|
from transformers import LlavaNextForConditionalGeneration
|
||||||
|
|
||||||
|
self.client = models.transformers_vision(
|
||||||
|
self.model,
|
||||||
|
model_class=LlavaNextForConditionalGeneration,
|
||||||
|
**self.model_kwargs,
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
raise ValueError("transformers_vision backend is not supported")
|
||||||
|
elif self.backend == "vllm":
|
||||||
|
if platform.system() == "Darwin":
|
||||||
|
raise ValueError("vLLM backend is not supported on macOS.")
|
||||||
|
check_packages_installed(["vllm"])
|
||||||
|
self.client = models.vllm(self.model, **self.model_kwargs)
|
||||||
|
elif self.backend == "mlxlm":
|
||||||
|
check_packages_installed(["mlx"])
|
||||||
|
self.client = models.mlxlm(self.model, **self.model_kwargs)
|
||||||
|
else:
|
||||||
|
raise ValueError(f"Unsupported backend: {self.backend}")
|
||||||
|
return self
|
||||||
|
|
||||||
|
@property
|
||||||
|
def _llm_type(self) -> str:
|
||||||
|
return "outlines-chat"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def _default_params(self) -> Dict[str, Any]:
|
||||||
|
return {
|
||||||
|
"max_tokens": self.max_tokens,
|
||||||
|
"stop_at": self.stop,
|
||||||
|
**self.model_kwargs,
|
||||||
|
}
|
||||||
|
|
||||||
|
@property
|
||||||
|
def _identifying_params(self) -> Dict[str, Any]:
|
||||||
|
return {
|
||||||
|
"model": self.model,
|
||||||
|
"backend": self.backend,
|
||||||
|
"regex": self.regex,
|
||||||
|
"type_constraints": self.type_constraints,
|
||||||
|
"json_schema": self.json_schema,
|
||||||
|
"grammar": self.grammar,
|
||||||
|
**self._default_params,
|
||||||
|
}
|
||||||
|
|
||||||
|
@property
|
||||||
|
def _generator(self) -> Any:
|
||||||
|
from outlines import generate
|
||||||
|
|
||||||
|
if self.custom_generator:
|
||||||
|
return self.custom_generator
|
||||||
|
constraints = [
|
||||||
|
self.regex,
|
||||||
|
self.type_constraints,
|
||||||
|
self.json_schema,
|
||||||
|
self.grammar,
|
||||||
|
]
|
||||||
|
|
||||||
|
num_constraints = sum(constraint is not None for constraint in constraints)
|
||||||
|
if num_constraints != 1 and num_constraints != 0:
|
||||||
|
raise ValueError(
|
||||||
|
"Either none or exactly one of regex, type_constraints, "
|
||||||
|
"json_schema, or grammar can be provided."
|
||||||
|
)
|
||||||
|
if self.regex:
|
||||||
|
return generate.regex(self.client, regex_str=self.regex)
|
||||||
|
if self.type_constraints:
|
||||||
|
return generate.format(self.client, python_type=self.type_constraints)
|
||||||
|
if self.json_schema:
|
||||||
|
return generate.json(self.client, schema_object=self.json_schema)
|
||||||
|
if self.grammar:
|
||||||
|
return generate.cfg(self.client, cfg_str=self.grammar)
|
||||||
|
return generate.text(self.client)
|
||||||
|
|
||||||
|
def _convert_messages_to_openai_format(
|
||||||
|
self, messages: list[BaseMessage]
|
||||||
|
) -> list[dict]:
|
||||||
|
return [convert_message_to_dict(message) for message in messages]
|
||||||
|
|
||||||
|
def _convert_messages_to_prompt(self, messages: list[BaseMessage]) -> str:
|
||||||
|
"""Convert a list of messages to a single prompt."""
|
||||||
|
if self.backend == "llamacpp": # get base_model_name from gguf repo_id
|
||||||
|
from huggingface_hub import ModelCard
|
||||||
|
|
||||||
|
repo_creator, gguf_repo_name, file_name = self.model.split("/")
|
||||||
|
model_card = ModelCard.load(f"{repo_creator}/{gguf_repo_name}")
|
||||||
|
if hasattr(model_card.data, "base_model"):
|
||||||
|
model_name = model_card.data.base_model
|
||||||
|
else:
|
||||||
|
raise ValueError(f"Base model name not found for {self.model}")
|
||||||
|
else:
|
||||||
|
model_name = self.model
|
||||||
|
|
||||||
|
from transformers import AutoTokenizer
|
||||||
|
|
||||||
|
return AutoTokenizer.from_pretrained(model_name).apply_chat_template(
|
||||||
|
self._convert_messages_to_openai_format(messages),
|
||||||
|
tokenize=False,
|
||||||
|
add_generation_prompt=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
def bind_tools(
|
||||||
|
self,
|
||||||
|
tools: Sequence[Dict[str, Any] | type | Callable[..., Any] | BaseTool],
|
||||||
|
*,
|
||||||
|
tool_choice: Optional[Union[Dict, bool, str]] = None,
|
||||||
|
**kwargs: Any,
|
||||||
|
) -> Runnable[LanguageModelInput, BaseMessage]:
|
||||||
|
"""Bind tool-like objects to this chat model
|
||||||
|
|
||||||
|
tool_choice: does not currently support "any", "auto" choices like OpenAI
|
||||||
|
tool-calling API. should be a dict of the form to force this tool
|
||||||
|
{"type": "function", "function": {"name": <<tool_name>>}}.
|
||||||
|
"""
|
||||||
|
formatted_tools = [convert_to_openai_tool(tool) for tool in tools]
|
||||||
|
tool_names = [ft["function"]["name"] for ft in formatted_tools]
|
||||||
|
if tool_choice:
|
||||||
|
if isinstance(tool_choice, dict):
|
||||||
|
if not any(
|
||||||
|
tool_choice["function"]["name"] == name for name in tool_names
|
||||||
|
):
|
||||||
|
raise ValueError(
|
||||||
|
f"Tool choice {tool_choice=} was specified, but the only "
|
||||||
|
f"provided tools were {tool_names}."
|
||||||
|
)
|
||||||
|
elif isinstance(tool_choice, str):
|
||||||
|
chosen = [
|
||||||
|
f for f in formatted_tools if f["function"]["name"] == tool_choice
|
||||||
|
]
|
||||||
|
if not chosen:
|
||||||
|
raise ValueError(
|
||||||
|
f"Tool choice {tool_choice=} was specified, but the only "
|
||||||
|
f"provided tools were {tool_names}."
|
||||||
|
)
|
||||||
|
elif isinstance(tool_choice, bool):
|
||||||
|
if len(formatted_tools) > 1:
|
||||||
|
raise ValueError(
|
||||||
|
"tool_choice=True can only be specified when a single tool is "
|
||||||
|
f"passed in. Received {len(tools)} tools."
|
||||||
|
)
|
||||||
|
tool_choice = formatted_tools[0]
|
||||||
|
|
||||||
|
kwargs["tool_choice"] = tool_choice
|
||||||
|
formatted_tools = [convert_to_openai_tool(tool) for tool in tools]
|
||||||
|
return super().bind_tools(tools=formatted_tools, **kwargs)
|
||||||
|
|
||||||
|
def with_structured_output(
|
||||||
|
self,
|
||||||
|
schema: Optional[_DictOrPydanticClass],
|
||||||
|
*,
|
||||||
|
include_raw: bool = False,
|
||||||
|
**kwargs: Any,
|
||||||
|
) -> Runnable[LanguageModelInput, Union[dict, BaseModel]]:
|
||||||
|
if get_origin(schema) is TypedDict:
|
||||||
|
raise NotImplementedError("TypedDict is not supported yet by Outlines")
|
||||||
|
|
||||||
|
self.json_schema = schema
|
||||||
|
|
||||||
|
if isinstance(schema, type) and issubclass(schema, BaseModel):
|
||||||
|
parser: Union[PydanticOutputParser, JsonOutputParser] = (
|
||||||
|
PydanticOutputParser(pydantic_object=schema)
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
parser = JsonOutputParser()
|
||||||
|
|
||||||
|
if include_raw: # TODO
|
||||||
|
raise NotImplementedError("include_raw is not yet supported")
|
||||||
|
|
||||||
|
return self | parser
|
||||||
|
|
||||||
|
def _generate(
|
||||||
|
self,
|
||||||
|
messages: List[BaseMessage],
|
||||||
|
stop: Optional[List[str]] = None,
|
||||||
|
run_manager: Optional[CallbackManagerForLLMRun] = None,
|
||||||
|
**kwargs: Any,
|
||||||
|
) -> ChatResult:
|
||||||
|
params = {**self._default_params, **kwargs}
|
||||||
|
if stop:
|
||||||
|
params["stop_at"] = stop
|
||||||
|
|
||||||
|
prompt = self._convert_messages_to_prompt(messages)
|
||||||
|
|
||||||
|
response = ""
|
||||||
|
if self.streaming:
|
||||||
|
for chunk in self._stream(
|
||||||
|
messages=messages,
|
||||||
|
stop=stop,
|
||||||
|
run_manager=run_manager,
|
||||||
|
**kwargs,
|
||||||
|
):
|
||||||
|
if isinstance(chunk.message.content, str):
|
||||||
|
response += chunk.message.content
|
||||||
|
else:
|
||||||
|
raise ValueError(
|
||||||
|
"Invalid content type, only str is supported, "
|
||||||
|
f"got {type(chunk.message.content)}"
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
response = self._generator(prompt, **params)
|
||||||
|
|
||||||
|
message = AIMessage(content=response)
|
||||||
|
generation = ChatGeneration(message=message)
|
||||||
|
return ChatResult(generations=[generation])
|
||||||
|
|
||||||
|
def _stream(
|
||||||
|
self,
|
||||||
|
messages: List[BaseMessage],
|
||||||
|
stop: Optional[List[str]] = None,
|
||||||
|
run_manager: Optional[CallbackManagerForLLMRun] = None,
|
||||||
|
**kwargs: Any,
|
||||||
|
) -> Iterator[ChatGenerationChunk]:
|
||||||
|
params = {**self._default_params, **kwargs}
|
||||||
|
if stop:
|
||||||
|
params["stop_at"] = stop
|
||||||
|
|
||||||
|
prompt = self._convert_messages_to_prompt(messages)
|
||||||
|
|
||||||
|
for token in self._generator.stream(prompt, **params):
|
||||||
|
if run_manager:
|
||||||
|
run_manager.on_llm_new_token(token)
|
||||||
|
message_chunk = AIMessageChunk(content=token)
|
||||||
|
chunk = ChatGenerationChunk(message=message_chunk)
|
||||||
|
yield chunk
|
||||||
|
|
||||||
|
async def _agenerate(
|
||||||
|
self,
|
||||||
|
messages: List[BaseMessage],
|
||||||
|
stop: List[str] | None = None,
|
||||||
|
run_manager: AsyncCallbackManagerForLLMRun | None = None,
|
||||||
|
**kwargs: Any,
|
||||||
|
) -> ChatResult:
|
||||||
|
if hasattr(self._generator, "agenerate"):
|
||||||
|
params = {**self._default_params, **kwargs}
|
||||||
|
if stop:
|
||||||
|
params["stop_at"] = stop
|
||||||
|
|
||||||
|
prompt = self._convert_messages_to_prompt(messages)
|
||||||
|
response = await self._generator.agenerate(prompt, **params)
|
||||||
|
|
||||||
|
message = AIMessage(content=response)
|
||||||
|
generation = ChatGeneration(message=message)
|
||||||
|
return ChatResult(generations=[generation])
|
||||||
|
elif self.streaming:
|
||||||
|
response = ""
|
||||||
|
async for chunk in self._astream(messages, stop, run_manager, **kwargs):
|
||||||
|
response += chunk.message.content or ""
|
||||||
|
message = AIMessage(content=response)
|
||||||
|
generation = ChatGeneration(message=message)
|
||||||
|
return ChatResult(generations=[generation])
|
||||||
|
else:
|
||||||
|
return await super()._agenerate(messages, stop, run_manager, **kwargs)
|
||||||
|
|
||||||
|
async def _astream(
|
||||||
|
self,
|
||||||
|
messages: List[BaseMessage],
|
||||||
|
stop: List[str] | None = None,
|
||||||
|
run_manager: AsyncCallbackManagerForLLMRun | None = None,
|
||||||
|
**kwargs: Any,
|
||||||
|
) -> AsyncIterator[ChatGenerationChunk]:
|
||||||
|
if hasattr(self._generator, "astream"):
|
||||||
|
params = {**self._default_params, **kwargs}
|
||||||
|
if stop:
|
||||||
|
params["stop_at"] = stop
|
||||||
|
|
||||||
|
prompt = self._convert_messages_to_prompt(messages)
|
||||||
|
|
||||||
|
async for token in self._generator.astream(prompt, **params):
|
||||||
|
if run_manager:
|
||||||
|
await run_manager.on_llm_new_token(token)
|
||||||
|
message_chunk = AIMessageChunk(content=token)
|
||||||
|
chunk = ChatGenerationChunk(message=message_chunk)
|
||||||
|
yield chunk
|
||||||
|
else:
|
||||||
|
async for chunk in super()._astream(messages, stop, run_manager, **kwargs):
|
||||||
|
yield chunk
|
@ -458,6 +458,12 @@ def _import_openlm() -> Type[BaseLLM]:
|
|||||||
return OpenLM
|
return OpenLM
|
||||||
|
|
||||||
|
|
||||||
|
def _import_outlines() -> Type[BaseLLM]:
|
||||||
|
from langchain_community.llms.outlines import Outlines
|
||||||
|
|
||||||
|
return Outlines
|
||||||
|
|
||||||
|
|
||||||
def _import_pai_eas_endpoint() -> Type[BaseLLM]:
|
def _import_pai_eas_endpoint() -> Type[BaseLLM]:
|
||||||
from langchain_community.llms.pai_eas_endpoint import PaiEasEndpoint
|
from langchain_community.llms.pai_eas_endpoint import PaiEasEndpoint
|
||||||
|
|
||||||
@ -807,6 +813,8 @@ def __getattr__(name: str) -> Any:
|
|||||||
return _import_openllm()
|
return _import_openllm()
|
||||||
elif name == "OpenLM":
|
elif name == "OpenLM":
|
||||||
return _import_openlm()
|
return _import_openlm()
|
||||||
|
elif name == "Outlines":
|
||||||
|
return _import_outlines()
|
||||||
elif name == "PaiEasEndpoint":
|
elif name == "PaiEasEndpoint":
|
||||||
return _import_pai_eas_endpoint()
|
return _import_pai_eas_endpoint()
|
||||||
elif name == "Petals":
|
elif name == "Petals":
|
||||||
@ -954,6 +962,7 @@ __all__ = [
|
|||||||
"OpenAIChat",
|
"OpenAIChat",
|
||||||
"OpenLLM",
|
"OpenLLM",
|
||||||
"OpenLM",
|
"OpenLM",
|
||||||
|
"Outlines",
|
||||||
"PaiEasEndpoint",
|
"PaiEasEndpoint",
|
||||||
"Petals",
|
"Petals",
|
||||||
"PipelineAI",
|
"PipelineAI",
|
||||||
@ -1076,6 +1085,7 @@ def get_type_to_cls_dict() -> Dict[str, Callable[[], Type[BaseLLM]]]:
|
|||||||
"vertexai_model_garden": _import_vertex_model_garden,
|
"vertexai_model_garden": _import_vertex_model_garden,
|
||||||
"openllm": _import_openllm,
|
"openllm": _import_openllm,
|
||||||
"openllm_client": _import_openllm,
|
"openllm_client": _import_openllm,
|
||||||
|
"outlines": _import_outlines,
|
||||||
"vllm": _import_vllm,
|
"vllm": _import_vllm,
|
||||||
"vllm_openai": _import_vllm_openai,
|
"vllm_openai": _import_vllm_openai,
|
||||||
"watsonxllm": _import_watsonxllm,
|
"watsonxllm": _import_watsonxllm,
|
||||||
|
314
libs/community/langchain_community/llms/outlines.py
Normal file
314
libs/community/langchain_community/llms/outlines.py
Normal file
@ -0,0 +1,314 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import importlib.util
|
||||||
|
import logging
|
||||||
|
import platform
|
||||||
|
from typing import Any, Callable, Dict, Iterator, List, Literal, Optional, Tuple, Union
|
||||||
|
|
||||||
|
from langchain_core.callbacks import CallbackManagerForLLMRun
|
||||||
|
from langchain_core.language_models.llms import LLM
|
||||||
|
from langchain_core.outputs import GenerationChunk
|
||||||
|
from pydantic import BaseModel, Field, model_validator
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class Outlines(LLM):
|
||||||
|
"""LLM wrapper for the Outlines library."""
|
||||||
|
|
||||||
|
client: Any = None # :meta private:
|
||||||
|
|
||||||
|
model: str
|
||||||
|
"""Identifier for the model to use with Outlines.
|
||||||
|
|
||||||
|
The model identifier should be a string specifying:
|
||||||
|
- A Hugging Face model name (e.g., "meta-llama/Llama-2-7b-chat-hf")
|
||||||
|
- A local path to a model
|
||||||
|
- For GGUF models, the format is "repo_id/file_name"
|
||||||
|
(e.g., "TheBloke/Llama-2-7B-Chat-GGUF/llama-2-7b-chat.Q4_K_M.gguf")
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
- "TheBloke/Llama-2-7B-Chat-GGUF/llama-2-7b-chat.Q4_K_M.gguf"
|
||||||
|
- "meta-llama/Llama-2-7b-chat-hf"
|
||||||
|
"""
|
||||||
|
|
||||||
|
backend: Literal[
|
||||||
|
"llamacpp", "transformers", "transformers_vision", "vllm", "mlxlm"
|
||||||
|
] = "transformers"
|
||||||
|
"""Specifies the backend to use for the model.
|
||||||
|
|
||||||
|
Supported backends are:
|
||||||
|
- "llamacpp": For GGUF models using llama.cpp
|
||||||
|
- "transformers": For Hugging Face Transformers models (default)
|
||||||
|
- "transformers_vision": For vision-language models (e.g., LLaVA)
|
||||||
|
- "vllm": For models using the vLLM library
|
||||||
|
- "mlxlm": For models using the MLX framework
|
||||||
|
|
||||||
|
Note: Ensure you have the necessary dependencies installed for the chosen backend.
|
||||||
|
The system will attempt to import required packages and may raise an ImportError
|
||||||
|
if they are not available.
|
||||||
|
"""
|
||||||
|
|
||||||
|
max_tokens: int = 256
|
||||||
|
"""The maximum number of tokens to generate."""
|
||||||
|
|
||||||
|
stop: Optional[List[str]] = None
|
||||||
|
"""A list of strings to stop generation when encountered."""
|
||||||
|
|
||||||
|
streaming: bool = True
|
||||||
|
"""Whether to stream the results, token by token."""
|
||||||
|
|
||||||
|
regex: Optional[str] = None
|
||||||
|
"""Regular expression for structured generation.
|
||||||
|
|
||||||
|
If provided, Outlines will guarantee that the generated text matches this regex.
|
||||||
|
This can be useful for generating structured outputs like IP addresses, dates, etc.
|
||||||
|
|
||||||
|
Example: (valid IP address)
|
||||||
|
regex = r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)"
|
||||||
|
|
||||||
|
Note: Computing the regex index can take some time, so it's recommended to reuse
|
||||||
|
the same regex for multiple generations if possible.
|
||||||
|
|
||||||
|
For more details, see: https://dottxt-ai.github.io/outlines/reference/generation/regex/
|
||||||
|
"""
|
||||||
|
|
||||||
|
type_constraints: Optional[Union[type, str]] = None
|
||||||
|
"""Type constraints for structured generation.
|
||||||
|
|
||||||
|
Restricts the output to valid Python types. Supported types include:
|
||||||
|
int, float, bool, datetime.date, datetime.time, datetime.datetime.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
type_constraints = int
|
||||||
|
|
||||||
|
For more details, see: https://dottxt-ai.github.io/outlines/reference/generation/format/
|
||||||
|
"""
|
||||||
|
|
||||||
|
json_schema: Optional[Union[BaseModel, Dict, Callable]] = None
|
||||||
|
"""Pydantic model, JSON Schema, or callable (function signature)
|
||||||
|
for structured JSON generation.
|
||||||
|
|
||||||
|
Outlines can generate JSON output that follows a specified structure,
|
||||||
|
which is useful for:
|
||||||
|
1. Parsing the answer (e.g., with Pydantic), storing it, or returning it to a user.
|
||||||
|
2. Calling a function with the result.
|
||||||
|
|
||||||
|
You can provide:
|
||||||
|
- A Pydantic model
|
||||||
|
- A JSON Schema (as a Dict)
|
||||||
|
- A callable (function signature)
|
||||||
|
|
||||||
|
The generated JSON will adhere to the specified structure.
|
||||||
|
|
||||||
|
For more details, see: https://dottxt-ai.github.io/outlines/reference/generation/json/
|
||||||
|
"""
|
||||||
|
|
||||||
|
grammar: Optional[str] = None
|
||||||
|
"""Context-free grammar for structured generation.
|
||||||
|
|
||||||
|
If provided, Outlines will generate text that adheres to the specified grammar.
|
||||||
|
The grammar should be defined in EBNF format.
|
||||||
|
|
||||||
|
This can be useful for generating structured outputs like mathematical expressions,
|
||||||
|
programming languages, or custom domain-specific languages.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
grammar = '''
|
||||||
|
?start: expression
|
||||||
|
?expression: term (("+" | "-") term)*
|
||||||
|
?term: factor (("*" | "/") factor)*
|
||||||
|
?factor: NUMBER | "-" factor | "(" expression ")"
|
||||||
|
%import common.NUMBER
|
||||||
|
'''
|
||||||
|
|
||||||
|
Note: Grammar-based generation is currently experimental and may have performance
|
||||||
|
limitations. It uses greedy generation to mitigate these issues.
|
||||||
|
|
||||||
|
For more details and examples, see:
|
||||||
|
https://dottxt-ai.github.io/outlines/reference/generation/cfg/
|
||||||
|
"""
|
||||||
|
|
||||||
|
custom_generator: Optional[Any] = None
|
||||||
|
"""Set your own outlines generator object to override the default behavior."""
|
||||||
|
|
||||||
|
model_kwargs: Dict[str, Any] = Field(default_factory=dict)
|
||||||
|
"""Additional parameters to pass to the underlying model.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
model_kwargs = {"temperature": 0.8, "seed": 42}
|
||||||
|
"""
|
||||||
|
|
||||||
|
@model_validator(mode="after")
|
||||||
|
def validate_environment(self) -> "Outlines":
|
||||||
|
"""Validate that outlines is installed and create a model instance."""
|
||||||
|
num_constraints = sum(
|
||||||
|
[
|
||||||
|
bool(self.regex),
|
||||||
|
bool(self.type_constraints),
|
||||||
|
bool(self.json_schema),
|
||||||
|
bool(self.grammar),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
if num_constraints > 1:
|
||||||
|
raise ValueError(
|
||||||
|
"Either none or exactly one of regex, type_constraints, "
|
||||||
|
"json_schema, or grammar can be provided."
|
||||||
|
)
|
||||||
|
return self.build_client()
|
||||||
|
|
||||||
|
def build_client(self) -> "Outlines":
|
||||||
|
try:
|
||||||
|
import outlines.models as models
|
||||||
|
except ImportError:
|
||||||
|
raise ImportError(
|
||||||
|
"Could not import the Outlines library. "
|
||||||
|
"Please install it with `pip install outlines`."
|
||||||
|
)
|
||||||
|
|
||||||
|
def check_packages_installed(
|
||||||
|
packages: List[Union[str, Tuple[str, str]]],
|
||||||
|
) -> None:
|
||||||
|
missing_packages = [
|
||||||
|
pkg if isinstance(pkg, str) else pkg[0]
|
||||||
|
for pkg in packages
|
||||||
|
if importlib.util.find_spec(pkg[1] if isinstance(pkg, tuple) else pkg)
|
||||||
|
is None
|
||||||
|
]
|
||||||
|
if missing_packages:
|
||||||
|
raise ImportError( # todo this is displaying wrong
|
||||||
|
f"Missing packages: {', '.join(missing_packages)}. "
|
||||||
|
"You can install them with:\n\n"
|
||||||
|
f" pip install {' '.join(missing_packages)}"
|
||||||
|
)
|
||||||
|
|
||||||
|
if self.backend == "llamacpp":
|
||||||
|
if ".gguf" in self.model:
|
||||||
|
creator, repo_name, file_name = self.model.split("/", 2)
|
||||||
|
repo_id = f"{creator}/{repo_name}"
|
||||||
|
else: # todo add auto-file-selection if no file is given
|
||||||
|
raise ValueError("GGUF file_name must be provided for llama.cpp.")
|
||||||
|
check_packages_installed([("llama-cpp-python", "llama_cpp")])
|
||||||
|
self.client = models.llamacpp(repo_id, file_name, **self.model_kwargs)
|
||||||
|
elif self.backend == "transformers":
|
||||||
|
check_packages_installed(["transformers", "torch", "datasets"])
|
||||||
|
self.client = models.transformers(self.model, **self.model_kwargs)
|
||||||
|
elif self.backend == "transformers_vision":
|
||||||
|
check_packages_installed(
|
||||||
|
["transformers", "datasets", "torchvision", "PIL", "flash_attn"]
|
||||||
|
)
|
||||||
|
from transformers import LlavaNextForConditionalGeneration
|
||||||
|
|
||||||
|
if not hasattr(models, "transformers_vision"):
|
||||||
|
raise ValueError(
|
||||||
|
"transformers_vision backend is not supported, "
|
||||||
|
"please install the correct outlines version."
|
||||||
|
)
|
||||||
|
self.client = models.transformers_vision(
|
||||||
|
self.model,
|
||||||
|
model_class=LlavaNextForConditionalGeneration,
|
||||||
|
**self.model_kwargs,
|
||||||
|
)
|
||||||
|
elif self.backend == "vllm":
|
||||||
|
if platform.system() == "Darwin":
|
||||||
|
raise ValueError("vLLM backend is not supported on macOS.")
|
||||||
|
check_packages_installed(["vllm"])
|
||||||
|
self.client = models.vllm(self.model, **self.model_kwargs)
|
||||||
|
elif self.backend == "mlxlm":
|
||||||
|
check_packages_installed(["mlx"])
|
||||||
|
self.client = models.mlxlm(self.model, **self.model_kwargs)
|
||||||
|
else:
|
||||||
|
raise ValueError(f"Unsupported backend: {self.backend}")
|
||||||
|
|
||||||
|
return self
|
||||||
|
|
||||||
|
@property
|
||||||
|
def _llm_type(self) -> str:
|
||||||
|
return "outlines"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def _default_params(self) -> Dict[str, Any]:
|
||||||
|
return {
|
||||||
|
"max_tokens": self.max_tokens,
|
||||||
|
"stop_at": self.stop,
|
||||||
|
**self.model_kwargs,
|
||||||
|
}
|
||||||
|
|
||||||
|
@property
|
||||||
|
def _identifying_params(self) -> Dict[str, Any]:
|
||||||
|
return {
|
||||||
|
"model": self.model,
|
||||||
|
"backend": self.backend,
|
||||||
|
"regex": self.regex,
|
||||||
|
"type_constraints": self.type_constraints,
|
||||||
|
"json_schema": self.json_schema,
|
||||||
|
"grammar": self.grammar,
|
||||||
|
**self._default_params,
|
||||||
|
}
|
||||||
|
|
||||||
|
@property
|
||||||
|
def _generator(self) -> Any:
|
||||||
|
from outlines import generate
|
||||||
|
|
||||||
|
if self.custom_generator:
|
||||||
|
return self.custom_generator
|
||||||
|
if self.regex:
|
||||||
|
return generate.regex(self.client, regex_str=self.regex)
|
||||||
|
if self.type_constraints:
|
||||||
|
return generate.format(self.client, python_type=self.type_constraints)
|
||||||
|
if self.json_schema:
|
||||||
|
return generate.json(self.client, schema_object=self.json_schema)
|
||||||
|
if self.grammar:
|
||||||
|
return generate.cfg(self.client, cfg_str=self.grammar)
|
||||||
|
return generate.text(self.client)
|
||||||
|
|
||||||
|
def _call(
|
||||||
|
self,
|
||||||
|
prompt: str,
|
||||||
|
stop: Optional[List[str]] = None,
|
||||||
|
run_manager: Optional[CallbackManagerForLLMRun] = None,
|
||||||
|
**kwargs: Any,
|
||||||
|
) -> str:
|
||||||
|
params = {**self._default_params, **kwargs}
|
||||||
|
if stop:
|
||||||
|
params["stop_at"] = stop
|
||||||
|
|
||||||
|
response = ""
|
||||||
|
if self.streaming:
|
||||||
|
for chunk in self._stream(
|
||||||
|
prompt=prompt,
|
||||||
|
stop=params["stop_at"],
|
||||||
|
run_manager=run_manager,
|
||||||
|
**params,
|
||||||
|
):
|
||||||
|
response += chunk.text
|
||||||
|
else:
|
||||||
|
response = self._generator(prompt, **params)
|
||||||
|
return response
|
||||||
|
|
||||||
|
def _stream(
|
||||||
|
self,
|
||||||
|
prompt: str,
|
||||||
|
stop: Optional[List[str]] = None,
|
||||||
|
run_manager: Optional[CallbackManagerForLLMRun] = None,
|
||||||
|
**kwargs: Any,
|
||||||
|
) -> Iterator[GenerationChunk]:
|
||||||
|
params = {**self._default_params, **kwargs}
|
||||||
|
if stop:
|
||||||
|
params["stop_at"] = stop
|
||||||
|
|
||||||
|
for token in self._generator.stream(prompt, **params):
|
||||||
|
if run_manager:
|
||||||
|
run_manager.on_llm_new_token(token)
|
||||||
|
yield GenerationChunk(text=token)
|
||||||
|
|
||||||
|
@property
|
||||||
|
def tokenizer(self) -> Any:
|
||||||
|
"""Access the tokenizer for the underlying model.
|
||||||
|
|
||||||
|
.encode() to tokenize text.
|
||||||
|
.decode() to convert tokens back to text.
|
||||||
|
"""
|
||||||
|
if hasattr(self.client, "tokenizer"):
|
||||||
|
return self.client.tokenizer
|
||||||
|
raise ValueError("Tokenizer not found")
|
@ -0,0 +1,177 @@
|
|||||||
|
# flake8: noqa
|
||||||
|
"""Test ChatOutlines wrapper."""
|
||||||
|
|
||||||
|
from typing import Generator
|
||||||
|
import re
|
||||||
|
import platform
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from langchain_community.chat_models.outlines import ChatOutlines
|
||||||
|
from langchain_core.messages import AIMessage, HumanMessage, BaseMessage
|
||||||
|
from langchain_core.messages import BaseMessageChunk
|
||||||
|
from pydantic import BaseModel
|
||||||
|
|
||||||
|
from tests.unit_tests.callbacks.fake_callback_handler import FakeCallbackHandler
|
||||||
|
|
||||||
|
|
||||||
|
MODEL = "microsoft/Phi-3-mini-4k-instruct"
|
||||||
|
LLAMACPP_MODEL = "bartowski/qwen2.5-7b-ins-v3-GGUF/qwen2.5-7b-ins-v3-Q4_K_M.gguf"
|
||||||
|
|
||||||
|
BACKENDS = ["transformers", "llamacpp"]
|
||||||
|
if platform.system() != "Darwin":
|
||||||
|
BACKENDS.append("vllm")
|
||||||
|
if platform.system() == "Darwin":
|
||||||
|
BACKENDS.append("mlxlm")
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture(params=BACKENDS)
|
||||||
|
def chat_model(request: pytest.FixtureRequest) -> ChatOutlines:
|
||||||
|
if request.param == "llamacpp":
|
||||||
|
return ChatOutlines(model=LLAMACPP_MODEL, backend=request.param)
|
||||||
|
else:
|
||||||
|
return ChatOutlines(model=MODEL, backend=request.param)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_inference(chat_model: ChatOutlines) -> None:
|
||||||
|
"""Test valid ChatOutlines inference."""
|
||||||
|
messages = [HumanMessage(content="Say foo:")]
|
||||||
|
output = chat_model.invoke(messages)
|
||||||
|
assert isinstance(output, AIMessage)
|
||||||
|
assert len(output.content) > 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_streaming(chat_model: ChatOutlines) -> None:
|
||||||
|
"""Test streaming tokens from ChatOutlines."""
|
||||||
|
messages = [HumanMessage(content="How do you say 'hello' in Spanish?")]
|
||||||
|
generator = chat_model.stream(messages)
|
||||||
|
stream_results_string = ""
|
||||||
|
assert isinstance(generator, Generator)
|
||||||
|
|
||||||
|
for chunk in generator:
|
||||||
|
assert isinstance(chunk, BaseMessageChunk)
|
||||||
|
if isinstance(chunk.content, str):
|
||||||
|
stream_results_string += chunk.content
|
||||||
|
else:
|
||||||
|
raise ValueError(
|
||||||
|
f"Invalid content type, only str is supported, "
|
||||||
|
f"got {type(chunk.content)}"
|
||||||
|
)
|
||||||
|
assert len(stream_results_string.strip()) > 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_streaming_callback(chat_model: ChatOutlines) -> None:
|
||||||
|
"""Test that streaming correctly invokes on_llm_new_token callback."""
|
||||||
|
MIN_CHUNKS = 5
|
||||||
|
callback_handler = FakeCallbackHandler()
|
||||||
|
chat_model.callbacks = [callback_handler]
|
||||||
|
chat_model.verbose = True
|
||||||
|
messages = [HumanMessage(content="Can you count to 10?")]
|
||||||
|
chat_model.invoke(messages)
|
||||||
|
assert callback_handler.llm_streams >= MIN_CHUNKS
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_regex(chat_model: ChatOutlines) -> None:
|
||||||
|
"""Test regex for generating a valid IP address"""
|
||||||
|
ip_regex = r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)"
|
||||||
|
chat_model.regex = ip_regex
|
||||||
|
assert chat_model.regex == ip_regex
|
||||||
|
|
||||||
|
messages = [HumanMessage(content="What is the IP address of Google's DNS server?")]
|
||||||
|
output = chat_model.invoke(messages)
|
||||||
|
|
||||||
|
assert isinstance(output, AIMessage)
|
||||||
|
assert re.match(
|
||||||
|
ip_regex, str(output.content)
|
||||||
|
), f"Generated output '{output.content}' is not a valid IP address"
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_type_constraints(chat_model: ChatOutlines) -> None:
|
||||||
|
"""Test type constraints for generating an integer"""
|
||||||
|
chat_model.type_constraints = int
|
||||||
|
messages = [
|
||||||
|
HumanMessage(
|
||||||
|
content="What is the answer to life, the universe, and everything?"
|
||||||
|
)
|
||||||
|
]
|
||||||
|
output = chat_model.invoke(messages)
|
||||||
|
assert isinstance(int(str(output.content)), int)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_json(chat_model: ChatOutlines) -> None:
|
||||||
|
"""Test json for generating a valid JSON object"""
|
||||||
|
|
||||||
|
class Person(BaseModel):
|
||||||
|
name: str
|
||||||
|
|
||||||
|
chat_model.json_schema = Person
|
||||||
|
messages = [HumanMessage(content="Who are the main contributors to LangChain?")]
|
||||||
|
output = chat_model.invoke(messages)
|
||||||
|
person = Person.model_validate_json(str(output.content))
|
||||||
|
assert isinstance(person, Person)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_grammar(chat_model: ChatOutlines) -> None:
|
||||||
|
"""Test grammar for generating a valid arithmetic expression"""
|
||||||
|
if chat_model.backend == "mlxlm":
|
||||||
|
pytest.skip("MLX grammars not yet supported.")
|
||||||
|
|
||||||
|
chat_model.grammar = """
|
||||||
|
?start: expression
|
||||||
|
?expression: term (("+" | "-") term)*
|
||||||
|
?term: factor (("*" | "/") factor)*
|
||||||
|
?factor: NUMBER | "-" factor | "(" expression ")"
|
||||||
|
%import common.NUMBER
|
||||||
|
%import common.WS
|
||||||
|
%ignore WS
|
||||||
|
"""
|
||||||
|
|
||||||
|
messages = [HumanMessage(content="Give me a complex arithmetic expression:")]
|
||||||
|
output = chat_model.invoke(messages)
|
||||||
|
|
||||||
|
# Validate the output is a non-empty string
|
||||||
|
assert (
|
||||||
|
isinstance(output.content, str) and output.content.strip()
|
||||||
|
), "Output should be a non-empty string"
|
||||||
|
|
||||||
|
# Use a simple regex to check if the output contains basic arithmetic operations and numbers
|
||||||
|
assert re.search(
|
||||||
|
r"[\d\+\-\*/\(\)]+", output.content
|
||||||
|
), f"Generated output '{output.content}' does not appear to be a valid arithmetic expression"
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_with_structured_output(chat_model: ChatOutlines) -> None:
|
||||||
|
"""Test that ChatOutlines can generate structured outputs"""
|
||||||
|
|
||||||
|
class AnswerWithJustification(BaseModel):
|
||||||
|
"""An answer to the user question along with justification for the answer."""
|
||||||
|
|
||||||
|
answer: str
|
||||||
|
justification: str
|
||||||
|
|
||||||
|
structured_chat_model = chat_model.with_structured_output(AnswerWithJustification)
|
||||||
|
|
||||||
|
result = structured_chat_model.invoke(
|
||||||
|
"What weighs more, a pound of bricks or a pound of feathers?"
|
||||||
|
)
|
||||||
|
|
||||||
|
assert isinstance(result, AnswerWithJustification)
|
||||||
|
assert isinstance(result.answer, str)
|
||||||
|
assert isinstance(result.justification, str)
|
||||||
|
assert len(result.answer) > 0
|
||||||
|
assert len(result.justification) > 0
|
||||||
|
|
||||||
|
structured_chat_model_with_raw = chat_model.with_structured_output(
|
||||||
|
AnswerWithJustification, include_raw=True
|
||||||
|
)
|
||||||
|
|
||||||
|
result_with_raw = structured_chat_model_with_raw.invoke(
|
||||||
|
"What weighs more, a pound of bricks or a pound of feathers?"
|
||||||
|
)
|
||||||
|
|
||||||
|
assert isinstance(result_with_raw, dict)
|
||||||
|
assert "raw" in result_with_raw
|
||||||
|
assert "parsed" in result_with_raw
|
||||||
|
assert "parsing_error" in result_with_raw
|
||||||
|
assert isinstance(result_with_raw["raw"], BaseMessage)
|
||||||
|
assert isinstance(result_with_raw["parsed"], AnswerWithJustification)
|
||||||
|
assert result_with_raw["parsing_error"] is None
|
123
libs/community/tests/integration_tests/llms/test_outlines.py
Normal file
123
libs/community/tests/integration_tests/llms/test_outlines.py
Normal file
@ -0,0 +1,123 @@
|
|||||||
|
# flake8: noqa
|
||||||
|
"""Test Outlines wrapper."""
|
||||||
|
|
||||||
|
from typing import Generator
|
||||||
|
import re
|
||||||
|
import platform
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from langchain_community.llms.outlines import Outlines
|
||||||
|
from pydantic import BaseModel
|
||||||
|
|
||||||
|
from tests.unit_tests.callbacks.fake_callback_handler import FakeCallbackHandler
|
||||||
|
|
||||||
|
|
||||||
|
MODEL = "microsoft/Phi-3-mini-4k-instruct"
|
||||||
|
LLAMACPP_MODEL = "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf"
|
||||||
|
|
||||||
|
BACKENDS = ["transformers", "llamacpp"]
|
||||||
|
if platform.system() != "Darwin":
|
||||||
|
BACKENDS.append("vllm")
|
||||||
|
if platform.system() == "Darwin":
|
||||||
|
BACKENDS.append("mlxlm")
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture(params=BACKENDS)
|
||||||
|
def llm(request: pytest.FixtureRequest) -> Outlines:
|
||||||
|
if request.param == "llamacpp":
|
||||||
|
return Outlines(model=LLAMACPP_MODEL, backend=request.param, max_tokens=100)
|
||||||
|
else:
|
||||||
|
return Outlines(model=MODEL, backend=request.param, max_tokens=100)
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_inference(llm: Outlines) -> None:
|
||||||
|
"""Test valid outlines inference."""
|
||||||
|
output = llm.invoke("Say foo:")
|
||||||
|
assert isinstance(output, str)
|
||||||
|
assert len(output) > 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_streaming(llm: Outlines) -> None:
|
||||||
|
"""Test streaming tokens from Outlines."""
|
||||||
|
generator = llm.stream("Q: How do you say 'hello' in Spanish?\n\nA:")
|
||||||
|
stream_results_string = ""
|
||||||
|
assert isinstance(generator, Generator)
|
||||||
|
|
||||||
|
for chunk in generator:
|
||||||
|
print(chunk)
|
||||||
|
assert isinstance(chunk, str)
|
||||||
|
stream_results_string += chunk
|
||||||
|
print(stream_results_string)
|
||||||
|
assert len(stream_results_string.strip()) > 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_streaming_callback(llm: Outlines) -> None:
|
||||||
|
"""Test that streaming correctly invokes on_llm_new_token callback."""
|
||||||
|
MIN_CHUNKS = 5
|
||||||
|
|
||||||
|
callback_handler = FakeCallbackHandler()
|
||||||
|
llm.callbacks = [callback_handler]
|
||||||
|
llm.verbose = True
|
||||||
|
llm.invoke("Q: Can you count to 10? A:'1, ")
|
||||||
|
assert callback_handler.llm_streams >= MIN_CHUNKS
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_regex(llm: Outlines) -> None:
|
||||||
|
"""Test regex for generating a valid IP address"""
|
||||||
|
ip_regex = r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)"
|
||||||
|
llm.regex = ip_regex
|
||||||
|
assert llm.regex == ip_regex
|
||||||
|
|
||||||
|
output = llm.invoke("Q: What is the IP address of googles dns server?\n\nA: ")
|
||||||
|
|
||||||
|
assert isinstance(output, str)
|
||||||
|
|
||||||
|
assert re.match(
|
||||||
|
ip_regex, output
|
||||||
|
), f"Generated output '{output}' is not a valid IP address"
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_type_constraints(llm: Outlines) -> None:
|
||||||
|
"""Test type constraints for generating an integer"""
|
||||||
|
llm.type_constraints = int
|
||||||
|
output = llm.invoke(
|
||||||
|
"Q: What is the answer to life, the universe, and everything?\n\nA: "
|
||||||
|
)
|
||||||
|
assert int(output)
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_json(llm: Outlines) -> None:
|
||||||
|
"""Test json for generating a valid JSON object"""
|
||||||
|
|
||||||
|
class Person(BaseModel):
|
||||||
|
name: str
|
||||||
|
|
||||||
|
llm.json_schema = Person
|
||||||
|
output = llm.invoke("Q: Who is the author of LangChain?\n\nA: ")
|
||||||
|
person = Person.model_validate_json(output)
|
||||||
|
assert isinstance(person, Person)
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_grammar(llm: Outlines) -> None:
|
||||||
|
"""Test grammar for generating a valid arithmetic expression"""
|
||||||
|
llm.grammar = """
|
||||||
|
?start: expression
|
||||||
|
?expression: term (("+" | "-") term)*
|
||||||
|
?term: factor (("*" | "/") factor)*
|
||||||
|
?factor: NUMBER | "-" factor | "(" expression ")"
|
||||||
|
%import common.NUMBER
|
||||||
|
%import common.WS
|
||||||
|
%ignore WS
|
||||||
|
"""
|
||||||
|
|
||||||
|
output = llm.invoke("Here is a complex arithmetic expression: ")
|
||||||
|
|
||||||
|
# Validate the output is a non-empty string
|
||||||
|
assert (
|
||||||
|
isinstance(output, str) and output.strip()
|
||||||
|
), "Output should be a non-empty string"
|
||||||
|
|
||||||
|
# Use a simple regex to check if the output contains basic arithmetic operations and numbers
|
||||||
|
assert re.search(
|
||||||
|
r"[\d\+\-\*/\(\)]+", output
|
||||||
|
), f"Generated output '{output}' does not appear to be a valid arithmetic expression"
|
@ -36,6 +36,7 @@ EXPECTED_ALL = [
|
|||||||
"ChatOCIModelDeploymentTGI",
|
"ChatOCIModelDeploymentTGI",
|
||||||
"ChatOllama",
|
"ChatOllama",
|
||||||
"ChatOpenAI",
|
"ChatOpenAI",
|
||||||
|
"ChatOutlines",
|
||||||
"ChatPerplexity",
|
"ChatPerplexity",
|
||||||
"ChatPremAI",
|
"ChatPremAI",
|
||||||
"ChatSambaNovaCloud",
|
"ChatSambaNovaCloud",
|
||||||
|
91
libs/community/tests/unit_tests/chat_models/test_outlines.py
Normal file
91
libs/community/tests/unit_tests/chat_models/test_outlines.py
Normal file
@ -0,0 +1,91 @@
|
|||||||
|
import pytest
|
||||||
|
from _pytest.monkeypatch import MonkeyPatch
|
||||||
|
from pydantic import BaseModel, Field
|
||||||
|
|
||||||
|
from langchain_community.chat_models.outlines import ChatOutlines
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_initialization(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
|
||||||
|
|
||||||
|
chat = ChatOutlines(
|
||||||
|
model="microsoft/Phi-3-mini-4k-instruct",
|
||||||
|
max_tokens=42,
|
||||||
|
stop=["\n"],
|
||||||
|
)
|
||||||
|
assert chat.model == "microsoft/Phi-3-mini-4k-instruct"
|
||||||
|
assert chat.max_tokens == 42
|
||||||
|
assert chat.backend == "transformers"
|
||||||
|
assert chat.stop == ["\n"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_backend_llamacpp(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
|
||||||
|
chat = ChatOutlines(
|
||||||
|
model="TheBloke/Llama-2-7B-Chat-GGUF/llama-2-7b-chat.Q4_K_M.gguf",
|
||||||
|
backend="llamacpp",
|
||||||
|
)
|
||||||
|
assert chat.backend == "llamacpp"
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_backend_vllm(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
|
||||||
|
chat = ChatOutlines(model="microsoft/Phi-3-mini-4k-instruct", backend="vllm")
|
||||||
|
assert chat.backend == "vllm"
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_backend_mlxlm(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
|
||||||
|
chat = ChatOutlines(model="microsoft/Phi-3-mini-4k-instruct", backend="mlxlm")
|
||||||
|
assert chat.backend == "mlxlm"
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_with_regex(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
|
||||||
|
regex = r"\d{3}-\d{3}-\d{4}"
|
||||||
|
chat = ChatOutlines(model="microsoft/Phi-3-mini-4k-instruct", regex=regex)
|
||||||
|
assert chat.regex == regex
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_with_type_constraints(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
|
||||||
|
chat = ChatOutlines(model="microsoft/Phi-3-mini-4k-instruct", type_constraints=int)
|
||||||
|
assert chat.type_constraints == int # noqa
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_with_json_schema(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
|
||||||
|
|
||||||
|
class TestSchema(BaseModel):
|
||||||
|
name: str = Field(description="A person's name")
|
||||||
|
age: int = Field(description="A person's age")
|
||||||
|
|
||||||
|
chat = ChatOutlines(
|
||||||
|
model="microsoft/Phi-3-mini-4k-instruct", json_schema=TestSchema
|
||||||
|
)
|
||||||
|
assert chat.json_schema == TestSchema
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_outlines_with_grammar(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
|
||||||
|
|
||||||
|
grammar = """
|
||||||
|
?start: expression
|
||||||
|
?expression: term (("+" | "-") term)*
|
||||||
|
?term: factor (("*" | "/") factor)*
|
||||||
|
?factor: NUMBER | "-" factor | "(" expression ")"
|
||||||
|
%import common.NUMBER
|
||||||
|
"""
|
||||||
|
chat = ChatOutlines(model="microsoft/Phi-3-mini-4k-instruct", grammar=grammar)
|
||||||
|
assert chat.grammar == grammar
|
||||||
|
|
||||||
|
|
||||||
|
def test_raise_for_multiple_output_constraints(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
|
||||||
|
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
ChatOutlines(
|
||||||
|
model="microsoft/Phi-3-mini-4k-instruct",
|
||||||
|
type_constraints=int,
|
||||||
|
regex=r"\d{3}-\d{3}-\d{4}",
|
||||||
|
)
|
@ -67,6 +67,7 @@ EXPECT_ALL = [
|
|||||||
"OpenAIChat",
|
"OpenAIChat",
|
||||||
"OpenLLM",
|
"OpenLLM",
|
||||||
"OpenLM",
|
"OpenLM",
|
||||||
|
"Outlines",
|
||||||
"PaiEasEndpoint",
|
"PaiEasEndpoint",
|
||||||
"Petals",
|
"Petals",
|
||||||
"PipelineAI",
|
"PipelineAI",
|
||||||
|
92
libs/community/tests/unit_tests/llms/test_outlines.py
Normal file
92
libs/community/tests/unit_tests/llms/test_outlines.py
Normal file
@ -0,0 +1,92 @@
|
|||||||
|
import pytest
|
||||||
|
from _pytest.monkeypatch import MonkeyPatch
|
||||||
|
|
||||||
|
from langchain_community.llms.outlines import Outlines
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_initialization(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(Outlines, "build_client", lambda self: self)
|
||||||
|
|
||||||
|
llm = Outlines(
|
||||||
|
model="microsoft/Phi-3-mini-4k-instruct",
|
||||||
|
max_tokens=42,
|
||||||
|
stop=["\n"],
|
||||||
|
)
|
||||||
|
assert llm.model == "microsoft/Phi-3-mini-4k-instruct"
|
||||||
|
assert llm.max_tokens == 42
|
||||||
|
assert llm.backend == "transformers"
|
||||||
|
assert llm.stop == ["\n"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_backend_llamacpp(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(Outlines, "build_client", lambda self: self)
|
||||||
|
llm = Outlines(
|
||||||
|
model="TheBloke/Llama-2-7B-Chat-GGUF/llama-2-7b-chat.Q4_K_M.gguf",
|
||||||
|
backend="llamacpp",
|
||||||
|
)
|
||||||
|
assert llm.backend == "llamacpp"
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_backend_vllm(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(Outlines, "build_client", lambda self: self)
|
||||||
|
llm = Outlines(model="microsoft/Phi-3-mini-4k-instruct", backend="vllm")
|
||||||
|
assert llm.backend == "vllm"
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_backend_mlxlm(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(Outlines, "build_client", lambda self: self)
|
||||||
|
llm = Outlines(model="microsoft/Phi-3-mini-4k-instruct", backend="mlxlm")
|
||||||
|
assert llm.backend == "mlxlm"
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_with_regex(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(Outlines, "build_client", lambda self: self)
|
||||||
|
regex = r"\d{3}-\d{3}-\d{4}"
|
||||||
|
llm = Outlines(model="microsoft/Phi-3-mini-4k-instruct", regex=regex)
|
||||||
|
assert llm.regex == regex
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_with_type_constraints(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(Outlines, "build_client", lambda self: self)
|
||||||
|
llm = Outlines(model="microsoft/Phi-3-mini-4k-instruct", type_constraints=int)
|
||||||
|
assert llm.type_constraints == int # noqa
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_with_json_schema(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(Outlines, "build_client", lambda self: self)
|
||||||
|
from pydantic import BaseModel, Field
|
||||||
|
|
||||||
|
class TestSchema(BaseModel):
|
||||||
|
name: str = Field(description="A person's name")
|
||||||
|
age: int = Field(description="A person's age")
|
||||||
|
|
||||||
|
llm = Outlines(model="microsoft/Phi-3-mini-4k-instruct", json_schema=TestSchema)
|
||||||
|
assert llm.json_schema == TestSchema
|
||||||
|
|
||||||
|
|
||||||
|
def test_outlines_with_grammar(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(Outlines, "build_client", lambda self: self)
|
||||||
|
grammar = """
|
||||||
|
?start: expression
|
||||||
|
?expression: term (("+" | "-") term)*
|
||||||
|
?term: factor (("*" | "/") factor)*
|
||||||
|
?factor: NUMBER | "-" factor | "(" expression ")"
|
||||||
|
%import common.NUMBER
|
||||||
|
"""
|
||||||
|
llm = Outlines(model="microsoft/Phi-3-mini-4k-instruct", grammar=grammar)
|
||||||
|
assert llm.grammar == grammar
|
||||||
|
|
||||||
|
|
||||||
|
def test_raise_for_multiple_output_constraints(monkeypatch: MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(Outlines, "build_client", lambda self: self)
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
Outlines(
|
||||||
|
model="microsoft/Phi-3-mini-4k-instruct",
|
||||||
|
type_constraints=int,
|
||||||
|
regex=r"\d{3}-\d{3}-\d{4}",
|
||||||
|
)
|
||||||
|
Outlines(
|
||||||
|
model="microsoft/Phi-3-mini-4k-instruct",
|
||||||
|
type_constraints=int,
|
||||||
|
regex=r"\d{3}-\d{3}-\d{4}",
|
||||||
|
)
|
Loading…
Reference in New Issue
Block a user