community: Outlines integration (#27449)

In collaboration with @rlouf I build an [outlines](https://dottxt-ai.github.io/outlines/latest/) integration for langchain! I think this is really useful for doing any type of structured output locally. [Dottxt](https://dottxt.co) spend alot of work optimising this process at a lower level ([outlines-core](https://pypi.org/project/outlines-core/0.1.14/) written in rust) so I think this is a better alternative over all current approaches in langchain to do structured output. It also implements the `.with_structured_output` method so it should be a drop in replacement for a lot of applications. The integration includes: - **Outlines LLM class** - **ChatOutlines class** - **Tutorial Cookbooks** - **Documentation Page** - **Validation and error messages** - **Exposes Outlines Structured output features** - **Support for multiple backends** - **Integration and Unit Tests** Dependencies: `outlines` + additional (depending on backend used) I am not sure if the unit-tests comply with all requirements, if not I suggest to just remove them since I don't see a useful way to do it differently. ### Quick overview: Chat Models: <img width="698" alt="image" src="https://github.com/user-attachments/assets/05a499b9-858c-4397-a9ff-165c2b3e7acc"> Structured Output: <img width="955" alt="image" src="https://github.com/user-attachments/assets/b9fcac11-d3e5-4698-b1ae-8c4cb3d54c45"> --------- Co-authored-by: Vadym Barda <vadym@langchain.dev>
2025-07-12 07:50:39 +00:00 · 2024-11-21 05:31:31 +08:00 · 2024-11-21 05:31:31 +08:00 · dee72c46c1
commit dee72c46c1
parent 2901fa20cc
14 changed files with 2162 additions and 0 deletions
--- a/docs/docs/integrations/chat/outlines.ipynb
+++ b/docs/docs/integrations/chat/outlines.ipynb
@ -0,0 +1,348 @@
+{
+    "cells": [
+        {
+            "cell_type": "raw",
+            "id": "afaf8039",
+            "metadata": {},
+            "source": [
+                "---\n",
+                "sidebar_label: Outlines\n",
+                "---"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "id": "e49f1e0d",
+            "metadata": {},
+            "source": [
+                "# ChatOutlines\n",
+                "\n",
+                "This will help you getting started with Outlines [chat models](/docs/concepts/chat_models/). For detailed documentation of all ChatOutlines features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/outlines.chat_models.ChatOutlines.html).\n",
+                "\n",
+                "[Outlines](https://github.com/outlines-dev/outlines) is a library for constrained language generation. It allows you to use large language models (LLMs) with various backends while applying constraints to the generated output.\n",
+                "\n",
+                "## Overview\n",
+                "### Integration details\n",
+                "\n",
+                "| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n",
+                "| :--- | :--- | :---: | :---: |  :---: | :---: | :---: |\n",
+                "| [ChatOutlines](https://api.python.langchain.com/en/latest/chat_models/outlines.chat_models.ChatOutlines.html) | [langchain-community](https://api.python.langchain.com/en/latest/community_api_reference.html) | ✅ | ❌ | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-community?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-community?style=flat-square&label=%20) |\n",
+                "\n",
+                "### Model features\n",
+                "| [Tool calling](/docs/how_to/tool_calling) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
+                "| :---: | :---: | :---: | :---: |  :---: | :---: | :---: | :---: | :---: | :---: |\n",
+                "| ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | \n",
+                "\n",
+                "## Setup\n",
+                "\n",
+                "To access Outlines models you'll need to have an internet connection to download the model weights from huggingface. Depending on the backend you need to install the required dependencies (see [Outlines docs](https://dottxt-ai.github.io/outlines/latest/installation/))\n",
+                "\n",
+                "### Credentials\n",
+                "\n",
+                "There is no built-in auth mechanism for Outlines.\n",
+                "\n",
+                "### Installation\n",
+                "\n",
+                "The LangChain Outlines integration lives in the `langchain-community` package and requires the `outlines` library:"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "id": "652d6238-1f87-422a-b135-f5abbb8652fc",
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "%pip install -qU langchain-community outlines"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "id": "a38cde65-254d-4219-a441-068766c0d4b5",
+            "metadata": {},
+            "source": [
+                "## Instantiation\n",
+                "\n",
+                "Now we can instantiate our model object and generate chat completions:"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae",
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "from langchain_community.chat_models.outlines import ChatOutlines\n",
+                "\n",
+                "# For llamacpp backend\n",
+                "model = ChatOutlines(model=\"TheBloke/phi-2-GGUF/phi-2.Q4_K_M.gguf\", backend=\"llamacpp\")\n",
+                "\n",
+                "# For vllm backend (not available on Mac)\n",
+                "model = ChatOutlines(model=\"meta-llama/Llama-3.2-1B\", backend=\"vllm\")\n",
+                "\n",
+                "# For mlxlm backend (only available on Mac)\n",
+                "model = ChatOutlines(model=\"mistralai/Ministral-8B-Instruct-2410\", backend=\"mlxlm\")\n",
+                "\n",
+                "# For huggingface transformers backend\n",
+                "model = ChatOutlines(model=\"microsoft/phi-2\")  # defaults to transformers backend"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "id": "2b4f3e15",
+            "metadata": {},
+            "source": [
+                "## Invocation"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "id": "62e0dbc3",
+            "metadata": {
+                "tags": []
+            },
+            "outputs": [],
+            "source": [
+                "from langchain_core.messages import HumanMessage\n",
+                "\n",
+                "messages = [HumanMessage(content=\"What will the capital of mars be called?\")]\n",
+                "response = model.invoke(messages)\n",
+                "\n",
+                "response.content"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "id": "18e2bfc0-7e78-4528-a73f-499ac150dca8",
+            "metadata": {},
+            "source": [
+                "## Streaming\n",
+                "\n",
+                "ChatOutlines supports streaming of tokens:"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "id": "e197d1d7-a070-4c96-9f8a-a0e86d046e0b",
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "messages = [HumanMessage(content=\"Count to 10 in French:\")]\n",
+                "\n",
+                "for chunk in model.stream(messages):\n",
+                "    print(chunk.content, end=\"\", flush=True)"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "id": "ccc3e2f6",
+            "metadata": {},
+            "source": [
+                "## Chaining"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "id": "5a032003",
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "from langchain_core.prompts import ChatPromptTemplate\n",
+                "\n",
+                "prompt = ChatPromptTemplate.from_messages(\n",
+                "    [\n",
+                "        (\n",
+                "            \"system\",\n",
+                "            \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n",
+                "        ),\n",
+                "        (\"human\", \"{input}\"),\n",
+                "    ]\n",
+                ")\n",
+                "\n",
+                "chain = prompt | model\n",
+                "chain.invoke(\n",
+                "    {\n",
+                "        \"input_language\": \"English\",\n",
+                "        \"output_language\": \"German\",\n",
+                "        \"input\": \"I love programming.\",\n",
+                "    }\n",
+                ")"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "id": "d1ee55bc-ffc8-4cfa-801c-993953a08cfd",
+            "metadata": {},
+            "source": [
+                "## Constrained Generation\n",
+                "\n",
+                "ChatOutlines allows you to apply various constraints to the generated output:\n",
+                "\n",
+                "### Regex Constraint"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "model.regex = r\"((25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\\.){3}(25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\"\n",
+                "\n",
+                "response = model.invoke(\"What is the IP address of Google's DNS server?\")\n",
+                "\n",
+                "response.content"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "id": "4a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
+            "metadata": {},
+            "source": [
+                "### Type Constraints"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "id": "5a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "model.type_constraints = int\n",
+                "response = model.invoke(\"What is the answer to life, the universe, and everything?\")\n",
+                "\n",
+                "response.content"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "id": "6a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
+            "metadata": {},
+            "source": [
+                "### Pydantic and JSON Schemas"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "id": "7a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "from pydantic import BaseModel\n",
+                "\n",
+                "\n",
+                "class Person(BaseModel):\n",
+                "    name: str\n",
+                "\n",
+                "\n",
+                "model.json_schema = Person\n",
+                "response = model.invoke(\"Who are the main contributors to LangChain?\")\n",
+                "person = Person.model_validate_json(response.content)\n",
+                "\n",
+                "person"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "id": "8a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
+            "metadata": {},
+            "source": [
+                "### Context Free Grammars"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "id": "9a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "model.grammar = \"\"\"\n",
+                "?start: expression\n",
+                "?expression: term ((\"+\" | \"-\") term)*\n",
+                "?term: factor ((\"*\" | \"/\") factor)*\n",
+                "?factor: NUMBER | \"-\" factor | \"(\" expression \")\"\n",
+                "%import common.NUMBER\n",
+                "%import common.WS\n",
+                "%ignore WS\n",
+                "\"\"\"\n",
+                "response = model.invoke(\"Give me a complex arithmetic expression:\")\n",
+                "\n",
+                "response.content"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "id": "aa5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
+            "metadata": {},
+            "source": [
+                "## LangChain's Structured Output\n",
+                "\n",
+                "You can also use LangChain's Structured Output with ChatOutlines:"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "id": "ba5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "from pydantic import BaseModel\n",
+                "\n",
+                "\n",
+                "class AnswerWithJustification(BaseModel):\n",
+                "    answer: str\n",
+                "    justification: str\n",
+                "\n",
+                "\n",
+                "_model = model.with_structured_output(AnswerWithJustification)\n",
+                "result = _model.invoke(\"What weighs more, a pound of bricks or a pound of feathers?\")\n",
+                "\n",
+                "result"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "id": "ca5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
+            "metadata": {},
+            "source": [
+                "## API reference\n",
+                "\n",
+                "For detailed documentation of all ChatOutlines features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/outlines.chat_models.ChatOutlines.html\n",
+                "\n",
+                "## Full Outlines Documentation: \n",
+                "\n",
+                "https://dottxt-ai.github.io/outlines/latest/"
+            ]
+        }
+    ],
+    "metadata": {
+        "kernelspec": {
+            "display_name": "Python 3 (ipykernel)",
+            "language": "python",
+            "name": "python3"
+        },
+        "language_info": {
+            "codemirror_mode": {
+                "name": "ipython",
+                "version": 3
+            },
+            "file_extension": ".py",
+            "mimetype": "text/x-python",
+            "name": "python",
+            "nbconvert_exporter": "python",
+            "pygments_lexer": "ipython3",
+            "version": "3.9.9"
+        }
+    },
+    "nbformat": 4,
+    "nbformat_minor": 5
+}
--- a/docs/docs/integrations/llms/outlines.ipynb
+++ b/docs/docs/integrations/llms/outlines.ipynb
@ -0,0 +1,268 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Outlines\n",
+    "\n",
+    "This will help you getting started with Outlines LLM. For detailed documentation of all Outlines features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/llms/outlines.llms.Outlines.html).\n",
+    "\n",
+    "[Outlines](https://github.com/outlines-dev/outlines) is a library for constrained language generation. It allows you to use large language models (LLMs) with various backends while applying constraints to the generated output.\n",
+    "\n",
+    "## Overview\n",
+    "\n",
+    "### Integration details\n",
+    "| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n",
+    "| :--- | :--- | :---: | :---: |  :---: | :---: | :---: |\n",
+    "| [Outlines](https://python.langchain.com/api_reference/community/llms/langchain_community.llms.outlines.Outlines.html) | [langchain-community](https://python.langchain.com/api_reference/community/index.html) | ✅ | beta | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-community?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-community?style=flat-square&label=%20) |\n",
+    "\n",
+    "## Setup\n",
+    "\n",
+    "To access Outlines models you'll need to have an internet connection to download the model weights from huggingface. Depending on the backend you need to install the required dependencies (see [Outlines docs](https://dottxt-ai.github.io/outlines/latest/installation/))\n",
+    "\n",
+    "### Credentials\n",
+    "\n",
+    "There is no built-in auth mechanism for Outlines.\n",
+    "\n",
+    "## Installation\n",
+    "\n",
+    "The LangChain Outlines integration lives in the `langchain-community` package and requires the `outlines` library:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "shellscript"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%pip install -qU langchain-community outlines"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Instantiation\n",
+    "\n",
+    "Now we can instantiate our model object and generate chat completions:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_community.llms import Outlines\n",
+    "\n",
+    "# For use with llamacpp backend\n",
+    "model = Outlines(model=\"microsoft/Phi-3-mini-4k-instruct\", backend=\"llamacpp\")\n",
+    "\n",
+    "# For use with vllm backend (not available on Mac)\n",
+    "model = Outlines(model=\"microsoft/Phi-3-mini-4k-instruct\", backend=\"vllm\")\n",
+    "\n",
+    "# For use with mlxlm backend (only available on Mac)\n",
+    "model = Outlines(model=\"microsoft/Phi-3-mini-4k-instruct\", backend=\"mlxlm\")\n",
+    "\n",
+    "# For use with huggingface transformers backend\n",
+    "model = Outlines(\n",
+    "    model=\"microsoft/Phi-3-mini-4k-instruct\"\n",
+    ")  # defaults to backend=\"transformers\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Invocation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model.invoke(\"Hello how are you?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Chaining"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.prompts import PromptTemplate\n",
+    "\n",
+    "prompt = PromptTemplate.from_template(\"How to say {input} in {output_language}:\\n\")\n",
+    "\n",
+    "chain = prompt | model\n",
+    "chain.invoke(\n",
+    "    {\n",
+    "        \"output_language\": \"German\",\n",
+    "        \"input\": \"I love programming.\",\n",
+    "    }\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Streaming\n",
+    "\n",
+    "Outlines supports streaming of tokens:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for chunk in model.stream(\"Count to 10 in French:\"):\n",
+    "    print(chunk, end=\"\", flush=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Constrained Generation\n",
+    "\n",
+    "Outlines allows you to apply various constraints to the generated output:\n",
+    "\n",
+    "#### Regex Constraint"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model.regex = r\"((25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\\.){3}(25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\"\n",
+    "response = model.invoke(\"What is the IP address of Google's DNS server?\")\n",
+    "\n",
+    "response"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Type Constraints"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model.type_constraints = int\n",
+    "response = model.invoke(\"What is the answer to life, the universe, and everything?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### JSON Schema"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from pydantic import BaseModel\n",
+    "\n",
+    "\n",
+    "class Person(BaseModel):\n",
+    "    name: str\n",
+    "\n",
+    "\n",
+    "model.json_schema = Person\n",
+    "response = model.invoke(\"Who is the author of LangChain?\")\n",
+    "person = Person.model_validate_json(response)\n",
+    "\n",
+    "person"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Grammar Constraint"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model.grammar = \"\"\"\n",
+    "?start: expression\n",
+    "?expression: term ((\"+\" | \"-\") term)\n",
+    "?term: factor ((\"\" | \"/\") factor)\n",
+    "?factor: NUMBER | \"-\" factor | \"(\" expression \")\"\n",
+    "%import common.NUMBER\n",
+    "%import common.WS\n",
+    "%ignore WS\n",
+    "\"\"\"\n",
+    "response = model.invoke(\"Give me a complex arithmetic expression:\")\n",
+    "\n",
+    "response"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## API reference\n",
+    "\n",
+    "For detailed documentation of all ChatOutlines features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/outlines.chat_models.ChatOutlines.html\n",
+    "\n",
+    "## Outlines Documentation: \n",
+    "\n",
+    "https://dottxt-ai.github.io/outlines/latest/"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": ".venv",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.9"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/docs/docs/integrations/providers/outlines.mdx
+++ b/docs/docs/integrations/providers/outlines.mdx
@ -0,0 +1,201 @@
+# Outlines
+
+>[Outlines](https://github.com/dottxt-ai/outlines) is a Python library for constrained language generation. It provides a unified interface to various language models and allows for structured generation using techniques like regex matching, type constraints, JSON schemas, and context-free grammars.
+
+Outlines supports multiple backends, including:
+- Hugging Face Transformers
+- llama.cpp
+- vLLM
+- MLX
+
+This integration allows you to use Outlines models with LangChain, providing both LLM and chat model interfaces.
+
+## Installation and Setup
+
+To use Outlines with LangChain, you'll need to install the Outlines library:
+
+```bash
+pip install outlines
+```
+
+Depending on the backend you choose, you may need to install additional dependencies:
+
+- For Transformers: `pip install transformers torch datasets`
+- For llama.cpp: `pip install llama-cpp-python`
+- For vLLM: `pip install vllm`
+- For MLX: `pip install mlx`
+
+## LLM
+
+To use Outlines as an LLM in LangChain, you can use the `Outlines` class:
+
+```python
+from langchain_community.llms import Outlines
+```
+
+## Chat Models
+
+To use Outlines as a chat model in LangChain, you can use the `ChatOutlines` class:
+
+```python
+from langchain_community.chat_models import ChatOutlines
+```
+
+## Model Configuration
+
+Both `Outlines` and `ChatOutlines` classes share similar configuration options:
+
+```python
+model = Outlines(
+    model="meta-llama/Llama-2-7b-chat-hf",  # Model identifier
+    backend="transformers",  # Backend to use (transformers, llamacpp, vllm, or mlxlm)
+    max_tokens=256,  # Maximum number of tokens to generate
+    stop=["\n"],  # Optional list of stop strings
+    streaming=True,  # Whether to stream the output
+    # Additional parameters for structured generation:
+    regex=None,
+    type_constraints=None,
+    json_schema=None,
+    grammar=None,
+    # Additional model parameters:
+    model_kwargs={"temperature": 0.7}
+)
+```
+
+### Model Identifier
+
+The `model` parameter can be:
+- A Hugging Face model name (e.g., "meta-llama/Llama-2-7b-chat-hf")
+- A local path to a model
+- For GGUF models, the format is "repo_id/file_name" (e.g., "TheBloke/Llama-2-7B-Chat-GGUF/llama-2-7b-chat.Q4_K_M.gguf")
+
+### Backend Options
+
+The `backend` parameter specifies which backend to use:
+- `"transformers"`: For Hugging Face Transformers models (default)
+- `"llamacpp"`: For GGUF models using llama.cpp
+- `"transformers_vision"`: For vision-language models (e.g., LLaVA)
+- `"vllm"`: For models using the vLLM library
+- `"mlxlm"`: For models using the MLX framework
+
+### Structured Generation
+
+Outlines provides several methods for structured generation:
+
+1. **Regex Matching**:
+   ```python
+   model = Outlines(
+       model="meta-llama/Llama-2-7b-chat-hf",
+       regex=r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)"
+   )
+   ```
+   This will ensure the generated text matches the specified regex pattern (in this case, a valid IP address).
+
+2. **Type Constraints**:
+   ```python
+   model = Outlines(
+       model="meta-llama/Llama-2-7b-chat-hf",
+       type_constraints=int
+   )
+   ```
+   This restricts the output to valid Python types (int, float, bool, datetime.date, datetime.time, datetime.datetime).
+
+3. **JSON Schema**:
+   ```python
+   from pydantic import BaseModel
+
+   class Person(BaseModel):
+       name: str
+       age: int
+
+   model = Outlines(
+       model="meta-llama/Llama-2-7b-chat-hf",
+       json_schema=Person
+   )
+   ```
+   This ensures the generated output adheres to the specified JSON schema or Pydantic model.
+
+4. **Context-Free Grammar**:
+   ```python
+   model = Outlines(
+       model="meta-llama/Llama-2-7b-chat-hf",
+       grammar="""
+           ?start: expression
+           ?expression: term (("+" | "-") term)*
+           ?term: factor (("*" | "/") factor)*
+           ?factor: NUMBER | "-" factor | "(" expression ")"
+           %import common.NUMBER
+       """
+   )
+   ```
+   This generates text that adheres to the specified context-free grammar in EBNF format.
+
+## Usage Examples
+
+### LLM Example
+
+```python
+from langchain_community.llms import Outlines
+
+llm = Outlines(model="meta-llama/Llama-2-7b-chat-hf", max_tokens=100)
+result = llm.invoke("Tell me a short story about a robot.")
+print(result)
+```
+
+### Chat Model Example
+
+```python
+from langchain_community.chat_models import ChatOutlines
+from langchain_core.messages import HumanMessage, SystemMessage
+
+chat = ChatOutlines(model="meta-llama/Llama-2-7b-chat-hf", max_tokens=100)
+messages = [
+    SystemMessage(content="You are a helpful AI assistant."),
+    HumanMessage(content="What's the capital of France?")
+]
+result = chat.invoke(messages)
+print(result.content)
+```
+
+### Streaming Example
+
+```python
+from langchain_community.chat_models import ChatOutlines
+from langchain_core.messages import HumanMessage
+
+chat = ChatOutlines(model="meta-llama/Llama-2-7b-chat-hf", streaming=True)
+for chunk in chat.stream("Tell me a joke about programming."):
+    print(chunk.content, end="", flush=True)
+print()
+```
+
+### Structured Output Example
+
+```python
+from langchain_community.llms import Outlines
+from pydantic import BaseModel
+
+class MovieReview(BaseModel):
+    title: str
+    rating: int
+    summary: str
+
+llm = Outlines(
+    model="meta-llama/Llama-2-7b-chat-hf",
+    json_schema=MovieReview
+)
+result = llm.invoke("Write a short review for the movie 'Inception'.")
+print(result)
+```
+
+## Additional Features
+
+### Tokenizer Access
+
+You can access the underlying tokenizer for the model:
+
+```python
+tokenizer = llm.tokenizer
+encoded = tokenizer.encode("Hello, world!")
+decoded = tokenizer.decode(encoded)
+```
--- a/libs/community/extended_testing_deps.txt
+++ b/libs/community/extended_testing_deps.txt
@ -55,6 +55,7 @@ openai<2
 openapi-pydantic>=0.3.2,<0.4
 oracle-ads>=2.9.1,<3
 oracledb>=2.2.0,<3
+outlines[test]>=0.1.0,<0.2
 pandas>=2.0.1,<3
 pdfminer-six>=20221105,<20240706
 pgvector>=0.1.6,<0.2
--- a/libs/community/langchain_community/chat_models/init.py
+++ b/libs/community/langchain_community/chat_models/init.py
@ -143,6 +143,7 @@ if TYPE_CHECKING:
    from langchain_community.chat_models.openai import (
        ChatOpenAI,
    )
+    from langchain_community.chat_models.outlines import ChatOutlines
    from langchain_community.chat_models.pai_eas_endpoint import (
        PaiEasChatEndpoint,
    )
@ -228,6 +229,7 @@ __all__ = [
    "ChatOCIModelDeploymentTGI",
    "ChatOllama",
    "ChatOpenAI",
+    "ChatOutlines",
    "ChatPerplexity",
    "ChatReka",
    "ChatPremAI",
@ -294,6 +296,7 @@ _module_lookup = {
    "ChatOCIModelDeploymentTGI": "langchain_community.chat_models.oci_data_science",
    "ChatOllama": "langchain_community.chat_models.ollama",
    "ChatOpenAI": "langchain_community.chat_models.openai",
+    "ChatOutlines": "langchain_community.chat_models.outlines",
    "ChatReka": "langchain_community.chat_models.reka",
    "ChatPerplexity": "langchain_community.chat_models.perplexity",
    "ChatSambaNovaCloud": "langchain_community.chat_models.sambanova",
--- a/libs/community/langchain_community/chat_models/outlines.py
+++ b/libs/community/langchain_community/chat_models/outlines.py
@ -0,0 +1,532 @@
+from __future__ import annotations
+
+import importlib.util
+import platform
+from collections.abc import AsyncIterator
+from typing import (
+    Any,
+    Callable,
+    Dict,
+    Iterator,
+    List,
+    Optional,
+    Sequence,
+    Tuple,
+    Type,
+    TypedDict,
+    TypeVar,
+    Union,
+    get_origin,
+)
+
+from langchain_core.callbacks import CallbackManagerForLLMRun
+from langchain_core.callbacks.manager import AsyncCallbackManagerForLLMRun
+from langchain_core.language_models import LanguageModelInput
+from langchain_core.language_models.chat_models import BaseChatModel
+from langchain_core.messages import AIMessage, AIMessageChunk, BaseMessage
+from langchain_core.output_parsers import JsonOutputParser, PydanticOutputParser
+from langchain_core.outputs import ChatGeneration, ChatGenerationChunk, ChatResult
+from langchain_core.runnables import Runnable
+from langchain_core.tools import BaseTool
+from langchain_core.utils.function_calling import convert_to_openai_tool
+from pydantic import BaseModel, Field, model_validator
+from typing_extensions import Literal
+
+from langchain_community.adapters.openai import convert_message_to_dict
+
+_BM = TypeVar("_BM", bound=BaseModel)
+_DictOrPydanticClass = Union[Dict[str, Any], Type[_BM], Type]
+
+
+class ChatOutlines(BaseChatModel):
+    """Outlines chat model integration.
+
+    Setup:
+      pip install outlines
+
+    Key init args — client params:
+      backend: Literal["llamacpp", "transformers", "transformers_vision", "vllm", "mlxlm"] = "transformers"
+        Specifies the backend to use for the model.
+
+    Key init args — completion params:
+      model: str
+        Identifier for the model to use with Outlines.
+      max_tokens: int = 256
+        The maximum number of tokens to generate.
+      stop: Optional[List[str]] = None
+        A list of strings to stop generation when encountered.
+      streaming: bool = True
+        Whether to stream the results, token by token.
+
+    See full list of supported init args and their descriptions in the params section.
+
+    Instantiate:
+      from langchain_community.chat_models import ChatOutlines
+      chat = ChatOutlines(model="meta-llama/Llama-2-7b-chat-hf")
+
+    Invoke:
+      chat.invoke([HumanMessage(content="Say foo:")])
+
+    Stream:
+      for chunk in chat.stream([HumanMessage(content="Count to 10:")]):
+          print(chunk.content, end="", flush=True)
+
+    """  # noqa: E501
+
+    client: Any = None  # :meta private:
+
+    model: str
+    """Identifier for the model to use with Outlines.
+    
+    The model identifier should be a string specifying:
+    - A Hugging Face model name (e.g., "meta-llama/Llama-2-7b-chat-hf")
+    - A local path to a model
+    - For GGUF models, the format is "repo_id/file_name"
+      (e.g., "TheBloke/Llama-2-7B-Chat-GGUF/llama-2-7b-chat.Q4_K_M.gguf")
+    
+    Examples:
+    - "TheBloke/Llama-2-7B-Chat-GGUF/llama-2-7b-chat.Q4_K_M.gguf"
+    - "meta-llama/Llama-2-7b-chat-hf"
+    """
+
+    backend: Literal[
+        "llamacpp", "transformers", "transformers_vision", "vllm", "mlxlm"
+    ] = "transformers"
+    """Specifies the backend to use for the model.
+    
+    Supported backends are:
+    - "llamacpp": For GGUF models using llama.cpp
+    - "transformers": For Hugging Face Transformers models (default)
+    - "transformers_vision": For vision-language models (e.g., LLaVA)
+    - "vllm": For models using the vLLM library
+    - "mlxlm": For models using the MLX framework
+    
+    Note: Ensure you have the necessary dependencies installed for the chosen backend.
+    The system will attempt to import required packages and may raise an ImportError
+    if they are not available.
+    """
+
+    max_tokens: int = 256
+    """The maximum number of tokens to generate."""
+
+    stop: Optional[List[str]] = None
+    """A list of strings to stop generation when encountered."""
+
+    streaming: bool = True
+    """Whether to stream the results, token by token."""
+
+    regex: Optional[str] = None
+    """Regular expression for structured generation.
+    
+    If provided, Outlines will guarantee that the generated text matches this regex.
+    This can be useful for generating structured outputs like IP addresses, dates, etc.
+    
+    Example: (valid IP address)
+        regex = r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)"
+    
+    Note: Computing the regex index can take some time, so it's recommended to reuse
+    the same regex for multiple generations if possible.
+    
+    For more details, see: https://dottxt-ai.github.io/outlines/reference/generation/regex/
+    """
+
+    type_constraints: Optional[Union[type, str]] = None
+    """Type constraints for structured generation.
+    
+    Restricts the output to valid Python types. Supported types include:
+    int, float, bool, datetime.date, datetime.time, datetime.datetime.
+    
+    Example:
+        type_constraints = int
+    
+    For more details, see: https://dottxt-ai.github.io/outlines/reference/generation/format/
+    """
+
+    json_schema: Optional[Union[Any, Dict, Callable]] = None
+    """Pydantic model, JSON Schema, or callable (function signature)
+    for structured JSON generation.
+    
+    Outlines can generate JSON output that follows a specified structure,
+    which is useful for:
+    1. Parsing the answer (e.g., with Pydantic), storing it, or returning it to a user.
+    2. Calling a function with the result.
+
+    You can provide:
+    - A Pydantic model
+    - A JSON Schema (as a Dict)
+    - A callable (function signature)
+
+    The generated JSON will adhere to the specified structure.
+
+    For more details, see: https://dottxt-ai.github.io/outlines/reference/generation/json/
+    """
+
+    grammar: Optional[str] = None
+    """Context-free grammar for structured generation.
+    
+    If provided, Outlines will generate text that adheres to the specified grammar.
+    The grammar should be defined in EBNF format.
+    
+    This can be useful for generating structured outputs like mathematical expressions,
+    programming languages, or custom domain-specific languages.
+    
+    Example:
+        grammar = '''
+            ?start: expression
+            ?expression: term (("+" | "-") term)*
+            ?term: factor (("*" | "/") factor)*
+            ?factor: NUMBER | "-" factor | "(" expression ")"
+            %import common.NUMBER
+        '''
+    
+    Note: Grammar-based generation is currently experimental and may have performance
+    limitations. It uses greedy generation to mitigate these issues.
+    
+    For more details and examples, see:
+    https://dottxt-ai.github.io/outlines/reference/generation/cfg/
+    """
+
+    custom_generator: Optional[Any] = None
+    """Set your own outlines generator object to override the default behavior."""
+
+    model_kwargs: Dict[str, Any] = Field(default_factory=dict)
+    """Additional parameters to pass to the underlying model.
+    
+    Example:
+        model_kwargs = {"temperature": 0.8, "seed": 42}
+    """
+
+    @model_validator(mode="after")
+    def validate_environment(self) -> "ChatOutlines":
+        """Validate that outlines is installed and create a model instance."""
+        num_constraints = sum(
+            [
+                bool(self.regex),
+                bool(self.type_constraints),
+                bool(self.json_schema),
+                bool(self.grammar),
+            ]
+        )
+        if num_constraints > 1:
+            raise ValueError(
+                "Either none or exactly one of regex, type_constraints, "
+                "json_schema, or grammar can be provided."
+            )
+        return self.build_client()
+
+    def build_client(self) -> "ChatOutlines":
+        try:
+            import outlines.models as models
+        except ImportError:
+            raise ImportError(
+                "Could not import the Outlines library. "
+                "Please install it with `pip install outlines`."
+            )
+
+        def check_packages_installed(
+            packages: List[Union[str, Tuple[str, str]]],
+        ) -> None:
+            missing_packages = [
+                pkg if isinstance(pkg, str) else pkg[0]
+                for pkg in packages
+                if importlib.util.find_spec(pkg[1] if isinstance(pkg, tuple) else pkg)
+                is None
+            ]
+            if missing_packages:
+                raise ImportError(
+                    f"Missing packages: {', '.join(missing_packages)}. "
+                    "You can install them with:\n\n"
+                    f"    pip install {' '.join(missing_packages)}"
+                )
+
+        if self.backend == "llamacpp":
+            check_packages_installed([("llama-cpp-python", "llama_cpp")])
+            if ".gguf" in self.model:
+                creator, repo_name, file_name = self.model.split("/", 2)
+                repo_id = f"{creator}/{repo_name}"
+            else:
+                raise ValueError("GGUF file_name must be provided for llama.cpp.")
+            self.client = models.llamacpp(repo_id, file_name, **self.model_kwargs)
+        elif self.backend == "transformers":
+            check_packages_installed(["transformers", "torch", "datasets"])
+            self.client = models.transformers(
+                model_name=self.model, **self.model_kwargs
+            )
+        elif self.backend == "transformers_vision":
+            if hasattr(models, "transformers_vision"):
+                from transformers import LlavaNextForConditionalGeneration
+
+                self.client = models.transformers_vision(
+                    self.model,
+                    model_class=LlavaNextForConditionalGeneration,
+                    **self.model_kwargs,
+                )
+            else:
+                raise ValueError("transformers_vision backend is not supported")
+        elif self.backend == "vllm":
+            if platform.system() == "Darwin":
+                raise ValueError("vLLM backend is not supported on macOS.")
+            check_packages_installed(["vllm"])
+            self.client = models.vllm(self.model, **self.model_kwargs)
+        elif self.backend == "mlxlm":
+            check_packages_installed(["mlx"])
+            self.client = models.mlxlm(self.model, **self.model_kwargs)
+        else:
+            raise ValueError(f"Unsupported backend: {self.backend}")
+        return self
+
+    @property
+    def _llm_type(self) -> str:
+        return "outlines-chat"
+
+    @property
+    def _default_params(self) -> Dict[str, Any]:
+        return {
+            "max_tokens": self.max_tokens,
+            "stop_at": self.stop,
+            **self.model_kwargs,
+        }
+
+    @property
+    def _identifying_params(self) -> Dict[str, Any]:
+        return {
+            "model": self.model,
+            "backend": self.backend,
+            "regex": self.regex,
+            "type_constraints": self.type_constraints,
+            "json_schema": self.json_schema,
+            "grammar": self.grammar,
+            **self._default_params,
+        }
+
+    @property
+    def _generator(self) -> Any:
+        from outlines import generate
+
+        if self.custom_generator:
+            return self.custom_generator
+        constraints = [
+            self.regex,
+            self.type_constraints,
+            self.json_schema,
+            self.grammar,
+        ]
+
+        num_constraints = sum(constraint is not None for constraint in constraints)
+        if num_constraints != 1 and num_constraints != 0:
+            raise ValueError(
+                "Either none or exactly one of regex, type_constraints, "
+                "json_schema, or grammar can be provided."
+            )
+        if self.regex:
+            return generate.regex(self.client, regex_str=self.regex)
+        if self.type_constraints:
+            return generate.format(self.client, python_type=self.type_constraints)
+        if self.json_schema:
+            return generate.json(self.client, schema_object=self.json_schema)
+        if self.grammar:
+            return generate.cfg(self.client, cfg_str=self.grammar)
+        return generate.text(self.client)
+
+    def _convert_messages_to_openai_format(
+        self, messages: list[BaseMessage]
+    ) -> list[dict]:
+        return [convert_message_to_dict(message) for message in messages]
+
+    def _convert_messages_to_prompt(self, messages: list[BaseMessage]) -> str:
+        """Convert a list of messages to a single prompt."""
+        if self.backend == "llamacpp":  # get base_model_name from gguf repo_id
+            from huggingface_hub import ModelCard
+
+            repo_creator, gguf_repo_name, file_name = self.model.split("/")
+            model_card = ModelCard.load(f"{repo_creator}/{gguf_repo_name}")
+            if hasattr(model_card.data, "base_model"):
+                model_name = model_card.data.base_model
+            else:
+                raise ValueError(f"Base model name not found for {self.model}")
+        else:
+            model_name = self.model
+
+        from transformers import AutoTokenizer
+
+        return AutoTokenizer.from_pretrained(model_name).apply_chat_template(
+            self._convert_messages_to_openai_format(messages),
+            tokenize=False,
+            add_generation_prompt=True,
+        )
+
+    def bind_tools(
+        self,
+        tools: Sequence[Dict[str, Any] | type | Callable[..., Any] | BaseTool],
+        *,
+        tool_choice: Optional[Union[Dict, bool, str]] = None,
+        **kwargs: Any,
+    ) -> Runnable[LanguageModelInput, BaseMessage]:
+        """Bind tool-like objects to this chat model
+
+        tool_choice: does not currently support "any", "auto" choices like OpenAI
+            tool-calling API. should be a dict of the form to force this tool
+            {"type": "function", "function": {"name": <<tool_name>>}}.
+        """
+        formatted_tools = [convert_to_openai_tool(tool) for tool in tools]
+        tool_names = [ft["function"]["name"] for ft in formatted_tools]
+        if tool_choice:
+            if isinstance(tool_choice, dict):
+                if not any(
+                    tool_choice["function"]["name"] == name for name in tool_names
+                ):
+                    raise ValueError(
+                        f"Tool choice {tool_choice=} was specified, but the only "
+                        f"provided tools were {tool_names}."
+                    )
+            elif isinstance(tool_choice, str):
+                chosen = [
+                    f for f in formatted_tools if f["function"]["name"] == tool_choice
+                ]
+                if not chosen:
+                    raise ValueError(
+                        f"Tool choice {tool_choice=} was specified, but the only "
+                        f"provided tools were {tool_names}."
+                    )
+            elif isinstance(tool_choice, bool):
+                if len(formatted_tools) > 1:
+                    raise ValueError(
+                        "tool_choice=True can only be specified when a single tool is "
+                        f"passed in. Received {len(tools)} tools."
+                    )
+                tool_choice = formatted_tools[0]
+
+        kwargs["tool_choice"] = tool_choice
+        formatted_tools = [convert_to_openai_tool(tool) for tool in tools]
+        return super().bind_tools(tools=formatted_tools, **kwargs)
+
+    def with_structured_output(
+        self,
+        schema: Optional[_DictOrPydanticClass],
+        *,
+        include_raw: bool = False,
+        **kwargs: Any,
+    ) -> Runnable[LanguageModelInput, Union[dict, BaseModel]]:
+        if get_origin(schema) is TypedDict:
+            raise NotImplementedError("TypedDict is not supported yet by Outlines")
+
+        self.json_schema = schema
+
+        if isinstance(schema, type) and issubclass(schema, BaseModel):
+            parser: Union[PydanticOutputParser, JsonOutputParser] = (
+                PydanticOutputParser(pydantic_object=schema)
+            )
+        else:
+            parser = JsonOutputParser()
+
+        if include_raw:  # TODO
+            raise NotImplementedError("include_raw is not yet supported")
+
+        return self | parser
+
+    def _generate(
+        self,
+        messages: List[BaseMessage],
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> ChatResult:
+        params = {**self._default_params, **kwargs}
+        if stop:
+            params["stop_at"] = stop
+
+        prompt = self._convert_messages_to_prompt(messages)
+
+        response = ""
+        if self.streaming:
+            for chunk in self._stream(
+                messages=messages,
+                stop=stop,
+                run_manager=run_manager,
+                **kwargs,
+            ):
+                if isinstance(chunk.message.content, str):
+                    response += chunk.message.content
+                else:
+                    raise ValueError(
+                        "Invalid content type, only str is supported, "
+                        f"got {type(chunk.message.content)}"
+                    )
+        else:
+            response = self._generator(prompt, **params)
+
+        message = AIMessage(content=response)
+        generation = ChatGeneration(message=message)
+        return ChatResult(generations=[generation])
+
+    def _stream(
+        self,
+        messages: List[BaseMessage],
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> Iterator[ChatGenerationChunk]:
+        params = {**self._default_params, **kwargs}
+        if stop:
+            params["stop_at"] = stop
+
+        prompt = self._convert_messages_to_prompt(messages)
+
+        for token in self._generator.stream(prompt, **params):
+            if run_manager:
+                run_manager.on_llm_new_token(token)
+            message_chunk = AIMessageChunk(content=token)
+            chunk = ChatGenerationChunk(message=message_chunk)
+            yield chunk
+
+    async def _agenerate(
+        self,
+        messages: List[BaseMessage],
+        stop: List[str] | None = None,
+        run_manager: AsyncCallbackManagerForLLMRun | None = None,
+        **kwargs: Any,
+    ) -> ChatResult:
+        if hasattr(self._generator, "agenerate"):
+            params = {**self._default_params, **kwargs}
+            if stop:
+                params["stop_at"] = stop
+
+            prompt = self._convert_messages_to_prompt(messages)
+            response = await self._generator.agenerate(prompt, **params)
+
+            message = AIMessage(content=response)
+            generation = ChatGeneration(message=message)
+            return ChatResult(generations=[generation])
+        elif self.streaming:
+            response = ""
+            async for chunk in self._astream(messages, stop, run_manager, **kwargs):
+                response += chunk.message.content or ""
+            message = AIMessage(content=response)
+            generation = ChatGeneration(message=message)
+            return ChatResult(generations=[generation])
+        else:
+            return await super()._agenerate(messages, stop, run_manager, **kwargs)
+
+    async def _astream(
+        self,
+        messages: List[BaseMessage],
+        stop: List[str] | None = None,
+        run_manager: AsyncCallbackManagerForLLMRun | None = None,
+        **kwargs: Any,
+    ) -> AsyncIterator[ChatGenerationChunk]:
+        if hasattr(self._generator, "astream"):
+            params = {**self._default_params, **kwargs}
+            if stop:
+                params["stop_at"] = stop
+
+            prompt = self._convert_messages_to_prompt(messages)
+
+            async for token in self._generator.astream(prompt, **params):
+                if run_manager:
+                    await run_manager.on_llm_new_token(token)
+                message_chunk = AIMessageChunk(content=token)
+                chunk = ChatGenerationChunk(message=message_chunk)
+                yield chunk
+        else:
+            async for chunk in super()._astream(messages, stop, run_manager, **kwargs):
+                yield chunk
--- a/libs/community/langchain_community/llms/init.py
+++ b/libs/community/langchain_community/llms/init.py
@ -458,6 +458,12 @@ def _import_openlm() -> Type[BaseLLM]:
    return OpenLM


+def _import_outlines() -> Type[BaseLLM]:
+    from langchain_community.llms.outlines import Outlines
+
+    return Outlines
+
+
 def _import_pai_eas_endpoint() -> Type[BaseLLM]:
    from langchain_community.llms.pai_eas_endpoint import PaiEasEndpoint

@ -807,6 +813,8 @@ def __getattr__(name: str) -> Any:
        return _import_openllm()
    elif name == "OpenLM":
        return _import_openlm()
+    elif name == "Outlines":
+        return _import_outlines()
    elif name == "PaiEasEndpoint":
        return _import_pai_eas_endpoint()
    elif name == "Petals":
@ -954,6 +962,7 @@ __all__ = [
    "OpenAIChat",
    "OpenLLM",
    "OpenLM",
+    "Outlines",
    "PaiEasEndpoint",
    "Petals",
    "PipelineAI",
@ -1076,6 +1085,7 @@ def get_type_to_cls_dict() -> Dict[str, Callable[[], Type[BaseLLM]]]:
        "vertexai_model_garden": _import_vertex_model_garden,
        "openllm": _import_openllm,
        "openllm_client": _import_openllm,
+        "outlines": _import_outlines,
        "vllm": _import_vllm,
        "vllm_openai": _import_vllm_openai,
        "watsonxllm": _import_watsonxllm,
--- a/libs/community/langchain_community/llms/outlines.py
+++ b/libs/community/langchain_community/llms/outlines.py
@ -0,0 +1,314 @@
+from __future__ import annotations
+
+import importlib.util
+import logging
+import platform
+from typing import Any, Callable, Dict, Iterator, List, Literal, Optional, Tuple, Union
+
+from langchain_core.callbacks import CallbackManagerForLLMRun
+from langchain_core.language_models.llms import LLM
+from langchain_core.outputs import GenerationChunk
+from pydantic import BaseModel, Field, model_validator
+
+logger = logging.getLogger(__name__)
+
+
+class Outlines(LLM):
+    """LLM wrapper for the Outlines library."""
+
+    client: Any = None  # :meta private:
+
+    model: str
+    """Identifier for the model to use with Outlines.
+    
+    The model identifier should be a string specifying:
+    - A Hugging Face model name (e.g., "meta-llama/Llama-2-7b-chat-hf")
+    - A local path to a model
+    - For GGUF models, the format is "repo_id/file_name"
+      (e.g., "TheBloke/Llama-2-7B-Chat-GGUF/llama-2-7b-chat.Q4_K_M.gguf")
+    
+    Examples:
+    - "TheBloke/Llama-2-7B-Chat-GGUF/llama-2-7b-chat.Q4_K_M.gguf"
+    - "meta-llama/Llama-2-7b-chat-hf"
+    """
+
+    backend: Literal[
+        "llamacpp", "transformers", "transformers_vision", "vllm", "mlxlm"
+    ] = "transformers"
+    """Specifies the backend to use for the model.
+    
+    Supported backends are:
+    - "llamacpp": For GGUF models using llama.cpp
+    - "transformers": For Hugging Face Transformers models (default)
+    - "transformers_vision": For vision-language models (e.g., LLaVA)
+    - "vllm": For models using the vLLM library
+    - "mlxlm": For models using the MLX framework
+    
+    Note: Ensure you have the necessary dependencies installed for the chosen backend.
+    The system will attempt to import required packages and may raise an ImportError
+    if they are not available.
+    """
+
+    max_tokens: int = 256
+    """The maximum number of tokens to generate."""
+
+    stop: Optional[List[str]] = None
+    """A list of strings to stop generation when encountered."""
+
+    streaming: bool = True
+    """Whether to stream the results, token by token."""
+
+    regex: Optional[str] = None
+    """Regular expression for structured generation.
+    
+    If provided, Outlines will guarantee that the generated text matches this regex.
+    This can be useful for generating structured outputs like IP addresses, dates, etc.
+    
+    Example: (valid IP address)
+        regex = r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)"
+    
+    Note: Computing the regex index can take some time, so it's recommended to reuse
+    the same regex for multiple generations if possible.
+    
+    For more details, see: https://dottxt-ai.github.io/outlines/reference/generation/regex/
+    """
+
+    type_constraints: Optional[Union[type, str]] = None
+    """Type constraints for structured generation.
+    
+    Restricts the output to valid Python types. Supported types include:
+    int, float, bool, datetime.date, datetime.time, datetime.datetime.
+    
+    Example:
+        type_constraints = int
+    
+    For more details, see: https://dottxt-ai.github.io/outlines/reference/generation/format/
+    """
+
+    json_schema: Optional[Union[BaseModel, Dict, Callable]] = None
+    """Pydantic model, JSON Schema, or callable (function signature)
+    for structured JSON generation.
+    
+    Outlines can generate JSON output that follows a specified structure,
+    which is useful for:
+    1. Parsing the answer (e.g., with Pydantic), storing it, or returning it to a user.
+    2. Calling a function with the result.
+
+    You can provide:
+    - A Pydantic model
+    - A JSON Schema (as a Dict)
+    - A callable (function signature)
+
+    The generated JSON will adhere to the specified structure.
+
+    For more details, see: https://dottxt-ai.github.io/outlines/reference/generation/json/
+    """
+
+    grammar: Optional[str] = None
+    """Context-free grammar for structured generation.
+    
+    If provided, Outlines will generate text that adheres to the specified grammar.
+    The grammar should be defined in EBNF format.
+    
+    This can be useful for generating structured outputs like mathematical expressions,
+    programming languages, or custom domain-specific languages.
+    
+    Example:
+        grammar = '''
+            ?start: expression
+            ?expression: term (("+" | "-") term)*
+            ?term: factor (("*" | "/") factor)*
+            ?factor: NUMBER | "-" factor | "(" expression ")"
+            %import common.NUMBER
+        '''
+    
+    Note: Grammar-based generation is currently experimental and may have performance
+    limitations. It uses greedy generation to mitigate these issues.
+    
+    For more details and examples, see:
+    https://dottxt-ai.github.io/outlines/reference/generation/cfg/
+    """
+
+    custom_generator: Optional[Any] = None
+    """Set your own outlines generator object to override the default behavior."""
+
+    model_kwargs: Dict[str, Any] = Field(default_factory=dict)
+    """Additional parameters to pass to the underlying model.
+    
+    Example:
+        model_kwargs = {"temperature": 0.8, "seed": 42}
+    """
+
+    @model_validator(mode="after")
+    def validate_environment(self) -> "Outlines":
+        """Validate that outlines is installed and create a model instance."""
+        num_constraints = sum(
+            [
+                bool(self.regex),
+                bool(self.type_constraints),
+                bool(self.json_schema),
+                bool(self.grammar),
+            ]
+        )
+        if num_constraints > 1:
+            raise ValueError(
+                "Either none or exactly one of regex, type_constraints, "
+                "json_schema, or grammar can be provided."
+            )
+        return self.build_client()
+
+    def build_client(self) -> "Outlines":
+        try:
+            import outlines.models as models
+        except ImportError:
+            raise ImportError(
+                "Could not import the Outlines library. "
+                "Please install it with `pip install outlines`."
+            )
+
+        def check_packages_installed(
+            packages: List[Union[str, Tuple[str, str]]],
+        ) -> None:
+            missing_packages = [
+                pkg if isinstance(pkg, str) else pkg[0]
+                for pkg in packages
+                if importlib.util.find_spec(pkg[1] if isinstance(pkg, tuple) else pkg)
+                is None
+            ]
+            if missing_packages:
+                raise ImportError(  # todo this is displaying wrong
+                    f"Missing packages: {', '.join(missing_packages)}. "
+                    "You can install them with:\n\n"
+                    f"    pip install {' '.join(missing_packages)}"
+                )
+
+        if self.backend == "llamacpp":
+            if ".gguf" in self.model:
+                creator, repo_name, file_name = self.model.split("/", 2)
+                repo_id = f"{creator}/{repo_name}"
+            else:  # todo add auto-file-selection if no file is given
+                raise ValueError("GGUF file_name must be provided for llama.cpp.")
+            check_packages_installed([("llama-cpp-python", "llama_cpp")])
+            self.client = models.llamacpp(repo_id, file_name, **self.model_kwargs)
+        elif self.backend == "transformers":
+            check_packages_installed(["transformers", "torch", "datasets"])
+            self.client = models.transformers(self.model, **self.model_kwargs)
+        elif self.backend == "transformers_vision":
+            check_packages_installed(
+                ["transformers", "datasets", "torchvision", "PIL", "flash_attn"]
+            )
+            from transformers import LlavaNextForConditionalGeneration
+
+            if not hasattr(models, "transformers_vision"):
+                raise ValueError(
+                    "transformers_vision backend is not supported, "
+                    "please install the correct outlines version."
+                )
+            self.client = models.transformers_vision(
+                self.model,
+                model_class=LlavaNextForConditionalGeneration,
+                **self.model_kwargs,
+            )
+        elif self.backend == "vllm":
+            if platform.system() == "Darwin":
+                raise ValueError("vLLM backend is not supported on macOS.")
+            check_packages_installed(["vllm"])
+            self.client = models.vllm(self.model, **self.model_kwargs)
+        elif self.backend == "mlxlm":
+            check_packages_installed(["mlx"])
+            self.client = models.mlxlm(self.model, **self.model_kwargs)
+        else:
+            raise ValueError(f"Unsupported backend: {self.backend}")
+
+        return self
+
+    @property
+    def _llm_type(self) -> str:
+        return "outlines"
+
+    @property
+    def _default_params(self) -> Dict[str, Any]:
+        return {
+            "max_tokens": self.max_tokens,
+            "stop_at": self.stop,
+            **self.model_kwargs,
+        }
+
+    @property
+    def _identifying_params(self) -> Dict[str, Any]:
+        return {
+            "model": self.model,
+            "backend": self.backend,
+            "regex": self.regex,
+            "type_constraints": self.type_constraints,
+            "json_schema": self.json_schema,
+            "grammar": self.grammar,
+            **self._default_params,
+        }
+
+    @property
+    def _generator(self) -> Any:
+        from outlines import generate
+
+        if self.custom_generator:
+            return self.custom_generator
+        if self.regex:
+            return generate.regex(self.client, regex_str=self.regex)
+        if self.type_constraints:
+            return generate.format(self.client, python_type=self.type_constraints)
+        if self.json_schema:
+            return generate.json(self.client, schema_object=self.json_schema)
+        if self.grammar:
+            return generate.cfg(self.client, cfg_str=self.grammar)
+        return generate.text(self.client)
+
+    def _call(
+        self,
+        prompt: str,
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> str:
+        params = {**self._default_params, **kwargs}
+        if stop:
+            params["stop_at"] = stop
+
+        response = ""
+        if self.streaming:
+            for chunk in self._stream(
+                prompt=prompt,
+                stop=params["stop_at"],
+                run_manager=run_manager,
+                **params,
+            ):
+                response += chunk.text
+        else:
+            response = self._generator(prompt, **params)
+        return response
+
+    def _stream(
+        self,
+        prompt: str,
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> Iterator[GenerationChunk]:
+        params = {**self._default_params, **kwargs}
+        if stop:
+            params["stop_at"] = stop
+
+        for token in self._generator.stream(prompt, **params):
+            if run_manager:
+                run_manager.on_llm_new_token(token)
+            yield GenerationChunk(text=token)
+
+    @property
+    def tokenizer(self) -> Any:
+        """Access the tokenizer for the underlying model.
+
+        .encode() to tokenize text.
+        .decode() to convert tokens back to text.
+        """
+        if hasattr(self.client, "tokenizer"):
+            return self.client.tokenizer
+        raise ValueError("Tokenizer not found")
--- a/libs/community/tests/integration_tests/chat_models/test_outlines.py
+++ b/libs/community/tests/integration_tests/chat_models/test_outlines.py
@ -0,0 +1,177 @@
+# flake8: noqa
+"""Test ChatOutlines wrapper."""
+
+from typing import Generator
+import re
+import platform
+import pytest
+
+from langchain_community.chat_models.outlines import ChatOutlines
+from langchain_core.messages import AIMessage, HumanMessage, BaseMessage
+from langchain_core.messages import BaseMessageChunk
+from pydantic import BaseModel
+
+from tests.unit_tests.callbacks.fake_callback_handler import FakeCallbackHandler
+
+
+MODEL = "microsoft/Phi-3-mini-4k-instruct"
+LLAMACPP_MODEL = "bartowski/qwen2.5-7b-ins-v3-GGUF/qwen2.5-7b-ins-v3-Q4_K_M.gguf"
+
+BACKENDS = ["transformers", "llamacpp"]
+if platform.system() != "Darwin":
+    BACKENDS.append("vllm")
+if platform.system() == "Darwin":
+    BACKENDS.append("mlxlm")
+
+
+@pytest.fixture(params=BACKENDS)
+def chat_model(request: pytest.FixtureRequest) -> ChatOutlines:
+    if request.param == "llamacpp":
+        return ChatOutlines(model=LLAMACPP_MODEL, backend=request.param)
+    else:
+        return ChatOutlines(model=MODEL, backend=request.param)
+
+
+def test_chat_outlines_inference(chat_model: ChatOutlines) -> None:
+    """Test valid ChatOutlines inference."""
+    messages = [HumanMessage(content="Say foo:")]
+    output = chat_model.invoke(messages)
+    assert isinstance(output, AIMessage)
+    assert len(output.content) > 1
+
+
+def test_chat_outlines_streaming(chat_model: ChatOutlines) -> None:
+    """Test streaming tokens from ChatOutlines."""
+    messages = [HumanMessage(content="How do you say 'hello' in Spanish?")]
+    generator = chat_model.stream(messages)
+    stream_results_string = ""
+    assert isinstance(generator, Generator)
+
+    for chunk in generator:
+        assert isinstance(chunk, BaseMessageChunk)
+        if isinstance(chunk.content, str):
+            stream_results_string += chunk.content
+        else:
+            raise ValueError(
+                f"Invalid content type, only str is supported, "
+                f"got {type(chunk.content)}"
+            )
+    assert len(stream_results_string.strip()) > 1
+
+
+def test_chat_outlines_streaming_callback(chat_model: ChatOutlines) -> None:
+    """Test that streaming correctly invokes on_llm_new_token callback."""
+    MIN_CHUNKS = 5
+    callback_handler = FakeCallbackHandler()
+    chat_model.callbacks = [callback_handler]
+    chat_model.verbose = True
+    messages = [HumanMessage(content="Can you count to 10?")]
+    chat_model.invoke(messages)
+    assert callback_handler.llm_streams >= MIN_CHUNKS
+
+
+def test_chat_outlines_regex(chat_model: ChatOutlines) -> None:
+    """Test regex for generating a valid IP address"""
+    ip_regex = r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)"
+    chat_model.regex = ip_regex
+    assert chat_model.regex == ip_regex
+
+    messages = [HumanMessage(content="What is the IP address of Google's DNS server?")]
+    output = chat_model.invoke(messages)
+
+    assert isinstance(output, AIMessage)
+    assert re.match(
+        ip_regex, str(output.content)
+    ), f"Generated output '{output.content}' is not a valid IP address"
+
+
+def test_chat_outlines_type_constraints(chat_model: ChatOutlines) -> None:
+    """Test type constraints for generating an integer"""
+    chat_model.type_constraints = int
+    messages = [
+        HumanMessage(
+            content="What is the answer to life, the universe, and everything?"
+        )
+    ]
+    output = chat_model.invoke(messages)
+    assert isinstance(int(str(output.content)), int)
+
+
+def test_chat_outlines_json(chat_model: ChatOutlines) -> None:
+    """Test json for generating a valid JSON object"""
+
+    class Person(BaseModel):
+        name: str
+
+    chat_model.json_schema = Person
+    messages = [HumanMessage(content="Who are the main contributors to LangChain?")]
+    output = chat_model.invoke(messages)
+    person = Person.model_validate_json(str(output.content))
+    assert isinstance(person, Person)
+
+
+def test_chat_outlines_grammar(chat_model: ChatOutlines) -> None:
+    """Test grammar for generating a valid arithmetic expression"""
+    if chat_model.backend == "mlxlm":
+        pytest.skip("MLX grammars not yet supported.")
+
+    chat_model.grammar = """
+        ?start: expression
+        ?expression: term (("+" | "-") term)*
+        ?term: factor (("*" | "/") factor)*
+        ?factor: NUMBER | "-" factor | "(" expression ")"
+        %import common.NUMBER
+        %import common.WS
+        %ignore WS
+    """
+
+    messages = [HumanMessage(content="Give me a complex arithmetic expression:")]
+    output = chat_model.invoke(messages)
+
+    # Validate the output is a non-empty string
+    assert (
+        isinstance(output.content, str) and output.content.strip()
+    ), "Output should be a non-empty string"
+
+    # Use a simple regex to check if the output contains basic arithmetic operations and numbers
+    assert re.search(
+        r"[\d\+\-\*/\(\)]+", output.content
+    ), f"Generated output '{output.content}' does not appear to be a valid arithmetic expression"
+
+
+def test_chat_outlines_with_structured_output(chat_model: ChatOutlines) -> None:
+    """Test that ChatOutlines can generate structured outputs"""
+
+    class AnswerWithJustification(BaseModel):
+        """An answer to the user question along with justification for the answer."""
+
+        answer: str
+        justification: str
+
+    structured_chat_model = chat_model.with_structured_output(AnswerWithJustification)
+
+    result = structured_chat_model.invoke(
+        "What weighs more, a pound of bricks or a pound of feathers?"
+    )
+
+    assert isinstance(result, AnswerWithJustification)
+    assert isinstance(result.answer, str)
+    assert isinstance(result.justification, str)
+    assert len(result.answer) > 0
+    assert len(result.justification) > 0
+
+    structured_chat_model_with_raw = chat_model.with_structured_output(
+        AnswerWithJustification, include_raw=True
+    )
+
+    result_with_raw = structured_chat_model_with_raw.invoke(
+        "What weighs more, a pound of bricks or a pound of feathers?"
+    )
+
+    assert isinstance(result_with_raw, dict)
+    assert "raw" in result_with_raw
+    assert "parsed" in result_with_raw
+    assert "parsing_error" in result_with_raw
+    assert isinstance(result_with_raw["raw"], BaseMessage)
+    assert isinstance(result_with_raw["parsed"], AnswerWithJustification)
+    assert result_with_raw["parsing_error"] is None
--- a/libs/community/tests/integration_tests/llms/test_outlines.py
+++ b/libs/community/tests/integration_tests/llms/test_outlines.py
@ -0,0 +1,123 @@
+# flake8: noqa
+"""Test Outlines wrapper."""
+
+from typing import Generator
+import re
+import platform
+import pytest
+
+from langchain_community.llms.outlines import Outlines
+from pydantic import BaseModel
+
+from tests.unit_tests.callbacks.fake_callback_handler import FakeCallbackHandler
+
+
+MODEL = "microsoft/Phi-3-mini-4k-instruct"
+LLAMACPP_MODEL = "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf"
+
+BACKENDS = ["transformers", "llamacpp"]
+if platform.system() != "Darwin":
+    BACKENDS.append("vllm")
+if platform.system() == "Darwin":
+    BACKENDS.append("mlxlm")
+
+
+@pytest.fixture(params=BACKENDS)
+def llm(request: pytest.FixtureRequest) -> Outlines:
+    if request.param == "llamacpp":
+        return Outlines(model=LLAMACPP_MODEL, backend=request.param, max_tokens=100)
+    else:
+        return Outlines(model=MODEL, backend=request.param, max_tokens=100)
+
+
+def test_outlines_inference(llm: Outlines) -> None:
+    """Test valid outlines inference."""
+    output = llm.invoke("Say foo:")
+    assert isinstance(output, str)
+    assert len(output) > 1
+
+
+def test_outlines_streaming(llm: Outlines) -> None:
+    """Test streaming tokens from Outlines."""
+    generator = llm.stream("Q: How do you say 'hello' in Spanish?\n\nA:")
+    stream_results_string = ""
+    assert isinstance(generator, Generator)
+
+    for chunk in generator:
+        print(chunk)
+        assert isinstance(chunk, str)
+        stream_results_string += chunk
+    print(stream_results_string)
+    assert len(stream_results_string.strip()) > 1
+
+
+def test_outlines_streaming_callback(llm: Outlines) -> None:
+    """Test that streaming correctly invokes on_llm_new_token callback."""
+    MIN_CHUNKS = 5
+
+    callback_handler = FakeCallbackHandler()
+    llm.callbacks = [callback_handler]
+    llm.verbose = True
+    llm.invoke("Q: Can you count to 10? A:'1, ")
+    assert callback_handler.llm_streams >= MIN_CHUNKS
+
+
+def test_outlines_regex(llm: Outlines) -> None:
+    """Test regex for generating a valid IP address"""
+    ip_regex = r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)"
+    llm.regex = ip_regex
+    assert llm.regex == ip_regex
+
+    output = llm.invoke("Q: What is the IP address of googles dns server?\n\nA: ")
+
+    assert isinstance(output, str)
+
+    assert re.match(
+        ip_regex, output
+    ), f"Generated output '{output}' is not a valid IP address"
+
+
+def test_outlines_type_constraints(llm: Outlines) -> None:
+    """Test type constraints for generating an integer"""
+    llm.type_constraints = int
+    output = llm.invoke(
+        "Q: What is the answer to life, the universe, and everything?\n\nA: "
+    )
+    assert int(output)
+
+
+def test_outlines_json(llm: Outlines) -> None:
+    """Test json for generating a valid JSON object"""
+
+    class Person(BaseModel):
+        name: str
+
+    llm.json_schema = Person
+    output = llm.invoke("Q: Who is the author of LangChain?\n\nA: ")
+    person = Person.model_validate_json(output)
+    assert isinstance(person, Person)
+
+
+def test_outlines_grammar(llm: Outlines) -> None:
+    """Test grammar for generating a valid arithmetic expression"""
+    llm.grammar = """
+        ?start: expression
+        ?expression: term (("+" | "-") term)*
+        ?term: factor (("*" | "/") factor)*
+        ?factor: NUMBER | "-" factor | "(" expression ")"
+        %import common.NUMBER
+        %import common.WS
+        %ignore WS
+    """
+
+    output = llm.invoke("Here is a complex arithmetic expression: ")
+
+    # Validate the output is a non-empty string
+    assert (
+        isinstance(output, str) and output.strip()
+    ), "Output should be a non-empty string"
+
+    # Use a simple regex to check if the output contains basic arithmetic operations and numbers
+    assert re.search(
+        r"[\d\+\-\*/\(\)]+", output
+    ), f"Generated output '{output}' does not appear to be a valid arithmetic expression"
--- a/libs/community/tests/unit_tests/chat_models/test_imports.py
+++ b/libs/community/tests/unit_tests/chat_models/test_imports.py
@ -36,6 +36,7 @@ EXPECTED_ALL = [
    "ChatOCIModelDeploymentTGI",
    "ChatOllama",
    "ChatOpenAI",
+    "ChatOutlines",
    "ChatPerplexity",
    "ChatPremAI",
    "ChatSambaNovaCloud",
--- a/libs/community/tests/unit_tests/chat_models/test_outlines.py
+++ b/libs/community/tests/unit_tests/chat_models/test_outlines.py
@ -0,0 +1,91 @@
+import pytest
+from _pytest.monkeypatch import MonkeyPatch
+from pydantic import BaseModel, Field
+
+from langchain_community.chat_models.outlines import ChatOutlines
+
+
+def test_chat_outlines_initialization(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
+
+    chat = ChatOutlines(
+        model="microsoft/Phi-3-mini-4k-instruct",
+        max_tokens=42,
+        stop=["\n"],
+    )
+    assert chat.model == "microsoft/Phi-3-mini-4k-instruct"
+    assert chat.max_tokens == 42
+    assert chat.backend == "transformers"
+    assert chat.stop == ["\n"]
+
+
+def test_chat_outlines_backend_llamacpp(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
+    chat = ChatOutlines(
+        model="TheBloke/Llama-2-7B-Chat-GGUF/llama-2-7b-chat.Q4_K_M.gguf",
+        backend="llamacpp",
+    )
+    assert chat.backend == "llamacpp"
+
+
+def test_chat_outlines_backend_vllm(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
+    chat = ChatOutlines(model="microsoft/Phi-3-mini-4k-instruct", backend="vllm")
+    assert chat.backend == "vllm"
+
+
+def test_chat_outlines_backend_mlxlm(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
+    chat = ChatOutlines(model="microsoft/Phi-3-mini-4k-instruct", backend="mlxlm")
+    assert chat.backend == "mlxlm"
+
+
+def test_chat_outlines_with_regex(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
+    regex = r"\d{3}-\d{3}-\d{4}"
+    chat = ChatOutlines(model="microsoft/Phi-3-mini-4k-instruct", regex=regex)
+    assert chat.regex == regex
+
+
+def test_chat_outlines_with_type_constraints(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
+    chat = ChatOutlines(model="microsoft/Phi-3-mini-4k-instruct", type_constraints=int)
+    assert chat.type_constraints == int  # noqa
+
+
+def test_chat_outlines_with_json_schema(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
+
+    class TestSchema(BaseModel):
+        name: str = Field(description="A person's name")
+        age: int = Field(description="A person's age")
+
+    chat = ChatOutlines(
+        model="microsoft/Phi-3-mini-4k-instruct", json_schema=TestSchema
+    )
+    assert chat.json_schema == TestSchema
+
+
+def test_chat_outlines_with_grammar(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
+
+    grammar = """
+?start: expression
+?expression: term (("+" | "-") term)*
+?term: factor (("*" | "/") factor)*
+?factor: NUMBER | "-" factor | "(" expression ")"
+%import common.NUMBER
+    """
+    chat = ChatOutlines(model="microsoft/Phi-3-mini-4k-instruct", grammar=grammar)
+    assert chat.grammar == grammar
+
+
+def test_raise_for_multiple_output_constraints(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(ChatOutlines, "build_client", lambda self: self)
+
+    with pytest.raises(ValueError):
+        ChatOutlines(
+            model="microsoft/Phi-3-mini-4k-instruct",
+            type_constraints=int,
+            regex=r"\d{3}-\d{3}-\d{4}",
+        )
--- a/libs/community/tests/unit_tests/llms/test_imports.py
+++ b/libs/community/tests/unit_tests/llms/test_imports.py
@ -67,6 +67,7 @@ EXPECT_ALL = [
    "OpenAIChat",
    "OpenLLM",
    "OpenLM",
+    "Outlines",
    "PaiEasEndpoint",
    "Petals",
    "PipelineAI",
--- a/libs/community/tests/unit_tests/llms/test_outlines.py
+++ b/libs/community/tests/unit_tests/llms/test_outlines.py
@ -0,0 +1,92 @@
+import pytest
+from _pytest.monkeypatch import MonkeyPatch
+
+from langchain_community.llms.outlines import Outlines
+
+
+def test_outlines_initialization(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(Outlines, "build_client", lambda self: self)
+
+    llm = Outlines(
+        model="microsoft/Phi-3-mini-4k-instruct",
+        max_tokens=42,
+        stop=["\n"],
+    )
+    assert llm.model == "microsoft/Phi-3-mini-4k-instruct"
+    assert llm.max_tokens == 42
+    assert llm.backend == "transformers"
+    assert llm.stop == ["\n"]
+
+
+def test_outlines_backend_llamacpp(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(Outlines, "build_client", lambda self: self)
+    llm = Outlines(
+        model="TheBloke/Llama-2-7B-Chat-GGUF/llama-2-7b-chat.Q4_K_M.gguf",
+        backend="llamacpp",
+    )
+    assert llm.backend == "llamacpp"
+
+
+def test_outlines_backend_vllm(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(Outlines, "build_client", lambda self: self)
+    llm = Outlines(model="microsoft/Phi-3-mini-4k-instruct", backend="vllm")
+    assert llm.backend == "vllm"
+
+
+def test_outlines_backend_mlxlm(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(Outlines, "build_client", lambda self: self)
+    llm = Outlines(model="microsoft/Phi-3-mini-4k-instruct", backend="mlxlm")
+    assert llm.backend == "mlxlm"
+
+
+def test_outlines_with_regex(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(Outlines, "build_client", lambda self: self)
+    regex = r"\d{3}-\d{3}-\d{4}"
+    llm = Outlines(model="microsoft/Phi-3-mini-4k-instruct", regex=regex)
+    assert llm.regex == regex
+
+
+def test_outlines_with_type_constraints(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(Outlines, "build_client", lambda self: self)
+    llm = Outlines(model="microsoft/Phi-3-mini-4k-instruct", type_constraints=int)
+    assert llm.type_constraints == int  # noqa
+
+
+def test_outlines_with_json_schema(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(Outlines, "build_client", lambda self: self)
+    from pydantic import BaseModel, Field
+
+    class TestSchema(BaseModel):
+        name: str = Field(description="A person's name")
+        age: int = Field(description="A person's age")
+
+    llm = Outlines(model="microsoft/Phi-3-mini-4k-instruct", json_schema=TestSchema)
+    assert llm.json_schema == TestSchema
+
+
+def test_outlines_with_grammar(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(Outlines, "build_client", lambda self: self)
+    grammar = """
+    ?start: expression
+    ?expression: term (("+" | "-") term)*
+    ?term: factor (("*" | "/") factor)*
+    ?factor: NUMBER | "-" factor | "(" expression ")"
+    %import common.NUMBER
+    """
+    llm = Outlines(model="microsoft/Phi-3-mini-4k-instruct", grammar=grammar)
+    assert llm.grammar == grammar
+
+
+def test_raise_for_multiple_output_constraints(monkeypatch: MonkeyPatch) -> None:
+    monkeypatch.setattr(Outlines, "build_client", lambda self: self)
+    with pytest.raises(ValueError):
+        Outlines(
+            model="microsoft/Phi-3-mini-4k-instruct",
+            type_constraints=int,
+            regex=r"\d{3}-\d{3}-\d{4}",
+        )
+        Outlines(
+            model="microsoft/Phi-3-mini-4k-instruct",
+            type_constraints=int,
+            regex=r"\d{3}-\d{3}-\d{4}",
+        )