Add C Transformers for GGML Models (#5218)

# Add C Transformers for GGML Models I created Python bindings for the GGML models: https://github.com/marella/ctransformers Currently it supports GPT-2, GPT-J, GPT-NeoX, LLaMA, MPT, etc. See [Supported Models](https://github.com/marella/ctransformers#supported-models). It provides a unified interface for all models: ```python from langchain.llms import CTransformers llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2') print(llm('AI is going to')) ``` It can be used with models hosted on the Hugging Face Hub: ```py llm = CTransformers(model='marella/gpt-2-ggml') ``` It supports streaming: ```py from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler llm = CTransformers(model='marella/gpt-2-ggml', callbacks=[StreamingStdOutCallbackHandler()]) ``` Please see [README](https://github.com/marella/ctransformers#readme) for more details. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2025-09-05 21:12:48 +00:00 · 2023-05-26 02:12:44 +05:30
parent ca88b25da6
commit b3988621c5
5 changed files with 310 additions and 0 deletions
--- a/docs/integrations/ctransformers.md
+++ b/docs/integrations/ctransformers.md
@@ -0,0 +1,57 @@
+# C Transformers
+
+This page covers how to use the [C Transformers](https://github.com/marella/ctransformers) library within LangChain.
+It is broken into two parts: installation and setup, and then references to specific C Transformers wrappers.
+
+## Installation and Setup
+
+- Install the Python package with `pip install ctransformers`
+- Download a supported [GGML model](https://huggingface.co/TheBloke) (see [Supported Models](https://github.com/marella/ctransformers#supported-models))
+
+## Wrappers
+
+### LLM
+
+There exists a CTransformers LLM wrapper, which you can access with:
+
+```python
+from langchain.llms import CTransformers
+```
+
+It provides a unified interface for all models:
+
+```python
+llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2')
+
+print(llm('AI is going to'))
+```
+
+If you are getting `illegal instruction` error, try using `lib='avx'` or `lib='basic'`:
+
+```py
+llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2', lib='avx')
+```
+
+It can be used with models hosted on the Hugging Face Hub:
+
+```py
+llm = CTransformers(model='marella/gpt-2-ggml')
+```
+
+If a model repo has multiple model files (`.bin` files), specify a model file using:
+
+```py
+llm = CTransformers(model='marella/gpt-2-ggml', model_file='ggml-model.bin')
+```
+
+Additional parameters can be passed using the `config` parameter:
+
+```py
+config = {'max_new_tokens': 256, 'repetition_penalty': 1.1}
+
+llm = CTransformers(model='marella/gpt-2-ggml', config=config)
+```
+
+See [Documentation](https://github.com/marella/ctransformers#config) for a list of available parameters.
+
+For a more detailed walkthrough of this, see [this notebook](../modules/models/llms/integrations/ctransformers.ipynb).
--- a/docs/modules/models/llms/integrations/ctransformers.ipynb
+++ b/docs/modules/models/llms/integrations/ctransformers.ipynb
@@ -0,0 +1,125 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# C Transformers\n",
+    "\n",
+    "The [C Transformers](https://github.com/marella/ctransformers) library provides Python bindings for GGML models.\n",
+    "\n",
+    "This example goes over how to use LangChain to interact with `C Transformers` [models](https://github.com/marella/ctransformers#supported-models)."
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Install**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%pip install ctransformers"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Load Model**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.llms import CTransformers\n",
+    "\n",
+    "llm = CTransformers(model='marella/gpt-2-ggml')"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Generate Text**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(llm('AI is going to'))"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Streaming**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
+    "\n",
+    "llm = CTransformers(model='marella/gpt-2-ggml', callbacks=[StreamingStdOutCallbackHandler()])\n",
+    "\n",
+    "response = llm('AI is going to')"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**LLMChain**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain import PromptTemplate, LLMChain\n",
+    "\n",
+    "template = \"\"\"Question: {question}\n",
+    "\n",
+    "Answer:\"\"\"\n",
+    "\n",
+    "prompt = PromptTemplate(template=template, input_variables=['question'])\n",
+    "\n",
+    "llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
+    "\n",
+    "response = llm_chain.run('What is AI?')"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  },
+  "orig_nbformat": 4
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}