feat(model): Support MLX inference (#2781)

2025-09-06 03:20:41 +00:00 · 2025-06-19 09:30:58 +08:00
parent 9084c6c19c
commit d9d4d4b6bc
12 changed files with 5047 additions and 4662 deletions
--- a/docs/docs/installation/advanced_usage/mlx_inference.md
+++ b/docs/docs/installation/advanced_usage/mlx_inference.md
@@ -0,0 +1,43 @@
+# MLX Inference
+DB-GPT supports [MLX](https://github.com/ml-explore/mlx-lm) inference, a fast and easy-to-use LLM inference and service library.
+
+## Install dependencies
+
+`MLX` is an optional dependency in DB-GPT. You can install it by adding the extra `--extra "mlx"` when installing dependencies.
+
+```bash
+# Use uv to install dependencies needed for mlx
+# Install core dependencies and select desired extensions
+uv sync --all-packages \
+--extra "base" \
+--extra "hf" \
+--extra "mlx" \
+--extra "rag" \
+--extra "storage_chromadb" \
+--extra "quant_bnb" \
+--extra "dbgpts"
+```
+
+## Modify configuration file
+
+After installing the dependencies, you can modify your configuration file to use the `mlx` provider.
+
+```toml
+# Model Configurations
+[models]
+[[models.llms]]
+name = "Qwen/Qwen3-0.6B-MLX-4bit"
+provider = "mlx"
+# If not provided, the model will be downloaded from the Hugging Face model hub
+# uncomment the following line to specify the model path in the local file system
+# https://huggingface.co/Qwen/Qwen3-0.6B-MLX-4bit
+# path = "the-model-path-in-the-local-file-system"
+```
+
+### Step 3: Run the Model
+
+You can run the model using the following command:
+
+```bash
+uv run dbgpt start webserver --config {your_config_file}
+```
--- a/docs/sidebars.js
+++ b/docs/sidebars.js
@@ -170,6 +170,10 @@ const sidebars = {
              type: 'doc',
              id: 'installation/advanced_usage/vLLM_inference',
            },
+            {
+              type: 'doc',
+              id: 'installation/advanced_usage/mlx_inference',
+            },
            {
              type: 'doc',
              id: 'installation/advanced_usage/Llamacpp_server',
--- a/docs/yarn.lock
+++ b/docs/yarn.lock