feat(model): Support MLX inference (#2781)

This commit is contained in:
Fangyin Cheng
2025-06-19 09:30:58 +08:00
committed by GitHub
parent 9084c6c19c
commit d9d4d4b6bc
12 changed files with 5047 additions and 4662 deletions

View File

@@ -0,0 +1,43 @@
# MLX Inference
DB-GPT supports [MLX](https://github.com/ml-explore/mlx-lm) inference, a fast and easy-to-use LLM inference and service library.
## Install dependencies
`MLX` is an optional dependency in DB-GPT. You can install it by adding the extra `--extra "mlx"` when installing dependencies.
```bash
# Use uv to install dependencies needed for mlx
# Install core dependencies and select desired extensions
uv sync --all-packages \
--extra "base" \
--extra "hf" \
--extra "mlx" \
--extra "rag" \
--extra "storage_chromadb" \
--extra "quant_bnb" \
--extra "dbgpts"
```
## Modify configuration file
After installing the dependencies, you can modify your configuration file to use the `mlx` provider.
```toml
# Model Configurations
[models]
[[models.llms]]
name = "Qwen/Qwen3-0.6B-MLX-4bit"
provider = "mlx"
# If not provided, the model will be downloaded from the Hugging Face model hub
# uncomment the following line to specify the model path in the local file system
# https://huggingface.co/Qwen/Qwen3-0.6B-MLX-4bit
# path = "the-model-path-in-the-local-file-system"
```
### Step 3: Run the Model
You can run the model using the following command:
```bash
uv run dbgpt start webserver --config {your_config_file}
```

View File

@@ -170,6 +170,10 @@ const sidebars = {
type: 'doc',
id: 'installation/advanced_usage/vLLM_inference',
},
{
type: 'doc',
id: 'installation/advanced_usage/mlx_inference',
},
{
type: 'doc',
id: 'installation/advanced_usage/Llamacpp_server',

File diff suppressed because it is too large Load Diff