[Inference] Add readme (roadmap) and fulfill request handler (#5147)

* request handler

* add readme

---------

Co-authored-by: CjhHa1 <cjh18671720497outlook.com>
This commit is contained in:
Jianghai
2023-12-01 17:31:31 +08:00
committed by FrankLeeeee
parent 4cf4682e70
commit 56e75eeb06
3 changed files with 67 additions and 3 deletions

View File

@@ -0,0 +1,7 @@
"""
Our config consists of three parts:
1. model_config: The configuration for the model, including `model name`, 'model path' and self-defined layer.
2. parallel_config: The configuration for parallelize model, including `tp_size`,'pp_size', `world size`, `local rank`, `master port`, `master ip`.
3. cache_config: Configuration for initialize and manage kv cache, including `block size`, `block num`
For the convenience of users, we provide a unified config api for that wrapped all the configs. One can easily construct a colossal_config by setting the needed configs.
"""