[Inference]Add BatchInferState, Sequence and InferConfig (#5149)

* add infer_struct and infer_config

* update codes

* change InferConfig

* Add hf_model_config to the engine

* rm _get_hf_model_config

* update codes

* made adjustments according to the feedback from the reviewer.

* update codes

* add ci test for config and struct
This commit is contained in:
yuehuayingxueluo
2023-12-07 14:34:01 +08:00
committed by FrankLeeeee
parent 2bb92243d4
commit fab9b931d9
5 changed files with 279 additions and 34 deletions

View File

@@ -1,7 +0,0 @@
"""
Our config consists of three parts:
1. model_config: The configuration for the model, including `model name`, 'model path' and self-defined layer.
2. parallel_config: The configuration for parallelize model, including `tp_size`,'pp_size', `world size`, `local rank`, `master port`, `master ip`.
3. cache_config: Configuration for initialize and manage kv cache, including `block size`, `block num`
For the convenience of users, we provide a unified config api for that wrapped all the configs. One can easily construct a colossal_config by setting the needed configs.
"""