[Inference]Add BatchInferState, Sequence and InferConfig (#5149)

* add infer_struct and infer_config * update codes * change InferConfig * Add hf_model_config to the engine * rm _get_hf_model_config * update codes * made adjustments according to the feedback from the reviewer. * update codes * add ci test for config and struct
2025-09-07 20:10:17 +00:00 · 2023-12-07 14:34:01 +08:00
parent 2bb92243d4
commit fab9b931d9
5 changed files with 279 additions and 34 deletions
--- a/colossalai/inference/config.py
+++ b/colossalai/inference/config.py
@@ -1,7 +0,0 @@
-"""
-Our config consists of three parts:
-    1. model_config: The configuration for the model, including `model name`, 'model path' and self-defined layer.
-    2. parallel_config: The configuration for parallelize model, including `tp_size`,'pp_size', `world size`, `local rank`, `master port`, `master ip`.
-    3. cache_config: Configuration for initialize and manage kv cache, including `block size`, `block num`
-For the convenience of users, we provide a unified config api for that wrapped all the configs. One can easily construct a colossal_config by setting the needed configs.
-"""