[hotfix] Suport extra_kwargs in ShardConfig (#5031)

* [refactor]: replace inference args with extra_kwargs in ShardConfig

* modify shardconfig

* polish code

* fix policy bug in llama

* fix bug in auto policy

* remove setattr in ShardConfig
This commit is contained in:
Zhongkai Zhao
2023-11-10 10:49:50 +08:00
committed by GitHub
parent 576a2f7b10
commit 70885d707d
23 changed files with 98 additions and 77 deletions

View File

@@ -19,7 +19,7 @@ def build_model(
enable_tensor_parallelism=enable_tensor_parallelism,
enable_flash_attention=enable_flash_attention,
enable_jit_fused=enable_jit_fused,
inference_only=True,
extra_kwargs={"inference_only": True},
)
model_copy = copy.deepcopy(org_model)
shard_former = ShardFormer(shard_config=shard_config)