[inference] refactor examples and fix schedule (#5077)

* [setup] refactor infer setup

* [hotfix] fix infenrece behavior on 1 1 gpu

* [exmaple] refactor inference examples
This commit is contained in:
Hongxin Liu
2023-11-21 10:46:03 +08:00
committed by GitHub
parent 4e3959d316
commit 1cd7efc520
9 changed files with 209 additions and 274 deletions

View File

@@ -1,6 +1,4 @@
transformers==4.34.0
packaging
ninja
auto-gptq==0.5.0
git+https://github.com/ModelTC/lightllm.git@ece7b43f8a6dfa74027adc77c2c176cff28c76c8
git+https://github.com/Dao-AILab/flash-attention.git@017716451d446e464dde9aca3a3c1ed2209caaa9