[Kernels]added flash-decoidng of triton (#5063)

* added flash-decoidng of triton based on lightllm kernel

* add req

* clean

* clean

* delete build.sh

---------

Co-authored-by: cuiqing.li <lixx336@gmail.com>
This commit is contained in:
Cuiqing Li (李崔卿)
2023-11-20 13:58:29 +08:00
committed by GitHub
parent fd6482ad8c
commit bce919708f
6 changed files with 82 additions and 43 deletions

View File

@@ -2,6 +2,6 @@ transformers==4.34.0
packaging
ninja
auto-gptq==0.5.0
git+https://github.com/ModelTC/lightllm.git@28c1267cfca536b7b4f28e921e03de735b003039
git+https://github.com/ModelTC/lightllm.git@ece7b43f8a6dfa74027adc77c2c176cff28c76c8
git+https://github.com/facebookresearch/xformers.git@main#egg=xformers
git+https://github.com/Dao-AILab/flash-attention.git@017716451d446e464dde9aca3a3c1ed2209caaa9