* fix for async io
* test for upgrading transformers
* add ci machine
* fix
* fix
* fix
* fix
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update test_fp16_torch.py
* Update build_on_pr.yml
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* fix
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* fiux
* fix
* fix
* fix
* upgrade llama
* fix
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* fix
* upgrade_bert
* upgrade_bloom
* [upgrade] upgrade gpt2 (#6291)
* fix
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* fix
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* upgrade command
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* add explanation
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* fix
* fix
* fix
* [upgrade]Upgrade qwen2 (#6302)
* upgrade qwen2
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* update_bloom
* fix
* add explantion
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* upgrade_sam
* add the explanation
* upgrade_t
* fix
* fix
* fix
* upgrade_gptj
* fix
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [upgrade]upgrade opt (#6307)
* upgrade opt
* fix
* [upgrade]Upgrade mixtral (#6317)
* upgrade mixtral
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* upgrade infer
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* upgrade drafter
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* upgrade lazy
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* upgrade mixtral
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [upgrade]Upgrade vit (#6308)
* fix
* fix
* fix rotate embedding test
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [upgrade]upgrade mistral (#6296)
* upgrade mistral
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* fix falcon
* fix
* Update test_shard_deepseek.py
* Update build_on_pr.yml
* Update requirements.txt
* fix (#6327)
* fix (#6328)
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update bert.py
* fix (#6329)
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Hanks <hangxu0304@gmail.com>
Co-authored-by: wangbluo <2538539015@qq.com>
Co-authored-by: Wang Binluo <32676639+wangbluo@users.noreply.github.com>
* [feature] support ep for deepseek v3
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix test
* [shardformer] fix deepseek v3 init
* [lazy] fit lora for lazy init
* [example] support npu for deepseek v3
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* support npu
* support pretrain
support pretrain
fix
* support lora
fix
fix
* support chatglm
fix
fxi
fix
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
fix
fix
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
fix
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
fix
fix
fix
* Update train.py
* Update train.py
* fix
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* fix
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [lazy] support _like methods and clamp
* [lazy] pass transformers models
* [lazy] fix device move and requires grad
* [lazy] fix requires grad and refactor api
* [lazy] fix requires grad
* [shardformer] support lazy init
* [shardformer] linear support lazy init
* [shardformer] embedding support lazy init
* [shardformer] norm support lazy init
* [shardformer] fused linear support lazy init
* [test] update shardformer test layer
* [test] shardformer with lazy init fit ddp
* [lazy] hotfix deepcopy of param
* [shardformer] fix bert policy and update test
* [shardformer] fix bloom policy and update test
* [shardformer] fix opt policy and update test
* [shardformer] fix t5 policy and update test
* [shardformer] fix gpt2 policy and update test
* [shardformer] fix llama policy and update test