1
0
mirror of https://github.com/hpcaitech/ColossalAI.git synced 2025-05-05 15:08:18 +00:00
Commit Graph

8 Commits

Author SHA1 Message Date
Hongxin Liu
19e1a5cf16
[shardformer] update colo attention to support custom mask ()
* [feature] refactor colo attention ()

* [extension] update api

* [feature] add colo attention

* [feature] update sdpa

* [feature] update npu attention

* [feature] update flash-attn

* [test] add flash attn test

* [test] update flash attn test

* [shardformer] update modeling to fit colo attention ()

* [misc] refactor folder structure

* [shardformer] update llama flash-attn

* [shardformer] fix llama policy

* [devops] update tensornvme install

* [test] update llama test

* [shardformer] update colo attn kernel dispatch

* [shardformer] update blip2

* [shardformer] update chatglm

* [shardformer] update gpt2

* [shardformer] update gptj

* [shardformer] update opt

* [shardformer] update vit

* [shardformer] update colo attention mask prep

* [shardformer] update whisper

* [test] fix shardformer tests ()

* [test] fix shardformer tests

* [test] fix shardformer tests
2024-03-27 11:19:32 +08:00
littsk
1a3315e336
[hotfix] Add layer norm gradients all-reduce for sequence parallel ()
* [hotfix] Add layer norm gradients all-reduce for sequence parallel. ()

* Add layer norm gradients all-reduce for sequence parallel.

* skip pipeline inference test

* [hotfix] fixing polices of sequence parallel ()

* Add layer norm gradients all-reduce for sequence parallel.

* fix parameter passing when calling get_autopolicy

---------

Co-authored-by: littsk <1214689160@qq.com>

* Hotfix/add grad all reduce for sequence parallel ()

* Add layer norm gradients all-reduce for sequence parallel.


* fix parameter passing when calling get_autopolicy

* fix bug using wrong variables

---------

Co-authored-by: littsk <1214689160@qq.com>

* fix policy initialization

* fix bloom and chatglm policices

* polish code of handling layernorm

* fix moe module

* polish code of class initializing

---------

Co-authored-by: Zhongkai Zhao <kanezz620@gmail.com>
2023-11-03 13:32:43 +08:00
Hongxin Liu
079bf3cb26
[misc] update pre-commit and run all files ()
* [misc] update pre-commit

* [misc] run pre-commit

* [misc] remove useless configuration files

* [misc] ignore cuda for clang-format
2023-09-19 14:20:26 +08:00
Baizhou Zhang
2c787d7f47
[shardformer] fix submodule replacement bug when enabling pp () 2023-08-31 09:57:18 +08:00
flybird11111
ec18fc7340
[shardformer] support pp+tp+zero1 tests ()
* [shardformer] fix opt test hanging

* fix

* test

* test

* test

* fix test

* fix test

* remove print

* add fix

* [shardformer] pp+tp+zero1

[shardformer] pp+tp+zero1

[shardformer] pp+tp+zero1

[shardformer] pp+tp+zero1

[shardformer] pp+tp+zero1

[shardformer] pp+tp+zero1

* [shardformer] pp+tp+zero1

* [shardformer] pp+tp+zero1

* [shardformer] pp+tp+zero1

* [shardformer] pp+tp+zero1
2023-08-30 21:29:18 +08:00
flybird11111
d367b88785
[shardformer] fix opt test hanging ()
* [shardformer] fix opt test hanging

* fix

* test

* test

* test

* fix test

* fix test

* remove print

* add fix
2023-08-30 14:50:34 +08:00
Jianghai
e04436a82a
[shardformer] tests for 3d parallel () 2023-08-23 15:05:24 +08:00
Jianghai
5545114fd8
rename chatglm to chatglm2 () 2023-08-22 14:13:31 +08:00