Commit Graph

24 Commits

Author SHA1 Message Date
Frank Lee
40d376c566
[setup] support pre-build and jit-build of cuda kernels (#2374)
* [setup] support pre-build and jit-build of cuda kernels

* polish code

* polish code

* polish code

* polish code

* polish code

* polish code
2023-01-06 20:50:26 +08:00
Jiarui Fang
16cc8e6aa7
[builder] MOE builder (#2277) 2023-01-03 20:29:39 +08:00
ver217
f8a7148dec
[kernel] move all symlinks of kernel to colossalai._C (#1971) 2022-11-17 13:42:33 +08:00
HELSON
a088022efc
[moe] fix moe bugs (#1633) 2022-09-23 15:33:57 +08:00
HELSON
f7f2248771
[moe] fix MoE bugs (#1628)
* remove forced FP32 modules

* correct no_shard-contexts' positions
2022-09-22 13:56:30 +08:00
HELSON
e5ea3fdeef
[gemini] add GeminiMemoryManger (#832)
* refactor StatefulTensor, tensor utilities

* add unitest for GeminiMemoryManager
2022-04-24 13:08:48 +08:00
HELSON
a9b8300d54
[zero] improve adaptability for not-shard parameters (#708)
* adapt post grad hooks for not-shard parameters
* adapt optimizer for not-shard parameters
* offload gradients for not-replicated parameters
2022-04-11 13:38:51 +08:00
ver217
8432dc7080
polish moe docsrting (#618) 2022-04-01 16:15:36 +08:00
HELSON
e6d50ec107
[zero] adapt zero for unsharded parameters (#561)
* support existing sharded and unsharded parameters in zero

* add unitest for moe-zero model init

* polish moe gradient handler
2022-03-31 18:34:11 +08:00
HELSON
8c90d4df54
[zero] add zero context manager to change config during initialization (#546) 2022-03-29 17:57:59 +08:00
Liang Bowen
ec5086c49c Refactored docstring to google style 2022-03-29 17:17:47 +08:00
Jiarui Fang
a445e118cf
[polish] polish singleton and global context (#500) 2022-03-23 18:03:39 +08:00
HELSON
c9023d4078
[MOE] support PR-MOE (#488) 2022-03-22 16:48:22 +08:00
HELSON
d7ea63992b
[MOE] add FP32LinearGate for MOE in NaiveAMP context (#480) 2022-03-22 10:50:20 +08:00
Jiarui Fang
65c0f380c2
[format] polish name format for MOE (#481) 2022-03-21 23:19:47 +08:00
HELSON
aff9d354f7
[MOE] polish moe_env (#467) 2022-03-19 15:36:25 +08:00
HELSON
bccbc15861
[MOE] changed parallelmode to dist process group (#460) 2022-03-19 13:46:29 +08:00
HELSON
dbdc9a7783
added Multiply Jitter and capacity factor eval for MOE (#434) 2022-03-16 16:47:44 +08:00
HELSON
3f70a2b12f
removed noisy function during evaluation of MoE router (#419) 2022-03-15 12:06:09 +08:00
1SAA
82023779bb Added TPExpert for special situation 2022-03-11 15:50:28 +08:00
HELSON
36b8477228 Fixed parameter initialization in FFNExpert (#251) 2022-03-11 15:50:28 +08:00
1SAA
219df6e685 Optimized MoE layer and fixed some bugs;
Decreased moe tests;

Added FFNExperts and ViTMoE model
2022-03-11 15:50:28 +08:00
HELSON
0f8c7f9804
Fixed docstring in colossalai (#171) 2022-01-21 10:44:30 +08:00
HELSON
dceae85195
Added MoE parallel (#127) 2022-01-07 15:08:36 +08:00