Commit Graph

63 Commits

Author SHA1 Message Date
HELSON
b31daed4cf fix bugs in CPU adam (#633)
* add cpu adam counter for all cpu adam

* fixed updating error in adam kernel
2022-04-02 17:04:05 +08:00
HELSON
055fbf5be6 [zero] adapt zero for unsharded paramters (Optimizer part) (#601) 2022-04-01 20:10:47 +08:00
HELSON
e6d50ec107 [zero] adapt zero for unsharded parameters (#561)
* support existing sharded and unsharded parameters in zero

* add unitest for moe-zero model init

* polish moe gradient handler
2022-03-31 18:34:11 +08:00
Jiarui Fang
7675366fce [polish] rename col_attr -> colo_attr (#558) 2022-03-31 12:25:45 +08:00
HELSON
8c90d4df54 [zero] add zero context manager to change config during initialization (#546) 2022-03-29 17:57:59 +08:00
Liang Bowen
ec5086c49c Refactored docstring to google style 2022-03-29 17:17:47 +08:00
Frank Lee
3601b2bad0 [test] fixed rerun_on_exception and adapted test cases (#487) 2022-03-25 17:25:12 +08:00
Jiarui Fang
a445e118cf [polish] polish singleton and global context (#500) 2022-03-23 18:03:39 +08:00
Jiarui Fang
65c0f380c2 [format] polish name format for MOE (#481) 2022-03-21 23:19:47 +08:00
HELSON
7544347145 [MOE] add unitest for MOE experts layout, gradient handler and kernel (#469) 2022-03-21 13:35:04 +08:00
HELSON
84fd7c1d4d add moe context, moe utilities and refactor gradient handler (#455) 2022-03-18 16:38:32 +08:00
1SAA
82023779bb Added TPExpert for special situation 2022-03-11 15:50:28 +08:00
1SAA
219df6e685 Optimized MoE layer and fixed some bugs;
Decreased moe tests;

Added FFNExperts and ViTMoE model
2022-03-11 15:50:28 +08:00