3 Commits

Author SHA1 Message Date
HELSON
84fd7c1d4d add moe context, moe utilities and refactor gradient handler (#455) 2022-03-18 16:38:32 +08:00
1SAA
82023779bb Added TPExpert for special situation 2022-03-11 15:50:28 +08:00
1SAA
219df6e685 Optimized MoE layer and fixed some bugs;
Decreased moe tests;

Added FFNExperts and ViTMoE model
2022-03-11 15:50:28 +08:00