Commit Graph

3 Commits

Author SHA1 Message Date
hxwang
05a78d2f41
[chore] solve moe ckpt test failure and some other arg pass failure 2024-07-22 03:53:02 +00:00
hxwang
8d3d7f3cbd
[moe] test deepseek 2024-07-19 07:32:00 +00:00
botbw
1b15cc97f5
[moe] add mixtral dp grad scaling when not all experts are activated 2024-07-19 07:30:14 +00:00