botbw
|
4fa6b9509c
|
[moe] add parallel strategy for shared_expert && fix test for deepseek (#6063)
|
2024-09-18 10:09:01 +08:00 |
|
botbw
|
c54c4fcd15
|
[hotfix] moe hybrid parallelism benchmark & follow-up fix (#6048)
* [example] pass use_fp8_comm flag to all plugins
* [example] add mixtral benchmark
* [moe] refine assertion and check
* [moe] fix mixtral & add more tests
* [moe] consider checking dp * sp group and moe_dp_group
* [mixtral] remove gate tp & add more tests
* [deepseek] fix tp & sp for deepseek
* [mixtral] minor fix
* [deepseek] add deepseek benchmark
|
2024-09-10 17:30:53 +08:00 |
|
botbw
|
62cdac6b7b
|
[chore] remove redundant test case, print string & reduce test tokens
|
2024-08-01 10:06:59 +08:00 |
|
haze188
|
70793ce9ed
|
[misc] fix ci failure: change default value to false in moe plugin
|
2024-08-01 10:06:59 +08:00 |
|
hxwang
|
cb01c0d5ce
|
[moe] refactor mesh assignment
|
2024-08-01 10:06:59 +08:00 |
|
hxwang
|
6c39f0b144
|
[test] add check
|
2024-08-01 10:06:59 +08:00 |
|
haze188
|
b2952a5982
|
[moe] deepseek moe sp support
|
2024-08-01 10:06:59 +08:00 |
|
hxwang
|
067e18f7e9
|
[test] fix test: test_zero1_2
|
2024-08-01 10:06:59 +08:00 |
|
hxwang
|
70c9924d0d
|
[chore] solve moe ckpt test failure and some other arg pass failure
|
2024-08-01 10:06:59 +08:00 |
|
hxwang
|
46037c2ccd
|
[chore] minor fix after rebase
|
2024-08-01 10:06:59 +08:00 |
|
hxwang
|
803878b2fd
|
[moe] full test for deepseek and mixtral (pp + sp to fix)
|
2024-08-01 10:06:59 +08:00 |
|