haze188
12d043ca00
[misc] remove incompatible test config
2024-08-01 10:06:59 +08:00
hxwang
cb01c0d5ce
[moe] refactor mesh assignment
2024-08-01 10:06:59 +08:00
hxwang
067e18f7e9
[test] fix test: test_zero1_2
2024-08-01 10:06:59 +08:00
hxwang
70c9924d0d
[chore] solve moe ckpt test failure and some other arg pass failure
2024-08-01 10:06:59 +08:00
hxwang
803878b2fd
[moe] full test for deepseek and mixtral (pp + sp to fix)
2024-08-01 10:06:59 +08:00
haze188
2cddeac717
moe sp + ep bug fix
2024-08-01 10:06:59 +08:00
hxwang
877d94bb8c
[moe] init moe plugin comm setting with sp
2024-08-01 10:06:59 +08:00
hxwang
09d6280d3e
[chore] minor fix
2024-08-01 10:06:59 +08:00
Haze188
404b16faf3
[Feature] MoE Ulysses Support ( #5918 )
...
* moe sp support
* moe sp bug solve
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-08-01 10:06:59 +08:00
botbw
e28e05345b
[moe] implement submesh initialization
2024-08-01 10:06:59 +08:00
haze188
5ed5e8cfba
solve hang when parallel mode = pp + dp
2024-08-01 10:06:59 +08:00
botbw
13b48ac0aa
[zero] solve hang
2024-08-01 10:06:59 +08:00
botbw
b5bfeb2efd
[moe] implement transit between non moe tp and ep
2024-08-01 10:06:59 +08:00
botbw
37443cc7e4
[test] pass mixtral shardformer test
2024-08-01 10:06:59 +08:00
hxwang
46c069b0db
[zero] solve hang
2024-08-01 10:06:59 +08:00
hxwang
a249e71946
[test] mixtra pp shard test
2024-08-01 10:06:59 +08:00
hxwang
0b76b57cd6
[test] add mixtral transformer test
2024-08-01 10:06:59 +08:00