[Feature] MoE Ulysses Support (#5918)

* moe sp support

* moe sp bug solve

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
This commit is contained in:
Haze188
2024-07-18 11:37:56 +08:00
committed by Hongxin Liu
parent 3e2b6132b7
commit 404b16faf3
6 changed files with 571 additions and 72 deletions

View File

@@ -48,11 +48,13 @@ loss_fn = lambda x: x.loss
loss_fn_for_seq_classification = lambda output: output.logits.mean()
config = MixtralConfig(
hidden_size=256,
intermediate_size=256,
num_attention_heads=64,
hidden_size=32,
intermediate_size=32,
num_attention_heads=8,
num_hidden_layers=2,
vocab_size=1000,
attn_implementation="flash_attention_2",
torch_dtype="float16",
output_router_logits=True,
)