flybird11111
0688d92e2d
[shardformer]Fix lm parallel. (#5480)
* fix
* padding vocab_size when using pipeline parallellism
padding vocab_size when using pipeline parallellism
fix
fix
* fix
* fix
fix
fix
* fix gather output
* fix
* fix
* fix
fix resize embedding
fix resize embedding
* fix resize embedding
fix
* revert
* revert
* revert
* fix lm forward distribution
* fix
* test ci
* fix
2024-03-25 17:21:51 +08:00
..
2023-09-19 14:20:26 +08:00
2023-01-06 20:50:26 +08:00
2024-03-05 21:52:30 +08:00
2024-01-09 10:20:05 +08:00
2024-02-20 19:24:43 +08:00
2023-09-19 14:20:26 +08:00
2024-03-25 12:31:09 +08:00
2024-03-05 21:52:30 +08:00
2024-03-05 15:35:54 +08:00
2024-01-03 14:26:13 +08:00
2023-11-02 02:21:24 +00:00
2023-11-22 19:23:21 +08:00
2023-09-19 14:20:26 +08:00
2024-03-12 11:25:16 +08:00
2023-09-26 11:04:11 +08:00
2024-01-25 17:01:48 +08:00
2023-09-27 10:24:04 +08:00
2024-03-12 11:25:16 +08:00
2023-09-19 14:20:26 +08:00
2024-03-05 21:52:30 +08:00
2024-03-12 11:25:16 +08:00
2024-03-25 12:31:09 +08:00
2024-03-25 17:21:51 +08:00
2024-02-07 19:21:02 +08:00
2024-01-09 10:20:05 +08:00
2024-01-29 13:49:39 +08:00
2024-02-19 16:41:04 +08:00
2023-11-30 13:25:17 +08:00
2024-01-09 10:20:05 +08:00