Commit Graph

12 Commits

Author SHA1 Message Date
HELSON
f92c100ddd [checkpoint] use gather_tensor in checkpoint and update its unit test (#1339) 2022-07-19 14:15:28 +08:00
Jiarui Fang
9e4c6449b0 [checkpoint] add ColoOptimizer checkpointing (#1316) 2022-07-15 09:52:55 +08:00
Jiarui Fang
85f933b58b [Optimizer] Remove useless ColoOptimizer (#1312) 2022-07-14 16:57:48 +08:00
Jiarui Fang
9f10524313 [Optimizer] polish the init method of ColoOptimizer (#1310) 2022-07-14 16:37:33 +08:00
Jiarui Fang
3ef3791a3b [checkpoint] add test for bert and hotfix save bugs (#1297) 2022-07-14 15:38:18 +08:00
Jiarui Fang
c92f84fcdb [tensor] distributed checkpointing for parameters (#1240) 2022-07-12 15:51:06 +08:00
Jiarui Fang
9bcd2fd4af [tensor] a shorter shard and replicate spec (#1245) 2022-07-11 15:51:48 +08:00
Jiarui Fang
20da6e48c8 [checkpoint] save sharded optimizer states (#1237) 2022-07-08 16:33:13 +08:00
Jiarui Fang
3b500984b1 [tensor] fix some unittests (#1234) 2022-07-08 14:18:30 +08:00
Yi Zhao
04537bf83e [checkpoint]support generalized scheduler (#1222) 2022-07-07 18:16:38 +08:00
Jiarui Fang
52736205d9 [checkpoint] make unitest faster (#1217) 2022-07-06 17:39:46 +08:00
Jiarui Fang
f38006ea83 [checkpoint] checkpoint for ColoTensor Model (#1196) 2022-07-06 17:22:03 +08:00