Commit Graph

13 Commits

Author SHA1 Message Date
Jiarui Fang
595bedf767 revert zero tensors back (#829) 2022-04-22 12:12:35 +08:00
Jiarui Fang
294a6060d0 [tensor] ZeRO use ColoTensor as the base class. (#828)
* [refactor] moving InsertPostInitMethodToModuleSubClasses to utils.

* [tensor] ZeRO use ColoTensor as the base class.

* polish
2022-04-22 12:00:48 +08:00
HELSON
22c4b88d56 [zero] refactor ShardedParamV2 for convenience (#742) 2022-04-13 14:54:26 +08:00
Jiarui Fang
f552b11294 [zero] label state for param fp16 and grad (#551) 2022-03-30 15:57:46 +08:00
Jiarui Fang
214da761d4 [zero] add stateful tensor (#549) 2022-03-30 13:51:37 +08:00
Jiarui Fang
8d8c5407c0 [zero] refactor model data tracing (#522) 2022-03-25 18:03:32 +08:00
Jiarui Fang
b5f43acee3 [zero] find miss code (#378) 2022-03-11 15:50:28 +08:00
jiaruifang
d9217e1960 Revert "[zero] bucketized tensor cpu gpu copy (#368)"
This reverts commit bef05489b6.
2022-03-11 15:50:28 +08:00
Jiarui Fang
00670c870e [zero] bucketized tensor cpu gpu copy (#368) 2022-03-11 15:50:28 +08:00
Jiarui Fang
44e4891f57 [zero] able to place params on cpu after zero init context (#365)
* place params on cpu after zero init context

* polish code
2022-03-11 15:50:28 +08:00
Jiarui Fang
ea2872073f [zero] global model data memory tracer (#360) 2022-03-11 15:50:28 +08:00
Jiarui Fang
c9e7d9582d [zero] polish shard strategy (#310)
* init shard param from shape tuple

* add more unitest for shard param

* add set_payload method for ShardedParam

* [zero] add shareded tensor class

* polish code

* add shard stratgy

* move shard and gather logic to shard strategy from shard tensor.

* polish code
2022-03-11 15:50:28 +08:00
Jiarui Fang
80364c7686 [zero] sharded tensor (#305)
* init shard param from shape tuple

* add more unitest for shard param

* add set_payload method for ShardedParam

* [zero] add shareded tensor class

* polish code
2022-03-11 15:50:28 +08:00