Commit Graph

12 Commits

Author SHA1 Message Date
ver217
828b9e5e0d [hotfix] fix zero optim save/load state dict (#1381) 2022-07-28 17:19:39 +08:00
ver217
6b43c789fd fix zero optim backward_by_grad and save/load (#1353) 2022-07-21 16:43:58 +08:00
ver217
d068af81a3 [doc] update rst and docstring (#1351)
* update rst

* add zero docstr

* fix docstr

* remove fx.tracer.meta_patch

* fix docstr

* fix docstr

* update fx rst

* fix fx docstr

* remove useless rst
2022-07-21 15:54:53 +08:00
ver217
561e90493f [zero] zero optim supports loading local state dict (#1171)
* zero optim supports loading local state dict

* polish code

* add unit test
2022-06-24 17:25:57 +08:00
ver217
8106d7b8c7 [ddp] refactor ColoDDP and ZeroDDP (#1146)
* ColoDDP supports overwriting default process group

* rename ColoDDPV2 to ZeroDDP

* add docstr for ZeroDDP

* polish docstr
2022-06-21 16:35:23 +08:00
Frank Lee
14e5b11d7f [zero] fixed api consistency (#1098) 2022-06-10 16:59:59 +08:00
Frank Lee
cb18922c47 [doc] added documentation to chunk and chunk manager (#1094)
* [doc] added documentation to chunk and chunk manager

* polish code

* polish code

* polish code
2022-06-10 15:33:06 +08:00
ver217
1f894e033f [gemini] zero supports gemini (#1093)
* add placement policy

* add gemini mgr

* update mem stats collector

* update zero

* update zero optim

* fix bugs

* zero optim monitor os

* polish unit test

* polish unit test

* add assert
2022-06-10 14:48:28 +08:00
ver217
be01db37c8 [tensor] refactor chunk mgr and impl MemStatsCollectorV2 (#1077)
* polish chunk manager

* polish unit test

* impl add_extern_static_tensor for chunk mgr

* add mem stats collector v2

* polish code

* polish unit test

* polish code

* polish get chunks
2022-06-09 20:56:34 +08:00
ver217
c5cd3b0f35 [zero] zero optim copy chunk rather than copy tensor (#1070) 2022-06-07 10:30:46 +08:00
Jiarui Fang
49832b2344 [refactory] add nn.parallel module (#1068) 2022-06-06 15:34:41 +08:00
ver217
51b9a49655 [zero] add zero optimizer for ColoTensor (#1046)
* add zero optimizer

* torch ok

* unit test ok

* polish code

* fix bugs

* polish unit test

* polish zero optim

* polish colo ddp v2

* refactor folder structure

* add comment

* polish unit test

* polish zero optim

* polish unit test
2022-06-02 12:13:15 +08:00