Commit Graph

21 Commits

Author SHA1 Message Date
ver217
dba7e0cfb4 make AutoPlacementPolicy configurable (#1191) 2022-06-30 15:18:30 +08:00
Jiarui Fang
372f791444 [refactor] move chunk and chunkmgr to directory gemini (#1182) 2022-06-29 13:31:02 +08:00
ver217
54aabb8da4 [gemini] refactor gemini mgr (#1151)
* refactor gemini mgr

* udpate __init__
2022-06-22 11:54:36 +08:00
ver217
7d14b473f0 [gemini] gemini mgr supports "cpu" placement policy (#1118)
* update gemini mgr

* update chunk

* add docstr

* polish placement policy

* update test chunk

* update test zero

* polish unit test

* remove useless unit test
2022-06-15 15:05:19 +08:00
Frank Lee
14e5b11d7f [zero] fixed api consistency (#1098) 2022-06-10 16:59:59 +08:00
ver217
1f894e033f [gemini] zero supports gemini (#1093)
* add placement policy

* add gemini mgr

* update mem stats collector

* update zero

* update zero optim

* fix bugs

* zero optim monitor os

* polish unit test

* polish unit test

* add assert
2022-06-10 14:48:28 +08:00
ver217
be01db37c8 [tensor] refactor chunk mgr and impl MemStatsCollectorV2 (#1077)
* polish chunk manager

* polish unit test

* impl add_extern_static_tensor for chunk mgr

* add mem stats collector v2

* polish code

* polish unit test

* polish code

* polish get chunks
2022-06-09 20:56:34 +08:00
ver217
c4d903e64a [gemini] accelerate adjust_layout() (#878)
* add lru cache

* polish code

* update unit test

* fix sharded optim
2022-04-26 18:08:31 +08:00
HELSON
425b4a96b8 [gemini] polish stateful_tensor_mgr (#876) 2022-04-26 15:05:03 +08:00
HELSON
3107817172 [gemini] add stateful tensor container (#867) 2022-04-25 14:58:16 +08:00
HELSON
f0e654558f [gemini] polish code (#855) 2022-04-25 10:40:14 +08:00
ver217
d7e0303d1e [zero] use GeminiMemoryManager when sampling model data (#850) 2022-04-24 17:17:22 +08:00
ver217
0dea140760 [hotfix] add deconstructor for stateful tensor (#848)
* add deconstructor for stateful tensor

* fix colo init context
2022-04-24 15:03:04 +08:00
HELSON
e5ea3fdeef [gemini] add GeminiMemoryManger (#832)
* refactor StatefulTensor, tensor utilities

* add unitest for GeminiMemoryManager
2022-04-24 13:08:48 +08:00
Jiarui Fang
0ce8924ceb [tensor] reorganize files (#820) 2022-04-21 14:15:48 +08:00
Jiarui Fang
ab962b9735 [gemini] a new tensor structure (#818)
* Revert "[zero] add ZeroTensorShardStrategy (#793)"

This reverts commit 88759e289e.

* [gemini] set cpu memory capacity

* [log] local throughput collecting

* polish

* polish

* polish

* polish code

* polish

* polish code

* add a new tensor structure and override linear for it

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish
2022-04-21 11:42:37 +08:00
Jiarui Fang
3ddbd1bce1 [gemini] collect cpu-gpu moving volume in each iteration (#813) 2022-04-20 11:29:48 +08:00
Jiarui Fang
681addb512 [refactor] moving grad acc logic to engine (#804) 2022-04-19 14:03:21 +08:00
Jiarui Fang
4d9332b4c5 [refactor] moving memtracer to gemini (#801) 2022-04-19 10:13:08 +08:00
ver217
846406a07a [gemini] fix auto tensor placement policy (#775) 2022-04-16 21:29:31 +08:00
Jiarui Fang
10ef8afdd2 [gemini] init genimi individual directory (#754) 2022-04-14 16:40:26 +08:00