Commit Graph

2230 Commits

Author SHA1 Message Date
Frank Lee
8518263b80 [test] fixed the triton version for testing (#2608) 2023-02-07 13:49:38 +08:00
HELSON
552183bb74 [polish] polish ColoTensor and its submodules (#2537) 2023-02-03 11:44:10 +08:00
Frank Lee
dd14783f75 [kernel] fixed repeated loading of kernels (#2549)
* [kernel] fixed repeated loading of kernels

* polish code

* polish code
2023-02-03 09:47:13 +08:00
ver217
5b1854309a [hotfix] fix zero ddp warmup check (#2545) 2023-02-02 16:42:38 +08:00
oahzxl
fa3d66feb9 support unet metainfo prop (#2544) 2023-02-02 16:19:26 +08:00
oahzxl
05671fcb42 [autochunk] support multi outputs chunk search (#2538)
Support multi outputs chunk search. Previously we only support single output chunk search. It is more flexible and improve performance by a large margin. For transformer, we reduce memory by 40% than previous search strategy.

1. rewrite search strategy to support multi outputs chunk search
2. fix many, many bugs
3. update tests
2023-02-01 13:18:51 +08:00
oahzxl
63199c6687 [autochunk] support transformer (#2526) 2023-01-31 16:00:06 +08:00
HELSON
a4ed9125ac [hotfix] fix lightning error (#2529) 2023-01-31 10:40:39 +08:00
HELSON
66dfcf5281 [gemini] update the gpt example (#2527) 2023-01-30 17:58:05 +08:00
HELSON
b528eea0f0 [zero] add zero wrappers (#2523)
* [zero] add zero wrappers

* change names

* add wrapper functions to init
2023-01-29 17:52:58 +08:00
Super Daniel
c198c7c0b0 [hotfix] meta tensor default device. (#2510) 2023-01-29 16:28:10 +08:00
HELSON
077a5cdde4 [zero] fix gradient clipping in hybrid parallelism (#2521)
* [zero] fix gradient clipping in hybrid parallelism

* [testing] change model name to avoid pytest warning

* [hotfix] fix unit testing
2023-01-29 15:09:57 +08:00
YuliangLiu0306
aa0f6686f9 [autoparallel] accelerate gpt2 training (#2495) 2023-01-29 11:13:15 +08:00
HELSON
707b11d4a0 [gemini] update ddp strict mode (#2518)
* [zero] add strict ddp mode for chunk init

* [gemini] update gpt example
2023-01-28 14:35:25 +08:00
HELSON
2d1a7dfe5f [zero] add strict ddp mode (#2508)
* [zero] add strict ddp mode

* [polish] add comments for strict ddp mode

* [zero] fix test error
2023-01-20 14:04:38 +08:00
oahzxl
c04f183237 [autochunk] support parsing blocks (#2506) 2023-01-20 11:18:17 +08:00
Super Daniel
35c0c0006e [utils] lazy init. (#2148)
* [utils] lazy init.

* [utils] remove description.

* [utils] complete.

* [utils] finalize.

* [utils] fix names.
2023-01-20 10:49:00 +08:00
oahzxl
72341e65f4 [auto-chunk] support extramsa (#3) (#2504) 2023-01-20 10:13:03 +08:00
Ziyue Jiang
0f02b8c6e6 add avg partition (#2483)
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-19 13:54:50 +08:00
アマデウス
99d9713b02 Revert "Update parallel_context.py (#2408)"
This reverts commit 7d5640b9db.
2023-01-19 12:27:48 +08:00
oahzxl
ecccc91f21 [autochunk] support autochunk on evoformer (#2497) 2023-01-19 11:41:00 +08:00
oahzxl
5db3a5bf42 [fx] allow control of ckpt_codegen init (#2498)
* [fx] allow control of ckpt_codegen init

Currently in ColoGraphModule, ActivationCheckpointCodeGen will be set automatically in __init__. But other codegen can't be set if so. 
So I add an arg to control whether to set ActivationCheckpointCodeGen in __init__.

* code style
2023-01-18 17:02:46 +08:00
HELSON
d565a24849 [zero] add unit testings for hybrid parallelism (#2486) 2023-01-18 10:36:10 +08:00
oahzxl
4953b4ace1 [autochunk] support evoformer tracer (#2485)
support full evoformer tracer, which is a main module of alphafold. previously we just support a simplifed version of it.
1. support some evoformer's op in fx
2. support evoformer test
3. add repos for test code
2023-01-16 19:25:05 +08:00
YuliangLiu0306
67e1912b59 [autoparallel] support origin activation ckpt on autoprallel system (#2468) 2023-01-16 16:25:13 +08:00
Ziyue Jiang
fef5c949c3 polish pp middleware (#2476)
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-13 16:56:01 +08:00
HELSON
a5dc4253c6 [zero] polish low level optimizer (#2473) 2023-01-13 14:56:17 +08:00
Frank Lee
8b7495dd54 [example] integrate seq-parallel tutorial with CI (#2463) 2023-01-13 14:40:05 +08:00
Jiarui Fang
867c8c2d3a [zero] low level optim supports ProcessGroup (#2464) 2023-01-13 10:05:58 +08:00
Frank Lee
14d9299360 [cli] fixed hostname mismatch error (#2465) 2023-01-12 14:52:09 +08:00
Haofan Wang
9358262992 Fix False warning in initialize.py (#2456)
* Update initialize.py

* pre-commit run check
2023-01-12 13:49:01 +08:00
YuliangLiu0306
8221fd7485 [autoparallel] update binary elementwise handler (#2451)
* [autoparallel] update binary elementwise handler

* polish
2023-01-12 09:35:10 +08:00
HELSON
2bfeb24308 [zero] add warning for ignored parameters (#2446) 2023-01-11 15:30:09 +08:00
Frank Lee
39163417a1 [example] updated the hybrid parallel tutorial (#2444)
* [example] updated the hybrid parallel tutorial

* polish code
2023-01-11 15:17:17 +08:00
HELSON
5521af7877 [zero] fix state_dict and load_state_dict for ddp ignored parameters (#2443)
* [ddp] add is_ddp_ignored

[ddp] rename to is_ddp_ignored

* [zero] fix state_dict and load_state_dict

* fix bugs

* [zero] update unit test for ZeroDDP
2023-01-11 14:55:41 +08:00
YuliangLiu0306
2731531bc2 [autoparallel] integrate device mesh initialization into autoparallelize (#2393)
* [autoparallel] integrate device mesh initialization into autoparallelize

* add megatron solution

* update gpt autoparallel examples with latest api

* adapt beta value to fit the current computation cost
2023-01-11 14:03:49 +08:00
Frank Lee
c72c827e95 [cli] provided more details if colossalai run fail (#2442) 2023-01-11 13:56:42 +08:00
Super Daniel
c41e59e5ad [fx] allow native ckpt trace and codegen. (#2438) 2023-01-11 13:49:59 +08:00
YuliangLiu0306
41429b9b28 [autoparallel] add shard option (#2423) 2023-01-11 13:40:33 +08:00
HELSON
7829aa094e [ddp] add is_ddp_ignored (#2434)
[ddp] rename to is_ddp_ignored
2023-01-11 12:22:45 +08:00
HELSON
bb4e9a311a [zero] add inference mode and its unit test (#2418) 2023-01-11 10:07:37 +08:00
Jiarui Fang
93f62dd152 [autochunk] add autochunk feature 2023-01-10 16:04:42 +08:00
HELSON
dddacd2d2c [hotfix] add norm clearing for the overflow step (#2416) 2023-01-10 15:43:06 +08:00
oahzxl
7ab2db206f adapt new fx 2023-01-10 11:56:00 +08:00
oahzxl
e532679c95 Merge branch 'main' of https://github.com/oahzxl/ColossalAI into chunk 2023-01-10 11:29:01 +08:00
Haofan Wang
7d5640b9db Update parallel_context.py (#2408) 2023-01-10 11:27:23 +08:00
oahzxl
fd818cf144 change imports 2023-01-10 11:10:45 +08:00
oahzxl
a591d45b29 add available 2023-01-10 10:56:39 +08:00
oahzxl
615e7e68d9 update doc 2023-01-10 10:44:07 +08:00
oahzxl
7d4abaa525 add doc 2023-01-10 09:59:47 +08:00