Commit Graph

1395 Commits

Author SHA1 Message Date
Liu Ziming
6427c406cf [NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/strategy_generator.py code style (#2695)
Co-authored-by: shenggan <csg19971016@gmail.com>
2023-02-14 21:30:25 +08:00
アマデウス
534f68c83c [NFC] polish pipeline process group code style (#2694) 2023-02-14 18:12:01 +08:00
LuGY
56ff1921e9 [NFC] polish colossalai/context/moe_context.py code style (#2693) 2023-02-14 18:02:45 +08:00
Shawn-Kong
1712da2800 [NFC] polish colossalai/gemini/gemini_context.py code style (#2690) 2023-02-14 11:55:23 +08:00
HELSON
df4f020ee3 [zero1&2] only append parameters with gradients (#2681) 2023-02-13 18:00:16 +08:00
ver217
f0aa191f51 [gemini] fix colo_init_context (#2683) 2023-02-13 17:53:15 +08:00
Boyuan Yao
40c916b192 [autoparallel] Patch meta information of torch.nn.functional.softmax and torch.nn.Softmax (#2674)
* [autoparallel] softmax metainfo

* [autoparallel] softmax metainfo
2023-02-13 16:09:22 +08:00
HELSON
8213f89fd2 [gemini] add fake_release_chunk for keep-gathered chunk in the inference mode (#2671) 2023-02-13 14:35:32 +08:00
binmakeswell
9ab14b20b5 [doc] add CVPR tutorial (#2666) 2023-02-10 20:43:34 +08:00
Boyuan Yao
0385b26ebf [autoparallel] Patch meta information of torch.nn.LayerNorm (#2647)
* [autoparallel] layernorm metainfo patch

* [autoparallel] polish test
2023-02-10 14:29:24 +08:00
YuliangLiu0306
37df666f38 [autoparallel] refactor handlers which reshape input tensors (#2615)
* [autoparallel] refactor handlers which reshape input tensors

* polish
2023-02-08 15:02:49 +08:00
YuliangLiu0306
28398f1c70 add overlap option (#2613) 2023-02-08 15:02:31 +08:00
YuliangLiu0306
cb3d1bef62 [autoparallel] adapt autoparallel tests with latest api (#2626) 2023-02-08 15:02:12 +08:00
Boyuan Yao
90a9fdd91d [autoparallel] Patch meta information of torch.matmul (#2584)
* [autoparallel] matmul metainfo

* [auto_parallel] remove unused print

* [tests] skip test_matmul_handler when torch version is lower than 1.12.0
2023-02-08 11:05:31 +08:00
oahzxl
6ba8364881 [autochunk] support diffusion for autochunk (#2621)
* add alphafold benchmark

* renae alphafold test

* rename tests

* rename diffuser

* renme

* rename

* update transformer

* update benchmark

* update benchmark

* update bench memory

* update transformer benchmark

* rename

* support diffuser

* support unet metainfo prop

* fix bug and simplify code

* update linear and support some op

* optimize max region search, support conv

* update unet test

* support some op

* support groupnorm and interpolate

* update flow search

* add fix dim in node flow

* fix utils

* rename

* support diffusion

* update diffuser

* update chunk search

* optimize imports

* import

* finish autochunk
2023-02-07 16:32:45 +08:00
Frank Lee
8518263b80 [test] fixed the triton version for testing (#2608) 2023-02-07 13:49:38 +08:00
HELSON
552183bb74 [polish] polish ColoTensor and its submodules (#2537) 2023-02-03 11:44:10 +08:00
Frank Lee
dd14783f75 [kernel] fixed repeated loading of kernels (#2549)
* [kernel] fixed repeated loading of kernels

* polish code

* polish code
2023-02-03 09:47:13 +08:00
ver217
5b1854309a [hotfix] fix zero ddp warmup check (#2545) 2023-02-02 16:42:38 +08:00
oahzxl
fa3d66feb9 support unet metainfo prop (#2544) 2023-02-02 16:19:26 +08:00
oahzxl
05671fcb42 [autochunk] support multi outputs chunk search (#2538)
Support multi outputs chunk search. Previously we only support single output chunk search. It is more flexible and improve performance by a large margin. For transformer, we reduce memory by 40% than previous search strategy.

1. rewrite search strategy to support multi outputs chunk search
2. fix many, many bugs
3. update tests
2023-02-01 13:18:51 +08:00
oahzxl
63199c6687 [autochunk] support transformer (#2526) 2023-01-31 16:00:06 +08:00
HELSON
a4ed9125ac [hotfix] fix lightning error (#2529) 2023-01-31 10:40:39 +08:00
HELSON
66dfcf5281 [gemini] update the gpt example (#2527) 2023-01-30 17:58:05 +08:00
HELSON
b528eea0f0 [zero] add zero wrappers (#2523)
* [zero] add zero wrappers

* change names

* add wrapper functions to init
2023-01-29 17:52:58 +08:00
Super Daniel
c198c7c0b0 [hotfix] meta tensor default device. (#2510) 2023-01-29 16:28:10 +08:00
HELSON
077a5cdde4 [zero] fix gradient clipping in hybrid parallelism (#2521)
* [zero] fix gradient clipping in hybrid parallelism

* [testing] change model name to avoid pytest warning

* [hotfix] fix unit testing
2023-01-29 15:09:57 +08:00
YuliangLiu0306
aa0f6686f9 [autoparallel] accelerate gpt2 training (#2495) 2023-01-29 11:13:15 +08:00
HELSON
707b11d4a0 [gemini] update ddp strict mode (#2518)
* [zero] add strict ddp mode for chunk init

* [gemini] update gpt example
2023-01-28 14:35:25 +08:00
HELSON
2d1a7dfe5f [zero] add strict ddp mode (#2508)
* [zero] add strict ddp mode

* [polish] add comments for strict ddp mode

* [zero] fix test error
2023-01-20 14:04:38 +08:00
oahzxl
c04f183237 [autochunk] support parsing blocks (#2506) 2023-01-20 11:18:17 +08:00
Super Daniel
35c0c0006e [utils] lazy init. (#2148)
* [utils] lazy init.

* [utils] remove description.

* [utils] complete.

* [utils] finalize.

* [utils] fix names.
2023-01-20 10:49:00 +08:00
oahzxl
72341e65f4 [auto-chunk] support extramsa (#3) (#2504) 2023-01-20 10:13:03 +08:00
Ziyue Jiang
0f02b8c6e6 add avg partition (#2483)
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-19 13:54:50 +08:00
アマデウス
99d9713b02 Revert "Update parallel_context.py (#2408)"
This reverts commit 7d5640b9db.
2023-01-19 12:27:48 +08:00
oahzxl
ecccc91f21 [autochunk] support autochunk on evoformer (#2497) 2023-01-19 11:41:00 +08:00
oahzxl
5db3a5bf42 [fx] allow control of ckpt_codegen init (#2498)
* [fx] allow control of ckpt_codegen init

Currently in ColoGraphModule, ActivationCheckpointCodeGen will be set automatically in __init__. But other codegen can't be set if so. 
So I add an arg to control whether to set ActivationCheckpointCodeGen in __init__.

* code style
2023-01-18 17:02:46 +08:00
HELSON
d565a24849 [zero] add unit testings for hybrid parallelism (#2486) 2023-01-18 10:36:10 +08:00
oahzxl
4953b4ace1 [autochunk] support evoformer tracer (#2485)
support full evoformer tracer, which is a main module of alphafold. previously we just support a simplifed version of it.
1. support some evoformer's op in fx
2. support evoformer test
3. add repos for test code
2023-01-16 19:25:05 +08:00
YuliangLiu0306
67e1912b59 [autoparallel] support origin activation ckpt on autoprallel system (#2468) 2023-01-16 16:25:13 +08:00
Ziyue Jiang
fef5c949c3 polish pp middleware (#2476)
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-13 16:56:01 +08:00
HELSON
a5dc4253c6 [zero] polish low level optimizer (#2473) 2023-01-13 14:56:17 +08:00
Frank Lee
8b7495dd54 [example] integrate seq-parallel tutorial with CI (#2463) 2023-01-13 14:40:05 +08:00
Jiarui Fang
867c8c2d3a [zero] low level optim supports ProcessGroup (#2464) 2023-01-13 10:05:58 +08:00
Frank Lee
14d9299360 [cli] fixed hostname mismatch error (#2465) 2023-01-12 14:52:09 +08:00
Haofan Wang
9358262992 Fix False warning in initialize.py (#2456)
* Update initialize.py

* pre-commit run check
2023-01-12 13:49:01 +08:00
YuliangLiu0306
8221fd7485 [autoparallel] update binary elementwise handler (#2451)
* [autoparallel] update binary elementwise handler

* polish
2023-01-12 09:35:10 +08:00
HELSON
2bfeb24308 [zero] add warning for ignored parameters (#2446) 2023-01-11 15:30:09 +08:00
Frank Lee
39163417a1 [example] updated the hybrid parallel tutorial (#2444)
* [example] updated the hybrid parallel tutorial

* polish code
2023-01-11 15:17:17 +08:00
HELSON
5521af7877 [zero] fix state_dict and load_state_dict for ddp ignored parameters (#2443)
* [ddp] add is_ddp_ignored

[ddp] rename to is_ddp_ignored

* [zero] fix state_dict and load_state_dict

* fix bugs

* [zero] update unit test for ZeroDDP
2023-01-11 14:55:41 +08:00