ColossalAI

mirror of https://github.com/hpcaitech/ColossalAI.git synced 2025-09-12 12:39:01 +00:00

Author	SHA1	Message	Date
Liu Ziming	6427c406cf	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/strategy_generator.py code style (#2695 ) Co-authored-by: shenggan <csg19971016@gmail.com>	2023-02-14 21:30:25 +08:00
アマデウス	534f68c83c	[NFC] polish pipeline process group code style (#2694 )	2023-02-14 18:12:01 +08:00
LuGY	56ff1921e9	[NFC] polish colossalai/context/moe_context.py code style (#2693 )	2023-02-14 18:02:45 +08:00
Shawn-Kong	1712da2800	[NFC] polish colossalai/gemini/gemini_context.py code style (#2690 )	2023-02-14 11:55:23 +08:00
HELSON	df4f020ee3	[zero1&2] only append parameters with gradients (#2681 )	2023-02-13 18:00:16 +08:00
ver217	f0aa191f51	[gemini] fix colo_init_context (#2683 )	2023-02-13 17:53:15 +08:00
Boyuan Yao	40c916b192	[autoparallel] Patch meta information of `torch.nn.functional.softmax` and `torch.nn.Softmax` (#2674 ) * [autoparallel] softmax metainfo * [autoparallel] softmax metainfo	2023-02-13 16:09:22 +08:00
HELSON	8213f89fd2	[gemini] add fake_release_chunk for keep-gathered chunk in the inference mode (#2671 )	2023-02-13 14:35:32 +08:00
binmakeswell	9ab14b20b5	[doc] add CVPR tutorial (#2666 )	2023-02-10 20:43:34 +08:00
Boyuan Yao	0385b26ebf	[autoparallel] Patch meta information of `torch.nn.LayerNorm` (#2647 ) * [autoparallel] layernorm metainfo patch * [autoparallel] polish test	2023-02-10 14:29:24 +08:00
YuliangLiu0306	37df666f38	[autoparallel] refactor handlers which reshape input tensors (#2615 ) * [autoparallel] refactor handlers which reshape input tensors * polish	2023-02-08 15:02:49 +08:00
YuliangLiu0306	28398f1c70	add overlap option (#2613 )	2023-02-08 15:02:31 +08:00
YuliangLiu0306	cb3d1bef62	[autoparallel] adapt autoparallel tests with latest api (#2626 )	2023-02-08 15:02:12 +08:00
Boyuan Yao	90a9fdd91d	[autoparallel] Patch meta information of `torch.matmul` (#2584 ) * [autoparallel] matmul metainfo * [auto_parallel] remove unused print * [tests] skip test_matmul_handler when torch version is lower than 1.12.0	2023-02-08 11:05:31 +08:00
oahzxl	6ba8364881	[autochunk] support diffusion for autochunk (#2621 ) * add alphafold benchmark * renae alphafold test * rename tests * rename diffuser * renme * rename * update transformer * update benchmark * update benchmark * update bench memory * update transformer benchmark * rename * support diffuser * support unet metainfo prop * fix bug and simplify code * update linear and support some op * optimize max region search, support conv * update unet test * support some op * support groupnorm and interpolate * update flow search * add fix dim in node flow * fix utils * rename * support diffusion * update diffuser * update chunk search * optimize imports * import * finish autochunk	2023-02-07 16:32:45 +08:00
Frank Lee	8518263b80	[test] fixed the triton version for testing (#2608 )	2023-02-07 13:49:38 +08:00
HELSON	552183bb74	[polish] polish ColoTensor and its submodules (#2537 )	2023-02-03 11:44:10 +08:00
Frank Lee	dd14783f75	[kernel] fixed repeated loading of kernels (#2549 ) * [kernel] fixed repeated loading of kernels * polish code * polish code	2023-02-03 09:47:13 +08:00
ver217	5b1854309a	[hotfix] fix zero ddp warmup check (#2545 )	2023-02-02 16:42:38 +08:00
oahzxl	fa3d66feb9	support unet metainfo prop (#2544 )	2023-02-02 16:19:26 +08:00
oahzxl	05671fcb42	[autochunk] support multi outputs chunk search (#2538 ) Support multi outputs chunk search. Previously we only support single output chunk search. It is more flexible and improve performance by a large margin. For transformer, we reduce memory by 40% than previous search strategy. 1. rewrite search strategy to support multi outputs chunk search 2. fix many, many bugs 3. update tests	2023-02-01 13:18:51 +08:00
oahzxl	63199c6687	[autochunk] support transformer (#2526 )	2023-01-31 16:00:06 +08:00
HELSON	a4ed9125ac	[hotfix] fix lightning error (#2529 )	2023-01-31 10:40:39 +08:00
HELSON	66dfcf5281	[gemini] update the gpt example (#2527 )	2023-01-30 17:58:05 +08:00
HELSON	b528eea0f0	[zero] add zero wrappers (#2523 ) * [zero] add zero wrappers * change names * add wrapper functions to init	2023-01-29 17:52:58 +08:00
Super Daniel	c198c7c0b0	[hotfix] meta tensor default device. (#2510 )	2023-01-29 16:28:10 +08:00
HELSON	077a5cdde4	[zero] fix gradient clipping in hybrid parallelism (#2521 ) * [zero] fix gradient clipping in hybrid parallelism * [testing] change model name to avoid pytest warning * [hotfix] fix unit testing	2023-01-29 15:09:57 +08:00
YuliangLiu0306	aa0f6686f9	[autoparallel] accelerate gpt2 training (#2495 )	2023-01-29 11:13:15 +08:00
HELSON	707b11d4a0	[gemini] update ddp strict mode (#2518 ) * [zero] add strict ddp mode for chunk init * [gemini] update gpt example	2023-01-28 14:35:25 +08:00
HELSON	2d1a7dfe5f	[zero] add strict ddp mode (#2508 ) * [zero] add strict ddp mode * [polish] add comments for strict ddp mode * [zero] fix test error	2023-01-20 14:04:38 +08:00
oahzxl	c04f183237	[autochunk] support parsing blocks (#2506 )	2023-01-20 11:18:17 +08:00
Super Daniel	35c0c0006e	[utils] lazy init. (#2148 ) * [utils] lazy init. * [utils] remove description. * [utils] complete. * [utils] finalize. * [utils] fix names.	2023-01-20 10:49:00 +08:00
oahzxl	72341e65f4	[auto-chunk] support extramsa (#3 ) (#2504 )	2023-01-20 10:13:03 +08:00
Ziyue Jiang	0f02b8c6e6	add avg partition (#2483 ) Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2023-01-19 13:54:50 +08:00
アマデウス	99d9713b02	Revert "Update parallel_context.py (#2408 )" This reverts commit `7d5640b9db`.	2023-01-19 12:27:48 +08:00
oahzxl	ecccc91f21	[autochunk] support autochunk on evoformer (#2497 )	2023-01-19 11:41:00 +08:00
oahzxl	5db3a5bf42	[fx] allow control of ckpt_codegen init (#2498 ) * [fx] allow control of ckpt_codegen init Currently in ColoGraphModule, ActivationCheckpointCodeGen will be set automatically in __init__. But other codegen can't be set if so. So I add an arg to control whether to set ActivationCheckpointCodeGen in __init__. * code style	2023-01-18 17:02:46 +08:00
HELSON	d565a24849	[zero] add unit testings for hybrid parallelism (#2486 )	2023-01-18 10:36:10 +08:00
oahzxl	4953b4ace1	[autochunk] support evoformer tracer (#2485 ) support full evoformer tracer, which is a main module of alphafold. previously we just support a simplifed version of it. 1. support some evoformer's op in fx 2. support evoformer test 3. add repos for test code	2023-01-16 19:25:05 +08:00
YuliangLiu0306	67e1912b59	[autoparallel] support origin activation ckpt on autoprallel system (#2468 )	2023-01-16 16:25:13 +08:00
Ziyue Jiang	fef5c949c3	polish pp middleware (#2476 ) Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2023-01-13 16:56:01 +08:00
HELSON	a5dc4253c6	[zero] polish low level optimizer (#2473 )	2023-01-13 14:56:17 +08:00
Frank Lee	8b7495dd54	[example] integrate seq-parallel tutorial with CI (#2463 )	2023-01-13 14:40:05 +08:00
Jiarui Fang	867c8c2d3a	[zero] low level optim supports ProcessGroup (#2464 )	2023-01-13 10:05:58 +08:00
Frank Lee	14d9299360	[cli] fixed hostname mismatch error (#2465 )	2023-01-12 14:52:09 +08:00
Haofan Wang	9358262992	Fix False warning in initialize.py (#2456 ) * Update initialize.py * pre-commit run check	2023-01-12 13:49:01 +08:00
YuliangLiu0306	8221fd7485	[autoparallel] update binary elementwise handler (#2451 ) * [autoparallel] update binary elementwise handler * polish	2023-01-12 09:35:10 +08:00
HELSON	2bfeb24308	[zero] add warning for ignored parameters (#2446 )	2023-01-11 15:30:09 +08:00
Frank Lee	39163417a1	[example] updated the hybrid parallel tutorial (#2444 ) * [example] updated the hybrid parallel tutorial * polish code	2023-01-11 15:17:17 +08:00
HELSON	5521af7877	[zero] fix state_dict and load_state_dict for ddp ignored parameters (#2443 ) * [ddp] add is_ddp_ignored [ddp] rename to is_ddp_ignored * [zero] fix state_dict and load_state_dict * fix bugs * [zero] update unit test for ZeroDDP	2023-01-11 14:55:41 +08:00

... 2 3 4 5 6 ...

1395 Commits