ColossalAI

mirror of https://github.com/hpcaitech/ColossalAI.git synced 2025-09-13 13:11:05 +00:00

Author	SHA1	Message	Date
HELSON	707b11d4a0	[gemini] update ddp strict mode (#2518 ) * [zero] add strict ddp mode for chunk init * [gemini] update gpt example	2023-01-28 14:35:25 +08:00
HELSON	2d1a7dfe5f	[zero] add strict ddp mode (#2508 ) * [zero] add strict ddp mode * [polish] add comments for strict ddp mode * [zero] fix test error	2023-01-20 14:04:38 +08:00
oahzxl	c04f183237	[autochunk] support parsing blocks (#2506 )	2023-01-20 11:18:17 +08:00
Super Daniel	35c0c0006e	[utils] lazy init. (#2148 ) * [utils] lazy init. * [utils] remove description. * [utils] complete. * [utils] finalize. * [utils] fix names.	2023-01-20 10:49:00 +08:00
oahzxl	72341e65f4	[auto-chunk] support extramsa (#3 ) (#2504 )	2023-01-20 10:13:03 +08:00
Ziyue Jiang	0f02b8c6e6	add avg partition (#2483 ) Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2023-01-19 13:54:50 +08:00
アマデウス	99d9713b02	Revert "Update parallel_context.py (#2408 )" This reverts commit `7d5640b9db`.	2023-01-19 12:27:48 +08:00
oahzxl	ecccc91f21	[autochunk] support autochunk on evoformer (#2497 )	2023-01-19 11:41:00 +08:00
oahzxl	5db3a5bf42	[fx] allow control of ckpt_codegen init (#2498 ) * [fx] allow control of ckpt_codegen init Currently in ColoGraphModule, ActivationCheckpointCodeGen will be set automatically in __init__. But other codegen can't be set if so. So I add an arg to control whether to set ActivationCheckpointCodeGen in __init__. * code style	2023-01-18 17:02:46 +08:00
HELSON	d565a24849	[zero] add unit testings for hybrid parallelism (#2486 )	2023-01-18 10:36:10 +08:00
oahzxl	4953b4ace1	[autochunk] support evoformer tracer (#2485 ) support full evoformer tracer, which is a main module of alphafold. previously we just support a simplifed version of it. 1. support some evoformer's op in fx 2. support evoformer test 3. add repos for test code	2023-01-16 19:25:05 +08:00
YuliangLiu0306	67e1912b59	[autoparallel] support origin activation ckpt on autoprallel system (#2468 )	2023-01-16 16:25:13 +08:00
Ziyue Jiang	fef5c949c3	polish pp middleware (#2476 ) Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2023-01-13 16:56:01 +08:00
HELSON	a5dc4253c6	[zero] polish low level optimizer (#2473 )	2023-01-13 14:56:17 +08:00
Frank Lee	8b7495dd54	[example] integrate seq-parallel tutorial with CI (#2463 )	2023-01-13 14:40:05 +08:00
Jiarui Fang	867c8c2d3a	[zero] low level optim supports ProcessGroup (#2464 )	2023-01-13 10:05:58 +08:00
Frank Lee	14d9299360	[cli] fixed hostname mismatch error (#2465 )	2023-01-12 14:52:09 +08:00
Haofan Wang	9358262992	Fix False warning in initialize.py (#2456 ) * Update initialize.py * pre-commit run check	2023-01-12 13:49:01 +08:00
YuliangLiu0306	8221fd7485	[autoparallel] update binary elementwise handler (#2451 ) * [autoparallel] update binary elementwise handler * polish	2023-01-12 09:35:10 +08:00
HELSON	2bfeb24308	[zero] add warning for ignored parameters (#2446 )	2023-01-11 15:30:09 +08:00
Frank Lee	39163417a1	[example] updated the hybrid parallel tutorial (#2444 ) * [example] updated the hybrid parallel tutorial * polish code	2023-01-11 15:17:17 +08:00
HELSON	5521af7877	[zero] fix state_dict and load_state_dict for ddp ignored parameters (#2443 ) * [ddp] add is_ddp_ignored [ddp] rename to is_ddp_ignored * [zero] fix state_dict and load_state_dict * fix bugs * [zero] update unit test for ZeroDDP	2023-01-11 14:55:41 +08:00
YuliangLiu0306	2731531bc2	[autoparallel] integrate device mesh initialization into autoparallelize (#2393 ) * [autoparallel] integrate device mesh initialization into autoparallelize * add megatron solution * update gpt autoparallel examples with latest api * adapt beta value to fit the current computation cost	2023-01-11 14:03:49 +08:00
Frank Lee	c72c827e95	[cli] provided more details if colossalai run fail (#2442 )	2023-01-11 13:56:42 +08:00
Super Daniel	c41e59e5ad	[fx] allow native ckpt trace and codegen. (#2438 )	2023-01-11 13:49:59 +08:00
YuliangLiu0306	41429b9b28	[autoparallel] add shard option (#2423 )	2023-01-11 13:40:33 +08:00
HELSON	7829aa094e	[ddp] add is_ddp_ignored (#2434 ) [ddp] rename to is_ddp_ignored	2023-01-11 12:22:45 +08:00
HELSON	bb4e9a311a	[zero] add inference mode and its unit test (#2418 )	2023-01-11 10:07:37 +08:00
Jiarui Fang	93f62dd152	[autochunk] add autochunk feature	2023-01-10 16:04:42 +08:00
HELSON	dddacd2d2c	[hotfix] add norm clearing for the overflow step (#2416 )	2023-01-10 15:43:06 +08:00
oahzxl	7ab2db206f	adapt new fx	2023-01-10 11:56:00 +08:00
oahzxl	e532679c95	Merge branch 'main' of https://github.com/oahzxl/ColossalAI into chunk	2023-01-10 11:29:01 +08:00
Haofan Wang	7d5640b9db	Update parallel_context.py (#2408 )	2023-01-10 11:27:23 +08:00
oahzxl	fd818cf144	change imports	2023-01-10 11:10:45 +08:00
oahzxl	a591d45b29	add available	2023-01-10 10:56:39 +08:00
oahzxl	615e7e68d9	update doc	2023-01-10 10:44:07 +08:00
oahzxl	7d4abaa525	add doc	2023-01-10 09:59:47 +08:00
oahzxl	1be0ac3cbf	add doc for trace indice	2023-01-09 17:59:52 +08:00
oahzxl	0b6af554df	remove useless function	2023-01-09 17:46:43 +08:00
oahzxl	d914a21d64	rename	2023-01-09 17:45:36 +08:00
oahzxl	865f2e0196	rename	2023-01-09 17:42:25 +08:00
HELSON	ea13a201bb	[polish] polish code for get_static_torch_model (#2405 ) * [gemini] polish code * [testing] remove code * [gemini] make more robust	2023-01-09 17:41:38 +08:00
oahzxl	a4ed5b0d0d	rename in doc	2023-01-09 17:41:26 +08:00
oahzxl	1bb1f2ad89	rename	2023-01-09 17:38:16 +08:00
oahzxl	cb9817f75d	rename function from index to indice	2023-01-09 17:34:30 +08:00
oahzxl	0ea903b94e	rename trace_index to trace_indice	2023-01-09 17:25:13 +08:00
Frank Lee	551cafec14	[doc] updated kernel-related optimisers' docstring (#2385 ) * [doc] updated kernel-related optimisers' docstring * polish doc	2023-01-09 17:13:53 +08:00
oahzxl	065f0b4c27	add doc for search	2023-01-09 17:11:51 +08:00
oahzxl	a68d240ed5	add doc for search chunk	2023-01-09 16:54:08 +08:00
oahzxl	1951f7fa87	code style	2023-01-09 16:30:16 +08:00

... 12 13 14 15 16 ...

1867 Commits