ColossalAI

mirror of https://github.com/hpcaitech/ColossalAI.git synced 2025-09-14 21:51:57 +00:00

Author	SHA1	Message	Date
hxwang	803878b2fd	[moe] full test for deepseek and mixtral (pp + sp to fix)	2024-08-01 10:06:59 +08:00
hxwang	7077d38d5a	[moe] finalize test (no pp)	2024-08-01 10:06:59 +08:00
haze188	2cddeac717	moe sp + ep bug fix	2024-08-01 10:06:59 +08:00
hxwang	877d94bb8c	[moe] init moe plugin comm setting with sp	2024-08-01 10:06:59 +08:00
hxwang	09d6280d3e	[chore] minor fix	2024-08-01 10:06:59 +08:00
Haze188	404b16faf3	[Feature] MoE Ulysses Support (#5918 ) * moe sp support * moe sp bug solve * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-08-01 10:06:59 +08:00
hxwang	3e2b6132b7	[moe] clean legacy code	2024-08-01 10:06:59 +08:00
hxwang	74eccac0db	[moe] test deepseek	2024-08-01 10:06:59 +08:00
botbw	dc583aa576	[moe] implement tp	2024-08-01 10:06:59 +08:00
hxwang	102b784a10	[chore] arg pass & remove drop token	2024-08-01 10:06:59 +08:00
botbw	8dbb86899d	[chore] trivial fix	2024-08-01 10:06:59 +08:00
botbw	014faf6c5a	[chore] manually revert unintended commit	2024-08-01 10:06:59 +08:00
botbw	9b9b76bdcd	[moe] add mixtral dp grad scaling when not all experts are activated	2024-08-01 10:06:59 +08:00
botbw	e28e05345b	[moe] implement submesh initialization	2024-08-01 10:06:59 +08:00
haze188	5ed5e8cfba	solve hang when parallel mode = pp + dp	2024-08-01 10:06:59 +08:00
botbw	13b48ac0aa	[zero] solve hang	2024-08-01 10:06:59 +08:00
botbw	b5bfeb2efd	[moe] implement transit between non moe tp and ep	2024-08-01 10:06:59 +08:00
botbw	37443cc7e4	[test] pass mixtral shardformer test	2024-08-01 10:06:59 +08:00
hxwang	46c069b0db	[zero] solve hang	2024-08-01 10:06:59 +08:00
hxwang	0fad23c691	[chore] handle non member group	2024-08-01 10:06:59 +08:00
hxwang	a249e71946	[test] mixtra pp shard test	2024-08-01 10:06:59 +08:00
hxwang	8ae8525bdf	[moe] fix plugin	2024-08-01 10:06:59 +08:00
hxwang	0b76b57cd6	[test] add mixtral transformer test	2024-08-01 10:06:59 +08:00
hxwang	f9b6fcf81f	[test] add mixtral for sequence classification	2024-08-01 10:06:59 +08:00
Hongxin Liu	060892162a	[zero] hotfix update master params (#5951 )	2024-07-30 13:36:00 +08:00
Runyu Lu	bcf0181ecd	[Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895 ) * Distrifusion Support source * comp comm overlap optimization * sd3 benchmark * pixart distrifusion bug fix * sd3 bug fix and benchmark * generation bug fix * naming fix * add docstring, fix counter and shape error * add reference * readme and requirement	2024-07-30 10:43:26 +08:00
Hongxin Liu	7b38964e3a	[shardformer] hotfix attn mask (#5947 )	2024-07-29 19:10:06 +08:00
Hongxin Liu	9664b1bc19	[shardformer] hotfix attn mask (#5945 )	2024-07-29 13:58:27 +08:00
Edenzzzz	2069472e96	[Hotfix] Fix ZeRO typo #5936 Co-authored-by: Edenzzzz <wtan45@wisc.edu>	2024-07-25 09:59:58 +08:00
Hongxin Liu	5fd0592767	[fp8] support all-gather flat tensor (#5932 )	2024-07-24 16:55:20 +08:00
Gao, Ruiyuan	5fb958cc83	[FIX BUG] convert env param to int in (#5934 )	2024-07-24 10:30:40 +08:00
Insu Jang	a521ffc9f8	Add n_fused as an input from native_module (#5894 )	2024-07-23 23:15:39 +08:00
Hongxin Liu	e86127925a	[plugin] support all-gather overlap for hybrid parallel (#5919 ) * [plugin] fixed all-gather overlap support for hybrid parallel	2024-07-18 15:33:03 +08:00
GuangyaoZhang	5b969fd831	fix shardformer fp8 communication training degradation	2024-07-18 07:16:36 +00:00
GuangyaoZhang	6a20f07b80	remove all to all	2024-07-17 07:14:55 +00:00
GuangyaoZhang	5a310b9ee1	fix rebase	2024-07-17 03:43:23 +00:00
GuangyaoZhang	457a0de79f	shardformer fp8	2024-07-16 06:56:51 +00:00
アマデウス	530283dba0	fix object_to_tensor usage when torch>=2.3.0 (#5820 )	2024-07-16 13:59:25 +08:00
Guangyao Zhang	2e28c793ce	[compatibility] support torch 2.2 (#5875 ) * Support Pytorch 2.2.2 * keep build_on_pr file and update .compatibility	2024-07-16 13:59:25 +08:00
Guangyao Zhang	1c961b20f3	[ShardFormer] fix qwen2 sp (#5903 )	2024-07-15 13:58:06 +08:00
Stephan Kö	45c49dde96	[Auto Parallel]: Speed up intra-op plan generation by 44% (#5446 ) * Remove unnecessary calls to deepcopy * Build DimSpec's difference dict only once This change considerably speeds up construction speed of DimSpec objects. The difference_dict is the same for each DimSpec object, so a single copy of it is enough. * Fix documentation of DimSpec's difference method	2024-07-15 12:05:06 +08:00
pre-commit-ci[bot]	51f916b11d	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2024-07-12 07:33:45 +00:00
BurkeHulk	1f1b856354	Merge remote-tracking branch 'origin/feature/fp8_comm' into feature/fp8_comm # Conflicts: # colossalai/quantization/fp8.py	2024-07-12 15:29:41 +08:00
BurkeHulk	e88190184a	support fp8 communication in pipeline parallelism	2024-07-12 15:25:25 +08:00
BurkeHulk	1e1959467e	fix scaling algorithm in FP8 casting	2024-07-12 15:23:37 +08:00
Hongxin Liu	c068ef0fa0	[zero] support all-gather overlap (#5898 ) * [zero] support all-gather overlap * [zero] add overlap all-gather flag * [misc] fix typo * [zero] update api	2024-07-11 18:59:59 +08:00
GuangyaoZhang	dbfa7d39fc	fix typo	2024-07-10 08:13:26 +00:00
Guangyao Zhang	669849d74b	[ShardFormer] Add Ulysses Sequence Parallelism support for Command-R, Qwen2 and ChatGLM (#5897 )	2024-07-10 11:34:25 +08:00
Edenzzzz	fbf33ecd01	[Feature] Enable PP + SP for llama (#5868 ) * fix cross-PP-stage position id length diff bug * fix typo * fix typo * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * use a one cross entropy func for all shardformer models --------- Co-authored-by: Edenzzzz <wtan45@wisc.edu> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-07-09 18:05:20 +08:00
Runyu Lu	66abf1c6e8	[HotFix] CI,import,requirements-test for #5838 (#5892 ) * [Hot Fix] CI,import,requirements-test --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-07-08 22:32:06 +08:00

... 3 4 5 6 7 ...

2276 Commits