ColossalAI

mirror of https://github.com/hpcaitech/ColossalAI.git synced 2025-08-19 08:27:23 +00:00

Author	SHA1	Message	Date
haze188	2d73efdfdd	[bugfix] colo attn bug fix	2024-07-24 06:53:24 +00:00
hxwang	e521890d32	[test] add check	2024-07-23 09:38:05 +00:00
haze188	4b6fbaf956	[moe] deepseek moe sp support	2024-07-23 06:39:49 +00:00
botbw	91f84f6a5f	[bug] fix: somehow logger hangs the program	2024-07-23 06:17:51 +00:00
hxwang	e31d2ebcf7	[test] fix test: test_zero1_2	2024-07-22 05:36:20 +00:00
hxwang	c67e553fd3	[moe] remove ops	2024-07-22 04:00:42 +00:00
hxwang	05a78d2f41	[chore] solve moe ckpt test failure and some other arg pass failure	2024-07-22 03:53:02 +00:00
hxwang	c27f5d9731	[chore] minor fix after rebase	2024-07-19 07:53:40 +00:00
hxwang	783aafa327	[moe] full test for deepseek and mixtral (pp + sp to fix)	2024-07-19 07:32:56 +00:00
hxwang	162e2d935c	[moe] finalize test (no pp)	2024-07-19 07:32:56 +00:00
haze188	b91cdccf2e	moe sp + ep bug fix	2024-07-19 07:32:55 +00:00
hxwang	8e85523a42	[moe] init moe plugin comm setting with sp	2024-07-19 07:32:54 +00:00
hxwang	f0599a0c19	[chore] minor fix	2024-07-19 07:32:02 +00:00
Haze188	633849f438	[Feature] MoE Ulysses Support (#5918 ) * moe sp support * moe sp bug solve * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-07-19 07:32:01 +00:00
hxwang	c8bf2681e3	[moe] clean legacy code	2024-07-19 07:32:01 +00:00
hxwang	8d3d7f3cbd	[moe] test deepseek	2024-07-19 07:32:00 +00:00
botbw	335ad3c6fb	[moe] implement tp	2024-07-19 07:30:17 +00:00
hxwang	18be903ed9	[chore] arg pass & remove drop token	2024-07-19 07:30:16 +00:00
botbw	cbcc818d5a	[chore] trivial fix	2024-07-19 07:30:15 +00:00
botbw	5bc085fc01	[chore] manually revert unintended commit	2024-07-19 07:30:14 +00:00
botbw	1b15cc97f5	[moe] add mixtral dp grad scaling when not all experts are activated	2024-07-19 07:30:14 +00:00
botbw	2f9bce6686	[moe] implement submesh initialization	2024-07-19 07:30:13 +00:00
haze188	a613edd517	solve hang when parallel mode = pp + dp	2024-07-19 07:30:13 +00:00
botbw	b303ffe9f3	[zero] solve hang	2024-07-19 07:29:36 +00:00
botbw	2431694564	[moe] implement transit between non moe tp and ep	2024-07-19 07:29:35 +00:00
botbw	dec6e25e99	[test] pass mixtral shardformer test	2024-07-19 07:29:35 +00:00
hxwang	61109c7843	[zero] solve hang	2024-07-19 07:29:07 +00:00
hxwang	000456bf94	[chore] handle non member group	2024-07-19 07:29:07 +00:00
hxwang	4fc6f9aa98	[test] mixtra pp shard test	2024-07-19 07:29:06 +00:00
hxwang	5a9490a46b	[moe] fix plugin	2024-07-19 07:29:06 +00:00
hxwang	6a9164a477	[test] add mixtral transformer test	2024-07-19 07:29:05 +00:00
hxwang	229db4bc16	[test] add mixtral for sequence classification	2024-07-19 07:29:05 +00:00
Hongxin Liu	e86127925a	[plugin] support all-gather overlap for hybrid parallel (#5919 ) * [plugin] fixed all-gather overlap support for hybrid parallel	2024-07-18 15:33:03 +08:00
アマデウス	530283dba0	fix object_to_tensor usage when torch>=2.3.0 (#5820 )	2024-07-16 13:59:25 +08:00
Guangyao Zhang	2e28c793ce	[compatibility] support torch 2.2 (#5875 ) * Support Pytorch 2.2.2 * keep build_on_pr file and update .compatibility	2024-07-16 13:59:25 +08:00
Guangyao Zhang	1c961b20f3	[ShardFormer] fix qwen2 sp (#5903 )	2024-07-15 13:58:06 +08:00
Stephan Kö	45c49dde96	[Auto Parallel]: Speed up intra-op plan generation by 44% (#5446 ) * Remove unnecessary calls to deepcopy * Build DimSpec's difference dict only once This change considerably speeds up construction speed of DimSpec objects. The difference_dict is the same for each DimSpec object, so a single copy of it is enough. * Fix documentation of DimSpec's difference method	2024-07-15 12:05:06 +08:00
Hongxin Liu	c068ef0fa0	[zero] support all-gather overlap (#5898 ) * [zero] support all-gather overlap * [zero] add overlap all-gather flag * [misc] fix typo * [zero] update api	2024-07-11 18:59:59 +08:00
Guangyao Zhang	669849d74b	[ShardFormer] Add Ulysses Sequence Parallelism support for Command-R, Qwen2 and ChatGLM (#5897 )	2024-07-10 11:34:25 +08:00
Edenzzzz	fbf33ecd01	[Feature] Enable PP + SP for llama (#5868 ) * fix cross-PP-stage position id length diff bug * fix typo * fix typo * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * use a one cross entropy func for all shardformer models --------- Co-authored-by: Edenzzzz <wtan45@wisc.edu> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-07-09 18:05:20 +08:00
Runyu Lu	66abf1c6e8	[HotFix] CI,import,requirements-test for #5838 (#5892 ) * [Hot Fix] CI,import,requirements-test --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-07-08 22:32:06 +08:00
Runyu Lu	cba20525a8	[Feat] Diffusion Model(PixArtAlpha/StableDiffusion3) Support (#5838 ) * Diffusion Model Inference support * Stable Diffusion 3 Support * pixartalpha support	2024-07-08 16:02:07 +08:00
Edenzzzz	8ec24b6a4d	[Hoxfix] Fix CUDA_DEVICE_MAX_CONNECTIONS for comm overlap Co-authored-by: Edenzzzz <wtan45@wisc.edu>	2024-07-05 20:02:36 +08:00
Haze188	3420921101	[shardformer] DeepseekMoE support (#5871 ) * [Feature] deepseek moe expert parallel implement * [misc] fix typo, remove redundant file (#5867) * [misc] fix typo * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [Feature] deepseek support & unit test * [misc] remove debug code & useless print * [misc] fix typos (#5872) * [Feature] remove modeling file, use auto config. (#5884) * [misc] fix typos * [Feature] deepseek support via auto model, remove modeling file * [misc] delete useless file * [misc] fix typos * [Deepseek] remove redundant code (#5888) * [misc] fix typos * [Feature] deepseek support via auto model, remove modeling file * [misc] delete useless file * [misc] fix typos * [misc] remove redundant code * [Feature/deepseek] resolve comment. (#5889) * [misc] fix typos * [Feature] deepseek support via auto model, remove modeling file * [misc] delete useless file * [misc] fix typos * [misc] remove redundant code * [misc] mv module replacement into if branch * [misc] add some warning message and modify some code in unit test * [misc] fix typos --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-07-05 16:13:58 +08:00
Hongxin Liu	7afbc81d62	[quant] fix bitsandbytes version check (#5882 ) * [quant] fix bitsandbytes version check * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-07-04 11:33:23 +08:00
Wang Binluo	6cd4c32be4	[shardformer] fix the moe (#5883 )	2024-07-03 20:02:19 +08:00
Edenzzzz	eb24fcd914	[Hotfix] Fix OPT gradient checkpointing forward Co-authored-by: Edenzzzz <wtan45@wisc.edu>	2024-07-03 14:57:57 +08:00
Haze188	ea94c07b95	[hotfix] fix the bug that large tensor exceed the maximum capacity of TensorBucket (#5879 )	2024-07-02 12:42:02 +08:00
pre-commit-ci[bot]	7c2f79fa98	[pre-commit.ci] pre-commit autoupdate (#5572 ) * [pre-commit.ci] pre-commit autoupdate updates: - [github.com/PyCQA/autoflake: v2.2.1 → v2.3.1](https://github.com/PyCQA/autoflake/compare/v2.2.1...v2.3.1) - [github.com/pycqa/isort: 5.12.0 → 5.13.2](https://github.com/pycqa/isort/compare/5.12.0...5.13.2) - [github.com/psf/black-pre-commit-mirror: 23.9.1 → 24.4.2](https://github.com/psf/black-pre-commit-mirror/compare/23.9.1...24.4.2) - [github.com/pre-commit/mirrors-clang-format: v13.0.1 → v18.1.7](https://github.com/pre-commit/mirrors-clang-format/compare/v13.0.1...v18.1.7) - [github.com/pre-commit/pre-commit-hooks: v4.3.0 → v4.6.0](https://github.com/pre-commit/pre-commit-hooks/compare/v4.3.0...v4.6.0) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-07-01 17:16:41 +08:00
Jianghai	8ab46b4000	[Shardformer] change qwen2 modeling into gradient checkpointing style (#5874 )	2024-07-01 16:45:09 +08:00

1 2 3 4 5 ...

2064 Commits