ColossalAI

mirror of https://github.com/hpcaitech/ColossalAI.git synced 2025-08-15 22:53:12 +00:00

History

Tong Li 7bb7e80476 [feat] GRPO with distributed implementation (#6230 ) * add reward related function * add simple grpo * update grpo * polish * modify data loader * grpo consumer * update loss * update reward fn * update example * update loader * add algo selection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add save * update select algo * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update grpo * update reward fn * update reward * fix reward score * add response length * detach * fix tp bug * fix consumer * convert to 8 generation * print results * setup update * fix transformers backend * [Feature] Support Distributed LogProb for GRPO Training (#6247) * [fix] fix qwen VocabParallelLMHead1D and gather output * fix tp bug * fix consumer * [feat] Support Distributed LogProb for GRPO Training * [fix] fix loss func * [fix] fix log prob plugin * [fix] fix qwen modeling param * [fix] rm comments * [fix] rm hard-code;fix non-dist version * [fix] fix test file param name and benchmark tp gather output=True/False * [fix] rm non-dist version in dist log prob * [fix] fix comments * [fix] fix dis log prob plugin * [fix] fix test case * [fix] fix qwen VocabParallelLMHead1D and gather output * [fix] fix DistLogProb comments * [fix] restore tp size * [fix] fix comments * [fix] fix comment; fix LogSoftmax usage --------- Co-authored-by: Tong Li <tong.li35271158@gmail.com> * fix vllm * fix logprob, add filtering, temperature annealing, lr descent * simplify vllm preprocessing input ids * update logging * [feat] add microbatch forwarding (#6251) * add microbatch forwarding * fix forward microbatch * fix producer OOM * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change project name * fix temperature annealing * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * address conversation --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [Distributed RLHF] Integration of PP (#6257) * update help information * update style * fix * minor fix * support PP training * add pp support * remove unused code * address conversation --------- Co-authored-by: Tong Li <tong.li35271158@gmail.com> * [hot-fix] Fix memory leakage bug, support TP+PP (#6258) * update help information * update style * fix * minor fix * support PP training * add pp support * remove unused code * address conversation * fix memory leakage support tp+pp * move empty cache * move empty cache --------- Co-authored-by: Tong Li <tong.li35271158@gmail.com> --------- Co-authored-by: Tong Li <tong.li35271158@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: YeAnbang <anbangy2@outlook.com> Co-authored-by: duanjunwen <935724073@qq.com> Co-authored-by: YeAnbang <44796419+YeAnbang@users.noreply.github.com>		2025-04-21 10:43:49 +08:00
..
kit	[release] update version (#6195 )	2025-02-20 11:36:18 +08:00
test_analyzer	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
test_auto_parallel	[test] Fix/fix testcase (#5770 )	2024-06-03 15:26:01 +08:00
test_autochunk	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
test_booster	[plugin] support get_grad_norm (#6115 )	2024-11-05 18:12:47 +08:00
test_checkpoint_io	[shardformer] support pipeline for deepseek v3 and optimize lora save (#6188 )	2025-02-14 14:48:54 +08:00
test_cluster	[misc] refactor launch API and tensor constructor (#5666 )	2024-04-29 10:40:11 +08:00
test_config	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
test_device	[misc] refactor launch API and tensor constructor (#5666 )	2024-04-29 10:40:11 +08:00
test_fp8	[fp8] Disable all_gather intranode. Disable Redundant all_gather fp8 (#6059 )	2024-09-14 10:40:01 +08:00
test_fx	[hotfix] fix testcase in test_fx/test_tracer (#5779 )	2024-06-05 11:29:32 +08:00
test_infer	[release] update version (#6041 )	2024-09-10 10:31:09 +08:00
test_lazy	[fp8] Merge feature/fp8_comm to main branch of Colossalai (#6016 )	2024-08-22 09:21:34 +08:00
test_legacy	[FP8] rebase main (#5963 )	2024-08-06 16:29:37 +08:00
test_lora	[fp8] Merge feature/fp8_comm to main branch of Colossalai (#6016 )	2024-08-22 09:21:34 +08:00
test_moe	[hotfix] moe hybrid parallelism benchmark & follow-up fix (#6048 )	2024-09-10 17:30:53 +08:00
test_optimizer	[CI] Cleanup Dist Optim tests with shared helper funcs (#6125 )	2025-02-12 13:42:34 +08:00
test_pipeline	[hotfix] fix hybrid checkpointio for sp+dp (#6184 )	2025-02-06 17:21:04 +08:00
test_shardformer	[feat] GRPO with distributed implementation (#6230 )	2025-04-21 10:43:49 +08:00
test_smoothquant	[inference] Add smmoothquant for llama (#4904 )	2023-10-16 11:28:44 +08:00
test_tensor	[misc] refactor launch API and tensor constructor (#5666 )	2024-04-29 10:40:11 +08:00
test_zero	[zero] support extra dp (#6123 )	2024-11-12 11:20:46 +08:00
__init__.py
conftest.py	[checkpointio] support load-pin overlap (#6177 )	2025-01-07 16:16:04 +08:00