ColossalAI

mirror of https://github.com/hpcaitech/ColossalAI.git synced 2025-08-27 04:01:30 +00:00

Author	SHA1	Message	Date
YeAnbang	0d0fef771f	disable wandb tb syncing	2025-08-05 13:59:56 +08:00
YeAnbang	280aa0b830	use consumer global step	2025-08-05 13:59:56 +08:00
Tong Li	5a6e4a6d75	[feat] Support prompt level dynamic (#6300 ) * adjust to dynamic prompt bs * remove debug * update pad seq (#6303) Co-authored-by: Tong Li <tong.li35271158@gmail.com> * adjust to dynamic prompt bs * remove debug * fix dp issue * fix * fix default settings --------- Co-authored-by: Tong Li <tong.li35271158@gmail.com>	2025-08-05 13:59:53 +08:00
YeAnbang	3416a4fc9c	move logging to producer	2025-08-05 13:59:03 +08:00
YeAnbang	af4366f0cb	Support evaluation during training	2025-08-05 13:59:03 +08:00
Tong Li	4ac7d065a6	update pad seq (#6303 ) Co-authored-by: Tong Li <tong.li35271158@gmail.com>	2025-08-05 13:59:03 +08:00
YeAnbang	9544c51a74	[fix] revert reward update and evaluation (#6295 ) * Revert "rewrite reward fn" This reverts commit `d06042b434`. * Revert "upgrade reward math verification" This reverts commit `a6085ff676`. * Revert "fix bug" This reverts commit `01640ebd65`. * Revert "reuse comm-group" This reverts commit `bd61918dcf`. * Revert "Support evaluation during training" This reverts commit `57a88395fe`.	2025-08-05 13:59:02 +08:00
YeAnbang	06b892bf4d	rewrite reward fn	2025-08-05 13:59:02 +08:00
YeAnbang	9642b75581	upgrade reward math verification	2025-08-05 13:59:02 +08:00
YeAnbang	1be993de3e	fix bug	2025-08-05 13:59:02 +08:00
YeAnbang	de0c267f5a	reuse comm-group	2025-08-05 13:59:02 +08:00
YeAnbang	16600f3509	Support evaluation during training	2025-08-05 13:59:02 +08:00
Tong Li	6a1bd833e0	[feat] Sync shard model (#6289 ) * [feat] support hybrid parallel model sync * update consumer and producer * update files * update producer * remove print * update --------- Co-authored-by: duanjunwen <935724073@qq.com> Co-authored-by: YeAnbang <44796419+YeAnbang@users.noreply.github.com> Co-authored-by: Tong Li <tong.li35271158@gmail.com>	2025-08-05 13:59:02 +08:00
YeAnbang	e181318d51	[feat] Support boxed math reward (#6284 ) * fix pp+tp, fix dataloader * fixed plugin micro-batch size * support boxed reward * add boxed reward * fix pp state dict incomplete issue * Revert "fix pp state dict incomplete issue" This reverts commit `6c1b3b694f`.	2025-08-05 13:59:02 +08:00
YeAnbang	fb4e507d00	fix pp+tp, fix dataloader (#6280 )	2025-08-05 13:59:02 +08:00
Tong Li	37a8be7651	fix save issue (#6279 ) Co-authored-by: Tong Li <tong.li35271158@gmail.com>	2025-08-05 13:59:02 +08:00
YeAnbang	673682e716	fix checkpoint naming; add num_epoch parameter (#6277 )	2025-08-05 13:59:02 +08:00
YeAnbang	5f913e8b77	[feat] Support DAPO (#6263 ) * update help information * update style * fix * minor fix * support PP training * add pp support * remove unused code * address conversation * fix memory leakage support tp+pp * move empty cache * move empty cache * add DAPO support * remove format reward * fix filtering, still buggy * small fix * add DAPO support * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * tested multi-node training; fix bind_batch bug * fix conversation; support sleep mode * support reusing excessive samples * add dynamic batching control flag * add dynamic batching control flag * refactored * fix logging --------- Co-authored-by: Tong Li <tong.li35271158@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2025-08-05 13:59:02 +08:00
Tong Li	b34d707cdc	[feat] Add final save at the end (#6274 ) * add final save * default 1 episode	2025-08-05 13:59:02 +08:00
Tong Li	befd4f1487	add prompt template (#6273 ) Co-authored-by: Tong Li <tong.li35271158@gmail.com>	2025-08-05 13:59:02 +08:00
YeAnbang	3bd6fa3c67	[hot-fix] Fix memory leakage bug, support TP+PP (#6258 ) * update help information * update style * fix * minor fix * support PP training * add pp support * remove unused code * address conversation * fix memory leakage support tp+pp * move empty cache * move empty cache --------- Co-authored-by: Tong Li <tong.li35271158@gmail.com>	2025-08-05 13:59:02 +08:00
YeAnbang	5d79b9e692	[Distributed RLHF] Integration of PP (#6257 ) * update help information * update style * fix * minor fix * support PP training * add pp support * remove unused code * address conversation --------- Co-authored-by: Tong Li <tong.li35271158@gmail.com>	2025-08-05 13:59:02 +08:00
YeAnbang	12da4d14aa	[feat] add microbatch forwarding (#6251 ) * add microbatch forwarding * fix forward microbatch * fix producer OOM * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change project name * fix temperature annealing * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * address conversation --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2025-08-05 13:59:02 +08:00
YeAnbang	c627b60551	update logging	2025-08-05 13:59:02 +08:00
YeAnbang	23aac43dcf	simplify vllm preprocessing input ids	2025-08-05 13:59:02 +08:00
YeAnbang	16e68a071d	fix logprob, add filtering, temperature annealing, lr descent	2025-08-05 13:59:02 +08:00
YeAnbang	f983071b10	fix vllm	2025-08-05 13:59:02 +08:00
duanjunwen	455185345e	[Feature] Support Distributed LogProb for GRPO Training (#6247 ) * [fix] fix qwen VocabParallelLMHead1D and gather output * fix tp bug * fix consumer * [feat] Support Distributed LogProb for GRPO Training * [fix] fix loss func * [fix] fix log prob plugin * [fix] fix qwen modeling param * [fix] rm comments * [fix] rm hard-code;fix non-dist version * [fix] fix test file param name and benchmark tp gather output=True/False * [fix] rm non-dist version in dist log prob * [fix] fix comments * [fix] fix dis log prob plugin * [fix] fix test case * [fix] fix qwen VocabParallelLMHead1D and gather output * [fix] fix DistLogProb comments * [fix] restore tp size * [fix] fix comments * [fix] fix comment; fix LogSoftmax usage --------- Co-authored-by: Tong Li <tong.li35271158@gmail.com>	2025-08-05 13:59:02 +08:00
YeAnbang	35dabd718e	fix transformers backend	2025-08-05 13:59:02 +08:00
Tong Li	e224673c44	setup update	2025-08-05 13:59:02 +08:00
Tong Li	bfc45829c3	print results	2025-08-05 13:59:02 +08:00
Tong Li	30c7ddd9f1	convert to 8 generation	2025-08-05 13:59:02 +08:00
Tong Li	a2ae82a417	fix consumer	2025-08-05 13:59:02 +08:00
Tong Li	69a1a325ee	detach	2025-08-05 13:59:02 +08:00
Tong Li	b951d0b224	add response length	2025-08-05 13:59:02 +08:00
Tong Li	a4862a2349	fix reward score	2025-08-05 13:59:02 +08:00
Tong Li	a537aa1c20	update reward	2025-08-05 13:59:02 +08:00
Tong Li	c8db826782	update reward fn	2025-08-05 13:59:02 +08:00
Tong Li	fe017d34c5	update grpo	2025-08-05 13:59:02 +08:00
pre-commit-ci[bot]	bc538ba049	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-08-05 13:59:02 +08:00
pre-commit-ci[bot]	f71d422690	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-08-05 13:59:01 +08:00
Tong Li	246f16d7bc	update select algo	2025-08-05 13:59:01 +08:00
Tong Li	88eb6e5f04	add save	2025-08-05 13:59:01 +08:00
Tong Li	1f15dc70df	add algo selection	2025-08-05 13:59:01 +08:00
Tong Li	cc4cc78169	update loader	2025-08-05 13:59:01 +08:00
Tong Li	5c75d5b07c	update example	2025-08-05 13:59:01 +08:00
Tong Li	f8899dda70	update reward fn	2025-08-05 13:59:01 +08:00
Tong Li	9754a11398	update loss	2025-08-05 13:59:01 +08:00
Tong Li	5f178a7d24	grpo consumer	2025-08-05 13:59:01 +08:00
Tong Li	b7842f8a5d	modify data loader	2025-08-05 13:59:01 +08:00
Tong Li	718c4b76cc	polish	2025-08-05 13:59:01 +08:00
Tong Li	1f07b716bf	update grpo	2025-08-05 13:59:01 +08:00
Tong Li	40d601802d	add simple grpo	2025-08-05 13:59:01 +08:00
Tong Li	fa1272f9f2	add reward related function	2025-08-05 13:59:01 +08:00
Hongxin Liu	7a2d455136	[feature] fit RL style generation (#6213 ) * [feature] fit rl style generation * [doc] add docstr * [doc] add docstr	2025-08-05 13:59:01 +08:00
Hongxin Liu	162bb42321	[chat] add distributed impl (#6210 )	2025-08-05 13:59:01 +08:00
duanjunwen	44d4053fec	[HotFix] update load lora model Readme; (#6240 ) * [fix] update load lora model Readme; * [fix] update lora infer readme * [fix] remove useless comments	2025-03-07 14:14:26 +08:00
Hongxin Liu	56fe130b15	[hotfix] fix lora load (#6231 ) * [hotfix] fix lora load * [hotfix] fix hp load * accelerate deepseek loading	2025-03-01 19:04:14 +08:00
pre-commit-ci[bot]	7595c453a5	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-02-20 10:25:19 +00:00
YeAnbang	53834b74b9	fix num_train_step update	2025-02-20 18:24:04 +08:00
YeAnbang	0171884664	fix inference rebatching bug	2025-02-20 17:28:49 +08:00
Hongxin Liu	f73ae55394	[application] add lora sft example data (#6198 )	2025-02-18 20:18:18 +08:00
Tong Li	f8b9e88484	[application] Update README (#6196 ) * remove unused ray * remove unused readme * update readme * update readme * update * update * add link * update readme * update readme * fix link * update code * update cititaion * update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update readme * update project * add images * update link * update note --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2025-02-18 20:17:56 +08:00
Hongxin Liu	d54642a263	[application] add lora sft example (#6192 ) * [application] add lora sft example * update requirements * update readme * update comment * update ci	2025-02-18 13:06:38 +08:00
YeAnbang	d20c8ffd97	Add GRPO and Support RLVR for PPO (#6186 ) * add grpo, support rlvr * add grpo, support rlvr * tested deepseek r1 pipeline * add ci * verify grpo r1 * verify grpo r1 * update readme, remove unused code * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove path * clean code * fix circular import * fix ci OOM * fix ci OOM * skip kto tp, fix qwen generation --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2025-02-18 09:43:36 +08:00
flybird11111	aaafb38851	[Device]Support npu (#6159 ) * support npu * support pretrain support pretrain fix * support lora fix fix * support chatglm fix fxi fix [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci fix fix [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci fix [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci fix fix fix * Update train.py * Update train.py * fix * fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix * fix * fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-12-17 15:42:39 +08:00
Tong Li	30a9443132	[Coati] Refine prompt for better inference (#6117 ) * refine prompt * update prompt * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-11-08 11:00:37 +08:00
Tong Li	7a60161035	update readme (#6116 )	2024-11-06 17:24:08 +08:00
Tong Li	89a9a600bc	[MCTS] Add self-refined MCTS (#6098 ) * add reasoner * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update code * delete llama * update prompts * update readme * update readme --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-10-24 17:51:19 +08:00
Tong Li	4c8e85ee0d	[Coati] Train DPO using PP (#6054 ) * update dpo * remove unsupport plugin * update msg * update dpo * remove unsupport plugin * update msg * update template * update dataset * add pp for dpo * update dpo * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add dpo fn * update dpo * update dpo * update dpo * update dpo * minor update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update loss * update help * polish code --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-10-11 19:32:00 +08:00
Camille Zhong	f9546ba0be	[ColossalEval] support for vllm (#6056 ) * support vllm * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify vllm and update readme * run pre-commit * remove dupilicated lines and refine code * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update param name * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refine code * update readme * refine code * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-09-18 17:09:45 +08:00
Tong Li	c650a906db	[Hotfix] Remove deprecated install (#6042 ) * remove deprecated install * remove unused folder	2024-09-03 10:33:18 +08:00
Tong Li	0d3a85d04f	add fused norm (#6038 )	2024-08-28 17:12:51 +08:00
Tong Li	4a68efb7da	[Colossal-LLaMA] Refactor latest APIs (#6030 ) * refactor latest code * update api * add dummy dataset * update Readme * add setup * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update files * add PP support * update arguments * update argument * reorg folder * update version * remove IB infor * update utils * update readme * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update save for zero * update save * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add apex * update --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-08-28 17:01:58 +08:00
Tong Li	39e2597426	[ColossalChat] Add PP support (#6001 ) * support pp training * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update rm * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactor * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update test case * fix * change to 4 * fix eval * test * add pp * hotfix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * support pp training * update rm * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactor * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update test case * fix * change to 4 * fix eval * test * add pp * hotfix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update * skip pp eval * update all reduce * update sft * update ignore * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update no cache * add eval * remove fi * remove debug * remove parentheses to avoid warning * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "add eval" This reverts commit `3ab2f6fa32`. * add all reduce --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-08-21 10:47:39 +08:00
YeAnbang	ed97d3a5d3	[Chat] fix readme (#5989 ) * fix readme * fix readme, tokenization fully tested * fix readme, tokenization fully tested * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: root <root@notebook-8f919155-6035-47b4-9c6f-1be133b9e2c9-0.notebook-8f919155-6035-47b4-9c6f-1be133b9e2c9.colossal-ai.svc.cluster.local> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-08-12 14:55:17 +08:00
Tong Li	ad3fa4f49c	[Hotfix] README link (#5966 ) * update ignore * update readme * run style * update readme	2024-08-08 18:04:47 +08:00
YeAnbang	0b2d55c4ab	Support overall loss, update KTO logging	2024-08-02 06:51:38 +00:00
Tong Li	19d1510ea2	[feat] Dist Loader for Eval (#5950 ) * support auto distributed data loader * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * support auto distributed data loader * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix tp error * remove unused parameters * remove unused * update inference * update docs * update inference --------- Co-authored-by: Michelle <qianranma8@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-08-02 10:06:25 +08:00
Tong Li	1aeb5e8847	[hotfix] Remove unused plan section (#5957 ) * remove readme * fix readme * update	2024-07-31 17:47:46 +08:00
YeAnbang	66fbf2ecb7	Update README.md (#5958 )	2024-07-31 17:44:09 +08:00
YeAnbang	30f4e31a33	[Chat] Fix lora (#5946 ) * fix merging * remove filepath * fix style	2024-07-31 14:10:17 +08:00
YeAnbang	c8332b9cb5	Merge pull request #5922 from hpcaitech/kto [Chat] Add KTO	2024-07-29 13:27:00 +08:00
YeAnbang	6fd9e86864	fix style	2024-07-29 01:29:18 +00:00
YeAnbang	de1bf08ed0	fix style	2024-07-26 10:07:15 +00:00
YeAnbang	8a3ff4f315	fix style	2024-07-26 09:55:15 +00:00
zhurunhua	ad35a987d3	[Feature] Add a switch to control whether the model checkpoint needs to be saved after each epoch ends (#5941 ) * Add a switch to control whether the model checkpoint needs to be saved after each epoch ends * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-07-26 11:15:20 +08:00
YeAnbang	9688e19b32	remove real data path	2024-07-22 06:13:02 +00:00
YeAnbang	b0e15d563e	remove real data path	2024-07-22 06:11:38 +00:00
YeAnbang	12fe8b5858	refactor evaluation	2024-07-22 05:57:39 +00:00
YeAnbang	c5f582f666	fix test data	2024-07-22 01:31:32 +00:00
zhurunhua	4ec17a7cdf	[FIX BUG] UnboundLocalError: cannot access local variable 'default_conversation' where it is not associated with a value (#5931 ) * cannot access local variable 'default_conversation' where it is not associated with a value set default value for 'default_conversation' * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-07-21 19:46:01 +08:00
YeAnbang	150505cbb8	Merge branch 'kto' of https://github.com/hpcaitech/ColossalAI into kto	2024-07-19 10:11:05 +00:00
YeAnbang	d49550fb49	refactor tokenization	2024-07-19 10:10:48 +00:00
Tong Li	d08c99be0d	Merge branch 'main' into kto	2024-07-19 15:23:31 +08:00
Tong Li	f585d4e38e	[ColossalChat] Hotfix for ColossalChat (#5910 ) * add ignore and tiny llama * fix path issue * run style * fix issue * update bash * add ignore and tiny llama * fix path issue * run style * fix issue * update bash * fix ddp issue * add Qwen 1.5 32B	2024-07-19 13:40:07 +08:00
YeAnbang	544b7a38a1	fix style, add kto data sample	2024-07-18 08:38:56 +00:00
YeAnbang	09d5ffca1a	add kto	2024-07-18 07:54:11 +00:00
YeAnbang	b3594d4d68	fix orpo cross entropy loss	2024-07-15 02:12:05 +00:00
YeAnbang	115c4cc5a4	hotfix citation	2024-07-11 06:05:05 +00:00

1 2 3 4 5 ...

441 Commits