ColossalAI/applications/ColossalChat/coati/trainer
2025-08-14 11:05:42 +00:00
..
callbacks [ColossalChat] Update RLHF V2 (#5286) 2024-03-29 14:12:29 +08:00
__init__.py Add GRPO and Support RLVR for PPO (#6186) 2025-02-18 09:43:36 +08:00
base.py Add GRPO and Support RLVR for PPO (#6186) 2025-02-18 09:43:36 +08:00
dpo.py [pre-commit.ci] auto fixes from pre-commit.com hooks 2025-02-20 10:25:19 +00:00
grpo.py fix num_train_step update 2025-02-20 18:24:04 +08:00
kto.py [pre-commit.ci] auto fixes from pre-commit.com hooks 2025-08-14 11:05:42 +00:00
orpo.py fix num_train_step update 2025-02-20 18:24:04 +08:00
ppo.py fix num_train_step update 2025-02-20 18:24:04 +08:00
rm.py fix num_train_step update 2025-02-20 18:24:04 +08:00
sft.py fix num_train_step update 2025-02-20 18:24:04 +08:00
utils.py [feat] Support prompt level dynamic (#6300) 2025-08-05 13:59:53 +08:00