ColossalAI/applications/ColossalChat/coati/trainer
Tong Li aca547623f
[feat] Support prompt level dynamic (#6300)
* adjust to dynamic prompt bs

* remove debug

* update pad seq (#6303)

Co-authored-by: Tong Li <tong.li35271158@gmail.com>

* adjust to dynamic prompt bs

* remove debug

* fix dp issue

* fix

* fix default settings

---------

Co-authored-by: Tong Li <tong.li35271158@gmail.com>
2025-05-14 16:40:35 +08:00
..
callbacks [ColossalChat] Update RLHF V2 (#5286) 2024-03-29 14:12:29 +08:00
__init__.py Add GRPO and Support RLVR for PPO (#6186) 2025-02-18 09:43:36 +08:00
base.py Add GRPO and Support RLVR for PPO (#6186) 2025-02-18 09:43:36 +08:00
dpo.py Add GRPO and Support RLVR for PPO (#6186) 2025-02-18 09:43:36 +08:00
grpo.py Add GRPO and Support RLVR for PPO (#6186) 2025-02-18 09:43:36 +08:00
kto.py Add GRPO and Support RLVR for PPO (#6186) 2025-02-18 09:43:36 +08:00
orpo.py Add GRPO and Support RLVR for PPO (#6186) 2025-02-18 09:43:36 +08:00
ppo.py Add GRPO and Support RLVR for PPO (#6186) 2025-02-18 09:43:36 +08:00
rm.py Add GRPO and Support RLVR for PPO (#6186) 2025-02-18 09:43:36 +08:00
sft.py Add GRPO and Support RLVR for PPO (#6186) 2025-02-18 09:43:36 +08:00
utils.py [feat] Support prompt level dynamic (#6300) 2025-05-14 16:40:35 +08:00