dataset
|
fix schedualing for multi-node training
|
2025-05-02 19:45:07 +08:00 |
distributed
|
spot a possible bug
|
2025-05-05 18:48:42 +08:00 |
models
|
Add GRPO and Support RLVR for PPO (#6186)
|
2025-02-18 09:43:36 +08:00 |
quant
|
[ColossalChat] Update RLHF V2 (#5286)
|
2024-03-29 14:12:29 +08:00 |
ray
|
[ColossalChat] Update RLHF V2 (#5286)
|
2024-03-29 14:12:29 +08:00 |
trainer
|
[feat] Support DAPO (#6263)
|
2025-04-25 17:39:17 +08:00 |
utils
|
Add GRPO and Support RLVR for PPO (#6186)
|
2025-02-18 09:43:36 +08:00 |
__init__.py
|
[ColossalChat] Update RLHF V2 (#5286)
|
2024-03-29 14:12:29 +08:00 |