Commit Graph

8 Commits

Author SHA1 Message Date
YeAnbang
16e68a071d fix logprob, add filtering, temperature annealing, lr descent 2025-08-05 13:59:02 +08:00
YeAnbang
f983071b10 fix vllm 2025-08-05 13:59:02 +08:00
YeAnbang
35dabd718e fix transformers backend 2025-08-05 13:59:02 +08:00
Tong Li
30c7ddd9f1 convert to 8 generation 2025-08-05 13:59:02 +08:00
Tong Li
718c4b76cc polish 2025-08-05 13:59:01 +08:00
Tong Li
40d601802d add simple grpo 2025-08-05 13:59:01 +08:00
Hongxin Liu
7a2d455136 [feature] fit RL style generation (#6213)
* [feature] fit rl style generation

* [doc] add docstr

* [doc] add docstr
2025-08-05 13:59:01 +08:00
Hongxin Liu
162bb42321 [chat] add distributed impl (#6210) 2025-08-05 13:59:01 +08:00