Default Branch

083766d54c · Add new implementations of RL algorithms (#6383) · Updated 2025-09-03 05:48:06 +00:00

Branches

7f814e71f3 · support agentic with asyncllm · Updated 2025-09-03 07:12:46 +00:00

14
2

84723e8bed · [feat][merge] Support one-behind to reduce bubble time. Add profiling code. (#6355) · Updated 2025-09-02 09:05:15 +00:00

161
79

e694ff45e2 · [pre-commit.ci] auto fixes from pre-commit.com hooks · Updated 2025-09-01 17:28:50 +00:00

1
2

fe1f429574 · Merge branch 'grpo-latest-rebase-main' of https://github.com/hpcaitech/ColossalAI into grpo-latest-rebase-main · Updated 2025-08-15 02:16:49 +00:00

5
0
Included

f067e778e9 · merge grpo-latest' · Updated 2025-08-04 03:38:14 +00:00

161
121

cd32236e53 · [Fix] Add L2 Regularization (#6372) · Updated 2025-07-29 08:56:52 +00:00

161
115

6019434ac9 · Merge pull request #6370 from ChosenQC/feature/pdf-rag · Updated 2025-07-23 06:26:08 +00:00

116
4

9f8c97d028 · add entropy · Updated 2025-07-16 08:44:23 +00:00

161
113

973dea21c7 · remove assert · Updated 2025-06-27 06:16:23 +00:00

161
114

c7d3d0dc8f · remove unused parameter · Updated 2025-06-19 07:14:16 +00:00

161
109

2db255bf15 · add profiling, implement memory efficient logprob alculation · Updated 2025-06-18 10:08:22 +00:00

161
98

2f02a28777 · Update README.md · Updated 2025-06-12 03:21:31 +00:00

161
99

9ca920c1af · [pre-commit.ci] auto fixes from pre-commit.com hooks · Updated 2025-06-09 01:48:20 +00:00

161
92

96faf54542 · fix typ and parameter description · Updated 2025-06-05 07:41:14 +00:00

161
87

e00c9bbf38 · upgrade python · Updated 2025-06-03 10:51:39 +00:00

121
0
Included

5890c8ecdd · Merge pull request #6335 from wangbluo/lazy_deepseek · Updated 2025-06-02 03:30:11 +00:00

151
114

f8bd2db33f · add uuid to rollout log · Updated 2025-05-20 01:45:56 +00:00

161
68

18f2247a10 · update consumer · Updated 2025-05-14 10:19:47 +00:00

161
58

367ae3f233 · Revert "Support evaluation during training" · Updated 2025-05-07 02:52:08 +00:00

161
58

16169d1f22 · Revert "[feat] Update reward verification" · Updated 2025-05-06 04:59:30 +00:00

161
54