* [fix] support npu
* [feat] multinode 14B
* [feat] enlarge seqlen
* [fix]
* [fix] ready to updated
* [fix] ready to merge grpo-latest
* [fix] rm comments
* [feat] support msprof-analyze, add analsys result
* [feat] support ColossalaiRL on Ascend
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [feat] rm comments in qwen modeling
* [Doc] Drafted README.md
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [feat] fix ascend readme format
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [fix] fix readme
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [fix] fix readme
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [fix] fix Readme, rm irrelevant testcase
* [fix] fix some adapt modification
* [fix] rm comments in modeling qwen
* [fix] rm comm, test and debug print
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: YeAnbang <44796419+YeAnbang@users.noreply.github.com>
* update help information
* update style
* fix
* minor fix
* support PP training
* add pp support
* remove unused code
* address conversation
* fix memory leakage support tp+pp
* move empty cache
* move empty cache
* add DAPO support
* remove format reward
* fix filtering, still buggy
* small fix
* add DAPO support
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tested multi-node training; fix bind_batch bug
* fix conversation; support sleep mode
* support reusing excessive samples
* add dynamic batching control flag
* add dynamic batching control flag
* refactored
* fix logging
---------
Co-authored-by: Tong Li <tong.li35271158@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* update help information
* update style
* fix
* minor fix
* support PP training
* add pp support
* remove unused code
* address conversation
---------
Co-authored-by: Tong Li <tong.li35271158@gmail.com>
* add microbatch forwarding
* fix forward microbatch
* fix producer OOM
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* change project name
* fix temperature annealing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* address conversation
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>