Merge branch 'grpo-latest-ascend' into merge

2025-07-23 03:33:05 +00:00 · 2025-06-23 14:39:05 +08:00 · 2025-06-23 14:39:05 +08:00 · 9d7544afc5
commit 9d7544afc5
parent be5acb02d9 9379a89677
1 changed files with 51 additions and 0 deletions
--- a/applications/ColossalChat/coati/distributed/README.md
+++ b/applications/ColossalChat/coati/distributed/README.md
@ -129,6 +129,7 @@ pip install ray
 ```
 Install Other Dependencies
 ```bash
 export ATB_LLM_HCCL_ENABLE=1
 export ATB_LLM_COMM_BACKEND="hccl"
@ -284,6 +285,8 @@ In addition to the two default training settings we provided--- original `GRPO`
    train_pipeline_parallelism_size /
    train_tensor_parallelism_size)
  ```
 ---
 ## 🧪 Example: single machine 8-GPU Zero2 Strategy
@ -334,6 +337,54 @@ plugin_config={
  },  # for pp, tp
 ```
 ```bash
 # Hint1: replace /models/Qwen/Qwen2.5-7B to your model path
 #        replace /datasets/train-alignment.jsonl to your dataset path
 python rl_example.py
  -m /path/to/Qwen2.5-Math-7B/ \
  -d /path/to/train_data.jsonl \
  --master_address '10.0.0.3'
  -t 16 \
  -i 16 \
  -p GRPO-Train-Align-Debug \
  -g 2 \
  -ibs 1 \
  -tbs 2 \
  -tMbs 1 \
  -tmbs 2 \
  -imbs 1 \
  -s "Please reason step by step, and put your final answer within \\boxed{}." \
  -tMbs 8 \
  -p GRPO-Train-Align-Debug \
 ```
 ## 🧪 Example: multi-machine TP+PP Strategy
 ### Create ray cluster on multi-machine
 For example, now we have 4 nodes and their IPs are 10.0.0.3, 10.0.0.4, 10.0.0.5, 10.0.0.6.
 We use 10.0.0.3 as master node. First we start a ray cluster on 10.0.0.3:
 ```bash
 ray start --head --node-ip-address=10.0.0.3
 ```
 Then, for each slave node (10.0.0.4/10.0.0.5/10.0.0.6), we add to the ray cluser by following code:
 ```bash
 ray start --address='10.0.0.3:6379'
 ```
 Modify plugin_config in ./applications/ColossalChat/rl_example.py
 ```python
 plugin_config={
  "tp_size": 4,
  "pp_size": 2,
  "microbatch_size": max(
    1, args.train_microbatch_size // 2
  ),  # microbatch size should be set to train_microbatch_size // pp_size
  "zero_stage": 1,
  "max_norm": 1.0,
  },  # for pp, tp
 ```
 ```bash
 # Hint1: replace /models/Qwen/Qwen2.5-7B to your model path
 #        replace /datasets/train-alignment.jsonl to your dataset path