Merge branch 'grpo-latest-ascend' into merge

This commit is contained in:
flybird11111 2025-06-23 14:39:05 +08:00 committed by GitHub
commit 9d7544afc5
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -129,6 +129,7 @@ pip install ray
``` ```
Install Other Dependencies Install Other Dependencies
```bash ```bash
export ATB_LLM_HCCL_ENABLE=1 export ATB_LLM_HCCL_ENABLE=1
export ATB_LLM_COMM_BACKEND="hccl" export ATB_LLM_COMM_BACKEND="hccl"
@ -284,6 +285,8 @@ In addition to the two default training settings we provided--- original `GRPO`
train_pipeline_parallelism_size / train_pipeline_parallelism_size /
train_tensor_parallelism_size) train_tensor_parallelism_size)
``` ```
--- ---
## 🧪 Example: single machine 8-GPU Zero2 Strategy ## 🧪 Example: single machine 8-GPU Zero2 Strategy
@ -334,6 +337,54 @@ plugin_config={
}, # for pp, tp }, # for pp, tp
``` ```
```bash
# Hint1: replace /models/Qwen/Qwen2.5-7B to your model path
# replace /datasets/train-alignment.jsonl to your dataset path
python rl_example.py
-m /path/to/Qwen2.5-Math-7B/ \
-d /path/to/train_data.jsonl \
--master_address '10.0.0.3'
-t 16 \
-i 16 \
-p GRPO-Train-Align-Debug \
-g 2 \
-ibs 1 \
-tbs 2 \
-tMbs 1 \
-tmbs 2 \
-imbs 1 \
-s "Please reason step by step, and put your final answer within \\boxed{}." \
-tMbs 8 \
-p GRPO-Train-Align-Debug \
```
## 🧪 Example: multi-machine TP+PP Strategy
### Create ray cluster on multi-machine
For example, now we have 4 nodes and their IPs are 10.0.0.3, 10.0.0.4, 10.0.0.5, 10.0.0.6.
We use 10.0.0.3 as master node. First we start a ray cluster on 10.0.0.3:
```bash
ray start --head --node-ip-address=10.0.0.3
```
Then, for each slave node (10.0.0.4/10.0.0.5/10.0.0.6), we add to the ray cluser by following code:
```bash
ray start --address='10.0.0.3:6379'
```
Modify plugin_config in ./applications/ColossalChat/rl_example.py
```python
plugin_config={
"tp_size": 4,
"pp_size": 2,
"microbatch_size": max(
1, args.train_microbatch_size // 2
), # microbatch size should be set to train_microbatch_size // pp_size
"zero_stage": 1,
"max_norm": 1.0,
}, # for pp, tp
```
```bash ```bash
# Hint1: replace /models/Qwen/Qwen2.5-7B to your model path # Hint1: replace /models/Qwen/Qwen2.5-7B to your model path
# replace /datasets/train-alignment.jsonl to your dataset path # replace /datasets/train-alignment.jsonl to your dataset path