mirror of
https://github.com/hpcaitech/ColossalAI.git
synced 2025-07-22 19:23:16 +00:00
Merge branch 'grpo-latest-ascend' into merge
This commit is contained in:
commit
9d7544afc5
@ -129,6 +129,7 @@ pip install ray
|
||||
```
|
||||
|
||||
Install Other Dependencies
|
||||
|
||||
```bash
|
||||
export ATB_LLM_HCCL_ENABLE=1
|
||||
export ATB_LLM_COMM_BACKEND="hccl"
|
||||
@ -284,6 +285,8 @@ In addition to the two default training settings we provided--- original `GRPO`
|
||||
train_pipeline_parallelism_size /
|
||||
train_tensor_parallelism_size)
|
||||
```
|
||||
|
||||
|
||||
---
|
||||
|
||||
## 🧪 Example: single machine 8-GPU Zero2 Strategy
|
||||
@ -334,6 +337,54 @@ plugin_config={
|
||||
}, # for pp, tp
|
||||
```
|
||||
|
||||
```bash
|
||||
# Hint1: replace /models/Qwen/Qwen2.5-7B to your model path
|
||||
# replace /datasets/train-alignment.jsonl to your dataset path
|
||||
python rl_example.py
|
||||
-m /path/to/Qwen2.5-Math-7B/ \
|
||||
-d /path/to/train_data.jsonl \
|
||||
--master_address '10.0.0.3'
|
||||
-t 16 \
|
||||
-i 16 \
|
||||
-p GRPO-Train-Align-Debug \
|
||||
-g 2 \
|
||||
-ibs 1 \
|
||||
-tbs 2 \
|
||||
-tMbs 1 \
|
||||
-tmbs 2 \
|
||||
-imbs 1 \
|
||||
-s "Please reason step by step, and put your final answer within \\boxed{}." \
|
||||
-tMbs 8 \
|
||||
-p GRPO-Train-Align-Debug \
|
||||
```
|
||||
|
||||
## 🧪 Example: multi-machine TP+PP Strategy
|
||||
|
||||
### Create ray cluster on multi-machine
|
||||
For example, now we have 4 nodes and their IPs are 10.0.0.3, 10.0.0.4, 10.0.0.5, 10.0.0.6.
|
||||
We use 10.0.0.3 as master node. First we start a ray cluster on 10.0.0.3:
|
||||
```bash
|
||||
ray start --head --node-ip-address=10.0.0.3
|
||||
```
|
||||
|
||||
Then, for each slave node (10.0.0.4/10.0.0.5/10.0.0.6), we add to the ray cluser by following code:
|
||||
```bash
|
||||
ray start --address='10.0.0.3:6379'
|
||||
```
|
||||
|
||||
Modify plugin_config in ./applications/ColossalChat/rl_example.py
|
||||
```python
|
||||
plugin_config={
|
||||
"tp_size": 4,
|
||||
"pp_size": 2,
|
||||
"microbatch_size": max(
|
||||
1, args.train_microbatch_size // 2
|
||||
), # microbatch size should be set to train_microbatch_size // pp_size
|
||||
"zero_stage": 1,
|
||||
"max_norm": 1.0,
|
||||
}, # for pp, tp
|
||||
```
|
||||
|
||||
```bash
|
||||
# Hint1: replace /models/Qwen/Qwen2.5-7B to your model path
|
||||
# replace /datasets/train-alignment.jsonl to your dataset path
|
||||
|
Loading…
Reference in New Issue
Block a user