mirror of
https://github.com/hpcaitech/ColossalAI.git
synced 2025-07-23 03:33:05 +00:00
Merge branch 'grpo-latest-ascend' into merge
This commit is contained in:
commit
9d7544afc5
@ -129,6 +129,7 @@ pip install ray
|
|||||||
```
|
```
|
||||||
|
|
||||||
Install Other Dependencies
|
Install Other Dependencies
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
export ATB_LLM_HCCL_ENABLE=1
|
export ATB_LLM_HCCL_ENABLE=1
|
||||||
export ATB_LLM_COMM_BACKEND="hccl"
|
export ATB_LLM_COMM_BACKEND="hccl"
|
||||||
@ -284,6 +285,8 @@ In addition to the two default training settings we provided--- original `GRPO`
|
|||||||
train_pipeline_parallelism_size /
|
train_pipeline_parallelism_size /
|
||||||
train_tensor_parallelism_size)
|
train_tensor_parallelism_size)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 🧪 Example: single machine 8-GPU Zero2 Strategy
|
## 🧪 Example: single machine 8-GPU Zero2 Strategy
|
||||||
@ -334,6 +337,54 @@ plugin_config={
|
|||||||
}, # for pp, tp
|
}, # for pp, tp
|
||||||
```
|
```
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Hint1: replace /models/Qwen/Qwen2.5-7B to your model path
|
||||||
|
# replace /datasets/train-alignment.jsonl to your dataset path
|
||||||
|
python rl_example.py
|
||||||
|
-m /path/to/Qwen2.5-Math-7B/ \
|
||||||
|
-d /path/to/train_data.jsonl \
|
||||||
|
--master_address '10.0.0.3'
|
||||||
|
-t 16 \
|
||||||
|
-i 16 \
|
||||||
|
-p GRPO-Train-Align-Debug \
|
||||||
|
-g 2 \
|
||||||
|
-ibs 1 \
|
||||||
|
-tbs 2 \
|
||||||
|
-tMbs 1 \
|
||||||
|
-tmbs 2 \
|
||||||
|
-imbs 1 \
|
||||||
|
-s "Please reason step by step, and put your final answer within \\boxed{}." \
|
||||||
|
-tMbs 8 \
|
||||||
|
-p GRPO-Train-Align-Debug \
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🧪 Example: multi-machine TP+PP Strategy
|
||||||
|
|
||||||
|
### Create ray cluster on multi-machine
|
||||||
|
For example, now we have 4 nodes and their IPs are 10.0.0.3, 10.0.0.4, 10.0.0.5, 10.0.0.6.
|
||||||
|
We use 10.0.0.3 as master node. First we start a ray cluster on 10.0.0.3:
|
||||||
|
```bash
|
||||||
|
ray start --head --node-ip-address=10.0.0.3
|
||||||
|
```
|
||||||
|
|
||||||
|
Then, for each slave node (10.0.0.4/10.0.0.5/10.0.0.6), we add to the ray cluser by following code:
|
||||||
|
```bash
|
||||||
|
ray start --address='10.0.0.3:6379'
|
||||||
|
```
|
||||||
|
|
||||||
|
Modify plugin_config in ./applications/ColossalChat/rl_example.py
|
||||||
|
```python
|
||||||
|
plugin_config={
|
||||||
|
"tp_size": 4,
|
||||||
|
"pp_size": 2,
|
||||||
|
"microbatch_size": max(
|
||||||
|
1, args.train_microbatch_size // 2
|
||||||
|
), # microbatch size should be set to train_microbatch_size // pp_size
|
||||||
|
"zero_stage": 1,
|
||||||
|
"max_norm": 1.0,
|
||||||
|
}, # for pp, tp
|
||||||
|
```
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Hint1: replace /models/Qwen/Qwen2.5-7B to your model path
|
# Hint1: replace /models/Qwen/Qwen2.5-7B to your model path
|
||||||
# replace /datasets/train-alignment.jsonl to your dataset path
|
# replace /datasets/train-alignment.jsonl to your dataset path
|
||||||
|
Loading…
Reference in New Issue
Block a user