mirror of
https://github.com/hpcaitech/ColossalAI.git
synced 2025-09-17 15:11:20 +00:00
[example] reorganize for community examples (#3557)
This commit is contained in:
23
examples/community/roberta/pretraining/README.md
Normal file
23
examples/community/roberta/pretraining/README.md
Normal file
@@ -0,0 +1,23 @@
|
||||
# Pretraining
|
||||
1. Pretraining roberta through running the script below. Detailed parameter descriptions can be found in the arguments.py. `data_path_prefix` is absolute path specifies output of preprocessing. **You have to modify the *hostfile* according to your cluster.**
|
||||
|
||||
```bash
|
||||
bash run_pretrain.sh
|
||||
```
|
||||
* `--hostfile`: servers' host name from /etc/hosts
|
||||
* `--include`: servers which will be used
|
||||
* `--nproc_per_node`: number of process(GPU) from each server
|
||||
* `--data_path_prefix`: absolute location of train data, e.g., /h5/0.h5
|
||||
* `--eval_data_path_prefix`: absolute location of eval data
|
||||
* `--tokenizer_path`: tokenizer path contains huggingface tokenizer.json, e.g./tokenizer/tokenizer.json
|
||||
* `--bert_config`: config.json which represent model
|
||||
* `--mlm`: model type of backbone, bert or deberta_v2
|
||||
|
||||
2. if resume training from earylier checkpoint, run the script below.
|
||||
|
||||
```shell
|
||||
bash run_pretrain_resume.sh
|
||||
```
|
||||
* `--resume_train`: whether to resume training
|
||||
* `--load_pretrain_model`: absolute path which contains model checkpoint
|
||||
* `--load_optimizer_lr`: absolute path which contains optimizer checkpoint
|
Reference in New Issue
Block a user