mirror of
https://github.com/hpcaitech/ColossalAI.git
synced 2025-09-02 01:28:31 +00:00
[example] update opt example using booster api (#3918)
This commit is contained in:
@@ -19,15 +19,35 @@ Meta recently released [Open Pretrained Transformer (OPT)](https://github.com/fa
|
||||
|
||||
The following example of [Colossal-AI](https://github.com/hpcaitech/ColossalAI) demonstrates fine-tuning Casual Language Modelling at low cost.
|
||||
|
||||
We are using the pre-training weights of the OPT model provided by Hugging Face Hub on the raw WikiText-2 (no tokens were replaced before
|
||||
the tokenization). This training script is adapted from the [HuggingFace Language Modelling examples](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling).
|
||||
|
||||
## Our Modifications
|
||||
We adapt the OPT training code to ColossalAI by leveraging Gemini and ZeRO DDP.
|
||||
|
||||
## Quick Start
|
||||
You can launch training by using the following bash script
|
||||
We are using the pre-training weights of the OPT model provided by Hugging Face Hub on the raw WikiText-2 (no tokens were replaced before
|
||||
the tokenization).
|
||||
|
||||
We adapt the OPT training code to ColossalAI by leveraging [Boosting API](https://colossalai.org/docs/basics/booster_api) loaded with a chosen plugin, where each plugin corresponds to a specific kind of training strategy. This example supports plugins including TorchDDPPlugin, LowLevelZeroPlugin, and GeminiPlugin.
|
||||
|
||||
## Run Demo
|
||||
|
||||
By running the following script:
|
||||
```bash
|
||||
bash ./run_gemini.sh
|
||||
bash run_demo.sh
|
||||
```
|
||||
You will finetune a [facebook/opt-350m](https://huggingface.co/facebook/opt-350m) model on this [dataset](https://huggingface.co/datasets/hugginglearners/netflix-shows), which contains more than 8000 comments on Netflix shows.
|
||||
|
||||
The script can be modified if you want to try another set of hyperparameters or change to another OPT model with different size.
|
||||
|
||||
The demo code is adapted from this [blog](https://medium.com/geekculture/fine-tune-eleutherai-gpt-neo-to-generate-netflix-movie-descriptions-in-only-47-lines-of-code-40c9b4c32475) and the [HuggingFace Language Modelling examples](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling).
|
||||
|
||||
|
||||
|
||||
## Run Benchmark
|
||||
|
||||
You can run benchmark for OPT model by running the following script:
|
||||
```bash
|
||||
bash run_benchmark.sh
|
||||
```
|
||||
The script will test performance (throughput & peak memory usage) for each combination of hyperparameters. You can also play with this script to configure your set of hyperparameters for testing.
|
||||
|
||||
|
||||
|
||||
|
Reference in New Issue
Block a user