mirror of https://github.com/hpcaitech/ColossalAI.git synced 2025-06-03 04:39:43 +00:00

History

Wenhao Chen edd75a59ea [chat] remove naive strategy and split colossalai strategy (#4094 ) * feat: remove on_learn_epoch fn as not used * revert: add _on_learn_epoch fn * to: remove the use of NaiveStrategy * test: remove NaiveStrategy tests * feat: remove NaiveStrategy * style: modify comments and params * feat: split ColossalAIStrategy into LowLevelZeroStrategy and GeminiStrategy * fix: remove naive * fix: align with modified colossal strategy * fix: fix ddp _try_init_dist arg		2023-06-29 18:11:00 +08:00
..
easy_dataset.py	fix some spelling error with applications/Chat/examples/ (#3692 )	2023-05-06 11:27:23 +08:00
easy_models.py	add community example dictionary (#3465 )	2023-04-06 15:04:48 +08:00
README.md	[NFC] fix typo applications/ and colossalai/ (#3735 )	2023-05-15 11:46:25 +08:00
train_peft_prompts.py	[chat] remove naive strategy and split colossalai strategy (#4094 )	2023-06-29 18:11:00 +08:00
train_peft_sft.py	[chat] remove naive strategy and split colossalai strategy (#4094 )	2023-06-29 18:11:00 +08:00

README.md

Add Peft support for SFT and Prompts model training

The original implementation just adopts the loralib and merges the layers into the final model. The huggingface peft is a better lora model implementation and can be easily training and distributed.

Since reward model is relative small, I just keep it as original one. I suggest train full model to get the proper reward/critic model.

Preliminary installation

Since the current pypi peft package(0.2) has some bugs, please install the peft package using source.

git clone https://github.com/huggingface/peft
cd peft
pip install .

Usage

For SFT training, just call train_peft_sft.py

Its arguments are almost identical to train_sft.py instead adding a new eval_dataset if you have a eval_dataset file. The data file is just a plain datafile, please check the format in the easy_dataset.py.

For stage-3 rlhf training, call train_peft_prompts.py. Its arguments are almost identical to train_prompts.py. The only difference is that I use text files to indicate the prompt and pretrained data file. The models are included in easy_models.py. Currently only bloom models are tested, but technically gpt2/opt/llama should be supported.

Dataformat

Please refer the formats in test_sft.txt, test_prompts.txt, test_pretrained.txt.