Commit Graph

189 Commits

Author SHA1 Message Date
binmakeswell
535b896435 [chat] polish tutorial doc (#3551)
* [chat] clean up duplicate tutorial

* [chat] clean up duplicate tutorial

* [chat] clean up duplicate tutorial

* [chat] clean up duplicate tutorial
2023-04-13 18:11:48 +08:00
Yuanchen
7182ac2a04 [chat]add examples of training with limited resources in chat readme (#3536)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-04-12 15:47:09 +08:00
zhang-yi-chi
e6a132a449 [chat]: add vf_coef argument for PPOTrainer (#3318) 2023-04-11 09:54:59 +08:00
ver217
89fd10a1c9 [chat] add zero2 cpu strategy for sft training (#3520) 2023-04-10 19:00:13 +08:00
binmakeswell
990d4c3e4e [doc] hide diffusion in application path (#3519)
- [ ] Stable Diffusion
- [ ] Dreambooth
It's easy for users to think that we don't support them yet. Add them after migrating them from example to application
https://github.com/hpcaitech/ColossalAI/tree/main/examples/images
2023-04-10 17:52:24 +08:00
binmakeswell
0c0455700f [doc] add requirement and highlight application (#3516)
* [doc] add requirement and highlight application

* [doc] link example and application
2023-04-10 17:37:16 +08:00
NatalieC323
635d0a1baf [Chat Community] Update README.md (fixed#3487) (#3506)
* Update README.md

* Update README.md

* Update README.md

* Update README.md

---------

Co-authored-by: Fazzie-Maqianli <55798671+Fazziekey@users.noreply.github.com>
2023-04-10 14:36:39 +08:00
gongenlei
a7ca297281 [coati] Fix LlamaCritic (#3475)
* mv LlamaForCausalLM to LlamaModel

* rm unused imports

---------

Co-authored-by: gongenlei <gongenlei@baidu.com>
2023-04-07 11:39:09 +08:00
binmakeswell
891b8e7fac [chat] fix stage3 PPO sample sh command (#3477) 2023-04-06 18:08:16 +08:00
Fazzie-Maqianli
6afeb1202a add community example dictionary (#3465) 2023-04-06 15:04:48 +08:00
Frank Lee
80eba05b0a [test] refactor tests with spawn (#3452)
* [test] added spawn decorator

* polish code

* polish code

* polish code

* polish code

* polish code

* polish code
2023-04-06 14:51:35 +08:00
YY Lin
62f4e2eb07 [Chat]Add Peft support & fix the ptx bug (#3433)
* Update ppo.py

Fix the bug of fetching wrong batch data

* Add peft model support in SFT and Prompts training

In stage-1 and stage-3, the peft model supports are added. So the trained artifacts will be only a small lora additions instead of the whole bunch of files.

* Delete test_prompts.txt

* Delete test_pretrained.txt

* Move the peft stuffs to a community folder.

* Move the demo sft to community

* delete dirty files

* Add instructions to install peft using source

* Remove Chinese comments

* remove the Chinese comments
2023-04-06 11:54:52 +08:00
Dr-Corgi
73afb63594 [chat]fix save_model(#3377)
The function save_model should be a part of PPOTrainer.
2023-04-06 11:19:14 +08:00
kingkingofall
57a3c4db6d [chat]fix readme (#3429)
* fix stage 2

fix stage 2

* add torch
2023-04-06 10:58:53 +08:00
Camille Zhong
72cb4dd433 [Chat] fix the tokenizer "int too big to convert" error in SFT training (#3453)
* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* update roberta with coati

* chat ci update

* Revert "chat ci update"

This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.

* [Chat] fix the tokenizer "int too big to convert" error in SFT training

fix the tokenizer error during SFT training using Bloom and OPT
2023-04-06 09:30:28 +08:00
Yuanchen
b92313903f fix save_model indent error in ppo trainer (#3450)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-04-05 09:45:42 +08:00
Yuanchen
773955abfa fix save_model inin naive and ddp strategy (#3436)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-04-04 15:30:01 +08:00
ver217
26b7aac0be [zero] reorganize zero/gemini folder structure (#3424)
* [zero] refactor low-level zero folder structure

* [zero] fix legacy zero import path

* [zero] fix legacy zero import path

* [zero] remove useless import

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] fix test import path

* [zero] fix test

* [zero] fix circular import

* [zero] update import
2023-04-04 13:48:16 +08:00
Yuanchen
b09adff724 [chat]fix sft training for bloom, gpt and opt (#3418)
fix sft training for bloom, gpt and opt
2023-04-04 09:46:23 +08:00
Camille Zhong
30412866e0 [chatgpt] add pre-trained model RoBERTa for RLHF stage 2 & 3 (#3223)
* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* add test for reward model training

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* update roberta with coati
2023-04-03 10:11:03 +08:00
Andrew
82132f4e3d [chat] correcting a few obvious typos and grammars errors (#3338) 2023-03-30 14:18:37 +08:00
Fazzie-Maqianli
0fbadce79c [doc] added authors to the chat application (#3307) 2023-03-29 11:04:30 +08:00
BlueRum
b512893637 Polish readme link (#3306) 2023-03-29 10:25:50 +08:00
github-actions[bot]
cb413ccf28 [format] applied code formatting on changed files in pull request 3300 (#3302)
Co-authored-by: github-actions <github-actions@github.com>
2023-03-29 09:28:24 +08:00
binmakeswell
31c78f2be3 [doc] add ColossalChat news (#3304)
* [doc] add ColossalChat news

* [doc] add ColossalChat news
2023-03-29 09:27:55 +08:00
Frank Lee
e235a24673 [application] updated the README (#3301)
* [application] updated the README

* polish code
2023-03-29 08:47:00 +08:00
BlueRum
8257e1055d [chat]polish prompts training (#3300)
* polish train_prompts

* polish readme
2023-03-29 08:44:16 +08:00
ver217
62f7156131 [coati] fix inference profanity check (#3299) 2023-03-29 04:26:35 +08:00
github-actions[bot]
5134ad5d1a [format] applied code formatting on changed files in pull request 3296 (#3298)
Co-authored-by: github-actions <github-actions@github.com>
2023-03-29 02:35:40 +08:00
BlueRum
c8b723d6c2 [chat]Update Readme (#3296)
* Update README.md

* Update README.md

* Update README.md

* update example readme
2023-03-29 02:32:17 +08:00
ver217
73b542a124 [coati] inference supports profanity check (#3295) 2023-03-29 02:14:35 +08:00
ver217
ce2cafae76 [coati] add repetition_penalty for inference (#3294) 2023-03-29 01:18:45 +08:00
Fazzie-Maqianli
a88ed0f83a add limit (#3293) 2023-03-29 00:53:23 +08:00
Fazzie-Maqianli
c5484281aa [ColossalChat]add cite for datasets (#3292) 2023-03-29 00:38:36 +08:00
Fazzie-Maqianli
ec7af22a43 fix image (#3288) 2023-03-28 23:34:21 +08:00
Fazzie-Maqianli
1f7d9afbf8 add example (#3286) 2023-03-28 23:07:15 +08:00
ver217
4905b21b94 [coati] fix inference output (#3285)
* [coati] fix inference requirements

* [coati] add output postprocess

* [coati] update inference readme

* [coati] fix inference requirements
2023-03-28 21:20:28 +08:00
Fazzie-Maqianli
bb6196e71a remove chatgpt (#3284) 2023-03-28 20:29:09 +08:00
Fazzie-Maqianli
b0ce5a1032 [Coati] first commit (#3283) 2023-03-28 20:25:36 +08:00
binmakeswell
d32ef94ad9 [doc] fix typo (#3222)
* [doc] fix typo

* [doc] fix typo
2023-03-24 13:33:35 +08:00
ver217
78fd31f9c1 [chatgpt] add precision option for colossalai (#3233) 2023-03-24 12:15:06 +08:00
Fazzie-Maqianli
bd39877da4 support instrcut training (#3230) 2023-03-24 11:45:01 +08:00
Camille Zhong
9bc702ab48 [doc] update chatgpt doc paper link (#3229)
#issue 3189
2023-03-24 11:21:39 +08:00
Fazzie-Maqianli
bbac6760e5 fix torch version (#3225) 2023-03-23 20:56:35 +08:00
Fazzie-Maqianli
fa97a9cab4 [chatgpt] unnify datasets (#3218) 2023-03-23 17:38:30 +08:00
Fazzie-Maqianli
4fd4bd9d9a [chatgpt] support instuct training (#3216) 2023-03-23 16:46:20 +08:00
Yuanchen
9998d5ef64 [chatgpt]add reward model code for deberta (#3199)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-03-22 19:09:39 +08:00
Fazzie-Maqianli
1e1b9d2fea [chatgpt]support llama (#3070) 2023-03-22 15:44:31 +08:00
pgzhang
b429529365 [chatgpt] add supervised learning fine-tune code (#3183)
* [chatgpt] add supervised fine-tune code

* [chatgpt] delete unused code and modified comment code

* [chatgpt] use pytorch distributed sampler instead

---------

Co-authored-by: zhangpengpeng <zhangpengpeng@joyy.com>
2023-03-22 09:59:42 +08:00
BlueRum
7548ca5a54 [chatgpt]Reward Model Training Process update (#3133)
* add normalize function to value_head in bloom rm

* add normalization to value_function in gpt_rm

* add normalization to value_head of opt_rm

* add Anthropic/hh-rlhf dataset

* Update __init__.py

* Add LogExpLoss in RM training

* Update __init__.py

* update rm trainer to use acc as target

* update example/train_rm

* Update train_rm.sh

* code style

* Update README.md

* Update README.md

* add rm test to ci

* fix tokenier

* fix typo

* change batchsize to avoid oom in ci

* Update test_ci.sh
2023-03-20 09:59:06 +08:00