Commit Graph

13 Commits

Author SHA1 Message Date
Hongxin Liu
a7790a92e8 [devops] fix example test ci (#5504) 2024-03-26 15:09:05 +08:00
Hongxin Liu
070df689e6 [devops] fix extention building (#5427) 2024-03-05 15:35:54 +08:00
Frank Lee
73f4dc578e [workflow] updated CI image (#5318) 2024-01-29 11:53:07 +08:00
Hongxin Liu
7f3400b560 [devops] update torch versoin in ci (#5217) 2024-01-03 11:46:33 +08:00
Wenhao Chen
7172459e74 [shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088)
* [shardformer] implement policy for all GPT-J models and test

* [shardformer] support interleaved pipeline parallel for bert finetune

* [shardformer] shardformer support falcon (#4883)

* [shardformer]: fix interleaved pipeline for bert model (#5048)

* [hotfix]: disable seq parallel for gptj and falcon, and polish code (#5093)

* Add Mistral support for Shardformer (#5103)

* [shardformer] add tests to mistral (#5105)

---------

Co-authored-by: Pengtai Xu <henryxu880@gmail.com>
Co-authored-by: ppt0011 <143150326+ppt0011@users.noreply.github.com>
Co-authored-by: flybird11111 <1829166702@qq.com>
Co-authored-by: eric8607242 <e0928021388@gmail.com>
2023-11-28 16:54:42 +08:00
Hongxin Liu
b5f9e37c70 [legacy] clean up legacy code (#4743)
* [legacy] remove outdated codes of pipeline (#4692)

* [legacy] remove cli of benchmark and update optim (#4690)

* [legacy] remove cli of benchmark and update optim

* [doc] fix cli doc test

* [legacy] fix engine clip grad norm

* [legacy] remove outdated colo tensor (#4694)

* [legacy] remove outdated colo tensor

* [test] fix test import

* [legacy] move outdated zero to legacy (#4696)

* [legacy] clean up utils (#4700)

* [legacy] clean up utils

* [example] update examples

* [legacy] clean up amp

* [legacy] fix amp module

* [legacy] clean up gpc (#4742)

* [legacy] clean up context

* [legacy] clean core, constants and global vars

* [legacy] refactor initialize

* [example] fix examples ci

* [example] fix examples ci

* [legacy] fix tests

* [example] fix gpt example

* [example] fix examples ci

* [devops] fix ci installation

* [example] fix examples ci
2023-09-18 16:31:06 +08:00
Hongxin Liu
536397cc95 [devops] fix concurrency group (#4667) 2023-09-11 15:32:50 +08:00
Hongxin Liu
a686f9ddc8 [devops] fix concurrency group and compatibility test (#4665)
* [devops] fix concurrency group

* [devops] fix compatibility test

* [devops] fix tensornvme install

* [devops] fix tensornvme install

* [devops] fix colossalai install
2023-09-08 13:49:40 +08:00
Hongxin Liu
c7b60f7547 [devops] cancel previous runs in the PR (#4546) 2023-08-30 23:07:21 +08:00
Frank Lee
4110d1f0d4 [workflow] cancel duplicated workflow jobs (#3960) 2023-06-12 09:50:57 +08:00
Frank Lee
ad93c736ea [workflow] enable testing for develop & feature branch (#3801) 2023-05-23 11:21:15 +08:00
Frank Lee
719c4d5553 [doc] updated readme for CI/CD (#2600) 2023-02-06 17:42:15 +08:00
Frank Lee
ba47517342 [workflow] fixed example check workflow (#2554)
* [workflow] fixed example check workflow

* polish yaml
2023-02-06 13:46:52 +08:00