1
0
mirror of https://github.com/hpcaitech/ColossalAI.git synced 2025-09-05 02:51:59 +00:00
Commit Graph

85 Commits

Author SHA1 Message Date
Hongxin Liu
7f8b16635b [misc] refactor launch API and tensor constructor ()
* [misc] remove config arg from initialize

* [misc] remove old tensor contrusctor

* [plugin] add npu support for ddp

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [devops] fix doc test ci

* [test] fix test launch

* [doc] update launch doc

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-04-29 10:40:11 +08:00
Edenzzzz
15055f9a36 [hotfix] quick fixes to make legacy tutorials runnable ()
Co-authored-by: Edenzzzz <wtan45@wisc.edu>
2024-04-07 12:06:27 +08:00
Frank Lee
8823cc4831 Merge pull request from hpcaitech/feature/npu
Feature/npu
2024-01-29 13:49:39 +08:00
Frank Lee
7cfed5f076 [feat] refactored extension module ()
* [feat] refactored extension module

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish
2024-01-25 17:01:48 +08:00
digger yu
bce9499ed3 fix some typo () 2024-01-25 13:56:27 +08:00
Hongxin Liu
d202cc28c0 [npu] change device to accelerator api ()
* update accelerator

* fix timer

* fix amp

* update

* fix

* update bug

* add error raise

* fix autocast

* fix set device

* remove doc accelerator

* update doc

* update doc

* update doc

* use nullcontext

* update cpu

* update null context

* change time limit for example

* udpate

* update

* update

* update

* [npu] polish accelerator code

---------

Co-authored-by: Xuanlei Zhao <xuanlei.zhao@gmail.com>
Co-authored-by: zxl <43881818+oahzxl@users.noreply.github.com>
2024-01-09 10:20:05 +08:00
flybird11111
26cd6d850c [fix] fix weekly runing example ()
* [fix] fix weekly runing example

* [fix] fix weekly runing example
2023-09-25 16:19:33 +08:00
Hongxin Liu
079bf3cb26 [misc] update pre-commit and run all files ()
* [misc] update pre-commit

* [misc] run pre-commit

* [misc] remove useless configuration files

* [misc] ignore cuda for clang-format
2023-09-19 14:20:26 +08:00
Hongxin Liu
b5f9e37c70 [legacy] clean up legacy code ()
* [legacy] remove outdated codes of pipeline ()

* [legacy] remove cli of benchmark and update optim ()

* [legacy] remove cli of benchmark and update optim

* [doc] fix cli doc test

* [legacy] fix engine clip grad norm

* [legacy] remove outdated colo tensor ()

* [legacy] remove outdated colo tensor

* [test] fix test import

* [legacy] move outdated zero to legacy ()

* [legacy] clean up utils ()

* [legacy] clean up utils

* [example] update examples

* [legacy] clean up amp

* [legacy] fix amp module

* [legacy] clean up gpc ()

* [legacy] clean up context

* [legacy] clean core, constants and global vars

* [legacy] refactor initialize

* [example] fix examples ci

* [example] fix examples ci

* [legacy] fix tests

* [example] fix gpt example

* [example] fix examples ci

* [devops] fix ci installation

* [example] fix examples ci
2023-09-18 16:31:06 +08:00
Hongxin Liu
554aa9592e [legacy] move communication and nn to legacy and refactor logger ()
* [legacy] move communication to legacy ()

* [legacy] refactor logger and clean up legacy codes ()

* [legacy] make logger independent to gpc

* [legacy] make optim independent to registry

* [legacy] move test engine to legacy

* [legacy] move nn to legacy ()

* [legacy] move nn to legacy

* [checkpointio] fix save hf config

* [test] remove useledd rpc pp test

* [legacy] fix nn init

* [example] skip tutorial hybriad parallel example

* [devops] test doc check

* [devops] test doc check
2023-09-11 16:24:28 +08:00
Hongxin Liu
8accecd55b [legacy] move engine to legacy ()
* [legacy] move engine to legacy

* [example] fix seq parallel example

* [example] fix seq parallel example

* [test] test gemini pluging hang

* [test] test gemini pluging hang

* [test] test gemini pluging hang

* [test] test gemini pluging hang

* [test] test gemini pluging hang

* [example] update seq parallel requirements
2023-09-05 21:53:10 +08:00
Tian Siyuan
f1ae8c9104 [example] change accelerate version ()
Co-authored-by: Siyuan Tian <siyuant@vmware.com>
Co-authored-by: Hongxin Liu <lhx0217@gmail.com>
2023-08-30 22:56:13 +08:00
Hongxin Liu
27061426f7 [gemini] improve compatibility and add static placement policy ()
* [gemini] remove distributed-related part from colotensor ()

* [gemini] remove process group dependency

* [gemini] remove tp part from colo tensor

* [gemini] patch inplace op

* [gemini] fix param op hook and update tests

* [test] remove useless tests

* [test] remove useless tests

* [misc] fix requirements

* [test] fix model zoo

* [test] fix model zoo

* [test] fix model zoo

* [test] fix model zoo

* [test] fix model zoo

* [misc] update requirements

* [gemini] refactor gemini optimizer and gemini ddp ()

* [gemini] update optimizer interface

* [gemini] renaming gemini optimizer

* [gemini] refactor gemini ddp class

* [example] update gemini related example

* [example] update gemini related example

* [plugin] fix gemini plugin args

* [test] update gemini ckpt tests

* [gemini] fix checkpoint io

* [example] fix opt example requirements

* [example] fix opt example

* [example] fix opt example

* [example] fix opt example

* [gemini] add static placement policy ()

* [gemini] add static placement policy

* [gemini] fix param offload

* [test] update gemini tests

* [plugin] update gemini plugin

* [plugin] update gemini plugin docstr

* [misc] fix flash attn requirement

* [test] fix gemini checkpoint io test

* [example] update resnet example result ()

* [example] update bert example result ()

* [doc] update gemini doc ()

* [example] update gemini related examples ()

* [example] update gpt example

* [example] update dreambooth example

* [example] update vit

* [example] update opt

* [example] update palm

* [example] update vit and opt benchmark

* [hotfix] fix bert in model zoo ()

* [hotfix] fix bert in model zoo

* [test] remove chatglm gemini test

* [test] remove sam gemini test

* [test] remove vit gemini test

* [hotfix] fix opt tutorial example ()

* [hotfix] fix opt tutorial example

* [hotfix] fix opt tutorial example
2023-08-24 09:29:25 +08:00
Tian Siyuan
ff836790ae [doc] fix a typo in examples/tutorial/auto_parallel/README.md ()
Co-authored-by: Siyuan Tian <siyuant@vmware.com>
2023-08-15 00:22:57 +08:00
binmakeswell
089c365fa0 [doc] add Series A Funding and NeurIPS news ()
* [doc] add Series A Funding and NeurIPS news

* [kernal] fix mha kernal

* [CI] skip moe

* [CI] fix requirements
2023-08-04 17:42:07 +08:00
github-actions[bot]
4e9b09c222 Automated submodule synchronization ()
Co-authored-by: github-actions <github-actions@github.com>
2023-07-12 17:35:58 +08:00
github-actions[bot]
62c7e67f9f [format] applied code formatting on changed files in pull request 3786 ()
Co-authored-by: github-actions <github-actions@github.com>
2023-05-22 14:42:09 +08:00
binmakeswell
ad2cf58f50 [chat] add performance and tutorial () 2023-05-19 18:03:56 +08:00
digger-yu
b7141c36dd [CI] fix some spelling errors ()
* fix spelling error with examples/comminity/

* fix spelling error with tests/

* fix some spelling error with tests/ colossalai/ etc.
2023-05-10 17:12:03 +08:00
Hongxin Liu
3bf09efe74 [booster] update prepare dataloader method for plugin ()
* [booster] add prepare dataloader method for plug

* [booster] update examples and docstr
2023-05-08 15:44:03 +08:00
Hongxin Liu
f83ea813f5 [example] add train resnet/vit with booster example ()
* [example] add train vit with booster example

* [example] update readme

* [example] add train resnet with booster example

* [example] enable ci

* [example] enable ci

* [example] add requirements

* [hotfix] fix analyzer init

* [example] update requirements
2023-05-08 10:42:30 +08:00
Hongxin Liu
d556648885 [example] add finetune bert with booster example () 2023-05-06 11:53:13 +08:00
github-actions[bot]
d544ed4345 [bot] Automated submodule synchronization ()
Co-authored-by: github-actions <github-actions@github.com>
2023-04-19 10:38:12 +08:00
binmakeswell
f1b3d60cae [example] reorganize for community examples () 2023-04-14 16:27:48 +08:00
Frank Lee
80eba05b0a [test] refactor tests with spawn ()
* [test] added spawn decorator

* polish code

* polish code

* polish code

* polish code

* polish code

* polish code
2023-04-06 14:51:35 +08:00
Frank Lee
7d8d825681 [booster] fixed the torch ddp plugin with the new checkpoint api () 2023-04-06 09:43:51 +08:00
ver217
573af84184 [example] update examples related to zero/gemini ()
* [zero] update legacy import

* [zero] update examples

* [example] fix opt tutorial

* [example] fix opt tutorial

* [example] fix opt tutorial

* [example] fix opt tutorial

* [example] fix import
2023-04-04 17:32:51 +08:00
ver217
26b7aac0be [zero] reorganize zero/gemini folder structure ()
* [zero] refactor low-level zero folder structure

* [zero] fix legacy zero import path

* [zero] fix legacy zero import path

* [zero] remove useless import

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] fix test import path

* [zero] fix test

* [zero] fix circular import

* [zero] update import
2023-04-04 13:48:16 +08:00
YuliangLiu0306
fd6add575d [examples] polish AutoParallel readme () 2023-03-28 10:40:07 +08:00
Frank Lee
73d3e4d309 [booster] implemented the torch ddd + resnet example ()
* [booster] implemented the torch ddd + resnet example

* polish code
2023-03-27 10:24:14 +08:00
github-actions[bot]
0aa92c0409 Automated submodule synchronization ()
Co-authored-by: github-actions <github-actions@github.com>
2023-03-13 08:58:06 +08:00
binmakeswell
018936a3f3 [tutorial] update notes for TransformerEngine () 2023-03-10 16:30:52 +08:00
Kirthi Shankar Sivamani
65a4dbda6c [NVIDIA] Add FP8 example using TE ()
Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
2023-03-10 16:24:08 +08:00
binmakeswell
52a5078988 [doc] add ISC tutorial ()
* [doc] add ISC tutorial

* [doc] add ISC tutorial

* [doc] add ISC tutorial

* [doc] add ISC tutorial
2023-03-06 10:36:38 +08:00
github-actions[bot]
827a0af8cc Automated submodule synchronization ()
Co-authored-by: github-actions <github-actions@github.com>
2023-03-03 10:55:45 +08:00
binmakeswell
0afb55fc5b [doc] add os scope, update tutorial install and tips () 2023-02-27 14:59:27 +08:00
Zheng Zeng
597914317b [doc] fix typo in opt inference tutorial () 2023-02-21 17:16:13 +08:00
github-actions[bot]
a5721229d9 Automated submodule synchronization ()
Co-authored-by: github-actions <github-actions@github.com>
2023-02-20 17:35:46 +08:00
github-actions[bot]
d701ef81b1 Automated submodule synchronization ()
Co-authored-by: github-actions <github-actions@github.com>
2023-02-15 09:39:44 +08:00
github-actions[bot]
88416019e7 Automated submodule synchronization ()
Co-authored-by: github-actions <github-actions@github.com>
2023-02-13 18:10:54 +08:00
binmakeswell
9ab14b20b5 [doc] add CVPR tutorial () 2023-02-10 20:43:34 +08:00
Frank Lee
4ae02c4b1c [tutorial] added energonai to opt inference requirements () 2023-02-07 16:58:06 +08:00
binmakeswell
0556f5d468 [tutorial] add video link () 2023-02-07 15:14:51 +08:00
github-actions[bot]
ae86be1fd2 Automated submodule synchronization ()
Co-authored-by: github-actions <github-actions@github.com>
2023-02-07 09:33:27 +08:00
binmakeswell
039b0c487b [tutorial] polish README () 2023-02-04 17:49:52 +08:00
oahzxl
4f5ef73a43 [tutorial] update fastfold tutorial ()
* update readme

* update

* update
2023-02-03 16:54:28 +08:00
YuliangLiu0306
f477a14f4a [hotfix] fix autoparallel demo () 2023-01-31 17:42:45 +08:00
LuGY
ecbad93b65 [example] Add fastfold tutorial ()
* add fastfold example

* pre-commit polish

* pre-commit polish readme and add empty test ci

* Add test_ci and reduce the default sequence length
2023-01-30 17:08:18 +08:00
Frank Lee
8b7495dd54 [example] integrate seq-parallel tutorial with CI () 2023-01-13 14:40:05 +08:00
Frank Lee
e6943e2d11 [example] integrate autoparallel demo with CI ()
* [example] integrate autoparallel demo with CI

* polish code

* polish code

* polish code

* polish code
2023-01-12 16:26:42 +08:00