Commit Graph

1644 Commits

Author SHA1 Message Date
Frank Lee
e8dfa2e2e0
[workflow] rebuild cuda kernels when kernel-related files change (#2317) 2023-01-04 17:23:59 +08:00
Jiarui Fang
db6eea3583
[builder] reconfig op_builder for pypi install (#2314) 2023-01-04 16:32:32 +08:00
Fazzie-Maqianli
a9b27b9265
[exmaple] fix dreamblooth format (#2315) 2023-01-04 16:20:00 +08:00
Sze-qq
da1c47f060
update ColossalAI logo (#2316)
Co-authored-by: siqi <siqi@siqis-MacBook-Pro.local>
2023-01-04 15:41:53 +08:00
Junming Wu
4a79c10750 [NFC] polish colossalai/cli/benchmark/__init__.py code style (#2308) 2023-01-04 15:09:57 +08:00
Ofey Chan
87d2defda6 [NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/layer_norm_handler.py code style (#2305) 2023-01-04 15:09:57 +08:00
ver217
116e3d0b8f [NFC] polish communication/p2p_v2.py code style (#2303) 2023-01-04 15:09:57 +08:00
xyupeng
b965585d05 [NFC] polish colossalai/amp/torch_amp/torch_amp.py code style (#2290) 2023-01-04 15:09:57 +08:00
Zangwei Zheng
d1e5bafcd4 [NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/__init__.py code style (#2291) 2023-01-04 15:09:57 +08:00
shenggan
950685873f [NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/reshape_handler.py code style (#2292) 2023-01-04 15:09:57 +08:00
Ziheng Qin
3041014089 [NFC] polish colossalai/amp/naive_amp/grad_scaler/dynamic_grad_scaler.py code style (#2299)
Co-authored-by: henryqin1997 <henryqin1997@gamil.com>
2023-01-04 15:09:57 +08:00
アマデウス
49715a78f0 [NFC] polish colossalai/cli/benchmark/benchmark.py code style (#2287) 2023-01-04 15:09:57 +08:00
Zirui Zhu
1c29b173c9 [NFC] polish colossalai/auto_parallel/tensor_shard/node_handler/getitem_handler.py code style (#2289) 2023-01-04 15:09:57 +08:00
Zihao
3a02b46447
[auto-parallel] refactoring ColoTracer (#2118)
* add meta_data_computing

* add checkpoint_annotation

* rename proxy.data to proxy.meta_data and add bias addition pass

* polish code

* delete meta_prop_pass invoke and rename ori_node to orig_node

* add TracerType

* unify meta data computing

* delete TracerType

* handle setitem operation

* operator.setitem
2023-01-04 14:44:22 +08:00
Jiarui Fang
32253315b4
[example] update diffusion readme with official lightning (#2304) 2023-01-04 13:13:38 +08:00
HELSON
5d3a2be3af
[amp] add gradient clipping for unit tests (#2283)
* [amp] add gradient clipping in unit tests

* fix bugs
2023-01-04 11:59:56 +08:00
HELSON
e00cedd181
[example] update gemini benchmark bash (#2306) 2023-01-04 11:59:26 +08:00
Frank Lee
9b765e7a69
[setup] removed the build dependency on colossalai (#2307) 2023-01-04 11:38:42 +08:00
Boyuan Yao
d45695d94e
Merge pull request #2258 from hpcaitech/debug/ckpt-autoparallel
[autockpt] provide option for activation checkpoint search in SPMD solver
2023-01-04 11:37:28 +08:00
binmakeswell
c8144223b8
[doc] update diffusion doc (#2296) 2023-01-03 21:27:44 +08:00
binmakeswell
2fac699923
[doc] update news (#2295) 2023-01-03 21:09:11 +08:00
binmakeswell
4b72b2d4d3
[doc] update news 2023-01-03 21:05:54 +08:00
Jiarui Fang
16cc8e6aa7
[builder] MOE builder (#2277) 2023-01-03 20:29:39 +08:00
Boyuan Yao
b904748210
[autoparallel] bypass MetaInfo when unavailable and modify BCAST_FUNC_OP metainfo (#2293)
* [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline

* [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop

* [autoparallel] specifycomm nodes' memory cost in construct chain

* [autoparallel] fix wrong runtime apply calculation

* [autoparallel] fix wrong runtime apply calculation

* [autoparallel] fix wrong runtime apply calculation

* [autoparallel] bypass metainfo when available and modify BCAST_FUNC_OP
2023-01-03 20:28:01 +08:00
Jiarui Fang
26e171af6c
[version] 0.1.14 -> 0.2.0 (#2286) 2023-01-03 20:25:13 +08:00
Super Daniel
8ea50d999e
[hotfix] pass a parameter. (#2288)
* [autockpt] make it work.

* [autockpt] linearize / merge shape-consistency nodes.

* [autockpt] considering parameter and optimizer weights.

* [hotfix] pass a parameter.
2023-01-03 18:05:06 +08:00
ZijianYY
df1d6dc553
[examples] using args and combining two versions for PaLM (#2284) 2023-01-03 17:49:00 +08:00
zbian
e94c79f15b improved allgather & reducescatter for 3d 2023-01-03 17:46:08 +08:00
binmakeswell
c719798abe
[doc] add feature diffusion v2, bloom, auto-parallel (#2282) 2023-01-03 17:35:07 +08:00
HELSON
62c38e3330
[zero] polish low level zero optimizer (#2275) 2023-01-03 17:22:34 +08:00
Ziyue Jiang
ac863a01d6
[example] add benchmark (#2276)
* add benchmark

* merge common func

* add total and avg tflops

Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-03 17:20:59 +08:00
Boyuan Yao
22e947f982
[autoparallel] fix runtime apply memory estimation (#2281)
* [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline

* [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop

* [autoparallel] specifycomm nodes' memory cost in construct chain

* [autoparallel] fix wrong runtime apply calculation

* [autoparallel] fix wrong runtime apply calculation

* [autoparallel] fix wrong runtime apply calculation
2023-01-03 17:18:07 +08:00
BlueRum
1405b4381e
[example] fix save_load bug for dreambooth (#2280) 2023-01-03 17:13:29 +08:00
Super Daniel
8e8900ff3f
[autockpt] considering parameter and optimizer weights. (#2279)
* [autockpt] make it work.

* [autockpt] linearize / merge shape-consistency nodes.

* [autockpt] considering parameter and optimizer weights.
2023-01-03 16:55:49 +08:00
YuliangLiu0306
f027ef7913
[hotfix] fix fp16 optimzier bug (#2273) 2023-01-03 16:53:43 +08:00
YuliangLiu0306
fb87322773
[autoparallel] fix spelling error (#2270) 2023-01-03 16:13:00 +08:00
Jiarui Fang
af32022f74
[Gemini] fix the convert_to_torch_module bug (#2269) 2023-01-03 15:55:35 +08:00
Jiarui Fang
879df8b943
[example] GPT polish readme (#2274) 2023-01-03 15:46:52 +08:00
Ziyue Jiang
9654df0e9a
Add GPT PP Example (#2272)
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-03 15:17:26 +08:00
Super Daniel
b0d21d0c4f
[autockpt] linearize / merge shape-consistency nodes. (#2271)
* [autockpt] make it work.

* [autockpt] linearize / merge shape-consistency nodes.
2023-01-03 14:54:22 +08:00
YuliangLiu0306
4b29112ab2
[autoparallel] gpt2 autoparallel examples (#2267)
* [autoparallel] gpt2 autoparallel examples

* polish code

* polish code
2023-01-03 14:23:33 +08:00
Ziyue Jiang
8b045b3c1f
[Pipeline Middleware] Reduce comm redundancy by getting accurate output (#2232)
* move to cpu to avoid dead lock

* get output by offsets

Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-03 13:43:57 +08:00
HELSON
09c0102fe6
[example] fix gpt example with 0.1.10 (#2265) 2023-01-03 13:38:14 +08:00
Boyuan Yao
5c2ef9fc76
[autoparallel] modify comm nodes' memory cost in construct chain (#2263)
* [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline

* [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop

* [autoparallel] specifycomm nodes' memory cost in construct chain
2023-01-03 11:38:48 +08:00
Fazzie-Maqianli
89f048a88a
[example] clear diffuser image (#2262) 2023-01-03 10:57:02 +08:00
Boyuan Yao
1ea99b869e
[autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline (#2261) 2023-01-03 10:30:15 +08:00
Super Daniel
3ccf58aa76
[autockpt] make it work. (#2257) 2023-01-02 23:37:45 +08:00
Boyuan Yao
ac3739930d
[autoparallel] modify construct chain in rotor solver (#2254) 2023-01-02 16:26:12 +08:00
Boyuan Yao
ab38aebace
[autoparallel] Hook all meta information on ResNet nodes for auto activation checkpoint (#2248)
* [autoparallel] hook node meta on graph nodes for checkpoint solver

* [autoparallel] polish code

* [autoparallel] restore some node handlers

* colossalai/auto_parallel/passes/meta_info_prop.py

* [autoparallel] remove some unused import

* [autoparallel] hook bwd_mem_out
2023-01-02 16:25:18 +08:00
Boyuan Yao
c8c79102f0
[autoparallel] patch torch.flatten metainfo for autoparallel (#2247)
* [autoparallel] patch torch.flatten
2023-01-02 15:51:03 +08:00