Commit Graph

3506 Commits

Author SHA1 Message Date
YuliangLiu0306
3a46215135 [autoparallel] add embedding handler (#1620) 2022-09-23 12:34:30 +08:00
YuliangLiu0306
69448f64c4 [autoparallel] protect bcast handler from invalid strategies (#1631) 2022-09-23 12:12:49 +08:00
YuliangLiu0306
0c703189b9 [autoparallel] add layernorm handler (#1629) 2022-09-23 12:00:25 +08:00
YuliangLiu0306
bf77d3ab65 [autoparallel] recover the merged node strategy index (#1613) 2022-09-23 11:52:42 +08:00
Boyuan Yao
d6b01feb66 [fx] Modify offload codegen (#1618)
* [fx] modify offload codegen

* [fx] remove repeated hook definitions

* [fx] modify offload test
2022-09-23 11:04:52 +08:00
YuliangLiu0306
9eae855408 [hotfix] add recompile after graph manipulatation (#1621) 2022-09-23 11:00:33 +08:00
Super Daniel
d967779a32 [fx/profiler] tuned the calculation of memory estimation (#1619)
* [fx] tuned the meta info and rotor solver.

* [fx] remove import.

* [fx] remove import.

* [fx] remove import.

* [fx] tune the meta calculations.

* [fx] polish comments.

* [fx] remove assertions.

* [fx] modify test cases.

* [fx] modify test cases.

* [fx] optimize import.

* [fx
2022-09-23 10:59:47 +08:00
HELSON
f7f2248771 [moe] fix MoE bugs (#1628)
* remove forced FP32 modules

* correct no_shard-contexts' positions
2022-09-22 13:56:30 +08:00
Jiarui Fang
38c68b5b9a [embedding] rollback for better FAW performance (#1625) 2022-09-22 11:16:25 +08:00
Frank Lee
d925122020 [autoparallel] added new linear module handler (#1616) 2022-09-21 12:23:21 +08:00
Kirigaya Kazuto
170fa81095 [pipeline/chimera] test chimera | fix bug of initializing (#1615)
* [pipeline/tuning] improve dispatch performance both time and space cost

* [pipeline/converge] add interface for testing convergence

* [NFC] polish colossalai/utils/multi_tensor_apply/multi_tensor_apply.py code style

* Update PipelineBase.py

* [pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera

* [pipeline/chimera] test chimera | fix bug of initializing
2022-09-20 18:00:39 +08:00
Jiarui Fang
504ff1d101 [embeddings] use cache_ratio instead of cuda_row_num (#1611) 2022-09-20 14:33:04 +08:00
YuliangLiu0306
6a8f8cc05e [hotfix] got sliced types (#1614) 2022-09-20 14:32:42 +08:00
Frank Lee
d397842fa8 [autoparallel] added new node handler (#1612) 2022-09-20 14:17:21 +08:00
YuliangLiu0306
7d1bb71d5d [fx] PoC of runtime shape consistency application (#1607)
* [fx] PoC of runtime shape consistency application

* polish code
2022-09-20 14:00:04 +08:00
YuliangLiu0306
47b11c432c [autoparallel]add bcast matmul strategies (#1605) 2022-09-20 11:26:21 +08:00
Frank Lee
edb67cb378 [autoparallel] refactored the data structure for sharding strategy (#1610) 2022-09-20 11:20:54 +08:00
Boyuan Yao
933b6c6367 [fx] Add pofo solver (#1608)
* [fx] add pofo algorithm

* [fx] Add pofo solver

* [fx] code refactor

* [fx] fix test_linearize import
2022-09-20 11:20:48 +08:00
github-actions[bot]
d32cf84c46 Automated submodule synchronization (#1609)
Co-authored-by: github-actions <github-actions@github.com>
2022-09-20 10:47:08 +08:00
Frank Lee
725666d6a9 [workflow] deactivate conda environment before removing (#1606) 2022-09-19 12:05:33 +08:00
Kirigaya Kazuto
edc9e419ad [pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera (#1595)
* [pipeline/tuning] improve dispatch performance both time and space cost

* [pipeline/converge] add interface for testing convergence

* [NFC] polish colossalai/utils/multi_tensor_apply/multi_tensor_apply.py code style

* Update PipelineBase.py

* [pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera
2022-09-19 11:44:18 +08:00
ver217
c9e8ce67b8 fix move fp32 shards (#1604) 2022-09-16 17:33:16 +08:00
YuliangLiu0306
eac1b79371 [autoparallel] add bcast op handler (#1600)
* [autoparallel] add bcast op handler

* polish code

* add more BCAST FUNC OP

* polish code

* add exception handler

* polish
2022-09-16 11:33:01 +08:00
Frank Lee
3abf98a633 [autoparallel] added all non-bcast matmul strategies (#1603) 2022-09-16 10:47:32 +08:00
Frank Lee
db98b695b2 [autoparallel] added strategy generator and bmm strategies (#1602) 2022-09-15 16:57:07 +08:00
Jiarui Fang
a19eb80998 [embedding] updates some default parameters 2022-09-15 15:45:17 +08:00
Super Daniel
cd5cf2bcc9 [fx/tuning] tune performance on rotor with meta info. (#1599) 2022-09-15 14:46:36 +08:00
Boyuan Yao
a7cda6f57d [fx] Add offload codegen (#1598)
* [fx] add input activation offload to codegen

* [fx] modify unit test

* [fx] remove two skips in torch11

* [fx] use all_input_nodes instead of _input_nodes
2022-09-14 15:49:06 +08:00
Super Daniel
c8e9b2ad78 [hotfix/rotor] fix variable names (#1597)
* [fx] add some comment and docstrings.

* [fx] add dataflow analysis for an autograd graph.

* add intepretation for graph analysis.

* [fx] before doing save_tensor_hooks.

* [fx] provide an accurate estimation of memory except for GPT-2.

* [fx] provide an accurate estimation of memory except for GPT-2.

* [fx] provide an accurate estimation of memory except for GPT-2.

* [fx] a very accurate version on GPT-2.

* [fx] refactor code.

* [fx] remove redundant inplace=True.

* [fx] refactor code.

* [fx] refactor code.

* [fx] refactor code.

* [fx] dive into backward memory.

* [fx] fix variable names in ckpt_solvers and unskip tests.

* [fx] commit my changes.

* [fx] restore skips.

* [fx] restore skips.

* [fx] chaange stage into phase.

* [fx] chaange stage into phase.

* [fx] chaange stage into phase.
2022-09-14 14:27:04 +08:00
YuliangLiu0306
faa23b9d9a [autoparallel] add reshape handler (#1594)
* [autoparallel] add reshape handler

* polish code
2022-09-14 10:25:45 +08:00
github-actions[bot]
c938dda028 Automated submodule synchronization (#1596)
Co-authored-by: github-actions <github-actions@github.com>
2022-09-14 09:56:38 +08:00
Super Daniel
5c494d4540 [fx] provide an accurate estimation of memory. (#1587)
* [fx] add some comment and docstrings.

* [fx] add dataflow analysis for an autograd graph.

* add intepretation for graph analysis.

* [fx] before doing save_tensor_hooks.

* [fx] provide an accurate estimation of memory except for GPT-2.

* [fx] provide an accurate estimation of memory except for GPT-2.

* [fx] provide an accurate estimation of memory except for GPT-2.

* [fx] a very accurate version on GPT-2.

* [fx] refactor code.

* [fx] remove redundant inplace=True.

* [fx] refactor code.

* [fx] refactor code.

* [fx] refactor code.

* [fx] dive into backward memory.
2022-09-14 09:36:43 +08:00
Frank Lee
27fe8af60c [autoparallel] refactored shape consistency to remove redundancy (#1591)
* [autoparallel] refactored shape consistency to remove redundancy

* polish code

* polish code

* polish code
2022-09-13 18:30:18 +08:00
YuliangLiu0306
d164449d00 [autoparallel] add resnet autoparallel unit test and add backward weight communication cost (#1589) 2022-09-13 18:05:05 +08:00
Frank Lee
7c18a588c8 [autoparallel] added generate_sharding_spec to utils (#1590) 2022-09-13 15:43:22 +08:00
Boyuan Yao
49ccf8b5f8 [fx] Improve linearize and rotor solver (#1586)
* [fx] add nested activation_checkpoint codegen

* undo algorithms commits

* solver

* undo some commits

* [fx] torch11 add nested activation checkpoint codegen

* remove some imports

* [fx] add some comments in activation codegen

* [fx] codegen instance error fix

* [fx] imporve linearize and rotor solver

* [fx] some comments and format modification
2022-09-13 14:50:04 +08:00
Frank Lee
219f66c571 [autoparallel] added solver option dataclass (#1588) 2022-09-13 14:47:09 +08:00
YuliangLiu0306
82d4376c23 [autoparallel] adapt solver with resnet (#1583)
* [autoparallel]adapt solver with resnet

* polish code

* polish code
2022-09-13 12:07:09 +08:00
CsRic
f3403ff98e [embeddings] add already_split_along_rank flag for tablewise mode (#1584) 2022-09-13 10:50:34 +08:00
github-actions[bot]
77399dc91b Automated submodule synchronization (#1550)
Co-authored-by: github-actions <github-actions@github.com>
2022-09-13 10:03:33 +08:00
Boyuan Yao
f3687e4ee2 [fx] Add nested checkpoint in activation checkpoint codegen (#1585)
* [fx] add nested activation_checkpoint codegen

* undo algorithms commits

* solver

* undo some commits

* [fx] torch11 add nested activation checkpoint codegen

* remove some imports

* [fx] add some comments in activation codegen

* [fx] codegen instance error fix
2022-09-12 20:00:48 +08:00
binmakeswell
1c9ec32734 [NFC] add OPT serving (#1581) 2022-09-09 16:56:45 +08:00
Boyuan Yao
20e466527b [NFC] polish ./colossalai/trainer/hooks/_lr_scheduler_hook.py code style (#1576) 2022-09-08 22:11:04 +08:00
Fazzie-Maqianli
06dccdde44 [NFC] polish colossalai/zero/sharded_model/reduce_scatter.py code style (#1554) 2022-09-08 22:11:04 +08:00
CsRic
2ac46f7be4 [NFC] polish utils/tensor_detector/__init__.py code style (#1573)
Co-authored-by: ric <mkkt_bkkt@mail.ustc.edu.cn>
2022-09-08 22:11:04 +08:00
Sze-qq
2144cbae8c [NFC] polish colossalai/nn/lr_scheduler/multistep.py code style (#1572) 2022-09-08 22:11:04 +08:00
superhao1995
e4bf7ae667 [NFC] polish colossalai/nn/lr_scheduler/torch.py code style (#1571)
Co-authored-by: Research <research@soccf-snr3-017.comp.nus.edu.sg>
2022-09-08 22:11:04 +08:00
Jiatong Han
3263cdf57f [NFC] polish colossalai/nn/parallel/data_parallel.py code style (#1570)
Co-authored-by: JThh <jiatong.han@u.nus.edu>
2022-09-08 22:11:04 +08:00
Zirui Zhu
f566c9b98d [NFC] polish colossalai/pipeline/utils.py code style (#1562) 2022-09-08 22:11:04 +08:00
Xue Fuzhao
e070ca45c6 [NFC] polish colossalai/fx/tracer/meta_patch/patched_module/convolution.py code style (#1563) 2022-09-08 22:11:04 +08:00