Commit Graph

1179 Commits

Author SHA1 Message Date
YuliangLiu0306
845ff4a47a [autoparallel] resnet block runtime apply (#1709)
* [autoparallel] resnet block runtime apply

* seperate buffer and parameter in MemoryCost

* polish code

* add comments and todos

* fix test issue
2022-10-17 13:37:38 +08:00
Frank Lee
22a115406b [autoparallel] fixed broken node handler tests (#1708) 2022-10-14 18:25:59 +08:00
HELSON
1468e4bcfc [zero] add constant placement policy (#1705)
* fixes memory leak when paramter is in fp16 in ZeroDDP init.
* bans chunk releasement in CUDA. Only when a chunk is about to offload, it is allowed to release.
* adds a constant placement policy. With it, users can allocate a reserved caching memory space for parameters.
2022-10-14 17:53:16 +08:00
Frank Lee
6c331a5a09 [autoparallel] refactored the autoparallel module for organization (#1706)
* [autoparallel] refactored the autoparallel module for organization

* polish code
2022-10-14 13:27:00 +08:00
Frank Lee
91cd34e6e0 [unittest] added doc for the pytest wrapper (#1704) 2022-10-14 10:56:17 +08:00
YuliangLiu0306
451cd72dea [autoparallel] adapt runtime passes (#1703)
* [autoparallel] adapt runtime passes v2

* polish code
2022-10-14 10:14:07 +08:00
Jiarui Fang
21962e1593 [embedding] rename FreqAwareEmbedding -> CachedEmbedding (#1699) 2022-10-13 22:22:27 +08:00
Frank Lee
0e52f3d3d5 [unittest] supported condititonal testing based on env var (#1701)
polish code
2022-10-13 19:38:45 +08:00
Frank Lee
8283e95db3 [autoparallel] collated all deprecated files (#1700)
* [autoparallel] collated all deprecated files

* polish code
2022-10-13 18:24:11 +08:00
YuliangLiu0306
81f7530ee7 [autoparallel] adapt solver and CostGraph with new handler (#1695)
* [autoparallel] adapt solver and CostGraph with new handler

* fix test issue
2022-10-13 14:04:15 +08:00
YuliangLiu0306
42b882ef06 [autoparallel] add output handler and placeholder handler (#1694)
* [autoparallel] add output handler and placeholder handler

* Delete test_solver_with_resnet.py

* fix test bugs
2022-10-13 13:42:36 +08:00
YuliangLiu0306
56088e6d98 [autoparallel] add pooling handler (#1690)
* [autoparallel] add pooling handler

* polish code
2022-10-13 13:42:13 +08:00
YuliangLiu0306
319d654f79 [autoparallel] where_handler_v2 (#1688)
* where generator

* [autoparallel] where_handler_v2
2022-10-13 11:02:22 +08:00
Boyuan Yao
31d2f03d27 [autoparallel] fix C version rotor inconsistency (#1691) 2022-10-12 15:21:58 +08:00
Frank Lee
4973157ad7 [autoparallel] added sharding spec conversion for linear handler (#1687) 2022-10-12 11:16:18 +08:00
YuliangLiu0306
af718e83f2 [autoparallel] add reshape handler v2 and fix some previous bug (#1683) 2022-10-11 18:12:59 +08:00
Super Daniel
3dd6994427 [fx/profiler] assigned UUID to each unrecorded tensor/ improved performance on GPT-2 (#1679)
* [fx/profiler] modify data_ptr into uuid for all tensors.

* [fx] modify uuid.

* [fx/profiler] tune performance on GPT-2.

* [fx] updates.

* [fx] debug.

* [fx] debug.

* [fx] cuda.
2022-10-11 11:03:35 +08:00
YuliangLiu0306
517b63939a [autoparallel] add unary element wise handler v2 (#1674) 2022-10-09 17:30:42 +08:00
YuliangLiu0306
f6c6a932b8 [autoparallel] add following node generator (#1673)
* [autoparallel] add following node generator

* polish code

* polish code

* update name of arguments
2022-10-09 14:49:18 +08:00
YuliangLiu0306
52fda88796 [autoparallel] add layer norm handler v2 (#1671)
* [autoparallel] add layer norm handler v2

* polish code

* polish code
2022-10-09 14:23:22 +08:00
HELSON
b28991dd0a [feature] A new ZeRO implementation (#1644) 2022-10-09 09:18:51 +08:00
Boyuan Yao
1df98d5b66 [autoparallel] add rotor C version (#1658)
* [autoparallel] add rotor c version

* [fx] remove metainfoprop in rotor solver

* [autoparallel] modify C
 code format

* [autoparallel] remove build.py

* [autoparallel] fix C extension build

* [autoparallel] add C solver consistency test

* [autoparallel] remove some unused imports

* [autoparallel] refactor rotor solver code

* [autoparallel] replace print with colossalai logger

* [autoparallel] ranks fixed
2022-10-03 17:13:30 +08:00
YuliangLiu0306
11ec070e53 [hotfix]unit test (#1670) 2022-09-29 12:49:28 +08:00
Frank Lee
a60024e77a [autoparallel] added utils for broadcast operation (#1665)
* [autoparallel] added utils for broadcast operation

* polish code
2022-09-29 11:22:29 +08:00
YuliangLiu0306
3f068d1409 [autoparallel] update CommSpec (#1667) 2022-09-29 11:20:59 +08:00
YuliangLiu0306
746f8f979d [autoparallel] add batch norm handler v2 (#1666) 2022-09-29 11:02:49 +08:00
Kirigaya Kazuto
9708638ded [pipeline/pytree] add pytree to process args and kwargs | provide data_process_func to process args and kwargs after forward (#1642)
* [pipeline/tuning] improve dispatch performance both time and space cost

* [pipeline/converge] add interface for testing convergence

* [NFC] polish colossalai/utils/multi_tensor_apply/multi_tensor_apply.py code style

* Update PipelineBase.py

* [pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera

* [pipeline/chimera] test chimera | fix bug of initializing

* [pipeline/pytree] add pytree to process args and kwargs | provide  to process args and kwargs after forward
2022-09-29 10:58:58 +08:00
Frank Lee
3a4d6f63a8 [autoparallel] added node handler for bmm (#1655) 2022-09-28 11:32:16 +08:00
YuliangLiu0306
095854477f [autoparallel] add conv handler v2 (#1663) 2022-09-28 11:24:59 +08:00
YuliangLiu0306
1e7816a460 [autoparallel] adapt solver with gpt (#1653) 2022-09-28 11:17:26 +08:00
Frank Lee
30e50c8b4a [autoparallel] implemented all matmul strategy generator (#1650) 2022-09-27 12:06:25 +08:00
YuliangLiu0306
03978aad45 [autoparallel] change the following nodes strategies generation logic (#1636)
* [autoparallel] change the following nodes strategies generation logic

* fix unit test
2022-09-27 11:20:52 +08:00
YuliangLiu0306
59f100510a [autoparallel] where handler (#1651)
* [autoparallel] where handler

* fix unit test
2022-09-27 11:20:43 +08:00
Boyuan Yao
5d0fdb9cb4 [fx] fix offload codegen test (#1648)
* [fx] fix offload codegen test

* [fx] modify typing
2022-09-27 10:25:27 +08:00
Frank Lee
45b39a692a [autoparallel] implemented linear projection strategy generator (#1639) 2022-09-26 16:58:14 +08:00
Frank Lee
154d3ef432 [fix] fixed the collective pattern name for consistency (#1649)
* [fix] fixed the collective pattern name for consistency

* polish code
2022-09-26 16:39:37 +08:00
YuliangLiu0306
b2b2a4af98 [autoparallel] adapt solver with mlp (#1638) 2022-09-26 15:26:14 +08:00
Jiarui Fang
c5d39215f6 Revert "[feature] new zero implementation (#1623)" (#1643)
This reverts commit 5be118f405.
2022-09-26 10:06:03 +08:00
HELSON
5be118f405 [feature] new zero implementation (#1623) 2022-09-24 19:58:18 +08:00
HELSON
95c35f73bd [moe] initialize MoE groups by ProcessGroup (#1640) 2022-09-23 17:20:41 +08:00
HELSON
a088022efc [moe] fix moe bugs (#1633) 2022-09-23 15:33:57 +08:00
YuliangLiu0306
702dbc5288 [tensor] use communication autograd func (#1617)
* [tensor] use communication autograd func

* change all to all comm spec info

* rename pattern and distinguish fwd/bwd

* polish code
2022-09-23 13:31:15 +08:00
YuliangLiu0306
0c703189b9 [autoparallel] add layernorm handler (#1629) 2022-09-23 12:00:25 +08:00
YuliangLiu0306
bf77d3ab65 [autoparallel] recover the merged node strategy index (#1613) 2022-09-23 11:52:42 +08:00
Boyuan Yao
d6b01feb66 [fx] Modify offload codegen (#1618)
* [fx] modify offload codegen

* [fx] remove repeated hook definitions

* [fx] modify offload test
2022-09-23 11:04:52 +08:00
YuliangLiu0306
9eae855408 [hotfix] add recompile after graph manipulatation (#1621) 2022-09-23 11:00:33 +08:00
Super Daniel
d967779a32 [fx/profiler] tuned the calculation of memory estimation (#1619)
* [fx] tuned the meta info and rotor solver.

* [fx] remove import.

* [fx] remove import.

* [fx] remove import.

* [fx] tune the meta calculations.

* [fx] polish comments.

* [fx] remove assertions.

* [fx] modify test cases.

* [fx] modify test cases.

* [fx] optimize import.

* [fx
2022-09-23 10:59:47 +08:00
HELSON
f7f2248771 [moe] fix MoE bugs (#1628)
* remove forced FP32 modules

* correct no_shard-contexts' positions
2022-09-22 13:56:30 +08:00
Jiarui Fang
38c68b5b9a [embedding] rollback for better FAW performance (#1625) 2022-09-22 11:16:25 +08:00
Frank Lee
d925122020 [autoparallel] added new linear module handler (#1616) 2022-09-21 12:23:21 +08:00