Commit Graph

41 Commits

Author SHA1 Message Date
digger yu
9265f2d4d7 [NFC]fix typo colossalai/auto_parallel nn utils etc. (#3779)
* fix typo colossalai/autochunk auto_parallel amp

* fix typo colossalai/auto_parallel nn utils etc.
2023-05-23 15:28:20 +08:00
Jiarui Fang
21962e1593 [embedding] rename FreqAwareEmbedding -> CachedEmbedding (#1699) 2022-10-13 22:22:27 +08:00
Jiarui Fang
363fc2861a [embeddings] more detailed timer (#1692) 2022-10-12 12:01:21 +08:00
Jiarui Fang
c638bec028 [embedding] polish async copy (#1657) 2022-09-27 14:37:03 +08:00
Jiarui Fang
988570e4a6 [embedding] add more detail profiling (#1656) 2022-09-27 13:43:59 +08:00
Jiarui Fang
e1f97fd2b8 [embedding] print profiling results (#1654) 2022-09-27 12:50:33 +08:00
Jiarui Fang
04443605a5 [embedding] non-blocking cpu-gpu copy (#1647) 2022-09-26 14:57:57 +08:00
CsRic
0767f67a0f [embedding] isolate cache_op from forward (#1645)
Co-authored-by: ric <mkkt_bkkt@mail.ustc.edu.cn>
2022-09-26 11:18:59 +08:00
Jiarui Fang
e57df80325 [embeddings] cache option (#1635) 2022-09-23 16:40:18 +08:00
Jiarui Fang
38c68b5b9a [embedding] rollback for better FAW performance (#1625) 2022-09-22 11:16:25 +08:00
Jiarui Fang
504ff1d101 [embeddings] use cache_ratio instead of cuda_row_num (#1611) 2022-09-20 14:33:04 +08:00
Jiarui Fang
a19eb80998 [embedding] updates some default parameters 2022-09-15 15:45:17 +08:00
CsRic
f3403ff98e [embeddings] add already_split_along_rank flag for tablewise mode (#1584) 2022-09-13 10:50:34 +08:00
CsRic
a389ac4ec9 [embedding] cache_embedding small improvement (#1564) 2022-09-08 16:41:19 +08:00
Jiarui Fang
64169f3e8f [embedding] polish parallel embedding tablewise (#1545) 2022-09-06 10:41:20 +08:00
CsRic
964123ae0f [embedding] freq_aware_embedding: add small functions for caller application (#1537) 2022-09-05 15:12:53 +08:00
Jiarui Fang
521078ffc9 [embedding] fix a bug in table wise sharding (#1538) 2022-09-02 15:48:35 +08:00
Jiarui Fang
87134524fd [embedding] tablewise sharding polish (#1535) 2022-09-02 11:09:37 +08:00
CsRic
5156d5b4f8 [embedding] add tablewise sharding for FAW (#1526) 2022-09-01 17:55:41 +08:00
Jiarui Fang
4537d39df9 [doc] docstring for FreqAwareEmbeddingBag (#1525) 2022-08-31 13:52:30 +08:00
Jiarui Fang
9a9ef65313 [FAW] cpu caching operations (#1520) 2022-08-30 14:50:02 +08:00
Jiarui Fang
af5438caa2 [FAW] refactor reorder() for CachedParamMgr (#1514) 2022-08-29 14:22:07 +08:00
Jiarui Fang
9feee6d06b [FAW] LFU initialize with dataset freq (#1513) 2022-08-29 12:52:53 +08:00
CsRic
1b8fee8e9c [FAW] shrink freq_cnter size (#1509) 2022-08-29 11:44:55 +08:00
Jiarui Fang
ba61109b6c [FAW] remove code related to chunk (#1501) 2022-08-26 14:23:30 +08:00
Jiarui Fang
d5085bb317 [FAW] add more docs and fix a warning (#1500) 2022-08-26 14:10:21 +08:00
CsRic
0ed2f46131 [FAW] FAW embedding use LRU as eviction strategy intialized with dataset stats (#1494) 2022-08-26 11:24:12 +08:00
CsRic
b8d0e39eaf [FAW] LFU cache for the FAW 2022-08-25 13:08:46 +08:00
Jiarui Fang
cde7b8a5b8 [FAW] init an LFU implementation for FAW (#1488) 2022-08-24 17:37:22 +08:00
Geng Zhang
0aad53c62b [FCE] update interface for frequency statistics in FreqCacheEmbedding (#1462) 2022-08-23 17:38:24 +08:00
Jiarui Fang
a1476ea882 [NFC] polish doc style for ColoTensor (#1457) 2022-08-16 09:21:05 +08:00
Geng Zhang
9f3eed66eb [FAW] reorganize the inheritance struct of FreqCacheEmbedding (#1448) 2022-08-12 15:55:46 +08:00
Jiarui Fang
30b4dd17c0 [FAW] export FAW in _ops (#1438) 2022-08-11 13:43:24 +08:00
HELSON
1b41686461 [hotfix] fix unit test test_module_spec (#1321) 2022-07-15 14:02:32 +08:00
Jiarui Fang
9bcd2fd4af [tensor] a shorter shard and replicate spec (#1245) 2022-07-11 15:51:48 +08:00
Jiarui Fang
ae7d3f4927 [refactor] move process group from _DistSpec to ColoTensor. (#1203) 2022-07-06 16:15:16 +08:00
Jiarui Fang
060b917daf [refactor] remove gpc dependency in colotensor's _ops (#1189) 2022-07-04 18:54:37 +08:00
Ziyue Jiang
dd0420909f [Tensor] rename parallel_action (#1174)
* rename parallel_action

* polish
2022-06-27 10:04:45 +08:00
Jiarui Fang
4b9bba8116 [ColoTensor] rename APIs and add output_replicate to ComputeSpec (#1168) 2022-06-24 13:08:54 +08:00
Jiarui Fang
f4ef224358 [Tensor] remove ParallelAction, use ComputeSpec instread (#1166) 2022-06-23 17:34:59 +08:00
Jiarui Fang
49832b2344 [refactory] add nn.parallel module (#1068) 2022-06-06 15:34:41 +08:00