Hongxin Liu
ccabcf6485
[fp8] support fp8 amp for hybrid parallel plugin ( #5975 )
...
* [fp8] support fp8 amp for hybrid parallel plugin
* [test] add fp8 hook test
* [fp8] fix fp8 linear compatibility
2024-08-07 18:21:08 +08:00
Hongxin Liu
76ea16466f
[fp8] add fp8 linear ( #5967 )
...
* [fp8] add fp8 linear
* [test] fix fp8 linear test condition
* [test] fix fp8 linear test condition
* [test] fix fp8 linear test condition
2024-08-07 15:41:49 +08:00
flybird11111
afb26de873
[fp8]support all2all fp8 ( #5953 )
...
* support all2all fp8
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* fix
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-08-06 16:58:23 +08:00
Guangyao Zhang
53cb9606bd
[Feature] llama shardformer fp8 support ( #5938 )
...
* add llama shardformer fp8
* Llama Shardformer Parity
* fix typo
* fix all reduce
* fix pytest failure
* fix reduce op and move function to fp8.py
* fix typo
2024-08-05 10:05:47 +08:00
Hongxin Liu
5fd0592767
[fp8] support all-gather flat tensor ( #5932 )
2024-07-24 16:55:20 +08:00
GuangyaoZhang
6a20f07b80
remove all to all
2024-07-17 07:14:55 +00:00
GuangyaoZhang
5a310b9ee1
fix rebase
2024-07-17 03:43:23 +00:00
GuangyaoZhang
457a0de79f
shardformer fp8
2024-07-16 06:56:51 +00:00
pre-commit-ci[bot]
51f916b11d
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2024-07-12 07:33:45 +00:00
BurkeHulk
e88190184a
support fp8 communication in pipeline parallelism
2024-07-12 15:25:25 +08:00
BurkeHulk
1e1959467e
fix scaling algorithm in FP8 casting
2024-07-12 15:23:37 +08:00
HangXu
f5a52e1600
fp8 operators for compressed communication
...
cast_to_fp8, cast_from_fp8, all_reduce_fp8
2024-07-01 13:44:21 +08:00