Commit Graph

381 Commits

Author SHA1 Message Date
flybird11111
611c1247ba Update bert.py 2025-05-27 10:57:06 +08:00
wangbluo
4a077e5dc3 fix falcon 2025-05-22 16:50:40 +08:00
Hanks
6a29abdefd Merge pull request #6298 from wangbluo/upgrade_command
upgrade command
2025-05-22 14:21:58 +08:00
Hanks
6196faad3c Merge pull request #6318 from wangbluo/upgrade_t5
Upgrade T5
2025-05-22 14:21:04 +08:00
Hanks
33614b84ce Merge pull request #6306 from wangbluo/upgrade_sam
Upgrade sam
2025-05-22 14:19:20 +08:00
Hanks
e7ce5821de Merge pull request #6313 from wangbluo/upgrade_gptj
Upgrade gptj
2025-05-22 14:18:49 +08:00
flybird11111
6875a8a1cf [upgrade]upgrade mistral (#6296)
* upgrade mistral

* fix

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-05-21 16:14:45 +08:00
flybird11111
04516bb756 [upgrade]Upgrade vit (#6308)
* fix

* fix

* fix rotate embedding test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-05-21 16:14:20 +08:00
flybird11111
d0e13b85fd [upgrade]Upgrade mixtral (#6317)
* upgrade mixtral

* fix

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* upgrade infer

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* upgrade drafter

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* upgrade lazy

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* upgrade mixtral

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-05-21 16:14:05 +08:00
flybird11111
2aa295e959 [upgrade]upgrade opt (#6307)
* upgrade opt

* fix
2025-05-21 16:13:32 +08:00
pre-commit-ci[bot]
efb2d98da0 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-05-20 08:17:46 +00:00
wangbluo
07fa048895 fix 2025-05-20 16:13:34 +08:00
wangbluo
4e49f056d0 fix 2025-05-16 15:32:16 +08:00
wangbluo
e1925b36c4 upgrade_gptj 2025-05-16 15:28:04 +08:00
wangbluo
ced6b5e1c3 fix 2025-05-16 11:39:50 +08:00
wangbluo
10bc6af2b1 fix 2025-05-15 17:55:24 +08:00
wangbluo
ba9fb549d5 fix 2025-05-15 17:47:21 +08:00
wangbluo
2223b64931 upgrade_t 2025-05-15 14:31:24 +08:00
wangbluo
0e9d628bb7 add the explanation 2025-05-14 12:50:07 +08:00
wangbluo
b032cf9b16 upgrade_sam 2025-05-14 12:45:34 +08:00
pre-commit-ci[bot]
89917e247b [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-05-14 04:24:24 +00:00
Wang Binluo
0dede489d6 Merge branch 'upgrade_transformers' into upgrade_falcon 2025-05-14 12:23:28 +08:00
Hanks
1ace29b54d Merge pull request #6299 from wangbluo/upgrade_bloom
Upgrade bloom
2025-05-14 10:19:44 +08:00
Hanks
c28b3c39db Merge pull request #6305 from wangbluo/update_bert
update_bert
2025-05-14 10:19:34 +08:00
wangbluo
d665d6740a add explantion 2025-05-14 10:15:25 +08:00
wangbluo
07349e0014 fix 2025-05-14 10:09:35 +08:00
wangbluo
2237531137 update_bloom 2025-05-13 18:21:57 +08:00
flybird11111
f118146564 [upgrade]Upgrade qwen2 (#6302)
* upgrade qwen2

* fix

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-05-13 15:49:53 +08:00
wangbluo
4fbbf4737a fix 2025-05-13 14:51:54 +08:00
wangbluo
d6f3508910 fix 2025-05-13 10:15:48 +08:00
wangbluo
b124603c68 fix 2025-05-08 18:06:56 +08:00
wangbluo
fe94d73f6b fix 2025-05-08 18:03:53 +08:00
pre-commit-ci[bot]
4eced5cf8a [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-05-08 09:58:05 +00:00
wangbluo
cefdfc4125 add explanation 2025-05-08 17:46:54 +08:00
wangbluo
e78c4560c6 fix 2025-05-08 16:22:08 +08:00
pre-commit-ci[bot]
06724492ca [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-05-08 08:13:35 +00:00
wangbluo
a9bb7cb943 upgrade command 2025-05-08 16:06:05 +08:00
flybird11111
a4c6e189fa [upgrade] upgrade gpt2 (#6291)
* fix

* fix

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* fix

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-05-08 14:10:21 +08:00
wangbluo
5480b811c5 upgrade_bloom 2025-05-06 15:58:53 +08:00
wangbluo
08787f0b6e upgrade_bert 2025-05-05 09:50:07 +08:00
wangbluo
885210dc27 fix 2025-04-28 18:17:12 +08:00
wangbluo
5d167f2148 fix 2025-04-28 18:01:53 +08:00
pre-commit-ci[bot]
c6291be1b1 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-04-24 08:35:03 +00:00
flybird11111
2f615a49fd fix 2025-04-24 16:20:42 +08:00
flybird11111
e891501c55 fix 2025-04-24 15:44:20 +08:00
flybird11111
686982764c upgrade llama 2025-04-24 14:54:15 +08:00
Hongxin Liu
014837e725 [shardformer] support pipeline for deepseek v3 and optimize lora save (#6188)
* [shardformer] support pipeline for deepseek v3

* [checkpointio] fix lora save

* [devops] update ci env

* [booster] optimize lora

* fix test

* fix test
2025-02-14 14:48:54 +08:00
Wenxuan Tan
ec73f1b5e2 [CI] Cleanup Dist Optim tests with shared helper funcs (#6125)
* Refractor and cleanup using common helper funcs. Tests passed

* Update comments

* Fix relative import

* Fix param fetching bug
2025-02-12 13:42:34 +08:00
Hongxin Liu
2b415e5999 [shardformer] support ep for deepseek v3 (#6185)
* [feature] support ep for deepseek v3

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix test

* [shardformer] fix deepseek v3 init

* [lazy] fit lora for lazy init

* [example] support npu for deepseek v3

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-02-11 16:10:25 +08:00
duanjunwen
a9bedc7a43 [Sharderformer] Support zbv in Sharderformer Policy (#6150)
* [feat] Sharderformer support zbv

* [feat] support chatglm2, command, deepseek for zbv

* [feat] support zbv in shardformer policy:
falcon,gptj,mistral,opt,qwen2,t5, vit, whisper

* [feat] support GPT2FusedLinearConv1D

* [feat] support GPT2FusedLinear (without tp)

* [fix] debug FusedConvLinear

* [shardfromer] support gpt2 policy for zbv, support GPT2FusedLinearConv
Col and Row.

* [Shardformer] support FusedLinear1D base for zbv

* [shardformer] support zbv in FusedLinear1D base, Col, Row

* [shardformer] support zbv in blip2 and sam policy

* [shardformer] fix bug incorrect number of gradients; add fusedLinear
base testcase;

* [fix] fix incorrect number of gradients ;

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [Shardformer] add en doc for zbv;

* [fix] fix typo in Model compatibility table

* [fix] fix API Reference typo

* [Shardformer] add zh-Han doc for zbv

* [fix] fix Linear name; update en & zh doc

* [fix] fix shardformer doc import err

* [fix] fix shardconfig import in doc

* [fix] fix shardformer doc

* [fix] fix shardconfig doc

* [fix] fix config

* [fix] remove shardconfig

* [fix] fix doc

* [feat] add zbv doc string

* [fix] rm doc

* [fix] fix doc

* [fix] empty zbv doc

* [fix] ifx torch version

* [fix] fix torch version

* [fix] fix torch versions

* [fix] fix torch versions

* [fix] fix pyramid versions

* [fix] fix pyramid, zope version

* [fix] try fix workflow

* [fix] try import ShardConfig in yml

* [fix] fix workflow

* [fix] fix workflow

* [fix] fix workflow

* [fix] fix workflow

* [fix] fix ci

* [fix] fix zbv doc

* [fix] fix param for qkv linear, gpt2fused linear; fix requirments;

* [fix] fix policy use fused_linear

* [fix] fix weight grad none, err caused by  weight ptr change

* [fix] fix comm in WeightGradStore

* [fix] fix WeightGradStore pop param

* [fix] remove useless param in doc; fix gpt2 qkv test;

* [shardformer] simplify execute_w_pass_grad_accum;

* [fix] rm useless comments

* [shardformer] simplify execute_w_pass_grad_accum & execute_w_pass

* [shardformer] Run meaningful doc test

* [shadformer] fix doc test cmd;

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-01-02 10:22:26 +08:00