FoolPlayer
f7774ec0f3
[Shardformer] Downstream bert (#3979)
* add dist dropout in model
* update docstring and bert policy with dropout
* refactor basepolicy and sharded, update bert
* update format
* update gpt2 policy
* update bert policy
* remove unused code
* update readme for new policy usage
* add downstream model of bert
* remove unused code
2023-07-04 16:05:01 +08:00
..
2023-05-11 16:30:58 +08:00
2023-06-09 09:49:41 +08:00
2023-04-06 14:51:35 +08:00
2023-05-15 17:20:56 +08:00
2023-06-25 13:34:15 +08:00
2023-06-15 17:38:42 +08:00
2023-06-05 14:20:47 +08:00
2023-06-25 13:34:15 +08:00
2023-05-15 17:20:56 +08:00
2023-04-06 14:51:35 +08:00
2022-06-10 11:27:38 +08:00
2023-04-06 14:51:35 +08:00
2023-04-06 14:51:35 +08:00
2023-05-11 16:30:58 +08:00
2023-04-06 14:51:35 +08:00
2023-06-19 11:23:24 +08:00
2023-04-06 14:51:35 +08:00
2023-05-15 17:20:56 +08:00
2023-05-11 16:30:58 +08:00
2023-06-09 09:41:27 +08:00
2023-05-11 16:30:58 +08:00
2023-04-06 14:51:35 +08:00
2023-06-05 15:58:31 +08:00
2023-04-06 14:51:35 +08:00
2023-07-04 16:05:01 +08:00
2023-06-25 13:34:15 +08:00
2023-05-11 16:30:58 +08:00
2023-06-05 14:20:47 +08:00
2023-06-25 13:34:15 +08:00
2022-03-11 15:50:28 +08:00