Yuanheng Zhao
d85d91435a
[Inference/SpecDec] Support GLIDE Drafter Model (#5455)
* add glide-llama policy and modeling
* update glide modeling, compitable with transformers 4.36.2
* revise glide llama modeling/usage
* fix issues of glimpsing large kv
* revise the way re-loading params for glide drafter
* fix drafter and engine tests
* enable convert to glide strict=False
* revise glide llama modeling
* revise vicuna prompt template
* revise drafter and tests
* apply usage of glide model in engine
2024-04-10 11:07:52 +08:00
..
2024-04-08 15:09:40 +08:00
2023-09-19 14:20:26 +08:00
2024-01-09 10:20:05 +08:00
2023-09-19 14:20:26 +08:00
2024-03-27 13:57:00 +08:00
2024-04-08 15:09:40 +08:00
2024-04-03 17:15:47 +08:00
2023-09-19 14:20:26 +08:00
2023-09-19 14:20:26 +08:00
2023-09-19 14:20:26 +08:00
2024-04-08 15:09:40 +08:00
2024-04-10 11:07:52 +08:00
2024-04-08 15:09:40 +08:00
2024-01-09 10:20:05 +08:00
2024-03-25 12:31:09 +08:00
2024-04-08 15:09:40 +08:00
2024-04-08 15:09:40 +08:00
2024-04-03 17:15:47 +08:00
2023-10-16 11:28:44 +08:00
2024-03-26 17:22:27 +08:00
2024-01-09 10:20:05 +08:00