Yuanheng Zhao
1513f20f4d
[kernel] Add flash decoding triton kernel for blocked kv cache (#5249)
* add flash decoding unpad triton kernel
* rename flash decoding kernel
* add kernel testing (draft)
* revise pytest
* support kv group (GQA)
* (trivial) fix api and pytest
* (trivial) func renaming
* (trivial) func/file renaming
* refactor pytest for attention
* (trivial) format and consistent vars of context/decode attn
* (trivial) remove test redundancy
2024-01-11 13:46:14 +00:00
..
2023-09-19 14:20:26 +08:00
2023-01-06 20:50:26 +08:00
2023-11-20 16:12:41 +08:00
2023-11-20 16:12:41 +08:00
2023-09-19 14:20:26 +08:00
2024-01-11 19:07:45 +08:00
2023-11-16 20:15:59 +08:00
2023-09-22 10:50:47 +08:00
2024-01-03 14:26:13 +08:00
2023-11-02 02:21:24 +00:00
2023-11-22 19:23:21 +08:00
2023-09-19 14:20:26 +08:00
2024-01-11 13:46:14 +00:00
2023-09-26 11:04:11 +08:00
2024-01-11 13:46:14 +00:00
2023-09-27 10:24:04 +08:00
2024-01-11 13:44:06 +00:00
2023-09-19 14:20:26 +08:00
2023-11-17 10:53:00 +08:00
2023-11-20 16:12:41 +08:00
2024-01-08 15:37:27 +08:00
2024-01-11 21:01:11 +08:00
2023-12-05 14:28:36 +08:00
2023-11-22 19:23:21 +08:00
2023-11-28 16:54:42 +08:00
2023-11-28 16:54:42 +08:00
2023-09-19 14:20:26 +08:00
2023-11-20 16:12:41 +08:00