Files
ColossalAI/colossalai
Yuanheng Zhao 5f98a9d68a [Infer] Optimize Blocked KVCache And Kernels Using It (#5325)
* revise shape of kvcache (context attn kernel)

* revise shape of kvcache (flash decoding kernel)

* revise shape of kvcache (kvcache copy) and attn func

* init of kvcache in kvcache manager

* revise llama modeling

* revise block size retrieval

* use torch for rms_norm benchmarking

* revise block size retrieval
2024-01-30 16:06:09 +08:00
..
2024-01-03 14:26:13 +08:00
2023-11-02 02:21:24 +00:00
2023-09-27 10:24:04 +08:00
2024-01-11 13:44:06 +00:00
2024-01-11 21:01:11 +08:00
2023-12-05 14:28:36 +08:00