mirror of
https://github.com/hpcaitech/ColossalAI.git
synced 2025-09-17 15:11:20 +00:00
[Inference] Add CacheBlock and KV-Cache Manager (#5156)
* [Inference] Add KVCache Manager * function refactored * add test for KVCache Manager * add attr beam width * Revise alloc func in CacheManager * Fix docs and pytests * add tp slicing for head number * optimize shapes of tensors used as physical cache * Apply using InferenceConfig on KVCacheManager * rm duplicate config file * Optimize cache allocation: use contiguous cache * Fix config in pytest (and config)
This commit is contained in:
committed by
FrankLeeeee
parent
fab9b931d9
commit
3de2e62299
4
colossalai/inference/kv_cache/__init__.py
Normal file
4
colossalai/inference/kv_cache/__init__.py
Normal file
@@ -0,0 +1,4 @@
|
||||
from .block_cache import CacheBlock
|
||||
from .kvcache_manager import KVCacheManager
|
||||
|
||||
__all__ = ["CacheBlock", "KVCacheManager"]
|
Reference in New Issue
Block a user