[Pipeline inference] Combine kvcache with pipeline inference (#4938)

* merge kvcache with pipeline inference and refactor the code structure

* support ppsize > 2

* refactor pipeline code

* do pre-commit

* modify benchmark

* fix bench mark

* polish code

* add docstring and update readme

* refactor the code

* fix some logic bug of ppinfer

* polish readme

* fix typo

* skip infer test
This commit is contained in:
Bin Jia
2023-10-27 16:19:54 +08:00
committed by GitHub
parent c6cd629e7a
commit 1db6727678
19 changed files with 922 additions and 745 deletions

View File

@@ -1,3 +1,4 @@
from .pipeline import PPInferEngine
__all__ = ["PPInferEngine"]
__all__ = ['PPInferEngine']