[Inference] ADD async and sync Api server using FastAPI (#5396)

* add api server

* fix

* add

* add completion service and fix bug

* add generation config

* revise shardformer

* fix bugs

* add docstrings and fix some bugs

* fix bugs and add choices for prompt template
This commit is contained in:
Jianghai
2024-03-01 14:47:36 +08:00
committed by CjhHa1
parent d482922035
commit 69cd7e069d
13 changed files with 789 additions and 25 deletions

View File

@@ -164,6 +164,7 @@ class Sequence:
return (
f"(request_id={self.request_id}, "
f"prompt={self.prompt}, "
f"output_token_id={self.output_token_id},"
f"status={self.status.name}, "
f"sample_params={self.sample_params}, "
f"input_len={self.input_len},"