[Inference] Finish Online Serving Test, add streaming output api, continuous batching test and example (#5432)

* finish online test and add examples

* fix test_contionus_batching

* fix some bugs

* fix bash

* fix

* fix inference

* finish revision

* fix typos

* revision
This commit is contained in:
Jianghai
2024-03-18 17:06:05 +08:00
committed by CjhHa1
parent 69cd7e069d
commit de378cd2ab
10 changed files with 214 additions and 94 deletions

View File

@@ -209,6 +209,7 @@ class RequestHandler:
break
num_seqs_to_add = min(len(lst), self.max_batch_size - self.running_list.total_seq_num)
# for now the recycle logic is not working
remove_list.extend(lst[:num_seqs_to_add])
self.running_list.extend(lst[:num_seqs_to_add])