mirror of
https://github.com/hpcaitech/ColossalAI.git
synced 2025-09-28 21:17:08 +00:00
[Inference]Fused kv copy into rotary calculation (#5383)
* revise rotary embedding * remove useless print * adapt * fix * add * fix * modeling * fix * fix * fix * fused kv copy * fused copy * colossalai/kernel/triton/no_pad_rotary_embedding.py * del padding llama * del
This commit is contained in:
@@ -204,7 +204,7 @@ def benchmark_inference(args):
|
||||
torch.cuda.cudart().cudaProfilerStop()
|
||||
if args.profile:
|
||||
ctx.step()
|
||||
|
||||
print(f"config:batch_size {args.batch_size}, input_len{ args.seq_len}, output_len {args.output_len}")
|
||||
print_details_info(model.config, args, whole_end2end, total_token_num)
|
||||
|
||||
|
||||
|
Reference in New Issue
Block a user