ColossalAI/examples/inference
2024-01-31 10:41:47 +08:00
..
benchmark_llama.py merge commit 2024-01-31 10:41:47 +08:00
build_smoothquant_weight.py [inference] refactor examples and fix schedule (#5077) 2023-11-21 10:46:03 +08:00
run_benchmark.sh [inference]Optimize the usage of the mid tensors space in flash attn (#5304) 2024-01-26 14:00:10 +08:00
run_llama_inference.py [npu] change device to accelerator api (#5239) 2024-01-09 10:20:05 +08:00