ColossalAI/extensions/pybind/inference
2024-05-14 12:46:54 +08:00
..
__init__.py
inference_ops_cuda.py [Inference/Feat] Add convert_fp8 op for fp8 test in the future (#5706) 2024-05-10 18:39:54 +08:00
inference.cpp add paged-attetionv2: support seq length split across thread block (#5707) 2024-05-14 12:46:54 +08:00