Logo
Explore Help
Register Sign In
github/ColossalAI
1
0
Fork 0
You've already forked ColossalAI
mirror of https://github.com/hpcaitech/ColossalAI.git synced 2026-04-26 01:35:21 +00:00
Code Issues Packages Projects Releases Wiki Activity
Files
279300dc5f34db219c90a297c0996d00221eae96
ColossalAI/tests/test_infer
History
yuehuayingxueluo 12f10d5b0b [Fix/Inference]Fix CUDA Rotary Rmbedding GQA (#5623)
* fix rotary embedding GQA

* change test_rotary_embdding_unpad.py KH
2024-04-23 13:44:49 +08:00
..
test_models
[inference/model]Adapted to the baichuan2-7B model (#5591)
2024-04-15 16:53:02 +08:00
test_ops
[Fix/Inference]Fix CUDA Rotary Rmbedding GQA (#5623)
2024-04-23 13:44:49 +08:00
_utils.py
[Inference] Add the logic of the inference engine (#5173)
2024-01-11 13:39:56 +00:00
test_batch_bucket.py
[Fix/Inference] Fix format of input prompts and input model in inference engine (#5395)
2024-02-23 10:51:35 +08:00
test_config_and_struct.py
[Inference] Optimize and Refactor Inference Batching/Scheduling (#5367)
2024-02-19 17:18:20 +08:00
test_cuda_graph.py
[Feat]Tensor Model Parallel Support For Inference (#5563)
2024-04-18 16:56:46 +08:00
test_drafter.py
[Inference/SpecDec] Support GLIDE Drafter Model (#5455)
2024-04-10 11:07:52 +08:00
test_inference_engine.py
[Fix/Inference] Fix GQA Triton and Support Llama3 (#5624)
2024-04-23 13:09:55 +08:00
test_kvcache_manager.py
[Inference] Optimize and Refactor Inference Batching/Scheduling (#5367)
2024-02-19 17:18:20 +08:00
test_request_handler.py
[Inference] Optimize and Refactor Inference Batching/Scheduling (#5367)
2024-02-19 17:18:20 +08:00
Powered by Gitea Version: 1.25.2 Page: 279ms Template: 6ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API