ColossalAI/tests at 5d4c1fe8f5f7019284f6cbc0ed29506748f63bf1 - ColossalAI - Gitea: Git with a cup of tea at home

github/ColossalAI

mirror of https://github.com/hpcaitech/ColossalAI.git synced 2026-05-05 12:24:38 +00:00

Files

History

Yuanheng Zhao 5d4c1fe8f5 [Fix/Inference] Fix GQA Triton and Support Llama3 (#5624 )

* [fix] GQA calling of flash decoding triton

* fix kv cache alloc shape

* fix rotary triton - GQA

* fix sequence max length assigning

* Sequence max length logic

* fix scheduling and spec-dec

* skip without import error

* fix pytest - skip without ImportError

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

2024-04-23 13:09:55 +08:00

..

[devops] remove post commit ci (#5566 )

2024-04-08 15:09:40 +08:00

[misc] update pre-commit and run all files (#4752 )

2023-09-19 14:20:26 +08:00

test_auto_parallel

[npu] change device to accelerator api (#5239 )

2024-01-09 10:20:05 +08:00

[misc] update pre-commit and run all files (#4752 )

2023-09-19 14:20:26 +08:00

[shardformer] fix pipeline forward error if custom layer distribution is used (#5189 )

2024-03-27 13:57:00 +08:00

test_checkpoint_io

[devops] remove post commit ci (#5566 )

2024-04-08 15:09:40 +08:00

[shardformer] Sequence Parallelism Optimization (#5533 )

2024-04-03 17:15:47 +08:00

[misc] update pre-commit and run all files (#4752 )

2023-09-19 14:20:26 +08:00

[misc] update pre-commit and run all files (#4752 )

2023-09-19 14:20:26 +08:00

[misc] update pre-commit and run all files (#4752 )

2023-09-19 14:20:26 +08:00

[devops] remove post commit ci (#5566 )

2024-04-08 15:09:40 +08:00

[Fix/Inference] Fix GQA Triton and Support Llama3 (#5624 )

2024-04-23 13:09:55 +08:00

[devops] remove post commit ci (#5566 )

2024-04-08 15:09:40 +08:00

[npu] change device to accelerator api (#5239 )

2024-01-09 10:20:05 +08:00

[hotfix] set return_outputs=False in examples and polish code (#5404 )

2024-03-25 12:31:09 +08:00

[devops] remove post commit ci (#5566 )

2024-04-08 15:09:40 +08:00

[devops] remove post commit ci (#5566 )

2024-04-08 15:09:40 +08:00

test_shardformer

[shardformer] Sequence Parallelism Optimization (#5533 )

2024-04-03 17:15:47 +08:00

test_smoothquant

[inference] Add smmoothquant for llama (#4904 )

2023-10-16 11:28:44 +08:00

fixed layout converter caching and updated tester

2024-03-26 17:22:27 +08:00

[npu] change device to accelerator api (#5239 )

2024-01-09 10:20:05 +08:00

__init__.py

[zero] Update sharded model v2 using sharded param v2 (#323 )

2022-03-11 15:50:28 +08:00