Files
ColossalAI/colossalai
yuehuayingxueluo 4f28cb43c0 [inference]Optimize the usage of the mid tensors space in flash attn (#5304)
* opt flash attn

* opt tmp tensor

* fix benchmark_llama

* fix code style

* fix None logic for output tensor

* fix adapted to get_xine_cache

* add comment

* fix ci bugs

* fix some codes

* rm duplicated codes

* rm duplicated codes

* fix code style

* add _get_dtype in config.py
2024-01-26 14:00:10 +08:00
..
2024-01-03 14:26:13 +08:00
2023-11-02 02:21:24 +00:00
2023-09-27 10:24:04 +08:00
2024-01-11 13:44:06 +00:00
2024-01-11 21:01:11 +08:00
2023-12-05 14:28:36 +08:00