mirror of
https://github.com/hpcaitech/ColossalAI.git
synced 2026-04-12 15:14:55 +00:00
* prevent re-creating intermediate tensors * add singleton class holding intermediate values * fix triton kernel api * add benchmark in pytest * fix kernel api and add benchmark * revise flash decoding triton kernel in/out shapes * fix calling of triton kernel in modeling * fix pytest: extract to util functions