ColossalAI/colossalai at 4f28cb43c0c2afbc970b9f0f300e7aa28e39bd2e - ColossalAI - Gitea: Git with a cup of tea at home

github/ColossalAI

mirror of https://github.com/hpcaitech/ColossalAI.git synced 2026-05-05 12:24:38 +00:00

Files

History

yuehuayingxueluo 4f28cb43c0 [inference]Optimize the usage of the mid tensors space in flash attn (#5304 )

* opt flash attn

* opt tmp tensor

* fix benchmark_llama

* fix code style

* fix None logic for output tensor

* fix adapted to get_xine_cache

* add comment

* fix ci bugs

* fix some codes

* rm duplicated codes

* rm duplicated codes

* fix code style

* add _get_dtype in config.py

2024-01-26 14:00:10 +08:00

..

[misc] update pre-commit and run all files (#4752 )

2023-09-19 14:20:26 +08:00

[setup] support pre-build and jit-build of cuda kernels (#2374 )

2023-01-06 20:50:26 +08:00

[npu] add npu support for gemini and zero (#5067 )

2023-11-20 16:12:41 +08:00

[npu] add npu support for gemini and zero (#5067 )

2023-11-20 16:12:41 +08:00

[misc] update pre-commit and run all files (#4752 )

2023-09-19 14:20:26 +08:00

[ci] fix shardformer tests. (#5255 )

2024-01-11 19:07:45 +08:00

[pipeline,shardformer] Fix p2p efficiency in pipeline, allow skipping loading weight not in weight_map when strict=False, fix llama flash attention forward, add flop estimation by megatron in llama benchmark (#5017 )

2023-11-16 20:15:59 +08:00

[bug] Fix the version check bug in colossalai run when generating the cmd. (#4713 )

2023-09-22 10:50:47 +08:00

fix-test (#5210 )

2024-01-03 14:26:13 +08:00

[moe] merge moe into main (#4978 )

2023-11-02 02:21:24 +00:00

[npu] add npu support for hybrid plugin and llama (#5090 )

2023-11-22 19:23:21 +08:00

[misc] update pre-commit and run all files (#4752 )

2023-09-19 14:20:26 +08:00

[inference]Optimize the usage of the mid tensors space in flash attn (#5304 )

2024-01-26 14:00:10 +08:00

[lazy] support from_pretrained (#4801 )

2023-09-26 11:04:11 +08:00

[inference]Optimize the usage of the mid tensors space in flash attn (#5304 )

2024-01-26 14:00:10 +08:00

[doc] add lazy init docs (#4808 )

2023-09-27 10:24:04 +08:00

adapted to pad_context_forward

2024-01-11 13:44:06 +00:00

[misc] update pre-commit and run all files (#4752 )

2023-09-19 14:20:26 +08:00

[hotfix]: modify create_ep_hierarchical_group and add test (#5032 )

2023-11-17 10:53:00 +08:00

[npu] add npu support for gemini and zero (#5067 )

2023-11-20 16:12:41 +08:00

[pipeline] A more general _communicate in p2p (#5062 )

2024-01-08 15:37:27 +08:00

[doc] fix doc typo (#5256 )

2024-01-11 21:01:11 +08:00

fix (#5158 )

2023-12-05 14:28:36 +08:00

[npu] add npu support for hybrid plugin and llama (#5090 )

2023-11-22 19:23:21 +08:00

[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088 )

2023-11-28 16:54:42 +08:00

[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088 )

2023-11-28 16:54:42 +08:00

__init__.py

[misc] update pre-commit and run all files (#4752 )

2023-09-19 14:20:26 +08:00

initialize.py

[npu] add npu support for gemini and zero (#5067 )

2023-11-20 16:12:41 +08:00