ColossalAI

mirror of https://github.com/hpcaitech/ColossalAI.git synced 2025-05-06 07:28:12 +00:00

History

Xu Kai 611a5a80ca [inference] Add smmoothquant for llama (#4904 ) * [inference] add int8 rotary embedding kernel for smoothquant (#4843) * [inference] add smoothquant llama attention (#4850) * add smoothquant llama attention * remove uselss code * remove useless code * fix import error * rename file name * [inference] add silu linear fusion for smoothquant llama mlp (#4853) * add silu linear * update skip condition * catch smoothquant cuda lib exception * prcocess exception for tests * [inference] add llama mlp for smoothquant (#4854) * add llama mlp for smoothquant * fix down out scale * remove duplicate lines * add llama mlp check * delete useless code * [inference] add smoothquant llama (#4861) * add smoothquant llama * fix attention accuracy * fix accuracy * add kv cache and save pretrained * refactor example * delete smooth * refactor code * [inference] add smooth function and delete useless code for smoothquant (#4895) * add smooth function and delete useless code * update datasets * remove duplicate import * delete useless file * refactor codes (#4902) * rafactor code * add license * add torch-int and smoothquant license		2023-10-16 11:28:44 +08:00
..
gptq	[NFC] polish code style (#4799 )	2023-10-07 13:36:52 +08:00
kernels	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
smoothquant	[inference] Add smmoothquant for llama (#4904 )	2023-10-16 11:28:44 +08:00
colossal_C_frontend.cpp	[optimizer] add div_scale for optimizers (#2117 )	2022-12-12 17:58:57 +08:00
compat.h	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
cpu_adam.cpp	[hotfix] fix CPUAdam kernel nullptr (#1410 )	2022-08-05 19:45:45 +08:00
cpu_adam.h	[hotfix] fix CPUAdam kernel nullptr (#1410 )	2022-08-05 19:45:45 +08:00
layer_norm_cuda_kernel.cu	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
layer_norm_cuda.cpp	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
moe_cuda_kernel.cu	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
moe_cuda.cpp	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
multi_tensor_adam.cu	[doc] add deepspeed citation and copyright (#2996 )	2023-03-04 20:08:11 +08:00
multi_tensor_apply.cuh	[doc] add deepspeed citation and copyright (#2996 )	2023-03-04 20:08:11 +08:00
multi_tensor_l2norm_kernel.cu	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
multi_tensor_lamb.cu	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
multi_tensor_scale_kernel.cu	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
multi_tensor_sgd_kernel.cu	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
multihead_attention_1d.cpp	[hotfix] fix error for torch 2.0 (#2243 )	2022-12-30 23:11:55 +08:00
multihead_attention_1d.h	[hotfix] fix error for torch 2.0 (#2243 )	2022-12-30 23:11:55 +08:00
scaled_masked_softmax_cuda.cu	[NFC] polish colossalai/kernel/cuda_native/csrc/scaled_masked_softmax_cuda.cu code style (#949 )	2022-05-17 10:25:06 +08:00
scaled_masked_softmax.cpp	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
scaled_masked_softmax.h	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
scaled_upper_triang_masked_softmax_cuda.cu	[NFC] polish pre-commit run --files colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax_cuda.cu code style (#943 )	2022-05-17 10:25:06 +08:00
scaled_upper_triang_masked_softmax.cpp	[NFC] polish colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax.cpp code style (#959 )	2022-05-17 10:25:06 +08:00
scaled_upper_triang_masked_softmax.h	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
type_shim.h	[bf16] add bf16 support (#3882 )	2023-06-05 15:58:31 +08:00