diff --git a/README.md b/README.md
index 8342a9fa0..79f733122 100644
--- a/README.md
+++ b/README.md
@@ -38,6 +38,14 @@
- Why Colossal-AI
- Features
+ -
+ Colossal-AI for Real World Applications
+
+
-
Parallel Training Demo
-
- -
- Colossal-AI for Real World Applications
-
-
Installation
@@ -120,6 +120,88 @@ distributed training and inference in a few lines.
- Inference
- [Energon-AI](https://github.com/hpcaitech/EnergonAI)
+
(back to top)
+
+## Colossal-AI in the Real World
+
+### ColossalChat
+
+
+
+[ColossalChat](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat): An open-source solution for cloning [ChatGPT](https://openai.com/blog/chatgpt/) with a complete RLHF pipeline. [[code]](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat) [[blog]](https://medium.com/@yangyou_berkeley/colossalchat-an-open-source-solution-for-cloning-chatgpt-with-a-complete-rlhf-pipeline-5edf08fb538b) [[demo]](https://chat.colossalai.org)
+
+
+
+
+
+- Up to 7.73 times faster for single server training and 1.42 times faster for single-GPU inference
+
+
+
+
+
+- Up to 10.3x growth in model capacity on one GPU
+- A mini demo training process requires only 1.62GB of GPU memory (any consumer-grade GPU)
+
+
+
+
+
+- Increase the capacity of the fine-tuning model by up to 3.7 times on a single GPU
+- Keep at a sufficiently high running speed
+
+(back to top)
+
+
+### AIGC
+Acceleration of AIGC (AI-Generated Content) models such as [Stable Diffusion v1](https://github.com/CompVis/stable-diffusion) and [Stable Diffusion v2](https://github.com/Stability-AI/stablediffusion).
+
+
+
+
+- [Training](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion): Reduce Stable Diffusion memory consumption by up to 5.6x and hardware cost by up to 46x (from A100 to RTX3060).
+
+
+
+
+
+- [DreamBooth Fine-tuning](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/dreambooth): Personalize your model using just 3-5 images of the desired subject.
+
+
+
+
+
+- [Inference](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion): Reduce inference GPU memory consumption by 2.5x.
+
+
+(back to top)
+
+### Biomedicine
+Acceleration of [AlphaFold Protein Structure](https://alphafold.ebi.ac.uk/)
+
+
+
+
+
+- [FastFold](https://github.com/hpcaitech/FastFold): Accelerating training and inference on GPU Clusters, faster data processing, inference sequence containing more than 10000 residues.
+
+
+
+
+
+- [FastFold with Intel](https://github.com/hpcaitech/FastFold): 3x inference acceleration and 39% cost reduce.
+
+
+
+
+
+- [xTrimoMultimer](https://github.com/biomap-research/xTrimoMultimer): accelerating structure prediction of protein monomers and multimer by 11x.
+
+
(back to top)
## Parallel Training Demo
@@ -213,88 +295,6 @@ Please visit our [documentation](https://www.colossalai.org/) and [examples](htt
- [BLOOM](https://github.com/hpcaitech/EnergonAI/tree/main/examples/bloom): Reduce hardware deployment costs of 176-billion-parameter BLOOM by more than 10 times.
-(back to top)
-
-## Colossal-AI in the Real World
-
-### ColossalChat
-
-
-
-[ColossalChat](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat): An open-source solution for cloning [ChatGPT](https://openai.com/blog/chatgpt/) with a complete RLHF pipeline. [[code]](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat) [[blog]](https://medium.com/@yangyou_berkeley/colossalchat-an-open-source-solution-for-cloning-chatgpt-with-a-complete-rlhf-pipeline-5edf08fb538b) [[demo]](https://chat.colossalai.org)
-
-
-
-
-
-- Up to 7.73 times faster for single server training and 1.42 times faster for single-GPU inference
-
-
-
-
-
-- Up to 10.3x growth in model capacity on one GPU
-- A mini demo training process requires only 1.62GB of GPU memory (any consumer-grade GPU)
-
-
-
-
-
-- Increase the capacity of the fine-tuning model by up to 3.7 times on a single GPU
-- Keep at a sufficiently high running speed
-
-(back to top)
-
-
-### AIGC
-Acceleration of AIGC (AI-Generated Content) models such as [Stable Diffusion v1](https://github.com/CompVis/stable-diffusion) and [Stable Diffusion v2](https://github.com/Stability-AI/stablediffusion).
-
-
-
-
-- [Training](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion): Reduce Stable Diffusion memory consumption by up to 5.6x and hardware cost by up to 46x (from A100 to RTX3060).
-
-
-
-
-
-- [DreamBooth Fine-tuning](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/dreambooth): Personalize your model using just 3-5 images of the desired subject.
-
-
-
-
-
-- [Inference](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion): Reduce inference GPU memory consumption by 2.5x.
-
-
-(back to top)
-
-### Biomedicine
-Acceleration of [AlphaFold Protein Structure](https://alphafold.ebi.ac.uk/)
-
-
-
-
-
-- [FastFold](https://github.com/hpcaitech/FastFold): Accelerating training and inference on GPU Clusters, faster data processing, inference sequence containing more than 10000 residues.
-
-
-
-
-
-- [FastFold with Intel](https://github.com/hpcaitech/FastFold): 3x inference acceleration and 39% cost reduce.
-
-
-
-
-
-- [xTrimoMultimer](https://github.com/biomap-research/xTrimoMultimer): accelerating structure prediction of protein monomers and multimer by 11x.
-
-
(back to top)
## Installation
@@ -303,6 +303,8 @@ Requirements:
- PyTorch >= 1.11 (PyTorch 2.x in progress)
- Python >= 3.7
- CUDA >= 11.0
+- [NVIDIA GPU Compute Capability](https://developer.nvidia.com/cuda-gpus) >= 7.0 (V100/RTX20 and higher)
+- Linux OS
If you encounter any problem with installation, you may want to raise an [issue](https://github.com/hpcaitech/ColossalAI/issues/new/choose) in this repository.
diff --git a/applications/README.md b/applications/README.md
index b6bde313e..a9ea517e9 100644
--- a/applications/README.md
+++ b/applications/README.md
@@ -7,6 +7,8 @@ The list of applications include:
- [X] [Chatbot](./Chat/README.md)
- [ ] Stable Diffusion
- [ ] Dreambooth
-
+- [X] [FastFold](https://github.com/hpcaitech/FastFold): Optimizing AlphaFold (Biomedicine) Training and Inference on GPU Clusters
> Please note that the `Chatbot` application is migrated from the original `ChatGPT` folder.
+
+You can find more example code for the base model and functions in the [Examples](https://github.com/hpcaitech/ColossalAI/tree/main/examples) directory.
\ No newline at end of file
diff --git a/docs/README-zh-Hans.md b/docs/README-zh-Hans.md
index f43a59530..daa42412c 100644
--- a/docs/README-zh-Hans.md
+++ b/docs/README-zh-Hans.md
@@ -38,6 +38,14 @@
- 为何选择 Colossal-AI
- 特点
+ -
+ Colossal-AI 成功案例
+
+
-
并行训练样例展示
--
- Colossal-AI 成功案例
-
-
-
安装
@@ -117,8 +117,88 @@ Colossal-AI 为您提供了一系列并行组件。我们的目标是让您的
(返回顶端)
-## 并行训练样例展示
+## Colossal-AI 成功案例
+### ColossalChat
+
+
+[ColossalChat](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat): 完整RLHF流程0门槛克隆 [ChatGPT](https://openai.com/blog/chatgpt/) [[代码]](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat) [[博客]](https://medium.com/@yangyou_berkeley/colossalchat-an-open-source-solution-for-cloning-chatgpt-with-a-complete-rlhf-pipeline-5edf08fb538b) [[在线样例]](https://chat.colossalai.org)
+
+
+
+
+
+- 最高可提升单机训练速度7.73倍,单卡推理速度1.42倍
+
+
+
+
+
+- 单卡模型容量最多提升10.3倍
+- 最小demo训练流程最低仅需1.62GB显存 (任意消费级GPU)
+
+
+
+
+
+- 提升单卡的微调模型容量3.7倍
+- 同时保持高速运行
+
+(back to top)
+
+### AIGC
+加速AIGC(AI内容生成)模型,如[Stable Diffusion v1](https://github.com/CompVis/stable-diffusion) 和 [Stable Diffusion v2](https://github.com/Stability-AI/stablediffusion)
+
+
+
+
+
+- [训练](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion): 减少5.6倍显存消耗,硬件成本最高降低46倍(从A100到RTX3060)
+
+
+
+
+
+- [DreamBooth微调](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/dreambooth): 仅需3-5张目标主题图像个性化微调
+
+
+
+
+
+- [推理](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion): GPU推理显存消耗降低2.5倍
+
+
+(返回顶端)
+
+### 生物医药
+
+加速 [AlphaFold](https://alphafold.ebi.ac.uk/) 蛋白质结构预测
+
+
+
+
+
+- [FastFold](https://github.com/hpcaitech/FastFold): 加速AlphaFold训练与推理、数据前处理、推理序列长度超过10000残基
+
+
+
+
+
+- [FastFold with Intel](https://github.com/hpcaitech/FastFold): 3倍推理加速和39%成本节省
+
+
+
+
+
+- [xTrimoMultimer](https://github.com/biomap-research/xTrimoMultimer): 11倍加速蛋白质单体与复合物结构预测
+
+(返回顶端)
+
+## 并行训练样例展示
### GPT-3
@@ -213,87 +293,6 @@ Colossal-AI 为您提供了一系列并行组件。我们的目标是让您的
(返回顶端)
-## Colossal-AI 成功案例
-### ColossalChat
-
-
-
-[ColossalChat](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat): 完整RLHF流程0门槛克隆 [ChatGPT](https://openai.com/blog/chatgpt/) [[代码]](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat) [[博客]](https://medium.com/@yangyou_berkeley/colossalchat-an-open-source-solution-for-cloning-chatgpt-with-a-complete-rlhf-pipeline-5edf08fb538b) [[在线样例]](https://chat.colossalai.org)
-
-
-
-
-
-- 最高可提升单机训练速度7.73倍,单卡推理速度1.42倍
-
-
-
-
-
-- 单卡模型容量最多提升10.3倍
-- 最小demo训练流程最低仅需1.62GB显存 (任意消费级GPU)
-
-
-
-
-
-- 提升单卡的微调模型容量3.7倍
-- 同时保持高速运行
-
-(back to top)
-
-### AIGC
-加速AIGC(AI内容生成)模型,如[Stable Diffusion v1](https://github.com/CompVis/stable-diffusion) 和 [Stable Diffusion v2](https://github.com/Stability-AI/stablediffusion)
-
-
-
-
-
-- [训练](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion): 减少5.6倍显存消耗,硬件成本最高降低46倍(从A100到RTX3060)
-
-
-
-
-
-- [DreamBooth微调](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/dreambooth): 仅需3-5张目标主题图像个性化微调
-
-
-
-
-
-- [推理](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion): GPU推理显存消耗降低2.5倍
-
-
-(返回顶端)
-
-### 生物医药
-
-加速 [AlphaFold](https://alphafold.ebi.ac.uk/) 蛋白质结构预测
-
-
-
-
-
-- [FastFold](https://github.com/hpcaitech/FastFold): 加速AlphaFold训练与推理、数据前处理、推理序列长度超过10000残基
-
-
-
-
-
-- [FastFold with Intel](https://github.com/hpcaitech/FastFold): 3倍推理加速和39%成本节省
-
-
-
-
-
-- [xTrimoMultimer](https://github.com/biomap-research/xTrimoMultimer): 11倍加速蛋白质单体与复合物结构预测
-
-(返回顶端)
-
## 安装
环境要求:
@@ -301,6 +300,8 @@ Colossal-AI 为您提供了一系列并行组件。我们的目标是让您的
- PyTorch >= 1.11 (PyTorch 2.x 正在适配中)
- Python >= 3.7
- CUDA >= 11.0
+- [NVIDIA GPU Compute Capability](https://developer.nvidia.com/cuda-gpus) >= 7.0 (V100/RTX20 and higher)
+- Linux OS
如果你遇到安装问题,可以向本项目 [反馈](https://github.com/hpcaitech/ColossalAI/issues/new/choose)。
diff --git a/docs/source/en/get_started/installation.md b/docs/source/en/get_started/installation.md
index 672fd8ae0..290879219 100644
--- a/docs/source/en/get_started/installation.md
+++ b/docs/source/en/get_started/installation.md
@@ -4,6 +4,8 @@ Requirements:
- PyTorch >= 1.11 (PyTorch 2.x in progress)
- Python >= 3.7
- CUDA >= 11.0
+- [NVIDIA GPU Compute Capability](https://developer.nvidia.com/cuda-gpus) >= 7.0 (V100/RTX20 and higher)
+- Linux OS
If you encounter any problem about installation, you may want to raise an [issue](https://github.com/hpcaitech/ColossalAI/issues/new/choose) in this repository.
diff --git a/docs/source/zh-Hans/get_started/installation.md b/docs/source/zh-Hans/get_started/installation.md
index 7a9b20255..72f853938 100755
--- a/docs/source/zh-Hans/get_started/installation.md
+++ b/docs/source/zh-Hans/get_started/installation.md
@@ -5,6 +5,8 @@
- PyTorch >= 1.11 (PyTorch 2.x 正在适配中)
- Python >= 3.7
- CUDA >= 11.0
+- [NVIDIA GPU Compute Capability](https://developer.nvidia.com/cuda-gpus) >= 7.0 (V100/RTX20 and higher)
+- Linux OS
如果你遇到安装问题,可以向本项目 [反馈](https://github.com/hpcaitech/ColossalAI/issues/new/choose)。
diff --git a/examples/README.md b/examples/README.md
index 710ced101..dd5e7b10a 100644
--- a/examples/README.md
+++ b/examples/README.md
@@ -12,6 +12,8 @@
This folder provides several examples accelerated by Colossal-AI. The `tutorial` folder is for everyone to quickly try out the different features in Colossal-AI. Other folders such as `images` and `language` include a wide range of deep learning tasks and applications.
+You can find applications such as Chatbot, Stable Diffusion and Biomedicine in the [Applications](https://github.com/hpcaitech/ColossalAI/tree/main/applications) directory.
+
## Folder Structure
```text