mirror of
https://github.com/hpcaitech/ColossalAI.git
synced 2026-05-05 12:24:38 +00:00
[doc] add requirement and highlight application (#3516)
* [doc] add requirement and highlight application * [doc] link example and application
This commit is contained in:
182
README.md
182
README.md
@@ -38,6 +38,14 @@
|
||||
<ul>
|
||||
<li><a href="#Why-Colossal-AI">Why Colossal-AI</a> </li>
|
||||
<li><a href="#Features">Features</a> </li>
|
||||
<li>
|
||||
<a href="#Colossal-AI-in-the-Real-World">Colossal-AI for Real World Applications</a>
|
||||
<ul>
|
||||
<li><a href="#ColossalChat">ColossalChat: An Open-Source Solution for Cloning ChatGPT With a Complete RLHF Pipeline</a></li>
|
||||
<li><a href="#AIGC">AIGC: Acceleration of Stable Diffusion</a></li>
|
||||
<li><a href="#Biomedicine">Biomedicine: Acceleration of AlphaFold Protein Structure</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>
|
||||
<a href="#Parallel-Training-Demo">Parallel Training Demo</a>
|
||||
<ul>
|
||||
@@ -64,14 +72,6 @@
|
||||
<li><a href="#OPT-Serving">OPT-175B Online Serving for Text Generation</a></li>
|
||||
<li><a href="#BLOOM-Inference">176B BLOOM</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>
|
||||
<a href="#Colossal-AI-in-the-Real-World">Colossal-AI for Real World Applications</a>
|
||||
<ul>
|
||||
<li><a href="#ColossalChat">ColossalChat: An Open-Source Solution for Cloning ChatGPT With a Complete RLHF Pipeline</a></li>
|
||||
<li><a href="#AIGC">AIGC: Acceleration of Stable Diffusion</a></li>
|
||||
<li><a href="#Biomedicine">Biomedicine: Acceleration of AlphaFold Protein Structure</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>
|
||||
<a href="#Installation">Installation</a>
|
||||
@@ -120,6 +120,88 @@ distributed training and inference in a few lines.
|
||||
- Inference
|
||||
- [Energon-AI](https://github.com/hpcaitech/EnergonAI)
|
||||
|
||||
<p align="right">(<a href="#top">back to top</a>)</p>
|
||||
|
||||
## Colossal-AI in the Real World
|
||||
|
||||
### ColossalChat
|
||||
|
||||
<div align="center">
|
||||
<a href="https://chat.colossalai.org/">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/Chat-demo.png" width="700" />
|
||||
</a>
|
||||
</div>
|
||||
|
||||
[ColossalChat](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat): An open-source solution for cloning [ChatGPT](https://openai.com/blog/chatgpt/) with a complete RLHF pipeline. [[code]](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat) [[blog]](https://medium.com/@yangyou_berkeley/colossalchat-an-open-source-solution-for-cloning-chatgpt-with-a-complete-rlhf-pipeline-5edf08fb538b) [[demo]](https://chat.colossalai.org)
|
||||
|
||||
<p id="ColossalChat_scaling" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/ChatGPT%20scaling.png" width=800/>
|
||||
</p>
|
||||
|
||||
- Up to 7.73 times faster for single server training and 1.42 times faster for single-GPU inference
|
||||
|
||||
<p id="ColossalChat-1GPU" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/ChatGPT-1GPU.jpg" width=450/>
|
||||
</p>
|
||||
|
||||
- Up to 10.3x growth in model capacity on one GPU
|
||||
- A mini demo training process requires only 1.62GB of GPU memory (any consumer-grade GPU)
|
||||
|
||||
<p id="ColossalChat-LoRA" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/LoRA%20data.jpg" width=600/>
|
||||
</p>
|
||||
|
||||
- Increase the capacity of the fine-tuning model by up to 3.7 times on a single GPU
|
||||
- Keep at a sufficiently high running speed
|
||||
|
||||
<p align="right">(<a href="#top">back to top</a>)</p>
|
||||
|
||||
|
||||
### AIGC
|
||||
Acceleration of AIGC (AI-Generated Content) models such as [Stable Diffusion v1](https://github.com/CompVis/stable-diffusion) and [Stable Diffusion v2](https://github.com/Stability-AI/stablediffusion).
|
||||
<p id="diffusion_train" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/Stable%20Diffusion%20v2.png" width=800/>
|
||||
</p>
|
||||
|
||||
- [Training](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion): Reduce Stable Diffusion memory consumption by up to 5.6x and hardware cost by up to 46x (from A100 to RTX3060).
|
||||
|
||||
<p id="diffusion_demo" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/DreamBooth.png" width=800/>
|
||||
</p>
|
||||
|
||||
- [DreamBooth Fine-tuning](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/dreambooth): Personalize your model using just 3-5 images of the desired subject.
|
||||
|
||||
<p id="inference" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/Stable%20Diffusion%20Inference.jpg" width=800/>
|
||||
</p>
|
||||
|
||||
- [Inference](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion): Reduce inference GPU memory consumption by 2.5x.
|
||||
|
||||
|
||||
<p align="right">(<a href="#top">back to top</a>)</p>
|
||||
|
||||
### Biomedicine
|
||||
Acceleration of [AlphaFold Protein Structure](https://alphafold.ebi.ac.uk/)
|
||||
|
||||
<p id="FastFold" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/FastFold.jpg" width=800/>
|
||||
</p>
|
||||
|
||||
- [FastFold](https://github.com/hpcaitech/FastFold): Accelerating training and inference on GPU Clusters, faster data processing, inference sequence containing more than 10000 residues.
|
||||
|
||||
<p id="FastFold-Intel" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/data%20preprocessing%20with%20Intel.jpg" width=600/>
|
||||
</p>
|
||||
|
||||
- [FastFold with Intel](https://github.com/hpcaitech/FastFold): 3x inference acceleration and 39% cost reduce.
|
||||
|
||||
<p id="xTrimoMultimer" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/xTrimoMultimer_Table.jpg" width=800/>
|
||||
</p>
|
||||
|
||||
- [xTrimoMultimer](https://github.com/biomap-research/xTrimoMultimer): accelerating structure prediction of protein monomers and multimer by 11x.
|
||||
|
||||
|
||||
<p align="right">(<a href="#top">back to top</a>)</p>
|
||||
|
||||
## Parallel Training Demo
|
||||
@@ -213,88 +295,6 @@ Please visit our [documentation](https://www.colossalai.org/) and [examples](htt
|
||||
|
||||
- [BLOOM](https://github.com/hpcaitech/EnergonAI/tree/main/examples/bloom): Reduce hardware deployment costs of 176-billion-parameter BLOOM by more than 10 times.
|
||||
|
||||
<p align="right">(<a href="#top">back to top</a>)</p>
|
||||
|
||||
## Colossal-AI in the Real World
|
||||
|
||||
### ColossalChat
|
||||
|
||||
<div align="center">
|
||||
<a href="https://chat.colossalai.org/">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/Chat-demo.png" width="700" />
|
||||
</a>
|
||||
</div>
|
||||
|
||||
[ColossalChat](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat): An open-source solution for cloning [ChatGPT](https://openai.com/blog/chatgpt/) with a complete RLHF pipeline. [[code]](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat) [[blog]](https://medium.com/@yangyou_berkeley/colossalchat-an-open-source-solution-for-cloning-chatgpt-with-a-complete-rlhf-pipeline-5edf08fb538b) [[demo]](https://chat.colossalai.org)
|
||||
|
||||
<p id="ColossalChat_scaling" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/ChatGPT%20scaling.png" width=800/>
|
||||
</p>
|
||||
|
||||
- Up to 7.73 times faster for single server training and 1.42 times faster for single-GPU inference
|
||||
|
||||
<p id="ColossalChat-1GPU" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/ChatGPT-1GPU.jpg" width=450/>
|
||||
</p>
|
||||
|
||||
- Up to 10.3x growth in model capacity on one GPU
|
||||
- A mini demo training process requires only 1.62GB of GPU memory (any consumer-grade GPU)
|
||||
|
||||
<p id="ColossalChat-LoRA" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/LoRA%20data.jpg" width=600/>
|
||||
</p>
|
||||
|
||||
- Increase the capacity of the fine-tuning model by up to 3.7 times on a single GPU
|
||||
- Keep at a sufficiently high running speed
|
||||
|
||||
<p align="right">(<a href="#top">back to top</a>)</p>
|
||||
|
||||
|
||||
### AIGC
|
||||
Acceleration of AIGC (AI-Generated Content) models such as [Stable Diffusion v1](https://github.com/CompVis/stable-diffusion) and [Stable Diffusion v2](https://github.com/Stability-AI/stablediffusion).
|
||||
<p id="diffusion_train" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/Stable%20Diffusion%20v2.png" width=800/>
|
||||
</p>
|
||||
|
||||
- [Training](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion): Reduce Stable Diffusion memory consumption by up to 5.6x and hardware cost by up to 46x (from A100 to RTX3060).
|
||||
|
||||
<p id="diffusion_demo" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/DreamBooth.png" width=800/>
|
||||
</p>
|
||||
|
||||
- [DreamBooth Fine-tuning](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/dreambooth): Personalize your model using just 3-5 images of the desired subject.
|
||||
|
||||
<p id="inference" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/Stable%20Diffusion%20Inference.jpg" width=800/>
|
||||
</p>
|
||||
|
||||
- [Inference](https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion): Reduce inference GPU memory consumption by 2.5x.
|
||||
|
||||
|
||||
<p align="right">(<a href="#top">back to top</a>)</p>
|
||||
|
||||
### Biomedicine
|
||||
Acceleration of [AlphaFold Protein Structure](https://alphafold.ebi.ac.uk/)
|
||||
|
||||
<p id="FastFold" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/FastFold.jpg" width=800/>
|
||||
</p>
|
||||
|
||||
- [FastFold](https://github.com/hpcaitech/FastFold): Accelerating training and inference on GPU Clusters, faster data processing, inference sequence containing more than 10000 residues.
|
||||
|
||||
<p id="FastFold-Intel" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/data%20preprocessing%20with%20Intel.jpg" width=600/>
|
||||
</p>
|
||||
|
||||
- [FastFold with Intel](https://github.com/hpcaitech/FastFold): 3x inference acceleration and 39% cost reduce.
|
||||
|
||||
<p id="xTrimoMultimer" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/xTrimoMultimer_Table.jpg" width=800/>
|
||||
</p>
|
||||
|
||||
- [xTrimoMultimer](https://github.com/biomap-research/xTrimoMultimer): accelerating structure prediction of protein monomers and multimer by 11x.
|
||||
|
||||
|
||||
<p align="right">(<a href="#top">back to top</a>)</p>
|
||||
|
||||
## Installation
|
||||
@@ -303,6 +303,8 @@ Requirements:
|
||||
- PyTorch >= 1.11 (PyTorch 2.x in progress)
|
||||
- Python >= 3.7
|
||||
- CUDA >= 11.0
|
||||
- [NVIDIA GPU Compute Capability](https://developer.nvidia.com/cuda-gpus) >= 7.0 (V100/RTX20 and higher)
|
||||
- Linux OS
|
||||
|
||||
If you encounter any problem with installation, you may want to raise an [issue](https://github.com/hpcaitech/ColossalAI/issues/new/choose) in this repository.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user