mirror of
https://github.com/hpcaitech/ColossalAI.git
synced 2025-09-08 04:24:47 +00:00
[doc] added reference to related works (#2994)
* [doc] added reference to related works * polish code
This commit is contained in:
@@ -1,6 +1,7 @@
|
||||
# Zero Redundancy Optimizer with chunk-based memory management
|
||||
|
||||
Author: [Hongxiu Liu](https://github.com/ver217), [Jiarui Fang](https://github.com/feifeibear), [Zijian Ye](https://github.com/ZijianYY)
|
||||
|
||||
**Prerequisite:**
|
||||
- [Define Your Configuration](../basics/define_your_config.md)
|
||||
|
||||
@@ -9,9 +10,11 @@ Author: [Hongxiu Liu](https://github.com/ver217), [Jiarui Fang](https://github.c
|
||||
- [Train GPT with Colossal-AI](https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/gpt)
|
||||
|
||||
**Related Paper**
|
||||
|
||||
- [ZeRO: Memory Optimizations Toward Training Trillion Parameter Models](https://arxiv.org/abs/1910.02054)
|
||||
- [ZeRO-Offload: Democratizing Billion-Scale Model Training](https://arxiv.org/abs/2101.06840)
|
||||
- [ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning](https://arxiv.org/abs/2104.07857)
|
||||
- [DeepSpeed: System Optimizations Enable Training Deep Learning Models with Over 100 Billion Parameters](https://dl.acm.org/doi/10.1145/3394486.3406703)
|
||||
- [PatrickStar: Parallel Training of Pre-trained Models via Chunk-based Memory Management](https://arxiv.org/abs/2108.05818)
|
||||
|
||||
## Introduction
|
||||
|
Reference in New Issue
Block a user