[tutorial] added data script and updated readme (#1916)

This commit is contained in:
Frank Lee 2022-11-12 16:38:41 +08:00 committed by GitHub
parent 155e202318
commit d53415bc10
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 42 additions and 6 deletions

1
examples/tutorial/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
data/

View File

@ -7,18 +7,33 @@ Welcome to the [Colossal-AI](https://github.com/hpcaitech/ColossalAI) tutorial,
[Colossal-AI](https://github.com/hpcaitech/ColossalAI), a unified deep learning system for the big model era, integrates [Colossal-AI](https://github.com/hpcaitech/ColossalAI), a unified deep learning system for the big model era, integrates
many advanced technologies such as multi-dimensional tensor parallelism, sequence parallelism, heterogeneous memory management, many advanced technologies such as multi-dimensional tensor parallelism, sequence parallelism, heterogeneous memory management,
large-scale optimization, adaptive task scheduling, etc. By using Colossal-AI, we could help users to efficiently and large-scale optimization, adaptive task scheduling, etc. By using Colossal-AI, we could help users to efficiently and
quickly deploy large AI model training and inference, reducing large AI model training budgets and scaling down the labor cost of learning and deployment. quickly deploy large AI model training and inference, reducing large AI model training budgets and scaling down the labor cost of learning and deployment.
### 🚀 Quick Links ### 🚀 Quick Links
[**Colossal-AI**](https://github.com/hpcaitech/ColossalAI) | [**Colossal-AI**](https://github.com/hpcaitech/ColossalAI) |
[**Paper**](https://arxiv.org/abs/2110.14883) | [**Paper**](https://arxiv.org/abs/2110.14883) |
[**Documentation**](https://www.colossalai.org/) | [**Documentation**](https://www.colossalai.org/) |
[**Forum**](https://github.com/hpcaitech/ColossalAI/discussions) | [**Forum**](https://github.com/hpcaitech/ColossalAI/discussions) |
[**Slack**](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w) [**Slack**](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w)
## Prerequisite
To run this example, you only need to have PyTorch and Colossal-AI installed. A sample script to download the dependencies is given below.
```
# install torch 1.12 with CUDA 11.3
# visit https://pytorch.org/get-started/locally/ to download other versions
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
# install latest ColossalAI
# visit https://colossalai.org/download to download corresponding version of Colossal-AI
pip install colossalai==0.1.11+torch1.12cu11.3 -f https://release.colossalai.org
```
## Table of Content ## Table of Content
- Multi-dimensional Parallelism - Multi-dimensional Parallelism
@ -43,7 +58,15 @@ quickly deploy large AI model training and inference, reducing large AI model tr
- Acceleration of Stable Diffusion - Acceleration of Stable Diffusion
- Stable Diffusion with Lightning - Stable Diffusion with Lightning
- Try Lightning Colossal-AI strategy to optimize memory and accelerate speed - Try Lightning Colossal-AI strategy to optimize memory and accelerate speed
## Prepare Common Dataset
**This tutorial folder aims to let the user to quickly try out the training scripts**. One major task for deep learning is data preparataion. To save time on data preparation, we use `CIFAR10` for most tutorials and synthetic datasets if the dataset required is too large. To make the `CIFAR10` dataset shared across the different examples, it should be downloaded in tutorial root directory with the following command.
```python
python download_cifar10.py
```
## Discussion ## Discussion
@ -51,4 +74,3 @@ Discussion about the [Colossal-AI](https://github.com/hpcaitech/ColossalAI) proj
If you think there is a need to discuss anything, you may jump to our [Slack](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w). If you think there is a need to discuss anything, you may jump to our [Slack](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w).
If you encounter any problem while running these tutorials, you may want to raise an [issue](https://github.com/hpcaitech/ColossalAI/issues/new/choose) in this repository. If you encounter any problem while running these tutorials, you may want to raise an [issue](https://github.com/hpcaitech/ColossalAI/issues/new/choose) in this repository.

View File

@ -0,0 +1,13 @@
import os
from torchvision.datasets import CIFAR10
def main():
dir_path = os.path.dirname(os.path.realpath(__file__))
data_root = os.path.join(dir_path, 'data')
dataset = CIFAR10(root=data_root, download=True)
if __name__ == '__main__':
main()