From 176010f2898cd4353313fc909bf4d2f5a65860a1 Mon Sep 17 00:00:00 2001
From: Maruyama_Aya <china6280111@126.com>
Date: Tue, 6 Jun 2023 14:08:22 +0800
Subject: [PATCH] update performance evaluation

---
 examples/tutorial/new_api/dreambooth/README.md | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/examples/tutorial/new_api/dreambooth/README.md b/examples/tutorial/new_api/dreambooth/README.md
index bd7e7707a..8e1fdbbc8 100644
--- a/examples/tutorial/new_api/dreambooth/README.md
+++ b/examples/tutorial/new_api/dreambooth/README.md
@@ -40,6 +40,9 @@ We have modified our previous implementation of Dreambooth with our new Booster
 We have also offer a shell script `test_ci.sh` for you to go through all our plugins for the booster.
 For more information about the booster API you can refer to https://colossalai.org/docs/basics/booster_api/.
 
+
+
+
 ## Training
 
 We provide the script `colossalai.sh` to run the training task with colossalai. For instance, the script of training process for [stable-diffusion-v1-4] model can be modified into:
@@ -97,7 +100,22 @@ torchrun --nproc_per_node 2 train_dreambooth_colossalai.py \
   --placement="cuda"
 ```
 
+## Performance
 
+|    Strategy    | #GPU | Batch Size | GPU RAM(GB) | speedup |
+|:--------------:|:----:|:----------:|:-----------:|:-------:|
+|  Traditional   |  1   |     16     |     oom     |    \    |
+|  Traditional   |  1   |     8      |    61.81    |    1    |
+|   torch_ddp    |  4   |     16     |     oom     |    \    |
+|   torch_ddp    |  4   |     8      |    41.97    |  0.97   |
+|     gemini     |  4   |     16     |    53.29    |    \    |
+|     gemini     |  4   |     8      |    29.36    |  2.00   |
+| low_level_zero |  4   |     16     |    52.80    |    \    |
+| low_level_zero |  4   |     8      |    28.87    |  2.02   |
+
+The evaluation is performed on 4 Nvidia A100 GPUs with 80GB memory each, with GPU 0 & 1, 2 & 3 connected with NVLink.
+We finetuned the [stable-diffusion-v1-4](https://huggingface.co/stabilityai/stable-diffusion-v1-4) model with 512x512 resolution on the [Teyvat](https://huggingface.co/datasets/Fazzie/Teyvat) dataset and compared 
+the memory cost and the throughput for the plugins.
 
 ## Invitation to open-source contribution
 Referring to the successful attempts of [BLOOM](https://bigscience.huggingface.co/) and [Stable Diffusion](https://en.wikipedia.org/wiki/Stable_Diffusion), any and all developers and partners with computing powers, datasets, models are welcome to join and build the Colossal-AI community, making efforts towards the era of big AI models!