mirror of
https://github.com/hpcaitech/ColossalAI.git
synced 2025-09-05 11:02:05 +00:00
[doc] Fix typo under colossalai and doc(#3618)
* Fixed several spelling errors under colossalai * Fix the spelling error in colossalai and docs directory * Cautious Changed the spelling error under the example folder * Update runtime_preparation_pass.py revert autograft to autograd * Update search_chunk.py utile to until * Update check_installation.py change misteach to mismatch in line 91 * Update 1D_tensor_parallel.md revert to perceptron * Update 2D_tensor_parallel.md revert to perceptron in line 73 * Update 2p5D_tensor_parallel.md revert to perceptron in line 71 * Update 3D_tensor_parallel.md revert to perceptron in line 80 * Update README.md revert to resnet in line 42 * Update reorder_graph.py revert to indice in line 7 * Update p2p.py revert to megatron in line 94 * Update initialize.py revert to torchrun in line 198 * Update routers.py change to detailed in line 63 * Update routers.py change to detailed in line 146 * Update README.md revert random number in line 402
This commit is contained in:
@@ -26,7 +26,7 @@ def zero_model_wrapper(model: nn.Module,
|
||||
zero_stage (int, optional): The stage of ZeRO DDP. You can find more information in ZeRO's paper.
|
||||
https://arxiv.org/abs/1910.02054
|
||||
gemini_config (dict, optional): The configuration dictionary of `GeminiDDP`. `GeminiDDP` is enabled
|
||||
when the stage is set to 3. You can set the arguemnts of `GeminiDDP` in the gemini_config.
|
||||
when the stage is set to 3. You can set the arguments of `GeminiDDP` in the gemini_config.
|
||||
Here is an example where we set the device of the model, the placement policy of Gemini, and the
|
||||
size of hidden dimension to help Gemini find out a unified chunk size.
|
||||
|
||||
@@ -78,7 +78,7 @@ def zero_optim_wrapper(model: nn.Module,
|
||||
max_norm (float, optional): max_norm used for `clip_grad_norm`. You should notice that you shall not do
|
||||
clip_grad_norm by yourself when using ZeRO DDP. The ZeRO optimizer will take care of clip_grad_norm.
|
||||
norm_type (float, optional): norm_type used for `clip_grad_norm`.
|
||||
optim_config (dict, optinoal): The configuration used for the ZeRO optimizer.
|
||||
optim_config (dict, optional): The configuration used for the ZeRO optimizer.
|
||||
Example:
|
||||
|
||||
>>> zero2_config = dict(reduce_bucket_size=12 * 1024 * 1024, overlap_communication=True)
|
||||
|
Reference in New Issue
Block a user