mirror of
https://github.com/hpcaitech/ColossalAI.git
synced 2025-09-02 01:28:31 +00:00
[doc] Fix typo under colossalai and doc(#3618)
* Fixed several spelling errors under colossalai * Fix the spelling error in colossalai and docs directory * Cautious Changed the spelling error under the example folder * Update runtime_preparation_pass.py revert autograft to autograd * Update search_chunk.py utile to until * Update check_installation.py change misteach to mismatch in line 91 * Update 1D_tensor_parallel.md revert to perceptron * Update 2D_tensor_parallel.md revert to perceptron in line 73 * Update 2p5D_tensor_parallel.md revert to perceptron in line 71 * Update 3D_tensor_parallel.md revert to perceptron in line 80 * Update README.md revert to resnet in line 42 * Update reorder_graph.py revert to indice in line 7 * Update p2p.py revert to megatron in line 94 * Update initialize.py revert to torchrun in line 198 * Update routers.py change to detailed in line 63 * Update routers.py change to detailed in line 146 * Update README.md revert random number in line 402
This commit is contained in:
@@ -40,7 +40,7 @@ We provide two stable solutions.
|
||||
One utilizes the Gemini to implement hybrid parallel strategies of Gemini, DDP/ZeRO, and Tensor Parallelism for a huggingface GPT model.
|
||||
The other one use [Titans](https://github.com/hpcaitech/Titans), a distributed executed model zoo maintained by ColossalAI,to implement the hybrid parallel strategies of TP + ZeRO + PP.
|
||||
|
||||
We recommend using Gemini to qucikly run your model in a distributed manner.
|
||||
We recommend using Gemini to quickly run your model in a distributed manner.
|
||||
It doesn't require significant changes to the model structures, therefore you can apply it on a new model easily.
|
||||
And use Titans as an advanced weapon to pursue a more extreme performance.
|
||||
Titans has included the some typical models, such as Vit and GPT.
|
||||
|
@@ -27,7 +27,7 @@ pip install transformers
|
||||
|
||||
## Dataset
|
||||
|
||||
For simplicity, the input data is randonly generated here.
|
||||
For simplicity, the input data is randomly generated here.
|
||||
|
||||
## Training
|
||||
|
||||
|
@@ -34,7 +34,7 @@ conda install -c conda-forge coin-or-cbc
|
||||
|
||||
## Dataset
|
||||
|
||||
For simplicity, the input data is randonly generated here.
|
||||
For simplicity, the input data is randomly generated here.
|
||||
|
||||
## Training
|
||||
|
||||
|
@@ -27,7 +27,7 @@ pip install transformers
|
||||
|
||||
## Dataset
|
||||
|
||||
For simplicity, the input data is randonly generated here.
|
||||
For simplicity, the input data is randomly generated here.
|
||||
|
||||
## Training
|
||||
|
||||
|
@@ -163,7 +163,7 @@ def main():
|
||||
else:
|
||||
init_dev = get_current_device()
|
||||
|
||||
# shard init prameters
|
||||
# shard init parameters
|
||||
if args.shardinit:
|
||||
logger.info("Sharding initialization !", ranks=[0])
|
||||
else:
|
||||
@@ -192,7 +192,7 @@ def main():
|
||||
config=config,
|
||||
local_files_only=False)
|
||||
|
||||
# enable graident checkpointing
|
||||
# enable gradient checkpointing
|
||||
model.gradient_checkpointing_enable()
|
||||
|
||||
numel = sum([p.numel() for p in model.parameters()])
|
||||
|
Reference in New Issue
Block a user