[hotfix] add copyright for solver and device mesh (#2803)

* [hotfix] add copyright for solver and device mesh

* add readme

* add alpa license

* polish
This commit is contained in:
YuliangLiu0306
2023-02-18 21:14:38 +08:00
committed by GitHub
parent dbd0fd1522
commit 2059fdd6b0
6 changed files with 49 additions and 13 deletions

View File

@@ -37,9 +37,6 @@ Colossal-AIs auto-parallelism searches for strategies in regard to each opera
## Distributed Tensor and Shape-Consistency System
The Colossal-AI system uses a device-mesh, similar to PyTorch's latest DTensor release, to manage its cluster. Colossal-AI uses a sharding-spec to annotate the storage status of each tensor and facilitate their distribution across the cluster. The system also employs a shape-consistency manager to automatically transform tensors between different sharding-specs, allowing for seamless slicing and dicing of tensors, while the shape-consistency manager ensures that the output of upstream operands is consistently stored in the cluster, regardless of how the input of downstream operands is stored. This makes Colossal-AI highly versatile and easy to use without users worrying about the storage status of tensors when performing operations on them.
<figure style={{textAlign: "center"}}>
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/auto_parallel/shape_consistency.png"/>
</figure>
Here are some key advantages of Colossal-AI compared to PyTorch DTensor:
Colossal-AI's device-mesh uses cluster performance metrics and profiling results to estimate the time consumption of different communication operators. This helps Colossal-AI optimize communication between nodes and improve overall system efficiency.

View File

@@ -11,7 +11,3 @@ Detailed instructions can be found in its `README.md`.
Colossal-Auto's automatic search function for activation checkpointing finds the most efficient checkpoint within a given memory budget, rather than just aiming for maximum memory compression. To avoid a lengthy search process for an optimal activation checkpoint, Colossal-Auto has implemented a two-stage search process. This allows the system to find a feasible distributed training solution in a reasonable amount of time while still benefiting from activation checkpointing for memory management. The integration of activation checkpointing in Colossal-AI improves the efficiency and effectiveness of large model training. You can follow the [Resnet example](https://github.com/hpcaitech/ColossalAI/tree/main/examples/tutorial/auto_parallel).
Detailed instructions can be found in its `README.md`.
<figure style={{textAlign: "center"}}>
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/auto_parallel/auto_ckpt.jpg"/>
</figure>