[doc] Add user document for Shardformer (#4702)

* create shardformer doc files

* add docstring for seq-parallel

* update ShardConfig docstring

* add links to llama example

* add outdated massage

* finish introduction & supporting information

* finish 'how shardformer works'

* finish shardformer.md English doc

* fix doctest fail

* add Chinese document
This commit is contained in:
Baizhou Zhang
2023-09-15 10:56:39 +08:00
committed by GitHub
parent ce97790ed7
commit f911d5b09d
11 changed files with 315 additions and 33 deletions

View File

@@ -243,9 +243,11 @@ class HybridParallelPlugin(PipelinePluginBase):
enable_all_optimization (bool, optional): Whether to switch on all the optimizations supported by Shardformer.
Currently all the optimization methods include fused normalization, flash attention and JIT.
Defaults to False.
enable_fused_normalization (bool, optional): Whether to switch on fused normalization. Defaults to False.
enable_flash_attention (bool, optional): Whether to switch on flash attention. Defaults to False.
enable_jit_fused (bool, optional): Whether to switch on JIT. Default to Falase.
enable_fused_normalization (bool, optional): Whether to switch on fused normalization in Shardformer. Defaults to False.
enable_flash_attention (bool, optional): Whether to switch on flash attention in Shardformer. Defaults to False.
enable_jit_fused (bool, optional): Whether to switch on JIT in Shardformer. Default to False.
enable_sequence_parallelism (bool): Whether to turn on sequence parallelism in Shardformer. Defaults to False.
enable_sequence_overlap (bool): Whether to turn on sequence overlap in Shardformer. Defaults to False.
num_microbatches (int, optional): Number of microbatches when using pipeline parallelism. Defaults to None.
microbatch_size (int, optional): Microbatch size when using pipeline parallelism.
Either ``num_microbatches`` or ``microbatch_size`` should be provided if using pipeline.