[doc] Add user document for Shardformer (#4702)

* create shardformer doc files

* add docstring for seq-parallel

* update ShardConfig docstring

* add links to llama example

* add outdated massage

* finish introduction & supporting information

* finish 'how shardformer works'

* finish shardformer.md English doc

* fix doctest fail

* add Chinese document
This commit is contained in:
Baizhou Zhang
2023-09-15 10:56:39 +08:00
committed by GitHub
parent ce97790ed7
commit f911d5b09d
11 changed files with 315 additions and 33 deletions

View File

@@ -2,6 +2,8 @@
Author: Zhengda Bian, Yongbin Li
> ⚠️ The information on this page is outdated and will be deprecated. Please check [Shardformer](./shardformer.md) for more information.
**Prerequisite**
- [Define Your Configuration](../basics/define_your_config.md)
- [Configure Parallelization](../basics/configure_parallelization.md)
@@ -116,3 +118,5 @@ Output of the first linear layer: torch.Size([16, 512])
Output of the second linear layer: torch.Size([16, 256])
```
The output of the first linear layer is split into 2 partitions (each has the shape `[16, 512]`), while the second layer has identical outputs across the GPUs.
<!-- doc-test-command: echo -->