... to fix reads of compressed data by docker-archive:

> go get github.com/klauspost/pgzip@master
> make vendor

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This commit is contained in:
Miloslav Trmač
2022-09-30 19:51:07 +02:00
parent 787e10873c
commit 7ebff0f533
7 changed files with 29 additions and 11 deletions

View File

@@ -1,3 +1,7 @@
arch:
- amd64
- ppc64le
language: go
os:

View File

@@ -1,4 +1,4 @@
MIT License
The MIT License (MIT)
Copyright (c) 2014 Klaus Post
@@ -19,3 +19,4 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -104,13 +104,12 @@ Content is [Matt Mahoneys 10GB corpus](http://mattmahoney.net/dc/10gb.html). Com
Compressor | MB/sec | speedup | size | size overhead (lower=better)
------------|----------|---------|------|---------
[gzip](http://golang.org/pkg/compress/gzip) (golang) | 15.44MB/s (1 thread) | 1.0x | 4781329307 | 0%
[gzip](http://github.com/klauspost/compress/gzip) (klauspost) | 135.04MB/s (1 thread) | 8.74x | 4894858258 | +2.37%
[pgzip](https://github.com/klauspost/pgzip) (klauspost) | 1573.23MB/s| 101.9x | 4902285651 | +2.53%
[bgzf](https://godoc.org/github.com/biogo/hts/bgzf) (biogo) | 361.40MB/s | 23.4x | 4869686090 | +1.85%
[pargzip](https://godoc.org/github.com/golang/build/pargzip) (builder) | 306.01MB/s | 19.8x | 4786890417 | +0.12%
[gzip](http://golang.org/pkg/compress/gzip) (golang) | 16.91MB/s (1 thread) | 1.0x | 4781329307 | 0%
[gzip](http://github.com/klauspost/compress/gzip) (klauspost) | 127.10MB/s (1 thread) | 7.52x | 4885366806 | +2.17%
[pgzip](https://github.com/klauspost/pgzip) (klauspost) | 2085.35MB/s| 123.34x | 4886132566 | +2.19%
[pargzip](https://godoc.org/github.com/golang/build/pargzip) (builder) | 334.04MB/s | 19.76x | 4786890417 | +0.12%
pgzip also contains a [linear time compression](https://github.com/klauspost/compress#linear-time-compression-huffman-only) mode, that will allow compression at ~250MB per core per second, independent of the content.
pgzip also contains a [huffman only compression](https://github.com/klauspost/compress#linear-time-compression-huffman-only) mode, that will allow compression at ~450MB per core per second, largely independent of the content.
See the [complete sheet](https://docs.google.com/spreadsheets/d/1nuNE2nPfuINCZJRMt6wFWhKpToF95I47XjSsc-1rbPQ/edit?usp=sharing) for different content types and compression settings.
@@ -123,7 +122,7 @@ In the example above, the numbers are as follows on a 4 CPU machine:
Decompressor | Time | Speedup
-------------|------|--------
[gzip](http://golang.org/pkg/compress/gzip) (golang) | 1m28.85s | 0%
[pgzip](https://github.com/klauspost/pgzip) (golang) | 43.48s | 104%
[pgzip](https://github.com/klauspost/pgzip) (klauspost) | 43.48s | 104%
But wait, since gzip decompression is inherently singlethreaded (aside from CRC calculation) how can it be more than 100% faster? Because pgzip due to its design also acts as a buffer. When using unbuffered gzip, you are also waiting for io when you are decompressing. If the gzip decoder can keep up, it will always have data ready for your reader, and you will not be waiting for input to the gzip decompressor to complete.

View File

@@ -513,6 +513,19 @@ func (z *Reader) Read(p []byte) (n int, err error) {
func (z *Reader) WriteTo(w io.Writer) (n int64, err error) {
total := int64(0)
avail := z.current[z.roff:]
if len(avail) != 0 {
n, err := w.Write(avail)
if n != len(avail) {
return total, io.ErrShortWrite
}
total += int64(n)
if err != nil {
return total, err
}
z.blockPool <- z.current
z.current = nil
}
for {
if z.err != nil {
return total, z.err