Merge pull request #1764 from mtrmac/pgzip-update

Pgzip update
This commit is contained in:
Miloslav Trmač 2022-09-30 20:36:22 +02:00 committed by GitHub
commit ff2a361a0a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 1351 additions and 12 deletions

View File

@ -245,7 +245,7 @@ test-unit-local: bin/skopeo
$(GO) test $(MOD_VENDOR) -tags "$(BUILDTAGS)" $$($(GO) list $(MOD_VENDOR) -tags "$(BUILDTAGS)" -e ./... | grep -v '^github\.com/containers/skopeo/\(integration\|vendor/.*\)$$')
vendor:
$(GO) mod tidy -compat=1.17
$(GO) mod tidy
$(GO) mod vendor
$(GO) mod verify

2
go.mod
View File

@ -52,7 +52,7 @@ require (
github.com/inconshreveable/mousetrap v1.0.0 // indirect
github.com/json-iterator/go v1.1.12 // indirect
github.com/klauspost/compress v1.15.11 // indirect
github.com/klauspost/pgzip v1.2.5 // indirect
github.com/klauspost/pgzip v1.2.6-0.20220930104621-17e8dac29df8 // indirect
github.com/kr/pretty v0.2.1 // indirect
github.com/kr/text v0.2.0 // indirect
github.com/letsencrypt/boulder v0.0.0-20220723181115-27de4befb95e // indirect

1324
go.sum

File diff suppressed because it is too large Load Diff

View File

@ -1,3 +1,7 @@
arch:
- amd64
- ppc64le
language: go
os:

View File

@ -1,4 +1,4 @@
MIT License
The MIT License (MIT)
Copyright (c) 2014 Klaus Post
@ -19,3 +19,4 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@ -104,13 +104,12 @@ Content is [Matt Mahoneys 10GB corpus](http://mattmahoney.net/dc/10gb.html). Com
Compressor | MB/sec | speedup | size | size overhead (lower=better)
------------|----------|---------|------|---------
[gzip](http://golang.org/pkg/compress/gzip) (golang) | 15.44MB/s (1 thread) | 1.0x | 4781329307 | 0%
[gzip](http://github.com/klauspost/compress/gzip) (klauspost) | 135.04MB/s (1 thread) | 8.74x | 4894858258 | +2.37%
[pgzip](https://github.com/klauspost/pgzip) (klauspost) | 1573.23MB/s| 101.9x | 4902285651 | +2.53%
[bgzf](https://godoc.org/github.com/biogo/hts/bgzf) (biogo) | 361.40MB/s | 23.4x | 4869686090 | +1.85%
[pargzip](https://godoc.org/github.com/golang/build/pargzip) (builder) | 306.01MB/s | 19.8x | 4786890417 | +0.12%
[gzip](http://golang.org/pkg/compress/gzip) (golang) | 16.91MB/s (1 thread) | 1.0x | 4781329307 | 0%
[gzip](http://github.com/klauspost/compress/gzip) (klauspost) | 127.10MB/s (1 thread) | 7.52x | 4885366806 | +2.17%
[pgzip](https://github.com/klauspost/pgzip) (klauspost) | 2085.35MB/s| 123.34x | 4886132566 | +2.19%
[pargzip](https://godoc.org/github.com/golang/build/pargzip) (builder) | 334.04MB/s | 19.76x | 4786890417 | +0.12%
pgzip also contains a [linear time compression](https://github.com/klauspost/compress#linear-time-compression-huffman-only) mode, that will allow compression at ~250MB per core per second, independent of the content.
pgzip also contains a [huffman only compression](https://github.com/klauspost/compress#linear-time-compression-huffman-only) mode, that will allow compression at ~450MB per core per second, largely independent of the content.
See the [complete sheet](https://docs.google.com/spreadsheets/d/1nuNE2nPfuINCZJRMt6wFWhKpToF95I47XjSsc-1rbPQ/edit?usp=sharing) for different content types and compression settings.
@ -123,7 +122,7 @@ In the example above, the numbers are as follows on a 4 CPU machine:
Decompressor | Time | Speedup
-------------|------|--------
[gzip](http://golang.org/pkg/compress/gzip) (golang) | 1m28.85s | 0%
[pgzip](https://github.com/klauspost/pgzip) (golang) | 43.48s | 104%
[pgzip](https://github.com/klauspost/pgzip) (klauspost) | 43.48s | 104%
But wait, since gzip decompression is inherently singlethreaded (aside from CRC calculation) how can it be more than 100% faster? Because pgzip due to its design also acts as a buffer. When using unbuffered gzip, you are also waiting for io when you are decompressing. If the gzip decoder can keep up, it will always have data ready for your reader, and you will not be waiting for input to the gzip decompressor to complete.

View File

@ -513,6 +513,19 @@ func (z *Reader) Read(p []byte) (n int, err error) {
func (z *Reader) WriteTo(w io.Writer) (n int64, err error) {
total := int64(0)
avail := z.current[z.roff:]
if len(avail) != 0 {
n, err := w.Write(avail)
if n != len(avail) {
return total, io.ErrShortWrite
}
total += int64(n)
if err != nil {
return total, err
}
z.blockPool <- z.current
z.current = nil
}
for {
if z.err != nil {
return total, z.err

2
vendor/modules.txt vendored
View File

@ -287,7 +287,7 @@ github.com/klauspost/compress/internal/cpuinfo
github.com/klauspost/compress/internal/snapref
github.com/klauspost/compress/zstd
github.com/klauspost/compress/zstd/internal/xxhash
# github.com/klauspost/pgzip v1.2.5
# github.com/klauspost/pgzip v1.2.6-0.20220930104621-17e8dac29df8
## explicit
github.com/klauspost/pgzip
# github.com/kr/pretty v0.2.1