Roland's homepage

My random knot in the Web

Evaluating Zstandard compression

Recently I became aware of the zstd compression program. I wanted to see how it stacks up against gzip, bzip2 and xz.

TL;DR

Compression with zstd is blisteringly fast.

But xz still yields the smallest files.

Test setup

The tests are done on an otherwise idle machine. The test machine is from 2009 and has a core2 quad CPU and a regular SATA harddisk. So not the latest and greatest, but no slouch.

All programs are used with their default settings and single-threaded.

Plain text

The first demo file is a 46 MiB mail log file. This is plain text, so it should compress well.

First let’s compress the file and look at execution times. Each test is run three times. The last run is shown below. After that we look at the file sizes.

> /usr/bin/time gzip -k maillog.txt
    1.15 real         1.12 user         0.03 sys
> /usr/bin/time bzip2 -k maillog.txt
    11.24 real        11.21 user         0.02 sys
> /usr/bin/time xz -k maillog.txt
       22.55 real        22.43 user         0.11 sys
> /usr/bin/time zstd -q -k maillog.txt
        0.29 real         0.26 user         0.03 sys
> du maillog.txt*|sort -rn
46880   maillog.txt
5184    maillog.txt.gz
4736    maillog.txt.zst
3584    maillog.txt.bz2
3392    maillog.txt.xz

The speed of zstd is really amazing. The compression of zstd does not live up to xz and bzip2. But it beats gzip comfortably.

Mixed Tarball

Next is a tar file which contains the source code of a large TeX document with all its images, graphs et cetera complete with the complete git history. This is a mix of text and binary files. It will probably not compress terribly well.

Again, we observe compression times first, then file sizes.

> /usr/bin/time gzip -k backup-logbook2016.tar
        6.90 real         6.77 user         0.12 sys
> /usr/bin/time bzip2 -k backup-logbook2016.tar
       33.47 real        33.25 user         0.18 sys
> /usr/bin/time xz -k backup-logbook2016.tar
    67.58 real        66.90 user         0.65 sys
> /usr/bin/time zstd -q -k backup-logbook2016.tar
        1.67 real         1.03 user         0.16 sys
> du backup-logbook2016.tar*|sort -rn
133056    backup-logbook2016.tar
125440    backup-logbook2016.tar.bz2
125152    backup-logbook2016.tar.gz
124608    backup-logbook2016.tar.zst
121856    backup-logbook2016.tar.xz

As expected it is difficult to compress this file significantly. Interesting is that zstd does well here reaching second place. It is again the fastest by far.

Code tarball

Next test is the tarball for gcc-4.9.4, which is first unpacked from its bzipped form. Given the huge size of this tarball this test is only run once.

> /usr/bin/time gzip -k gcc-4.9.4.tar
    25.79 real        25.41 user         0.36 sys
> /usr/bin/time bzip2 -k gcc-4.9.4.tar
    80.68 real        80.24 user         0.40 sys
> /usr/bin/time xz -k gcc-4.9.4.tar
    291.59 real       290.14 user         1.37 sys
> /usr/bin/time zstd -q -k gcc-4.9.4.tar
        6.15 real         5.73 user         0.40 sys
> du gcc-4.9.4.tar*|sort -rn
566816    gcc-4.9.4.tar
114336    gcc-4.9.4.tar.gz
108384    gcc-4.9.4.tar.zst
88032     gcc-4.9.4.tar.bz2
69952     gcc-4.9.4.tar.xz

Again zstd shows amazing speed and better compression then gzip.

Zstd with maximum compression level

Let’s see what happens when we use zstd with its maximum regular compression setting.

> /usr/bin/time zstd -q -k -19 maillog.txt
       23.43 real        23.35 user         0.07 sys
> /usr/bin/time zstd -q -k -19 backup-logbook2016.tar
       46.77 real        46.39 user         0.33 sys
> /usr/bin/time zstd -q -k -19 gcc-4.9.4.tar
      252.21 real       251.68 user         0.45 sys
> du *.zst|sort -rn
122016   backup-logbook2016.tar.zst
74048    gcc-4.9.4.tar.zst
3936     maillog.txt.zst

The speed is much reduced in this case. The compressed files are smaller than those made with the standard settings. But they don’t beat xz yet.

Conclusion

It is probably time to retire gzip. In all test cases zstd in its default settings compresses much faster and yields a smaller file.

With regard to file size, xz is still king of the hill. On its best compression settings, zstd can come close to the performance of xz, but the latter still has the edge on file size.


←  Comparing stock ffmpeg with optimized ffmpeg