Evaluating Zstandard compression
Recently I became aware of the zstd compression program. I wanted to see how
it stacks up against
zstd is blisteringly fast.
xz still yields the smallest files.
The tests are done on an otherwise idle machine. The test machine is from 2009 and has a core2 quad CPU and a regular SATA harddisk. So not the latest and greatest, but no slouch.
All programs are used with their default settings and single-threaded.
The first demo file is a 46 MiB mail log file. This is plain text, so it should compress well.
First let’s compress the file and look at execution times. Each test is run three times. The last run is shown below. After that we look at the file sizes.
> /usr/bin/time gzip -k maillog.txt 1.15 real 1.12 user 0.03 sys > /usr/bin/time bzip2 -k maillog.txt 11.24 real 11.21 user 0.02 sys > /usr/bin/time xz -k maillog.txt 22.55 real 22.43 user 0.11 sys > /usr/bin/time zstd -q -k maillog.txt 0.29 real 0.26 user 0.03 sys > du maillog.txt*|sort -rn 46880 maillog.txt 5184 maillog.txt.gz 4736 maillog.txt.zst 3584 maillog.txt.bz2 3392 maillog.txt.xz
The speed of
zstd is really amazing. The compression of
zstd does not
live up to
bzip2. But it beats
Next is a tar file which contains the source code of a large TeX document with
all its images, graphs et cetera complete with the complete
This is a mix of text and binary files. It will probably not compress terribly well.
Again, we observe compression times first, then file sizes.
> /usr/bin/time gzip -k backup-logbook2016.tar 6.90 real 6.77 user 0.12 sys > /usr/bin/time bzip2 -k backup-logbook2016.tar 33.47 real 33.25 user 0.18 sys > /usr/bin/time xz -k backup-logbook2016.tar 67.58 real 66.90 user 0.65 sys > /usr/bin/time zstd -q -k backup-logbook2016.tar 1.67 real 1.03 user 0.16 sys > du backup-logbook2016.tar*|sort -rn 133056 backup-logbook2016.tar 125440 backup-logbook2016.tar.bz2 125152 backup-logbook2016.tar.gz 124608 backup-logbook2016.tar.zst 121856 backup-logbook2016.tar.xz
As expected it is difficult to compress this file significantly. Interesting
zstd does well here reaching second place. It is again the
fastest by far.
Next test is the tarball for
gcc-4.9.4, which is first unpacked from its
bzipped form. Given the huge size of this tarball this test is only run once.
> /usr/bin/time gzip -k gcc-4.9.4.tar 25.79 real 25.41 user 0.36 sys > /usr/bin/time bzip2 -k gcc-4.9.4.tar 80.68 real 80.24 user 0.40 sys > /usr/bin/time xz -k gcc-4.9.4.tar 291.59 real 290.14 user 1.37 sys > /usr/bin/time zstd -q -k gcc-4.9.4.tar 6.15 real 5.73 user 0.40 sys > du gcc-4.9.4.tar*|sort -rn 566816 gcc-4.9.4.tar 114336 gcc-4.9.4.tar.gz 108384 gcc-4.9.4.tar.zst 88032 gcc-4.9.4.tar.bz2 69952 gcc-4.9.4.tar.xz
zstd shows amazing speed and better compression then
Zstd with maximum compression level
Let’s see what happens when we use
zstd with its maximum regular
> /usr/bin/time zstd -q -k -19 maillog.txt 23.43 real 23.35 user 0.07 sys > /usr/bin/time zstd -q -k -19 backup-logbook2016.tar 46.77 real 46.39 user 0.33 sys > /usr/bin/time zstd -q -k -19 gcc-4.9.4.tar 252.21 real 251.68 user 0.45 sys > du *.zst|sort -rn 122016 backup-logbook2016.tar.zst 74048 gcc-4.9.4.tar.zst 3936 maillog.txt.zst
The speed is much reduced in this case. The compressed files are smaller than
those made with the standard settings. But they don’t beat
It is probably time to retire
gzip. In all test cases
zstd in its default
settings compresses much faster and yields a smaller file.
With regard to file size,
xz is still king of the hill. On its best
zstd can come close to the performance of
but the latter still has the edge on file size.
For comments, please send me an e-mail.