Should system backups be compressed?
Every now and then I make backups of my FreeBSD system’s filesystems with the
dump program. This is the only program that can capture
all features of the UFS filesystem. The filesystems to be backed up are:
- The root filesystem.
My user data is kept in
/home, which is replicated by
to several other disks. This is done because of the size of this
filesystem. It’s just unpractical to make complete dumps. It’s much
easier to just synchronize between two disks. I’m also not really
interested in retaining different versions, since I rarely throw
Because of original space constraints, I tend to compress the backups. For convenience, I only use compression programs that are available in the base system. That way I know for sure that I can restore a backup from a rescue disc. This leaves the following choices:
The purpose of this article is to measure the time required to do (de)compression, and the compressed size. The test data is a recent backup I made:
> du -m *.dump 126 root-0-20130305.dump 10403 usr-0-20130305.dump 281 var-0-20130305.dump
First, I’m going to compress them with
gzip. Each compression is
done three times to check for variations:
> time gzip -k root-0-20130305.dump 10.960u 0.094s 0:11.06 99.9% 40+2722k 0+404io 0pf+0w > rm root*.gz; time gzip -k root-0-20130305.dump 10.930u 0.070s 0:11.00 100.0% 40+2724k 0+404io 0pf+0w > rm root*.gz; time gzip -k root-0-20130305.dump 10.922u 0.078s 0:11.00 99.9% 40+2723k 0+404io 0pf+0w > time gzip -k var-0-20130305.dump 9.461u 0.393s 0:09.93 99.1% 40+2722k 2251+753io 0pf+0w > rm var*.gz; time gzip -k var-0-20130305.dump 9.478u 0.125s 0:09.61 99.7% 40+2727k 0+753io 0pf+0w > rm var*.gz ; time gzip -k var-0-20130305.dump 9.387u 0.126s 0:09.52 99.7% 40+2726k 0+753io 0pf+0w > time gzip -k usr-0-20130305.dump 585.734u 15.911s 10:03.60 99.6% 40+2723k 83658+34195io 0pf+0w > rm usr*.gz ; time gzip -k usr-0-20130305.dump 579.348u 14.895s 9:57.49 99.4% 40+2723k 83656+34195io 0pf+0w > rm usr*.gz ; time gzip -k usr-0-20130305.dump 578.003u 14.802s 9:55.14 99.6% 40+2722k 83657+34196io 0pf+0w
The size of the data:
> du -m root* 126 root-0-20130305.dump 51 root-0-20130305.dump.gz > du -m var* 281 var-0-20130305.dump 95 var-0-20130305.dump.gz > du -m usr* 10403 /tmp/usr-0-20130305.dump 4276 usr-0-20130305.dump.gz
/var filesystem is the outlier in that it has both the
smallest compressed size and the fastest compression time. This
filesystem is mainly filled with small text files, while the other
filesystems are more mixed between text and binary files. We will
therefore disregard the values for this filesystem.
The data is compressed to between 40.4-41.1% of its original size. The compression speed is between 11.5-17.7 MB/s. There is not much difference in time between the runs, so we’ll skip the multiple runs from now on.
bzip2 compressor is next. This should compress better but
slower. Note that we are using the standard
bzip2 here, not the
parallel version from ports.
> time bzip2 -k root-0-20130305.dump 17.285u 0.173s 0:17.53 99.5% 35+2726k 1010+348io 3pf+0w > time bzip2 -k var-0-20130305.dump 53.122u 0.338s 0:53.54 99.8% 35+2725k 2251+705io 0pf+0w > time bzip2 -k usr-0-20130305.dump 1830.357u 15.787s 30:51.01 99.7% 35+2727k 83656+31339io 1pf+0w
The size of the data:
> du -m /tmp/*.bz2 44 /tmp/root-0-20130305.dump.bz2 3919 /tmp/usr-0-20130305.dump.bz2 89 /tmp/var-0-20130305.dump.bz2
The data is compressed to between 32% and 38% of its original
size. But the compression speed is only between 5.7 MB/s and 7.3 MB/s.
The compression time has significantly increased compared to
(with a factor of 3.1 for
usr), for a modest compression gain.
This is the newest compression program to have been added to the FreeBSD base system. Earlier, I also did a comparison with bzip2.
> time xz -k root-0-20130305.dump 90.279u 0.259s 1:30.69 99.8% 71+2691k 1013+178io 5pf+0w > time xz -k var-0-20130305.dump 178.502u 0.614s 2:59.22 99.9% 71+2691k 0+672io 0pf+0w > time xz -k usr-0-20130305.dump 6175.208u 29.190s 1:43:32.49 99.8% 71+2692k 83652+26706io 1pf+0w
The size of the data:
> du -m /tmp/*.xz 23 /tmp/root-0-20130305.dump.xz 3340 /tmp/usr-0-20130305.dump.xz 85 /tmp/var-0-20130305.dump.xz
The compression rate is the best of all. The root dump is compressed
to 18% of its original size. The
/var dump was little better than
bzip2. The compression is slow, not more than 1.4 MB/s
to 1.7 MB/s.
dump program generates around 6-10 MB/s of data. Only
bzip2 can keep up with this when the output of the dump is
piped through them. Using
xz would slow down this process considerably.
Another consideration is that the restore operation becomes more
complicated. Instead of just using the
restore program to open the
dump file, the output of the decompression program should be piped
into the restore program. This complicates matters and makes them slower.
As opposed to earlier when I made backups to DVD, I now use USB connected disk drives. This has done away with the space constraints.
So my conclusion is that compressing backups is not worth the extra complexity for me anymore.
Making the dump of
/usr smaller would make the whole backup
process faster. Since the function of a backup is to restore a working
system, I have decided to exclude some directories in
the dump. Looking at the contents of this filesystem, we see
> du -cm -d 1 /usr/ 1 /usr/.snap 55 /usr/bin 1 /usr/games 21 /usr/include 38 /usr/lib 1 /usr/libdata 14 /usr/libexec 5549 /usr/local 2338 /usr/obj 2206 /usr/ports 28 /usr/sbin 52 /usr/share 1539 /usr/src 11867 /usr/ 11867 total
Looking at this list, I decided to set the
nodump flag on
/usr/ports, because both and are not necessary
for getting a system up and running. Using
portsnap one can easily
re-populate the ports tree. And
/usr/obj is only used for OS rebuilds.
I’m explicitly not excluding
/usr/local, because that basically
contains all installed ports, which is vey convenient. I’m still on
the fence about exclusing
/usr/src. On the one hand it is easy to
download via subversion, on the other hand I like to keep the source
that built the system on hand. So it stays in for now.
For comments, please send me an e-mail.
- Testing compression speed and ratio
- Evaluating Zstandard compression
- XZ compression
- Automated local backups