Every now and then I make backups of my FreeBSD system’s filesystems with the venerable dump program. This is the only program that can capture all features of the UFS filesystem. The filesystems to be backed up are:
- The root filesystem.
- The /usr filesystem.
- The /var filesystem.
My user data is kept in /home, which is replicated by rsync to several other disks. This is done because of the size of this filesystem. It’s just unpractical to make complete dumps. It’s much easier to just synchronize between two disks. I’m also not really interested in retaining different versions, since I rarely throw data away.
Because of original space constraints, I tend to compress the backups. For convenience, I only use compression programs that are available in the base system. That way I know for sure that I can restore a backup from a rescue disc. This leaves the following choices:
The purpose of this article is to measure the time required to do (de)compression, and the compressed size. The test data is a recent backup I made:
> du -m *.dump 126 root-0-20130305.dump 10403 usr-0-20130305.dump 281 var-0-20130305.dump
First, I’m going to compress them with gzip. Each compression is done three times to check for variations:
> time gzip -k root-0-20130305.dump 10.960u 0.094s 0:11.06 99.9% 40+2722k 0+404io 0pf+0w > rm root*.gz; time gzip -k root-0-20130305.dump 10.930u 0.070s 0:11.00 100.0% 40+2724k 0+404io 0pf+0w > rm root*.gz; time gzip -k root-0-20130305.dump 10.922u 0.078s 0:11.00 99.9% 40+2723k 0+404io 0pf+0w > time gzip -k var-0-20130305.dump 9.461u 0.393s 0:09.93 99.1% 40+2722k 2251+753io 0pf+0w > rm var*.gz; time gzip -k var-0-20130305.dump 9.478u 0.125s 0:09.61 99.7% 40+2727k 0+753io 0pf+0w > rm var*.gz ; time gzip -k var-0-20130305.dump 9.387u 0.126s 0:09.52 99.7% 40+2726k 0+753io 0pf+0w > time gzip -k usr-0-20130305.dump 585.734u 15.911s 10:03.60 99.6% 40+2723k 83658+34195io 0pf+0w > rm usr*.gz ; time gzip -k usr-0-20130305.dump 579.348u 14.895s 9:57.49 99.4% 40+2723k 83656+34195io 0pf+0w > rm usr*.gz ; time gzip -k usr-0-20130305.dump 578.003u 14.802s 9:55.14 99.6% 40+2722k 83657+34196io 0pf+0w
The size of the data:
> du -m root* 126 root-0-20130305.dump 51 root-0-20130305.dump.gz > du -m var* 281 var-0-20130305.dump 95 var-0-20130305.dump.gz > du -m usr* 10403 /tmp/usr-0-20130305.dump 4276 usr-0-20130305.dump.gz
The /var filesystem is the outlier in that it has both the smallest compressed size and the fastest compression time. This filesystem is mainly filled with small text files, while the other filesystems are more mixed between text and binary files. We will therefore disregard the values for this filesystem.
The data is compressed to between 40.4-41.1% of its original size. The compression speed is between 11.5-17.7 MB/s. There is not much difference in time between the runs, so we’ll skip the multiple runs from now on.
The bzip2 compressor is next. This should compress better but slower. Note that we are using the standard bzip2 here, not the parallel version from ports.
> time bzip2 -k root-0-20130305.dump 17.285u 0.173s 0:17.53 99.5% 35+2726k 1010+348io 3pf+0w > time bzip2 -k var-0-20130305.dump 53.122u 0.338s 0:53.54 99.8% 35+2725k 2251+705io 0pf+0w > time bzip2 -k usr-0-20130305.dump 1830.357u 15.787s 30:51.01 99.7% 35+2727k 83656+31339io 1pf+0w
The size of the data:
> du -m /tmp/*.bz2 44 /tmp/root-0-20130305.dump.bz2 3919 /tmp/usr-0-20130305.dump.bz2 89 /tmp/var-0-20130305.dump.bz2
The data is compressed to between 32% and 38% of its original size. But the compression speed is only between 5.7 MB/s and 7.3 MB/s. The compression time has significantly increased compared to gzip (with a factor of 3.1 for usr), for a modest compression gain.
This is the newest compression program to have been added to the FreeBSD base system. Earlier, I also did a comparison with bzip2.
> time xz -k root-0-20130305.dump 90.279u 0.259s 1:30.69 99.8% 71+2691k 1013+178io 5pf+0w > time xz -k var-0-20130305.dump 178.502u 0.614s 2:59.22 99.9% 71+2691k 0+672io 0pf+0w > time xz -k usr-0-20130305.dump 6175.208u 29.190s 1:43:32.49 99.8% 71+2692k 83652+26706io 1pf+0w
The size of the data:
> du -m /tmp/*.xz 23 /tmp/root-0-20130305.dump.xz 3340 /tmp/usr-0-20130305.dump.xz 85 /tmp/var-0-20130305.dump.xz
The compression rate is the best of all. The root dump is compressed to 18% of its original size. The /var dump was little better than when using bzip2. The compression is slow, not more than 1.4 MB/s to 1.7 MB/s.
The dump program generates around 6-10 MB/s of data. Only gzip and bzip2 can keep up with this when the output of the dump is piped through them. Using xz would slow down this process considerably.
Another consideration is that the restore operation becomes more complicated. Instead of just using the restore program to open the dump file, the output of the decompression program should be piped into the restore program. This complicates matters and makes them slower.
As opposed to earlier when I made backups to DVD, I now use USB connected disk drives. This has done away with the space constraints.
So my conclusion is that compressing backups is not worth the extra complexity for me anymore.
Making the dump of /usr smaller would make the whole backup process faster. Since the function of a backup is to restore a working system, I have decided to exclude some directories in /usr from the dump. Looking at the contents of this filesystem, we see
> du -cm -d 1 /usr/ 1 /usr/.snap 55 /usr/bin 1 /usr/games 21 /usr/include 38 /usr/lib 1 /usr/libdata 14 /usr/libexec 5549 /usr/local 2338 /usr/obj 2206 /usr/ports 28 /usr/sbin 52 /usr/share 1539 /usr/src 11867 /usr/ 11867 total
Looking at this list, I decided to set the nodump flag on /usr/obj and /usr/ports, because both and are not necessary for getting a system up and running. Using portsnap one can easily re-populate the ports tree. And /usr/obj is only used for OS rebuilds. I’m explicitly not excluding /usr/local, because that basically contains all installed ports, which is vey convenient. I’m still on the fence about exclusing /usr/src. On the one hand it is easy to download via subversion, on the other hand I like to keep the source that built the system on hand. So it stays in for now.