Roland's homepage

My random knot in the Web

ImageMagick: convert vs Wand

The ImageMagick suite has been in my software toolbox for years. It is my go-to tool for manipulating bitmap images. Over the years I have written several front-ends for specific tasks for in Python.

In general, I have used the subprocess module to launch convert or mogrify from Python.

With the release of Wand 0.5.0 which supports ImageMagick 7, I decided to try that by porting one of my scripts (foto4lb) to it. This turned out to be slower than using convert directly.

But now it is 2021. Py-wand is at 0.6.7 and it is time to try again.

What foto4lb basically does is take one or more directories of images and create a subdirectory that contains shrunken version of the images. It sets the modification time for those images to the time the photo was taken according to the EXIF metadata. It uses concurrent.futures to run multiple conversions in parallel.

Porting

Converting the subprocess call of convert to manipulations of an wand.image.Image instance is pretty straightforward with the aid of the Wand documentation. Basically, I replaced

args = [
    'convert', fname, '-strip', '-resize',
    str(newwidth), '-units', 'PixelsPerInch', '-density', '300', '-unsharp',
    '2x0.5+0.7+0', '-quality', '80', oname
]
rp = subprocess.call(args)

with

with Image(filename=fname) as img:
    scale = newwidth/img.width
    newheight = int(round(img.height * scale, 0))
    img.strip()
    img.resize(width=newwidth, height=newheight)
    img.units = 'pixelsperinch'
    img.resolution = (300, 300)
    img.unsharp_mask(radius=2, sigma=0.5, amount=0.7, threshold=0)
    img.quality = 80
    img.save(filename=oname)

The biggest differences are:

  1. Wand supports reading metadata like EXIF tags. In the other version of the program I used pillow for that.
  2. The Wand version uses a ProcessPoolExecutor, while the original uses a ThreadPoolExecutor combined with subprocess.call.

According to cloc, the version using Wand has 117 lines of code compared to 123 for the subprocess version. The difference is mostly due to the more involved handling of metadata and testing that convert is actually available. Both programs have been formatted using black.

Performance

For performance testing I used both programs on a directory with thirtyfive images from my smartphone camera. The time utility was used to log the run times.

  • CPU: intel i7-7700
  • OS: FreeBSD 13-STABLE
  • Python: 3.9.7
  • Wand: 0.6.7
  • Image size: 3840x2160
  • Number of images: 35

The subprocess-based program yielded the following results:

7.14 real        26.30 user         1.63 sys
6.95 real        25.57 user         1.78 sys
7.00 real        25.90 user         1.69 sys
7.02 real        26.14 user         1.43 sys
7.01 real        25.98 user         1.67 sys
7.03 real        26.08 user         1.69 sys
7.04 real        25.96 user         1.59 sys
6.99 real        26.01 user         1.48 sys
7.03 real        26.01 user         1.66 sys
7.02 real        25.93 user         1.55 sys

On average:

7.02 real        25.99 user         1.62 sys

The times for the Wand-based version were:

6.29 real        24.00 user         0.37 sys
6.13 real        23.56 user         0.37 sys
6.20 real        23.61 user         0.40 sys
6.12 real        23.42 user         0.41 sys
6.39 real        24.49 user         0.34 sys
6.17 real        23.66 user         0.32 sys
6.37 real        24.50 user         0.38 sys
6.28 real        24.15 user         0.32 sys
6.25 real        23.94 user         0.29 sys
6.21 real        23.80 user         0.40 sys

This averages to:

6.24 real        23.91 user         0.36 sys

Next I used time.monotonic to measure how long the code that does the actual processing of an image takes. For the Wand version, this was on average 0.68 seconds. For the version using convert it was 0.76 seconds on average. So in these tests, using convert takes around 11% longer than using Wand.

Conclusion

The performance of the Wand-based version is now higher than the one running convert in a subprocess. Not only does it use less time running userspace code, but especially the time spent in system calls is significantly reduced.


For comments, please send me an e-mail.


Related articles


←  Using sqlite3 for time management Including binary data in Python scripts  →