The ImageMagick suite has been in my software toolbox for years. It is my go-to tool for manipulating bitmap images. Over the years I have written several front-ends for specific tasks for in Python.
In general, I have used the subprocess module to launch
mogrify from Python.
foto4lb basically does is take one or more directories of images and
create a subdirectory that contains shrunken version of the images. It sets
the modification time for those images to the time the photo was taken
according to the EXIF metadata. It uses
concurrent.futures to run multiple
conversions in parallel.
subprocess call of
convert to manipulations of an
wand.image.Image instance is pretty straightforward with the aid of the Wand
documentation. Basically, I replaced
args = [ 'convert', fname, '-strip', '-resize', str(newwidth), '-units', 'PixelsPerInch', '-density', '300', '-unsharp', '2x0.5+0.7+0', '-quality', '80', oname ] rp = subprocess.call(args)
with Image(filename=fname) as img: scale = newwidth/img.width newheight = int(round(img.height * scale, 0)) img.strip() img.resize(width=newwidth, height=newheight) img.units = 'pixelsperinch' img.resolution = (300, 300) img.unsharp_mask(radius=2, sigma=0.5, amount=0.7, threshold=0) img.quality = 80 img.save(filename=oname)
The biggest differences are:
Wandsupports reading metadata like EXIF tags. In the other version of the program I used pillow for that.
Wandversion uses a
ProcessPoolExecutor, while the original uses a
According to cloc, the version using
Wand has 91 lines of code compared
to 114 for the
subprocess version. The difference is mostly due to the
more involved handling of metadata and testing that
convert is actually available.
For performance testing I used both programs on a directory with eight images.
time utility was used to log the run times.
The subprocess-based program yielded the following results:
3.81 real 13.03 user 1.25 sys 3.83 real 12.99 user 1.32 sys 3.78 real 13.06 user 1.15 sys 3.87 real 13.22 user 1.37 sys 3.85 real 13.09 user 1.32 sys 3.72 real 12.60 user 1.39 sys 3.73 real 12.77 user 1.23 sys 3.78 real 12.91 user 1.22 sys 3.84 real 13.21 user 1.19 sys 3.70 real 12.72 user 1.23 sys ---- ----- ---- 3.79 mean 12.96 mean 1.27 mean
The times for the
Wand-based version were:
7.29 real 17.87 user 2.51 sys 7.30 real 18.36 user 2.41 sys 7.19 real 18.08 user 2.23 sys 7.29 real 18.28 user 2.38 sys 7.29 real 18.19 user 2.38 sys 7.15 real 17.86 user 2.29 sys 7.24 real 18.30 user 2.20 sys 7.28 real 18.29 user 2.36 sys 7.23 real 18.10 user 2.40 sys 7.16 real 17.94 user 2.32 sys ---- ----- ---- 7.24 mean 18.23 mean 2.35 mean
The performance of the
Wand-based version is lower. Initially that
surprised me given that both use the same shared library for image manipulation.
Next I used
time.monotonic to measure how long the functions that do the
actual processing of an image take. For the
Wand version, this was around
2.1 seconds. For the version using
convert it was around 1.8 seconds.
So while there is some overhead from using
Wand, it is not enough to
explain the difference in real runtime.
At the moment I cannot explain why the program using
Wand takes almost
twice as long in real time. Using
Pool.imap_unordered from the
multiprocessing module instead of
ProcessPoolExecutor.map did not
really make a difference. So for now the it seems that
ThreadPoolExecutor combined with
is just more efficient.
For large batch jobs that are to be run in parallel, I will stick to using
ThreadPoolExecutor to run
subprocess.call since it
is significantly faster.
For interactive use in (I)Python, the
Wand module is superior. It presents
a Pythonic interface to ImageMagick.