This article contains several useful tricks for manipulating PDF files.
The focus of this article is on Open Source and Free software, that are available for UNIX-like operating systems. These tools are made for use on the command-line of a shell.
Adding password restrictions to a PDF file
PDF files can have two passwords;
- user password (Must be supplied to read a document.)
- owner password (Can restrict printing, editing, copying. Not necessary to read the document.)
Adding restrictions is done by “encrypting” the PDF with a owner password. Since this password is easily removed, you don’t need to remember this password. So I tend to generate one automatically.
The following command uses the SHA-256 checksum of the original file as the owner password.
> qpdf --encrypt '' `sha256 -q unrestricted.pdf` 128 \ --extract=n --modify=none --use-aes=y --cleartext-metadata -- \ unrestricted.pdf restricted.pdf
As given, it prevents copying (--extract=n) and modification (--modify=none), but leaves the document metadata unencrypted. By default, printing is allowed. The user password is an empty string, leaving read access open.
Running both through pdfinfo shows the file restrictions. First the unrestricted file.
Contrast that with the output for the restricted file (trimmed for brevity).
> pdfinfo restricted.pdf ... Encrypted: yes (print:yes copy:no change:no addNotes:no algorithm:AES) ...
Note that this only protects your documents from laypeople, since qpdf can also remove such restrictions, as shown below.
If you need stronger access control, you should set the user password or use other kinds of encryption that would prevent people from reading the file without knowing the password.
Removing restrictions from a PDF file
If a document only has an owner password, you can use qpdf to remove it, without having to provide the owner password!
Note that this only works with one of the standard encryption handlers (RC4 and AES). If a document was encrypted with a custom encryption handler this might not work.
> qpdf -decrypt restricted.pdf unrestricted2.pdf > pdfinfo unrestricted2.pdf ... Encrypted: no ...
So an owner password is not a protection against knowledgeable people.
Changing the metadata in a PDF file
The exiftool program can be used to change the Info dictionary and XMP tags in a PDF file.
For example, I’ve seen a e-book application on an android device use the “title” from the Info dictionary to label PDFs in the user interface. However in some PDF files the title is either empty or bears no resemblance to the actual contents. In cases like this you really want to update the metadata.
> exiftool -Title='Alexit hardener 405-25' -overwrite_original ALEXIT-Hardener_405-25_DE.pdf 1 image files updated
Overlaying text and images in a PDF file
This is such a substantial topic that it is located in a separate article.
Converting PDF to bitmap formats
Sometimes a PDF needs to be converted to bitmap format, e.g. for display on a webpage. (This is assuming that generating the same image in SVG format is not possible.)
The programs are from the ImageMagick suite of tools.
convert -density 1200 -units PixelsPerInch \ <input.pdf> \ -scale 25% \ <output.png>
The first option (which needs to come before the name of the input file) tells it to convert the image to a bitmap at 1200 pixels per inch (“PPI”). The standard resolution used by convert is only 72 PPI.
After the input file, -scale 25% is used to scale the image back. This reduces the effective resolution to 300 PPI, but averages the pixels giving a less pixelated look.
convert -density 1200 -units PixelsPerInch \ <input.pdf> \ -background white -flatten\ -scale 25% \ <output.jpg>
Here the -background white and -flatten options are needed to prevent a black background on some PDF files.