PDF to images with Poppler and pdftoppm


PDF: a document format that other people can open

What if there was a portable document format we could use to share our slides—without assuming that the consumer of our slides has a Google account, a Microsoft Office license or a Mac with Keynote installed?

Enter Portable Document Format. Or PDF, as we often say.

PDF was originally developed by Adobe, but became an ISO standard in 2008. Today, most web browsers have excellent support for viewing PDFs, and operating systems often provide a good PDF reader out of the box.

For both Build Your Own Little Memex with Babashka and Lessons learned teaching Elm to kids, I chose to write my slides in a format suited to produce content, export to PDF and show the PDF on the talk from a very boring PDF reader such as Firefox.

But once in a while, a question arises: could I perhaps please provide the slideshow in PowerPoint? Or perhaps in Google Slides?

Export from PDF with Pandoc?

I’ve tried using Pandoc to convert the PDF to a different slideshow format directly, but I haven’t loved the experience. I get a new file, sure. But it might not be the way I want. Then I have to start tweaking.

Can I solve my probems with tools that Pandoc provides? If the font is wrong, what do I do? What if I want to move an image around? What if I am only making some of the slides?

I’ve chosen to opt for an approach that gives me way more control in exchange for a bit of manual work.

Convert PDF to image files with Poppler 🤗

A different option is to convert the PDF to a set of images, then copy each image into the slideshow. Plenty of tools can do this. Poppler is one.

Poppler is an open source project for working with PDFs. You can probably install it with your favourite package manager.

Now, you can produce your images:

# First, install Poppler.
# After a successful Poppler installation, you should get a binary called `pdftoppm`.
$ which pdftoppm

# Then you need some slides.
# Bring your own, or take some of mine.
$ wget https://www.teodorheggelund.com/static/teaching-kids-elm.pdf

# Then convert to images!
$ mkdir slides
$ pdftoppm -jpeg -r 600 teaching-kids-elm.pdf slides/slide

Above, I chose some options to ensure a smooth import into Google Slides.

pdftoppm argument explanation
-jpeg export images as JPEG
-r 600 print resolution is 600 points per inch
teaching-kids-elm.pdf the slideshow PDF
slides/slide is the image path prefix.

600 pixels per inch (ppi) is good enough for me. With 600 ppi, I can’t tell the difference between the PDF (vector graphics) and the exported JPEG images (raster graphics).

Here are some of the resulting files:

$ find slides | sort | head

Now, do what you want with the images :)