Extracting images from a PDF

Sometimes you need to extract images inside a PDF. Here are step by step instructions on how to do that. Prerequisites: You need to have Java installed on your machine.

  1. Download pdfbox-app-1.8.16.jar from https://pdfbox.apache.org/download.cgi. WARNING: version 2.0.21 will not work.
  2. Run the jar file passing it path to your PDF file like so:
java -classpath pdfbox-app-1.8.16.jar org.apache.pdfbox.ExtractImages test.pdf

That’s it. This will extract all images it can find inside the PDF and store them as test-1, test-2 and so on in the same directory as your PDF. Reference: https://pdfbox.apache.org/docs/1.8.11/javadocs/org/apache/pdfbox/ExtractImages.html

This entry was posted in Software. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s