News Category

Java: Extract Images from a PDF Document

2022-03-22 07:09:00 Written by  jie zou
Rate this item
(3 votes)

If you'd like to use the images embedded in a PDF document elsewhere, you can extract and save them in a folder. This article will show you how to programmatically extract images from a PDF document using Spire.PDF for Java.

Install Spire.PDF for Java

First of all, you're required to add the Spire.Pdf.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.pdf</artifactId>
        <version>10.4.4</version>
    </dependency>
</dependencies>
    

Extract Images from a PDF Document

Spire.PDF for Java offers the PdfPageBase.extractImages() method to extract images from a PDF document. The detailed steps are listed below.

  • Create a PdfDocument instance and load a PDF sample document using PdfDocument.loadFromFile() method.
  • Loop through all pages of the document and extract images from the given page using PdfPageBase.extractImages() method.
  • Specify the path and name of the output document.
  • Save images as .png files.
  • Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;
import javax.imageio.ImageIO;

public class ExtractImage {
    public static void main(String[] args) throws IOException {
        //create a PdfDocument instance
        PdfDocument doc = new PdfDocument();

        //load a PDF sample file
        doc.loadFromFile("sample.pdf");

        //declare an int variable
        int index = 0;

        //loop through all pages
        for (PdfPageBase page : (Iterable<PdfPageBase>) doc.getPages()) {

            //extract images  from the given page
            for (BufferedImage image : page.extractImages()) {

                //specify the file path and name
                File output = new File("C:\\Users\\Administrator\\Desktop\\ExtractedImages\\" + String.format("Image_%d.png", index++));

                //save images as .png files
              ImageIO.write(image, "PNG", output);
            }
        }
    }
}

Java: Extract Images from a PDF Document

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Additional Info

  • tutorial_title:
Last modified on Thursday, 27 July 2023 07:10