Java (481)
PDF/A is a kind of PDF format designed for archiving and long-term preservation of electronic documents. Unlike paper documents that are easily damaged or smeared, PDF/A format ensures that documents can be reproduced in exactly the same way even after long-term storage. This article will demonstrate how to convert PDF to PDF/A-1A, 2A, 3A, 1B, 2B and 3B compliant PDF using Spire.PDF for Java.
Install Spire.PDF for Java
First of all, you're required to add the Spire.PDF.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.
<repositories>
<repository>
<id>com.e-iceblue</id>
<name>e-iceblue</name>
<url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>e-iceblue</groupId>
<artifactId>spire.pdf</artifactId>
<version>11.12.16</version>
</dependency>
</dependencies>
Convert PDF to PDF/A
The detailed steps are as follows:
- Create a PdfStandardsConverter instance, and pass in a sample PDF file as a parameter.
- Convert the sample file to PdfA1A conformance level using PdfStandardsConverter.toPdfA1A() method.
- Convert the sample file to PdfA1B conformance level using PdfStandardsConverter. toPdfA1B() method.
- Convert the sample file to PdfA2A conformance level using PdfStandardsConverter. toPdfA2A() method.
- Convert the sample file to PdfA2B conformance level using PdfStandardsConverter. toPdfA2B() method.
- Convert the sample file to PdfA3A conformance level using PdfStandardsConverter. toPdfA3A() method.
- Convert the sample file to PdfA3B conformance level using PdfStandardsConverter. toPdfA3B() method.
- Java
import com.spire.pdf.conversion.PdfStandardsConverter;
public class ConvertPdfToPdfA {
public static void main(String[] args) {
//Create a PdfStandardsConverter instance, and pass in a sample file as a parameter
PdfStandardsConverter converter = new PdfStandardsConverter("sample.pdf");
//Convert to PdfA1A
converter.toPdfA1A("output/ToPdfA1A.pdf");
//Convert to PdfA1B
converter.toPdfA1B("output/ToPdfA1B.pdf");
//Convert to PdfA2A
converter.toPdfA2A( "output/ToPdfA2A.pdf");
//Convert to PdfA2B
converter.toPdfA2B("output/ToPdfA2B.pdf");
//Convert to PdfA3A
converter.toPdfA3A("output/ToPdfA3A.pdf");
//Convert to PdfA3B
converter.toPdfA3B("output/ToPdfA3B.pdf");
}
}

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Extract Images from PDF in Java – Preserve Quality & Filter Noise
2024-11-21 07:09:00 Written by zaki zou
When dealing with PDF documents that contain images—such as scanned reports, digital brochures, or design portfolios—you may need to extract these images for reuse or analysis. In this article, we'll show you how to extract images from PDF in Java, covering both basic usage and advanced image extracting techniques using the Spire.PDF for Java library.
Whether you're creating a PDF image extractor in Java or simply looking to extract images from a PDF file using Java code, this guide will walk you through the process step by step.
Guide Outline
- Getting Started – Tools and Setup
- Extract All Images from a PDF in Java
- Advanced Tips for More Precise Image Extraction
- Frequently Asked Questions
- Conclusion
Getting Started – Tools and Setup
Extracting images from PDF files in Java can be challenging without third-party libraries. While PDFs may contain valuable image assets—such as scanned pages, charts, or embedded graphics—these elements are often encoded or compressed in ways that native Java APIs can’t handle directly.
Spire.PDF for Java provides a high-level, reliable way to locate and extract embedded or inline images from PDF files. Whether you’re building an automation tool or a document parser, this library helps you extract image content efficiently and with full quality.
Before getting started, make sure you have the following development tools ready:
- Java Development Kit (JDK) 1.6 or above
- Spire.PDF for Java (Free or commercial version)
- An IDE (e.g., IntelliJ IDEA, Eclipse)
Maven Dependency:
<repositories>
<repository>
<id>com.e-iceblue</id>
<name>e-iceblue</name>
<url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>e-iceblue</groupId>
<artifactId>spire.pdf</artifactId>
<version>11.12.16</version>
</dependency>
</dependencies>
You can use Free Spire.PDF for Java for smaller tasks.
Extract All Images from a PDF in Java
The most straightforward way to extract images from a PDF is by using the PdfImageHelper class in Spire.PDF for Java. This utility scans each page, locates embedded or inline images, and returns both the image data and metadata such as size and position.
Code Example: Basic Image Extraction
import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;
import com.spire.pdf.utilities.PdfImageHelper;
import com.spire.pdf.utilities.PdfImageInfo;
import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;
public class ExtractAllImagePDF {
public static void main(String[] args) throws IOException {
// Create a PdfDocument instance
PdfDocument pdf = new PdfDocument();
pdf.loadFromFile("input.pdf");
// Create an image helper instance
PdfImageHelper imageHelper = new PdfImageHelper();
// Loop through each page to extract images
for (int i = 0; i < pdf.getPages().getCount(); i++) {
PdfPageBase page = pdf.getPages().get(i);
PdfImageInfo[] imagesInfo = imageHelper.getImagesInfo(page);
for (int j = 0; j < imagesInfo.length; j++) {
BufferedImage image = imagesInfo[j].getImage();
File file = new File("output/Page" + i + "_Image" + j + ".png");
ImageIO.write(image, "png", file);
}
}
pdf.close();
}
}
Make sure the output folder exists before running the code to avoid IOException.
How It Works
-
PdfDocument loads and holds the structure of the input PDF.
-
PdfPageBase represents a single page inside the PDF.
-
PdfImageHelper.getImagesInfo(PdfPageBase) scans a specific page and returns an array of PdfImageInfo, each containing a detected image.
-
Each PdfImageInfo includes:
- The image itself as a BufferedImage
- Metadata like size, DPI, and page index
-
ImageIO.write() supports common formats like "png", "jpg", and "bmp" — you can change the format string as needed.
After running the extraction code, you’ll get a folder containing the exported images from the PDF, each saved in a separate file.

These high-level abstractions save you from manually decoding image XObjects or parsing raw streams—making PDF image extraction in Java easier and cleaner.
To save full PDF pages as images instead of just extracting embedded images, follow our guide on saving PDF pages as images in Java.
Advanced Tips for More Precise Image Extraction
Extracting images from a PDF is not always a one-size-fits-all operation. Some files contain layout elements like background layers, small decorative icons, or embedded metadata images. The following advanced tips help you refine your extraction logic for better results:
Skip Background Images (Optional)
Some PDF files include background images, such as watermarks or decorative layers. When these are defined using standard PDF background settings, they are typically extracted as the first image on the page. To focus on meaningful content, simply skip the first extracted image per page.
for (int i = 1; i < imagesInfo.length; i++) { // Skip background image
BufferedImage image = imagesInfo[i].getImage();
ImageIO.write(image, "PNG", new File("output/image_" + (i - 1) + ".png"));
}
You can also check the getBounds() property to assess image dimensions and placement before deciding to skip.
Filter by Image Size (Ignore Small Icons)
To exclude small elements like buttons or logos, add a size threshold before saving:
BufferedImage image = imagesInfo[i].getImage();
if (image.getWidth() > 200 && image.getHeight() > 200) {
ImageIO.write(image, "PNG", new File("output/image_" + i + ".png"));
}
This helps keep the output folder clean and focused on relevant image content.
Export Images in Various Formats or Streams
You can output images in various formats or streams depending on your use case:
ImageIO.write(image, "JPEG", new File("output/image_" + i + ".jpg")); // compressed
ImageIO.write(image, "BMP", new File("output/image_" + i + ".bmp")); // high-quality
- Use PNG or BMP for lossless quality (e.g., archival or OCR).
- Use JPEG for web or lower storage usage.
You can also write images to a ByteArrayOutputStream or other output streams for further processing:
ByteArrayOutputStream stream = new ByteArrayOutputStream();
ImageIO.write(image, "PNG", stream);
Also Want to Extract Images from PDF Attachments?
If your PDF contains embedded file attachments like .jpg or .png images, you'll need a different approach. See our guide here:
How to Extract Attachments from PDF in Java
FAQ for Image Extraction from PDF in Java
Can I extract images from a PDF file using Java?
Yes. Using Spire.PDF for Java, you can easily extract embedded or inline images from any PDF page with a few lines of code.
Will extracted images retain their original quality?
Absolutely. Images are extracted in their original resolution and encoding. You can save them in PNG or BMP format to preserve full quality.
What’s the difference between image extraction and rendering PDF as an image?
Rendering a PDF page creates a bitmap version of the entire page (including text and layout), while image extraction pulls out only the embedded image objects that were originally inserted in the file.
Does this work for scanned PDFs?
Yes. Many scanned PDFs contain full-page raster images (e.g., JPGs or TIFFs). These are extracted just like any other embedded image.
Conclusion
Extracting images from PDF files using Java is fast and efficient with Spire.PDF. Whether you're analyzing marketing materials, scanned reports, or design portfolios, this Java PDF image extractor solution helps you programmatically access and save high-quality images embedded in your documents.
For more advanced cases—such as excluding layout images or processing attachments—the API offers enough flexibility to customize your approach.
To fully unlock the capabilities of Spire.PDF for Java without any evaluation limitations, you can apply for a free temporary license.
For PDF documents that contain confidential or sensitive information, you may want to password protect these documents to ensure that only the designated person can access the information. This article will demonstrate how to programmatically encrypt a PDF document and decrypt a password-protected document using Spire.PDF for Java.
Install Spire.PDF for Java
First of all, you're required to add the Spire.Pdf.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.
<repositories>
<repository>
<id>com.e-iceblue</id>
<name>e-iceblue</name>
<url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>e-iceblue</groupId>
<artifactId>spire.pdf</artifactId>
<version>11.12.16</version>
</dependency>
</dependencies>
Encrypt a PDF File with Password
There are two kinds of passwords for encrypting a PDF file - open password and permission password. The former is set to open the PDF file, while the latter is set to restrict printing, contents copying, commenting, etc. If a PDF file is secured with both types of passwords, it can be opened with either password.
The PdfDocument.getSecurity().encrypt(java.lang.String openPassword, java.lang.String permissionPassword, java.util.EnumSet<PdfPermissionsFlags> permissions, PdfEncryptionKeySize keySize) method offered by Spire.PDF for Java allows you to set both open password and permission password to encrypt PDF files. The detailed steps are as follows.
- Create a PdfDocument instance.
- Load a sample PDF file using PdfDocument.loadFromFile() method.
- Set open password, permission password, encryption key size and permissions.
- Encrypt the PDF file using PdfDocument.getSecurity().encrypt(java.lang.String openPassword, java.lang.String permissionPassword, java.util.EnumSet<PdfPermissionsFlags> permissions, PdfEncryptionKeySize keySize) method.
- Save the result file using PdfDocument.saveToFile () method.
- Java
import java.util.EnumSet;
import com.spire.pdf.PdfDocument;
import com.spire.pdf.security.PdfEncryptionKeySize;
import com.spire.pdf.security.PdfPermissionsFlags;
public class EncryptPDF {
public static void main(String[] args) {
// Input file path
String input = "data/encryption.pdf";
// Output file path
String output = "output/encryption_output.pdf";
// Create a new PDF document object
PdfDocument doc = new PdfDocument();
// Load the PDF document from the input file path
doc.loadFromFile(input);
// Create a password-based security policy with open and permission passwords
PdfSecurityPolicy securityPolicy = new PdfPasswordSecurityPolicy("openPwd", "permissionPwd");
// Set the encryption algorithm to AES 256-bit
securityPolicy.setEncryptionAlgorithm(PdfEncryptionAlgorithm.AES_256);
// Set document privilege to forbid all actions
securityPolicy.setDocumentPrivilege(PdfDocumentPrivilege.getForbidAll());
// Allow degraded printing
securityPolicy.getDocumentPrivilege().setAllowDegradedPrinting(true);
// Allow modification of annotations
securityPolicy.getDocumentPrivilege().setAllowModifyAnnotations(true);
// Allow document assembly
securityPolicy.getDocumentPrivilege().setAllowAssembly(true);
// Allow modification of document contents
securityPolicy.getDocumentPrivilege().setAllowModifyContents(true);
// Allow filling form fields
securityPolicy.getDocumentPrivilege().setAllowFillFormFields(true);
// Allow printing
securityPolicy.getDocumentPrivilege().setAllowPrint(true);
// Allow printing
doc.encrypt(securityPolicy);
// Save the encrypted document to the output file path
doc.saveToFile(output, FileFormat.PDF);
// Dispose of the document resources
doc.dispose();
}
}

Remove Password to Decrypt a PDF File
When you need to remove the password from a PDF file, you can set the open password and permission password to empty while calling the PdfDocument.getSecurity().encrypt(java.lang.String openPassword, java.lang.String permissionPassword, java.util.EnumSet<PdfPermissionsFlags> permissions, PdfEncryptionKeySize keySize, java.lang.String originalPermissionPassword) method. The detailed steps are as follows.
- Create a PdfDocument object.
- Load the encrypted PDF file with password using PdfDocument.loadFromFile(java.lang.String filename, java.lang.String password) method.
- Decrypt the PDF file by setting the open password and permission password to empty using PdfDocument.getSecurity().encrypt(java.lang.String openPassword, java.lang.String permissionPassword, java.util.EnumSet<PdfPermissionsFlags> permissions, PdfEncryptionKeySize keySize, java.lang.String originalPermissionPassword) method.
- Save the result file using PdfDocument.saveToFile() method.
- Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.security.PdfEncryptionKeySize;
import com.spire.pdf.security.PdfPermissionsFlags;
public class DecryptPDF {
public static void main(String[] args) throws Exception {
// Specify the input and output file paths
String input = "data/decryption.pdf";
String output = "output/decryption_result.pdf";
//load the pdf document.
PdfDocument doc = new PdfDocument();
doc.loadFromFile(input, "test");
//decrypt the document
doc.decrypt();
//save the file
doc.saveToFile(output, FileFormat.PDF);
// Close the PDF document to release resources
doc.close();
// Dispose of the PDF document to free up system resources
doc.dispose();
}
}

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.