PDF is a versatile file format that can render text and graphics on its pages as well as serve as a storage container. People can attach files to PDFs and extract them later. Attaching related documents to a PDF can facilitate centralized management and transmission of documents.

Spire.PDF for Java allows you to attach files in two ways:

  • Document Level Attachment: A file attached to a PDF at the document level won't appear on a page, but can only be viewed in the "Attachments" panel of a PDF reader.
  • Annotation Attachment: A file will be added to a specific position of a page. Annotation attachments are shown as a paper clip icon on the page; reviewers can double-click the icon to open the file.

This article demonstrates how to add or remove these two types of attachments in a PDF document in Java using Spire.PDF for Java.

Install Spire.PDF for Java

First, you're required to add the Spire.Pdf.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.pdf</artifactId>
        <version>12.4.4</version>
    </dependency>
</dependencies>

Add an Attachment to PDF in Java

Adding an attachment to the "Attachments" panel can be easily done by using PdfDocument.getAttachments().add() method. The following are the detailed steps.

  • Create a PdfDocument object.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Create a PdfAttachment object based on an external file.
  • Add the attachment to PDF using PdfDocument.getAttachments().add() method.
  • Save the document to another PDF file using PdfDocument.saveToFile() method.
  • Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.attachments.PdfAttachment;

public class AttachFilesToPdf {

    public static void main(String[] args) {

        //Create a PdfDocument object
        PdfDocument doc = new PdfDocument();

        //Load a sample PDF file
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Sample.pdf");

        //Create a PdfAttachment object based on an external file
        PdfAttachment attachment = new PdfAttachment("C:\\Users\\Administrator\\Desktop\\Data.xlsx");

        //Add the attachment to PDF
        doc.getAttachments().add(attachment);

        //Save to file
        doc.saveToFile("Attachment.pdf");
    }
}

Java: Add or Remove Attachments in PDF

Add an Annotation Attachment to PDF in Java

An annotation attachment can be found in the "Attachments" panel as well as on a specific page. Below are the steps to add an annotation attachment to PDF using Spire.PDF for Java.

  • Create a PdfDocument object.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Get a specific page to add annotation using PdfDocument.getPages().get() method.
  • Create a PdfAttachmentAnnotation object based on an external file.
  • Add the annotation attachment to the page using PdfPageBase.getAnnotationsWidget().add() method.
  • Save the document to another PDF file using PdfDocument.saveToFile() method.
  • Java
import com.spire.pdf.PdfPageBase;
import com.spire.pdf.annotations.*;
import com.spire.pdf.graphics.*;
import com.spire.pdf.PdfDocument;

import java.awt.*;
import java.awt.geom.Dimension2D;
import java.awt.geom.Rectangle2D;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;

public class AnnotationAttachment {

    public static void main(String[] args) throws IOException {

        //Create a PdfDocument object
        PdfDocument doc = new PdfDocument();

        //Load a sample PDF file
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Sample.pdf");

        //Get a specific page
        PdfPageBase page = doc.getPages().get(0);

        //Draw a label on PDF
        String label = "Here is the report:";
        PdfTrueTypeFont font = new PdfTrueTypeFont(new Font("Arial", Font.PLAIN, 13));
        double x = 35;
        double y = doc.getPages().get(0).getActualSize().getHeight() - 220;
        page.getCanvas().drawString(label, font, PdfBrushes.getRed(), x, y);

        //Attach a file as an annotation
        String filePath = "C:\\Users\\Administrator\\Desktop\\Report.pptx";
        byte[] data = toByteArray(filePath);
        Dimension2D size = font.measureString(label);
        Rectangle2D bound = new Rectangle2D.Float((float) (x + size.getWidth() + 5), (float) y, 10, 15);
        PdfAttachmentAnnotation annotation = new PdfAttachmentAnnotation(bound, filePath, data);
        annotation.setColor(new PdfRGBColor(new Color(0, 128, 128)));
        annotation.setFlags(EnumSet.of(PdfAnnotationFlags.Default));
        annotation.setIcon(PdfAttachmentIcon.Graph);
        annotation.setText("Click here to open the file");
        page.getAnnotationsWidget().add(annotation);

        //Save to file
        doc.saveToFile("Attachments.pdf");
    }
    //Convert file to byte array
    public static byte[] toByteArray(String filePath) throws IOException {

        File file = new File(filePath);
        long fileSize = file.length();
        if (fileSize > Integer.MAX_VALUE) {
            System.out.println("file too big...");
            return null;
        }
        FileInputStream fi = new FileInputStream(file);
        byte[] buffer = new byte[(int) fileSize];
        int offset = 0;
        int numRead = 0;
        while (offset < buffer.length
                && (numRead = fi.read(buffer, offset, buffer.length - offset)) >= 0) {
            offset += numRead;
        }

        if (offset != buffer.length) {
            throw new IOException("Could not completely read file "
                    + file.getName());
        }
        fi.close();
        return buffer;
    }
}

Java: Add or Remove Attachments in PDF

Remove Attachments from PDF in Java

The attachments of a PDF document can be accessed using PdfDocument.getAttachments() method, and can be removed by using removeAt() method or clear() method of the PdfAttachmentCollection object. The detailed steps are as follows.

  • Create a PdfDocument object.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Get the attachment collection from the document using PdfDocument.getAttachments() method.
  • Remove a specific attachment using PdfAttachmentCollection.removeAt() method. To remove all attachments at once, use PdfAttachmentCollection.clear() method.
  • Save the document to another PDF file using PdfDocument.saveToFile() method.
  • Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.attachments.PdfAttachmentCollection;

public class RemoveAttachments {

    public static void main(String[] args) {

        //Create a PdfDocument object
        PdfDocument doc = new PdfDocument();

        //Load a PDF file
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Attachments.pdf");

        //Get attachment collection, not containing annotation attachments
        PdfAttachmentCollection attachments = doc.getAttachments();

        //Remove all attachments
        attachments.clear();

        //Remove a specific attachment
        //attachments.removeAt(0);

        //save to file
        doc.saveToFile("output/DeleteAttachments.pdf");
        doc.close();
    }
}

Remove Annotation Attachments from PDF in Java

Annotation is a page-based element. To get all annotations from a document, we must traverse through the pages and get the annotations from each page. Then determine if a certain annotation is an annotation attachment. Lastly, remove the annotation attachment from the annotation collection using remove() method.  The following are the detailed steps.

  • Create a PdfDocument object.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Loop through the pages in the document, and get the annotation collection from a specific page using PdfPageBase.getAnnotationsWidget() method.
  • Determine if an annotation is an instance of PdfAttachmentAnnotationWidget. If yes, remove the annotation attachment using PdfAnnotationCollection.remove() method.
  • Save the document to another PDF file using PdfDocument.saveToFile() method.
  • Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.annotations.PdfAnnotation;
import com.spire.pdf.annotations.PdfAnnotationCollection;
import com.spire.pdf.annotations.PdfAttachmentAnnotationWidget;

public class RemoveAnnotationAttachments {

    public static void main(String[] args) {

        //Create a PdfDocument object
        PdfDocument doc = new PdfDocument();

        //Load a PDF file
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Attachments.pdf");

        //Loop through the pages
        for (int i = 0; i < doc.getPages().getCount(); i++) {

            //Get the annotation collection
            PdfAnnotationCollection annotationCollection = doc.getPages().get(i).getAnnotationsWidget();

            //Loop through the annotations
            for (Object annotation: annotationCollection) {

                //Determine if an annotation is an instance of PdfAttachmentAnnotationWidget
                if (annotation instanceof PdfAttachmentAnnotationWidget){

                    //Remove the attachment annotation
                    annotationCollection.remove((PdfAnnotation) annotation);
                }
            }
        }

        //save to file
        doc.saveToFile("output/DeleteAnnotationAttachments.pdf");
        doc.close();
    }
}

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Java: Extract Text from PowerPoint

2023-08-11 07:12:00 Written by Koohji

A PowerPoint presentation, developed by Microsoft Corporation, is a versatile file format used for creating visually captivating and interactive content. It includes rich features and multiple elements such as text and images, making it a powerful tool for various scenarios, such as business introductions and academic speeches. If you need to edit or manipulate the text of PowerPoint, programmatically extracting it and saving it to a new file is an effective approach. In this article, we will show you how to extract text  from PowerPoint using Spire.Presentation for Java.

Install Spire.Presentation for Java

First of all, you're required to add the Spire.Presentation.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.presentation</artifactId>
        <version>11.3.5</version>
    </dependency>
</dependencies>

Extract Text from the Whole PowerPoint File

Spire.Presentation for Java supports looping through all slides and extracting text from the paragraphs on each slide using ParagraphEx.getText() method. The detailed steps are as follows.

  • Create an object of Presentation class.
  • Load a sample presentation using Presentation.loadFromFile() method.
  • Create an object of StringBuilder class.
  • Loop through shapes in each slide and paragraphs in each shape.
  • Extract all text from these slides by calling ParagraphEx.getText() method and append the extracted text to StringBuilder object.
  • Create an object of FileWriter class, and write the extracted text to a new .txt file.
  • Java
import com.spire.presentation.*;

import java.io.*;

public class ExtractText {
    public static void main(String[] args) throws Exception {

        //Create an object of Presentation class
        Presentation presentation = new Presentation();

        //Load a sample presentation
        presentation.loadFromFile("sample.pptx");

        //Create a  StringBuilder object
        StringBuilder buffer = new StringBuilder();

        //Loop through each slide and extract text 
        for (Object slide : presentation.getSlides()) {
            for (Object shape : ((ISlide) slide).getShapes()) {
                if (shape instanceof IAutoShape) {
                    for (Object tp : ((IAutoShape) shape).getTextFrame().getParagraphs()) {
                        buffer.append(((ParagraphEx) tp).getText()+"\n");
                    }
                }
            }
        }

        //Write the extracted text to a new .txt file
        FileWriter writer = new FileWriter("output/ExtractAllText.txt");
        writer.write(buffer.toString());
        writer.flush();
        writer.close();
        presentation.dispose();
    }
}

Java: Extract Text from PowerPoint

Extract Text from the Specific Slide

Spire.Presentation for Java also supports users to extract text from the specific slide. Simply get the desired slide by calling Presentation.getSlides().get() method before extracting text from the paragraphs on it. The following are detailed steps.

  • Create an object of Presentation class.
  • Load a sample presentation using Presentation.loadFromFile() method.
  • Create an object of StringBuilder class.
  • Get the first slide of this file by calling Presentation.getSlides().get() method.
  • Loop through each shape and the paragraphs in each shape.
  • Extract the text from the first slide by calling ParagraphEx.getText() method and append the extracted text to StringBuilder object.
  • Create an object of FileWriter class, and write the extracted text to a new .txt file.
  • Java
import com.spire.presentation.*;

import java.io.*;

public class ExtractText {
    public static void main(String[] args) throws Exception {

        //Create an object of Presentation class
        Presentation presentation = new Presentation();

        //Load a sample presentation
        presentation.loadFromFile("sample.pptx");

        //Create a StringBuilder object
        StringBuilder buffer = new StringBuilder();

        //Get the first slide of the presentation
        ISlide Slide = presentation.getSlides().get(0);

        //Loop through each paragraphs in each shape and extract text
        for (Object shape : Slide.getShapes()) {
            if (shape instanceof IAutoShape) {
                for (Object tp : ((IAutoShape) shape).getTextFrame().getParagraphs()) {
                    buffer.append(((ParagraphEx) tp).getText()+"\n");
                }
            }
        }

        //Write the extracted text to a new .txt file
        FileWriter writer = new FileWriter("output/ExtractSlideText.txt");
        writer.write(buffer.toString());
        writer.flush();
        writer.close();
        presentation.dispose();
    }
}

Java: Extract Text from PowerPoint

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Occasionally, you may need to protect PowerPoint documents. For instance, when you want to prevent unauthorized users from viewing and editing a PowerPoint document. Conversely, sometimes you may also need to unprotect PowerPoint documents. For example, when you wish to make a password-protected PowerPoint document accessible to everyone. In this article, we will introduce how to protect or unprotect PowerPoint documents in Java using Spire.Presentation for Java.

Install Spire.Presentation for Java

First of all, you're required to add the Spire.Presentation.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.presentation</artifactId>
        <version>11.3.5</version>
    </dependency>
</dependencies>

Protect a PowerPoint Document with a Password in Java

You can protect a PowerPoint document with a password to ensure that only the people who have the right password can view it.

The following steps demonstrate how to protect a PowerPoint document with a password:

  • Initialize an instance of Presentation class.
  • Load a PowerPoint document using Presentation.loadFromFile() method.
  • Encrypt the document with a password using Presentation.encrypt() method.
  • Save the result document using Presentation.saveToFile() method.
  • Java
import com.spire.presentation.FileFormat;
import com.spire.presentation.Presentation;

public class ProtectPPTWithPassword {
    public static void main(String []args) throws Exception {
        //Create a Presentation instance
        Presentation presentation = new Presentation();

        //Load a PowerPoint document
        presentation.loadFromFile("Sample.pptx");

        //Encrypt the document with a password
        presentation.encrypt("your password");

        //Save the result document
        presentation.saveToFile("Encrypted.pptx", FileFormat.PPTX_2013);

    }
}

Java: Protect or Unprotect PowerPoint Documents

Mark a PowerPoint Document as Final in Java

You can mark a PowerPoint document as final to inform readers that the document is final and no further editing is expected.

The following steps demonstrate how to mark a PowerPoint document as final:

  • Initialize an instance of Presentation class.
  • Load a PowerPoint document using Presentation.LoadFromFile() method.
  • Mark the document as final using Presentation.getDocumentProperty().set() method.
  • Save the result document using Presentation.SaveToFile() method.
  • Java
import com.spire.presentation.FileFormat;
import com.spire.presentation.Presentation;

public class MarkPPTAsFinal {
    public static void main(String []args) throws Exception {
        //Create a Presentation instance
        Presentation ppt = new Presentation();
        //Load a PowerPoint document
        ppt.loadFromFile("Sample.pptx");

        //Mark the document as final
        ppt.getDocumentProperty().set("_MarkAsFinal", true);

        //Save the result document
        ppt.saveToFile("MarkAsFinal.pptx", FileFormat.PPTX_2013);
    }
}

Java: Protect or Unprotect PowerPoint Documents

Remove Password Protection from a PowerPoint Document in Java

You can remove password protection from a PowerPoint document by loading the document with the correct password, then removing the password protection from it.

The following steps demonstrate how to remove password protection from a PowerPoint document:

  • Initialize an instance of Presentation class.
  • Load a PowerPoint document using Presentation.loadFromFile() method.
  • Mark the document as final through Presentation.removeEncryption() method.
  • Save the result document using Presentation.saveToFile() method.
  • Java
import com.spire.presentation.FileFormat;
import com.spire.presentation.Presentation;

public class RemovePasswordProtectionFromPPT {
    public static void main(String []args) throws Exception {
        //Create a Presentation instance
        Presentation presentation = new Presentation();

        //Load a password-protected PowerPoint document with the right password
        presentation.loadFromFile("Encrypted.pptx", "your password");

        //Remove password protection from the document
        presentation.removeEncryption();

        //Save the result document
        presentation.saveToFile("RemoveProtection.pptx", FileFormat.PPTX_2013);

    }
}

Java: Protect or Unprotect PowerPoint Documents

Remove Mark as Final Option from a PowerPoint Document in Java

The mark as final feature makes a PowerPoint document read-only to prevent further changes, if you decide to make changes to the document later, you can remove the mark as final option from it.

The following steps demonstrate how to remove mark as final option from a PowerPoint document:

  • Initialize an instance of Presentation class.
  • Load a PowerPoint document using Presentation.loadFromFile() method.
  • Remove the mark as final option from the document using Presentation.getDocumentProperty().set() method.
  • Save the result document using Presentation.saveToFile() method.
  • Java
import com.spire.presentation.FileFormat;
import com.spire.presentation.Presentation;

public class RemoveMarkAsFinalFromPPT {
    public static void main(String []args) throws Exception {
        //Create a Presentation instance
        Presentation ppt = new Presentation();
        //Load a PowerPoint document
        ppt.loadFromFile( "MarkAsFinal.pptx");

        //Remove mark as final option from the document
        ppt.getDocumentProperty().set("_MarkAsFinal", false);

        //Save the result document
        ppt.saveToFile("RemoveMarkAsFinal.pptx", FileFormat.PPTX_2013);
    }
}

Java: Protect or Unprotect PowerPoint Documents

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

page 75