page 105

Knowledgebase (2311)

Children categories

Spire.OfficeJs (3)

View items...

Java: Find and Replace Text in PDF

2024-03-22 02:57:00 Written by Koohji

Finding and replacing text in PDF documents is essential for updating reports, legal contracts, or any other type of document where accurate and consistent information is crucial. The process involves identifying specific pieces of text and replacing them with new content, allowing users to update placeholder text, correct mistakes, customize document details, or undertake other modifications involving the written word.

This article introduces how to find and replace text in a PDF document in Java by using the Spire.PDF for Java library.

Replace Text in a Specific PDF Page in Java
Replace Text in an Entire PDF Document in Java
Replace the First Instance of the Target Text in Java
Replace Text Based on a Regular Expression in Java

Install Spire.PDF for Java

First of all, you're required to add the Spire.Pdf.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.

Package Manager

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.pdf</artifactId>
        <version>11.12.16</version>
    </dependency>
</dependencies>

Replace Text in a Specific PDF Page in Java

In Spire.PDF for Java, the PdfTextReplacer class is designed to facilitate text replacement within PDF documents. One of its primary methods, replaceAllText(), enables developers to replace all instances of a specified text on a page with new text.

To replace text in a specific page in Java, follow these steps:

Create a PdfDocument object.
Load a PDF file for a specified path.
Get a specific page from the document.
Create a PdfTextReplaceOptions object, and specify the replace options using setReplaceType() method of the object.
Create a PdfTextReplacer object, and apply the replace options using setOptions() method of it.
Replace all instances of the target text in the page with new text using PdfTextReplacer.replaceAllText() method.
Save the document to a different PDF file.

Java

import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;
import com.spire.pdf.texts.PdfTextReplaceOptions;
import com.spire.pdf.texts.PdfTextReplacer;
import com.spire.pdf.texts.ReplaceActionType;

import java.util.EnumSet;

public class ReplaceTextInPage {

    public static void main(String[] args) {

        // Create a PdfDocument object
        PdfDocument doc = new PdfDocument();

        // Load a PDF file
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf");

        // Create a PdfTextReplaceOptions object
        PdfTextReplaceOptions textReplaceOptions = new PdfTextReplaceOptions();

        // Specify the options for text replacement
        textReplaceOptions.setReplaceType(EnumSet.of(ReplaceActionType.IgnoreCase));
        textReplaceOptions.setReplaceType(EnumSet.of(ReplaceActionType.WholeWord));

        // Get a specific page
        PdfPageBase page = doc.getPages().get(0);

        // Create a PdfTextReplacer object based on the page
        PdfTextReplacer textReplacer = new PdfTextReplacer(page);

        // Set the replace options
        textReplacer.setOptions(textReplaceOptions);

        // Replace all instances of target text with new text
        textReplacer.replaceAllText("MySQL", "mysql");

        // Save the document to a different PDF file
        doc.saveToFile("output/ReplaceTextInPage.pdf");

        // Dispose resources
        doc.dispose();
    }
}

Java: Find and Replace Text in PDF

Replace Text in an Entire PDF Document in Java

You already know how to replace text in one page. To replace all instances of a specific text within a PDF document with new text, you just need to iterate through each page of the document and use the PdfTextReplacer.replaceAllText() method to update the text on every page.

The following are the steps to replace text in an entire PDF document using Java.

Create a PdfDocument object.
Load a PDF file for a specified path.
Create a PdfTextReplaceOptions object, and specify the replace options using setReplaceType() method of the object.
Iterate through the pages in the document.
- Create a PdfTextReplacer object based on a specified page, and apply the replace options using setOptions() method.
- Replace all instances of the target text in the page with new text using PdfTextReplacer.replaceAllText() method.
Save the document to a different PDF file.

Java

import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;
import com.spire.pdf.texts.PdfTextReplaceOptions;
import com.spire.pdf.texts.PdfTextReplacer;
import com.spire.pdf.texts.ReplaceActionType;

import java.util.EnumSet;

public class ReplaceTextInDocument {

    public static void main(String[] args) {

        // Create a PdfDocument object
        PdfDocument doc = new PdfDocument();

        // Load a PDF file
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf");

        // Create a PdfTextReplaceOptions object
        PdfTextReplaceOptions textReplaceOptions = new PdfTextReplaceOptions();

        // Specify the options for text replacement
        textReplaceOptions.setReplaceType(EnumSet.of(ReplaceActionType.IgnoreCase));
        textReplaceOptions.setReplaceType(EnumSet.of(ReplaceActionType.WholeWord));

        for (int i = 0; i < doc.getPages().getCount(); i++) {

            // Get a specific page
            PdfPageBase page = doc.getPages().get(i);

            // Create a PdfTextReplacer object based on the page
            PdfTextReplacer textReplacer = new PdfTextReplacer(page);

            // Set the replace options
            textReplacer.setOptions(textReplaceOptions);

            // Replace all instances of target text with new text
            textReplacer.replaceAllText("MySQL", "mysql");
        }

        // Save the document to a different PDF file
        doc.saveToFile("output/ReplaceTextInDocument.pdf");

        // Dispose resources
        doc.dispose();
    }
}

Replace the First Instance of the Target Text in Java

To replace the first instance of the target text in a page, you can make use of the replaceText() method from the PdfTextReplacer class. Here are the steps to accomplish this task in Java.

Create a PdfDocument object.
Load a PDF file for a specified path.
Get a specific page from the document.
Create a PdfTextReplaceOptions object, and specify the replace options using replaceType method of the object.
Create a PdfTextReplacer object, and apply the replace options using setOptions() method.
Replace the first occurrence of the target text in the page with new text using PdfTextReplacer.replaceText() method.
Save the document to a different PDF file.

Java

import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;
import com.spire.pdf.texts.PdfTextReplaceOptions;
import com.spire.pdf.texts.PdfTextReplacer;
import com.spire.pdf.texts.ReplaceActionType;

import java.util.EnumSet;

public class ReplaceFirstInstance {

    public static void main(String[] args) {

        // Create a PdfDocument object
        PdfDocument doc = new PdfDocument();

        // Load a PDF file
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf");

        // Create a PdfTextReplaceOptions object
        PdfTextReplaceOptions textReplaceOptions = new PdfTextReplaceOptions();

        // Specify the options for text replacement
        textReplaceOptions.setReplaceType(EnumSet.of(ReplaceActionType.IgnoreCase));
        textReplaceOptions.setReplaceType(EnumSet.of(ReplaceActionType.WholeWord));

        // Get a specific page
        PdfPageBase page = doc.getPages().get(0);

        // Create a PdfTextReplacer object based on the page
        PdfTextReplacer textReplacer = new PdfTextReplacer(page);

        // Set the replace options
        textReplacer.setOptions(textReplaceOptions);

        // Replace the first instance of target text with new text
        textReplacer.replaceText("MySQL", "mysql");

        // Save the document to a different PDF file
        doc.saveToFile("output/ReplaceFirstInstance.pdf");

        // Dispose resources
        doc.dispose();
    }
}

Java: Find and Replace Text in PDF

Replace Text Based on a Regular Expression in Java

Regular expressions are incredibly powerful and flexible patterns that are commonly used for matching text. When working with Spire.PDF for Java, you can harness the capabilities of regular expressions to search for specific text within a PDF document and replace it or them with new text.

To replace text in a PDF based on a regular expression, you can follow these steps:

Create a PdfDocument object.
Load a PDF file for a specified path.
Get a specific page from the document.
Create a PdfTextReplaceOptions object.
Specify the replace type as Regex using PdfTextReplaceOptions.setReplaceType() method.
Create a PdfTextReplacer object, and apply the replace options using setOptions() method.
Find and replace the text that matches a specified regular expression using PdfTextReplacer.replaceAllText() method.
Save the document to a different PDF file.

Java

import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;
import com.spire.pdf.texts.PdfTextReplaceOptions;
import com.spire.pdf.texts.PdfTextReplacer;
import com.spire.pdf.texts.ReplaceActionType;

import java.util.EnumSet;

public class ReplaceBasedOnRegularExpression {

    public static void main(String[] args) {

        // Create a PdfDocument object
        PdfDocument doc = new PdfDocument();

        // Load a PDF file
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf");

        // Create a PdfTextReplaceOptions object
        PdfTextReplaceOptions textReplaceOptions = new PdfTextReplaceOptions();

        // Set the replace type as Regex
        textReplaceOptions.setReplaceType(EnumSet.of(ReplaceActionType.Regex));

        // Get a specific page
        PdfPageBase page = doc.getPages().get(0);

        // Create a PdfTextReplacer object based on the page
        PdfTextReplacer textReplacer = new PdfTextReplacer(page);

        // Set the replace options
        textReplacer.setOptions(textReplaceOptions);

        // Specify the regular expression
        String regularExpression = "\\bS\\w*L\\b";

        // Replace all instances that match the regular expression with new text
        textReplacer.replaceAllText(regularExpression, "NEW");

        // Save the document to a different PDF file
        doc.saveToFile("output/ReplaceWithRegularExpression.pdf");

        // Dispose resources
        doc.dispose();
    }
}

Java: Find and Replace Text in PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Text

Tagged under

pdf java Text

C#/VB.NET: Convert PDF to Grayscale (Black and White)

2021-11-12 06:43:17 Written by Koohji

Converting a PDF with color images to grayscale can help you reduce the file size and print the PDF in a more affordable mode without consuming colored ink. In this article, you will learn how to achieve the conversion programmatically in C# and VB.NET using Spire.PDF for .NET.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

Package Manager

PM> Install-Package Spire.PDF

Convert PDF to Grayscale

The following are the steps to convert a color PDF to grayscale:

Load a PDF file using PdfGrayConverter class.
Convert the PDF to grayscale using PdfGrayConverter.ToGrayPdf() method.

C#
VB.NET

using Spire.Pdf.Conversion;
 
namespace ConvertPdfToGrayscale
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfGrayConverter instance and load a PDF file
            PdfGrayConverter converter = new PdfGrayConverter(@"Sample.pdf");
            //Convert the PDF to grayscale
            converter.ToGrayPdf("Grayscale.pdf");
            converter.Dispose();
        }
    }
}

Imports Spire.Pdf.Conversion

Namespace ConvertPdfToGrayscale
    Friend Class Program
        Private Shared Sub Main(ByVal args As String())
            'Create a PdfGrayConverter instance and load a PDF file
            Dim converter As PdfGrayConverter = New PdfGrayConverter("Sample.pdf")
            'Convert the PDF to grayscale
            converter.ToGrayPdf("Grayscale.pdf")
            converter.Dispose()
        End Sub
    End Class
End Namespace

The input PDF:

C#/VB.NET: Convert PDF to Grayscale (Black and White)

The output PDF:

C#/VB.NET: Convert PDF to Grayscale (Black and White)

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Conversion

Tagged under

pdf net Conversion

Java: Convert PDF to PostScript

2021-11-12 02:16:01 Written by Koohji

Postscript, also known as PS, is a dynamically typed, concatenative programming language that describes the appearance of a printed page. Owing to its faster printing and improved quality, sometime you may need to convert a PDF document to Postscript. In this article, you will learn how to achieve this function using Spire.PDF for Java.

Install Spire.PDF for Java

First of all, you're required to add the Spire.PDF.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.

Package Manager

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.pdf</artifactId>
        <version>11.12.16</version>
    </dependency>
</dependencies>

Convert PDF to PostScript

The following are the steps to convert PDF to PostScript.

Create a PdfDocument object.
Load a sample PDF file using PdfDocument.loadFromFile() method.
Save the document as PostScript using PdfDocument.saveToFile() method

Java

import com.spire.pdf.*;

public class PDFToPS {
    public static void main(String[] args) {

        //Load a pdf document
        PdfDocument doc = new PdfDocument();
        doc.loadFromFile("sample.pdf");

        //Convert to PostScript file
        doc.saveToFile("output.ps", FileFormat.POSTSCRIPT);

    }
}

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Conversion

Tagged under

pdf java Conversion

News Category

Knowledgebase (2311)

Children categories

Purchase (7)

Licensing (7)

Benchmark (1)

Java (481)

.NET (1317)

Cloud (13)

CPP (71)

Python (355)

AI (4)

JavaScript (51)

Spire.OfficeJs (3)

Java: Find and Replace Text in PDF

Install Spire.PDF for Java

Replace Text in a Specific PDF Page in Java

Replace Text in an Entire PDF Document in Java

Replace the First Instance of the Target Text in Java

Replace Text Based on a Regular Expression in Java

Apply for a Temporary License

C#/VB.NET: Convert PDF to Grayscale (Black and White)

Install Spire.PDF for .NET

Convert PDF to Grayscale

Apply for a Temporary License

Java: Convert PDF to PostScript

Install Spire.PDF for Java

Convert PDF to PostScript

Apply for a Temporary License

More...

Java: Find Text and Add Hyperlinks for Them in PDF

Java: Add a Text Box to a Chart in Excel

Java: Convert PDF to Images with Transparent Background

C#/VB.NET: Distribute Table Rows and Columns in PowerPoint