page 15

Subscribe to this RSS feed

Java (481)

Children categories

Spire.Presentation for Java (83)

View items...

Spire.OCR for Java (4)

View items...

Java: Find and Extract Hyperlinks in Word Documents

2022-05-24 01:48:53 Written by Administrator

Hyperlinks in Word documents can lead readers to a webpage, an external file, an email address, and a specific place of the document being read. They are commonly used in Word documents for their convenience. This article will teach you how to use Spire.Doc for Java to find and extract hyperlinks in Word documents, including hypertexts and links.

Find and Extract a Specified Hyperlink in a Word Document
Find and Extract All the Hyperlinks in a Word Document

Install Spire.Doc for Java

First, you're required to add the Spire.Doc.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.

Package Manager

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.doc</artifactId>
        <version>14.1.3</version>
    </dependency>
</dependencies>

Find and Extract a Specified Hyperlink in a Word Document

The detailed steps are as follows:

Create a Document instance and load a Word document from disk using Document.loadFromFile() method.
Create an object of ArrayList<Field>.
Iterate through the items in the sections to find all hyperlinks.
Get the text of the first hyperlink using Field.get().getFieldText() method and get its link using Field.get().getValue() method.
Save the text and the link of the first hyperlink to a TXT file using custom method writeStringToText().

Java

import com.spire.doc.*;
import com.spire.doc.documents.*;
import com.spire.doc.fields.Field;

import java.io.*;
import java.util.ArrayList;

public class findHyperlinks {
    public static void main(String[] args) throws IOException {
        //Create a Document instance and load a Word document from file
        String input = "D:/testp/test.docx";
        Document doc = new Document();
        doc.loadFromFile(input);

        //Create an object of ArrayList
        ArrayList⁢Field> hyperlinks = new ArrayList();

        //Iterate through the items in the sections to find all hyperlinks
        for (Section section : (Iterable⁢Section>) doc.getSections()) {
            for (DocumentObject object : (Iterable⁢DocumentObject>) section.getBody().getChildObjects()) {
                if (object.getDocumentObjectType().equals(DocumentObjectType.Paragraph)) {
                    Paragraph paragraph = (Paragraph) object;
                    for (DocumentObject cObject : (Iterable⁢DocumentObject>) paragraph.getChildObjects()) {
                        if (cObject.getDocumentObjectType().equals(DocumentObjectType.Field)) {
                            Field field = (Field) cObject;
                            if (field.getType().equals(FieldType.Field_Hyperlink)) {
                                hyperlinks.add(field);
                            }
                        }
                    }
                }
            }
        }

        //Get the text and the address of the first hyperlink
        String hyperlinksText = hyperlinks.get(0).getFieldText();
        String hyperlinkAddress = hyperlinks.get(0).getValue();

        //Save the text and the link of the first hyperlink to a TXT file
        String output = "D:/javaOutput/HyperlinkTextAndLink.txt";
        writeStringToText("Text:\r\n" + hyperlinksText+ "\r\n" + "Link:\r\n" + hyperlinkAddress, output);
    }

    //Create a method to write the text and link of hyperlinks to a TXT file
    public static void writeStringToText(String content, String textFileName) throws IOException {
        File file = new File(textFileName);
        if (file.exists())
        {
            file.delete();
        }
        FileWriter fWriter = new FileWriter(textFileName, true);
        try {
            fWriter.write(content);
        } catch (IOException ex) {
            ex.printStackTrace();
        } finally {
            try {
                fWriter.flush();
                fWriter.close();
            } catch (IOException ex) {
                ex.printStackTrace();
            }
        }
    }
}

Java: Find and Extract Hyperlinks in Word Documents

Find and Extract All the Hyperlinks in a Word Document

The detailed steps are as follows:

Create a Document instance and load a Word document from disk using Document.loadFromFile() method.
Create an object of ArrayList<Field>.
Iterate through the items in the sections to find all hyperlinks.
Get the texts of the hyperlinks using Field.get().getFieldText() method and get their links using Field.get().getValue() method.
Save the text and the links of the hyperlinks to a TXT file using custom method writeStringToText().

Java

import com.spire.doc.*;
import com.spire.doc.documents.*;
import com.spire.doc.fields.Field;

import java.io.*;
import java.util.ArrayList;

public class findHyperlinks {
    public static void main(String[] args) throws IOException {
        //Create a Document instance and load a Word document from file
        String input = "D:/testp/test.docx";
        Document doc = new Document();
        doc.loadFromFile(input);

        //Create an object of ArrayList
        ArrayList⁢Field> hyperlinks = new ArrayList();
        String hyperlinkText = "";
        String hyperlinkAddress = "";

        //Iterate through the items in the sections to find all hyperlinks
        for (Section section : (Iterable⁢Section>) doc.getSections()) {
            for (DocumentObject object : (Iterable⁢DocumentObject>) section.getBody().getChildObjects()) {
                if (object.getDocumentObjectType().equals(DocumentObjectType.Paragraph)) {
                    Paragraph paragraph = (Paragraph) object;
                    for (DocumentObject cObject : (Iterable⁢DocumentObject>) paragraph.getChildObjects()) {
                        if (cObject.getDocumentObjectType().equals(DocumentObjectType.Field)) {
                            Field field = (Field) cObject;
                            if (field.getType().equals(FieldType.Field_Hyperlink)) {
                                hyperlinks.add(field);
                            }
                        }
                    }
                }
            }
        }

        //Save the texts and the links of the hyperlinks to a TXT file
        String output = "D:/javaOutput/HyperlinksTextsAndLinks.txt";
        writeStringToText("Text:\r\n " + hyperlinkText + "\r\n" + "Link:\r\n" + hyperlinkAddress + "\r\n", output);
    }

    //Create a method to write the text and link of hyperlinks to a TXT file
    public static void writeStringToText(String content, String textFileName) throws IOException {
        File file = new File(textFileName);
        if (file.exists())
        {
            file.delete();
        }
        FileWriter fWriter = new FileWriter(textFileName, true);
        try {
            fWriter.write(content);
        } catch (IOException ex) {
            ex.printStackTrace();
        } finally {
            try {
                fWriter.flush();
                fWriter.close();
            } catch (IOException ex) {
                ex.printStackTrace();
            }
        }
    }
}

Java: Find and Extract Hyperlinks in Word Documents

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Hyperlink

Tagged under

doc java Hyperlink

Java: Remove Hyperlinks in Word Documents

2022-05-20 03:42:14 Written by Administrator

Hyperlinks usually appear on texts. By clicking on a hyperlink, we can access a website, a document, an email address, or other elements. Some Word documents, especially those that are generated from web content, may contain irritating hyperlinks, such as advertisements. This article shows you how to programmatically remove one hyperlink or all hyperlinks in a Word document using Spire.Doc for Java.

Remove a Specified Hyperlink in a Word Document
Remove All the Hyperlinks in a Word Document

Install Spire.Doc for Java

Package Manager

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.doc</artifactId>
        <version>14.1.3</version>
    </dependency>
</dependencies>

Remove a Specified Hyperlink in a Word Document

The detailed steps to remove a specified hyperlink in a Word file are as follows:

Create a Document object and load a Word document from disk using Document.loadFromFile() method.
Find all the hyperlinks using custom method FindAllHyperlinks().
Flatten the first hyperlink using custom method FlattenHyperlinks().
Save the document using Document.saveToFile() method.

Java

import com.spire.doc.*;
import com.spire.doc.documents.DocumentObjectType;
import com.spire.doc.documents.Paragraph;
import com.spire.doc.documents.UnderlineStyle;
import com.spire.doc.fields.Field;
import com.spire.doc.fields.TextRange;

import java.awt.*;
import java.util.ArrayList;

public class removeHyperlink {
    public static void main(String[] args) {
        //Create a Document object and load a Word document from disk
        String input = "D:/testp/test.docx";
        Document doc = new Document();
        doc.loadFromFile(input);

        //Find all hyperlinks
        ArrayList<Field> hyperlinks = FindAllHyperlinks(doc);

        //Flatten the first hyperlink
        FlattenHyperlinks(hyperlinks.get(0));

        //Save the document to file
        String output = "D:/javaOutput/RemoveHyperlinks.docx";
        doc.saveToFile(output, FileFormat.Docx);
    }

     //Iterate through the items in the sections to find all hyperlinks
        for (Section section : (Iterable<Section>)document.getSections())
        {
            for (DocumentObject object : (Iterable<DocumentObject>)section.getBody().getChildObjects())
            {
                if (object.getDocumentObjectType().equals(DocumentObjectType.Paragraph))
                {
                    Paragraph paragraph = (Paragraph) object;
                    for (DocumentObject cObject : (Iterable<DocumentObject>)paragraph.getChildObjects())
                    {
                    Paragraph paragraph = (Paragraph) object;
                    for (DocumentObject cObject : (Iterable)paragraph.getChildObjects())
                    {
                        if (cObject.getDocumentObjectType().equals(DocumentObjectType.Field))
                        {
                            Field field = (Field) cObject;
                            if (field.getType().equals(FieldType.Field_Hyperlink))
                            {
                                hyperlinks.add(field);
                            }
                        }
                    }
                }
            }
        }
        return hyperlinks;
    }

    //Create a method FlattenHyperlinks() to flatten the hyperlink field
    public static void FlattenHyperlinks(Field field)
    {
        int ownerParaIndex = field.getOwnerParagraph().getOwnerTextBody().getChildObjects().indexOf(field.getOwnerParagraph());
        int fieldIndex = field.getOwnerParagraph().getChildObjects().indexOf(field);
        Paragraph sepOwnerPara = field.getSeparator().getOwnerParagraph();
        int sepOwnerParaIndex = field.getSeparator().getOwnerParagraph().getOwnerTextBody().getChildObjects().indexOf(field.getSeparator().getOwnerParagraph());
        int sepIndex = field.getSeparator().getOwnerParagraph().getChildObjects().indexOf(field.getSeparator());
        int endIndex = field.getEnd().getOwnerParagraph().getChildObjects().indexOf(field.getEnd());
        int endOwnerParaIndex = field.getEnd().getOwnerParagraph().getOwnerTextBody().getChildObjects().indexOf(field.getEnd().getOwnerParagraph());
        FormatFieldResultText(field.getSeparator().getOwnerParagraph().getOwnerTextBody(), sepOwnerParaIndex, endOwnerParaIndex, sepIndex, endIndex);
        field.getEnd().getOwnerParagraph().getChildObjects().removeAt(endIndex);"
        for (int i = sepOwnerParaIndex; i >= ownerParaIndex; i--)
        {
            if (i == sepOwnerParaIndex && i == ownerParaIndex)
            {
                for (int j = sepIndex; j >= fieldIndex; j--)
                {
                    field.getOwnerParagraph().getChildObjects().removeAt(j);
                }
            }
            else if (i == ownerParaIndex)
            {
                for (int j = field.getOwnerParagraph().getChildObjects().getCount() - 1; j >= fieldIndex; j--)
                {
                    field.getOwnerParagraph().getChildObjects().removeAt(j);
                }
            }
            else if (i == sepOwnerParaIndex)
            {
                for (int j = sepIndex; j >= 0; j--)
                {
                    sepOwnerPara.getChildObjects().removeAt(j);
                }
            }
            else
            {
                field.getOwnerParagraph().ownerTextBody().getChildObjects().removeAt(i);
            }
        }
    }

    //Create a method FormatFieldResultText() to remove the font color and underline format of the hyperlinks
    private static void FormatFieldResultText(Body ownerBody, int sepOwnerParaIndex, int endOwnerParaIndex, int sepIndex, int endIndex)
    {
        for (int i = sepOwnerParaIndex; i <= endOwnerParaIndex; i++)
        {
            Paragraph para = (Paragraph) ownerBody.getChildObjects().get(i);
            if (i == sepOwnerParaIndex && i == endOwnerParaIndex)
            {
                for (int j = sepIndex + 1; j < endIndex; j++)
                {
                    FormatText((TextRange)para.getChildObjects().get(j));
                }
            }
            else if (i == sepOwnerParaIndex)
            {
                for (int j = sepIndex + 1; j < para.getChildObjects().getCount(); j++)
                {
                    FormatText((TextRange)para.getChildObjects().get(j));
                }
            }
            else if (i == endOwnerParaIndex)
            {
                for (int j = 0; j < endIndex; j++)
                {
                    FormatText((TextRange)para.getChildObjects().get(j));
                }
            }
            else
            {
                for (int j = 0; j < para.getChildObjects().getCount(); j++)
                {
                    FormatText((TextRange)para.getChildObjects().get(j));
                }
            }
        }
    }

    //Create a method FormatText() to change the color of the text to black and remove the underline
    private static void FormatText(TextRange tr)
    {
        //Set the text color to black
        tr.getCharacterFormat().setTextColor(Color.black);

        //Set the text underline style to none
        tr.getCharacterFormat().setUnderlineStyle(UnderlineStyle.None);
    }
}

Java: Remove Hyperlinks in Word Documents

Remove All the Hyperlinks in a Word Document

The detailed steps to remove all the hyperlinks in a Word file are as follows:

Create a Document object and load a Word document from disk using Document.loadFromFile() method.
Find all the hyperlinks using custom method FindAllHyperlinks().
Loop through the hyperlinks, and invoke the custom method FlattenHyperlinks() to flatten the specific hyperlink.
Save the document using Document.saveToFile() method.

Java

import com.spire.doc.*;
import com.spire.doc.documents.DocumentObjectType;
import com.spire.doc.documents.Paragraph;
import com.spire.doc.documents.UnderlineStyle;
import com.spire.doc.fields.Field;
import com.spire.doc.fields.TextRange;

import java.awt.*;
import java.util.ArrayList;

public class removeHyperlink {
    public static void main(String[] args) {
        //Create a Document object and load a Word document from disk
        String input = "D:/testp/test.docx";
        Document doc = new Document();
        doc.loadFromFile(input);

        //Find all the hyperlinks
        ArrayList<Field> hyperlinks = FindAllHyperlinks(doc);

        //Loop through the hyperlinks, and flatten the specific hyperlink.
        for (int i = hyperlinks.size() -1; i >= 0; i--)
        {
            FlattenHyperlinks(hyperlinks.get(i));
        }

        //Save the document to file
        String output = "D:/javaOutput/RemoveHyperlinks.docx";
        doc.saveToFile(output, FileFormat.Docx);
    }

    //Create a method FindAllHyperlinks() to get all the hyperlinks from the sample document
    private static ArrayList FindAllHyperlinks(Document document)
    {
        ArrayList⁢Field> hyperlinks = new ArrayList();

         //Iterate through the items in the sections to find all hyperlinks
        for (Section section : (Iterable<Section>)document.getSections())
        {
            for (DocumentObject object : (Iterable<DocumentObject>)section.getBody().getChildObjects())
            {
                if (object.getDocumentObjectType().equals(DocumentObjectType.Paragraph))
                {
                    Paragraph paragraph = (Paragraph) object;
                    for (DocumentObject cObject : (Iterable<DocumentObject>)paragraph.getChildObjects())
                    {
                        if (cObject.getDocumentObjectType().equals(DocumentObjectType.Field))
                        {
                            Field field = (Field) cObject;
                            if (field.getType().equals(FieldType.Field_Hyperlink))
                            {
                                hyperlinks.add(field);
                            }
                        }
                    }
                }
            }
        }
        return hyperlinks;
    }

    //Create a method FlattenHyperlinks() to flatten the hyperlink field
    public static void FlattenHyperlinks(Field field)
    {
        int ownerParaIndex = field.getOwnerParagraph().getOwnerTextBody().getChildObjects().indexOf(field.getOwnerParagraph());
        int fieldIndex = field.getOwnerParagraph().getChildObjects().indexOf(field);
        Paragraph sepOwnerPara = field.getSeparator().getOwnerParagraph();
        int sepOwnerParaIndex = field.getSeparator().getOwnerParagraph().getOwnerTextBody().getChildObjects().indexOf(field.getSeparator().getOwnerParagraph());
        int sepIndex = field.getSeparator().getOwnerParagraph().getChildObjects().indexOf(field.getSeparator());
        int endIndex = field.getEnd().getOwnerParagraph().getChildObjects().indexOf(field.getEnd());
        int endOwnerParaIndex = field.getEnd().getOwnerParagraph().getOwnerTextBody().getChildObjects().indexOf(field.getEnd().getOwnerParagraph());
        FormatFieldResultText(field.getSeparator().getOwnerParagraph().getOwnerTextBody(), sepOwnerParaIndex, endOwnerParaIndex, sepIndex, endIndex);
        field.getEnd().getOwnerParagraph().getChildObjects().removeAt(endIndex);"
        for (int i = sepOwnerParaIndex; i >= ownerParaIndex; i--)
        {
            if (i == sepOwnerParaIndex && i == ownerParaIndex)
            {
                for (int j = sepIndex; j >= fieldIndex; j--)
                {
                    field.getOwnerParagraph().getChildObjects().removeAt(j);
                }
            }
            else if (i == ownerParaIndex)
            {
                for (int j = field.getOwnerParagraph().getChildObjects().getCount() - 1; j >= fieldIndex; j--)
                {
                    field.getOwnerParagraph().getChildObjects().removeAt(j);
                }
            }
            else if (i == sepOwnerParaIndex)
            {
                for (int j = sepIndex; j >= 0; j--)
                {
                    sepOwnerPara.getChildObjects().removeAt(j);
                }
            }
            else
            {
                field.getOwnerParagraph().ownerTextBody().getChildObjects().removeAt(i);
            }
        }
    }

    //Create a method FormatFieldResultText() to format the texts
    private static void FormatFieldResultText(Body ownerBody, int sepOwnerParaIndex, int endOwnerParaIndex, int sepIndex, int endIndex)
    {
        for (int i = sepOwnerParaIndex; i <= endOwnerParaIndex; i++)
        {
            Paragraph para = (Paragraph) ownerBody.getChildObjects().get(i);
            if (i == sepOwnerParaIndex && i == endOwnerParaIndex)
            {
                for (int j = sepIndex + 1; j < endIndex; j++)
                {
                    FormatText((TextRange)para.getChildObjects().get(j));
                }
            }
            else if (i == sepOwnerParaIndex)
            {
                for (int j = sepIndex + 1; j < para.getChildObjects().getCount(); j++)
                {
                    FormatText((TextRange)para.getChildObjects().get(j));
                }
            }
            else if (i == endOwnerParaIndex)
            {
                for (int j = 0; j < endIndex; j++)
                {
                    FormatText((TextRange)para.getChildObjects().get(j));
                }
            }
            else
            {
                for (int j = 0; j < para.getChildObjects().getCount(); j++)
                {
                    FormatText((TextRange)para.getChildObjects().get(j));
                }
            }
        }
    }

    //Create a method FormatText() to change the color of the text to black and remove the underline
    private static void FormatText(TextRange tr)
    {
        //Set the text color to black
        tr.getCharacterFormat().setTextColor(Color.black);

        //Set the text underline style to none
        tr.getCharacterFormat().setUnderlineStyle(UnderlineStyle.None);
    }
}

Java: Remove Hyperlinks in Word Documents

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Hyperlink

Tagged under

doc java Hyperlink

Java: Set the Font Color for the Text String on PDF

2022-05-05 03:39:23 Written by Koohji

When you are drawing text into a PDF, you may need to define colorful brushes or pens in order to make the page more vivid. This article shows how to set the text string’s color space in a PDF document using Spire.PDF for Java.

Install Spire.PDF for Java

First, you need to add the Spire.Pdf.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file by adding the following code to your project's pom.xml file.

Package Manager

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.pdf</artifactId>
        <version>11.12.16</version>
    </dependency>
</dependencies>

Set the Font Color for the Text String on PDF

Spire.PDF for Java offers PdfSolidBrush class to set the brush color for the text. It supports to define the brush color based on a particular RGB color space or a HTML color code.

Create a PdfDocument object.
Add a new page in the PDF using PdfDocument.getPages().add() method.
Create a PdfSolidBrush object based on a particular RGB color space or a HTML color code.
Create an object of PdfTrueTypeFont to set the font name, size and style.
Draw text on the page at the specified location using PdfPageBase.getCanvas().drawString() method.
Save the document to another PDF using PdfDocument.saveToFile() method.

Java

import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;
import com.spire.pdf.graphics.PdfRGBColor;
import com.spire.pdf.graphics.PdfSolidBrush;
import com.spire.pdf.graphics.PdfTrueTypeFont;

import java.awt.*;

       public class pdfBrush {
            public static void main(String[] args) throws Exception {


                //Create a PdfDocument object
                PdfDocument doc = new PdfDocument();
                //Add a page
                PdfPageBase page = doc.getPages().add();

                //Set the location
                float y = 30;

                //Create solid brush object and define the color
                PdfRGBColor rgb1 = new PdfRGBColor(Color.green);
                PdfSolidBrush brush1 = new PdfSolidBrush(rgb1);

                //RGB Color
                PdfRGBColor rgb2 = new PdfRGBColor(0,197,205);
                PdfSolidBrush brush2 = new PdfSolidBrush(rgb2);

                //HTML code color
                Color color = Color.decode("#A52A2A");
                PdfSolidBrush brush3 = new PdfSolidBrush(new PdfRGBColor(color));

                //Create true type font object
                Font font = new Font("Arial", java.awt.Font.BOLD, 14);
                PdfTrueTypeFont trueTypeFont = new PdfTrueTypeFont(font);

                //Draw text
                page.getCanvas().drawString("Set the text color with brush", trueTypeFont, brush1, 0, (y = y + 30f));
                page.getCanvas().drawString("Set the text color with RGB", trueTypeFont, brush2, 0, (y = y + 50f));
                page.getCanvas().drawString("Set the text color with HTML code color", trueTypeFont, brush3, 0, (y = y + 60f));

                //Save to file
                doc.saveToFile("output/CreatePdf.pdf");

            }
        }

Java: Set the Font Color for the Text String on PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Text

Tagged under

pdf java Text

News Category

Java (481)

Children categories

Spire.Barcode for Java (4)

Spire.PDF for Java (124)

Spire.Presentation for Java (83)

Spire.Doc for Java (137)

Spire.XLS for Java (129)

Spire.OCR for Java (4)

Java: Find and Extract Hyperlinks in Word Documents

Install Spire.Doc for Java

Find and Extract a Specified Hyperlink in a Word Document

Find and Extract All the Hyperlinks in a Word Document

Apply for a Temporary License

Java: Remove Hyperlinks in Word Documents

Install Spire.Doc for Java

Remove a Specified Hyperlink in a Word Document

Remove All the Hyperlinks in a Word Document

Apply for a Temporary License

Java: Set the Font Color for the Text String on PDF

Install Spire.PDF for Java

Set the Font Color for the Text String on PDF

Apply for a Temporary License

More...

Java: Set Background Color and Pattern for Excel Cells

Java: Convert XML to Word

Java: Insert Page Break into Word Documents