Spire.Office Knowledgebase Page 16 | E-iceblue

PDFs are versatile documents that often contain images to enhance their visual appeal and convey information. The ability to manipulate these images - adding new ones, replacing existing ones, or removing unwanted ones - is a valuable skill. In this article, you will learn how to add, replace, or delete images in a PDF document in React using Spire.PDF for JavaScript .

Install Spire.PDF for JavaScript

To get started with manipulating images in PDF in a React application, you can either download Spire.PDF for JavaScript from our website or install it via npm with the following command:

npm i spire.pdf

After that, copy the "Spire.Pdf.Base.js" and "Spire.Pdf.Base.wasm" files to the public folder of your project.

For more details, refer to the documentation: How to Integrate Spire.PDF for JavaScript in a React Project

Add an Image to a PDF Document in JavaScript

Spire.PDF for JavaScript provides the PdfPage.Canvas.DrawImage() method to add an image at a specified location on a PDF page. The main steps are as follows.

  • Load the input image into the Virtual File System (VFS).
  • Create a PdfDocument object with the wasmModule.PdfDocument.Create() method.
  • Add a page to the PDF document using the PdfDocument.Pages.Add() method.
  • Load the image using the wasmModule.PdfImage.FromFile() method.
  • Specify the size of the image.
  • Draw the image at a specified location on the page using the PdfPageBase.Canvas.DrawImage() method.
  • Save the PDF document using PdfDocument.SaveToFile() method.
  • Trigger the download of the resulting document.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to hold the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {

        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {

        // Log any errors that occur during loading
        console.error('Failed to load WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []); 

  // Function to add images in PDF
  const AddPdfImage = async () => {
    if (wasmModule) {
      // Specify the input and output file paths
      const inputFileName = "JS.png";
      const outputFileName = "DrawImage.pdf";

      // Fetch the input file and add it to the VFS
      await wasmModule.FetchFileToVFS(inputFileName , '', `${process.env.PUBLIC_URL}/`);

      // Create a pdf instance
      let pdf = wasmModule.PdfDocument.Create();

      // Add a page
      let page = pdf.Pages.Add();

      // Load the image 
      let image = wasmModule.PdfImage.FromFile(inputFileName);
    
      // Calculate the scaled width and height of the image
      let width = image.Width * 0.6;
      let height = image.Height * 0.6;
    
      // Calculate the x-coordinate to center the image horizontally on the page
      let x = (page.Canvas.ClientSize.Width - width) / 2;
    
      // Draw the image at a specified location on the page
      page.Canvas.DrawImage({image:image, x:x, y: 60, width: width, height: height});

      // Save the result file
      pdf.SaveToFile({fileName: outputFileName});

      // Clean up resources
      pdf.Close();

      // Read the generated PDF file
      const modifiedFileArray = wasmModule.FS.readFile(outputFileName);

      // Create a Blob object from the PDF file
      const modifiedFile = new Blob([modifiedFileArray], { type: "application/pdf" });

      // Create a URL for the Blob
      const url = URL.createObjectURL(modifiedFile);

      // Create an anchor element to trigger the download
      const a = document.createElement('a');
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);  
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Add Images in PDF with JavaScript in React</h1>
      <button onClick={AddPdfImage} disabled={!wasmModule}>
        Process
      </button>
    </div>
  );
}

export default App;

Run the code to launch the React app at localhost:3000. Once it's running, click the "Process" button to insert image in PDF:

Run the code to launch the React app at localhost:3000

Below is the result file:

Insert a picture to a specified location on a PDF page

Replace an Image in a PDF Document in JavaScript

To replace an image in PDF, you can load a new image and then replace the existing image with the new one through the PdfImageHelper.ReplaceImage() method. The main steps are as follows.

  • Load the input file and image into the Virtual File System (VFS).
  • Create a PdfDocument object with the wasmModule.PdfDocument.Create() method.
  • Load a PDF document using the PdfDocument.LoadFromFile() method.
  • Get a specific page through the PdfDocument.Pages.get_Item() method.
  • Load an image using PdfImage.FromFile() method.
  • Create a PdfImageHelper object with the wasmModule.PdfImageHelper.Create() method.
  • Get the image information on the page using the PdfImageHelper.GetImagesInfo() method.
  • Load the input image using the wasmModule.PdfImage.FromFile() method.
  • Replace an existing image in the page with the new image using the PdfImageHelper.ReplaceImage() method.
  • Save the PDF document using PdfDocument.SaveToFile() method.
  • Trigger the download of the resulting document.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to hold the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {

        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {

        // Log any errors that occur during loading
        console.error('Failed to load WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []); 

  // Function to replace an image in PDF
  const ReplacePdfImage = async () => {
    if (wasmModule) {
      // Specify the input and output file paths
      const inputFileName = "DrawImage.pdf";
      const inputImageName = "coding1.jpg";
      const outputFileName = "ReplaceImage.pdf";

      // Fetch the input file and image and add them to the VFS
      await wasmModule.FetchFileToVFS(inputFileName , '', `${process.env.PUBLIC_URL}/`);
      await wasmModule.FetchFileToVFS(inputImageName , '', `${process.env.PUBLIC_URL}/`);

      // Create a pdf instance
      let pdf = wasmModule.PdfDocument.Create();
      
      // Load the PDF file
      pdf.LoadFromFile({fileName: inputFileName});

      // Get the first page
      let page = pdf.Pages.get_Item(0);

      // Create a PdfImageHelper instance 
      let helper = wasmModule.PdfImageHelper.Create();

      // Get the image information from the page
      let images = helper.GetImagesInfo(page);

      // Load a new image
      let newImage = wasmModule.PdfImage.FromFile(inputImageName);

      // Replace the first image on the page with the loaded image
      helper.ReplaceImage(images[0], newImage);

      // Save the result file
      pdf.SaveToFile({fileName: outputFileName});

      // Clean up resources
      pdf.Close();

      // Read the generated PDF file
      const modifiedFileArray = wasmModule.FS.readFile(outputFileName);

      // Create a Blob object from the PDF file
      const modifiedFile = new Blob([modifiedFileArray], { type: "application/pdf" });

      // Create a URL for the Blob
      const url = URL.createObjectURL(modifiedFile);

      // Create an anchor element to trigger the download
      const a = document.createElement('a');
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);  
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Replace an Image in PDF with JavaScript in React</h1>
      <button onClick={ReplacePdfImage} disabled={!wasmModule}>
        Process
      </button>
    </div>
  );
}

export default App;

Replace a specified existing image with a new image in PDF

Remove an Image from a PDF Document in JavaScript

The PdfImageHelper class also provides the DeleteImage() method to remove a specific image from a PDF page. The main steps are as follows.

  • Load the input file into the Virtual File System (VFS).
  • Create a PdfDocument object with the wasmModule.PdfDocument.Create() method.
  • Load a PDF document using the PdfDocument.LoadFromFile() method.
  • Get a specific page using the PdfDocument.Pages.get_Item() method.
  • Create a PdfImageHelper object with the wasmModule.PdfImageHelper.Create() method.
  • Get the image information on the page using the PdfImageHelper.GetImagesInfo() method.
  • Delete a specified image on the page using the PdfImageHelper.DeleteImage() method.
  • Save the PDF document using PdfDocument.SaveToFile() method.
  • Trigger the download of the resulting document.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to hold the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {

        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {

        // Log any errors that occur during loading
        console.error('Failed to load WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []); 

  // Function to remove images in PDF
  const DeletePdfImage = async () => {
    if (wasmModule) {
      // Specify the input and output file paths
      const inputFileName  = "DrawImage.pdf";
      const outputFileName = "DeleteImage.pdf";

      // Fetch the input file and add it to the VFS
      await wasmModule.FetchFileToVFS(inputFileName , '', `${process.env.PUBLIC_URL}/`);

      // Create a pdf instance
      let pdf = wasmModule.PdfDocument.Create();
      
      // Load the PDF file
      pdf.LoadFromFile({fileName: inputFileName});

      // Get the first page
      let page = pdf.Pages.get_Item(0);

      // Create a PdfImageHelper instance 
      let helper = wasmModule.PdfImageHelper.Create();

      // Get the image information from the page
      let images = helper.GetImagesInfo(page);

      // Delete the first image on the page
      helper.DeleteImage({imageInfo: images[0]});

      // Save the result file
      pdf.SaveToFile({fileName: outputFileName});

      // Clean up resources
      pdf.Close();

      // Read the generated PDF file
      const modifiedFileArray = wasmModule.FS.readFile(outputFileName);

      // Create a Blob object from the PDF file
      const modifiedFile = new Blob([modifiedFileArray], { type: "application/pdf" });

      // Create a URL for the Blob
      const url = URL.createObjectURL(modifiedFile);

      // Create an anchor element to trigger the download
      const a = document.createElement('a');
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);  
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Remove an Image from PDF with JavaScript in React</h1>
      <button onClick={DeletePdfImage} disabled={!wasmModule}>
        Process
      </button>
    </div>
  );
}

export default App;

Get a Free License

To fully experience the capabilities of Spire.PDF for JavaScript without any evaluation limitations, you can request a free 30-day trial license.

When working with Word documents, batch extraction of hyperlinks has significant practical applications. Manually extracting URLs from technical documents or product manuals is not only inefficient but also prone to omissions and errors. To address this, this article presents an automated solution using C# to accurately extract hyperlink anchor text, corresponding URLs, and screen tips by parsing document elements. The extracted hyperlink data can support data analysis, SEO optimization, and other applications. The following sections demonstrate how to use Spire.Doc for .NET to extract hyperlinks from a Word document with C# code in .NET programs.

Install Spire.Doc for .NET

To begin with, you need to add the DLL files included in the Spire.Doc for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.Doc

Extracting All Hyperlinks from a Word Document Using C#

In a Word document, hyperlinks are stored as fields. To extract them, the first step is to identify all field objects by checking whether each document object is an instance of the Field class. Then, by checking whether the field object's Type property equals FieldType.FieldHyperlink, we can extract all hyperlink fields.

Once the hyperlinks are identified, we can use the Field.FieldText property to retrieve the hyperlink anchor text and the Field.GetFieldCode() method to obtain the full field code in the following format:

Hyperlink Type Field Code Example
Standard Hyperlink HYPERLINK "https://www.example.com/example"
Hyperlink with ScreenTip HYPERLINK "https://www.example.com/example" \o "ScreenTip"

By parsing the field code, we can extract both the hyperlink URL and the screen tip text, enabling complete retrieval of hyperlink information.

  • Create a Document object and use the Document.LoadFromFile() method to load the target Word document.
  • Iterate through all sections in the document using foreach (Section section in doc.Sections) to retrieve each section object.
  • For each section, iterate through its child objects using foreach (DocumentObject secObj in section.Body.ChildObjects) to access individual elements.
  • If a child object is of type Paragraph:
    • Iterate through the child objects within the paragraph using foreach (DocumentObject paraObj in paragraph.ChildObjects).
  • If a paragraph child object is of type Field and its Field.Type property value equals FieldType.FieldHyperlink, process the Field object.
  • For each Field object:
    • Extract the anchor text using the Field.FieldText property.
    • Retrieve the field code string using the Field.GetFieldCode() method.
  • Process the field code string:
    • Extract the URL enclosed in quotation marks after "HYPERLINK".
    • Check if the field code contains the \o parameter; if present, extract the screen tip text enclosed in double quotes.
  • Store the extracted hyperlinks and write them to an output file.
  • C#
using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;

namespace ExtractWordHyperlink
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create an instance of Document
            Document doc = new Document();
            // Load a Word document
            doc.LoadFromFile("Sample.docx");

            // Create a string list to store the hyperlink information
            List<string> hyperlinkInfoList = new List<string>();

            // Iterate through the sections in the document
            foreach (Section section in doc.Sections)
            {
                // Iterate through the child objects in the section
                foreach (DocumentObject secObj in section.Body.ChildObjects)
                {
                    // Check if the current document object is a Paragraph instance
                    if (secObj is Paragraph paragraph )
                    {
                        // Iterate through the child objects in the paragraph
                        foreach (DocumentObject paraObj in paragraph.ChildObjects)
                        {
                            // Check if the current child object is a field
                            if (paraObj is Field field && field.Type == FieldType.FieldHyperlink)
                            {
                                string hyperlinkInfo = "";
                                // Get the anchor text
                                string anchorText = field.FieldText;

                                // Get the field code
                                string fieldCode = field.GetFieldCode();
                                // Get the URL from the field code
                                string url = fieldCode.Split('"')[1];
                                // Check if there is a ScreenTip
                                if (fieldCode.Contains("\\o"))
                                {
                                    // Get the ScreenTip text
                                    string screenTip = fieldCode.Split("\"")[3].Trim();
                                    // Consolidate the information
                                    hyperlinkInfo += $"Anchor Text: {anchorText}\nURL: {url}\nScreenTip: {screenTip}";
                                }
                                else
                                {
                                    hyperlinkInfo += $"Anchor Text: {anchorText}\nURL: {url}";
                                }
                                hyperlinkInfo += "\n";
                                // Append the hyperlink information to the list
                                hyperlinkInfoList.Add(hyperlinkInfo);

                            }
                        }
                    }
                }
            }

            // Write the extracted hyperlink information to a text file
            File.WriteAllLines("output/ExtractedHyperlinks.txt", hyperlinkInfoList);

            doc.Close();
        }
    }
}

Hyperlinks Extracted from Word Documents Using C#

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

When printing a PDF, ensuring the content appears as intended is crucial. Depending on your needs, you may want to print the document at the actual size to maintain the original dimensions or scale it to fit the entire page for a better presentation.

C# Print PDF in Actual Size or Fit to Page

To accommodate different printing needs, Spire.PDF for .NET provides flexible printing options that allow developers to control the output easily. This article will demonstrate how to print a PDF either at the actual size or fit to page in C# using the Spire.PDF for .NET library.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.PDF

Print a PDF to Fit the Page Size in C#

When printing a PDF to fit the page, the content is automatically scaled to match the dimensions of the paper. This ensures that the document fits within the printed area, regardless of its original size.

To fit the content to the page, you can use the PdfDocument.PrintSettings.SelectSinglePageLayout(PdfSinglePageScalingMode pageScalingMode, bool autoPortraitOrLandscape) method. The detailed steps are as follows.

  • Create an instance of the PdfDocument class.
  • Load the PDF file using the PdfDocument.LoadFromFile() method.
  • Configure print settings to scale the PDF to fit the page size for printing using the PdfDocument.PrintSettings.SelectSinglePageLayout(PdfSinglePageScalingMode pageScalingMode, bool autoPortraitOrLandscape) method.
  • Call the PdfDocument.Print() method to print the PDF file.
  • C#
using Spire.Pdf;
using Spire.Pdf.Print;

namespace PrintPdfToFitPageSize
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Create an instance of the PdfDocument class
            PdfDocument pdf = new PdfDocument();
            // Load the specified PDF file into the PdfDocument object
            pdf.LoadFromFile("Sample.pdf");

            // Configure print settings to scale the PDF to fit the page size for printing
            pdf.PrintSettings.SelectSinglePageLayout(PdfSinglePageScalingMode.FitSize, false);
            // Execute the print command to print the loaded PDF document
            pdf.Print();
            

        }
    }
}

Print a PDF at the Actual Size in C#

When printing a PDF at the actual size, the original document dimensions are preserved without scaling. This ensures that the printed output matches the PDF's defined measurements.

To print a PDF at its actual size, you can also use the PdfDocument.PrintSettings.SelectSinglePageLayout(PdfSinglePageScalingMode pageScalingMode, bool autoPortraitOrLandscape) method. The detailed steps are as follows.

  • Create an instance of the PdfDocument class.
  • Load the PDF file using the PdfDocument.LoadFromFile() method.
  • Configure print settings to print the PDF at its actual size without scaling using the PdfDocument.PrintSettings.SelectSinglePageLayout(PdfSinglePageScalingMode pageScalingMode, bool autoPortraitOrLandscape) method.
  • Call the PdfDocument.Print() method to print the PDF file.
  • C#
using Spire.Pdf;
using Spire.Pdf.Print;
using System.Drawing.Printing;

namespace PrintPdfAtActualSize
{
    internal class Program
    {
        static void Main(string[] args)
        {         
            // Create a new instance of the PdfDocument class
            PdfDocument pdf = new PdfDocument();
            // Load the PDF file into the PdfDocument object
            pdf.LoadFromFile("Sample.pdf");

            // Set paper margins as 0
            pdf.PrintSettings.SetPaperMargins(0, 0, 0, 0);

            // Configure print settings to print the PDF at its actual size without scaling
            pdf.PrintSettings.SelectSinglePageLayout(PdfSinglePageScalingMode.ActualSize, false);
            // Execute the print command to print the loaded PDF document
            pdf.Print();
        }
    }
}

Print a PDF at the Actual Size on Custom-Sized Paper

In some cases, you may need to print a PDF at its actual size on a specific size of paper. Spire.PDF allows you to define a custom paper size using the PaperSize class and then you can assign it to the print settings of the document using the PdfDocument.PrintSettings.PaperSize property. The detailed steps are as follows.

  • Create an instance of the PdfDocument class.
  • Load the PDF file using the PdfDocument.LoadFromFile() method.
  • Define a custom paper size for printing using the PaperSize class.
  • Assign the custom paper size to the print settings of the file using the using the PdfDocument.PrintSettings.PaperSize property.
  • Configure print settings to print the PDF at its actual size without scaling using the PdfDocument.PrintSettings.SelectSinglePageLayout(PdfSinglePageScalingMode pageScalingMode, bool autoPortraitOrLandscape) method.
  • Call the PdfDocument.Print() method to print the PDF file.
  • C#
using Spire.Pdf;
using Spire.Pdf.Print;
using System.Drawing.Printing;

namespace PrintPdfOnCustomSizedPaper
{
    internal class Program
    {
        static void Main(string[] args)
        {            
            // Create a new instance of the PdfDocument class
            PdfDocument pdf = new PdfDocument();
            // Load the specified PDF file into the PdfDocument object
            pdf.LoadFromFile("Sample.pdf");

            //// Define an A3 paper size for printing
            //PaperSize paperSize = new PaperSize
            //{
            //    // Set paper size to A3
            //    RawKind = (int)PaperKind.A3
            //};

            // Define a custom paper size for printing
            PaperSize paperSize = new PaperSize
            {
                // Set the width of the paper
                Width = 283 * 100, //inch*100
                // Set the height of the paper
                Height = 826 * 100, //inch*100
                // Set paper size to custom
                RawKind = (int)PaperKind.Custom
            };

            // Assign the custom paper size to the print settings of the PdfDocument
            pdf.PrintSettings.PaperSize = paperSize;

            // Set paper margins as 0
            pdf.PrintSettings.SetPaperMargins(0, 0, 0, 0);

            // Set print settings to print the PDF at its actual size without scaling
            pdf.PrintSettings.SelectSinglePageLayout(PdfSinglePageScalingMode.ActualSize, false);
            // Execute the print command to print the loaded PDF document
            pdf.Print();
        }
    }
}

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

page 16