Conversion
Conversion

Conversion (4)

Converting PDF to HTML is important for improving accessibility and interactivity in web environments. While PDFs are widely used for their reliable layout and ease of sharing, they can be restrictive when it comes to online use. HTML provides greater flexibility, allowing content to be displayed more effectively on websites and mobile devices. By converting a PDF document into HTML, developers can enhance search engine visibility, enable easier editing, and create more user-friendly experiences. In this article, we will demonstrate how to convert PDF to HTML in React with JavaScript and the Spire.PDF for JavaScript library.

Install Spire.PDF for JavaScript

To get started with converting PDF to HTML with JavaScript in a React application, you can either download Spire.PDF for JavaScript from our website or install it via npm with the following command:

npm i spire.pdf

After that, copy the "Spire.Pdf.Base.js" and "Spire.Pdf.Base.wasm" files to the public folder of your project. Additionally, include the required font files to ensure accurate and consistent text rendering.

For more details, refer to the documentation: How to Integrate Spire.PDF for JavaScript in a React Project

Convert PDF to HTML in React

The PdfDocument.SaveToFile() method offered by Spire.PDF for JavaScript allows developers to effortlessly convert a PDF file into HTML format. The detailed steps are as follows.

  • Load the required font file and the input PDF file into the Virtual File System (VFS).
  • Create a PdfDocument object with the wasmModule.PdfDocument.Create() method.
  • Load the PDF file using the PdfDocument.LoadFromFile() method.
  • Save the PDF file to HTML format using the PdfDocument.SaveToFile() method.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to hold the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {

        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {

        // Log any errors that occur during loading
        console.error('Failed to load WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []); 

  // Function to convert PDF to HTML
  const ConvertPdfToHTML = async () => {
    if (wasmModule) {

       // Load the necessary font file into the virtual file system (VFS)
       await wasmModule.FetchFileToVFS('ARIAL.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Load the input PDF file into the VFS
      let inputFileName = 'Input.pdf';
      await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);

      // Create a new document
      const doc = wasmModule.PdfDocument.Create();

      // Load the PDF file
      doc.LoadFromFile(inputFileName); 

      // Define the output file name
      const outputFileName = 'PdfToHtml.html';

      // Save the document to an HTML file
      doc.SaveToFile({fileName: outputFileName, fileFormat: wasmModule.FileFormat.HTML});
      // Clean up resources
      doc.Close();       
      doc.Dispose();

      // Read the saved file and convert it to a Blob object
      const modifiedFileArray = wasmModule.FS.readFile(outputFileName);
      const modifiedFile = new Blob([modifiedFileArray], { type: 'text/html' });
      
      // Create a URL for the Blob and initiate the download
      const url = URL.createObjectURL(modifiedFile);
      const a = document.createElement('a');
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Convert PDF to HTML in React Using JavaScript</h1>
      <button onClick={ConvertPdfToHTML} disabled={!wasmModule}>
        Convert
      </button>
    </div>
  );
}

export default App;

Run the code to launch the React app at localhost:3000. Once it's running, click on the "Convert" button to convert the PDF file to HTML format:

React APP for Converting PDF to HTML

Here is the screenshot of the input PDF file and the converted HTML file:

Convert PDF to HTML with JavaScript in React

Customize PDF to HTML Conversion Settings in React

Developers can use the PdfDocument.ConvertOptions.SetPdfToHtmlOptions() method to customize settings during the PDF to HTML conversion process. For instance, they can choose whether to embed SVG or images in the resulting HTML and set the maximum number of pages included in each HTML file. The detailed steps are as follows.

  • Load the required font file and the input PDF file into the Virtual File System (VFS).
  • Create a PdfDocument object with the wasmModule.PdfDocument.Create() method.
  • Load the PDF file using the PdfDocument.LoadFromFile() method.
  • Customize the PDF to HTML conversion settings using the PdfDocument.ConvertOptions.SetPdfToHtmlOptions() method.
  • Save the PDF document to HTML format using the PdfDocument.SaveToFile() method.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to hold the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {

        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {

        // Log any errors that occur during loading
        console.error('Failed to load WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []); 

  // Function to convert PDF to HTML
  const ConvertPdfToHTML = async () => {
    if (wasmModule) {

       // Load the necessary font file into the virtual file system (VFS)
       await wasmModule.FetchFileToVFS('ARIAL.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Load the input PDF file into the VFS
      let inputFileName = 'Input.pdf';
      await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);

      // Create a new document
      const doc = wasmModule.PdfDocument.Create();

      // Load the PDF file
      doc.LoadFromFile(inputFileName); 

      // Customize the conversion settings
      // Parameters: useEmbeddedSvg: false, useEmbeddedImg: true, maxPageOneFile: 1
      doc.ConvertOptions.SetPdfToHtmlOptions(false, true, 1);

      // Define the output file name
      const outputFileName = 'CutomizePdfToHtmlConversion.html';

      // Save the document to an HTML file
      doc.SaveToFile({fileName: outputFileName, fileFormat: wasmModule.FileFormat.HTML});
      // Clean up resources
      doc.Close();       
      doc.Dispose();

      // Read the saved file and convert it to a Blob object
      const modifiedFileArray = wasmModule.FS.readFile(outputFileName);
      const modifiedFile = new Blob([modifiedFileArray], { type: 'text/html' });
      
      // Create a URL for the Blob and initiate the download
      const url = URL.createObjectURL(modifiedFile);
      const a = document.createElement('a');
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Convert PDF to HTML in React Using JavaScript</h1>
      <button onClick={ConvertPdfToHTML} disabled={!wasmModule}>
        Convert
      </button>
    </div>
  );
}

export default App;

Convert PDF to HTML Stream in React

Spire.PDF for JavaScript also supports converting a PDF to an HTML stream using the PdfDocument.SaveToStream() method. The detailed steps are as follows.

  • Load the required font file and the input PDF file into the Virtual File System (VFS).
  • Create a PdfDocument object with the wasmModule.PdfDocument.Create() method.
  • Load the PDF file using the PdfDocument.LoadFromFile() method.
  • Create a memory stream using the wasmModule.Stream.CreateByFile() method.
  • Save the PDF document as an HTML stream using the PdfDocument.SaveToStream() method.
  • Write the content of the stream to an HTML file using the wasmModule.FS.writeFile() method.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to hold the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {

        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {

        // Log any errors that occur during loading
        console.error('Failed to load WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []); 

  // Function to convert PDF to HTML
  const ConvertPdfToHTML = async () => {
    if (wasmModule) {

       // Load the necessary font file into the virtual file system (VFS)
       await wasmModule.FetchFileToVFS('ARIAL.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Load the input PDF file into the VFS
      let inputFileName = 'Input.pdf';
      await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);

      // Create a new document
      const doc = wasmModule.PdfDocument.Create();

      // Load the PDF file
      doc.LoadFromFile(inputFileName); 

      // Define the output file name
      const outputFileName = 'PdfToHtmlStream.html';
      // Create a new memory stream
      let ms = wasmModule.Stream.CreateByFile(outputFileName);

      // Save the PDF document to an HTML stream
      doc.SaveToStream({stream: ms, fileformat: wasmModule.FileFormat.HTML});
      // Write the content of the memory stream to an HTML file
      wasmModule.FS.writeFile(outputFileName, ms.ToArray());

      // Clean up resources
      doc.Close();       
      doc.Dispose();

      // Read the saved file and convert it to a Blob object
      const modifiedFileArray = wasmModule.FS.readFile(outputFileName);
      const modifiedFile = new Blob([modifiedFileArray], { type: 'text/html' });
      
      // Create a URL for the Blob and initiate the download
      const url = URL.createObjectURL(modifiedFile);
      const a = document.createElement('a');
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Convert PDF to HTML in React Using JavaScript</h1>
      <button onClick={ConvertPdfToHTML} disabled={!wasmModule}>
        Convert
      </button>
    </div>
  );
}

export default App;

Get a Free License

To fully experience the capabilities of Spire.PDF for JavaScript without any evaluation limitations, you can request a free 30-day trial license.

Converting PDF files to Word documents is essential for modern web applications focused on document management and editing. Using JavaScript and React, developers can easily integrate this functionality with libraries like Spire.PDF for JavaScript. This guide will walk you through implementing a PDF-to-Word conversion feature in a React application, showing how to load files, configure settings, and enable users to download their converted documents effortlessly.

Install Spire.PDF for JavaScript

To get started with converting PDF to Word with JavaScript in a React application, you can either download Spire.PDF for JavaScript from our website or install it via npm with the following command:

npm i spire.pdf

After that, copy the "Spire.Pdf.Base.js" and "Spire.Pdf.Base.wasm" files to the public folder of your project. Additionally, include the required font files to ensure accurate and consistent text rendering.

For more details, refer to the documentation: How to Integrate Spire.PDF for JavaScript in a React Project

Convert PDF to Word Using PdfToDocConverter Class

The PdfToDocConverter class from Spire.PDF for JavaScript facilitates the conversion of PDF files to Word documents. It includes the DocxOptions property, allowing developers to customize conversion settings, including document properties. The conversion is performed using the SaveToDocx() method.

Steps to convert PDF to Word using the PdfToDocConverter class in React:

  • Load the necessary font files and input PDF file into the virtual file system (VFS).
  • Instantiate a PdfToDocConverter object using the wasmModule.PdfToDocConverter.Create() method, passing the PDF file path.
  • Customize the generated Word file's properties using the DocxOptions property.
  • Use the SaveToDocx() method to convert the PDF document.
  • Trigger the download of the resulting Word file.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to hold the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {

        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {

        // Log any errors that occur during loading
        console.error('Failed to load WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []); 

  // Function to convert PDF to Word
  const ConvertPdfToWord = async () => {
    if (wasmModule) {

       // Load the necessary font files into the virtual file system (VFS)
       await wasmModule.FetchFileToVFS('GOTHIC.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICB.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICBI.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICI.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Load the input PDF file into the VFS
      let inputFileName = 'input.pdf';
      await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);

      // Create a PdfToDocConverter object
      let converter = wasmModule.PdfToDocConverter.Create({filePath: inputFileName});

      // Set document properties of the generated Word file
      converter.DocxOptions.Subject = "Convert PDF to Word";
      converter.DocxOptions.Authors = "E-ICEBLUE"
 
      // Define the output file name
      const outputFileName = "ToWord.docx";

      // Convert PDF as a Docx file
      converter.SaveToDocx({fileName: outputFileName});
    
      // Read the generated Word file
      const modifiedFileArray = wasmModule.FS.readFile(outputFileName);

      // Create a Blob object from the Word file
      const modifiedFile = new Blob([modifiedFileArray], {type:'msword'});

      // Create a URL for the Blob
      const url = URL.createObjectURL(modifiedFile);

      // Create an anchor element to trigger the download
      const a = document.createElement('a');
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);  
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Convert PDF to Word in React</h1>
      <button onClick={ConvertPdfToWord} disabled={!wasmModule}>
        Convert
      </button>
    </div>
  );
}

export default App;

Run the code to launch the React app at localhost:3000. Click "Convert," and a "Save As" window will appear, prompting you to save the output file in your chosen folder.

Launch react app to convert pdf to word

Below is a screenshot showing the input PDF file and the output Word file:

Convert PDF to Word in React

Convert PDF to Word Using PdfDocument Class

To convert PDF to Word, you can also use the PdfDocument class. This class allows developers to load an existing PDF document, make modifications, and save it as a Word file. This feature is particularly useful for users who need to edit or enhance their PDFs before conversion.

Steps to convert PDF to Word Using the PdfDocument class in React:

  • Load the necessary font files and input PDF file into the virtual file system (VFS).
  • Create a PdfDocument object using the wasmModule.PdfDocument.Create() method
  • Load the PDF document using the PdfDocument.LoadFromFile() method.
  • Convert the PDF document to a Word file using the PdfDocument.SaveToFile() method.
  • Trigger the download of the resulting Word file.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to hold the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {

        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {

        // Log any errors that occur during loading
        console.error('Failed to load WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []); 

  // Function to convert PDF to Word
  const ConvertPdfToWord = async () => {
    if (wasmModule) {

       // Load the necessary font files into the virtual file system (VFS)
       await wasmModule.FetchFileToVFS('GOTHIC.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICB.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICBI.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICI.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Load the input PDF file into the VFS
      let inputFileName = 'input.pdf';
      await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);

      // Create a PdfDocument object
      let doc = wasmModule.PdfDocument.Create();
      
      // Load the PDF file
      doc.LoadFromFile(inputFileName);
 
      // Define the output file name
      const outputFileName = "ToWord.docx";

      // Convert PDF as a Docx file
      doc.SaveToFile({fileName: outputFileName,fileFormat: wasmModule.FileFormat.DOCX});
    
      // Read the generated Word file
      const modifiedFileArray = wasmModule.FS.readFile(outputFileName);

      // Create a Blob object from the Word file
      const modifiedFile = new Blob([modifiedFileArray], {type:'msword'});

      // Create a URL for the Blob
      const url = URL.createObjectURL(modifiedFile);

      // Create an anchor element to trigger the download
      const a = document.createElement('a');
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);  
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 

      // Cleanup resources
      doc.Dispose();
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Convert PDF to Word in React</h1>
      <button onClick={ConvertPdfToWord} disabled={!wasmModule}>
        Convert
      </button>
    </div>
  );
}

export default App;

Get a Free License

To fully experience the capabilities of Spire.PDF for JavaScript without any evaluation limitations, you can request a free 30-day trial license.

In data-driven workflows, converting PDF documents with tables to Excel improves accessibility and usability. While PDFs preserve document integrity, their static nature makes data extraction challenging, often leading to error-prone manual work. By leveraging JavaScript in React, developers can automate the conversion process, seamlessly transferring structured data like financial reports into Excel worksheets for real-time analysis and collaboration. This article explores how to use Spire.PDF for JavaScript to efficiently convert PDFs to Excel files with JavaScript in React applications.

Install Spire.PDF for JavaScript

To get started with converting PDF to Excel with JavaScript in a React application, you can either download Spire.PDF for JavaScript from our website or install it via npm with the following command:

npm i spire.pdf

After that, copy the "Spire.Pdf.Base.js" and "Spire.Pdf.Base.wasm" files to the public folder of your project. Additionally, make sure to include the required font files to ensure accurate and consistent text rendering.

For more details, refer to the documentation: How to Integrate Spire.PDF for JavaScript in a React Project

Steps to Convert PDF to Excel Using JavaScript

With the Spire.PDF for JavaScript WebAssembly module, PDF documents can be loaded from the Virtual File System (VFS) using the PdfDocument.LoadFromFile() method and converted into Excel workbooks using the PdfDocument.SaveToFile() method.

In addition to direct conversion, developers can customize the process by configuring conversion options through the XlsxLineLayoutOptions and XlsxTextLayoutOptions classes, along with the PdfDocument.ConvertOptions.SetPdfToXlsxOptions() method.

The following steps demonstrate how to convert a PDF document to an Excel file using Spire.PDF for JavaScript:

  • Load the Spire.Pdf.Base.js file to initialize the WebAssembly module.
  • Fetch the PDF file into the Virtual File System (VFS) using the wasmModule.FetchFileToVFS() method.
  • Fetch the font files used in the PDF document to the “/Library/Fonts/” folder in the VFS using the wasmModule.FetchFileToVFS() method.
  • Create an instance of the PdfDocument class using the wasmModule.PdfDocument.Create() method.
  • Load the PDF document from the VFS into the PdfDocument instance using the PdfDocument.LoadFromFile() method.
  • (Optional) Customize the conversion options:
    • Create an instance of the XlsxLineLayoutOptions or XlsxTextLayoutOptions class and specify the desired conversion settings.
    • Apply the conversion options using the PdfDocument.ConvertOptions.SetPdfToXlsxOptions() method.
  • Convert the PDF document to an Excel file using the PdfDocument.SaveToFile({ filename: string, wasmModule.FileFormat.XLSX }) method.
  • Retrieve the converted file from the VFS for download or further use.

Simple PDF to Excel Conversion in JavaScript

Developers can directly load a PDF document from the VFS and convert it to an Excel file using the default conversion settings. These settings map one PDF page to one Excel worksheet, preserve rotated and overlapped text, allow cell splitting, and enable text wrapping.

Below is a code example demonstrating this process:

  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to store the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {
        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {
        // Log any errors that occur during module loading
        console.error('Failed to load the WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []);

  // Function to convert PDF to Excel
  const ConvertPDFToExcel = async () => {
    if (wasmModule) {
      // Specify the input and output file names
      const inputFileName = 'Sample.pdf';
      const outputFileName = 'PDFToExcel.xlsx';

      // Fetch the input file and add it to the VFS
      await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);

      // Fetch the font file used in the PDF to the VFS
      await wasmModule.FetchFileToVFS('Calibri.ttf', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
      await wasmModule.FetchFileToVFS('Symbol.ttf', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Create an instance of the PdfDocument class
      const pdf = wasmModule.PdfDocument.Create();

      // Load the PDF document from the VFS
      pdf.LoadFromFile(inputFileName);

      // Convert the PDF document to an Excel file
      pdf.SaveToFile({ fileName: outputFileName, fileFormat: wasmModule.FileFormat.XLSX});

      // Read the Excel file from the VFS
      const excelArray = await wasmModule.FS.readFile(outputFileName)

      // Create a Blob object from the Excel file and trigger a download
      const blob = new Blob([excelArray], { type: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' });
      const url = URL.createObjectURL(blob);
      const a = document.createElement('a');
      a.href = url;
      a.download = `${outputFileName}`;
      document.body.appendChild(a);
      a.click();
      document.body.removeChild(a);
      URL.revokeObjectURL(url);
    }
  };

  return (
      <div style={{ textAlign: 'center', height: '300px' }}>
        <h1>Convert PDF to Excel Using JavaScript in React</h1>
        <button onClick={ConvertPDFToExcel} disabled={!wasmModule}>
          Convert and Download
        </button>
      </div>
  );
}

export default App;

Convert PDF to Excel Without Configuring Options Using JavaScript

Convert PDF to Excel with XlsxLineLayoutOptions

Spire.PDF for JavaScript provides the XlsxLineLayoutOptions class for configuring line-based conversion settings when converting PDFs to Excel. By adjusting these options, developers can achieve different conversion results, such as merging all PDF pages into a single worksheet.

The table below outlines the available parameters in XlsxLineLayoutOptions:

Parameter (bool) Function
convertToMultipleSheet Specifies whether to convert each page into a separate worksheet.
rotatedText Specifies whether to retain rotated text.
splitCell Specifies whether to split cells.
wrapText Specifies whether to wrap text within cells.
overlapText Specifies whether to retain overlapped text.

Special attention should be given to the splitCell parameter, as it significantly impacts the way tables are converted. Setting it to false preserves table cell structures, making the output table cells more faithful to the original PDF. Conversely, setting it to true allows plain text to be split smoothly in cells, which may be useful for text-based layouts rather than structured tables.

Below is a code example demonstrating PDF-to-Excel conversion using XlsxLineLayoutOptions:

  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to store the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {
        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {
        // Log any errors that occur during module loading
        console.error('Failed to load the WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []);

  // Function to convert PDF to Excel with XlsxLineLayoutOptions
  const ConvertPDFToExcelXlsxLineLayoutOptions = async () => {
    if (wasmModule) {
      // Specify the input and output file names
      const inputFileName = 'Sample.pdf';
      const outputFileName = 'PDFToExcelXlsxLineLayoutOptions.xlsx';

      // Fetch the input file and add it to the VFS
      await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);

      // Fetch the font file used in the PDF to the VFS
      await wasmModule.FetchFileToVFS('Calibri.ttf', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
      await wasmModule.FetchFileToVFS('Symbol.ttf', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Create an instance of the PdfDocument class
      const pdf = wasmModule.PdfDocument.Create();

      // Load the PDF document from the VFS
      pdf.LoadFromFile(inputFileName);

      // Create an instance of the XlsxLineLayoutOptions class and specify the conversion options
      const options = wasmModule.XlsxLineLayoutOptions.Create({ convertToMultipleSheet: true, rotatedText: false, splitCell: false, wrapText: false, overlapText: true});

      // Set the XlsxLineLayoutOptions instance as the conversion options
      pdf.ConvertOptions.SetPdfToXlsxOptions(options);

      // Convert the PDF document to an Excel file
      pdf.SaveToFile({ fileName: outputFileName, fileFormat: wasmModule.FileFormat.XLSX});

      // Read the Excel file from the VFS
      const excelArray = await wasmModule.FS.readFile(outputFileName)

      // Create a Blob object from the Excel file and trigger a download
      const blob = new Blob([excelArray], { type: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' });
      const url = URL.createObjectURL(blob);
      const a = document.createElement('a');
      a.href = url;
      a.download = `${outputFileName}`;
      document.body.appendChild(a);
      a.click();
      document.body.removeChild(a);
      URL.revokeObjectURL(url);
    }
  };

  return (
      <div style={{ textAlign: 'center', height: '300px' }}>
        <h1>Convert PDF to Excel with XlsxLineLayoutOptions Using JavaScript in React</h1>
        <button onClick={ConvertPDFToExcelXlsxLineLayoutOptions} disabled={!wasmModule}>
          Convert and Download
        </button>
      </div>
  );
}

export default App;

Convert PDF to Excel with XlsxLineLayoutOptions in React

Convert PDF to Excel Using XlsxTextLayoutOptions

Developers can also customize conversion settings using the XlsxTextLayoutOptions class, which focuses on text-based layout formatting. The table below lists its parameters:

Parameter (bool) Function
convertToMultipleSheet Specifies whether to convert each page into a separate worksheet.
rotatedText Specifies whether to retain rotated text.
overlapText Specifies whether to retain overlapped text.

Below is a code example demonstrating PDF-to-Excel conversion using XlsxTextLayoutOptions:

  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to store the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {
        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {
        // Log any errors that occur during module loading
        console.error('Failed to load the WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []);

  // Function to convert PDF to Excel with XlsxTextLayoutOptions
  const ConvertPDFToExcelXlsxTextLayoutOptions = async () => {
    if (wasmModule) {
      // Specify the input and output file names
      const inputFileName = 'Sample.pdf';
      const outputFileName = 'PDFToExcelXlsxTextLayoutOptions.xlsx';

      // Fetch the input file and add it to the VFS
      await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);

      // Fetch the font file used in the PDF to the VFS
      await wasmModule.FetchFileToVFS('Calibri.ttf', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
      await wasmModule.FetchFileToVFS('Symbol.ttf', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Create an instance of the PdfDocument class
      const pdf = wasmModule.PdfDocument.Create();

      // Load the PDF document from the VFS
      pdf.LoadFromFile(inputFileName);

      // Create an instance of the XlsxTextLayoutOptions class and specify the conversion options
      const options = wasmModule.XlsxTextLayoutOptions.Create({ convertToMultipleSheet: false, rotatedText: true, overlapText: true});

      // Set the XlsxTextLayoutOptions instance as the conversion options
      pdf.ConvertOptions.SetPdfToXlsxOptions(options);

      // Convert the PDF document to an Excel file
      pdf.SaveToFile({ fileName: outputFileName, fileFormat: wasmModule.FileFormat.XLSX});

      // Read the Excel file from the VFS
      const excelArray = await wasmModule.FS.readFile(outputFileName)

      // Create a Blob object from the Excel file and trigger a download
      const blob = new Blob([excelArray], { type: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' });
      const url = URL.createObjectURL(blob);
      const a = document.createElement('a');
      a.href = url;
      a.download = `${outputFileName}`;
      document.body.appendChild(a);
      a.click();
      document.body.removeChild(a);
      URL.revokeObjectURL(url);
    }
  };

  return (
      <div style={{ textAlign: 'center', height: '300px' }}>
        <h1>Convert PDF to Excel with XlsxTextLayoutOptions Using JavaScript in React</h1>
        <button onClick={ConvertPDFToExcelXlsxTextLayoutOptions} disabled={!wasmModule}>
          Convert and Download
        </button>
      </div>
  );
}

export default App;

Convert PDF to Excel with XlsxTextLayoutOptions Using JavaScript

Get a Free License

To fully experience the capabilities of Spire.PDF for JavaScript without any evaluation limitations, you can request a free 30-day trial license.

Transforming PDF documents into image formats like JPG or PNG is a powerful way to enhance the accessibility and usability of your content. By converting PDF pages into images, you preserve the original layout and design, making it ideal for various applications, from online sharing to incorporation in websites and presentations.

In this article, you will learn how to convert PDF files to images in React using Spire.PDF for JavaScript. We will guide you through the process step-by-step, ensuring you can easily generate high-quality images from your PDF documents.

Install Spire.PDF for JavaScript

To get started with converting PDF to images with JavaScript in a React application, you can either download Spire.PDF for JavaScript from our website or install it via npm with the following command:

npm i spire.pdf

After that, copy the "Spire.Pdf.Base.js" and "Spire.Pdf.Base.wasm" files to the public folder of your project. Additionally, include the required font files to ensure accurate and consistent text rendering.

For more details, refer to the documentation: How to Integrate Spire.PDF for JavaScript in a React Project

Convert PDF to JPG in React

Spire.PDF for JavaScript provides the PdfDocument.SaveAsImage() method to convert a specific page of a PDF into image byte data, which can then be saved as a JPG file using the Save() method. To convert all pages into individual images, iterate through each page.

The following are the steps to convert PDF to JPG in React:

  • Load the required font files and the input PDF file into the Virtual File System (VFS).
  • Create a PdfDocument object with the wasmModule.PdfDocument.Create() method.
  • Load the PDF using the PdfDocument.LoadFromFile() method.
  • Iterate through the document's pages:
    • Convert each page into image byte data using the PdfDocument.SaveAsImage() method.
    • Save the image as a JPG file using the Save() method.
    • Trigger the download of the generated JPG file.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to hold the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {

        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {

        // Log any errors that occur during loading
        console.error('Failed to load WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []); 

  // Function to convert PDF to JPG
  const ConvertPdfToJpg = async () => {
    if (wasmModule) {

       // Load the necessary font files into the virtual file system (VFS)
       await wasmModule.FetchFileToVFS('GOTHIC.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICB.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICBI.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICI.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Load the input PDF file into the VFS
      let inputFileName = 'input.pdf';
      await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);

      // Create a new document
      const doc = wasmModule.PdfDocument.Create();

      // Load the PDF file
      doc.LoadFromFile(inputFileName); 

      // Iterate through the pages in the document
      for (let i = 0; i < doc.Pages.Count; i++) {

            // Specify the output file name
            let outputFileName = `ToImage-img-${i}.jpg`;  
            
            // Save the specific page to image data
            let imageData = doc.SaveAsImage({pageIndex: i});

            // Save the image data as a JPG file
            imageData.Save(outputFileName); 
            
            // Read the generated JPG file
            const modifiedFileArray = wasmModule.FS.readFile(outputFileName);

            // Create a Blob object from the JPG file
            const modifiedFile = new Blob([modifiedFileArray], { type:'image/jpeg' });

            // Create a URL for the Blob
            const url = URL.createObjectURL(modifiedFile);

            // Create an anchor element to trigger the download
            const a = document.createElement('a');
            a.href = url;
            a.download = outputFileName;
            document.body.appendChild(a);
            a.click(); 
            document.body.removeChild(a); 
            URL.revokeObjectURL(url); 
      }
 
      // Clean up resources
      doc.Dispose();
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Convert PDF to JPG in React</h1>
      <button onClick={ConvertPdfToJpg} disabled={!wasmModule}>
        Convert
      </button>
    </div>
  );
}

export default App;

Run the code to launch the React app at localhost:3000. Click "Convert," and a "Save As" window will appear, prompting you to save the output file in your chosen folder.

React app to convert PDF to JPG

Here is a screenshot of the generated JPG files:

Convert PDF to JPG in React

Convert PDF to PNG in React

To convert a PDF document into individual PNG files, iterate through its pages and use the PdfDocument.SaveAsImage() method to generate image byte data for each page. Then, save these byte data as PNG files.

The following are the steps to convert PDF to PNG in React:

  • Load the required font files and the input PDF file into the Virtual File System (VFS).
  • Create a PdfDocument object with the wasmModule.PdfDocument.Create() method.
  • Load the PDF using the PdfDocument.LoadFromFile() method.
  • Iterate through the document's pages:
    • Convert each page into image byte data using the PdfDocument.SaveAsImage() method.
    • Save the image as a PNG file using the Save() method.
    • Trigger the download of the generated PNG file.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to hold the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {

        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {

        // Log any errors that occur during loading
        console.error('Failed to load WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []); 

  // Function to convert PDF to PNG
  const ConvertPdfToPng = async () => {
    if (wasmModule) {

       // Load the necessary font files into the virtual file system (VFS)
       await wasmModule.FetchFileToVFS('GOTHIC.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICB.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICBI.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICI.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Load the input PDF file into the VFS
      let inputFileName = 'input.pdf';
      await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);

      // Create a new document
      const doc = wasmModule.PdfDocument.Create();

      // Load the PDF file
      doc.LoadFromFile(inputFileName); 

      // Make background of the generated images transparent 
      // doc.ConvertOptions.SetPdfToImageOptions(0);

      // Iterate through the pages in the document
      for (let i = 0; i < doc.Pages.Count; i++) {

            // Specify the output file name
            let outputFileName = `ToImage-img-${i}.png`;  
            
            // Save the specific page to image data
            let imageData = doc.SaveAsImage({pageIndex: i});

            // Save the image data as a PNG file
            imageData.Save(outputFileName); 
            
            // Read the generated PNG file
            const modifiedFileArray = wasmModule.FS.readFile(outputFileName);

            // Create a Blob object from the PNG file
            const modifiedFile = new Blob([modifiedFileArray], { type:'image/png' });

            // Create a URL for the Blob
            const url = URL.createObjectURL(modifiedFile);

            // Create an anchor element to trigger the download
            const a = document.createElement('a');
            a.href = url;
            a.download = outputFileName;
            document.body.appendChild(a);
            a.click(); 
            document.body.removeChild(a); 
            URL.revokeObjectURL(url); 
      }
 
      // Clean up resources
      doc.Dispose();
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Convert PDF to PNG in React</h1>
      <button onClick={ConvertPdfToPng} disabled={!wasmModule}>
        Convert
      </button>
    </div>
  );
}

export default App;

Convert PDF to PNG in React

Convert PDF to SVG in React

To convert each page of a PDF document into individual SVG files, you can utilize the PdfDocument.SaveToFile() method. Here are the detailed steps:

  • Load the required font files and the input PDF file into the Virtual File System (VFS).
  • Create a PdfDocument object with the wasmModule.PdfDocument.Create() method.
  • Load the PDF using the PdfDocument.LoadFromFile() method.
  • Iterate through the pages:
    • Convert each page into an SVG file using the PdfDocument.SaveToFile() method.
    • Trigger the download of the generated SVG file.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to hold the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {

        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {

        // Log any errors that occur during loading
        console.error('Failed to load WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []); 

  // Function to convert PDF to SVG
  const ConvertPdfToSvg = async () => {
    if (wasmModule) {

       // Load the necessary font files into the virtual file system (VFS)
       await wasmModule.FetchFileToVFS('GOTHIC.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICB.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICBI.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
       await wasmModule.FetchFileToVFS('GOTHICI.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Load the input PDF file into the VFS
      let inputFileName = 'input.pdf';
      await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);

      // Create a new document
      const doc = wasmModule.PdfDocument.Create();

      // Load the PDF file
      doc.LoadFromFile(inputFileName); 

      // Iterate through the pages in the document
      for (let i = 0; i < doc.Pages.Count; i++) { 
        
        // Specify the output file name
        let outputFileName = `ToSVG_${i}.svg`;  

        // Save a specfic page to SVG
        doc.SaveToFile({fileName: outputFileName, startIndex:i, endIndex:i, fileFormat: wasmModule.FileFormat.SVG});

        // Read the generated SVG file
        const modifiedFileArray = wasmModule.FS.readFile(outputFileName);

        // Create a Blob object from the SVG file
        const modifiedFile = new Blob([modifiedFileArray], { type:'image/svg+xml' });

        // Create a URL for the Blob
        const url = URL.createObjectURL(modifiedFile);

        // Create an anchor element to trigger the download
        const a = document.createElement('a');
        a.href = url;
        a.download = outputFileName;
        document.body.appendChild(a);
        a.click(); 
        document.body.removeChild(a); 
        URL.revokeObjectURL(url); 
      }
 
      // Clean up resources
      doc.Dispose();
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Convert PDF to SVG in React</h1>
      <button onClick={ConvertPdfToSvg} disabled={!wasmModule}>
        Convert
      </button>
    </div>
  );
}

export default App;

Convert PDF to SVG in React

Get a Free License

To fully experience the capabilities of Spire.PDF for JavaScript without any evaluation limitations, you can request a free 30-day trial license.

page