Converting PDF to HTML is important for improving accessibility and interactivity in web environments. While PDFs are widely used for their reliable layout and ease of sharing, they can be restrictive when it comes to online use. HTML provides greater flexibility, allowing content to be displayed more effectively on websites and mobile devices. By converting a PDF document into HTML, developers can enhance search engine visibility, enable easier editing, and create more user-friendly experiences. In this article, we will demonstrate how to convert PDF to HTML in React with JavaScript and the Spire.PDF for JavaScript library.
- Convert PDF to HTML in React
- Customize PDF to HTML Conversion Settings in React
- Convert PDF to HTML Stream in React
Install Spire.PDF for JavaScript
To get started with converting PDF to HTML with JavaScript in a React application, you can either download Spire.PDF for JavaScript from our website or install it via npm with the following command:
npm i spire.pdf
After that, copy the "Spire.Pdf.Base.js" and "Spire.Pdf.Base.wasm" files to the public folder of your project. Additionally, include the required font files to ensure accurate and consistent text rendering.
For more details, refer to the documentation: How to Integrate Spire.PDF for JavaScript in a React Project
Convert PDF to HTML in React
The PdfDocument.SaveToFile() method offered by Spire.PDF for JavaScript allows developers to effortlessly convert a PDF file into HTML format. The detailed steps are as follows.
- Load the required font file and the input PDF file into the Virtual File System (VFS).
- Create a PdfDocument object with the wasmModule.PdfDocument.Create() method.
- Load the PDF file using the PdfDocument.LoadFromFile() method.
- Save the PDF file to HTML format using the PdfDocument.SaveToFile() method.
- JavaScript
import React, { useState, useEffect } from 'react';
function App() {
// State to hold the loaded WASM module
const [wasmModule, setWasmModule] = useState(null);
// useEffect hook to load the WASM module when the component mounts
useEffect(() => {
const loadWasm = async () => {
try {
// Access the Module and spirepdf from the global window object
const { Module, spirepdf } = window;
// Set the wasmModule state when the runtime is initialized
Module.onRuntimeInitialized = () => {
setWasmModule(spirepdf);
};
} catch (err) {
// Log any errors that occur during loading
console.error('Failed to load WASM module:', err);
}
};
// Create a script element to load the WASM JavaScript file
const script = document.createElement('script');
script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
script.onload = loadWasm;
// Append the script to the document body
document.body.appendChild(script);
// Cleanup function to remove the script when the component unmounts
return () => {
document.body.removeChild(script);
};
}, []);
// Function to convert PDF to HTML
const ConvertPdfToHTML = async () => {
if (wasmModule) {
// Load the necessary font file into the virtual file system (VFS)
await wasmModule.FetchFileToVFS('ARIAL.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
// Load the input PDF file into the VFS
let inputFileName = 'Input.pdf';
await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);
// Create a new document
const doc = wasmModule.PdfDocument.Create();
// Load the PDF file
doc.LoadFromFile(inputFileName);
// Define the output file name
const outputFileName = 'PdfToHtml.html';
// Save the document to an HTML file
doc.SaveToFile({fileName: outputFileName, fileFormat: wasmModule.FileFormat.HTML});
// Clean up resources
doc.Close();
doc.Dispose();
// Read the saved file and convert it to a Blob object
const modifiedFileArray = wasmModule.FS.readFile(outputFileName);
const modifiedFile = new Blob([modifiedFileArray], { type: 'text/html' });
// Create a URL for the Blob and initiate the download
const url = URL.createObjectURL(modifiedFile);
const a = document.createElement('a');
a.href = url;
a.download = outputFileName;
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
URL.revokeObjectURL(url);
}
};
return (
<div style={{ textAlign: 'center', height: '300px' }}>
<h1>Convert PDF to HTML in React Using JavaScript</h1>
<button onClick={ConvertPdfToHTML} disabled={!wasmModule}>
Convert
</button>
</div>
);
}
export default App;
Run the code to launch the React app at localhost:3000. Once it's running, click on the "Convert" button to convert the PDF file to HTML format:

Here is the screenshot of the input PDF file and the converted HTML file:

Customize PDF to HTML Conversion Settings in React
Developers can use the PdfDocument.ConvertOptions.SetPdfToHtmlOptions() method to customize settings during the PDF to HTML conversion process. For instance, they can choose whether to embed SVG or images in the resulting HTML and set the maximum number of pages included in each HTML file. The detailed steps are as follows.
- Load the required font file and the input PDF file into the Virtual File System (VFS).
- Create a PdfDocument object with the wasmModule.PdfDocument.Create() method.
- Load the PDF file using the PdfDocument.LoadFromFile() method.
- Customize the PDF to HTML conversion settings using the PdfDocument.ConvertOptions.SetPdfToHtmlOptions() method.
- Save the PDF document to HTML format using the PdfDocument.SaveToFile() method.
- JavaScript
import React, { useState, useEffect } from 'react';
function App() {
// State to hold the loaded WASM module
const [wasmModule, setWasmModule] = useState(null);
// useEffect hook to load the WASM module when the component mounts
useEffect(() => {
const loadWasm = async () => {
try {
// Access the Module and spirepdf from the global window object
const { Module, spirepdf } = window;
// Set the wasmModule state when the runtime is initialized
Module.onRuntimeInitialized = () => {
setWasmModule(spirepdf);
};
} catch (err) {
// Log any errors that occur during loading
console.error('Failed to load WASM module:', err);
}
};
// Create a script element to load the WASM JavaScript file
const script = document.createElement('script');
script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
script.onload = loadWasm;
// Append the script to the document body
document.body.appendChild(script);
// Cleanup function to remove the script when the component unmounts
return () => {
document.body.removeChild(script);
};
}, []);
// Function to convert PDF to HTML
const ConvertPdfToHTML = async () => {
if (wasmModule) {
// Load the necessary font file into the virtual file system (VFS)
await wasmModule.FetchFileToVFS('ARIAL.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
// Load the input PDF file into the VFS
let inputFileName = 'Input.pdf';
await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);
// Create a new document
const doc = wasmModule.PdfDocument.Create();
// Load the PDF file
doc.LoadFromFile(inputFileName);
// Customize the conversion settings
// Parameters: useEmbeddedSvg: false, useEmbeddedImg: true, maxPageOneFile: 1
doc.ConvertOptions.SetPdfToHtmlOptions(false, true, 1);
// Define the output file name
const outputFileName = 'CutomizePdfToHtmlConversion.html';
// Save the document to an HTML file
doc.SaveToFile({fileName: outputFileName, fileFormat: wasmModule.FileFormat.HTML});
// Clean up resources
doc.Close();
doc.Dispose();
// Read the saved file and convert it to a Blob object
const modifiedFileArray = wasmModule.FS.readFile(outputFileName);
const modifiedFile = new Blob([modifiedFileArray], { type: 'text/html' });
// Create a URL for the Blob and initiate the download
const url = URL.createObjectURL(modifiedFile);
const a = document.createElement('a');
a.href = url;
a.download = outputFileName;
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
URL.revokeObjectURL(url);
}
};
return (
<div style={{ textAlign: 'center', height: '300px' }}>
<h1>Convert PDF to HTML in React Using JavaScript</h1>
<button onClick={ConvertPdfToHTML} disabled={!wasmModule}>
Convert
</button>
</div>
);
}
export default App;
Convert PDF to HTML Stream in React
Spire.PDF for JavaScript also supports converting a PDF to an HTML stream using the PdfDocument.SaveToStream() method. The detailed steps are as follows.
- Load the required font file and the input PDF file into the Virtual File System (VFS).
- Create a PdfDocument object with the wasmModule.PdfDocument.Create() method.
- Load the PDF file using the PdfDocument.LoadFromFile() method.
- Create a memory stream using the wasmModule.Stream.CreateByFile() method.
- Save the PDF document as an HTML stream using the PdfDocument.SaveToStream() method.
- Write the content of the stream to an HTML file using the wasmModule.FS.writeFile() method.
- JavaScript
import React, { useState, useEffect } from 'react';
function App() {
// State to hold the loaded WASM module
const [wasmModule, setWasmModule] = useState(null);
// useEffect hook to load the WASM module when the component mounts
useEffect(() => {
const loadWasm = async () => {
try {
// Access the Module and spirepdf from the global window object
const { Module, spirepdf } = window;
// Set the wasmModule state when the runtime is initialized
Module.onRuntimeInitialized = () => {
setWasmModule(spirepdf);
};
} catch (err) {
// Log any errors that occur during loading
console.error('Failed to load WASM module:', err);
}
};
// Create a script element to load the WASM JavaScript file
const script = document.createElement('script');
script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
script.onload = loadWasm;
// Append the script to the document body
document.body.appendChild(script);
// Cleanup function to remove the script when the component unmounts
return () => {
document.body.removeChild(script);
};
}, []);
// Function to convert PDF to HTML
const ConvertPdfToHTML = async () => {
if (wasmModule) {
// Load the necessary font file into the virtual file system (VFS)
await wasmModule.FetchFileToVFS('ARIAL.TTF', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
// Load the input PDF file into the VFS
let inputFileName = 'Input.pdf';
await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);
// Create a new document
const doc = wasmModule.PdfDocument.Create();
// Load the PDF file
doc.LoadFromFile(inputFileName);
// Define the output file name
const outputFileName = 'PdfToHtmlStream.html';
// Create a new memory stream
let ms = wasmModule.Stream.CreateByFile(outputFileName);
// Save the PDF document to an HTML stream
doc.SaveToStream({stream: ms, fileformat: wasmModule.FileFormat.HTML});
// Write the content of the memory stream to an HTML file
wasmModule.FS.writeFile(outputFileName, ms.ToArray());
// Clean up resources
doc.Close();
doc.Dispose();
// Read the saved file and convert it to a Blob object
const modifiedFileArray = wasmModule.FS.readFile(outputFileName);
const modifiedFile = new Blob([modifiedFileArray], { type: 'text/html' });
// Create a URL for the Blob and initiate the download
const url = URL.createObjectURL(modifiedFile);
const a = document.createElement('a');
a.href = url;
a.download = outputFileName;
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
URL.revokeObjectURL(url);
}
};
return (
<div style={{ textAlign: 'center', height: '300px' }}>
<h1>Convert PDF to HTML in React Using JavaScript</h1>
<button onClick={ConvertPdfToHTML} disabled={!wasmModule}>
Convert
</button>
</div>
);
}
export default App;
Get a Free License
To fully experience the capabilities of Spire.PDF for JavaScript without any evaluation limitations, you can request a free 30-day trial license.
