Converting PDF to HTML is important for improving accessibility and interactivity in web environments. While PDFs are widely used for their reliable layout and ease of sharing, they can be restrictive when it comes to online use. HTML provides greater flexibility, allowing content to be displayed more effectively on websites and mobile devices. By converting a PDF document into HTML, developers can enhance search engine visibility, enable easier editing, and create more user-friendly experiences. In this article, we will demonstrate how to convert PDF to HTML in React with JavaScript and the Spire.PDF for JavaScript library.
- Convert PDF to HTML in React
- Customize PDF to HTML Conversion Settings in React
- Convert PDF to HTML Stream in React
Install Spire.PDF for JavaScript
To get started with converting PDF to HTML with JavaScript in a React application, you can either download Spire.PDF for JavaScript from our website or install it via npm with the following command:
npm i spire.office
The downloaded product package integrates Spire.Doc for JavaScript, Spire.XLS for JavaScript, Spire.PDF for JavaScript, and Spire.Presentation for JavaScript. To use Spire.PDF for JavaScript functionality, you need to copy the corresponding files (spire.pdf.js, Spire.Pdf.Wasm.zip, spire.common.js, Spire.Common.Wasm.zip, and the _framework folder) to the public folder of your project. Additionally, to ensure proper text rendering, font files can be added to a custom path of your choice. In the following example, the font addition path is: public\static\font.
For more details, refer to the documentation: How to Integrate Spire.PDF for JavaScript in a React Project
Convert PDF to HTML in React
The PdfDocument.SaveToFile() method offered by Spire.PDF for JavaScript allows developers to effortlessly convert a PDF file into HTML format. The detailed steps are as follows.
- Load the required font file and the input PDF file into the Virtual File System (VFS).
- Create a PdfDocument object with the wasmModule.PdfDocument() method.
- Load the PDF file using the PdfDocument.LoadFromFile() method.
- Save the PDF file to HTML format using the PdfDocument.SaveToFile() method.
- JavaScript
import React, { useState, useEffect } from 'react';
function App() {
const [wasmModule, setWasmModule] = useState(null);
useEffect(() => {
(async () => {
try {
const publicUrl = process.env.PUBLIC_URL || '';
const spireModule = await import(/* webpackIgnore: true */ `${publicUrl}/spire.pdf.js`);
const rawModule = spireModule.default || spireModule;
window.wasmModule = typeof rawModule === 'function'
? await rawModule({ locateFile: p => p.endsWith('.wasm') ? `${publicUrl}/${p}` : p })
: rawModule;
setWasmModule(window.wasmModule);
} catch (error) {
console.error('Failed to load spire.pdf.js:', error);
}
})();
}, []);
const ConvertPdfToHTML= async () => {
// Get WASM module
const wasmModule = window.wasmModule.spirepdf;
if (wasmModule) {
// Load font file to virtual file system (VFS)
await window.spire.FetchFileToVFS("arial.ttf","/Library/Fonts/",`${process.env.PUBLIC_URL}static/font/`);
// PDF file name to convert
let inputFileName = "ToHTML.pdf";
// Load PDF file to virtual file system (VFS)
await window.spire.FetchFileToVFS(inputFileName, "", `${process.env.PUBLIC_URL}static/data/`);
// Create a PdfDocument object
let doc =new wasmModule.PdfDocument();
// Load the PDF file
doc.LoadFromFile(inputFileName);
// Define the output file name
const outputFileName = 'PdfToHtml.html';
// Save the document to an HTML file
doc.SaveToFile({fileName: outputFileName, fileFormat: wasmModule.FileFormat.HTML});
// Read the saved file and convert to a Blob object
const modifiedFileArray = window.dotnetRuntime.Module.FS.readFile(outputFileName);
const modifiedFile = new Blob([modifiedFileArray], { type: "text/html" });
// Create a URL for the Blob
const url = URL.createObjectURL(modifiedFile);
// Create an anchor element to trigger the download
const a = document.createElement('a');
a.href = url;
a.download = outputFileName ;
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
URL.revokeObjectURL(url);
}
};
return (
<div style={{ textAlign: 'center', height: '300px' }}>
<h1>Convert PDF to HTML in React Using JavaScript</h1>
<button onClick={ConvertPdfToHTML} disabled={!wasmModule}>
Convert
</button>
</div>
);
}
export default App;
Run the code to launch the React app at localhost:3000. Once it's running, click on the "Convert" button to convert the PDF file to HTML format:

Here is the screenshot of the input PDF file and the converted HTML file:

Customize PDF to HTML Conversion Settings in React
Developers can use the PdfDocument.ConvertOptions.SetPdfToHtmlOptions() method to customize settings during the PDF to HTML conversion process. For instance, they can choose whether to embed SVG or images in the resulting HTML and set the maximum number of pages included in each HTML file. The detailed steps are as follows.
- Load the required font file and the input PDF file into the Virtual File System (VFS).
- Create a PdfDocument object with the wasmModule.PdfDocument() method.
- Load the PDF file using the PdfDocument.LoadFromFile() method.
- Customize the PDF to HTML conversion settings using the PdfDocument.ConvertOptions.SetPdfToHtmlOptions() method.
- Save the PDF document to HTML format using the PdfDocument.SaveToFile() method.
- JavaScript
import React, { useState, useEffect } from 'react';
function App() {
const [wasmModule, setWasmModule] = useState(null);
useEffect(() => {
(async () => {
try {
const publicUrl = process.env.PUBLIC_URL || '';
const spireModule = await import(/* webpackIgnore: true */ `${publicUrl}/spire.pdf.js`);
const rawModule = spireModule.default || spireModule;
window.wasmModule = typeof rawModule === 'function'
? await rawModule({ locateFile: p => p.endsWith('.wasm') ? `${publicUrl}/${p}` : p })
: rawModule;
setWasmModule(window.wasmModule);
} catch (error) {
console.error('Failed to load spire.pdf.js:', error);
}
})();
}, []);
const downloadFileFromVFS = (fileName) => {
const fileArray = window.dotnetRuntime.Module.FS.readFile(fileName);
const fileBlob = new Blob([fileArray], { type: 'text/html' });
const url = URL.createObjectURL(fileBlob);
const a = document.createElement('a');
a.href = url;
a.download = fileName;
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
URL.revokeObjectURL(url);
};
const ConvertPdfToHTML = async () => {
const wasmModule = window.wasmModule.spirepdf;
if (wasmModule) {
await window.spire.FetchFileToVFS("MSYH.TTC", "/Library/Fonts/", `${process.env.PUBLIC_URL}static/font/`);
// Load the input PDF file into the VFS
let inputFileName = "ToHTML.pdf";
await window.spire.FetchFileToVFS(inputFileName, "", `${process.env.PUBLIC_URL}static/data/`);
let doc = new wasmModule.PdfDocument();
doc.LoadFromFile(inputFileName);
const totalPages = doc.Pages.Count;
// Customize the conversion settings
doc.ConvertOptions.SetPdfToHtmlOptions({ useEmbeddedSvg: false, useEmbeddedImg: true, maxPageOneFile: 1 });
// Save the document to an HTML file
const outputFileName = 'PdfToHtmlOptions.html';
doc.SaveToFile({ fileName: outputFileName, fileFormat: wasmModule.FileFormat.HTML });
doc.Close();
doc.Dispose();
console.log(`totalPages: ${totalPages}`);
for (let i = 1; i <= totalPages; i++) {
const fileName = `PdfToHtmlOptions_${i}-${i}.html`;
downloadFileFromVFS(fileName);
}
}
};
return (
<div style={{ textAlign: 'center', height: '300px' }}>
<h1>Convert PDF to HTML in React Using JavaScript</h1>
<button onClick={ConvertPdfToHTML}>
Convert
</button>
</div>
);
}
export default App;
Convert PDF to HTML Stream in React
Spire.PDF for JavaScript also supports converting a PDF to an HTML stream using the PdfDocument.SaveToStream() method. The detailed steps are as follows.
- Load the required font file and the input PDF file into the Virtual File System (VFS).
- Create a PdfDocument object with the wasmModule.PdfDocument() method.
- Load the PDF file using the PdfDocument.LoadFromFile() method.
- Create a memory stream using the wasmModule.Stream() method.
- Save the PDF document as an HTML stream using the PdfDocument.SaveToStream() method.
- Write the content of the stream to an HTML file using the window.dotnetRuntime.Module.FS.readFile() method.
- JavaScript
import React, { useState, useEffect } from 'react';
function App() {
const [wasmModule, setWasmModule] = useState(null);
useEffect(() => {
(async () => {
try {
const publicUrl = process.env.PUBLIC_URL || '';
const spireModule = await import(/* webpackIgnore: true */ `${publicUrl}/spire.pdf.js`);
const rawModule = spireModule.default || spireModule;
window.wasmModule = typeof rawModule === 'function'
? await rawModule({ locateFile: p => p.endsWith('.wasm') ? `${publicUrl}/${p}` : p })
: rawModule;
setWasmModule(window.wasmModule);
} catch (error) {
console.error('Failed to load spire.pdf.js:', error);
}
})();
}, []);
const ConvertPdfToHTML = async () => {
const wasmModule = window.wasmModule.spirepdf;
if (wasmModule) {
await window.spire.FetchFileToVFS("MSYH.TTC", "/Library/Fonts/", `${process.env.PUBLIC_URL}static/font/`);
// Load the input PDF file into the VFS
let inputFileName = "ToHTML.pdf";
await window.spire.FetchFileToVFS(inputFileName, "", `${process.env.PUBLIC_URL}static/data/`);
let doc = new wasmModule.PdfDocument();
doc.LoadFromFile(inputFileName);
// Define the output file name
const outputFileName = 'PdfToHtmlStream.html';
// Create a new memory stream
let ms = new wasmModule.Stream();
// Save the file as HTML stream
doc.SaveToStream({stream: ms, fileformat: wasmModule.FileFormat.HTML});
ms.Save(outputFileName);
// Release resources
ms.Close();
doc.Close();
// Read the saved HTML file and convert to a Blob object
const modifiedFileArray = window.dotnetRuntime.Module.FS.readFile(outputFileName);
const modifiedFile = new Blob([modifiedFileArray], { type: "text/html" });
// Create a Blob URL and trigger download
const url = URL.createObjectURL(modifiedFile);
const a = document.createElement('a');
a.href = url;
a.download = outputFileName;
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
URL.revokeObjectURL(url);
}
};
return (
<div style={{ textAlign: 'center', height: '300px' }}>
<h1>Convert PDF to HTML in React Using JavaScript</h1>
<button onClick={ConvertPdfToHTML}>
Convert
</button>
</div>
);
}
export default App;
Get a Free License
To fully experience the capabilities of Spire.PDF for JavaScript without any evaluation limitations, you can request a free 30-day trial license.
