Converting PowerPoint presentations to PDF ensures that slide content remains intact while making the file easier to share and view across different devices. The PDF format preserves the original layout, text, and images, preventing unintended modifications and ensuring consistent formatting. This conversion is especially useful for professional and academic settings, where maintaining document integrity and accessibility is essential. Additionally, PDFs offer enhanced security features, such as restricted editing and password protection, making them a reliable choice for distributing important presentations. In this article, we will demonstrate how to convert PowerPoint presentations to PDF in React using Spire.Presentation for JavaScript.

Install Spire.Presentation for JavaScript

To get started with converting PowerPoint to PDF in a React application, you can either download Spire.Presentation for JavaScript from the official website or install it via npm with the following command:

npm i spire.office

The downloaded product package integrates Spire.Doc for JavaScript, Spire.XLS for JavaScript, Spire.PDF for JavaScript, and Spire.Presentation for JavaScript. To use Spire.Presentation for JavaScript functionality, you need to copy the corresponding files (spire.presentation.js, Spire.Presentation.Wasm.zip, spire.common.js, Spire.Common.Wasm.zip, and the _framework folder) to the public folder of your project. Additionally, to ensure proper text rendering, font files can be added to a custom path of your choice. In the following example, the font addition path is: public\static\font.

For more details, refer to the documentation: How to Integrate Spire.Presentation for JavaScript in a React Project.

Convert a PowerPoint Presentation to PDF

Converting a PowerPoint presentation to PDF allows you to share the entire document while preserving its original layout. Using the Presentation.SaveToFile() method, developers can export the full presentation to a PDF file. Below are the detailed steps to perform this operation.

  • Create an object of Presentation class.
  • Load a presentation file using Presentation.LoadFromFile() method.
  • Save the presentation to a PDF document using Presentation.SaveToFile() method.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {
   const [wasmModule, setWasmModule] = useState(null);
   useEffect(() => {
    (async () => {
      try {
        const publicUrl = process.env.PUBLIC_URL || '';
        const spireModule = await import(/* webpackIgnore: true */ `${publicUrl}/spire.presentation.js`);
        const rawModule = spireModule.default || spireModule;
        window.wasmModule = typeof rawModule === 'function' 
          ? await rawModule({ locateFile: p => p.endsWith('.wasm') ? `${publicUrl}/${p}` : p })
          : rawModule;       
        setWasmModule(window.wasmModule);
      } catch (error) {
        console.error('Failed to load spire.presentation.js:', error);
      }
    })();
  }, []);

  const ConvertPowerPointToPDF = async () => {
    const wasmModule = window.wasmModule.spirepresentation;
    
    if (wasmModule) {
      // Specify the input file paths
      let inputFileName  = "Sample.pptx";
      await window.spire.FetchFileToVFS(inputFileName , '',  `${process.env.PUBLIC_URL}static/data/`);
      await window.spire.FetchFileToVFS("arial.ttf","/Library/Fonts/",`${process.env.PUBLIC_URL}static/font/`);

      // Create a Presentation instance and load the PowerPoint file from the virtual file system
      const ppt =new wasmModule.Presentation();
      ppt.LoadFromFile(inputFileName);

      // Define the output file name
      const outputFileName = "PowerPointToPDF.pdf";

      // Save the PowerPoint file to PDF format
      ppt.SaveToFile({ file: outputFileName, fileFormat: wasmModule.FileFormat.PDF });

      // Read the generated PDF file
      const modifiedFileArray = window.dotnetRuntime.Module.FS.readFile(outputFileName);

      // Create a Blob object from the PDF file
      const modifiedFile = new Blob([modifiedFileArray], { type: "application/pdf" });

      // Create a URL for the Blob
      const url = URL.createObjectURL(modifiedFile);

      // Create an anchor element to trigger the download
      const a = document.createElement('a');
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);  
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Convert a PowerPoint Presentation to PDF in React</h1>
      <button onClick={ConvertPowerPointToPDF} disabled={!wasmModule}>
        Convert
      </button>
    </div>
  );
}


export default App;

Run the code to launch the React app at localhost:3000. Once it's running, click on the "Convert" button to convert the PowerPoint presentation to PDF:

Run the code to launch the React app at localhost:3000

The below screenshot shows the input PowerPoint presentation and the converted PDF:

Convert a PowerPoint Presentation to PDF

Convert a PowerPoint Presentation to PDF with a Custom Page Size

Developers can customize the page size of the resulting PDF by adjusting the slide size using the Presentation.SlideSize.Type property during conversion. This ensures that the converted PDF meets specific formatting or printing needs. Here are the detailed steps for this operation.

  • Create an object of Presentation class.
  • Load a presentation file using Presentation.LoadFromFile() method.
  • Set the slide size to A4 using Presentation.SlideSize.Type property.
  • Save the presentation to a PDF document using Presentation.SaveToFile() method.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {
   const [wasmModule, setWasmModule] = useState(null);
   useEffect(() => {
    (async () => {
      try {
        const publicUrl = process.env.PUBLIC_URL || '';
        const spireModule = await import(/* webpackIgnore: true */ `${publicUrl}/spire.presentation.js`);
        const rawModule = spireModule.default || spireModule;
        window.wasmModule = typeof rawModule === 'function' 
          ? await rawModule({ locateFile: p => p.endsWith('.wasm') ? `${publicUrl}/${p}` : p })
          : rawModule;       
        setWasmModule(window.wasmModule);
      } catch (error) {
        console.error('Failed to load spire.presentation.js:', error);
      }
    })();
  }, []);

  const ConvertPowerPointToPDF = async () => {
    const wasmModule = window.wasmModule.spirepresentation;
    
    if (wasmModule) {
      // Specify the input file paths
      let inputFileName  = "Sample.pptx";
      await window.spire.FetchFileToVFS(inputFileName , '',  `${process.env.PUBLIC_URL}static/data/`);
      await window.spire.FetchFileToVFS("arial.ttf","/Library/Fonts/",`${process.env.PUBLIC_URL}static/font/`);

      // Create a Presentation instance and load the PowerPoint file from the virtual file system
      const ppt =new wasmModule.Presentation();
      ppt.LoadFromFile(inputFileName);

      //Set A4 page size
      ppt.SlideSize.Type = wasmModule.SlideSizeType.A4;

      // Define the output file name
      const outputFileName = "ToPdfWithSpecificPageSize.pdf";      

      // Save the PowerPoint file to PDF format
      ppt.SaveToFile({ file: outputFileName, fileFormat: wasmModule.FileFormat.PDF });

      // Read the generated PDF file
      const modifiedFileArray = window.dotnetRuntime.Module.FS.readFile(outputFileName);

      // Create a Blob object from the PDF file
      const modifiedFile = new Blob([modifiedFileArray], { type: "application/pdf" });

      // Create a URL for the Blob
      const url = URL.createObjectURL(modifiedFile);

      // Create an anchor element to trigger the download
      const a = document.createElement('a');
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);  
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Convert a PowerPoint Presentation to PDF with a Custom Page Size in React</h1>
      <button onClick={ConvertPowerPointToPDF} disabled={!wasmModule}>
        Convert
      </button>
    </div>
  );
}


export default App;

Convert a PowerPoint Presentation to PDF with a Custom Page Size

Convert a PowerPoint Slide to PDF

Converting a single PowerPoint slide to PDF allows for easy extraction and sharing of individual slides without exporting the entire presentation. Using the ISlide.SaveToFile() method, developers can convert individual slides to PDF with ease. The detailed steps for this operation are as follows.

  • Create an object of the Presentation class.
  • Load a presentation file using Presentation.LoadFromFile() method.
  • Get a slide using Presentation.Slides.get_Item() method.
  • Save the slide as a PDF document using ISlide.SaveToFile() method.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {
   const [wasmModule, setWasmModule] = useState(null);
   useEffect(() => {
    (async () => {
      try {
        const publicUrl = process.env.PUBLIC_URL || '';
        const spireModule = await import(/* webpackIgnore: true */ `${publicUrl}/spire.presentation.js`);
        const rawModule = spireModule.default || spireModule;
        window.wasmModule = typeof rawModule === 'function' 
          ? await rawModule({ locateFile: p => p.endsWith('.wasm') ? `${publicUrl}/${p}` : p })
          : rawModule;       
        setWasmModule(window.wasmModule);
      } catch (error) {
        console.error('Failed to load spire.presentation.js:', error);
      }
    })();
  }, []);

  const ConvertPowerPointSlideToPDF = async () => {
    const wasmModule = window.wasmModule.spirepresentation;
    
    if (wasmModule) {
      // Specify the input file paths
      let inputFileName  = "Sample.pptx";
      await window.spire.FetchFileToVFS(inputFileName , '',  `${process.env.PUBLIC_URL}static/data/`);
      await window.spire.FetchFileToVFS("arial.ttf","/Library/Fonts/",`${process.env.PUBLIC_URL}static/font/`);

      // Create a Presentation instance and load the PowerPoint file from the virtual file system
      const ppt =new wasmModule.Presentation();
      ppt.LoadFromFile(inputFileName);

      // Get the second slide
      let slide = ppt.Slides.get_Item(1);

      // Define the output file name
      const outputFileName = "SlideToPdf.pdf";      

      // Save the slide to PDF format
      slide.SaveToFile( outputFileName, wasmModule.FileFormat.PDF);

      // Read the generated PDF file
      const modifiedFileArray = window.dotnetRuntime.Module.FS.readFile(outputFileName);

      // Create a Blob object from the PDF file
      const modifiedFile = new Blob([modifiedFileArray], { type: "application/pdf" });

      // Create a URL for the Blob
      const url = URL.createObjectURL(modifiedFile);

      // Create an anchor element to trigger the download
      const a = document.createElement('a');
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);  
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Convert a PowerPoint Slide to PDF in React</h1>
      <button onClick={ConvertPowerPointSlideToPDF} disabled={!wasmModule}>
        Convert
      </button>
    </div>
  );
}


export default App;

Convert a PowerPoint Slide to PDF

Get a Free License

To fully experience the capabilities of Spire.Presentation for JavaScript without any evaluation limitations, you can request a free 30-day trial license.

In the ever-evolving world of web development, React continues to be the preferred framework for creating engaging and responsive user interfaces. For developers looking to enhance their applications with robust presentation capabilities, Spire.Presentation for JavaScript emerges as an invaluable resource.

In this guide, we'll explore the steps to effectively integrate Spire.Presentation for JavaScript into your React application, ensuring you can leverage its robust features for tasks such as generating slides, editing content, and exporting presentations in various formats.

Benefits of Using Spire.Presentation for JavaScript in React

React, a powerful JavaScript library for building interactive user interfaces, has become a cornerstone in modern web development. Complementing this is Spire.Presentation for JavaScript, a specialized library designed to enhance PowerPoint presentation management within web applications.

By integrating Spire.Presentation for JavaScript into your React project, you can unlock advanced features for creating and manipulating presentations easily. Here are some of the key benefits:

  • Rich Functionality: Spire.Presentation for JavaScript offers a comprehensive range of features for managing PowerPoint files, including creating slides, adding text, images, charts, and shapes. This rich functionality allows developers to build robust presentation applications without needing to rely on external tools.
  • Seamless Integration: Designed to work harmoniously with various JavaScript frameworks, including React, Spire.Presentation for JavaScript integrates smoothly into existing projects, facilitating an efficient and enjoyable development experience.
  • Cross-Platform Compatibility: Spire.Presentation for JavaScript is designed to work across different platforms and devices. Whether your application is run on desktop, tablet, or mobile devices, you can expect consistent performance and functionality.
  • High-Quality Output: Spire.Presentation for JavaScript ensures that the presentations you create are of high quality, maintaining the integrity of fonts, images, and layouts. This quality is crucial for professional presentations and business-related use cases.

Set Up Your Environment

Step 1. Install React and npm

Download and install Node.js from the official website. Make sure to choose the version that matches your operating system.

After the installation is complete, you can verify that Node.js and npm are working correctly by running the following commands in your terminal:

Check if node.js and npm are successfully installed

Step 2. Create a New React Project

Create a new React project named my-app using Create React App from terminal:

npx create-react-app my-app

Create a react project

If your React project is compiled successfully, the app will be served at http://localhost:3000, allowing you to view and test your application in a browser.

Launch React app at localhost 3000

To visually browse and manage the files in your project, you can open the project using VS Code.

Open React project in VS Code

Integrate Spire.Presentation for JavaScript in Your Project

Download Spire.Presentation for JavaScript from our website and unzip it to a location on your disk. The downloaded product package integrates Spire.Doc for JavaScript, Spire.XLS for JavaScript, Spire.PDF for JavaScript, and Spire.Presentation for JavaScript. When using the features of Spire.Presentation for JavaScript, the required files are: spire.presentation.js, Spire.Presentation.Wasm.zip, spire.common.js, Spire.Common.Wasm.zip, and the _framework folder.

Download Spire.Presnentation for JavaScript library

Alternatively, you can download Spire.Presentation for JavaScript using npm. In the terminal within VS Code, run the following command:

npm i spire.office

Install Spire.Presentation for Javascript via npm

Once the installation is complete, the product packages will be saved in the node_modules/spire.office path of your project.

The library files downloaded via npm

Copy the spire.presentation.js, Spire.Presentation.Wasm.zip, spire.common.js, Spire.Common.Wasm.zip, and the _framework folder five files into the "public" folder in your React project.

Copy library to React project

Add font files you plan to use to the "public/static/font" folder in your project. (Not always necessary)

Add font file to React project

Create and Save Presentation Files Using JavaScript

Modify the code in the "App.js" file to generate a PowerPoint file using the WebAssembly (WASM) module.

Modify app.js file

Here is the entire code:

  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {
   const [wasmModule, setWasmModule] = useState(null);
   useEffect(() => {
    (async () => {
      try {
        const publicUrl = process.env.PUBLIC_URL || '';
        const spireModule = await import(/* webpackIgnore: true */ `${publicUrl}/spire.presentation.js`);
        const rawModule = spireModule.default || spireModule;
        window.wasmModule = typeof rawModule === 'function' 
          ? await rawModule({ locateFile: p => p.endsWith('.wasm') ? `${publicUrl}/${p}` : p })
          : rawModule;       
        setWasmModule(window.wasmModule);
      } catch (error) {
        console.error('Failed to load spire.presentation.js:', error);
      }
    })();
  }, []);

  
  const CreatePowerPoint = async () => {
    const wasmModule = window.wasmModule.spirepresentation;
    
    if (wasmModule) {
       // Load the ARIALUNI.TTF font file into the virtual file system (VFS)
        await window.spire.FetchFileToVFS("ARIALUNI.TTF", "/Library/Fonts/", `${import.meta.env.BASE_URL}static/font/`);

        // Create a PPT document
        const ppt = new wasmModule.Presentation();

        // Add a new shape to the PPT document
        let rec = wasmModule.RectangleF.FromLTRB(ppt.SlideSize.Size.Width / 2 - 250,80,(500 + ppt.SlideSize.Size.Width / 2 - 250),230);
        let shape = ppt.Slides.get_Item(0).Shapes.AppendShape({shapeType:wasmModule.ShapeType.Rectangle,rectangle:rec});

        shape.ShapeStyle.LineColor.Color = wasmModule.Color.get_White();
        shape.Fill.FillType = wasmModule.FillFormatType.None;

        // Add text to the shape
        shape.AppendTextFrame("Hello World!");

        // Set the font and fill style of the text
        let textRange = shape.TextFrame.TextRange;
        textRange.Fill.FillType = wasmModule.FillFormatType.Solid;
        textRange.Fill.SolidColor.Color = wasmModule.Color.get_CadetBlue();
        textRange.FontHeight = 66;
        textRange.LatinFont = wasmModule.TextFont;

        // Define the output file name 
        const outputFileName = "HelloWorld.pptx";

        // Save to file
        ppt.SaveToFile({file:outputFileName,fileFormat:wasmModule.FileFormat.Pptx2013});

        // Read the saved file and convert to a Blob object
        const modifiedFileArray = window.dotnetRuntime.Module.FS.readFile(outputFileName);
        const modifiedFile = new Blob([modifiedFileArray], { type: "application/vnd.openxmlformats-officedocument.presentationml.presentation" });

        // Clean up resources
        ppt.Dispose();

      // Create a URL for the Blob
      const url = URL.createObjectURL(modifiedFile);

      // Create an anchor element to trigger the download
      const a = document.createElement('a');
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 
    }
  };
  
  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Create a PowerPoint Document in React</h1>
      <button onClick={CreatePowerPoint} disabled={!wasmModule}>
        Generate
      </button>
    </div>
  );
}

export default App;

Save the changes by clicking "File" - "Save".

Save changes

Start the development server by entering the following command in the terminal within VS

npm start

Start your React project by running npm start

Once the React app is successfully compiled, it will open in your default web browser, typically at http://localhost:3000.

React app opens at local host 3000

Click "Generate," and a "Save As" window will prompt you to save the output file in the designated folder.

Save the generated PowerPoint file at the specified folder

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

We are delighted to announce the release of Spire.Presentation for Java 10.2.2. This version enhances the conversion from PowerPoint documents to images. Moreover, some known issues are fixed successfully in this version, such as the issue that it threw "Value cannot be null" when saving a PowerPoint document. More details are listed below.

Here is a list of changes made in this release

Category ID Description
Bug SPIREPPT-2669 Fixes the issue that the shadow effect of text was lost when converting PowerPoint to images.
Bug SPIREPPT-2717 Optimizes the function of adding annotations for specific text.
Bug SPIREPPT-2718 Fixes the issue that it threw "StringIndexOutOfBoundsException" when adding annotations for specific text.
Bug SPIREPPT-2719 Fixes the issue that the effect of converting PowerPoint to images was incorrect.
Bug SPIREPPT-2722 Fixes the issue that it threw "Value cannot be null" when saving a PowerPoint document.
Click the link below to download Spire.Presentation for Java 10.2.2:

We're pleased to announce the release of Spire.Doc 13.2.3. This version optimizes the time and resource consumption when converting Word to PDF, and also adds new interfaces for reading and writing chart titles, data labels, axis, legends, data tables and other chart attributes. More details are listed below.

Here is a list of changes made in this release

Category ID Description
New feature - Adds new interfaces for reading and writing chart titles, chart data labels, chart axis, chart legends, chart data tables and other attributes.
  • ChartTitle.Text property: Sets the chart title text.
  • ChartDataLabel.ShowValue property: Sets whether the data label includes the value.
  • ChartAxis.CategoryType property: Sets the type of the horizontal axis (automatic, text, or date).
  • ChartLegend.Position property: Sets the position of the legend.
  • ChartDataTable.Show property: Sets whether to display the data table.
New feature - Namespace changes:
Spire.Doc.Formatting.RowFormat.TablePositioning->Spire.Doc.Formatting.TablePositioning
Spire.Doc.Printing.PagesPreSheet->Spire.Doc.Printing.PagesPerSheet    
New feature - Optimizes the time and resource consumption when converting Word to PDF, especially when working with large files or complex layouts.
Click the link to download Spire.Doc 13.2.3:
More information of Spire.Doc new release or hotfix:

We are excited to announce the release of the Spire.XLS for Java 15.2.1. The latest version enhances conversions from Excel to images and PDF. Besides, this update fixes the issue that the program threw a "NullPointerException" when loading an XLSX document. More details are listed below.

Here is a list of changes made in this release

Category ID Description
Bug SPIREXLS-5575 Fixes the issue that the program threw a "NullPointerException" when loading an XLSX document.
Bug SPIREXLS-5668 Fixes the issue that incorrect colors existed when converting Excel to images.
Bug SPIREXLS-5685 Fixes the issue that incomplete content displayed when converting Excel to PDF.
Click the link to download Spire.XLS for Java 15.2.1:

In the modern web development landscape, React has become the go-to framework for building dynamic and interactive user interfaces. When it comes to handling PDF documents within a React application, Spire.PDF for JavaScript stands out as a powerful tool.

This guide will walk you through how to integrate Spire.PDF for JavaScript into your React project, explore its benefits, and provide actionable insights to optimize your implementation.

Benefits of Using Spire.PDF for JavaScript in React

React, a widely used JavaScript library for crafting dynamic user interfaces, has become essential in modern web development. In tandem, Spire.PDF for JavaScript is a robust library tailored to enhance PDF document processing in web applications.

By incorporating Spire.PDF for JavaScript into your React project, you can introduce advanced PDF manipulation capabilities to your application. Here are some of the key advantages:

  • Effortless PDF Generation: Spire.PDF for JavaScript facilitates the creation and editing of PDF documents directly within React, allowing for efficient management without the need for external applications.
  • Cross-Platform Functionality: With Spire.PDF for JavaScript, you can generate PDFs that are accessible across various platforms, enabling users to view and edit documents from any location.
  • Comprehensive Features: Spire.PDF for JavaScript provides a wide array of features, including text formatting, image embedding, and annotation capabilities, making it perfect for applications that require detailed PDF manipulation.
  • Smooth Integration: Designed to work seamlessly with various JavaScript frameworks, including React, Spire.PDF for JavaScript integrates effortlessly into existing projects, ensuring a smooth development process.

Set Up Your Environment

Step 1. Install React and npm

Download and install Node.js from the official website. Make sure to choose the version that matches your operating system.

After the installation is complete, you can verify that Node.js and npm are working correctly by running the following commands in your terminal:

Check if node.js and npm are successfully installed

Step 2. Create a New React Project

Create a new React project named my-app using Create React App from terminal:

npx create-react-app my-app

Create a react project

If your React project is compiled successfully, the app will be served at http://localhost:3000, allowing you to view and test your application in a browser.

Launch React app at localhost 3000

To visually browse and manage the files in your project, you can open the project using VS Code.

Open React project in VS Code

Integrate Spire.PDF for JavaScript in Your Project

Download Spire.PDF for JavaScript from our website and unzip it to a location on your disk. The downloaded product package integrates Spire.Doc for JavaScript, Spire.XLS for JavaScript, Spire.PDF for JavaScript, and Spire.Presentation for JavaScript. When using the features of Spire.PDF for JavaScript, the required files are: spire.pdf.js, Spire.Pdf.Wasm.zip, spire.common.js, Spire.Common.Wasm.zip, and the _framework folder.

Download Spire.PDF for JavaScript library

Alternatively, you can download Spire.PDF for JavaScript using npm. In the terminal within VS Code, run the following command:

npm i spire.office

Donwload Spire.PDF for JavaScript via npm

Once the installation is complete, the product packages will be saved in the node_modules/spire.office path of your project.

The library files downloaded via npm

Copy the spire.pdf.js, Spire.Pdf.Wasm.zip, spire.common.js, Spire.Common.Wasm.zip, and the _framework folder five files into the "public" folder in your React project.

Copy library to React project

Add font files you plan to use to the "public/static/font" folder in your project. (Not always necessary)

Add font files to react project

Create and Save PDF Files Using JavaScript

Modify the code in the "App.js" file to generate a PDF file using the WebAssembly (WASM) module.

Modify app.js file

Here is the entire code:

  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {
   const [wasmModule, setWasmModule] = useState(null);
   useEffect(() => {
    (async () => {
      try {
        const publicUrl = process.env.PUBLIC_URL || '';
        const spireModule = await import(/* webpackIgnore: true */ `${publicUrl}/spire.pdf.js`);
        const rawModule = spireModule.default || spireModule;
        window.wasmModule = typeof rawModule === 'function' 
          ? await rawModule({ locateFile: p => p.endsWith('.wasm') ? `${publicUrl}/${p}` : p })
          : rawModule;       
        setWasmModule(window.wasmModule);
      } catch (error) {
        console.error('Failed to load spire.pdf.js:', error);
      }
    })();
  }, []);

  
  const CreatePdfDocument = async () => {
    const wasmModule = window.wasmModule.spirepdf;
    
    if (wasmModule) {
      // Load the ARIALUNI.TTF font file into the virtual file system (VFS)
      await window.spire.FetchFileToVFS("ARIALUNI.TTF", "/Library/Fonts/", `${import.meta.env.BASE_URL}static/font/`);

      // Create a pdf instance
      let doc = new wasmModule.PdfDocument();

      // Create one page
      let pagebase = doc.Pages.Add();
            
      const text = "Hello World";
      let pdffont = new wasmModule.PdfFont({fontFamily:wasmModule.PdfFontFamily.Helvetica, size:30.0});
      let pdfBrush = new wasmModule.PdfSolidBrush({pdfRGBColor: new wasmModule.PdfRGBColor({color: wasmModule.Color.get_Black()})});
      // Draw the text
      pagebase.Canvas.DrawString({s: text, font: pdffont, brush: pdfBrush, x: 10, y: 10});

      // Define the output file name
      const outputFileName = "HelloWorld_out.pdf";

      // Save the document to the specified path
      doc.SaveToFile(outputFileName);
      doc.Close();

      // Read the saved file and convert to a Blob object
      const modifiedFileArray = window.dotnetRuntime.Module.FS.readFile(outputFileName);
      const modifiedFile = new Blob([modifiedFileArray], { type: "application/pdf" });

      // Clean up resources
      doc.Dispose();

      // Create a URL for the Blob
      const url = URL.createObjectURL(modifiedFile);

      // Create an anchor element to trigger the download
      const a = document.createElement('a');
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 

    }
  };
  
  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Create a PDF Document in React</h1>
      <button onClick={CreatePdfDocument} disabled={!wasmModule}>
        Generate
      </button>
    </div>
  );
}

export default App;

Save the changes by clicking "File" - "Save".

Save changes

Start the development server by entering the following command in the terminal within VS

npm start

Start your react project by running npm start

Once the React app is successfully compiled, it will open in your default web browser, typically at http://localhost:3000.

React app opens at local host 3000

Click "Generate," and a "Save As" window will prompt you to save the output file in the designated folder.

Save the generated PDF file at the specified folder

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Wednesday, 08 January 2025 01:04

Python: Recognize Text from Images

In today's digital world, extracting text from images has become essential for many fields, including business, education, and data analysis. OCR (Optical Character Recognition) technology makes this process effortless by converting text in images into editable and searchable formats quickly and accurately. Whether it's turning handwritten notes into digital files or pulling key information from scanned documents, OCR simplifies tasks and makes work more efficient. In this article, we will demonstrate how to recognize text from images in Python using Spire.OCR for Python.

Install Spire.OCR for Python

This scenario requires Spire.OCR for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.OCR

Download the Model of Spire.OCR for Python

Spire.OCR for Python provides different recognition models for different operating systems. Download the model suited to your system from one of the links below:

After downloading, extract the package and save it to a specific directory on your system.

Recognize Text from Images in Python

Spire.OCR for Python offers the OcrScanner.Scan() method to recognize text from images. Once the recognition is complete, you can use the OcrScanner.Text property to retrieve the recognized text and then save it to a file for further use. The detailed steps are as follows.

  • Create an instance of the OcrScanner class to handle OCR operations.
  • Create an instance of the ConfigureOptions class to configure the OCR settings.
  • Specify the file path to the model and the desired recognition language through the ConfigureOptions.ModelPath and ConfigureOptions.Language properties.
  • Apply the configuration settings to the OcrScanner instance using the OcrScanner.ConfigureDependencies() method.
  • Call the OcrScanner.Scan() method to perform text recognition on the image.
  • Retrieve the recognized text using the OcrScanner.Text property.
  • Save the extracted text to a file for further use.
  • Python
from spire.ocr import *

# Create an instance of the OcrScanner class
scanner = OcrScanner()

# Configure OCR settings
configureOptions = ConfigureOptions()
# Set the file path to the model
configureOptions.ModelPath = r'D:\OCR\win-x64'  
# Set the recognition language. Supported languages include English, Chinese, Chinesetraditional, French, German, Japanese, and Korean.
configureOptions.Language = 'English'  
# Apply the settings to the OcrScanner instance
scanner.ConfigureDependencies(configureOptions)

# Recognize text from the image
scanner.Scan(r'Sample.png')

# Retrieve the recognized text and save it to a file
text = scanner.Text.ToString() + '\n'
with open('output.txt', 'a', encoding='utf-8') as file:
    file.write(text + '\n')

Recognize Text from Images in Python

Recognize Text with Coordinates from Images in Python

In scenarios where you need the exact position of text in an image, such as for layout analysis or advanced data processing, extracting coordinate information is essential. With Spire.OCR for Python, you can retrieve recognized text block by block. Each text block includes detailed positional data such as the x and y coordinates, width, and height. The detailed steps are as follows.

  • Create an instance of the OcrScanner class to handle OCR operations.
  • Create an instance of the ConfigureOptions class to configure the OCR settings.
  • Specify the file path to the model and the desired recognition language through the ConfigureOptions.ModelPath and ConfigureOptions.Language properties.
  • Apply the configuration settings to the OcrScanner instance using the OcrScanner.ConfigureDependencies() method.
  • Call the OcrScanner.Scan() method to perform text recognition on the image.
  • Retrieve the recognized text using the OcrScanner.Text property.
  • Iterate through the text blocks in the recognized text. For each block, use the IOCRTextBlock.Text property to get the text and the IOCRTextBlock.Box property to retrieve positional details (x, y, width, and height).
  • Save the results to a text file for further analysis.
  • Python
from spire.ocr import *

# Create an instance of the OcrScanner class
scanner = OcrScanner()

# Configure OCR settings
configureOptions = ConfigureOptions()
# Set the file path to the model
configureOptions.ModelPath = r'D:\OCR\win-x64' 
# Set the recognition language. Supported languages include English, Chinese, Chinesetraditional, French, German, Japanese, and Korean.
configureOptions.Language = 'English' 
# Apply the settings to the OcrScanner instance
scanner.ConfigureDependencies(configureOptions)

# Recognize text from the image
scanner.Scan(r'sample.png')
# Retrieve the recognized text 
text = scanner.Text

# Iterate through the text blocks in the recognized text. For each text block, retrieve its text and positional data (x, y, width, and height)
block_text = ""
for block in text.Blocks:
    rectangle = block.Box
    block_info = f'{block.Text} -> x: {rectangle.X}, y: {rectangle.Y}, w: {rectangle.Width}, h: {rectangle.Height}'
    block_text += block_info + '\n'

# Save the results to a file
with open('output.txt', 'a', encoding='utf-8') as file:
    file.write(block_text + '\n')

Recognize Text with Coordinates from Images in Python

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Custom document properties are user-defined fields within a Word document that store specific metadata. Unlike standard properties, such as title, author, or subject, which are predefined by Microsoft Word, these custom properties provide users with the flexibility to define and manage additional metadata fields according to their specific requirements. In this article, we will demonstrate how to add, extract, and remove custom document properties in Word documents in Python using Spire.Doc for Python.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.Doc

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows

Add Custom Document Properties to Word in Python

Spire.Doc for Python provides the CustomDocumentProperties.Add() method, which enables developers to assign different types of values, such as text, time, numeric, or yes or no, to the custom properties of a Word document. The steps below demonstrate how to add custom document properties with different types of values to a Word document using Spire.Doc for Python.

  • Initialize an instance of the Document class.
  • Load a Word document using the Document.LoadFromFile() method.
  • Get the custom document properties of the document through the Document.CustomDocumentProperties property.
  • Add custom document properties with different data types to the document using the CustomDocumentProperties.Add(name, value) method.
  • Save the result document using the Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a Document object
document = Document()
# Load a Word document
document.LoadFromFile("Example.docx")

# Add custom document properties with different types of values to the document
customProperties = document.CustomDocumentProperties
customProperties.Add("DocumentCategory", String("Technical Report"))
customProperties.Add("RevisionNumber", Int32(5))
customProperties.Add("LastReviewedDate", DateTime(2024, 12, 1, 0, 0, 0, 0))
customProperties.Add("RequiresFollowUp", Boolean(False))

# Save the result document
document.SaveToFile("AddCustomDocumentProperties.docx", FileFormat.Docx2016)
document.Close()

Add Custom Document Properties to Word in Python

Extract Custom Document Properties in Word in Python

Extracting custom document properties allows developers to access metadata for further analysis, reporting, or integration into other applications. Spire.Doc for Python makes it simple to retrieve the details of these properties using the CustomDocumentProperty.Name and CustomDocumentProperty.Value properties. The detailed steps are as follows.

  • Initialize an instance of the Document class.
  • Load a Word document using the Document.LoadFromFile() method.
  • Get the custom document properties of the document through the Document.CustomDocumentProperties property.
  • Iterate through the custom document properties.
  • Extract the name and value of each custom document property.
  • Save the extracted data to a text file.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a Document object
document = Document()
# Load a Word document
document.LoadFromFile("AddCustomDocumentProperties.docx")

# Open a text file to save the extracted custom properties
with open("ExtractedCustomProperties.txt", "w") as output_file:
    # Iterate through all custom document properties
    for i in range(document.CustomDocumentProperties.Count):
        # Extract the name and value of each custom property
        property_name = document.CustomDocumentProperties.get_Item(i).Name
        property_value = document.CustomDocumentProperties.get_Item(i).Value

        # Write the property details to the text file
        output_file.write(f"{property_name}: {property_value}\n")

document.Close()

Extract Custom Document Properties in Word in Python

Remove Custom Document Properties from Word in Python

Cleaning up custom document properties is crucial for maintaining confidentiality, reducing file size, and ensuring metadata does not contain outdated or irrelevant information. Spire.Doc for Python allows developers to remove custom properties from a Word document using the DocumentProperties.Remove() method. The detailed steps are as follows.

  • Initialize an instance of the Document class.
  • Load a Word document using the Document.LoadFromFile() method.
  • Get the custom document properties of the document through the Document.CustomDocumentProperties property.
  • Iterate through the custom document properties.
  • Remove each custom document property through its name using the DocumentProperties.Remove() method.
  • Save the result document using the Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a Document object
document = Document()
# Load a Word document
document.LoadFromFile("AddCustomDocumentProperties.docx")

# Iterate through all custom document properties
customProperties = document.CustomDocumentProperties
for i in range(customProperties.Count - 1, -1, -1):
    # Remove each custom document property by its name
    customProperties.Remove(customProperties[i].Name)

# Save the result document
document.SaveToFile("RemoveCustomDocumentProperties.docx", FileFormat.Docx2016)
document.Close()

Remove Custom Document Properties from Word in Python

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

PowerPoint presentations are a powerful tool for presenting information in an organized and engaging manner. To further enhance the organization of slides, PowerPoint allows users to group slides into sections. This feature makes navigating and managing large presentations much easier. In this article, we'll show you how to manage slides within PowerPoint sections in Python using Spire.Presentation for Python. Specifically, we'll cover how to add, retrieve, reorder, and remove slides in these sections.

Install Spire.Presentation for Python

This scenario requires Spire.Presentation for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.Presentation

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Presentation for Python on Windows

Insert Slides into a PowerPoint Section in Python

Inserting slides is essential when you want to introduce new content to a section. Using Spire.Presentation for Python, you can quickly insert a slide into a section with the Section.Insert() method. The detailed steps are as follows.

  • Create an instance of the Presentation class.
  • Load a PowerPoint presentation using the Presentation.LoadFromFile() method.
  • Get a specific section through its index (0-based) using the Presentation.SectionList(index) property.
  • Add a new slide to the presentation, then insert it into the section using the Section.Insert() method.
  • Remove the added slide from the presentation.
  • Save the resulting presentation using the Presentation.SaveToFile() method.
  • Python
from spire.presentation import *

# Create an instance of the Presentation class
presentation = Presentation()
# Load a PowerPoint presentation
presentation.LoadFromFile("Example.pptx")

# Access the first section
first_section = presentation.SectionList.get_Item(0)

# Add a new slide to the presentation and insert it at the start of the section
slide = presentation.Slides.Append()
first_section.Insert(0, slide)

# Remove the added slide from the presentation
presentation.Slides.Remove(slide)

# Save the modified presentation
presentation.SaveToFile("InsertSlidesInSection.pptx", FileFormat.Pptx2016)
# Close the Presentation object
presentation.Dispose()

Insert Slides into a PowerPoint Section in Python

Retrieve Slides from a PowerPoint Section in Python

Retrieving slides from a specific section allows you to focus on a smaller group of slides for tasks such as reordering or applying custom formatting. Using the Section.GetSlides() method in Spire.Presentation for Python, you can easily access all the slides in a particular section. The detailed steps are as follows.

  • Create an instance of the Presentation class.
  • Load a PowerPoint presentation using the Presentation.LoadFromFile() method.
  • Get a specific section through its index (0-based) using the Presentation.SectionList(index) property.
  • Retrieve the slides within the section using the Section.GetSlides() method.
  • Iterate through the retrieved slides and get the slide number (1-based) of each slide.
  • Python
from spire.presentation import *

# Create an instance of the Presentation class
presentation = Presentation()
# Load a PowerPoint presentation
presentation.LoadFromFile("Example.pptx")

# Retrieve the slides in the 3rd section
section = presentation.SectionList.get_Item(2)
slides = section.GetSlides()

output_content = "The slide numbers in this section are:\n"

# Get the slide number of each slide in the section
for slide in slides:
    output_content += str(slide.SlideNumber) + " "

# Save the slide number to a text file
with open("slide_numbers.txt", "w") as file:
    file.write(output_content)

Retrieve Slides from a PowerPoint Section in Python

Reorder Slides in a PowerPoint Section in Python

Reordering slides is important to ensure related content is in the right order. Spire.Presentation for Python offers the Section.Move() method, which allows you to move a slide to a new position within a section. The detailed steps are as follows.

  • Create an instance of the Presentation class.
  • Load a PowerPoint presentation using the Presentation.LoadFromFile() method.
  • Get a specific section through its index (0-based) using the Presentation.SectionList(index) property.
  • Move a specific slide in the section to another position using the Section.Move() method.
  • Save the resulting presentation using the Presentation.SaveToFile() method.
  • Python
from spire.presentation import *

# Create an instance of the Presentation class
presentation = Presentation()
# Load a PowerPoint presentation
presentation.LoadFromFile("Example.pptx")

# Access the 3rd section
section = presentation.SectionList.get_Item(2)

# Retrieve the slides in the section
slides = section.GetSlides()

# Move the 1st slide in the section to the specified position
section.Move(2, slides[0])

# Save the modified presentation
presentation.SaveToFile("ReorderSlidesInSection.pptx", FileFormat.Pptx2016)
# Close the Presentation object
presentation.Dispose()

Reorder Slides in a PowerPoint Section in Python

Remove Slides from a PowerPoint Section in Python

Removing slides from a section streamlines your presentation, particularly when some slides become outdated or unnecessary. With Spire.Presentation for Python, you can easily remove a single slide or multiple slides from a section using the Section.RemoveAt() or Section.RemoveRange() method. The detailed steps are as follows.

  • Create an instance of the Presentation class.
  • Load a PowerPoint presentation using the Presentation.LoadFromFile() method.
  • Get a specific section through its index (0-based) using the Presentation.SectionList(index) property.
  • Remove a specific slide or a range of slides from the presentation using the Section.RemoveAt() or Section.RemoveRange() method.
  • Save the resulting presentation using the Presentation.SaveToFile() method.
  • Python
from spire.presentation import *

# Create an instance of the Presentation class
presentation = Presentation()
# Load a PowerPoint presentation
presentation.LoadFromFile("Example.pptx")

# Access the 3rd section
section = presentation.SectionList.get_Item(2)

# Remove the first slide from the section
section.RemoveAt(0)

# Or remove a range of slides from the section
# section.RemoveRange(0, 2)

# Save the modified presentation
presentation.SaveToFile("RemoveSlidesInSection.pptx", FileFormat.Pptx2016)
# Close the Presentation object
presentation.Dispose()

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Friday, 22 November 2024 08:40

Python: Extract Annotations from PDF

Annotations in PDF documents play a crucial role in enhancing collaboration, emphasizing key points, or providing additional context. Extracting annotations is essential for efficiently analyzing PDF content, but manual extraction can be tedious. This guide demonstrates how to extract annotations from PDF with Python using Spire.PDF for Python, providing a faster and more flexible solution to access important information.

Install Spire.PDF for Python

This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.PDF

If you are unsure how to install it, please refer to this tutorial: How to Install Spire.PDF for Python on Windows.

Extract Specified Annotations from PDF Documents

Although Adobe Acrobat offers a built-in one-click annotation extraction feature, it lacks flexibility when handling specific annotations. If you only need to extract one or a few annotations, you must manually locate and copy them, which can be inefficient, especially when working with PDFs containing multiple annotations. Spire.PDF (short for Spire.PDF for Python), however, provides the PdfAnnotationCollection.get_item() method, enabling targeted extraction of specific annotations, making PDF annotation management more flexible and efficient.

Steps to extract specified annotations from PDF:

  • Create an object of PdfDocument class.
  • Load a PDF document from the local storage with PdfDocument.LoadFromFile() method.
  • Get a page using PdfDocument.Pages[] property, and access the annotations collection with PdfPageBase.AnnotationsWidget property.
  • Create a list to store annotation information.
  • Access the specified annotation using PdfAnnotationCollection.get_Item() method.
  • Append annotation details to the list.
  • Save the list as a Text file.

Here is the code example of exporting the first annotation on the third page:

  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a new PDF document
pdf = PdfDocument()

# Load the file from disk
pdf.LoadFromFile( "Sample.pdf")

# Get the third page 
page = doc.Pages.get_Item(2)

# Access the annotations on the page
annotations = page.AnnotationsWidget

# Create a list to save information of annotations
sb = []

# Access the first annotation on the page
annotation = annotations.get_Item(0)

# Append the annotation details to the list
sb.append("Annotation information: ")
sb.append("Text: " + annotation.Text)
modifiedDate = annotation.ModifiedDate.ToString()
sb.append("ModifiedDate: " + modifiedDate)

# Save the list as a Text file
with open("GetSpecificAnnotation.txt", "w", encoding="utf-8") as file:
    file.write("\n".join(sb))

# Close the PDF file
pdf.Close()

Extract Specified Annotations from PDF

Extract All Annotations from a PDF Page

To export all annotations from a specified PDF page, you can still use the PdfPageBase.AnnotationsWidget property along with the PdfAnnotationCollection.get_item() method. However, you will need to iterate through all the annotations on the page to ensure none are missed. Below are the steps and code examples to guide you through the process.

Steps to extract annotations from PDF pages:

  • Create a PdfDocument instance.
  • Read a PDF document from the local storage with PdfDocument.LoadFromFile() method.
  • Access the annotation collection on the specified page using PdfDocument.Pages.AnnotationsWidget property.
  • Create a list to store annotation information.
  • Loop through annotations on a certain page.
    • Retrieve each annotation using PdfAnnotationCollection.get_Item() method.
    • Add annotation details to the list.
  • Save the list as a Text file.

Below is the code example of extracting all annotations on the second page:

  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a new PDF document
pdf = PdfDocument()

# Load the file from disk
pdf.LoadFromFile("Sample.pdf")

# Get all annotations from the second page
annotations = pdf.Pages.get_Item(1).AnnotationsWidget

# Create a list to maintain annotation details
sb = []

# Loop through annotations on the page
if annotations.Count > 0:
    for i in range(annotations.Count):
        # Get the current annotation
        annotation = annotations.get_Item(i)

        # Get the annotation details
        if isinstance(annotation, PdfPopupAnnotationWidget):
            continue
        sb.append("Annotation information: ")
        sb.append("Text: " + annotation.Text)
        modifiedDate = annotation.ModifiedDate.ToString()
        sb.append("ModifiedDate: " + modifiedDate)

# Save annotations as a Text file
with open("GetAllAnnotationsFromPage.txt", "w", encoding="utf-8") as file:
    file.write("\n".join(sb))

# Release resources
pdf.Close()

Export All Annotations on a PDF Page

Extract All Annotations from PDF Files

The final section of this guide illustrates how to extract all annotations from a PDF document using Python. The process is similar to exporting annotations from a single page but involves iterating through each page, traversing all annotations, and accessing their details. Finally, the extracted annotation details are saved to a text file for further use. Let’s take a closer look at the detailed steps.

Steps to extract all annotations from a PDF document:

  • Create an instance of PdfDocument class.
  • Read a PDF document from the disk with PdfDocument.LoadFromFile() method.
  • Initialize a list to store annotation information.
  • Loop through all pages and access the annotation collection with PdfDocument.Pages.AnnotationsWidget property.
    • Iterate each annotation in the collection and get annotations using PdfAnnotationCollection.get_item() method.
    • Append annotation details to the list.
  • Output the list as a Text file.

Here is an example of exporting all annotations from a PDF file:

  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a new PDF document
pdf = PdfDocument()

# Load the file from disk 
pdf.LoadFromFile("Sample.pdf")

# Create a list to save annotation details
sb = []

# Iterate through all pages in the PDF document
for pageIndex in range(pdf.Pages.Count):
    sb.append(f"Page {pageIndex + 1}:")

    # Access the annotation collection of the current page
    annotations = pdf.Pages.get_Item(pageIndex).AnnotationsWidget
   
    # Loop through annotations in the collection
    if annotations.Count > 0:
        for i in range(annotations.Count):
            # Get the annotations of the current page
            annotation = annotations.get_Item(i)

            # Skip invalid annotations (empty text and default date)
            if not annotation.Text.strip() and annotation.ModifiedDate.ToString() == "0001/1/1 0:00:00":
                continue
           
            # Extract annotation information
            sb.append("Annotation information: ")
            sb.append("Text: " + (annotation.Text.strip() or "N/A"))
            modifiedDate = annotation.ModifiedDate.ToString()
            sb.append("ModifiedDate: " + modifiedDate)
    else:
        sb.append("No annotations found.")

    # Add a blank line after each page
    sb.append("")

# Save all annotations to a file
with open("GetAllAnnotationsFromDocument.txt", "w", encoding="utf-8") as file:
    file.write("\n".join(sb))

# Close the PDF document
pdf.Close()

Extract All Annotations from a PDF Document

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Page 8 of 53