Spire.Office Knowledgebase Page 17

Knowledgebase (2311)

Children categories

Spire.OfficeJs (3)

View items...

Python: Check if a PDF is Password Protected and Determine the Correct Password

2025-03-19 01:06:07 Written by Koohji

When working with PDF files, you may encounter documents that are password protected. This means that you cannot view or edit the content without entering the correct password. Understanding how to check if a PDF is password protected and determining the correct password is essential for accessing important information. In this guide, we will introduce how to check if a PDF is password protected and determine the correct password using Python and the Spire.PDF for Python library.

Check if a PDF is Password Protected
Determine the Correct Password for a PDF

Install Spire.PDF for Python

This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

Package Manager

pip install Spire.PDF

If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows

Check if a PDF is Password Protected

Spire.PDF for Python offers the PdfDocument.IsPasswordProtected(fileName: str) method to check if a PDF file is password protected. The detailed steps are as follows.

Specify the input and output file paths.
Check if the PDF file is password protected or not using the PdfDocument.IsPasswordProtected() method.
Save the result to a text file.

Python

from spire.pdf import *

# Specify the input and output file paths
inputFile = "Secured.pdf"
outputFile = "CheckPasswordProtection.txt"

# Check if the input PDF file is password protected
isProtected = PdfDocument.IsPasswordProtected(inputFile)

# Write the result into a text file
with open(outputFile, "w") as fp:
    fp.write("The PDF is " + ("password protected!" if isProtected else "not password protected!"))

Check if a PDF is Password Protected

Determine the Correct Password for a PDF

While Spire.PDF for Python does not provide a direct method to check if a password is correct, you can achieve this by attempting to load the PDF with the password and catching exceptions. If the password is incorrect, an exception will be thrown. The detailed steps are as follows.

Specify the input and output file paths.
Create a list of potential passwords to test.
Iterate through the list and load the PDF with each password using the PdfDocument.LoadFromFile(filename: str, password: str) method.
If no exception is thrown, the password is correct. Otherwise, the password is incorrect.
Save the results to a text file.

Python

from spire.pdf import *

# Specify the input and output file paths
inputFile = "Secured.pdf"
outputFile = "DetermineCorrectPassword.txt"

# Create a list of potential passwords to test
passwords = ["password1", "password2", "password3", "test", "sample"]

# Create a text file to store the results
with open(outputFile, "w") as fp:
    for value in passwords:
        try:
            # Load the PDF with the current password
            doc = PdfDocument()
            doc.LoadFromFile(inputFile, value)
            # If successful, write that the password is correct
            fp.write(f'Password "{value}" is correct\n')
        except SpireException:
            # If an exception occurs, write that the password is not correct
            fp.write(f'Password "{value}" is not correct\n')

Determine the Correct Password for a PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Security

Tagged under

pdf Python Security

Python: Verify and Extract Digital Signatures in PDF

2025-03-17 01:12:05 Written by Koohji

Verifying digital signatures in a PDF is essential for ensuring the authenticity and integrity of electronically signed documents. It confirms that the signature is valid and that the document has not been altered since signing. Additionally, extracting signature images from a PDF allows you to retrieve and save the visual representation of a signature, making it easier to verify and archive for legal or record-keeping purposes. In this article, we will demonstrate how to verify and extract digital signatures in PDF in Python using Spire.PDF for Python.

Verify Digital Signatures in PDF in Python
Detect Whether a Signed PDF Has Been Modified in Python
Extract Signature Images from PDF in Python

Install Spire.PDF for Python

This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

Package Manager

pip install Spire.PDF

If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows

Verify Signatures in PDF in Python

Spire.PDF for Python provides the PdfDocument.VerifySignature(signName: str) method to check the validity of a digital signature in a PDF document. The detailed steps are as follows.

Create an object of the PdfDocument class.
Load a PDF file using the PdfDocument.LoadFromFile() method.
Get the form of the PDF file using the PdfDocument.Form property.
Iterate through all fields in the form and find the signature field.
Get the name of the signature field using the PdfSignatureFieldWidget.FullName property.
Verify the validity of the signature using the PdfSignature.VerifySignature(signName: str) method.

Python

from spire.pdf.common import *
from spire.pdf import *

# Load a PDF document
doc = PdfDocument()
doc.LoadFromFile("Signature.pdf")

# Access the form in the document
pdfform = doc.Form
formWidget = PdfFormWidget(pdfform)

# Check if there are any fields in the form
if formWidget.FieldsWidget.Count > 0:
        # Loop through all fields in the form
        for i in range(formWidget.FieldsWidget.Count):
            field = formWidget.FieldsWidget.get_Item(i)

            # Check if the field is a PdfSignatureFieldWidget
            if isinstance(field, PdfSignatureFieldWidget):
                # Typecast the field to a PdfSignatureFieldWidget instance
                signatureField = PdfSignatureFieldWidget(field)
                # Get the name of the signature field
                fullName = signatureField.FullName

                # Verify the signature
                valid = doc.VerifySignature(fullName)
                # Determine the validation status text based on the verification result
                if valid:
                    print("The signature is valid")
                else:
                    print("The signature is invalid")

doc.Close()

Verify Signatures in PDF in Python

Detect Whether a Signed PDF Has Been Modified in Python

To determine whether a PDF document has been modified after signing, use the Security_PdfSignature.VerifyDocModified() method. This method returns a Boolean value: True indicates that the document has been altered and the signature is no longer valid, while False confirms that the document remains unchanged since it was signed. The detailed steps are as follows.

Create an object of the PdfDocument class.
Load a PDF file using the PdfDocument.LoadFromFile() method.
Get the form of the PDF file using the PdfDocument.Form property.
Iterate through all fields in the form and find the signature field.
Get the signature using the PdfSignatureFieldWidget.Signature property.
Verify if the document has been modified since it was signed using the Security_PdfSignature.VerifyDocModified() method.

Python

from spire.pdf.common import *
from spire.pdf import *

# Load a PDF document
doc = PdfDocument()
doc.LoadFromFile("Signature.pdf")

# Access the form in the document
pdfform = doc.Form
formWidget = PdfFormWidget(pdfform)

# Check if there are any fields in the form
if formWidget.FieldsWidget.Count > 0:
        # Loop through all fields in the form
        for i in range(formWidget.FieldsWidget.Count):
            field = formWidget.FieldsWidget.get_Item(i)

            # Check if the field is a PdfSignatureFieldWidget
            if isinstance(field, PdfSignatureFieldWidget):
                # Typecast the field to a PdfSignatureFieldWidget instance
                signatureField = PdfSignatureFieldWidget(field)

                # Get the signature
                signature = signatureField.Signature
                # Verify if the document has been modified since it was signed 
                modified = signature.VerifyDocModified()
            
                # Determine the validation status text based on the verification result
                if modified:
                    print("The document has been modified")
                else:
                    print("The document has not been modified")

doc.Close()

Detect Whether a Signed PDF Has Been Modified in Python

Extract Signature Images from PDF in Python

Spire.PDF for Python allows extracting all signature images from a PDF document using the PdfFormWidget.ExtractSignatureAsImages property. The detailed steps are as follows.

Create an object of the PdfDocument class.
Load a PDF file using the PdfDocument.LoadFromFile() method.
Get the form of the PDF file using the PdfDocument.Form property.
Extract signature images from the form using the PdfFormWidget.ExtractSignatureAsImages property.
Save the extracted images to image files.

Python

from spire.pdf.common import *
from spire.pdf import *

# Load a PDF document
doc = PdfDocument()
doc.LoadFromFile("Signature.pdf")

# Access the form in the document
pdfform = doc.Form
formWidget = PdfFormWidget(pdfform)

i = 0
# Extract signature images from the form and save them to files
for image in formWidget.ExtractSignatureAsImages:
    filename = "Signature/" + f"Image-{i}.png"
    # Save the image to a file
    image.Save(filename)
    i = i + 1

doc.Close()

Extract Signature Images from PDF in Python

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Security

Tagged under

pdf Python Security

Convert PDF to Excel Using JavaScript in React

2025-03-14 01:06:21 Written by Koohji

In data-driven workflows, converting PDF documents with tables to Excel improves accessibility and usability. While PDFs preserve document integrity, their static nature makes data extraction challenging, often leading to error-prone manual work. By leveraging JavaScript in React, developers can automate the conversion process, seamlessly transferring structured data like financial reports into Excel worksheets for real-time analysis and collaboration. This article explores how to use Spire.PDF for JavaScript to efficiently convert PDFs to Excel files with JavaScript in React applications.

Steps to Convert PDF to Excel Using JavaScript
Simple PDF to Excel Conversion in JavaScript
Convert PDF to Excel with XlsxLineLayoutOptions
Convert PDF to Excel with XlsxTextLayoutOptions

Install Spire.PDF for JavaScript

To get started with converting PDF to Excel with JavaScript in a React application, you can either download Spire.PDF for JavaScript from our website or install it via npm with the following command:

Package Manager

npm i spire.pdf

After that, copy the "Spire.Pdf.Base.js" and "Spire.Pdf.Base.wasm" files to the public folder of your project. Additionally, make sure to include the required font files to ensure accurate and consistent text rendering.

For more details, refer to the documentation: How to Integrate Spire.PDF for JavaScript in a React Project

Steps to Convert PDF to Excel Using JavaScript

With the Spire.PDF for JavaScript WebAssembly module, PDF documents can be loaded from the Virtual File System (VFS) using the PdfDocument.LoadFromFile() method and converted into Excel workbooks using the PdfDocument.SaveToFile() method.

In addition to direct conversion, developers can customize the process by configuring conversion options through the XlsxLineLayoutOptions and XlsxTextLayoutOptions classes, along with the PdfDocument.ConvertOptions.SetPdfToXlsxOptions() method.

The following steps demonstrate how to convert a PDF document to an Excel file using Spire.PDF for JavaScript:

Load the Spire.Pdf.Base.js file to initialize the WebAssembly module.
Fetch the PDF file into the Virtual File System (VFS) using the wasmModule.FetchFileToVFS() method.
Fetch the font files used in the PDF document to the “/Library/Fonts/” folder in the VFS using the wasmModule.FetchFileToVFS() method.
Create an instance of the PdfDocument class using the wasmModule.PdfDocument.Create() method.
Load the PDF document from the VFS into the PdfDocument instance using the PdfDocument.LoadFromFile() method.
(Optional) Customize the conversion options:
- Create an instance of the XlsxLineLayoutOptions or XlsxTextLayoutOptions class and specify the desired conversion settings.
- Apply the conversion options using the PdfDocument.ConvertOptions.SetPdfToXlsxOptions() method.
Convert the PDF document to an Excel file using the PdfDocument.SaveToFile({ filename: string, wasmModule.FileFormat.XLSX }) method.
Retrieve the converted file from the VFS for download or further use.

Simple PDF to Excel Conversion in JavaScript

Developers can directly load a PDF document from the VFS and convert it to an Excel file using the default conversion settings. These settings map one PDF page to one Excel worksheet, preserve rotated and overlapped text, allow cell splitting, and enable text wrapping.

Below is a code example demonstrating this process:

JavaScript

import React, { useState, useEffect } from 'react';

function App() {

  // State to store the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {
        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {
        // Log any errors that occur during module loading
        console.error('Failed to load the WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []);

  // Function to convert PDF to Excel
  const ConvertPDFToExcel = async () => {
    if (wasmModule) {
      // Specify the input and output file names
      const inputFileName = 'Sample.pdf';
      const outputFileName = 'PDFToExcel.xlsx';

      // Fetch the input file and add it to the VFS
      await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);

      // Fetch the font file used in the PDF to the VFS
      await wasmModule.FetchFileToVFS('Calibri.ttf', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
      await wasmModule.FetchFileToVFS('Symbol.ttf', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Create an instance of the PdfDocument class
      const pdf = wasmModule.PdfDocument.Create();

      // Load the PDF document from the VFS
      pdf.LoadFromFile(inputFileName);

      // Convert the PDF document to an Excel file
      pdf.SaveToFile({ fileName: outputFileName, fileFormat: wasmModule.FileFormat.XLSX});

      // Read the Excel file from the VFS
      const excelArray = await wasmModule.FS.readFile(outputFileName)

      // Create a Blob object from the Excel file and trigger a download
      const blob = new Blob([excelArray], { type: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' });
      const url = URL.createObjectURL(blob);
      const a = document.createElement('a');
      a.href = url;
      a.download = `${outputFileName}`;
      document.body.appendChild(a);
      a.click();
      document.body.removeChild(a);
      URL.revokeObjectURL(url);
    }
  };

  return (
      <div style={{ textAlign: 'center', height: '300px' }}>
        <h1>Convert PDF to Excel Using JavaScript in React</h1>
        <button onClick={ConvertPDFToExcel} disabled={!wasmModule}>
          Convert and Download
        </button>
      </div>
  );
}

export default App;

Convert PDF to Excel Without Configuring Options Using JavaScript

Convert PDF to Excel with XlsxLineLayoutOptions

Spire.PDF for JavaScript provides the XlsxLineLayoutOptions class for configuring line-based conversion settings when converting PDFs to Excel. By adjusting these options, developers can achieve different conversion results, such as merging all PDF pages into a single worksheet.

The table below outlines the available parameters in XlsxLineLayoutOptions:

Parameter (bool)	Function
convertToMultipleSheet	Specifies whether to convert each page into a separate worksheet.
rotatedText	Specifies whether to retain rotated text.
splitCell	Specifies whether to split cells.
wrapText	Specifies whether to wrap text within cells.
overlapText	Specifies whether to retain overlapped text.

Special attention should be given to the splitCell parameter, as it significantly impacts the way tables are converted. Setting it to false preserves table cell structures, making the output table cells more faithful to the original PDF. Conversely, setting it to true allows plain text to be split smoothly in cells, which may be useful for text-based layouts rather than structured tables.

Below is a code example demonstrating PDF-to-Excel conversion using XlsxLineLayoutOptions:

JavaScript

import React, { useState, useEffect } from 'react';

function App() {

  // State to store the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {
        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {
        // Log any errors that occur during module loading
        console.error('Failed to load the WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []);

  // Function to convert PDF to Excel with XlsxLineLayoutOptions
  const ConvertPDFToExcelXlsxLineLayoutOptions = async () => {
    if (wasmModule) {
      // Specify the input and output file names
      const inputFileName = 'Sample.pdf';
      const outputFileName = 'PDFToExcelXlsxLineLayoutOptions.xlsx';

      // Fetch the input file and add it to the VFS
      await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);

      // Fetch the font file used in the PDF to the VFS
      await wasmModule.FetchFileToVFS('Calibri.ttf', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
      await wasmModule.FetchFileToVFS('Symbol.ttf', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Create an instance of the PdfDocument class
      const pdf = wasmModule.PdfDocument.Create();

      // Load the PDF document from the VFS
      pdf.LoadFromFile(inputFileName);

      // Create an instance of the XlsxLineLayoutOptions class and specify the conversion options
      const options = wasmModule.XlsxLineLayoutOptions.Create({ convertToMultipleSheet: true, rotatedText: false, splitCell: false, wrapText: false, overlapText: true});

      // Set the XlsxLineLayoutOptions instance as the conversion options
      pdf.ConvertOptions.SetPdfToXlsxOptions(options);

      // Convert the PDF document to an Excel file
      pdf.SaveToFile({ fileName: outputFileName, fileFormat: wasmModule.FileFormat.XLSX});

      // Read the Excel file from the VFS
      const excelArray = await wasmModule.FS.readFile(outputFileName)

      // Create a Blob object from the Excel file and trigger a download
      const blob = new Blob([excelArray], { type: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' });
      const url = URL.createObjectURL(blob);
      const a = document.createElement('a');
      a.href = url;
      a.download = `${outputFileName}`;
      document.body.appendChild(a);
      a.click();
      document.body.removeChild(a);
      URL.revokeObjectURL(url);
    }
  };

  return (
      <div style={{ textAlign: 'center', height: '300px' }}>
        <h1>Convert PDF to Excel with XlsxLineLayoutOptions Using JavaScript in React</h1>
        <button onClick={ConvertPDFToExcelXlsxLineLayoutOptions} disabled={!wasmModule}>
          Convert and Download
        </button>
      </div>
  );
}

export default App;

Convert PDF to Excel with XlsxLineLayoutOptions in React

Convert PDF to Excel Using XlsxTextLayoutOptions

Developers can also customize conversion settings using the XlsxTextLayoutOptions class, which focuses on text-based layout formatting. The table below lists its parameters:

Parameter (bool)	Function
convertToMultipleSheet	Specifies whether to convert each page into a separate worksheet.
rotatedText	Specifies whether to retain rotated text.
overlapText	Specifies whether to retain overlapped text.

Below is a code example demonstrating PDF-to-Excel conversion using XlsxTextLayoutOptions:

JavaScript

import React, { useState, useEffect } from 'react';

function App() {

  // State to store the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {
        // Access the Module and spirepdf from the global window object
        const { Module, spirepdf } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spirepdf);
        };
      } catch (err) {
        // Log any errors that occur during module loading
        console.error('Failed to load the WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Pdf.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []);

  // Function to convert PDF to Excel with XlsxTextLayoutOptions
  const ConvertPDFToExcelXlsxTextLayoutOptions = async () => {
    if (wasmModule) {
      // Specify the input and output file names
      const inputFileName = 'Sample.pdf';
      const outputFileName = 'PDFToExcelXlsxTextLayoutOptions.xlsx';

      // Fetch the input file and add it to the VFS
      await wasmModule.FetchFileToVFS(inputFileName, '', `${process.env.PUBLIC_URL}/`);

      // Fetch the font file used in the PDF to the VFS
      await wasmModule.FetchFileToVFS('Calibri.ttf', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);
      await wasmModule.FetchFileToVFS('Symbol.ttf', '/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Create an instance of the PdfDocument class
      const pdf = wasmModule.PdfDocument.Create();

      // Load the PDF document from the VFS
      pdf.LoadFromFile(inputFileName);

      // Create an instance of the XlsxTextLayoutOptions class and specify the conversion options
      const options = wasmModule.XlsxTextLayoutOptions.Create({ convertToMultipleSheet: false, rotatedText: true, overlapText: true});

      // Set the XlsxTextLayoutOptions instance as the conversion options
      pdf.ConvertOptions.SetPdfToXlsxOptions(options);

      // Convert the PDF document to an Excel file
      pdf.SaveToFile({ fileName: outputFileName, fileFormat: wasmModule.FileFormat.XLSX});

      // Read the Excel file from the VFS
      const excelArray = await wasmModule.FS.readFile(outputFileName)

      // Create a Blob object from the Excel file and trigger a download
      const blob = new Blob([excelArray], { type: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' });
      const url = URL.createObjectURL(blob);
      const a = document.createElement('a');
      a.href = url;
      a.download = `${outputFileName}`;
      document.body.appendChild(a);
      a.click();
      document.body.removeChild(a);
      URL.revokeObjectURL(url);
    }
  };

  return (
      <div style={{ textAlign: 'center', height: '300px' }}>
        <h1>Convert PDF to Excel with XlsxTextLayoutOptions Using JavaScript in React</h1>
        <button onClick={ConvertPDFToExcelXlsxTextLayoutOptions} disabled={!wasmModule}>
          Convert and Download
        </button>
      </div>
  );
}

export default App;

Convert PDF to Excel with XlsxTextLayoutOptions Using JavaScript

Get a Free License

To fully experience the capabilities of Spire.PDF for JavaScript without any evaluation limitations, you can request a free 30-day trial license.

Published in Conversion

Tagged under

PDF React Conversion

News Category

Knowledgebase (2311)

Children categories

Purchase (7)

Licensing (7)

Benchmark (1)

Java (481)

.NET (1317)

Cloud (13)

CPP (71)

Python (355)

AI (4)

JavaScript (51)

Spire.OfficeJs (3)

Python: Check if a PDF is Password Protected and Determine the Correct Password

Install Spire.PDF for Python

Check if a PDF is Password Protected

Determine the Correct Password for a PDF

Apply for a Temporary License

Python: Verify and Extract Digital Signatures in PDF

Install Spire.PDF for Python

Verify Signatures in PDF in Python

Detect Whether a Signed PDF Has Been Modified in Python

Extract Signature Images from PDF in Python

Apply for a Temporary License

Convert PDF to Excel Using JavaScript in React

Install Spire.PDF for JavaScript

Steps to Convert PDF to Excel Using JavaScript

Simple PDF to Excel Conversion in JavaScript

Convert PDF to Excel with XlsxLineLayoutOptions

Convert PDF to Excel Using XlsxTextLayoutOptions

Get a Free License

More...

Convert PDF to Images with JavaScript in React

Find and Highlight Text in Word with JavaScript in React

Lock Cells, Rows, and Columns in Excel with JavaScript in React

C#: Get and Replace Fonts in Word Documents