Spire.Office Knowledgebase Page 55 | E-iceblue

Python: Find and Replace Text in PDF

2024-02-08 01:26:47 Written by Koohji

Finding and replacing text is a common need in document editing, as it helps users correct minor errors or make adjustments to terms appearing in the document. Although PDF documents have a fixed layout and editing can be challenging, users can still perform small modifications such as replacing text with Python, and achieve a satisfactory editing result. In this article, we will explore how to utilize Spire.PDF for Python to find and replace text in PDF documents within a Python program.

Install Spire.PDF for Python

This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip commands.

pip install Spire.PDF

If you are unsure how to install, please refer to: How to Install Spire.PDF for Python on Windows

Find Text and Replace the First Match in PDF with Python

Spire.PDF for Python enables users to find text and replace the first match in PDF documents with the PdfTextReplacer.ReplaceText(string originalText, string newText) method. This replacement method is great for making simple replacements for words or phrases that only appear once on a single page of a document.

The detailed steps for finding text and replacing the first match are as follows:

  • Create an object of PdfDocument class and load a PDF document using PdfDocument.LoadFromFile() method.
  • Get a page of the document using PdfDocument.Pages.get_Item() method.
  • Create an object of PdfTextReplacer class based on the page.
  • Find specific text and replace the first match on the page using PdfTextReplacer.ReplaceText() method.
  • Save the document using PdfDocument.SaveToFile() method.
  • Python
from spire.pdf import *
from spire.pdf.common import *

# Create an object of PdfDocument
pdf = PdfDocument()

# Load a PDF document
pdf.LoadFromFile("Sample.pdf")

# Get a page
page = pdf.Pages.get_Item(0)

# Create an object of PdfTextReplacer class
replacer = PdfTextReplacer(page)

# Find and replace the first matched text
replacer.ReplaceText("compressing", "comparing")

# Save the document
pdf.SaveToFile("output/ReplaceFirstMatch.pdf")
pdf.Close()

Python: Find and Replace Text in PDF

Find Text and Replace All Matches in PDF with Python

Spire.PDF for Python also provides the PdfTextReplacer.ReplaceAllText(string originalText, string newText, Color textColor) method to find specific text and replace all matches with new text (optionally resetting the text color). The detailed steps are as follows:

  • Create an object of PdfDocument class and load a PDF document using PdfDocument.LoadFromFile() method.
  • Loop through the pages in the document.
  • Get a page using PdfDocument.Pages.get_Item() method.
  • Create an object of PdfTextReplacer class based on the page.
  • Find specific text and replace all the matches with new text in a new color using PdfTextReplacer.ReplaceAllText() method.
  • Save the document using PdfDocument.SaveToFile() method.
  • Python
from spire.pdf import *
from spire.pdf.common import *

# Create an object of PdfDocument
pdf = PdfDocument()

# Load a PDF document
pdf.LoadFromFile("Sample.pdf")

# Loop through the pages in the document
for i in range(pdf.Pages.Count):
    # Get a page
    page = pdf.Pages.get_Item(0)
    # Create an object of PdfTextReplacer class based on the page
    replacer = PdfTextReplacer(page)
    # Find and replace all matched text with a new color
    replacer.ReplaceAllText("PYTHON", "Python", Color.get_Red())

# Save the document
pdf.SaveToFile("output/ReplaceAllMatches.pdf")
pdf.Close()

Python: Find and Replace Text in PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Merging and splitting table cells in PowerPoint are essential features that enable users to effectively organize and present data. By merging cells, users can create larger cells to accommodate more information or establish header rows for better categorization. On the other hand, splitting cells allows users to divide a cell into smaller units to showcase specific details, such as individual data points or subcategories. These operations enhance the visual appeal and clarity of slides, helping the audience better understand and analyze the presented data. In this article, we will demonstrate how to merge and split table cells in PowerPoint in Python using Spire.Presentation for Python.

Install Spire.Presentation for Python

This scenario requires Spire.Presentation for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.Presentation

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Presentation for Python on Windows

Merge Table Cells in PowerPoint in Python

Spire.Presentation for Python offers the ITable[columnIndex, rowIndex] property to access specific table cells. Once accessed, you can use the ITable.MergeCells(startCell, endCell, allowSplitting) method to merge them into a larger cell. The detailed steps are as follows.

  • Create an object of the Presentation class.
  • Load a PowerPoint presentation using Presentation.LoadFromFile() method.
  • Get a specific slide using Presentation.Slides[index] property.
  • Find the table on the slide by looping through all shapes.
  • Get the cells you want to merge using ITable[columnIndex, rowIndex] property.
  • Merge the cells using ITable.MergeCells(startCell, endCell, allowSplitting) method.
  • Save the result presentation using Presentation.SaveToFile() method.
  • Python
from spire.presentation.common import *
from spire.presentation import *

# Create a Presentation object
ppt = Presentation()

# Load a PowerPoint presentation
ppt.LoadFromFile("Table1.pptx")

# Get the first slide
slide = ppt.Slides[0]

# Find the table on the first slide
table = None
for shape in slide.Shapes:
    if isinstance(shape, ITable):
        table = shape
        # Get the cell at column 2, row 2
        cell1 = table[1, 1]
        # Get the cell at column 2, row 3
        cell2 = table[1, 2]
        # Check if the content of the cells is the same
        if cell1.TextFrame.Text == cell2.TextFrame.Text:
            # Clear the text in the second cell
            cell2.TextFrame.Paragraphs.Clear()
        # Merge the cells
        table.MergeCells(cell1, cell2, True)

# Save the result presentation to a new file
ppt.SaveToFile("MergeCells.pptx", FileFormat.Pptx2016)
ppt.Dispose()

Python: Merge or Split Table Cells in PowerPoint

Split Table Cells in PowerPoint in Python

In addition to merging specific table cells, Spire.Presentation for Python also empowers you to split a specific table cell into smaller cells by using the Cell.Split(rowCount, colunmCount) method. The detailed steps are as follows.

  • Create an object of the Presentation class.
  • Load a PowerPoint presentation using Presentation.LoadFromFile() method.
  • Get a specific slide using Presentation.Slides[index] property.
  • Find the table on the slide by looping through all shapes.
  • Get the cell you want to split using ITable[columnIndex, rowIndex] property.
  • Split the cell into smaller cells using Cell.Split(rowCount, columnCount) method.
  • Save the result presentation using Presentation.SaveToFile() method.
  • Python
from spire.presentation.common import *
from spire.presentation import *

# Create a Presentation object
ppt = Presentation()

# Load a PowerPoint presentation
ppt.LoadFromFile("Table2.pptx")

# Get the first slide
slide = ppt.Slides[0]

# Find the table on the first slide
table = None
for shape in slide.Shapes:
    if isinstance(shape, ITable):
        table = shape
        # Get the cell at column 2, row 3
        cell = table[1, 2]
        # Split the cell into 3 rows and 2 columns
        cell.Split(3, 2)

# Save the result presentation to a new file
ppt.SaveToFile("SplitCells.pptx", FileFormat.Pptx2016)
ppt.Dispose()

Python: Merge or Split Table Cells in PowerPoint

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Images in Excel can enhance data visualization and help convey information effectively. Apart from inserting/deleting images in Excel with Spire.XLS for Python, you can also use the library to replace existing images with new ones, or extract images for reuse or backup. This article will demonstrate how to replace or extract images in Excel in Python.

Install Spire.XLS for Python

This scenario requires Spire.XLS for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.XLS

If you are unsure how to install, please refer to this tutorial: How to Install Spire.XLS for Python on Windows

Replace Images in Excel with Python

To replace a picture in Excel, you can load a new picture and then set it as the value of the ExcelPicture.Picture property. The following are the detailed steps to replace an Excel image with another one.

  • Create a Workbook instance.
  • Load an Excel file using Workbook.LoadFromFile() method.
  • Get a specified worksheet using Workbook.Worksheets[] property.
  • Get a specified picture from the worksheet using Worksheet.Pictures[] property.
  • Load an image and then replace the original picture with it using ExcelPicture.Picture property.
  • Save the result file using Workbook.SaveToFile() method.
  • Python
from spire.xls import *
from spire.xls.common import *

# Create a Workbook instance 
workbook = Workbook()

# Load an Excel file
workbook.LoadFromFile ("ExcelImg.xlsx")

# Get the first worksheet
sheet = workbook.Worksheets[0]

# Get the first picture from the worksheet
excelPicture = sheet.Pictures[0]
            
# Replace the picture with another one 
excelPicture.Picture = Image.FromFile("logo.png")

# Save the result file
workbook.SaveToFile("ReplaceImage.xlsx", ExcelVersion.Version2016)

Python: Replace or Extract Images in Excel

Extract Images from Excel with Python

Spire.XLS for Python provides the ExcelPicture.Picture.Save() method to save the images in Excel to a specified file path. The following are the detailed steps to extract all images in an Excel worksheet at once.

  • Create a Workbook instance.
  • Load an Excel file using Workbook.LoadFromFile() method.
  • Get a specified worksheet using Workbook.Worksheets[] property.
  • Loop through to get all pictures in the worksheet using Worksheet.Pictures property.
  • Extract pictures and save them to a specified file path using ExcelPicture.Picture.Save() method.
  • Python
from spire.xls import *
from spire.xls.common import *

# Create a Workbook instance
workbook = Workbook()

# Load an Excel file
workbook.LoadFromFile("Test.xlsx")

# Get the first worksheet
sheet = workbook.Worksheets[0]

# Get all images in the worksheet
for i in range(sheet.Pictures.Count - 1, -1, -1):
    pic = sheet.Pictures[i]

    # Save each image as a PNG file
    pic.Picture.Save("ExtractImages\\Image-{0:d}.png".format(i), ImageFormat.get_Png())

workbook.Dispose()

Python: Replace or Extract Images in Excel

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

page 55

Coupon Code Copied!

Christmas Sale

Celebrate the season with exclusive savings

Save 10% Sitewide

Use Code:

View Campaign Details