Textboxes in a Word document serve as versatile containers for text, enabling users to enhance layout and design. They allow for the separation of content from the main body, making documents more visually appealing and organized. Extracting or updating textboxes can be essential for improving document efficiency, ensuring information is current, and facilitating data analysis.

In this article, you will learn how to extract or update textboxes in a Word document using Python and Spire.Doc for Python.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.Doc

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows

Extract Text from a Textbox in Word

Using Spire.Doc for Python, you can access a specific text box in a document by utilizing the Document.TextBoxes[index] property. After retrieving the text box, you can iterate through its child objects to identify whether each one is a paragraph or a table. If the object is a paragraph, you can retrieve its text using the Paragraph.Text property. In cases where the object is a table, you will need to loop through each cell to extract text from every individual cell within that table.

The steps to extract text from a text box in a Word document are as follows:

  • Create a Document object.
  • load a Word file by using Document.LoadFromFile() method.
  • Access a specific text box using Document.TextBoxes[index] property.
  • Iterate through the child objects within the text box.
  • Determine if a child object is a paragraph. If it is, retrieve the text from the paragraph using Paragraph.Text property.
  • Check if a child object is a table. If so, iterate through the cells in the table to extract text from each cell.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a Document object
document = Document()

# Load a Word file
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx")

# Get a specific textbox
textBox = document.TextBoxes.get_Item(0)

with open('ExtractedText.txt','w') as sw:

    # Iterate through the child objects in the textbox
    for i in range(textBox.ChildObjects.Count):

        # Get a specific child object    
        object = textBox.ChildObjects.get_Item(i)

        # Determine if the child object is paragraph
        if object.DocumentObjectType == DocumentObjectType.Paragraph:

            # Write paragraph text to txt file
            sw.write((object if isinstance(object, Paragraph) else None).Text + "\n")

        # Determine if the child object is table
        if object.DocumentObjectType == DocumentObjectType.Table:
            table = object if isinstance(object, Table) else None
            for i in range(table.Rows.Count):
                row = table.Rows[i]
                for j in range(row.Cells.Count):
                    cell = row.Cells[j]
                    for k in range(cell.Paragraphs.Count):
                        paragraph = cell.Paragraphs.get_Item(k)

                        # Write paragrah text of a specific cell to txt file
                        sw.write(paragraph.Text + "\n")

# Dispose resources
document.Dispose()

Python: Extract or Update Textboxes in a Word Document

Update Text in a Textbox in Word

To update a textbox in a Word document, start by clearing its existing content with the TextBox.ChildObjects.Clear() method. This action removes all child objects, including any paragraphs or tables currently contained within the textbox. After clearing the content, you can add a new paragraph to the text box. Once the paragraph is created, set its text to the desired value.

The steps to update a textbox in a Word document are as follows:

  • Create a Document object.
  • Load a Word file using Document.LoadFromFile() method.
  • Get a specific textbox using Document.TextBoxes[index] property
  • Remove existing content of the textbox using TextBox.ChildObjects.Clear() method.
  • Add a paragraph to the textbox using TextBox.Body.AddParagraph() method.
  • Add text to the paragraph using Paragraph.AppendText() method.
  • Save the document to a different Word file.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a Document object
document = Document()

# Load a Word file
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.docx")

# Get a specific textbox
textBox = document.TextBoxes.get_Item(0)

# Remove child objects of the textbox
textBox.ChildObjects.Clear()

# Add a new paragraph to the textbox
paragraph = textBox.Body.AddParagraph()

# Set line spacing
paragraph.Format.LineSpacing = 15.0

# Add text to the paragraph
textRange = paragraph.AppendText("The text in this textbox has been updated.")

# Set font size
textRange.CharacterFormat.FontSize = 15.0

# Save the document to a different Word file
document.SaveToFile("UpdateTextbox.docx", FileFormat.Docx2019);

# Dispose resources
document.Dispose()

Python: Extract or Update Textboxes in a Word Document

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Thursday, 19 September 2024 00:56

Python: Crop Pages in PDF

When dealing with PDF files, you might sometimes need to crop pages in the PDF to remove unnecessary margins, borders, or unwanted content. By doing so, you can make the document conform to specific design requirements or page sizes, ensuring a more aesthetically pleasing or functionally optimized output. This article will introduce how to crop pages in PDF in Python using Spire.PDF for Python.

Install Spire.PDF for Python

This scenario requires Spire.PDF for Python. It can be easily installed in your Windows through the following pip command.

pip install Spire.PDF

If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows

Crop a PDF Page in Python

Spire.PDF for Python allows you specify a rectangular area, and then use the PdfPageBase.CropBox property to crop page to the specified area. The following are the detailed steps.

  • Create a PdfDocument instance.
  • Load a PDF file using PdfDocument.LoadFromFile() method.
  • Get a specified page using PdfDocument.Pages[] property.
  • Crop the page to the specified area using PdfPageBase.CropBox property.
  • Save the result file using PdfDocument.SaveToFile() method.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
pdf = PdfDocument()

# Load a PDF file from disk
pdf.LoadFromFile("Sample1.pdf")

# Get the first page
page = pdf.Pages.get_Item(0)

# Crop the page by the specified area
page.CropBox = RectangleF(0.0, 300.0, 600.0, 260.0)

# Save the result file
pdf.SaveToFile("CropPDF.pdf")
pdf.Close()

Python: Crop Pages in PDF

Crop a PDF Page and Export as an Image in Python

To accomplish this task, you can use the PdfDocument.SaveAsImage(pageIndex: int) method to convert a cropped PDF page to an image stream. The following are the detailed steps.

  • Create a PdfDocument instance.
  • Load a PDF file using PdfDocument.LoadFromFile() method.
  • Get a specified page using PdfDocument.Pages[] property.
  • Crop the page to the specified area using PdfPageBase.CropBox property.
  • Convert the cropped page to an image stream using PdfDocument.SaveAsImage() method.
  • Save the image as a PNG, JPG or BMP file using Stream.Save() method.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
pdf = PdfDocument()

# Load a PDF file from disk
pdf.LoadFromFile("Sample1.pdf")

# Get the first page
page = pdf.Pages.get_Item(0)

# Crop the page by the specified area
page.CropBox = RectangleF(0.0, 300.0, 600.0, 260.0)

# Convert the page to an image
with pdf.SaveAsImage(0) as imageS:

    # Save the image as a PNG file
    imageS.Save("CropPDFSaveAsImage.png")
pdf.Close()

Python: Crop Pages in PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Friday, 13 September 2024 01:05

Python: Set the Transparency of PDF Images

Setting the transparency of images in PDF documents is crucial for achieving professional-grade output, which allows for layering images without hard edges and creating a seamless integration with the background or underlying content. This not only enhances the visual appeal but also creates a polished and cohesive look, especially in graphics-intensive documents. This article will demonstrate how to effectively set the transparency of PDF images using Spire.PDF for Python in Python programs.

Install Spire.PDF for Python

This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.PDF

If you are unsure how to install, please refer to: How to Install Spire.PDF for Python on Windows

Add Images with Specified Transparency to PDF

Developers can utilize the PdfPageBase.Canvas.DrawImage() method in Spire.PDF for Python to draw an image at a specified location on a PDF page. Before drawing, developers can set the transparency of the canvas using PdfPageBase.Canvas.SetTransparency() method, which in turn sets the transparency level of the image being drawn. Below are the detailed steps:

  • Create an object of PdfDocument class and load a PDF document using PdfDocument.LoadFromFile() method.
  • Get a page in the document using PdfDocument.Pages.get_Item() method.
  • Load an image using PdfImage.FromFile() method.
  • Set the transparency of the canvas using PdfPageBase.Canvas.SetTransparency() method.
  • Draw the image on the page using PdfPageBase.Canvas.DrawImage() method.
  • Save the document using PdfDocument.SaveToFile() method.
  • Python
from spire.pdf import *

# Create a PdfDocument instance
pdf = PdfDocument()

# Load a PDF file
pdf.LoadFromFile("Sample.pdf")

# Get the first page
page = pdf.Pages.get_Item(0)

# Load an image
image = PdfImage.FromFile("Screen.jpg")

# Set the transparency of the canvas
page.Canvas.SetTransparency(0.2)

# Draw the image at the specified location
page.Canvas.DrawImage(image, PointF(80.0, 80.0))

# Save the document
pdf.SaveToFile("output/AddTranslucentPicture.pdf")
pdf.Close()

Python: Set the Transparency of PDF Images

Adjust the Transparency of Existing Images in PDF

To adjust the transparency of an existing image on a PDF page, developers can retrieve the image along with its bounds, delete the image, and finally redraw the image in the same location with the specified transparency. This process allows for the adjustment of the image's opacity while maintaining its original placement. The detailed steps are as follows:

  • Create an object of PdfDocument class and load a PDF document using PdfDocument.LoadFromFile() method.
  • Get a page in the document using PdfDocument.Pages.get_Item() method.
  • Get an image on the page as a stream through PdfPageBase.ImagesInfo[].Image property and get the bounds of the image through PdfPageBase.ImagesInfo[].Bounds property.
  • Remove the image from the page using PdfPageBase.DeleteImage() method.
  • Create a PdfImage instance with the stream using PdfImage.FromStream() method.
  • Set the transparency of the canvas using PdfPageBase.Canvas.SetTransparency() method.
  • Redraw the image in the same location with the specified transparency using PdfPageBase.Canvas.DrawImage() method.
  • Save the document using PdfDocument.SaveToFile() method.
  • Python
from spire.pdf import *

# Create a PdfDocument instance
pdf = PdfDocument()

# Load a PDF file
pdf.LoadFromFile("Sample1.pdf")

# Get the first page
page = pdf.Pages.get_Item(0)

# Get the first image on the page as a stream and the bounds of the image
imageHelper = PdfImageHelper()
imageInformation = imageHelper.GetImagesInfo(page)
bounds = imageInformation[0].Bounds
imageStream =imageInformation[0].Image

# Delete the original image
imageHelper.DeleteImage(imageInformation[0])

# Create a PdfImage instance using the image stream
image = PdfImage.FromStream(imageStream)

# Create a PdfImage instance using the image stream
image = PdfImage.FromStream(imageStream)

# Set the transparency of the canvas
page.Canvas.SetTransparency(0.3)

# Draw the new image at the same location using the canvas
page.Canvas.DrawImage(image, bounds)

# Save the document
pdf.SaveToFile("output/SetExistingImageTransparency.pdf")
pdf.Close()

Python: Set the Transparency of PDF Images

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Thursday, 05 September 2024 00:53

Python: Add Page Numbers to a PDF Document

Adding page numbers to a PDF enhances its organization and readability, making it easier for readers to navigate the document. Whether for reports, manuals, or e-books, page numbers provide a professional touch and help maintain the flow of information. This process involves determining the placement, alignment, and style of the numbers within the footer or header.

In this article, you will learn how to add page numbers to the PDF footer using Spire.PDF for Python.

Install Spire.PDF for Python

This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.PDF

If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows

Coordinate System in PDF

When using Spire.PDF for Python to modify a PDF document, the coordinate system's origin is at the top-left corner of the page. The x-axis extends to the right, while the y-axis extends downward.

Page numbers are usually positioned in the header or footer. Thus, it's important to consider the page size and margins when determining the placement of the page numbers.

Python: Add Page Numbers to a PDF Document

Classes and Methods for Creating Page Numbers

Spire.PDF for Python provides the PdfPageNumberField and PdfPageCountField classes to retrieve the current page number and total page count. These can be merged into a single PdfCompositeField that formats the output as "Page X of Y", where X represents the current page number and Y indicates the total number of pages.

To position the PdfCompositeField on the page, use the Location property, and render it with the Draw() method.

Add Left-Aligned Page Numbers to PDF Footer

To add left-aligned page numbers in the footer, you need to consider the left and bottom page margins as well as the page height. For example, you can use coordinates such as (LeftMargin, PageHeight – BottomMargin + SmallNumber). This ensures that the page numbers align with the left side of the text while keeping a comfortable distance from both the content and the edges of the page.

The steps to add left-aligned page numbers to PDF footer are as follows:

  • Create a PdfDocument object.
  • Load a PDF file from a specified path.
  • Create a PdfPageNumberField object and a PdfPageCountField object.
  • Create a PdfCompositeField object to combine page count field and page number field in a single string.
  • Set the position of the composite field through PdfCompositeField.Location property to ensure the page number aligns with the left side of the text.
  • Iterate through the pages in the document, and draw the composite field on each page at the specified location.
  • Save the document to a different PDF file.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
doc = PdfDocument()

# Load a PDF file
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Privacy Policy.pdf")

# Create font, brush and pen
font = PdfTrueTypeFont("Times New Roman", 12.0, PdfFontStyle.Regular, True)
brush = PdfBrushes.get_Black()
pen = PdfPen(brush, 1.0)

# Create a PdfPageNumberField object and a PdfPageCountField object
pageNumberField = PdfPageNumberField()
pageCountField = PdfPageCountField()

# Create a PdfCompositeField object to combine page count field and page number field in a single string
compositeField = PdfCompositeField(font, brush, "Page {0} of {1}", [pageNumberField, pageCountField])

# Get the page size
pageSize = doc.Pages.get_Item(0).Size

# Specify the blank areas around the page
leftMargin = 54.0
rightMargin = 54.0
bottomMargin = 72.0

# Set the location of the composite field
compositeField.Location = PointF(leftMargin, pageSize.Height - bottomMargin + 18.0)

# Iterate through the pages in the document
for i in range(doc.Pages.Count):

    # Get a specific page
    page = doc.Pages.get_Item(i)

    # Draw a line at the specified position
    page.Canvas.DrawLine(pen, leftMargin, pageSize.Height - bottomMargin + 15.0, pageSize.Width - rightMargin, pageSize.Height - bottomMargin + 15.0)

    # Draw the composite field on the page
    compositeField.Draw(page.Canvas, 0.0, 0.0)

# Save to a different PDF file
doc.SaveToFile("Output/LeftAlignedPageNumbers.pdf")

# Dispose resources
doc.Dispose()

Python: Add Page Numbers to a PDF Document

Add Center-Aligned Page Numbers to PDF Footer

To position the page number in the center of the footer, you first need to measure the width of the page number itself. Once you have this measurement, you can calculate the appropriate X coordinate by using the formula (PageWidth - PageNumberWidth) / 2. This ensures the page number is horizontally centered within the footer.

The steps to add center-aligned page numbers to PDF footer are as follows:

  • Create a PdfDocument object.
  • Load a PDF file from a specified path.
  • Create a PdfPageNumberField object and a PdfPageCountField object.
  • Create a PdfCompositeField object to combine page count field and page number field in a single string.
  • Set the position of the composite field through PdfCompositeField.Location property to ensure the page number is perfectly centered in the footer.
  • Iterate through the pages in the document, and draw the composite field on each page at the specified location.
  • Save the document to a different PDF file.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
doc = PdfDocument()

# Load a PDF file
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Privacy Policy.pdf")

# Create font, brush and pen
font = PdfTrueTypeFont("Times New Roman", 12.0, PdfFontStyle.Regular, True)
brush = PdfBrushes.get_Black()
pen = PdfPen(brush, 1.0)

# Specify the blank margins around the page
leftMargin = 54.0
rightMargin = 54.0
bottomMargin = 72.0

# Create a PdfPageNumberField object and a PdfPageCountField object
pageNumberField = PdfPageNumberField()
pageCountField = PdfPageCountField()

# Create a PdfCompositeField object to combine page count field and page number field in a single field
compositeField = PdfCompositeField(font, brush, "Page {0} of {1}", [pageNumberField, pageCountField])

# Iterate through the pages in the document
for i in range(doc.Pages.Count):

    # Get a specific page
    page = doc.Pages.get_Item(i)

    # Get the page size
    pageSize = doc.Pages.get_Item(i).Size

    # Draw a line at the specified position
    page.Canvas.DrawLine(pen, leftMargin, pageSize.Height - bottomMargin + 15.0, pageSize.Width - rightMargin, pageSize.Height - bottomMargin + 15.0)

    # Measure the size the "Page X of Y"
    pageNumberSize = font.MeasureString("Page {} of {}".format(i + 1, doc.Pages.Count))

    # Set the location of the composite field
    compositeField.Location = PointF((pageSize.Width - pageNumberSize.Width)/2, pageSize.Height - bottomMargin + 18.0)

    # Draw the composite field on the page
    compositeField.Draw(page.Canvas, 0.0, 0.0)

# Save to a different PDF file
doc.SaveToFile("Output/CenterAlignedPageNumbers.pdf")

# Dispose resources
doc.Dispose()

Python: Add Page Numbers to a PDF Document

Add Right-Aligned Page Numbers to PDF Footer

To add a right-aligned page number in the footer, measure the width of the page number. Then, calculate the X coordinate using the formula PageWidth - PageNumberWidth - RightMargin. This ensures that the page number aligns with the right side of the text.

The following are the steps to add right-aligned page numbers to PDF footer:

  • Create a PdfDocument object.
  • Load a PDF file from a specified path.
  • Create a PdfPageNumberField object and a PdfPageCountField object.
  • Create a PdfCompositeField object to combine page count field and page number field in a single string.
  • Set the position of the composite field through PdfCompositeField.Location property to ensure the page number aligns with the right side of the text.
  • Iterate through the pages in the document, and draw the composite field on each page at the specified location.
  • Save the document to a different PDF file.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
doc = PdfDocument()

# Load a PDF file
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Privacy Policy.pdf")

# Create font, brush and pen
font = PdfTrueTypeFont("Times New Roman", 12.0, PdfFontStyle.Regular, True)
brush = PdfBrushes.get_Black()
pen = PdfPen(brush, 1.0)

# Specify the blank margins around the page
leftMargin = 54.0
rightMargin = 54.0
bottomMargin = 72.0

# Create a PdfPageNumberField object and a PdfPageCountField object
pageNumberField = PdfPageNumberField()
pageCountField = PdfPageCountField()

# Create a PdfCompositeField object to combine page count field and page number field in a single string
compositeField = PdfCompositeField(font, brush, "Page {0} of {1}", [pageNumberField, pageCountField])

# Iterate through the pages in the document
for i in range(doc.Pages.Count):

    # Get a specific page
    page = doc.Pages.get_Item(i)

    # Get the page size
    pageSize = doc.Pages.get_Item(i).Size

    # Draw a line at the specified position
    page.Canvas.DrawLine(pen, leftMargin, pageSize.Height - bottomMargin + 15.0, pageSize.Width - rightMargin, pageSize.Height - bottomMargin + 15.0)

    # Measure the size the "Page X of Y"
    pageNumberSize = font.MeasureString("Page {} of {}".format(i + 1, doc.Pages.Count))

    # Set the location of the composite field
    compositeField.Location = PointF(pageSize.Width - pageNumberSize.Width - rightMargin, pageSize.Height - bottomMargin + 18.0)

    # Draw the composite field on the page
    compositeField.Draw(page.Canvas, 0.0, 0.0)

# Save to a different PDF file
doc.SaveToFile("Output/RightAlignedPageNumbers.pdf")

# Dispose resources
doc.Dispose()

Python: Add Page Numbers to a PDF Document

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Tuesday, 03 September 2024 00:56

Python: Edit or Modify a Word Document

Programmatic editing of Word documents involves using code to alter or modify the contents of these documents. This approach enables automation and customization, making it particularly advantageous for handling large document collections. Through the use of Spire.Doc library, developers can perform a wide range of operations, including text manipulation, formatting changes, and the addition of images or tables.

The following sections will demonstrate how to edit or modify a Word document in Python using Spire.Doc for Python.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.Doc

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows

Modify Text in a Word Document in Python

In order to alter the content of a paragraph, the initial step is to obtain the desired paragraph from a specific section through the use of the Section.Paragraphs[index] property. Following this, you can replace the existing text with the new content by assigning it to the Paragraph.Text property of the chosen paragraph.

Here are the steps to edit text in a Word document with Python:

  • Create a Document object.
  • Load a Word file from the given file path.
  • Get a specific section using Document.Sections[index] property.
  • Get a specific paragraph using Section.Paragraphs[index] property.
  • Reset the text of the paragraph using Paragraph.Text property.
  • Save the updated document to a different Word file.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a Document object
document = Document()

# Load an existing Word file
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx");

# Get a specific section
section = document.Sections[0]

# Get a specific paragraph
paragraph = section.Paragraphs[0]

# Modify the text of the paragraph 
paragraph.Text = "The text has been modified"

# Save the document to a different Word file
document.SaveToFile("output/ModifyText.docx", FileFormat.Docx)

# Dispose resource
document.Dispose()

Python: Edit or Modify a Word Document

Change Formatting of Text in a Word Document in Python

To alter the text appearance of a particular paragraph, you first need to obtain the specified paragraph. Next, go through its child objects to find the individual text ranges. The formatting of each text range can then be updated using the TextRange.CharacterFormat property.

The steps to change text formatting in a Word document are as follows:

  • Create a Document object.
  • Load a Word file from the given file path.
  • Get a specific section using Document.Sections[index] property.
  • Get a specific paragraph using Section.Paragraphs[index] property.
  • Iterate through the child objects in the paragraph.
    • Determine if a child object is a text range.
    • Get a specific text range.
    • Reset the text formatting using TextRange.CharacterFormat property.
  • Save the updated document to a different Word file.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create an object of Document
doc = Document()

# Load a Word document
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx")

# Get a specific section
section = document.Sections.get_Item(0)

# Get a specific paragraph
paragraph = section.Paragraphs.get_Item(0)

# Iterate through the child objects in the paragraph
for i in range(paragraph.ChildObjects.Count):
    
    # Determine if a child object is text range
    if isinstance(paragraph.ChildObjects[i], TextRange):

        # Get a specific text range
        textRange = paragraph.ChildObjects[i]

        # Reset font name
        textRange.CharacterFormat.FontName = "Corbel Light"

        # Reset font size
        textRange.CharacterFormat.FontSize = 11.0

        # Reset text color
        textRange.CharacterFormat.TextColor = Color.get_Blue()

        # Apply italic to the text range 
        textRange.CharacterFormat.Italic = True

# Save the document to a different Word file
doc.SaveToFile("output/ChangeFormatting.docx", FileFormat.Docx2019)

# Dispose resource
doc.Dispose()

Python: Edit or Modify a Word Document

Add New Elements to a Word Document in Python

In a Word document, most elements—such as text, images, lists, and charts—are fundamentally organized around the concept of a paragraph. To insert a new paragraph into a specific section, use the Section.AddParagraph() method.

After creating the new paragraph, you can add various elements to it by leveraging the methods and properties of the Paragraph object.

The steps to add new elements (text and images) to a Word document are as follows:

  • Create a Document object.
  • Load a Word file from the given file path.
  • Get a specific section through Document.Sections[index] property.
  • Add a paragraph to the section using Section.AddParagraph() method.
  • Add text to the paragraph using Paragraph.AppendText() method.
  • Add an image to the paragraph using Paragraph.AppendPicture() method.
  • Save the updated document to a different Word file.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create an object of Document
doc = Document()

# Load a Word document
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.docx")

# Get the last section
lastSection = doc.LastSection

# Add a paragraph to the section
paragraph = lastSection.AddParagraph()

# Add an image to the paragraph
picture = paragraph.AppendPicture("C:\\Users\\Administrator\\Desktop\\logo.png");

# Set text wrap style 
picture.TextWrappingStyle = TextWrappingStyle.TopAndBottom

# Add text to the paragraph
paragraph.AppendText("This text and the image above are added by Spire.Doc for Python.")

# Create a paragraph style
style = ParagraphStyle(doc)
style.Name = "FontStyle"
style.CharacterFormat.FontName = "Times New Roman"
style.CharacterFormat.FontSize = 12
doc.Styles.Add(style)

# Apply the style to the paragraph
paragraph.ApplyStyle(style.Name)

# Save the document to a different Word file
doc.SaveToFile("output/AddNewElements.docx", FileFormat.Docx2019)

# Dispose resource
doc.Dispose()

Python: Edit or Modify a Word Document

Remove Paragraphs from a Word Document in Python

To eliminate a specific paragraph from a document, simply invoke the ParagraphCollection.RemoveAt() method and supply the index of the paragraph you intend to delete.

The steps to remove paragraphs from a Word document are as follows:

  • Create a Document object.
  • Load a Word file from the given file path.
  • Get a specific section through Document.Sections[index] property.
  • Remove a specific paragraph from the section using Section.Paragraphs.RemoveAt() method.
  • Save the updated document to a different Word file.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create an object of Document
doc = Document()

# Load a Word document
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx")

# Get a specific section
section = doc.Sections[0]

# Remove a specific paragraph
section.Paragraphs.RemoveAt(0)

# Save the document to a different Word file
doc.SaveToFile("output/RemoveParagraph.docx", FileFormat.Docx);

# Dispose resource
doc.Dispose()

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Proper presentation of a PDF document is critical for maintaining its accuracy and professionalism. By checking the orientation and rotation of each PDF page, you can confirm that all elements, including diagrams and images, are displayed correctly as intended on the viewing device or platform, thus avoiding confusion or misinterpretation of content. In this article, you will learn how to detect the orientation and rotation angle of a PDF page in Python using Spire.PDF for Python.

Install Spire.PDF for Python

This scenario requires Spire.PDF for Python. It can be easily installed in your Windows through the following pip command.

pip install Spire.PDF

If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows

Detect PDF Page Orientation in Python

Page orientation is determined by the relationship between page width and height. Using Spire.PDF for Python, you can compare these two values to detect whether a page is landscape (width greater than height) or portrait (width less than height). The following are the detailed steps.

  • Create a PdfDocument instance.
  • Load a PDF file using PdfDocument.LoadFromFile() method.
  • Get a specified page using PdfDocument.Pages[] property.
  • Get the width and height of the PDF page using PdfPageBase.Size.Width and PdfPageBase.Size.Height properties.
  • Compare the values of page width and height to detect the page orientation.
  • Print out the result.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
pdf = PdfDocument()

# Load a PDF file from disk
pdf.LoadFromFile("SamplePDF.pdf")

# Get the first page
page = pdf.Pages.get_Item(0)

# Get the width and height of the page
Width = page.Size.Width
Height = page.Size.Height

# Compare the values of page width and height
if Width > Height:
    print("The page orientation is Landscape.")
    
else:
    print("The page orientation is Portrait.")

Python: Detect Page Orientation or Rotation Angle in PDF

Detect PDF Page Rotation Angle in Python

PDF pages can be rotated by a certain angle. To detect the rotation angle of a PDF page, you can use the PdfPageBase.Rotation property. The following are the detailed steps.

  • Create a PdfDocument instance.
  • Load a PDF file using PdfDocument.LoadFromFile() method.
  • Get a specified page using PdfDocument.Pages[] property.
  • Get the rotation angle of the page using PdfPageBase.Rotation property, and then convert it to text string.
  • Print out the result.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
pdf = PdfDocument()

# Load a PDF file from disk
pdf.LoadFromFile("Sample.pdf")

# Get the first page
page = pdf.Pages.get_Item(0)

# Get the rotation angle of the current page
rotationAngle = page.Rotation
rotation = str(rotationAngle)

# Print out the result
print("The rotation angle is: " + rotation)

Python: Detect Page Orientation or Rotation Angle in PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Digital signatures are vital for maintaining the authenticity and integrity of PDF documents. They provide a reliable way to verify the signer's identity and ensure that the document's content has not been tampered with since the signature was applied. By using digital signatures, you can enhance the security and trustworthiness of your documents. In this article, we will explore how to add and remove digital signatures in PDF files in Python using Spire.PDF for Python.

Install Spire.PDF for Python

This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.PDF

If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows

Add a Digital Signature to PDF in Python

You can use the PdfOrdinarySignatureMaker.MakeSignature(sigFieldName: str, page: PdfPageBase, x: float, y: float, width: float, height: float, signatureAppearance: IPdfSignatureAppearance) method to add a visible digital signature with a custom appearance to a specific page of a PDF document. The detailed steps are as follows.

  • Create a PdfDocument instance.
  • Load a PDF document using PdfDocument.LoadFromFile() method.
  • Create a PdfOrdinarySignatureMaker instance and pass the PdfDocument object, certificate (.pfx) file path and certificate password to the class instructor as parameters.
  • Set signature details, such as the signer’s name, contact information, location, and signature reason, using the properties of the PdfOrdinarySignatureMaker class.
  • Create a PdfSignatureAppearance instance for the signature, and then customize the labels for the signature and set the signature image.
  • Get a specific page in the PDF document using PdfDocument.Pages[] property.
  • Call the PdfOrdinarySignatureMaker.MakeSignature(sigFieldName: str, page: PdfPageBase, x: float, y: float, width: float, height: float, signatureAppearance: IPdfSignatureAppearance) method to add the digital signature to a specific location of the page.
  • Save the result document using the PdfDocument.SaveToFile() method.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument instance
doc = PdfDocument()
# Load a PDF file
doc.LoadFromFile("Sample.pdf")

# Create a signature maker
signatureMaker = PdfOrdinarySignatureMaker(doc, "gary.pfx", "e-iceblue")

# Configure the signature properties like the signer's name, contact information, location and signature reason
signature = signatureMaker.Signature
signature.Name = "Gary"
signature.ContactInfo = "+86 12345678"
signature.Location = "China"
signature.Reason = "I am the author."

# Create a custom signature appearance
appearance = PdfSignatureAppearance(signature)
# Set label for the signer's name
appearance.NameLabel = "Signer: "
# Set label for the contact information
appearance.ContactInfoLabel = "Phone: "
# Set label for the location
appearance.LocationLabel = "Location: "
# Set label for the signature reason
appearance.ReasonLabel = "Reason: "
# Set signature image
appearance.SignatureImage = PdfImage.FromFile("SigImg.png")
# Set the graphic render/display mode for the signature
appearance.GraphicMode = GraphicMode.SignImageAndSignDetail
# Set the layout for the signature image
appearance.SignImageLayout = SignImageLayout.none

# Get the first page
page = doc.Pages.get_Item(0)

# Add the signature to a specified location of the page
signatureMaker.MakeSignature("Signature by Gary", page, 90.0, 600.0, 260.0, 100.0, appearance)

# Save the signed document
doc.SaveToFile("Signed.pdf")
doc.Close()

Python: Add or Remove Digital Signatures in PDF

Add an Invisible Digital Signature to PDF in Python

An invisible signature in a PDF is a type of digital signature that provides all the security and authentication benefits of a standard digital signature but does not appear visibly on the document itself. Using the PdfOrdinarySignatureMaker.MakeSignature(sigFieldName: str) method of Spire.PDF for Python, you can add an invisible digital signature to a PDF document. The detailed steps are as follows.

  • Create a PdfDocument instance.
  • Load a PDF document using PdfDocument.LoadFromFile() method.
  • Create a PdfOrdinarySignatureMaker instance and pass the PdfDocument object, certificate (.pfx) file path and password to the class instructor as parameters.
  • Add an invisible digital signature to a PDF document using the PdfOrdinarySignatureMaker.MakeSignature(sigFieldName: str) method
  • Save the result document using the PdfDocument.SaveToFile() method.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument instance
doc = PdfDocument()
# Load a PDF document
doc.LoadFromFile("test2.pdf")

# Create a signature maker
signatureMaker = PdfOrdinarySignatureMaker(doc, "gary.pfx", "e-iceblue")

# Add an invisible signature to the document
signatureMaker.MakeSignature("Signature by Gary")

# Save the signed document
doc.SaveToFile("InvisibleSignature.pdf")
doc.Close()

Python: Add or Remove Digital Signatures in PDF

Remove Digital Signature from PDF in Python

To remove digital signatures from a PDF document, you need to iterate through all form fields in the document, find the form fields that are of PdfSignatureFieldWidget type and then remove them from the document. The detailed steps are as follows.

  • Create a PdfDocument instance.
  • Load a PDF document using PdfDocument.LoadFromFile() method.
  • Get the form field collection of the document using PdfDocument.Form property.
  • Iterate through the form fields in the collection from the last to the first.
  • Check if the field is a PdfSignatureFieldWidget object.
  • If the result is True, remove the field from the document using PdfFormFieldWidgetCollection.FieldsWidget.RemoveAt(index) method.
  • Save the result document using the PdfDocument.SaveToFile() method.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument instance
doc = PdfDocument()
# Load a PDF document
doc.LoadFromFile("Signed.pdf")

# Get form field collection from the document
pdfForm = doc.Form
formWidget = PdfFormWidget(pdfForm)

# Check if there are any form fields in the collection
if formWidget.FieldsWidget.Count > 0:
    # Loop through all form fields from the last to the first
    for i in range(formWidget.FieldsWidget.Count - 1, -1, -1):
        field = formWidget.FieldsWidget.get_Item(i)
        # Check if the field is a PdfSignatureFieldWidget
        if isinstance(field, PdfSignatureFieldWidget):
            # Remove the field
            formWidget.FieldsWidget.RemoveAt(i)

# Save the document
doc.SaveToFile("RemoveSignature.pdf")
doc.Close()

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Thursday, 22 August 2024 01:04

Python: Add Annotations to PDF Documents

Adding annotations to PDFs is a common practice for adding comments, highlighting text, drawing shapes, and more. This feature is beneficial for collaborative document review, education, and professional presentations. It allows users to mark up documents digitally, enhancing communication and productivity.

In this article, you will learn how to add various types of annotations to a PDF document in Python using Spire.PDF for Python.

Install Spire.PDF for Python

This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.PDF

If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows

Add a Text Markup Annotation to PDF in Python

Text markup in PDF refers to the ability to emphasize important text by selecting and highlighting it. To add a text markup annotation to a PDF, you first need to locate the specific text within the document with the help of the PdfTextFinder class. Once the text is identified, you can create a PdfTextMarkupAnnotation object and apply it to the document.

The following are the steps to add a text markup annotation to PDF using Python:

  • Create a PdfDocument object.
  • Load a PDF file from the specified location.
  • Get a page from the document.
  • Find a specific piece of text within the page using PdfTextFinder class.
  • Create a PdfTextMarkupAnnotation object based on the text found.
  • Add the annotation to the page using PdfPageBase.AnnotationsWidget.Add() method.
  • Save the modified document to a different PDF file.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
doc = PdfDocument()

# Load a PDF file
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf")

# Get a specific page
page = doc.Pages.get_Item(0)

# Create a PdfTextFinder object based on the page
finder = PdfTextFinder(page)

# Set the find options
finder.Options.Parameter = TextFindParameter.WholeWord
finder.Options.Parameter = TextFindParameter.IgnoreCase

# Find the instances of the specified text
fragments = finder.Find("However, we cannot and do not guarantee that these measures will prevent "+
                        "every unauthorized attempt to access, use, or disclose your information since "+
                         "despite our efforts, no Internet and/or other electronic transmissions can be completely secure.");

# Get the first instance
textFragment = fragments[0]

# Specify annotation text
text = "Here is a markup annotation."

# Iterate through the text bounds
for i in range(len(textFragment.Bounds)):
    
    # Get a specific bound
    rect = textFragment.Bounds[i]

    # Create a text markup annotation
    annotation = PdfTextMarkupAnnotation("Administrator", text, rect)

    # Set the markup color
    annotation.TextMarkupColor = PdfRGBColor(Color.get_Green())

    # Add the annotation to the collection of the annotations
    page.AnnotationsWidget.Add(annotation)

# Save result to file
doc.SaveToFile("output/MarkupAnnotation.pdf")

# Dispose resources
doc.Dispose()

Python: Add Annotations to PDF Documents

Add a Free Text Annotation to PDF in Python

Free text annotations allow adding freeform text comments directly on a PDF. To add a free text annotation at a specific location, you can use the PdfTextFinder class to obtain the coordinate information of the searched text, then create a PdfFreeTextAnnotation object based on that coordinates and add it to the document.

The following are the steps to add a free text annotation to PDF using Python:

  • Create a PdfDocument object.
  • Load a PDF file from the specified location.
  • Get a page from the document.
  • Find a specific piece of text within the page using PdfTextFinder class.
  • Create a PdfFreeTextAnnotation object based on the coordinate information of the text found.
  • Set the annotation content using PdfFreeTextAnnotation.Text property.
  • Add the annotation to the page using PdfPageBase.AnnotationsWidget.Add() method.
  • Save the modified document to a different PDF file.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
doc = PdfDocument()

# Load a PDF file
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf")

# Get a specific page
page = doc.Pages.get_Item(0)

# Create a PdfTextFinder object based on the page
finder = PdfTextFinder(page)

# Set the find options
finder.Options.Parameter = TextFindParameter.WholeWord
finder.Options.Parameter = TextFindParameter.IgnoreCase

# Find the instances of the specified text
fragments = finder.Find("Children");

# Get the first instance
textFragment = fragments[0]

# Get the text bound
rect = textFragment.Bounds[0]

# Get the coordinates to add annotation
right = bound.Right
top = bound.Top

# Create a free text annotation
rectangle = RectangleF(right + 5, top + 2, 160.0, 18.0)
textAnnotation = PdfFreeTextAnnotation(rectangle)

# Set the content of the annotation
textAnnotation.Text = "Here is a free text annotation."

# Set other properties of annotation
font = PdfFont(PdfFontFamily.TimesRoman, 13.0, PdfFontStyle.Regular)
border = PdfAnnotationBorder(1.0)
textAnnotation.Font = font
textAnnotation.Border = border
textAnnotation.BorderColor = PdfRGBColor(Color.get_SkyBlue())
textAnnotation.Color = PdfRGBColor(Color.get_LightBlue())
textAnnotation.Opacity = 1.0

# Add the annotation to the collection of the annotations
page.AnnotationsWidget.Add(textAnnotation)

# Save result to file
doc.SaveToFile("output/FreeTextAnnotation.pdf")

# Dispose resources
doc.Dispose()

Python: Add Annotations to PDF Documents

Add a Popup Annotation to PDF in Python

A popup annotation allows for the display of additional information or content in a pop-up window. To get a specific position to add a popup annotation, you can still use the PdfTextFinder class. Once the coordinate information is obtained, you can create a PdfPopupAnnotation object and add it to the document.

The steps to add a popup annotation to PDF using Python are as follows:

  • Create a PdfDocument object.
  • Load a PDF file from the specified location.
  • Get a page from the document.
  • Find a specific piece of text within the page using PdfTextFinder class.
  • Create a PdfPopupAnnotation object based on the coordinate information of the text found.
  • Set the annotation content using PdfPopupAnnotation.Text property.
  • Add the annotation to the page using PdfPageBase.AnnotationsWidget.Add() method.
  • Save the modified document to a different PDF file.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
doc = PdfDocument()

# Load a PDF file
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf")

# Get a specific page
page = doc.Pages.get_Item(0)

# Create a PdfTextFinder object based on the page
finder = PdfTextFinder(page)

# Set the find options
finder.Options.Parameter = TextFindParameter.WholeWord
finder.Options.Parameter = TextFindParameter.IgnoreCase

# Find the instances of the specified text
fragments = finder.Find("Children");

# Get the first instance
textFragment = fragments[0]

# Get the text bound
bound = textFragment.Bounds[0]

# Get the coordinates to add annotation
right = bound.Right
top = bound.Top

# Create a free text annotation
rectangle = RectangleF(right + 5, top, 30.0, 30.0)
popupAnnotation = PdfPopupAnnotation(rectangle)

# Set the content of the annotation
popupAnnotation.Text = "Here is a popup annotation."

# Set the icon and color of the annotation
popupAnnotation.Icon = PdfPopupIcon.Comment
popupAnnotation.Color = PdfRGBColor(Color.get_Red())

# Add the annotation to the collection of the annotations
page.AnnotationsWidget.Add(popupAnnotation)

# Save result to file
doc.SaveToFile("output/PopupAnnotation.pdf")

# Dispose resources
doc.Dispose()

Python: Add Annotations to PDF Documents

Add a Stamp Annotation to PDF in Python

Stamp annotations in PDF documents are a type of annotation that allow users to add custom "stamps" or symbols to a PDF file. To define the appearance of a stamp annotation, use the PdfTemplate class. Then, create a PdfRubberStampAnnotation object and apply the previously defined template as its appearance. Finally, add the annotation to the PDF document.

The steps to add a stamp annotation to PDF using Python are as follows:

  • Create a PdfDocument object.
  • Load a PDF file from the specified location.
  • Get a page from the document.
  • Create a PdfTemplate object and draw an image on the template.
  • Create a PdfRubberStampAnnotation object and apply the template as its appearance.
  • Add the annotation to the page using PdfPageBase.AnnotationsWidget.Add() method.
  • Save the modified document to a different PDF file.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
doc = PdfDocument()

# Load a PDF document
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf")

# Get a specific page
page = doc.Pages.get_Item(0)

# Load an image file
image = PdfImage.FromFile("C:\\Users\\Administrator\\Desktop\\confidential.png")

# Get the width and height of the image
width = (float)(image.Width)
height = (float)(image.Height)

# Create a PdfTemplate object based on the size of the image
template = PdfTemplate(width, height, True)

# Draw image on the template
template.Graphics.DrawImage(image, 0.0, 0.0, width, height)

# Create a rubber stamp annotation, specifying its location and position
rect = RectangleF((float) (page.ActualSize.Width/2 - width/2), 90.0, width, height)
stamp = PdfRubberStampAnnotation(rect)

# Create a PdfAppearance object
pdfAppearance = PdfAppearance(stamp)

# Set the template as the normal state of the appearance
pdfAppearance.Normal = template

# Apply the appearance to the stamp
stamp.Appearance = pdfAppearance

# Add the stamp annotation to PDF
page.AnnotationsWidget.Add(stamp)

# Save the file
doc.SaveToFile("output/StampAnnotation.pdf")

# Dispose resources
doc.Dispose()

Python: Add Annotations to PDF Documents

Add a Shape Annotation to PDF in Python

Shape annotations in PDF documents are a type of annotation that allow users to add various geometric shapes to the PDF file. Spire.PDF for Python offers the classes such as PdfPolyLineAnnotation, PdfLineAnnotation, and PdfPolygonAnnotation, allowing developers to add different types of shape annotations to PDF.

The steps to add a shape annotation to PDF using Python are as follows:

  • Create a PdfDocument object.
  • Load a PDF file from the specified location.
  • Get a page from the document.
  • Find a specific piece of text within the page using PdfTextFinder class.
  • Create a PdfPolyLineAnnotation object based on the coordinate information of the text found.
  • Set the annotation content using PdfPolyLineAnnotation.Text property.
  • Add the annotation to the page using PdfPageBase.AnnotationsWidget.Add() method.
  • Save the modified document to a different PDF file.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
doc = PdfDocument()

# Load a PDF file
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf")

# Get a specific page
page = doc.Pages.get_Item(0)

# Create a PdfTextFinder object based on the page
finder = PdfTextFinder(page)

# Set the find options
finder.Options.Parameter = TextFindParameter.WholeWord
finder.Options.Parameter = TextFindParameter.IgnoreCase

# Find the instances of the specified text
fragments = finder.Find("Children");

# Get the first instance
textFragment = fragments[0]

# Get the text bound
bound = textFragment.Bounds[0]

# Get the coordinates to add annotation
left = bound.Left
top = bound.Top
right = bound.Right
bottom = bound.Bottom

# Create a shape annotation
polyLineAnnotation = PdfPolyLineAnnotation(page, [PointF(left, top), PointF(right, top), PointF(right - 5, bottom), PointF(left - 5, bottom), PointF(left, top)])

# Set the annotation text
polyLineAnnotation.Text = "Here is a shape annotation."

# Add the annotation to the collection of the annotations
page.AnnotationsWidget.Add(polyLineAnnotation)

# Save result to file
doc.SaveToFile("output/ShapeAnnotation.pdf")

# Dispose resources
doc.Dispose()

Python: Add Annotations to PDF Documents

Add a Web Link Annotation to PDF in Python

Web link annotations in PDF documents enable users to embed clickable links to webpages, providing easy access to additional resources or related information online. You can use the PdfTextFinder class to find the specified text in a PDF document, and then create a PdfUriAnnotation object based on that text.

The following are the steps to add a web link annotation to PDF using Python:

  • Create a PdfDocument object.
  • Load a PDF file from the specified location.
  • Get a page from the document.
  • Find a specific piece of text within the page using PdfTextFinder class.
  • Create a PdfUriAnnotation object based on the bound of the text.
  • Add the annotation to the page using PdfPageBase.AnnotationsWidget.Add() method.
  • Save the modified document to a different PDF file.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
doc = PdfDocument()

# Load a PDF file
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf")

# Get a specific page
page = doc.Pages.get_Item(0)

# Create a PdfTextFinder object based on the page
finder = PdfTextFinder(page)

# Set the find options
finder.Options.Parameter = TextFindParameter.WholeWord
finder.Options.Parameter = TextFindParameter.IgnoreCase

# Find the instances of the specified text
fragments = finder.Find("Children");

# Get the first instance
textFragment = fragments[0]

# Get the text bound
bound = textFragment.Bounds[0]

# Create a Url annotation
urlAnnotation = PdfUriAnnotation(bound, "https://www.e-iceblue.com/");

# Add the annotation to the collection of the annotations
page.AnnotationsWidget.Add(urlAnnotation)

# Save result to file
doc.SaveToFile("output/WebLinkAnnotation.pdf")

# Dispose resources
doc.Dispose()

Python: Add Annotations to PDF Documents

Add a File Link Annotation to PDF in Python

A file link annotation in a PDF document is a type of annotation that allows users to create a clickable link to an external file, such as another PDF, image, or document. Still, you can use the PdfTextFinder class to find the specified text in a PDF document, and then create a PdfFileLinkAnnotation object based on that text.

The following are the steps to add a file link annotation to PDF using Python:

  • Create a PdfDocument object.
  • Load a PDF file from the specified location.
  • Get a page from the document.
  • Find a specific piece of text within the page using PdfTextFinder class.
  • Create a PdfFileLinkAnnotation object based on the bound of the text.
  • Add the annotation to the page using PdfPageBase.AnnotationsWidget.Add() method.
  • Save the modified document to a different PDF file.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
doc = PdfDocument()

# Load a PDF file
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf")

# Get a specific page
page = doc.Pages.get_Item(0)

# Create a PdfTextFinder object based on the page
finder = PdfTextFinder(page)

# Set the find options
finder.Options.Parameter = TextFindParameter.WholeWord
finder.Options.Parameter = TextFindParameter.IgnoreCase

# Find the instances of the specified text
fragments = finder.Find("Children");

# Get the first instance
textFragment = fragments[0]

# Get the text bound
bound = textFragment.Bounds[0]

# Create a file link annotation
fileLinkAnnotation = PdfFileLinkAnnotation(bound, "C:\\Users\\Administrator\\Desktop\\Report.docx")

# Add the annotation to the collection of the annotations
page.AnnotationsWidget.Add(fileLinkAnnotation)

# Save result to file
doc.SaveToFile("output/FileLinkAnnotation.pdf")

# Dispose resources
doc.Dispose()

Python: Add Annotations to PDF Documents

Add a Document Link Annotation to PDF in Python

A document link annotation in a PDF document is a type of annotation that creates a clickable link to a specific location within the same PDF file. To create a document link annotation, you can first use the PdfTextFinder class to identify the target text in the PDF document. Then, a PdfDocumentLinkAnnotation object is instantiated, and its destination property is configured to redirect the user to the specified location in the PDF.

The steps to add a document link annotation to PDF using Python are as follows:

  • Create a PdfDocument object.
  • Load a PDF file from the specified location.
  • Get a page from the document.
  • Find a specific piece of text within the page using PdfTextFinder class.
  • Create a PdfDocumentLinkAnnotation object based on the bound of the text.
  • Set the destination of the annotation using PdfDocumentLinkAnnotation.Destination property.
  • Add the annotation to the page using PdfPageBase.AnnotationsWidget.Add() method.
  • Save the modified document to a different PDF file.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
doc = PdfDocument()

# Load a PDF file
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf")

# Get a specific page
page = doc.Pages.get_Item(0)

# Create a PdfTextFinder object based on the page
finder = PdfTextFinder(page)

# Set the find options
finder.Options.Parameter = TextFindParameter.WholeWord
finder.Options.Parameter = TextFindParameter.IgnoreCase

# Find the instances of the specified text
fragments = finder.Find("Children");

# Get the first instance
textFragment = fragments[0]

# Get the text bound
bound = textFragment.Bounds[0]

# Create a document link annotation
documentLinkAnnotation = PdfDocumentLinkAnnotation(bound)
     
# Set the destination of the annotation
documentLinkAnnotation.Destination = PdfDestination(doc.Pages[1]);

# Add the annotation to the collection of the annotations
page.AnnotationsWidget.Add(documentLinkAnnotation)

# Save result to file
doc.SaveToFile("output/DocumentLinkAnnotation.pdf")

# Dispose resources
doc.Dispose()

Python: Add Annotations to PDF Documents

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Setting the column width in Word tables is crucial for optimizing document readability and aesthetics. Appropriate column widths prevent long lines of text from hindering readability, particularly in text-dense tables. Word offers two approaches: percentages and fixed values. Setting column widths using percentage values allows tables to adapt to different screen sizes, keeping content neatly aligned and enhancing the reading experience. Using fixed values, on the other hand, precisely controls the structure of the table, ensuring consistency and professionalism, making it suitable for designs with strict data alignment requirements or complex layouts. This article will introduce how to set Word table column width based on percentage or fixed value settings using Spire.Doc for .NET in C# projects.

Install Spire.Doc for .NET

To begin with, you need to add the DLL files included in the Spire.Doc for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.Doc

Set Column Width Based on Percentage in C#

When setting column widths in a Word table using percentage values, you first need to set the table's preferred width type to percentage. This is done with Table.PreferredWidth = new PreferredWidth(WidthType.Percentage, (short)100). Then, iterate through each column and set the width to the same or different percentage values as required. Here are the detailed steps:

  • Create a Document object.
  • Load a document using the Document.LoadFromFile() method.
  • Retrieve the first section of the document using Document.Sections[0].
  • Get the first table within the section using Section.Tables[0].
  • Use a for loop to iterate through all rows in the table.
  • Set the column width for cells in different columns to percentage values using the TableRow.Cells[index].SetCellWidth(value, CellWidthType.Percentage) method.
  • Save the changes to the Word document using the Document.SaveToFile() method.
  • C#
using Spire.Doc;

namespace SpireDocDemo
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Create a new Document object
            Document doc = new Document();

            // Load the document named "example.docx"
            doc.LoadFromFile("Sample.docx");

            // Get the first Section of the document
            Section section = doc.Sections[0];

            // Cast the first Table in the Section to Table type
            Table table = (Table)section.Tables[0];

            // Create a PreferredWidth object, set the width type to Percentage, and set the width value to 100%
            PreferredWidth percentageWidth = new PreferredWidth(WidthType.Percentage, (short)100);

            // Set the preferred width of the Table to the PreferredWidth object created above
            table.PreferredWidth = percentageWidth;

            // Define a variable of type TableRow
            TableRow tableRow;

            // Iterate through all rows in the Table
            for (int i = 0; i < table.Rows.Count; i++)
            {
                // Get the current row
                tableRow = table.Rows[i];

                // Set the width of the first column cell to 34%, with the type as Percentage
                tableRow.Cells[0].SetCellWidth(34, CellWidthType.Percentage);

                // Set the width of the second column cell to 33%, with the type as Percentage
                tableRow.Cells[1].SetCellWidth(33, CellWidthType.Percentage);

                // Set the width of the third column cell to 33%, with the type as Percentage
                tableRow.Cells[2].SetCellWidth(33, CellWidthType.Percentage);
            }

            // Save the modified document, specifying the file format as Docx2016
            doc.SaveToFile("set_column_width_by_percentage.docx", FileFormat.Docx2016);

            // Close the document
            doc.Close();
        }
    }
}

C#: Set the Column Width of a Word Table

Set Column Width Based on Fixed Value in C#

When setting column widths in a Word table using fixed values, you first need to set the table's layout to fixed. This is done with Table.TableFormat.LayoutType = LayoutType.Fixed. Then, iterate through each column and set the width to the same or different fixed values as required. Here are the detailed steps:

  • Create a Document object.
  • Load a document using the Document.LoadFromFile() method.
  • Retrieve the first section of the document using Document.Sections[0].
  • Get the first table within the section using Section.Tables[0].
  • Use a for loop to iterate through all rows in the table.
  • Set the column width for cells in different columns to fixed values using the TableRow.Cells[index].SetCellWidth(value, CellWidthType.Point) method. Note that value should be replaced with the desired width in points.
  • Save the changes to the Word document using the Document.SaveToFile() method.
  • C#
using Spire.Doc;

namespace SpireDocDemo
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Create a new Document object
            Document doc = new Document();

            // Load the document
            doc.LoadFromFile("Sample.docx");

            // Get the first Section of the document
            Section section = doc.Sections[0];

            // Cast the first Table in the Section to Table type
            Table table = (Table)section.Tables[0];

            // Set the table layout type to Fixed
            table.Format.LayoutType = LayoutType.Fixed;

            // Get the left margin
            float leftMargin = section.PageSetup.Margins.Left;

            // Get the right margin
            float rightMargin = section.PageSetup.Margins.Right;

            // Calculate the page width minus the left and right margins
            double pageWidth = section.PageSetup.PageSize.Width - leftMargin - rightMargin;

            // Define a variable of type TableRow
            TableRow tableRow;

            // Loop through all rows in the Table
            for (int i = 0; i < table.Rows.Count; i++)
            {
                // Get the current row
                tableRow = table.Rows[i];

                // Set the width of the first column cell to 34% of the page width
                tableRow.Cells[0].SetCellWidth((float)(pageWidth * 0.34), CellWidthType.Point);

                // Set the width of the second column cell to 33% of the page width
                tableRow.Cells[1].SetCellWidth((float)(pageWidth * 0.33), CellWidthType.Point);

                // Set the width of the third column cell to 33% of the page width
                tableRow.Cells[2].SetCellWidth((float)(pageWidth * 0.33), CellWidthType.Point);
            }

            // Save the modified document, specifying the file format as Docx2016
            doc.SaveToFile("set_column_width_to_fixed_value.docx", FileFormat.Docx2016);

            // Close the document
            doc.Close();
        }
    }
}

C#: Set the Column Width of a Word Table

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Spire.OCR for .NET offers developers a new model to extract text from images. In this article, we will demonstrate how to extract text from images in C# using the new model of Spire.OCR for .NET.

The detailed steps are as follows.

Step 1: Create a Console App (.NET Framework) in Visual Studio.

C#: Extract Text from Images using the New Model of Spire.OCR for .NET

C#: Extract Text from Images using the New Model of Spire.OCR for .NET

Step 2: Change the platform target of the application to x64.

In the application's Solution Explorer, right-click on the project's name and then select "Properties".

C#: Extract Text from Images using the New Model of Spire.OCR for .NET

Change the platform target of the application to x64. This step must be performed since Spire.OCR only supports 64-bit platforms.

C#: Extract Text from Images using the New Model of Spire.OCR for .NET

Step 3: Install Spire.OCR for .NET in your application.

Install Spire.OCR for .NET through NuGet by executing the following command in the NuGet Package Manager Console:

Install-Package Spire.OCR

C#: Extract Text from Images using the New Model of Spire.OCR for .NET

Step 4: Download the new model of Spire.OCR for .NET.

Download the model that fits in with your operating system from one of the following links.

Windows x64

Linux x64

macOS 10.15 and later

Linux aarch

Then extract the package and save it to a specific directory on your computer. In this example, we saved the package to "D:\".

C#: Extract Text from Images using the New Model of Spire.OCR for .NET

Step 5: Use the new model of Spire.OCR for .NET to extract text from images in C#.

The following code example shows how to extract text from an image using C# and the new model of Spire.OCR for .NET:

  • C#
using Spire.OCR;
using System.IO;

namespace NewOCRModel
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Set license key
            // Spire.OCR.License.LicenseProvider.SetLicenseKey("your-license-key");

            // Create an instance of the OcrScanner class
            OcrScanner scanner = new OcrScanner();

            // Create an instance of the ConfigureOptions class to set up the scanner configuration
            ConfigureOptions configureOptions = new ConfigureOptions();

            // Set the path to the new model
            configureOptions.ModelPath = "D:\\win-x64";

            // Set the language for text recognition. The default is English.
            // Supported languages include English, Chinese, Chinesetraditional, French, German, Japanese, and Korean.
            configureOptions.Language = "English";

            // Apply the configuration options to the scanner
            scanner.ConfigureDependencies(configureOptions);

            // Extract text from an image
            scanner.Scan("test.png");

            //Save the extracted text to a text file
            string text = scanner.Text.ToString();
            File.WriteAllText("Output.txt", text);
        }
    }
}

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Page 8 of 46