Python (359)
Grouping rows and columns in Excel provides a more organized and structured view of data, making it easier to analyze and understand complex datasets. After grouping related rows or columns, you can collapse or expand them as needed to focus on specific subsets of information while hiding details. In this article, you will learn how to group or ungroup rows and columns , as well as how to collapse or expand groups in Excel in Python using Spire.XLS for Python.
- Group Rows and Columns in Excel in Python
- Ungroup Rows and Columns in Excel in Python
- Expand or Collapse Groups in Excel in Python
Install Spire.XLS for Python
This scenario requires Spire.XLS for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.XLS
If you are unsure how to install, please refer to this tutorial: How to Install Spire.XLS for Python on Windows
Group Rows and Columns in Excel in Python
Spire.XLS for Python provides the Worksheet.GroupByRows() and Worksheet.GroupByColumns() methods to group specific rows and columns in an Excel worksheet. The following are the detailed steps:
- Create a Workbook object.
- Load a sample Excel file using Workbook.LoadFromFile() method.
- Get the specified worksheet using Workbook.Worksheets[] property.
- Group rows using Worksheet.GroupByRows() method.
- Group columns using Worksheet.GroupByColumns() method.
- Save the result file using Workbook.SaveToFile() method.
- Python
from spire.xls import * from spire.xls.common import * inputFile = "Data.xlsx" outputFile = "GroupRowsAndColumns.xlsx" # Create a Workbook object workbook = Workbook() # Load a sample Excel file workbook.LoadFromFile(inputFile) # Get the first worksheet sheet = workbook.Worksheets[0] # Group rows sheet.GroupByRows(2, 6, False) sheet.GroupByRows(8, 13, False) # Group columns sheet.GroupByColumns(4, 6, False) # Save the result file workbook.SaveToFile(outputFile, ExcelVersion.Version2016) workbook.Dispose()

Ungroup Rows and Columns in Excel in Python
Ungrouping rows and columns in Excel refer to the process of reversing the grouping operation and restoring the individual rows or columns to their original state.
To ungroup rows and columns in an Excel worksheet, you can use the Worksheet.UngroupByRows() and Worksheet.UngroupByColumns() methods. The following are the detailed steps:
- Create a Workbook object.
- Load a sample Excel file using Workbook.LoadFromFile() method.
- Get the specified worksheet using Workbook.Worksheets[] property.
- Ungroup rows using Worksheet.UngroupByRows() method.
- Ungroup columns using Worksheet.UngroupByColumns() method.
- Save the result file using Workbook.SaveToFile() method.
- Python
from spire.xls import * from spire.xls.common import * inputFile = "GroupRowsAndColumns.xlsx" outputFile = "UnGroupRowsAndColumns.xlsx" # Create a Workbook object workbook = Workbook() # Load a sample Excel file workbook.LoadFromFile(inputFile) # Get the first worksheet sheet = workbook.Worksheets[0] # UnGroup rows sheet.UngroupByRows(2, 6) sheet.UngroupByRows(8, 13) # UnGroup columns sheet.UngroupByColumns(4, 6) # Save the result file workbook.SaveToFile(outputFile, ExcelVersion.Version2016) workbook.Dispose()

Expand or Collapse Groups in Excel in Python
Expanding or collapsing groups in Excel refers to the action of showing or hiding the detailed information within a grouped section. With Spire.XLS for Python, you can expand or collapse groups through the Worksheet.Range[].ExpandGroup() or Worksheet.Range[].CollapseGroup() methods. The following are the detailed steps:
- Create a Workbook object.
- Load a sample Excel file using Workbook.LoadFromFile() method.
- Get the specified worksheet using Workbook.Worksheets[] property.
- Expand a specific group using the Worksheet.Range[].ExpandGroup() method.
- Collapse a specific group using the Worksheet.Range[].CollapseGroup() method.
- Save the result file using Workbook.SaveToFile() method.
- Python
from spire.xls import * from spire.xls.common import * inputFile = "Grouped.xlsx" outputFile = "ExpandOrCollapseGroups.xlsx" # Create a Workbook object workbook = Workbook() # Load a sample Excel file workbook.LoadFromFile(inputFile) # Get the first worksheet sheet = workbook.Worksheets[0] # Expand a group sheet.Range["A2:G6"].ExpandGroup(GroupByType.ByRows) # Collapse a group sheet.Range["D1:F15"].CollapseGroup(GroupByType.ByColumns) # Save the result file workbook.SaveToFile(outputFile, ExcelVersion.Version2016) workbook.Dispose()

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
PDF bookmarks are navigational aids that allow users to quickly locate and jump to specific sections or pages in a PDF document. Through a simple click, users can arrive at the target location, which eliminates the need to manually scroll or search for specific content in a lengthy document. In this article, you will learn how to programmatically add, modify and delete bookmarks in PDF files using Spire.PDF for Python.
- Add Bookmarks to a PDF Document
- Edit Bookmarks in a PDF Document
- Delete Bookmarks from a PDF Document
Install Spire.PDF for Python
This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.PDF
If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows
Add Bookmarks to a PDF Document in Python
Spire.PDF for Python provides a method to add bookmarks to a PDF document: PdfDocument. Bookmarks.Add(). You can use this method to create primary bookmarks for the PDF document and use the PdfBookmarkCollection.Add() method to add sub-bookmarks to the primary bookmarks. Additionally, the PdfBookmark class offers other methods to set properties such as destination, text color, and text style for the bookmarks. The following are the detailed steps for adding bookmarks to a PDF document.
- Create a PdfDocument class instance.
- Load a PDF document using PdfDocument.LoadFromFile() method.
- Add a parent bookmark to the document using PdfDocument.Bookmarks.Add() method.
- Create a PdfDestination class object and set the destination of the parent bookmark using PdfBookmark.Action property.
- Set the text color and style of the parent bookmark.
- Create a PdfBookmarkCollection class object to add sub-bookmark to the parent bookmark using PdfBookmarkCollection.Add() method.
- Use the above methods to set the destination, text color, and text style of the sub-bookmark.
- Save the document using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import *
from spire.pdf import *
# Create a PdfDocument object
doc = PdfDocument()
# Load a PDF file
doc.LoadFromFile("Terms of service.pdf")
# Loop through the pages in the PDF file
for i in range(doc.Pages.Count):
page = doc.Pages.get_Item(i)
# Set the title and destination for the bookmark
bookmarkTitle = "Bookmark-{0}".format(i+1)
bookmarkDest = PdfDestination(page, PointF(0.0, 0.0))
# Create and configure the bookmark
bookmark = doc.Bookmarks.Add(bookmarkTitle)
bookmark.Color = PdfRGBColor(Color.get_SaddleBrown())
bookmark.DisplayStyle = PdfTextStyle.Bold
bookmark.Action = PdfGoToAction(bookmarkDest)
# Create a collection to hold child bookmarks
bookmarkColletion = PdfBookmarkCollection(bookmark)
# Set the title and destination for the child bookmark
childBookmarkTitle = "Sub-Bookmark-{0}".format(i+1)
childBookmarkDest = PdfDestination(page, PointF(0.0, 100.0))
# Create and configure the child bookmark
childBookmark = bookmarkColletion.Add(childBookmarkTitle)
childBookmark.Color = PdfRGBColor(Color.get_Coral())
childBookmark.DisplayStyle = PdfTextStyle.Italic
childBookmark.Action = PdfGoToAction(childBookmarkDest)
# Save the PDF file
outputFile = "Bookmark.pdf"
doc.SaveToFile(outputFile)
# Close the document
doc.Close()

Edit Bookmarks in a PDF Document
If you need to update the existing bookmarks, you can use the methods of PdfBookmark class to rename the bookmarks and change their text color, text style. The following are the detailed steps.
- Create a PdfDocument class instance.
- Load a PDF document using PdfDocument.LoadFromFile() method.
- Get a specified bookmark using PdfDocument.Bookmarks[] property.
- Change the title of the bookmark using PdfBookmark.Title property.
- Change the font color of the bookmark using PdfBookmark.Color property.
- Change the text style of the bookmark using PdfBookmark.DisplayStyle property.
- Change the text color and style of the sub-bookmark using the above methods.
- Save the result document using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import *
from spire.pdf import *
# Create a PdfDocument object
doc = PdfDocument()
# Load a PDF file
doc.LoadFromFile("Bookmark.pdf")
# Get the first bookmark
bookmark = doc.Bookmarks.get_Item(0)
# Change the title of the bookmark
bookmark.Title = "Modified BookMark"
# Set the color of the bookmark
bookmark.Color = PdfRGBColor(Color.get_Black())
# Set the outline text style of the bookmark
bookmark.DisplayStyle = PdfTextStyle.Bold
# Edit child bookmarks of the parent bookmark
pBookmark = PdfBookmarkCollection(bookmark)
for i in range(pBookmark.Count):
childBookmark = pBookmark.get_Item(i)
childBookmark.Color = PdfRGBColor(Color.get_Blue())
childBookmark.DisplayStyle = PdfTextStyle.Regular
# Save the PDF document
outputFile = "EditBookmark.pdf"
# Close the document
doc.SaveToFile(outputFile)

Delete Bookmarks from a PDF Document
Spire.PDF for Python also provides methods to delete any bookmark in a PDF document. PdfDocument.Bookmarks.RemoveAt() method is used to remove a specific primary bookmark, PdfDocument.Bookmarks.Clear() method is used to remove all bookmarks, and PdfBookmarkCollection.RemoveAt() method is used to remove a specific sub-bookmark of a primary bookmark. The detailed steps of removing bookmarks form a PDF document are as follows.
- Create a PdfDocument class instance.
- Load a PDF document using PdfDocument.LoadFromFile() method.
- Get the first bookmark using PdfDocument.Bookmarks[] property.
- Remove a specified sub-bookmark of the first bookmark using PdfBookmarkCollection.RemoveAt() method.
- Remove a specified bookmark including its sub-bookmarks using PdfDocument.Bookmarks.RemoveAt() method.
- Remove all bookmarks in the PDF file using PdfDocument.Bookmarks.Clear() method.
- Save the document using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import *
from spire.pdf import *
# Create a PdfDocument object
doc = PdfDocument()
# Load a PDF file
doc.LoadFromFile("Bookmark.pdf")
# # Delete the first bookmark
# doc.Bookmarks.RemoveAt(0)
# # Get the first bookmark
# bookmark = doc.Bookmarks.get_Item(0)
# # Remove the first child bookmark from first parent bookmark
# pBookmark = PdfBookmarkCollection(bookmark)
# pBookmark.RemoveAt(0)
#Remove all bookmarks
doc.Bookmarks.Clear()
# Save the PDF document
output = "DeleteAllBookmarks.pdf"
doc.SaveToFile(output)
# Close the document
doc.Close()

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
The conversion from HTML to image allows you to capture the appearance and layout of the HTML content as a static image file. It can be useful for various purposes, such as generating website previews, creating screenshots, archiving web pages, or integrating HTML content into applications that primarily deal with images. In this article, you will learn how to convert an HTML file or an HTML string to an image in Python using Spire.Doc for Python.
Install Spire.Doc for Python
This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.Doc
If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows
Convert an HTML File to an Image in Python
When an HTML file is loaded into the Document object using the Document.LoadFromFile() method, its contents are automatically rendered as the contents of a Word page. Then, a specific page can be saved as an image stream using the Document.SaveImageToStreams() method.
The following are the steps to convert an HTML file to an image with Python.
- Create a Document object.
- Load a HTML file using Document.LoadFromFile() method.
- Convert a specific page to an image stream using Document.SaveImageToStreams() method.
- Save the image stream as a PNG file using BufferedWriter.write() method.
- Python
from spire.doc import *
from spire.doc.common import *
# Create a Document object
document = Document()
# Load an HTML file
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Sample.html", FileFormat.Html, XHTMLValidationType.none)
# Save the first page as an image stream
imageStream = document.SaveImageToStreams(0, ImageType.Bitmap)
# Convert the image stream as a PNG file
with open("output/HtmlToImage.png",'wb') as imageFile:
imageFile.write(imageStream.ToArray())
document.Close()

Convert an HTML String to an Image in Python
To render uncomplicated HTML strings (typically text and its formatting) as a Word page, you can utilize the Paragraph.AppendHTML() method. Afterwards, you can convert it to an image stream using the Document.SaveImageToStreams() method.
The following are the steps to convert an HTML string to an image in Python.
- Create a Document object.
- Add a section using Document.AddSection() method.
- Add a paragraph using Section.AddParagraph() method.
- Specify the HTML string, and add the it to the paragraph using Paragraph.AppendHTML() method.
- Convert a specific page to an image stream using Document.SaveImageToStreams() method.
- Save the image stream as a PNG file using BufferedWriter.write() method.
- Python
from spire.doc import *
from spire.doc.common import *
# Create a Document object
document = Document()
# Add a section to the document
sec = document.AddSection()
# Add a paragraph to the section
paragraph = sec.AddParagraph()
# Specify the HTML string
htmlString = """
<html>
<head>
<title>HTML to Word Example</title>
<style>
body {
font-family: Arial, sans-serif;
}
h1 {
color: #FF5733;
font-size: 24px;
margin-bottom: 20px;
}
p {
color: #333333;
font-size: 16px;
margin-bottom: 10px;
}
ul {
list-style-type: disc;
margin-left: 20px;
margin-bottom: 15px;
}
li {
font-size: 14px;
margin-bottom: 5px;
}
table {
border-collapse: collapse;
width: 100%;
margin-bottom: 20px;
}
th, td {
border: 1px solid #CCCCCC;
padding: 8px;
text-align: left;
}
th {
background-color: #F2F2F2;
font-weight: bold;
}
td {
color: #0000FF;
}
</style>
</head>
<body>
<h1>This is a Heading</h1>
<p>This is a paragraph.</p>
<p>Here's an unordered list:</p>
<ul>
<li>Item 1</li>
<li>Item 2</li>
<li>Item 3</li>
</ul>
<p>And here's a table:</p>
<table>
<tr>
<th>Name</th>
<th>Age</th>
<th>Gender</th>
</tr>
<tr>
<td>John Smith</td>
<td>35</td>
<td>Male</td>
</tr>
<tr>
<td>Jenny Garcia</td>
<td>27</td>
<td>Female</td>
</tr>
</table>
</body>
</html>
"""
# Append the HTML string to the paragraph
paragraph.AppendHTML(htmlString)
# Save the first page as an image stream
imageStream = document.SaveImageToStreams(0, ImageType.Bitmap)
# Convert the image stream as a PNG file
with open("output/HtmlToImage2.png",'wb') as imageFile:
imageFile.write(imageStream.ToArray())
document.Close()

Get a Free License
To fully experience the capabilities of Spire.Doc for Python without any evaluation limitations, you can request a free 30-day trial license.
Captions play a significant role in Word documents by serving as markers, explanations, navigation aids, and accessibility features. They are crucial elements for creating professional, accurate, and user-friendly documents. Captions help improve the readability, usability, and accessibility of the document and are essential for understanding and effectively processing the document's content. This article will explain how to use Spire.Doc for Python to add or remove captions in Word documents using Python programs.
- Add Image Captions to a Word document in Python
- Add Table Captions to a Word document in Python
- Remove Captions from a Word document in Python
Install Spire.Doc for Python
This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.Doc
If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows
Add Image Captions to a Word document in Python
Spire.Doc for Python provides a convenient method to add captions to images. Simply call the DocPicture.AddCaption(self, name: str, numberingFormat: 'CaptionNumberingFormat', captionPosition: 'CaptionPosition') method to add a caption for the image. The detailed steps are as follows:
- Create an object of the Document class.
- Use the Document.AddSection() method to add a section.
- Add a paragraph using Section.AddParagraph() method.
- Use the Paragraph.AppendPicture(self ,imgFile:str) method to add a DocPicture image object to the paragraph.
- Use the DocPicture.AddCaption(self ,name:str,numberingFormat:'CaptionNumberingFormat',captionPosition:'CaptionPosition') method to add a caption with numbering format as CaptionNumberingFormat.Number.
- Set the Document.IsUpdateFields property to true to update all fields.
- Use the Document.SaveToFile() method to save the resulting document.
- Python
from spire.doc import *
from spire.doc.common import *
# Create a Word document object
document = Document()
# Add a section
section = document.AddSection()
# Add a new paragraph and add an image to it
pictureParagraphCaption = section.AddParagraph()
pictureParagraphCaption.Format.AfterSpacing = 10
pic1 = pictureParagraphCaption.AppendPicture("Data\\1.png")
pic1.Height = 100
pic1.Width = 100
# Add a caption to the image
format = CaptionNumberingFormat.Number
pic1.AddCaption("Image", format, CaptionPosition.BelowItem)
# Add another new paragraph and add an image to it
pictureParagraphCaption = section.AddParagraph()
pic2 = pictureParagraphCaption.AppendPicture("Data\\2.png")
pic2.Height = 100
pic2.Width = 100
# Add a caption to the image
pic2.AddCaption("Image", format, CaptionPosition.BelowItem)
# Update all fields in the document
document.IsUpdateFields = True
# Save the document as a docx file
result = "AddImageCaption.docx"
document.SaveToFile(result, FileFormat.Docx2016)
# Close the document object and release resources
document.Close()
document.Dispose()

Add Table Captions to a Word document in Python
To facilitate the addition of captions to tables, Spire.Doc for Python also provides a convenient method similar to adding captions to images. You can use the Table.AddCaption(self, name:str, format:'CaptionNumberingFormat', captionPosition:'CaptionPosition') method to create a caption for the table. The following are the detailed steps:
- Create an object of the Document class.
- Use the Document.AddSection() method to add a section.
- Create a Table object and add it to the specified section in the document.
- Use the Table.ResetCells(self ,rowsNum:int,columnsNum:int) method to set the number of rows and columns in the table.
- Add a caption to the table using the Table.AddCaption(self ,name:str,format:'CaptionNumberingFormat',captionPosition:'CaptionPosition') method, specifying the caption numbering format as CaptionNumberingFormat.Number.
- Set the Document.IsUpdateFields property to true to update all fields.
- Use the Document.SaveToFile() method to save the resulting document.
- Python
from spire.doc import *
from spire.doc.common import *
# Create a Word document object
document = Document()
# Add a section
section = document.AddSection()
# Add a table
tableCaption = section.AddTable(True)
tableCaption.ResetCells(3, 2)
# Add a caption to the table
tableCaption.AddCaption("Table", CaptionNumberingFormat.Number, CaptionPosition.BelowItem)
# Add another table and caption to it
tableCaption = section.AddTable(True)
tableCaption.ResetCells(2, 3)
tableCaption.AddCaption("Table", CaptionNumberingFormat.Number, CaptionPosition.BelowItem)
# Update all fields in the document
document.IsUpdateFields = True
# Save the document as a docx file
result = "AddTableCaption.docx"
document.SaveToFile(result, FileFormat.Docx2016)
# Close the document object and release resources
document.Close()
document.Dispose()

Remove Captions from a Word document in Python
Spire.Doc for Python also supports removing captions from Word documents. Here are the detailed steps:
- Create an object of the Document class.
- Use the Document.LoadFromFile() method to load a Word document.
- Create a custom method, named detect_caption_paragraph(paragraph), to determine if a paragraph contains a caption.
- Iterate through all the Paragraph objects in the document using a loop and utilize the custom method, detect_caption_paragraph(paragraph), to identify and delete paragraphs that contain captions.
- Use the Document.SaveToFile() method to save the resulting document.
- Python
from spire.doc import *
from spire.doc.common import *
# Method to detect if a paragraph is a caption paragraph
def detect_caption_paragraph(paragraph):
tag = False
field = None
# Iterate through the child objects in the paragraph
for i in range(len(paragraph.ChildObjects)):
if paragraph.ChildObjects[i].DocumentObjectType == DocumentObjectType.Field:
# Check if the child object is of Field type
field = paragraph.ChildObjects[i]
if field.Type == FieldType.FieldSequence:
# Check if the Field type is FieldSequence, indicating a caption field type
return True
return tag
# Create a Word document object
document = Document()
# Load the sample.docx file
document.LoadFromFile("Data/sample.docx")
# Iterate through all sections
for i in range(len(document.Sections)):
section = document.Sections.get_Item(i)
# Iterate through paragraphs in reverse order within the section
for j in range(len(section.Body.Paragraphs) - 1, -1, -1):
# Check if the paragraph is a caption paragraph
if detect_caption_paragraph(section.Body.Paragraphs[j]):
# If it's a caption paragraph, remove it
section.Body.Paragraphs.RemoveAt(j)
# Save the document after removing captions
result = "DeleteCaptions.docx"
document.SaveToFile(result, FileFormat.Docx2016)
# Close the document object and release resources
document.Close()
document.Dispose()

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Page size refers to the dimensions of a document's page. It determines the width and height of the printable area and plays a crucial role in the overall layout and design of the document. Different types of documents may require specific page sizes, such as standard letter size (8.5 x 11 inches) for business letters or A4 size (210 x 297 mm) for international correspondence. Adjusting the page size ensures that your document is compatible with the intended output or presentation medium. In this article, we will demonstrate how to adjust the page size of a Word document in Python using Spire.Doc for Python.
- Adjust the Page Size of a Word Document to a Standard Page Size in Python
- Adjust the Page Size of a Word Document to a Custom Page Size in Python
Install Spire.Doc for Python
This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip commands.
pip install Spire.Doc
If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows
Adjust the Page Size of a Word Document to a Standard Page Size in Python
With Spire.Doc for Python, you can easily adjust the page sizes of Word documents to a variety of standard page sizes, such as A3, A4, A5, A6, B4, B5, B6, letter, legal, and tabloid. The following steps explain how to change the page size of a Word document to a standard page size using Spire.Doc for Python:
- Create an instance of the Document class.
- Load a Word document using the Document.LoadFromFile() method.
- Iterate through the sections in the document.
- Set the page size of each section to a standard page size, such as A4, by setting the Section.PageSetup.PageSize property to PageSize.A4().
- Save the result document using the Document.SaveToFile() method.
- Python
from spire.doc import *
from spire.doc.common import *
# Create an instance of the Document class
doc = Document()
# Load a Word document
doc.LoadFromFile("Input.docx")
# Iterate through the sections in the document
for i in range(doc.Sections.Count):
section = doc.Sections.get_Item(i)
# Change the page size of each section to A4
section.PageSetup.PageSize = PageSize.A4()
# Save the result document
doc.SaveToFile("StandardSize.docx", FileFormat.Docx2016)
doc.Close()

Adjust the Page Size of a Word Document to a Custom Page Size in Python
If you plan to print your document on paper with dimensions that don't match any standard paper size, you can change the page size of your document to a custom page size that matches the exact dimensions of the paper. The following steps explain how to change the page size of a Word document to a custom page size using Spire.Doc for Python:
- Create an instance of the Document class.
- Load a Word document using the Document.LoadFromFile() method.
- Create an instance of the SizeF class with customized dimensions.
- Iterate through the sections in the document.
- Set the page size of each section to a custom page size by assigning the SizeF instance to the Section.PageSetup.PageSize property.
- Save the result document using the Document.SaveToFile() method.
- Python
from spire.doc import *
from spire.doc.common import *
# Create an instance of the Document class
doc = Document()
# Load a Word document
doc.LoadFromFile("Input.docx")
# Create an instance of the SizeF class with customized dimensions
customSize = SizeF(600.0, 800.0)
# Iterate through the sections in the document
for i in range(doc.Sections.Count):
section = doc.Sections.get_Item(i)
# Change the page size of each section to the specified dimensions
section.PageSetup.PageSize = customSize
# Save the result document
doc.SaveToFile("CustomSize.docx", FileFormat.Docx2016)
doc.Close()

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Converting HTML to PDF in Python is a common need when you want to generate printable reports, preserve web content, or create offline documentation with consistent formatting. In this tutorial, you’ll learn how to convert HTML to PDF in Python— whether you're working with a local HTML file or a HTML string. If you're looking for a simple and reliable way to generate PDF files from HTML in Python, this guide is for you.
Install Spire.Doc to Convert HTML to PDF Easily
To convert HTML to PDF in Python, you’ll need a reliable library that supports HTML parsing and PDF rendering. Spire.Doc for Python is a powerful and easy-to-use HTML to PDF converter library that lets you generate PDF documents from HTML content — without relying on a browser, headless engine, or third-party tools.
Install via pip
You can install the library quickly with pip:
pip install spire.doc
Alternative: Manual Installation
You can also download the Spire.Doc package and perform a custom installation if you need more control over the environment.
Tip: Spire.Doc offers a free version suitable for small projects or evaluation purposes.
Once installed, you're ready to convert HTML to PDF in Python in just a few lines of code.
Convert HTML Files to PDF in Python
Spire.Doc for Python makes it easy to convert HTML files to PDF. The Document.LoadFromFile() method supports loading various file formats, including .html, .doc, and .docx. After loading an HTML file, you can convert it to PDF by calling Document.SaveToFile() method. Follow the steps below to convert an HTML file to PDF in Python using Spire.Doc.
Steps to convert an HTML file to PDF in Python:
- Create a Document object.
- Load an HTML file using Document.LoadFromFile() method.
- Convert it to PDF using Document.SaveToFile() method.
The following code shows how to convert an HTML file directly to PDF in Python:
from spire.doc import *
from spire.doc.common import *
# Create a Document object
document = Document()
# Load an HTML file
document.LoadFromFile("Sample.html", FileFormat.Html, XHTMLValidationType.none)
# Save the HTML file to a pdf file
document.SaveToFile("output/ToPdf.pdf", FileFormat.PDF)
document.Close()

Convert an HTML String to PDF in Python
If you want to convert an HTML string to PDF in Python, Spire.Doc for Python provides a straightforward solution. For simple HTML content like paragraphs, text styles, and basic formatting, you can use the Paragraph.AppendHTML() method to insert the HTML into a Word document. Once added, you can save the document as a PDF using the Document.SaveToFile() method.
Here are the steps to convert an HTML string to a PDF file in Python.
- Create a Document object.
- Add a section using Document.AddSection() method and insert a paragraph using Section.AddParagraph() method.
- Specify the HTML string and add it to the paragraph using Paragraph.AppendHTML() method.
- Save the document as a PDF file using Document.SaveToFile() method.
Here's the complete Python code that shows how to convert an HTML string to a PDF:
from spire.doc import *
from spire.doc.common import *
# Create a Document object
document = Document()
# Add a section to the document
sec = document.AddSection()
# Add a paragraph to the section
paragraph = sec.AddParagraph()
# Specify the HTML string
htmlString = """
<html>
<head>
<title>HTML to Word Example</title>
<style>
body {
font-family: Arial, sans-serif;
}
h1 {
color: #FF5733;
font-size: 24px;
margin-bottom: 20px;
}
p {
color: #333333;
font-size: 16px;
margin-bottom: 10px;
}
ul {
list-style-type: disc;
margin-left: 20px;
margin-bottom: 15px;
}
li {
font-size: 14px;
margin-bottom: 5px;
}
table {
border-collapse: collapse;
width: 100%;
margin-bottom: 20px;
}
th, td {
border: 1px solid #CCCCCC;
padding: 8px;
text-align: left;
}
th {
background-color: #F2F2F2;
font-weight: bold;
}
td {
color: #0000FF;
}
</style>
</head>
<body>
<h1>This is a Heading</h1>
<p>This is a paragraph.</p>
<p>Here's an unordered list:</p>
<ul>
<li>Item 1</li>
<li>Item 2</li>
<li>Item 3</li>
</ul>
<p>And here's a table:</p>
<table>
<tr>
<th>Name</th>
<th>Age</th>
<th>Gender</th>
</tr>
<tr>
<td>John Smith</td>
<td>35</td>
<td>Male</td>
</tr>
<tr>
<td>Jenny Garcia</td>
<td>27</td>
<td>Female</td>
</tr>
</table>
</body>
</html>
"""
# Append the HTML string to the paragraph
paragraph.AppendHTML(htmlString)
# Save the document as a pdf file
document.SaveToFile("output/HtmlStringToPdf.pdf", FileFormat.PDF)
document.Close()

Customize the Conversion from HTML to PDF in Python
While converting HTML to PDF in Python is often straightforward, there are times when you need more control over the output. For example, you may want to set a password to protect the PDF document, or embed fonts to ensure consistent formatting across different devices. In this section, you’ll learn how to customize the HTML to PDF conversion using Spire.Doc for Python.
1. Set a Password to Protect the PDF
To prevent unauthorized viewing or editing, you can encrypt the PDF by specifying a user password and an owner password.
# Create a ToPdfParameterList object
toPdf = ToPdfParameterList()
# Set PDF encryption passwords
userPassword = "viewer"
ownerPassword = "E-iceblue"
toPdf.PdfSecurity.Encrypt(userPassword, ownerPassword, PdfPermissionsFlags.Default, PdfEncryptionKeySize.Key128Bit)
# Save as PDF with password protection
document.SaveToFile("/HtmlToPdfWithPassword.pdf", toPdf)
2. Embed Fonts to Preserve Formatting
To ensure the PDF displays correctly across all devices, you can embed all fonts used in the document.
# Create a ToPdfParameterList object
ppl = ToPdfParameterList()
ppl.IsEmbeddedAllFonts = True
# Save as PDF with embedded fonts
document.SaveToFile("/HtmlToPdfWithEmbeddedFonts.pdf", ppl)
These options give you finer control when you convert HTML to PDF in Python, especially for professional document sharing or long-term storage scenarios.
The Conclusion
Converting HTML to PDF in Python becomes simple and flexible with Spire.Doc for Python. Whether you're handling static HTML files or dynamic HTML strings, or need to secure and customize your PDFs, this library provides everything you need — all in just a few lines of code. Get a free 30-day license and start converting HTML to high-quality PDF documents in Python today!
FAQs
Q1: Can I convert an HTML file to PDF in Python? Yes. Using Spire.Doc for Python, you can convert a local HTML file to PDF with just a few lines of code.
Q2: How do I convert HTML to PDF in Chrome? While Chrome allows manual "Save as PDF", it’s not suitable for batch or automated workflows. If you're working in Python, Spire.Doc provides a better solution for programmatically converting HTML to PDF.
Q3: How do I convert HTML to PDF without losing formatting? To preserve formatting:
- Use embedded or inline CSS (not external files).
- Use absolute URLs for images and resources.
- Embed fonts using Spire.Doc options like IsEmbeddedAllFonts(True).
A page break is a markup that divides the content of a document or spreadsheet into multiple pages for printing or display. This feature can be used to adjust the page layout of a document to ensure that each page contains the appropriate information. By placing page breaks appropriately, you can also ensure that your document is presented in a better format and layout when printed. This article will explain how to insert or remove horizontal/vertical page breaks in Excel on the Python platform by using Spire.XLS for Python.
- Insert Horizontal Page Breaks in Excel Using Python
- Insert Vertical Page Breaks in Excel Using Python
- Remove Horizontal Page Breaks from Excel Using Python
- Remove Vertical Page Breaks from Excel Using Python
Install Spire.XLS for Python
This scenario requires Spire.XLS for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip commands.
pip install Spire.XLS
If you are unsure how to install, please refer to this tutorial: How to Install Spire.XLS for Python on Windows
Insert Horizontal Page Breaks in Excel Using Python
Spire.XLS for Python supports inserting horizontal page breaks to specified cell ranges by calling Worksheet.HPageBreaks.Add(CellRange) method. The following are detailed steps.
- Create a Workbook instance.
- Load an Excel file from disk using Workbook.LoadFromFile() method.
- Get a specified worksheet using Workbook.Worksheets[] property.
- Insert horizontal page breaks to specified cell ranges using Worksheet.HPageBreaks.Add(CellRange) method.
- Set view mode to Preview mode by Worksheet.ViewMode property.
- Save the result file using Workbook.SaveToFile() method.
- Python
from spire.xls import * from spire.xls.common import * inputFile = "C:/Users/Administrator/Desktop/Sample.xlsx" outputFile = "C:/Users/Administrator/Desktop/InsertHPageBreak.xlsx" #Create a Workbook instance workbook = Workbook() #Load an Excel file from disk workbook.LoadFromFile(inputFile) #Get the first worksheet of this file sheet = workbook.Worksheets[0] #Insert horizontal page breaks to specified cell ranges sheet.HPageBreaks.Add(sheet.Range["A12"]) sheet.HPageBreaks.Add(sheet.Range["A20"]) #Set view mode to Preview mode sheet.ViewMode = ViewMode.Preview #Save the result file workbook.SaveToFile(outputFile, ExcelVersion.Version2013) workbook.Dispose()

Insert Vertical Page Breaks in Excel Using Python
Spire.XLS for Python also supports inserting vertical page breaks to specified cell ranges by calling Worksheet.VPageBreaks.Add(CellRange) method. The following are detailed steps.
- Create a Workbook instance.
- Load an Excel file from disk using Workbook.LoadFromFile() method.
- Get a specified worksheet using Workbook.Worksheets[] property.
- Insert vertical page breaks to specified cell ranges using Worksheet.VPageBreaks.Add(CellRange) method.
- Set view mode to Preview mode using Worksheet.ViewMode property.
- Save the result file using Workbook.SaveToFile() method.
- Python
from spire.xls import * from spire.xls.common import * inputFile = "C:/Users/Administrator/Desktop/Sample.xlsx" outputFile = "C:/Users/Administrator/Desktop/InsertVPageBreak.xlsx" #Create a Workbook instance workbook = Workbook() #Load an Excel file from disk workbook.LoadFromFile(inputFile) #Get the first worksheet of this file sheet = workbook.Worksheets[0] #Insert vertical page breaks to specified cell ranges sheet.VPageBreaks.Add(sheet.Range["B1"]) sheet.VPageBreaks.Add(sheet.Range["D3"]) #Set view mode to Preview mode sheet.ViewMode = ViewMode.Preview #Save the result file workbook.SaveToFile(outputFile, ExcelVersion.Version2013) workbook.Dispose()

Remove Horizontal Page Breaks from Excel Using Python
If you want to remove horizontal page breaks from Excel, call the Worksheet.HPageBreaks.RemoveAt() or Worksheet.HPageBreaks.Clear() methods. The following are detailed steps.
- Create a Workbook instance.
- Load an Excel file from disk using Workbook.LoadFromFile() method.
- Get a specified worksheet using Workbook.Worksheets[] property.
- Remove all the horizontal page breaks by calling Worksheet.HPageBreaks.Clear() method or remove a specific horizontal page break by calling Worksheet.HPageBreaks.RemoveAt() method.
- Set view mode to Preview mode using Worksheet.ViewMode property.
- Save the result file using Workbook.SaveToFile() method.
- Python
from spire.xls import * from spire.xls.common import * inputFile = "C:/Users/Administrator/Desktop/InsertHPageBreak.xlsx" outputFile = "C:/Users/Administrator/Desktop/RemoveHPageBreak.xlsx" #Create a Workbook instance workbook = Workbook() #Load an Excel file from disk workbook.LoadFromFile(inputFile) #Get the first worksheet from this file sheet = workbook.Worksheets[0] #Clear all the horizontal page breaks #sheet.HPageBreaks.Clear() #Remove the first horizontal page break sheet.HPageBreaks.RemoveAt(0) #Set view mode to Preview mode sheet.ViewMode = ViewMode.Preview #Save the result file workbook.SaveToFile(outputFile, ExcelVersion.Version2013) workbook.Dispose()

Remove Vertical Page Breaks from Excel Using Python
If you want to remove vertical page breaks from Excel, call the Worksheet.VPageBreaks.RemoveAt() or Worksheet.VPageBreaks.Clear() methods. The following are detailed steps.
- Create a Workbook instance.
- Load an Excel file from disk using Workbook.LoadFromFile() method.
- Get a specified worksheet using Workbook.Worksheets[] property.
- Remove all the vertical page breaks by calling Worksheet.VPageBreaks.Clear() method or remove a specific vertical page break by calling Worksheet.VPageBreaks.RemoveAt() method.
- Set view mode to Preview mode using Worksheet.ViewMode property.
- Save the result file using Workbook.SaveToFile() method.
- Python
from spire.xls import * from spire.xls.common import * inputFile = "C:/Users/Administrator/Desktop/InsertVPageBreak.xlsx" outputFile = "C:/Users/Administrator/Desktop/RemoveVPageBreak.xlsx" #Create a Workbook instance workbook = Workbook() #Load an Excel file from disk workbook.LoadFromFile(inputFile) #Get the first worksheet from this file sheet = workbook.Worksheets[0] #Clear all the vertical page breaks #sheet.VPageBreaks.Clear() #Remove the first vertical page break sheet.VPageBreaks.RemoveAt(0) #Set view mode to Preview mode sheet.ViewMode = ViewMode.Preview #Save the result file workbook.SaveToFile(outputFile, ExcelVersion.Version2013) workbook.Dispose()

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
EPUB, short for Electronic Publication, is a widely used standard format for eBooks. It is an open and free format based on web standards, enabling compatibility with various devices and software applications. EPUB files are designed to provide a consistent reading experience across different platforms, including e-readers, tablets, smartphones, and computers. By converting your Word document to EPUB, you can ensure that your content is accessible and enjoyable to a broader audience, regardless of the devices and software they use. In this article, we will demonstrate how to convert Word documents to EPUB format in Python using Spire.Doc for Python.
Install Spire.Doc for Python
This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip commands.
pip install Spire.Doc
If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows
Convert Word to EPUB in Python
The Document.SaveToFile(fileName:str, fileFormat:FileFormat) method provided by Spire.Doc for Python supports converting a Word document to EPUB format. The detailed steps are as follows.
- Create an object of the Document class.
- Load a Word document using Document.LoadFromFile() method.
- Save the Word document to EPUB format using Document.SaveToFile(fileName:str, fileFormat:FileFormat) method.
- Python
from spire.doc import * from spire.doc.common import * # Specify the input Word document and output EPUB file paths inputFile = "Sample.docx" outputFile = "ToEpub.epub" # Create an object of the Document class doc = Document() # Load a Word document doc.LoadFromFile(inputFile) # Save the Word document to EPUB format doc.SaveToFile(outputFile, FileFormat.EPub) # Close the Document object doc.Close()

Convert Word to EPUB with a Cover Image in Python
Spire.Doc for Python enables you to convert a Word document to EPUB format and set a cover image for the resulting EPUB file by using the Document.SaveToEpub(fileName:str, coverImage:DocPicture) method. The detailed steps are as follows.
- Create an object of the Document class.
- Load a Word document using Document.LoadFromFile() method.
- Create an object of the DocPicture class, and then load an image using DocPicture.LoadImage() method.
- Save the Word document as an EPUB file and set the loaded image as the cover image of the EPUB file using Document.SaveToEpub(fileName:str, coverImage:DocPicture) method.
- Python
from spire.doc import * from spire.doc.common import * # Specify the input Word document and output EPUB file paths inputFile = "Sample.docx" outputFile = "ToEpubWithCoverImage.epub" # Specify the file path for the cover image imgFile = "Cover.png" # Create a Document object doc = Document() # Load the Word document doc.LoadFromFile(inputFile) # Create a DocPicture object picture = DocPicture(doc) # Load the cover image file picture.LoadImage(imgFile) # Save the Word document as an EPUB file and set the cover image doc.SaveToEpub(outputFile, picture) # Close the Document object doc.Close()

Get a Free License
To fully experience the capabilities of Spire.Doc for Python without any evaluation limitations, you can request a free 30-day trial license.
SVG files are commonly used for web graphics and vector-based illustrations because they can be scaled and adjusted easily. PDF, on the other hand, is a versatile format widely supported across different devices and operating systems. Converting SVG to PDF allows for easy sharing of graphics and illustrations, ensuring that recipients can open and view the files without requiring specialized software or worrying about browser compatibility issues. In this article, we will demonstrate how to convert SVG files to PDF format in Python using Spire.PDF for Python.
Install Spire.PDF for Python
This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.PDF
If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows
Convert SVG to PDF in Python
Spire.PDF for Python provides the PdfDocument.LoadFromSvg() method, which allows users to load an SVG file. Once loaded, users can use the PdfDocument.SaveToFile() method to save the SVG file as a PDF file. The detailed steps are as follows.
- Create an object of the PdfDocument class.
- Load an SVG file using PdfDocument.LoadFromSvg() method.
- Save the SVG file to PDF format using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import *
from spire.pdf import *
# Create a PdfDocument object
doc = PdfDocument()
# Load an SVG file
doc.LoadFromSvg("Sample.svg")
# Save the SVG file to PDF format
doc.SaveToFile("ConvertSvgToPdf.pdf", FileFormat.PDF)
# Close the PdfDocument object
doc.Close()

Add SVG to PDF in Python
In addition to converting SVG to PDF directly, Spire.PDF for Python also supports adding SVG files to specific locations in PDF. The detailed steps are as follows.
- Create an object of the PdfDocument class.
- Load an SVG file using PdfDocument.LoadFromSvg() method.
- Create a template based on the content of the SVG file using PdfDocument. Pages[].CreateTemplate() method.
- Get the width and height of the template.
- Create another object of the PdfDocument class and load a PDF file using PdfDocument.LoadFromFile() method.
- Draw the template with a custom size at a specific location in the PDF file using PdfDocument.Pages[].Canvas.DrawTemplate() method.
- Save the result file using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import *
from spire.pdf import *
# Create a PdfDocument object
doc1 = PdfDocument()
# Load an SVG file
doc1.LoadFromSvg("Sample.svg")
# Create a template based on the content of the SVG
template = doc1.Pages.get_Item(0).CreateTemplate()
# Get the width and height of the template
width = template.Width
height = template.Height
# Create another PdfDocument object
doc2 = PdfDocument()
# Load a PDF file
doc2.LoadFromFile(""Sample.pdf"")
# Draw the template with a custom size at a specific location on the first page of the loaded PDF file
doc2.Pages.get_Item(0).Canvas.DrawTemplate(template, PointF(10.0, 100.0), SizeF(width*0.8, height*0.8))
# Save the result file
doc2.SaveToFile("AddSvgToPdf.pdf", FileFormat.PDF)
# Close the PdfDocument objects
doc2.Close()
doc1.Close()

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
In the context of Excel, Open XML refers to the underlying file format used by Excel to store spreadsheet data, formatting, formulas, and other related information. It provides a powerful and flexible basis for working with Excel files programmatically.
By converting Excel to Open XML, developers gain greater control and automation when working with spreadsheet-related tasks. In turn, you can also generate Excel files from Open XML to take advantage of Excel's built-in capabilities to perform advanced data operations. In this article, you will learn how to convert Excel to Open XML or Open XML to Excel in Python using Spire.XLS for Python.
Install Spire.XLS for Python
This scenario requires Spire.XLS for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.XLS
If you are unsure how to install, please refer to this tutorial: How to Install Spire.XLS for Python on Windows
Convert Excel to Open XML in Python
Spire.XLS for Python offers the Workbook.SaveAsXml() method to save an Excel file in Open XML format. The following are the detailed steps.
- Create a Workbook object.
- Load an Excel file using Workbook.LoadFromFile() method.
- Save the Excel file in Open XML format using Workbook.SaveAsXml() method.
- Python
from spire.xls import *
from spire.xls.common import *
# Create a Workbook object
workbook = Workbook()
# Load an Excel file
workbook.LoadFromFile("sample.xlsx")
# Save the Excel file in Open XML file format
workbook.SaveAsXml("ExcelToXML.xml")
workbook.Dispose()

Convert Open XML to Excel in Python
To convert an Open XML file to Excel, you need to load the Open XML file through the Workbook.LoadFromXml() method, and then call the Workbook.SaveToFile() method to save it as an Excel file. The following are the detailed steps.
- Create a Workbook object.
- Load an Open XML file using Workbook.LoadFromXml() method.
- Save the Open XML file to Excel using Workbook.SaveToFile() method.
- Python
from spire.xls import *
from spire.xls.common import *
# Create a Workbook object
workbook = Workbook()
# Load an Open XML file
workbook.LoadFromXml("ExcelToXML.xml")
# Save the Open XML file to Excel XLSX format
workbook.SaveToFile("XMLToExcel.xlsx", ExcelVersion.Version2016)
workbook.Dispose()

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.