Spire.Office Knowledgebase Page 3

Subscribe to this RSS feed

Knowledgebase (2300)

Children categories

JavaScript (51)

View items...

HTML Parsing in C#: Strings, Files, and URLs

2025-09-16 10:10:02 Written by zaki zou

How to Parse HTML in C#

The need to efficiently parse HTML in C# is a common requirement for many development tasks, from web scraping, data extraction to content automation. While .NET offers built-in tools (e.g., HtmlAgilityPack), Spire.Doc simplifies HTML parsing in C# with its intuitive object model and seamless integration.

This guide explores how to leverage Spire.Doc for .NET to parse HTML, including loading HTML from various sources, navigating document structure, and extracting critical data.

Setting Up Spire.Doc
How Spire.Doc Parses HTML
How to Load and Parse HTML Content
Conclusion
Common Questions

Setting Up Spire.Doc

The easiest way to integrate the C# HTML parser library into your project is via NuGet:

Open your project in Visual Studio.
Right-click the project in the Solution Explorer → Select Manage NuGet Packages.
In the NuGet Package Manager, search for Spire.Doc.
Select the latest stable version and click Install.

Alternatively, download the library directly from the E-iceblue website, extract the ZIP file, and reference Spire.Doc.dll in your project.

How Spire.Doc Parses HTML

Spire.Doc converts HTML into a structured object model, where elements like <p>, <a>, and <table> are mapped to classes you can programmatically access. Key components include:

Document: Acts as the container for parsed HTML content.
Section: Represents a block of content (similar to HTML’s <body> or <div> sections).
Paragraph: Maps to HTML block elements like <p>, <h1>, or <li>.
DocumentObject: Base class for all elements within a Paragraph (images, links, etc.).

This model ensures that HTML structures are preserved and accessible via intuitive C# properties and methods.

How to Load and Parse HTML Content

Spire.Doc supports parsing HTML from strings, local files, or even remote URLs (when combined with HTTP clients). Below are detailed examples for each scenario.

Parse an HTML String in C#

Parse an HTML string (e.g., from a web API or database) into Spire.Doc’s object model for inspection.

using Spire.Doc;
using Spire.Doc.Documents;

namespace ParseHtmlString
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create a Document object
            Document doc = new Document();

            // Add a section to act as a container
            Section section = doc.AddSection();
            // Add a paragraph
            Paragraph para = section.AddParagraph();

            // Define HTML content to parse
            string htmlContent = @"
                <h2>Sample HTML String</h2>
                <p>This is a paragraph with <strong>bold text</strong> and a <a href='https://www.e-iceblue.com/'>link</a>.</p>
                <ul>
                    <li>List item 1</li>
                    <li>List item 2</li>
                </ul>
            ";

            // Parse the HTML string into the paragraph
            para.AppendHTML(htmlContent);

            // Print all paragraph text
            Console.WriteLine("Parsed HTML Content:");
            Console.WriteLine("---------------------");

            foreach (Paragraph paragraph in section.Paragraphs)
            {
                Console.WriteLine(paragraph.Text);
            }
        }
    }
}

In this code, the method AppendHTML() automatically converts HTML tags to corresponding Spire.Doc objects (e.g., <h1> → Heading1 style, <ul> → list paragraphs).

Output:

Parse an HTML string using C#

Pro Tip: You can also call the SaveToFile() method to convert the HTML string to Word in C#.

Parse an HTML File in C#

For HTML content stored in a file (e.g., downloaded web pages, static HTML reports), load it via LoadFromFile() and then analyze its structure (e.g., extracting headings, paragraphs).

using Spire.Doc;
using Spire.Doc.Documents;

namespace ParseHtmlFile
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create a Document object
            Document doc = new Document();

            // Load an HTML file
            doc.LoadFromFile("sample.html", FileFormat.Html);

            // Traverse sections (HTML body blocks)
            foreach (Section section in doc.Sections)
            {
                Console.WriteLine($"Section {doc.Sections.IndexOf(section) + 1}:");
                Console.WriteLine("---------------------------------");

                // Traverse paragraphs in the section
                foreach (Paragraph para in section.Paragraphs)
                {
                    // Print paragraph text and style (e.g., heading level)
                    string styleName = para.StyleName;
                    Console.WriteLine($"[{styleName}] {para.Text}"+ "\n");
                }
                Console.WriteLine();
            }
        }
    }
}

This C# code example loads a local HTML file and then uses the Paragraph.StyleName and Paragraph.Text properties to extract content along with its styling information.

Output:

Load and parse an HTML file using C#

Spire.Doc’s object model allows you to interact with an HTML file just like you would with a Word document. In addition to extracting text content, you can also extract elements like links, tables from HTML.

Parse a URL in C#

To parse HTML from a web page, combine Spire.Doc with HttpClient to fetch the HTML content first, then parse it.

using Spire.Doc;
using Spire.Doc.Documents;

namespace HtmlUrlParsing
{
    class Program
    {
        // HttpClient instance for web requests
        private static readonly HttpClient httpClient = new HttpClient();

        static async Task Main(string[] args)
        {
            try
            {
                // Fetch HTML from a URL
                string url = "https://www.e-iceblue.com/privacypolicy.html";
                Console.WriteLine($"Fetching HTML from: {url}");
                string htmlContent = await FetchHtmlFromUrl(url);

                // Parse the fetched HTML
                Document doc = new Document();
                Section section = doc.AddSection();
                Paragraph paragraph = section.AddParagraph();
                paragraph.AppendHTML(htmlContent);

                // Extract key information
                Console.WriteLine("\nParsed Content Summary:");
                Console.WriteLine($"Sections: {doc.Sections.Count}");
                Console.WriteLine($"Paragraphs: {section.Paragraphs.Count}");
                Console.WriteLine("-------------------------------------------");

                // Extract all heading paragraphs
                foreach (Paragraph para in section.Paragraphs)
                {
                    if (para.StyleName.StartsWith("Heading"))
                    {
                        string headings = para.Text;
                        Console.WriteLine($"Headings: {headings}");
                    }
                }
                
            }
            catch (Exception ex)
            {
                Console.WriteLine($"Error: {ex.Message}");
            }
        }

        // Helper method to fetch HTML from a URL
        private static async Task<string> FetchHtmlFromUrl(string url)
        {
            // Set a user-agent to avoid being blocked by servers
            httpClient.DefaultRequestHeaders.UserAgent.ParseAdd("Mozilla/5.0 (Windows NT 10.0; Win64; x64)");

            // Send GET request and return HTML content
            HttpResponseMessage response = await httpClient.GetAsync(url);
            response.EnsureSuccessStatusCode(); // Throw on HTTP errors (4xx, 5xx)
            return await response.Content.ReadAsStringAsync();
        }
    }
}

This C# code combines web scraping (fetching HTML from a URL) with document parsing (using Spire.Doc) to extract structured information (like headings) from web content. It’s useful for scenarios like content analysis or web data extraction.

Output:

Parse HTML from a web URL using C#

Conclusion

Spire.Doc for .NET provides a comprehensive solution for reading HTML in C# applications. Whether you're working with HTML strings, local files, or even web URLs, this library streamlines the process with intuitive APIs and reliable performance. By following the examples outlined in this guide, you can efficiently integrate HTML parsing capabilities into your .NET projects.

To fully experience the capabilities of Spire.Doc for .NET, request a free 30-day trial license here.

Common Questions

Q1: Why use Spire.Doc for HTML parsing instead of HtmlAgilityPack?

A: Spire.Doc and HtmlAgilityPack serve different primary goals, so the choice depends on your needs:

HtmlAgilityPack: A lightweight library only for parsing and manipulating raw HTML (e.g., extracting tags, fixing invalid HTML). It does not handle document formatting or export to Word.
Spire.Doc: Designed for document manipulation first - it parses HTML and maps it directly to structured Word elements (sections, paragraphs, styles like headings/bold). This is critical if you need to:
- Preserve HTML structure in an editable Word file.
- Extract styled content (e.g., identify "Heading 1" vs. "Normal" text).
- Export parsed HTML to RTF, TXT, PDF, etc.

Q2. How do I convert HTML to Text in C#

A: To convert an HTML file to plain text in C#, get its text content via the GetText() method and then write the result to a .txt file.

// Create a Document object
Document doc = new Document();

// Load an HTML file
doc.LoadFromFile("sample.html", FileFormat.Html);

// Get text from HTML
string text = doc.GetText();

// Write to a text file
File.WriteAllText("HTMLText.txt", text);

Q3: Can Spire.Doc handle malformed or incomplete HTML?

A: Spire.Doc has good error tolerance and can handle imperfect HTML to some extent. However, severely malformed HTML might cause parsing issues. For best results, ensure your HTML is well-formed or use HTML sanitization libraries before parsing with Spire.Doc.

Q3: Can I use Spire.Doc in ASP.NET Core applications?

A: Yes, Spire.Doc is fully compatible with ASP.NET Core applications. The installation and usage process is the same as in other .NET applications.

Published in Document Operation

Tagged under

doc net Operation

How to Convert XML to CSV in Python: A Complete Guide

2025-09-12 07:20:51 Written by zaki zou

Convert XML to CSV Python

XML is a widely used format for storing structured data, but it’s not ideal for analysis or tabular tools like Excel. Converting XML to CSV allows you to work with the data in a simpler, flat format that’s compatible with spreadsheets and data analysis libraries. By converting XML to CSV, you can easily import data into Excel, perform calculations, or feed it into Python data analysis tools like Pandas.

This approach also helps standardize complex hierarchical data into a format that is easier to read, manipulate, and share across different applications. In this tutorial, we’ll explore how to efficiently transform XML files into CSV using Spire.XLS for Python.

Table of Contents:

Setting Up Spire.XLS for Python
Understanding XML Data Structures
How XML Data Is Extracted and Converted
Basic Example: Convert XML to CSV with Python
Advanced Techniques
Troubleshooting and Common Issues
Conclusion
FAQs

Setting Up Spire.XLS for Python

Before we can start, we need to install the Spire.XLS library. The package is available on PyPI , so installation is straightforward. Run the following command:

pip install spire.xls

Once installed, you can import the required classes:

from spire.xls import *
from spire.xls.common import *

Spire.XLS provides the Workbook and Worksheet objects for managing Excel-like files. You’ll use them to create new CSV files and populate them with your XML data.

Understanding XML Data Structures

XML files are organized in a tree-like hierarchy, where elements (or nodes) are enclosed in tags. Each element can contain text, attributes, or even other child elements.

For example, consider this Books.xml:

<catalog>
  <book isbn="9780451524935">
    <title>The Catcher in the Rye</title>
    <author>J.D. Salinger</author>
    <genre>Fiction</genre>
    <reviews>
      <review>Excellent coming-of-age story</review>
      <review>A true classic</review>
    </reviews>
  </book>
  <book isbn="9780439023528">
    <title>The Hunger Games</title>
    <author>Suzanne Collins</author>
    <genre>Adventure</genre>
  </book>
</catalog>

Root node : <catalog> is the top-level container.
Child nodes : Each <book> is a child of <catalog>.
Elements : <title>, <author>, and <genre> are elements inside each <book>.
Attributes : The isbn in <book isbn="..."> is an attribute attached to the book element.
Nested elements : The <reviews> node contains multiple <review> sub-nodes.

Challenges when converting XML to CSV:

Hierarchical data – XML allows nesting (e.g., <reviews> inside <book>), while CSV is flat.
Attributes vs. elements – Data may be stored as an attribute (isbn) or as a tag (title).
Optional fields – Not all <book> elements may contain the same tags, which can lead to missing values in the CSV.

A robust converter must be able to handle these differences and map hierarchical XML into a flat, tabular CSV format.

How XML Data Is Extracted and Converted

To load and parse an XML file in Python, you can use the built-in xml.etree.ElementTree library. This library lets you navigate the XML tree, retrieve elements, and access attributes. For example:

import xml.etree.ElementTree as ET

# Load the XML file
tree = ET.parse("Books.xml")
root = tree.getroot()

# Iterate through elements
for book in root.findall("book"):
    title = book.findtext("title", "")
    author = book.findtext("author", "")
    isbn = book.attrib.get("isbn", "")

After extracting the XML data, the next step is to map it to a tabular structure . Using Spire.XLS for Python , you can create a workbook, write the extracted values into worksheet cells, and finally export the worksheet as a CSV file for easy analysis.

General Steps to Convert XML to CSV in Python

Use xml.etree.ElementTree to load and retrieve data from the XML file.
Create a Workbook object using Spire.XLS.
Add a worksheet with Workbook.Worksheets.Add() .
Write extracted XML data into the worksheet using Worksheet.SetValue() .
Save the worksheet to a CSV file using Worksheet.SaveToFile() .

Basic Example: Convert XML to CSV with Python

Let’s start with a basic XML-to-CSV conversion. This example automatically generates headers by inspecting the first <book> element and then exports all child nodes into CSV.

from spire.xls import *
from spire.xls.common import *
import xml.etree.ElementTree as ET

# Create a Workbook object
workbook = Workbook()

# Remove default worksheets
workbook.Worksheets.Clear()

# Add a worksheet and name it
worksheet = workbook.Worksheets.Add("Books")

# Load an XML file
xml_tree = ET.parse("C:\\Users\\Administrator\\Desktop\\Books.xml")

# Get the root element of the XML tree
xml_root = xml_tree.getroot()

# Get the first the "book" element
first_book = xml_root.find("book")

# Extract header information and convert it into a list
header = list(first_book.iter())[1:]  

# Write header to Excel
for col_index, header_node in enumerate(header, start=1):
    header_text = header_node.tag
    worksheet.SetValue(1, col_index, header_text)

# Write other data to Excel by iterating over each book element and each data node within it
row_index = 2
for book in xml_root.iter("book"):
    for col_index, data_node in enumerate(list(book.iter())[1:], start=1):  
        value = data_node.text
        header_text = list(header[col_index - 1].iter())[0].tag
        worksheet.SetValue(row_index, col_index, value)
    row_index += 1

# Save the document to an Excel file
worksheet.SaveToFile("output/XmlToCsv.csv", ",", Encoding.get_UTF8())

# Dispose resources
workbook.Dispose()

This script works well for simple, flat XML structures. It automatically generates headers (title, author, genre, price) and fills each row with corresponding values.

Output:

Python Convert XML to CSV

You might also be interested in: Convert XML to Excel and PDF in Python

Advanced Techniques

The basic script works in many cases, but XML often isn’t so simple. Let’s go through some advanced techniques to handle real-world scenarios.

Exporting Only Specific Elements

Sometimes your XML has more data than you need. Maybe you only want to export title and author, ignoring everything else.

from spire.xls import *
from spire.xls.common import *
import xml.etree.ElementTree as ET

# Create a Workbook object
workbook = Workbook()

# Remove default worksheets
workbook.Worksheets.Clear()

# Add a worksheet and name it
worksheet = workbook.Worksheets.Add("Books")

# Load the XML file
xml_tree = ET.parse(r"C:\Users\Administrator\Desktop\Books.xml")
xml_root = xml_tree.getroot()

# Define the elements you want to export
selected_elements = ["title", "author"]

# Write header
for col_index, tag in enumerate(selected_elements, start=1):
    worksheet.SetValue(1, col_index, tag)

# Write data
row_index = 2
for book in xml_root.iter("book"):
    for col_index, tag in enumerate(selected_elements, start=1):
        # Use findtext to handle missing values safely
        worksheet.SetValue(row_index, col_index, book.findtext(tag, ""))
    row_index += 1

# Save the document to a CSV file
worksheet.SaveToFile("output/XmlToCsv_Selected.csv", ",", Encoding.get_UTF8())

# Dispose resources
workbook.Dispose()

This approach ensures your CSV only contains the columns you care about.

Output:

Export specific elements from XML to CSV

Including XML Attributes in CSV

What if your XML contains important data stored as attributes, such as isbn? You can easily include them:

from spire.xls import *
from spire.xls.common import *
import xml.etree.ElementTree as ET

# Create a Workbook object
workbook = Workbook()

# Remove default worksheets
workbook.Worksheets.Clear()

# Add a worksheet and name it
worksheet = workbook.Worksheets.Add("Books")

# Load an XML file
xml_tree = ET.parse(r"C:\Users\Administrator\Desktop\Books.xml")

# Get the root element of the XML tree
xml_root = xml_tree.getroot()

# Get the first the "book" element
first_book = xml_root.find("book")

# Extract header information (child nodes)
header = list(first_book.iter())[1:]

# Write header to Excel
worksheet.SetValue(1, 1, "isbn")   # <-- Add ISBN column first
for col_index, header_node in enumerate(header, start=2):  # start at 2 now
    header_text = header_node.tag
    worksheet.SetValue(1, col_index, header_text)

# Write data
row_index = 2
for book in xml_root.iter("book"):
    # Write isbn as text
    isbn_value = book.attrib.get("isbn", "")
    worksheet.Range[row_index, 1].Text = isbn_value

    # Then write other fields
    for col_index, data_node in enumerate(list(book.iter())[1:], start=2):
        value = data_node.text
        worksheet.SetValue(row_index, col_index, value)

    row_index += 1

# Format the whole ISBN column as text to prevent scientific notation
last_row = row_index - 1   
isbn_range = f"A2:A{last_row}"
worksheet.Range[isbn_range].NumberFormat = "@"

# Save the document to an Excel file (CSV format)
worksheet.SaveToFile("output/XmlToCsv_WithAttributes.csv", ",", Encoding.get_UTF8())

# Dispose resources
workbook.Dispose()

Here we explicitly create an ISBN column, extract it from each <book>’s attributes, and format it as text to prevent Excel from displaying it in scientific notation.

Output:

Include attributes when converting XML to CSV

Handling Nested XML Structures

Nested nodes are common. Suppose your XML looks like this:

<catalog>
  <book>
    <title>1984</title>
    <author>George Orwell</author>
    <reviews>
      <review>Excellent read!</review>
      <review>Still relevant today.</review>
    </reviews>
  </book>
</catalog>

Here’s how to collapse multiple <review> entries into a single CSV column:

from spire.xls import *
from spire.xls.common import *
import xml.etree.ElementTree as ET

# Create a Workbook object
workbook = Workbook()

# Remove default worksheets
workbook.Worksheets.Clear()

# Add a worksheet and name it
worksheet = workbook.Worksheets.Add("Books")

# Load an XML file
xml_tree = ET.parse(r"C:\Users\Administrator\Desktop\Nested.xml")
xml_root = xml_tree.getroot()

# Get the first <book> element
first_book = xml_root.find("book")

# Collect headers (auto-detected)
header = []
for child in first_book:
    if child.tag == "reviews":  
        header.append("reviews")   # collapse nested <review> into one column
    else:
        header.append(child.tag)

# Write header row
for col_index, header_text in enumerate(header, start=1):
    worksheet.SetValue(1, col_index, header_text)

# Write data rows
row_index = 2
for book in xml_root.iter("book"):
    col_index = 1
    for child in book:
        if child.tag == "reviews":
            # Join all <review> texts into a single cell
            reviews = [r.text for r in child.findall("review") if r.text]
            worksheet.SetValue(row_index, col_index, "; ".join(reviews))
        else:
            worksheet.SetValue(row_index, col_index, child.text if child.text else "")
        col_index += 1
    row_index += 1

# Save to CSV
worksheet.SaveToFile("output/XmlToCsv_WithReviews.csv", ",", Encoding.get_UTF8())

# Dispose resources
workbook.Dispose()

Output:

Convert Nested XML to CSV

Spire.XLS not only supports importing data from standard XML files into Excel or CSV, but also allows converting OpenXML (Microsoft's XML-based file format) to Excel. If you're interested, check out this tutorial: How to Convert Excel to OpenXML and OpenXML to Excel in Python.

Troubleshooting and Common Issues

Even with a well-structured script, you may encounter some common issues when converting XML to CSV:

Scientific notation in Excel
- Problem: Long numeric strings like ISBNs may appear as 9.78045E+12 instead of the full number.
- Solution: Format the column as text before saving, for example:
```
worksheet.Range["A2:A{last_row}"].NumberFormat = "@"
```
Missing values causing errors
- Problem: Some <book> elements may lack optional fields (e.g., <genre>). Attempting to access .text directly can cause errors.
- Solution: Use findtext(tag, "") to safely provide a default empty string.
Incomplete or unexpected headers
- Problem: If you generate headers from only the first <book>, you might miss fields that appear later in the XML.
- Solution: Scan multiple elements (or the entire dataset) to build a complete header list before writing data.
Encoding issues
- Problem: Special characters (such as accents or symbols) may not display correctly in the CSV.
- Solution: Always save with UTF-8 encoding:
```
worksheet.SaveToFile("output.csv",",", Encoding.get_UTF8())
```

Conclusion

Converting XML to CSV in Python doesn’t have to be painful. With Spire.XLS for Python, you can automate much of the process, including header generation, handling attributes, and flattening nested nodes. Whether you’re exporting only a few fields, working with complex hierarchies, or cleaning up messy XML, Spire.XLS gives you the flexibility to handle it.

By integrating these approaches into your workflow, you can turn structured XML datasets into clean, analysis-ready CSV files in just a few lines of code.

FAQs

Q1: Can I export directly to Excel (.xlsx) instead of CSV?

Yes. Simply use workbook.SaveToFile("output.xlsx", ExcelVersion.Version2016).

Q2: How do I handle very large XML files?

Use Python’s iterparse() from xml.etree.ElementTree to stream large files instead of loading them fully into memory.

Q3: What if some <book> elements contain additional tags?

You can enhance the header-building step to scan all <book> nodes and collect unique tags dynamically.

Q4: Can I customize the CSV delimiter (e.g., use ; instead of ,)?

Yes. When calling SaveToFile(), replace the delimiter argument:

worksheet.SaveToFile("output.csv", ";", Encoding.get_UTF8())

Q5: How do I export nested XML structures (e.g., multiple <review> nodes)?

Flatten them into a single cell by joining values. For example:

reviews = [r.text for r in book.find("reviews").findall("review")]
worksheet.SetValue(row_index, col_index, "; ".join(reviews))

Get a Free License

To fully experience the capabilities of Spire.XLS for Python without any evaluation limitations, you can request a free 30-day trial license.

Published in Conversion

Tagged under

xls Python Conversion

Format Excel with Python: From Basics to Professional Reports

2025-09-11 09:13:41 Written by zaki zou

Excel formatting with Python tutorial illustration

When working with spreadsheets, readability is just as important as the data itself. A well-formatted Excel file makes it easier to analyze, present, and share information. Instead of manually adjusting styles in Excel, you can use Python for Excel formatting to automate the process and save significant time.

This tutorial shows you how to format Excel with Python using the library Spire.XLS for Python. We’ll cover basic styling, advanced formatting, and practical use cases, while also explaining the key classes and properties that make Excel formatting in Python efficient.

Here's What's Covered:

Why Use Python for Excel Formatting
Setting Up the Environment and Project
Basic Excel Formatting in Python
Extended Excel Formatting in Python
Key APIs for Excel Styling in Python
Use Case: Formatting an Excel Report with Python
Conclusion
FAQ

Why Use Python for Excel Formatting

Formatting Excel manually is time-consuming, especially when handling large datasets or generating reports dynamically. By using Python Excel formatting, you can:

Apply consistent formatting to multiple workbooks.
Automate repetitive tasks like setting fonts, borders, and colors.
Generate styled reports programmatically for business or research.
Save time while improving accuracy and presentation quality.

With Python, you can quickly build scripts that apply professional-looking styles to your spreadsheets. Next, let’s see how to set up the environment.

Setting Up the Environment and Project

To follow this tutorial, you need to install Spire.XLS for Python, a library designed for working with Excel files. It supports creating, reading, modifying, and formatting Excel documents programmatically.

Install Spire.XLS for Python

Install the library via pip:

pip install Spire.XLS

Then import it in your Python script:

from spire.xls import *

Creating or Loading an Excel Workbook

Before we start formatting, we need a workbook to work with.

Create a new workbook:

workbook = Workbook()
sheet = workbook.Worksheets[0]

Or load an existing file:

workbook = Workbook()
workbook.LoadFromFile("input.xlsx")
sheet = workbook.Worksheets[0]

After applying formatting, save the result:

workbook.SaveToFile("output/formatted_output.xlsx", ExcelVersion.Version2016)

With the workbook ready, let’s move on to formatting examples.

Basic Excel Formatting in Python

Before diving into advanced operations, it’s important to master the fundamental Excel formatting features in Python. These basic techniques—such as fonts, alignment, borders, background colors, and adjusting column widths or row heights—are the building blocks of clear, professional spreadsheets. Once familiar with them, you can combine and extend these methods to create more complex styles later.

1. Formatting Fonts

Changing font properties is one of the most frequent tasks when working with styled Excel sheets. In Spire.XLS for Python, font settings are accessed through the CellRange.Style.Font object, which lets you control the typeface, size, color, and emphasis (bold, italic, underline).

cell = sheet.Range[2, 2]
cell.Text = "Python Excel Formatting"

cell.Style.Font.FontName = "Arial"
cell.Style.Font.Size = 14
cell.Style.Font.Color = Color.get_Blue()
cell.Style.Font.IsBold = True

This modifies the text appearance directly within the cell by adjusting its style attributes.

2. Alignment and Wrapping

Cell alignment is managed through the HorizontalAlignment and VerticalAlignment properties of the Style object. In addition, the WrapText property ensures that longer text fits within a cell without overflowing.

cell = sheet.Range[4, 2]
cell.Text = "This text is centered and wrapped."

cell.Style.HorizontalAlignment = HorizontalAlignType.Center
cell.Style.VerticalAlignment = VerticalAlignType.Center
cell.Style.WrapText = True

This produces neatly centered text that remains readable even when it spans multiple lines.

3. Adding Borders

Borders are defined through the Borders collection on the Style object, where you can set the line style and color for each edge individually.

cell = sheet.Range[6, 2]
cell.Text = "Border Example"

cell.Style.Borders[BordersLineType.EdgeBottom].LineStyle = LineStyleType.Thin
cell.Style.Borders[BordersLineType.EdgeBottom].Color = Color.get_Black()

This adds a thin black line to separate the cell from the content below.

4. Background Colors

Cell background is controlled by the Style.Color property. This is often used to highlight headers or important values.

cell = sheet.Range[8, 2]
cell.Text = "Highlighted Cell"
cell.Style.Color = Color.get_Yellow()

This makes the cell stand out in the worksheet.

5. Setting Column Widths and Row Heights

Besides styling text and borders, adjusting the column widths and row heights ensures that your content fits properly without overlapping or leaving excessive blank space.

# Set specific column width
sheet.Columns[1].ColumnWidth = 20
# Set specific row height
sheet.Rows[7].RowHeight = 20

In addition to specifying widths and heights, Excel rows and columns can also be automatically adjusted to fit their content:

Use Worksheet.AutoFitColumn(columnIndex) or Worksheet.AutoFitRow(rowIndex) to adjust a specific column or row.
Use CellRange.AutoFitColumns() or CellRange.AutoFitRows() to adjust all columns or rows in a selected range.

This helps create well-structured spreadsheets where data is both readable and neatly aligned.

Preview of Basic Formatting Result

Here’s what the basic Excel formatting looks like in practice:

Basic Excel formatting with fonts, alignment, borders, background colors, and adjusted column widths

In addition to these visual styles, you can also customize how numbers, dates, or currency values are displayed—see How to Set Excel Number Format in Python for details.

Extended Excel Formatting in Python

Instead of formatting individual cells, you can also work with ranges and reusable styles. These operations typically involve the CellRange object for merging and sizing, and the Workbook.Styles collection for creating reusable Style definitions.

1. Merging Cells

Merging cells is often used to create report titles or section headers that span multiple columns.

range = sheet.Range[2, 2, 2, 4]
range.Merge()
range.Text = "Quarterly Report"
range.Style.HorizontalAlignment = HorizontalAlignType.Center
range.RowHeight = 30

Here, three columns (B2:D2) are merged into one, styled as a bold, centered title with a shaded background—perfect for highlighting key sections in a report.

2. Applying Built-in Styles

Excel comes with a set of predefined styles that can be applied quickly. In Spire.XLS, you can directly assign these built-in styles to ranges.

range.BuiltInStyle = BuiltInStyles.Heading1

This instantly applies Excel’s built-in Heading 1 style, giving the range a consistent and professional appearance without manually setting fonts, borders, or colors.

3. Creating a Custom Style and Applying It

Creating a custom style is useful when you need to apply the same formatting rules across multiple cells or sheets. The Workbook.Styles.Add() method allows you to define a named style that can be reused throughout the workbook.

# Create a custom style
custom_style = workbook.Styles.Add("CustomStyle")
custom_style.Font.FontName = "Calibri"
custom_style.Font.Size = 12
custom_style.Font.Color = Color.get_DarkGreen()
custom_style.Borders[BordersLineType.EdgeTop].LineStyle = LineStyleType.MediumDashDot

# Apply style to a cell
cell = sheet.Range[4, 2]
cell.Text = "Custom Style Applied"
cell.Style = custom_style

This method makes Excel style management in Python more efficient, especially when working with large datasets.

Preview of Advanced Formatting Result

Below is an example showing the results of merged cells, built-in styles, and a custom style applied:

Advanced Excel formatting with merged cells, built-in heading style, and custom reusable style

For highlighting data patterns and trends, you can also use conditional rules—see How to Apply Conditional Formatting in Python Excel.

Key APIs for Excel Styling in Python

After exploring both basic and advanced formatting examples, it’s helpful to step back and look at the core classes, properties, and methods that make Excel styling in Python possible. Understanding these elements will give you the foundation to write scripts that are flexible, reusable, and easier to maintain.

The following table summarizes the most important classes, properties, and methods for formatting Excel with Python. You can use it as a quick reference whenever you need to recall the key styling options.

Class / Property / Method	Description
Workbook / Worksheet	Represent Excel files and individual sheets.
Workbook.LoadFromFile() / SaveToFile()	Load and save Excel files.
Workbook.Styles.Add()	Create and define a custom reusable style that can be applied across cells.
CellRange	Represents one or more cells; used for applying styles or formatting.
CellRange.Style	Represents formatting information (font, alignment, borders, background, text wrapping, etc.).
CellRange.Merge()	Merge multiple cells into a single cell.
CellRange.BuiltInStyle	Apply one of Excel’s predefined built-in styles (e.g., Heading1).
CellRange.BorderAround() / BorderInside()	Apply border formatting to the outside or inside of a range.
CellRange.ColumnWidth / RowHeight	Adjust the width of columns and the height of rows.
CellRange.NumberFormat	Defines the display format for numbers, dates, or currency.

With these building blocks, you can handle common formatting tasks—such as fonts, borders, alignment, colors, and number formats—in a structured and highly customizable way. This ensures that your spreadsheets look professional, remain easy to manage, and can be consistently styled across different Python automation projects. For more details, see the Spire.XLS for Python API reference.

Real-World Use Case: Formatting an Annual Excel Sales Report with Python

Now that we’ve explored both basic and advanced Excel formatting techniques, let’s apply them in a real-world scenario. The following example demonstrates how to generate a comprehensive Excel annual sales report with Python. By combining structured data, regional breakdowns, and advanced formatting, the report becomes much easier to interpret and presents business data in a clear and professional way.

from spire.xls import *

workbook = Workbook()
sheet = workbook.Worksheets[0]
sheet.Name = "Sales Report"

# Report title
title = sheet.Range[1, 1, 1, 7]
title.Merge()
title.Text = "Annual Sales Report - 2024"
title.Style.Font.IsBold = True
title.Style.Font.Size = 16
title.Style.HorizontalAlignment = HorizontalAlignType.Center
title.Style.Color = Color.get_LightGray()
title.RowHeight = 30

# Data
data = [
    ["Product", "Region", "Q1", "Q2", "Q3", "Q4", "Total"],
    ["Laptop", "North", 1200, 1500, 1300, 1600, 5600],
    ["Laptop", "South", 1000, 1200, 1100, 1300, 4600],
    ["Tablet", "North", 800, 950, 1000, 1200, 3950],
    ["Tablet", "South", 700, 850, 900, 1000, 3450],
    ["Phone", "North", 2000, 2200, 2100, 2500, 8800],
    ["Phone", "South", 1800, 1900, 2000, 2200, 7900],
    ["Accessories", "All", 600, 750, 720, 900, 2970],
    ["", "", "", "", "", "Grand Total", 39370]
]

for r in range(len(data)):
    for c in range(len(data[r])):
        sheet.Range[r+2, c+1].Text = str(data[r][c])

# Header formatting
header = sheet.Range[2, 1, 2, 7]
header.Style.Font.IsBold = True
header.Style.Color = Color.get_LightBlue()
header.Style.HorizontalAlignment = HorizontalAlignType.Center
header.Style.Borders[BordersLineType.EdgeBottom].LineStyle = LineStyleType.Thin

# Numeric columns as currency
for row in range(3, 10):
    for col in range(3, 8):
        cell = sheet.Range[row, col]
        if cell.Text.isdigit():
            cell.NumberValue = float(cell.Text)
            cell.NumberFormat = "$#,##0"

# Highlight totals
grand_total = sheet.Range[10, 7]
grand_total.Style.Color = Color.get_LightYellow()
grand_total.Style.Font.IsBold = True

# Freeze the first row and the first two columns
sheet.FreezePanes(2, 3)

# Auto fit columns
sheet.AllocatedRange.AutoFitColumns()

workbook.SaveToFile("output/annual_sales_report.xlsx", ExcelVersion.Version2016)

This script combines several formatting techniques—such as a merged and styled title, bold headers with borders, currency formatting, highlighted totals, and frozen panes—into a single workflow. Applying these styles programmatically improves readability and ensures consistency across every report, which is especially valuable for business and financial data.

Here’s the final styled annual sales report generated with Python:

Annual Excel sales report with a styled title, bold headers, regional breakdown, currency formatting, highlighted grand total, and frozen panes

Conclusion

Formatting Excel with Python is a practical way to automate report generation and ensure professional presentation of data. By combining basic styling with advanced techniques like custom styles and column adjustments, you can create clear, consistent, and polished spreadsheets.

Whether you are working on financial reports, research data, or business dashboards, formatting Excel with Python helps you save time while maintaining presentation quality. With the right use of styles, properties, and formatting options, your spreadsheets will not only contain valuable data but also deliver it in a visually effective way.

You can apply for a free temporary license to unlock the full functionality of Spire.XLS for Python or try Free Spire.XLS for Python to get started quickly.

FAQ - Excel Fpormatting in Python

Q1: Can you format Excel with Python?

Yes. Python provides libraries that allow you to apply fonts, colors, borders, alignment, conditional formatting, and more to Excel files programmatically.

Q2: How to do formatting in Python?

For Excel formatting, you can use a library such as Spire.XLS for Python. It lets you change fonts, set alignment, adjust column widths, merge cells, and apply custom or built-in styles through code.

Q3: Can I use Python to edit Excel?

Yes. Python can not only format but also create, read, modify, and save Excel files, making it useful for dynamic reporting and data automation.

Q4: What’s the best way to automate repeated Excel styling tasks?

Define custom styles or reusable functions that standardize formatting rules across multiple workbooks and sheets. This ensures consistency and saves time.

Published in Cells

Tagged under

xls Python Cells

News Category

Knowledgebase (2300)

Children categories

Setting Up Spire.Doc

How Spire.Doc Parses HTML

How to Load and Parse HTML Content

Parse an HTML String in C#

Parse an HTML File in C#

Parse a URL in C#

Conclusion

Common Questions

Q1: Why use Spire.Doc for HTML parsing instead of HtmlAgilityPack?

Q2. How do I convert HTML to Text in C#

Q3: Can Spire.Doc handle malformed or incomplete HTML?

Q3: Can I use Spire.Doc in ASP.NET Core applications?

Setting Up Spire.XLS for Python

Understanding XML Data Structures

How XML Data Is Extracted and Converted

Basic Example: Convert XML to CSV with Python

Advanced Techniques

Exporting Only Specific Elements

Including XML Attributes in CSV

Handling Nested XML Structures

Troubleshooting and Common Issues

Conclusion

FAQs

Q1: Can I export directly to Excel (.xlsx) instead of CSV?

Q2: How do I handle very large XML files?

Q3: What if some <book> elements contain additional tags?

Q4: Can I customize the CSV delimiter (e.g., use ; instead of ,)?

Q5: How do I export nested XML structures (e.g., multiple <review> nodes)?

Get a Free License

Why Use Python for Excel Formatting

Setting Up the Environment and Project

Install Spire.XLS for Python

Creating or Loading an Excel Workbook

Basic Excel Formatting in Python

1. Formatting Fonts

2. Alignment and Wrapping

3. Adding Borders

4. Background Colors

5. Setting Column Widths and Row Heights

Preview of Basic Formatting Result

Extended Excel Formatting in Python

1. Merging Cells

2. Applying Built-in Styles

3. Creating a Custom Style and Applying It

Preview of Advanced Formatting Result

Key APIs for Excel Styling in Python

Real-World Use Case: Formatting an Annual Excel Sales Report with Python

Conclusion

FAQ - Excel Fpormatting in Python

Q1: Can you format Excel with Python?

Q2: How to do formatting in Python?

Q3: Can I use Python to edit Excel?

Q4: What’s the best way to automate repeated Excel styling tasks?

More...