Spire.Office Knowledgebase Page 1

Subscribe to this RSS feed

Knowledgebase (2300)

Children categories

JavaScript (51)

View items...

Convert CSV to List in Python: Quick & Simple Guide

2025-11-07 06:08:12 Written by zaki zou

Convert CSV to lists and dictionaries through Python

CSV (Comma-Separated Values) is a universal file format for storing tabular data, while lists are Python’s fundamental data structure for easy data manipulation. Converting CSV to lists in Python enables seamless data processing, analysis, and integration with other workflows. While Python’s built-in csv module works for basic cases, Spire.XLS for Python simplifies handling structured CSV data with its intuitive spreadsheet-like interface.

This article will guide you through how to use Python to read CSV into lists (and lists of dictionaries), covering basic to advanced scenarios with practical code examples.

Table of Contents:

Why Choose Spire.XLS for CSV to List Conversion?
Basic Conversion: CSV to Python List
Advanced: Convert CSV to List of Dictionaries
Handle Special Scenarios
- CSV with Custom Delimiters
- Clean Empty Values
Conclusion
Frequently Asked Questions

Why Choose Spire.XLS for CSV to List Conversion?

Spire.XLS is a powerful library designed for spreadsheet processing, and it excels at CSV handling for several reasons:

Simplified Indexing: Uses intuitive 1-based row/column indexing (matching spreadsheet logic).
Flexible Delimiters: Easily specify custom separators (commas, tabs, semicolons, etc.).
Structured Access: Treats CSV data as a worksheet, making row/column traversal straightforward.
Robust Data Handling: Automatically parses numbers, dates, and strings without extra code.

Installation

Before starting, install Spire.XLS for Python using pip:

pip install Spire.XLS

This command installs the latest stable version, enabling immediate use in your projects.

Basic Conversion: CSV to Python List

If your CSV file has no headers (pure data rows), Spire.XLS can directly read rows and convert them to a list of lists (each sublist represents a CSV row).

Step-by-Step Process:

Import the Spire.XLS module.
Create a Workbook object and load the CSV file.
Access the first worksheet (Spire.XLS parses CSV into a worksheet).
Traverse rows and cells, extracting values into a Python list.

CSV to List Python Code Example:

from spire.xls import *
from spire.xls.common import *

# Initialize Workbook and load CSV
workbook = Workbook()
workbook.LoadFromFile("Employee.csv",",")

# Get the first worksheet
sheet = workbook.Worksheets[0]

# Convert sheet data to a list of lists
data_list = []
for i in range(sheet.Rows.Length):
    row = []
    for j in range(sheet.Columns.Length):
        cell_value = sheet.Range[i + 1, j + 1].Value
        row.append(cell_value)
    data_list.append(row)

# Display the result
for row in data_list:
    print(row)

# Dispose resources
workbook.Dispose()

Output:

Convert a CSV file to a list via Python code

If you need to convert the list back to CSV, refer to: Python List to CSV: 1D/2D/Dicts – Easy Tutorial

Advanced: Convert CSV to List of Dictionaries

For CSV files with headers (e.g., name,age,city), converting to a list of dictionaries (where keys are headers and values are row data) is more intuitive for data manipulation.

CSV to Dictionary Python Code Example:

from spire.xls import *

# Initialize Workbook and load CSV
workbook = Workbook()
workbook.LoadFromFile("Customer_Data.csv", ",")

# Get the first worksheet
sheet = workbook.Worksheets[0]

# Extract headers (first row)
headers = []
for j in range(sheet.Columns.Length):
    headers.append(sheet.Range[1, j + 1].Value)

# Convert data rows to list of dictionaries
dict_list = []
for i in range(1, sheet.Rows.Length):  # Skip header row
    row_dict = {}
    for j in range(sheet.Columns.Length):
        key = headers[j]
        value = sheet.Range[i + 1, j + 1].Value
        row_dict[key] = value
    dict_list.append(row_dict)

# Output the result
for record in dict_list:
    print(record)

workbook.Dispose()

Explanation

Load the CSV: Use LoadFromFile() method of Workbook class.
Extracting Headers: Pull the first row of the worksheet to use as dictionary keys.
Map Rows to Dictionaries: For each data row (skipping the header row), create a dictionary where keys are headers and values are cell contents.

Output:

Convert a CSV file to a list of dictionaries via Python code

Handle Special Scenarios

CSV with Custom Delimiters (e.g., Tabs, Semicolons)

To process CSV files with delimiters other than commas (e.g., tab-separated TSV files), specify the delimiter in LoadFromFile:

# Load a tab-separated file
workbook.LoadFromFile("data.tsv", "\t")

# Load a semicolon-separated file 
workbook.LoadFromFile("data_eu.csv", ";")

Clean Empty Values

Empty cells in the CSV are preserved as empty strings ('') in the list. To replace empty strings with a custom value (e.g., "N/A"), modify the cell value extraction:

cell_value = sheet.Range[i + 1, j + 1].Value or "N/A"

Conclusion

Converting CSV to lists in Python using Spire.XLS is efficient, flexible, and beginner-friendly. Whether you need a list of lists for raw data or a list of dictionaries for structured analysis, this library handles parsing, indexing, and resource management efficiently. By following the examples above, you can integrate this conversion into data pipelines, analysis scripts, or applications with minimal effort.

For more advanced features (e.g., CSV to Excel conversion, batch processing), you can visit the Spire.XLS for Python documentation.

Frequently Asked Questions

Q1: Is Spire.XLS suitable for large CSV files?

A: Spire.XLS handles large files efficiently, but for very large datasets (millions of rows), consider processing in chunks or using specialized big data tools. For typical business datasets, it performs excellently.

Q2: How does this compare to using pandas for CSV to list conversion?

A: Spire.XLS offers more control over the parsing process and doesn't require additional data science dependencies. While pandas is great for analysis, Spire.XLS is ideal when you need precise control over CSV parsing or are working in environments without pandas.

Q3: How do I handle CSV files with headers when converting to lists?

A: For headers, use the dictionary conversion method. Extract the first row as headers, then map subsequent rows to dictionaries where keys are header values. This preserves column meaning and enables easy data access by column name.

Q4: How do I convert only specific columns from my CSV to a list?

A: Modify the inner loop to target specific columns:

# Convert only columns 1 and 3 (index 0 and 2)
target_columns = [0, 2]
for i in range(sheet.Rows.Length):
    row = []
    for j in target_columns:
        cell_value = sheet.Range[i + 1, j + 1].Value
        row.append(cell_value)
    data_list.append(row)

Published in Conversion

Tagged under

xls Python Conversion

Parse HTML in Java: Read Files, Fetch URLs & Extract Data

2025-10-28 08:33:11 Written by zaki zou

Learn how to read or parse HTML in Java

HTML parsing is a critical task in Java development, enabling developers to extract structured data, analyze content, and interact with web-based information. Whether you’re building a web scraper, validating HTML content, or extracting text and attributes from web pages, having a reliable tool simplifies the process. In this guide, we’ll explore how to parse HTML in Java using Spire.Doc for Java - a powerful library that combines robust HTML parsing with seamless document processing capabilities.

Why Use Spire.Doc for Java for HTML Parsing
Environment Setup & Installation
Core Guide: Parsing HTML to Extract Elements in Java
- 1. Extract Text from HTML in Java
- 2. Extract Table Data from HTML in Java
Advanced Scenarios: Parse HTML Files & URLs in Java
- 1. Read an HTML File in Java
- 2. Parse a URL in Java
FAQ About Parsing HTML

Why Use Spire.Doc for Java for HTML Parsing

While there are multiple Java libraries for HTML parsing (e.g., Jsoup), Spire.Doc stands out for its seamless integration with document processing and low-code workflow, which is critical for developers prioritizing efficiency. Here’s why it’s ideal for Java HTML parsing tasks:

Intuitive Object Model: Converts HTML into a navigable document structure (e.g., Section, Paragraph, Table), eliminating the need to manually parse raw HTML tags.
Comprehensive Data Extraction: Easily retrieve text, attributes, table rows/cells, and even styles (e.g., headings) without extra dependencies.
Low-Code Workflow: Minimal code is required to load HTML content and process it—reducing development time for common tasks.
Lightweight Integration: Simple to add to Java projects via Maven/Gradle, with minimal dependencies.

Environment Setup & Installation

To start reading HTML in Java, ensure your environment meets these requirements:

Java Development Kit (JDK): Version 8 or higher (JDK 11+ recommended for HttpClient support in URL parsing).
Spire.Doc for Java Library: Latest version (integrated via Maven or manual download).
HTML Source: A sample HTML string, local file, or URL (for testing extraction).

Install Spire.Doc for Java

Maven Setup: Add the Spire.Doc repository and dependency to your project’s pom.xml file. This automatically downloads the library and its dependencies:

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.doc</artifactId>
        <version>13.11.2</version>
    </dependency>
</dependencies>

For manual installation, download the JAR from the official website and add it to your project.

Get a Temporary License (Optional)

By default, Spire.Doc adds an evaluation watermark to output. To remove it and unlock full features, you can request a free 30-day trial license.

Core Guide: Parsing HTML to Extract Elements in Java

Spire.Doc parses HTML into a structured object model, where elements like paragraphs, tables, and fields are accessible as Java objects. Below are practical examples to extract key HTML components.

1. Extract Text from HTML in Java

Extracting text (without HTML tags or formatting) is essential for scenarios like content indexing or data analysis. This example parses an HTML string and extracts text from all paragraphs.

Java Code: Extract Text from an HTML String

import com.spire.doc.*;
import com.spire.doc.documents.*;

public class ExtractTextFromHtml {
    public static void main(String[] args) {
        // Define HTML content to parse
        String htmlContent = "<html>" +
                "<body>" +
                "<h1>Introduction to HTML Parsing</h1>" +
                "<p>Spire.Doc for Java simplifies extracting text from HTML.</p>" +
                "<ul>" +
                "<li>Extract headings</li>" +
                "<li>Extract paragraphs</li>" +
                "<li>Extract list items</li>" +
                "</ul>" +
                "</body>" +
                "</html>";

        // Create a Document object to hold parsed HTML
        Document doc = new Document();
        // Parse the HTML string into the document
        doc.addSection().addParagraph().appendHTML(htmlContent);

        // Extract text from all paragraphs
        StringBuilder extractedText = new StringBuilder();
        for (Section section : (Iterable<Section>) doc.getSections()) {
            for (Paragraph paragraph : (Iterable<Paragraph>) section.getParagraphs()) {
                extractedText.append(paragraph.getText()).append("\n");
            }
        }

        // Print or process the extracted text
        System.out.println("Extracted Text:\n" + extractedText);
    }
}

Output:

Parse an HTML string using Java

2. Extract Table Data from HTML in Java

HTML tables store structured data (e.g., product lists, reports). Spire.Doc parses <table> tags into Table objects, making it easy to extract rows and columns.

Java Code: Extract HTML Table Rows & Cells

import com.spire.doc.*;
import com.spire.doc.documents.*;

public class ExtractTableFromHtml {
    public static void main(String[] args) {
        // HTML content with a table
        String htmlWithTable = "<html>" +
                "<body>" +
                "<table border='1'>" +
                "<tr><th>ID</th><th>Name</th><th>Price</th></tr>" +
                "<tr><td>001</td><td>Laptop</td><td>$999</td></tr>" +
                "<tr><td>002</td><td>Phone</td><td>$699</td></tr>" +
                "</table>" +
                "</body>" +
                "</html>";

        // Parse HTML into Document
        Document doc = new Document();
        doc.addSection().addParagraph().appendHTML(htmlWithTable);

        // Extract table data
        for (Section section : (Iterable<Section>) doc.getSections()) {
            // Iterate through all objects in the section's body
            for (Object obj : section.getBody().getChildObjects()) {
                if (obj instanceof Table) { // Check if the object is a table
                    Table table = (Table) obj;
                    System.out.println("Table Data:");
                    // Loop through rows
                    for (TableRow row : (Iterable<TableRow>) table.getRows()) {
                        // Loop through cells in the row
                        for (TableCell cell : (Iterable<TableCell>) row.getCells()) {
                            // Extract text from each cell's paragraphs
                            for (Paragraph para : (Iterable<Paragraph>) cell.getParagraphs()) {
                                System.out.print(para.getText() + "\t");
                            }
                        }
                        System.out.println(); // New line after each row
                    }
                }
            }
        }
    }
}

Output:

Parse HTML table data using Java

After parsing the HTML string into a Word document via the appendHTML() method, you can leverage Spire.Doc’s APIs to extract hyperlinks as well.

Advanced Scenarios: Parse HTML Files & URLs in Java

Spire.Doc for Java also offers flexibility to parse local HTML files and web URLs, making it versatile for real-world applications.

1. Read an HTML File in Java

To parse a local HTML file using Spire.Doc for Java, simply load it via the loadFromFile(String filename, FileFormat.Html) method for processing.

Java Code: Read & Parse Local HTML Files

import com.spire.doc.*;
import com.spire.doc.documents.*;

public class ParseHtmlFile {
    public static void main(String[] args) {
        // Create a Document object
        Document doc = new Document();
        // Load an HTML file
        doc.loadFromFile("input.html", FileFormat.Html);

        // Extract and print text
        StringBuilder text = new StringBuilder();
        for (Section section : (Iterable<Section>) doc.getSections()) {
            for (Paragraph para : (Iterable<Paragraph>) section.getParagraphs()) {
                text.append(para.getText()).append("\n");
            }
        }
        System.out.println("Text from HTML File:\n" + text);
    }
}

The example extracts text content from the loaded HTML file. If you need to extract the paragraph style (e.g., "Heading1", "Normal") simultaneously, use the Paragraph.getStyleName() method.

Output:

Read an HTML file using Java

You may also need: Convert HTML to Word in Java

2. Parse a URL in Java

For real-world web scraping, you'll need to parse HTML from live web pages. Spire.Doc can work with Java’s built-in HttpClient (JDK 11+) to fetch HTML content from URLs, then parse it.

Java Code: Fetch & Parse a Web URL

import com.spire.doc.*;
import com.spire.doc.documents.*;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.time.Duration;

public class ParseHtmlFromUrl {
    // Reusable HttpClient (configures timeout to avoid hanging)
    private static final HttpClient httpClient = HttpClient.newBuilder()
            .connectTimeout(Duration.ofSeconds(10))
            .build();

    public static void main(String[] args) {
        String url = "https://www.e-iceblue.com/privacypolicy.html";

        try {
            // Fetch HTML content from the URL
            System.out.println("Fetching from: " + url);
            String html = fetchHtml(url);

            // Parse HTML with Spire.Doc
            Document doc = new Document();
            Section section = doc.addSection();
            section.addParagraph().appendHTML(html);

            System.out.println("--- Headings ---");

            // Extract headings
            for (Paragraph para : (Iterable<Paragraph>) section.getParagraphs()) {
                // Check if the paragraph style is a heading (e.g., "Heading1", "Heading2")

                if (para.getStyleName() != null && para.getStyleName().startsWith("Heading")) {
                    System.out.println(para.getText());
                }
            }

        } catch (Exception e) {
            System.err.println("Error: " + e.getMessage());
        }
    }

    // Helper method: Fetches HTML content from a given URL
    private static String fetchHtml(String url) throws Exception {
        // Create HTTP request with User-Agent header (to avoid blocks)
        HttpRequest request = HttpRequest.newBuilder()
                .uri(URI.create(url))
                .header("User-Agent", "Mozilla/5.0")
                .timeout(Duration.ofSeconds(10))
                .GET()
                .build();
        // Send request and get response
        HttpResponse<String> response = httpClient.send(request, HttpResponse.BodyHandlers.ofString());

        // Check if the request succeeded (HTTP 200 = OK)
        if (response.statusCode() != 200) {
            throw new Exception("HTTP error: " + response.statusCode());
        }

        return response.body(); // Return the raw HTML content
    }
}

Key Steps:

HTTP Fetching: Uses HttpClient to fetch HTML from the URL, with a User-Agent header to mimic a browser (avoids being blocked).
HTML Parsing: Creates a Document, adds a Section and Paragraph, then uses appendHTML() to load the fetched HTML.
Content Extraction: Extracts headings by checking if paragraph styles start with "Heading".

Output:

Parse HTML from a web URL using Java

Conclusion

Parsing HTML in Java is simplified with the Spire.Doc for Java library. Using it, you can extract text, tables, and data from HTML strings, local files, or URLs with minimal code—no need to manually handle raw HTML tags or manage heavy dependencies.

Whether you’re building a web scraper, analyzing web content, or converting HTML to other formats (e.g., HTML to PDF), Spire.Doc streamlines the workflow. By following the step-by-step examples in this guide, you’ll be able to integrate robust HTML parsing into your Java projects to unlock actionable insights from HTML content.

FAQs About Parsing HTML

Q1: Which library is best for parsing HTML in Java?

A: It depends on your needs:

Use Spire.Doc if you need to extract text/tables and integrate with document processing (e.g., convert HTML to PDF).
Use Jsoup if you only need basic HTML parsing (but it requires more code for table/text extraction).

Q2: How does Spire.Doc handle malformed or poorly structured HTML?

A: Spire.Doc for Java provides a dedicated approach using the loadFromFile method with XHTMLValidationType.None parameter. This configuration disables strict XHTML validation, allowing the parser to handle non-compliant HTML structures gracefully.

// Load and parse the malformed HTML file
// Parameters: file path, file format (HTML), validation type (None)
doc.loadFromFile("input.html", FileFormat.Html, XHTMLValidationType.None);

However, severely malformed HTML may still cause parsing issues.

Q3: Can I modify parsed HTML content and save it back as HTML?

A: Yes. Spire.Doc lets you manipulate parsed content (e.g., edit paragraph text, delete table rows, or add new elements) and then save the modified document back as HTML:

// After parsing HTML into a Document object:
Section section = doc.getSections().get(0);
Paragraph firstPara = section.getParagraphs().get(0);
firstPara.setText("Updated heading!"); // Modify text

// Save back as HTML
doc.saveToFile("modified.html", FileFormat.Html);

Q4: Is an internet connection required to parse HTML with Spire.Doc?

A: No, unless you’re loading HTML directly from a URL. Spire.Doc can parse HTML from local files or strings without an internet connection. If fetching HTML from a URL, you’ll need an internet connection to retrieve the content first, but parsing itself works offline.

Published in Document Operation

Tagged under

doc java Operation

Generate PDF from HTML Template in C# – Full Tutorial

2025-10-21 08:26:43 Written by zaki zou

C-sharp Generate PDF from HTML Template

In many modern .NET applications, generating professional-looking PDF documents is a common requirement — especially for invoices, reports, certificates, and forms. Instead of creating PDFs manually, a smarter approach is to use HTML templates . HTML makes it easy to design layouts using CSS, include company branding, and reuse the same structure across multiple documents.

By dynamically inserting data into HTML and converting it to PDF programmatically, you can automate document generation while maintaining design consistency.

In this tutorial, you’ll learn how to generate a PDF from an HTML template in C# .NET using Spire.PDF for .NET. We’ll guide you step-by-step — from setting up your development environment (including the required HTML-to-PDF plugin), preparing the HTML template, inserting dynamic data, and generating the final PDF file.

On this page:

Why Generate PDFs from HTML Templates in C#?
Set Up Your .NET Environment
Prepare an HTML Template
Insert Dynamic Data into HTML Before Conversion
Convert Updated HTML Template to PDF in C#
Best Practices for Generating PDF from HTML in C#
Final Words
FAQs About C# HTML Template to PDF Conversion

Why Generate PDFs from HTML Templates in C#?

Using HTML templates for PDF generation offers several advantages:

Reusability: Design once, reuse anywhere — perfect for reports, receipts, and forms.
Styling flexibility: HTML + CSS allow rich formatting without complex PDF drawing code.
Dynamic content: Easily inject runtime data such as customer names, order totals, or timestamps.
Consistency: Ensure all generated documents follow the same layout and style guidelines.
Ease of maintenance: You can update the HTML template without changing your C# logic.

Set Up Your .NET Environment

Before you begin coding, make sure your project is properly configured to handle HTML-to-PDF conversion.

1. Install Spire.PDF for .NET

Spire.PDF for .NET is a professional library designed for creating, reading, editing, and converting PDF documents in C# and VB.NET applications—without relying on Adobe Acrobat. It provides powerful APIs for handling text, images, annotations, forms, and HTML-to-PDF conversion.

You can install it via NuGet:

Install-Package Spire.PDF

Or download it directly from the official website and reference the DLL in your project.

2. Install the HTML Rendering Plugin

Spire.PDF relies on an external rendering engine (Qt WebEngine or Chrome) to accurately convert HTML content into PDF. This plugin must be installed separately.

Steps:

Download the plugin package for your platform.

Extract the contents to a local folder, such as: C:\plugins-windows-x64\plugins
In your C# code, register the plugin path before performing the conversion.

HtmlConverter.PluginPath = @"C:\plugins-windows-x64\plugins";

Prepare an HTML Template

Create an HTML file with placeholders for your dynamic data. For example, name it invoice_template.html :

<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>Invoice</title>
  <style>
    body { font-family: Arial; margin: 40px; }
    .header { font-size: 24px; font-weight: bold; margin-bottom: 20px; }
    table { width: 100%; border-collapse: collapse; margin-top: 20px; }
    th, td { border: 1px solid #999; padding: 8px; text-align: left; }
  </style>
</head>
<body>
  <div class="header">Invoice for {CustomerName}</div>
  <p>Date: {InvoiceDate}</p>
  <table>
    <tr><th>Item</th><th>Price</th></tr>
    <tr><td>{Item}</td><td>{Price}</td></tr>
  </table>
</body>
</html>

Tips:

Keep CSS inline or embedded within the HTML file.
Avoid JavaScript or complex animations.
Use placeholders like {CustomerName} and {InvoiceDate} for data replacement.

Insert Dynamic Data into HTML Before Conversion

You can read your HTML file as text, replace placeholders with real values, and then save it as a new temporary file.

using System;
using System.IO;

string template = File.ReadAllText("invoice_template.html");

template = template.Replace("{CustomerName}", "John Doe");
template = template.Replace("{InvoiceDate}", DateTime.Now.ToShortDateString());
template = template.Replace("{Item}", "Wireless Mouse");
template = template.Replace("{Price}", "$25.99");

File.WriteAllText("invoice_ready.html", template);

This approach lets you generate customized PDFs for each user or transaction dynamically.

Convert Updated HTML Template to PDF in C#

Now that your HTML content is ready, you can use the HtmlConverter.Convert() method to directly convert the HTML string into a PDF file.

Below is a full code example to create PDF from HTML template file in C#:

using System;
using System.Collections.Generic;
using System.Drawing;
using System.IO;
using Spire.Additions.Qt;
using Spire.Pdf.Graphics;
using Spire.Pdf.HtmlConverter;

namespace CreatePdfFromHtmlTemplate
{
    class Program
    {
        static void Main(string[] args)
        {
            // Path to the HTML template file
            string htmlFilePath = "invoice_template.html";

            // Step 1: Read the HTML template from file
            if (!File.Exists(htmlFilePath))
            {
                Console.WriteLine("Error: HTML template file not found.");
                return;
            }

            string htmlTemplate = File.ReadAllText(htmlFilePath);

            // Step 2: Define dynamic data for invoice placeholders
            Dictionary<string, string> invoiceData = new Dictionary<string, string>()
            {
                { "INVOICE_NUMBER", "INV-2025-001" },
                { "INVOICE_DATE", DateTime.Now.ToString("yyyy-MM-dd") },
                { "BILLER_NAME", "John Doe" },
                { "BILLER_ADDRESS", "123 Main Street, New York, USA" },
                { "BILLER_EMAIL", "john.doe@example.com" },
                { "ITEM_DESCRIPTION", "Consulting Services" },
                { "ITEM_QUANTITY", "10" },
                { "ITEM_UNIT_PRICE", "$100" },
                { "ITEM_TOTAL", "$1000" },
                { "SUBTOTAL", "$1000" },
                { "TAX_RATE", "5" },
                { "TAX", "$50" },
                { "TOTAL", "$1050" }
            };

            // Step 3: Replace placeholders in the HTML template with real values
            string populatedInvoice = PopulateInvoice(htmlTemplate, invoiceData);

            // Optional: Save the populated HTML for debugging or review
            File.WriteAllText("invoice_ready.html", populatedInvoice);

            // Step 4: Specify the plugin path for the HTML to PDF conversion
            string pluginPath = @"C:\plugins-windows-x64\plugins";
            HtmlConverter.PluginPath = pluginPath;

            // Step 5: Define output PDF file path
            string outputFile = "InvoiceOutput.pdf";

            try
            {
                // Step 6: Convert the HTML string to PDF
                HtmlConverter.Convert(
                    populatedInvoice,
                    outputFile,
                    enableJavaScript: true,
                    timeout: 100000,                // 100 seconds
                    pageSize: new SizeF(595, 842),  // A4 size in points
                    margins: new PdfMargins(20),    // 20-point margins
                    loadHtmlType: LoadHtmlType.SourceCode
                );

                Console.WriteLine($"PDF generated successfully: {outputFile}");
            }
            catch (Exception ex)
            {
                Console.WriteLine($"Error during PDF generation: {ex.Message}");
            }
        }

        /// <summary>
        /// Helper method: Replaces placeholders in the HTML with actual data values.
        /// </summary>
        private static string PopulateInvoice(string template, Dictionary<string, string> data)
        {
            string result = template;
            foreach (var entry in data)
            {
                result = result.Replace("{" + entry.Key + "}", entry.Value);
            }
            return result;
        }
    }
}

How it works :

Create an invoice template with placeholder variables in {VARIABLE_NAME} format.
Set up a dictionary with key-value pairs containing actual invoice data that matches the template placeholders.
Replace all placeholders in the HTML template with actual values from the data dictionary.
Use Spire.PDF with the Qt plugin to render the HTML content as a PDF file.

Result:

Create PDF from Template in C#

Best Practices for Generating PDF from HTML in C#

Use fixed-width layouts: Avoid fluid or responsive designs to maintain consistent rendering.
Embed or inline CSS: Ensure your styles are self-contained.
Use standard fonts: Arial, Times New Roman, or other supported fonts convert reliably.
Keep images lightweight: Compress large images to improve performance.
Test with different page sizes: A4 and Letter are the most common formats.
Avoid unsupported tags: Elements relying on JavaScript (like <canvas>) won’t render.

Final Words

Generating PDFs from HTML templates in C# .NET is a powerful way to automate document creation while preserving visual consistency. By combining Spire.PDF for .NET with the HTML rendering plugin , you can easily transform styled HTML layouts into print-ready PDF files that integrate seamlessly with your applications.

Whether you’re building a reporting system, an invoicing tool, or a document automation service, this approach saves time, reduces complexity, and produces professional results with minimal code.

FAQs About C# HTML Template to PDF Conversion

Q1: Can I use Google Chrome for HTML rendering instead of Qt WebEngine?

Absolutely. For advanced HTML, CSS, or modern JavaScript, we recommend using the Google Chrome engine via the ChromeHtmlConverter class for more precise and reliable PDF results.

For a complete guide, see our article: Convert HTML to PDF using ChromeHtmlConverter

Q2: Do I need to install a plugin for every machine running my application?

Yes, each environment must have access to the HTML rendering plugin (Qt or Chrome engine) for successful HTML-to-PDF conversion.

Q3: Does Spire.PDF support external CSS files or online resources?

Yes, but inline or embedded CSS is recommended for better rendering accuracy.

Q4: Can I use this approach in ASP.NET or web APIs?

Absolutely. You can generate PDFs server-side and return them as downloadable files or streams.

Q5: Is JavaScript supported during HTML rendering?

Limited support. Static elements are rendered correctly, but scripts and dynamic DOM manipulations are not executed.

Get a Free License

To fully experience the capabilities of Spire.PDF for .NET without any evaluation limitations, you can request a free 30-day trial license.

Published in Document Operation

Tagged under

pdf net Document Operation

News Category

Knowledgebase (2300)

Children categories

Why Choose Spire.XLS for CSV to List Conversion?

Basic Conversion: CSV to Python List

Advanced: Convert CSV to List of Dictionaries

Handle Special Scenarios

CSV with Custom Delimiters (e.g., Tabs, Semicolons)

Clean Empty Values

Conclusion

Frequently Asked Questions

Q1: Is Spire.XLS suitable for large CSV files?

Q2: How does this compare to using pandas for CSV to list conversion?

Q3: How do I handle CSV files with headers when converting to lists?

Q4: How do I convert only specific columns from my CSV to a list?

Why Use Spire.Doc for Java for HTML Parsing

Environment Setup & Installation

Install Spire.Doc for Java

Get a Temporary License (Optional)

Core Guide: Parsing HTML to Extract Elements in Java

1. Extract Text from HTML​ in Java

2. Extract Table Data from HTML​ in Java

Advanced Scenarios: Parse HTML Files & URLs in Java

1. Read an HTML File​ in Java

2. Parse a URL​ in Java

Conclusion​

FAQs About Parsing HTML

Q1: Which library is best for parsing HTML in Java?

Q2: How does Spire.Doc handle malformed or poorly structured HTML?

Q3: Can I modify parsed HTML content and save it back as HTML?

Q4: Is an internet connection required to parse HTML with Spire.Doc?

Why Generate PDFs from HTML Templates in C#?

Set Up Your .NET Environment

Prepare an HTML Template

Insert Dynamic Data into HTML Before Conversion

Convert Updated HTML Template to PDF in C#

Best Practices for Generating PDF from HTML in C#

Final Words

FAQs About C# HTML Template to PDF Conversion

Q1: Can I use Google Chrome for HTML rendering instead of Qt WebEngine?

Q2: Do I need to install a plugin for every machine running my application?

Q3: Does Spire.PDF support external CSS files or online resources?

Q4: Can I use this approach in ASP.NET or web APIs?

Q5: Is JavaScript supported during HTML rendering?

Get a Free License

More...

1. Extract Text from HTML in Java

2. Extract Table Data from HTML in Java

1. Read an HTML File in Java

2. Parse a URL in Java

Conclusion