page 1

Subscribe to this RSS feed

.NET (1316)

Children categories

Spire.DataExport for .NET (33)

View items...

Spire.Presentation for .NET (204)

View items...

Spire.Spreadsheet for .NET (1)

View items...

Spire.PDFViewer for MAUI (1)

View items...

How to Print PDF Documents in .NETStandard Platform

2026-01-16 01:20:04 Written by zaki zou

To print Word, Excel, PowerPoint, PDF, and other document types, Spire.Printing is used together with the corresponding Spire.Office document libraries—Spire.Doc, Spire.XLS, Spire.Presentation, and Spire.PDF (especially the .NET Standard version)—to load the source files, save them into IPrintDocumentStream, and send them to the printer.

Published in Print

How to Print PowerPoint Presentations in .NETStandard Platform

2026-01-16 01:19:42 Written by zaki zou

Published in Print

How to Print Excel Files in .NETStandard Platform

2026-01-16 01:19:12 Written by zaki zou

Published in Print

How to Print Word Documents in .NETStandard Platform

2026-01-16 01:13:34 Written by zaki zou

Published in Print

Cross-Platform PDF and Office Printing in C# with Spire.Printing

2026-01-14 09:48:52 Written by zaki zou

Spire.Printing: Professional Printing for C# .NET

Printing documents is a common requirement in desktop applications, background services, and server-side systems. In practical development and business scenarios, developers often need to print files silently, route jobs to specific printers, or control printing behavior programmatically—without relying on user interaction.

This guide demonstrates how to use Spire.Printing to implement flexible, automated printing of PDF and Office documents in C# across Windows, Linux, and macOS. You will learn how to handle printable streams, select printers programmatically, and apply advanced print settings, enabling reliable cross-platform printing in modern .NET applications.

Table of Contents

Installing Spire.Printing
The Core Printing Workflow and Printing Settings
Printing Word, Excel, PowerPoint, PDF, and Other Documents
Advanced Print Settings and Operations
Licensing Notes

Installing Spire.Printing

Spire.Printing is distributed as a NuGet package and can be added to your project in the standard way:

Install-Package Spire.Printing

Platform Compatibility

Spire.Printing is a cross-platform printing library for modern .NET applications. When used together with Spire.Office libraries (with .NET Standard support), it enables printing Word, Excel, PowerPoint, PDF, and other document formats across Windows, Linux, and macOS—without relying on MS Office Interop.

It supports recent .NET runtimes, including .NET 5.0, .NET 6.0, .NET 9.0, and .NET 10.0, and runs on the following platforms:

Windows (x64, x86)
Linux (x64, ARM)
macOS (x64, ARM)

The Core Printing Workflow and Printing Settings

Spire.Printing is designed to send print-ready document streams directly to the printer. On Windows, the printable stream is typically an XPS document, while on Linux and macOS it is a PDF document. It is commonly used with Spire.Office for .NET to implement the printing workflow in .NET applications.

The general process is:

Create an IPrintDocumentStream instance from the document.
Create a PrintDocument instance.
Configure print settings through the PrintSettings property.
Send the job to the printer.

Code Example

using Spire.Printing;

IPrintDocumentStream documentStream;
if (System.Runtime.InteropServices.RuntimeInformation.IsOSPlatform(System.Runtime.InteropServices.OSPlatform.Windows))
{
    // Windows
    documentStream = new XpsPrintDocument("test.xps");
}
else
{
    // Non-Windows(Linux, MacOS)
    documentStream = new PdfPrintDocument("test.pdf");
}
//new PrintDocument
PrintDocument printDocument = new PrintDocument(documentStream);

//Set paper size
printDocument.PrintSettings.PaperSize = PaperSize.A4;

//Set the number of copies to be printed
printDocument.PrintSettings.Copies = 2;

//Select a page range
printDocument.PrintSettings.SelectPageRange(2, 5);

//Duplex Printing
if (printDocument.PrintSettings.CanDuplex)
{
    //Print in duplex
    printDocument.PrintSettings.Duplex = Duplex.Vertical;
}

//Collated vs. uncollated output:
printDocument.PrintSettings.Collate = true;

//Printing a file to a specific printer, if not set, print to the default printer
printDocument.PrintSettings.PrinterName = "Your Printer Name";

// Print to the specified file
printDocument.PrintSettings.PrintToFile("toXps.xps");

//Record printing logs
printDocument.PrintSettings.PrintLogger = new DefaultPrintLogger("log.txt");

//Print
printDocument.Print();

//Dispose
printDocument.Dispose();

This stream-based model keeps the printing workflow consistent across platforms, while allowing all printer behavior to be customized through the PrintSettings API.

Printing Word, Excel, PowerPoint, PDF, and Other Documents

Office and PDF Printing Workflow of in C#

Printing Word Documents in C#

Library Installation

Install-Package Spire.Printing
Install-Package Spire.Docfor.NETStandard

Code Example

using Spire.Doc;
using Spire.Printing;

//Check the system
bool isWindows = System.Runtime.InteropServices.RuntimeInformation.IsOSPlatform(System.Runtime.InteropServices.OSPlatform.Windows);

using (Document document = new Document())
{
    //Use Spire.Doc to save documents as. xps or. pdf document streams
    document.LoadFromFile(@"test.docx");
    Spire.Doc.FileFormat fileFormat = !isWindows ? Spire.Doc.FileFormat.PDF : Spire.Doc.FileFormat.XPS;
    MemoryStream stream = new MemoryStream();
    document.SaveToStream(stream, fileFormat);

    //Save to IPrintDocumentStream according to the system
    IPrintDocumentStream docStream = !isWindows ? new PdfPrintDocument(stream) : new XpsPrintDocument(stream);

    //Print
    PrintDocument printDoc = new PrintDocument(docStream);
    printDoc.PrintSettings.SelectPageRange(1, 1);
    printDoc.Print();

    //Dispose
    printDoc.Dispose();
}

Printing Excel Files in C#

Library Installation

Install-Package Spire.Printing
Install-Package Spire.XLSfor.NETStandard

Code Example

using Spire.Printing;
using Spire.Xls;

//Check the system
bool isWindows = System.Runtime.InteropServices.RuntimeInformation.IsOSPlatform(System.Runtime.InteropServices.OSPlatform.Windows);
using (Workbook wb = new Workbook())
{
    //Use Spire.Xls to save documents as. xps or. pdf document streams
    wb.LoadFromFile("test.xlsx");
    Spire.Xls.FileFormat fileFormat = !isWindows ? Spire.Xls.FileFormat.PDF : Spire.Xls.FileFormat.XPS;
    MemoryStream stream = new MemoryStream();
    wb.SaveToStream(stream, fileFormat);

    //Save to IPrintDocumentStream according to the system
    IPrintDocumentStream xlsStream = !isWindows ? new PdfPrintDocument(stream) : new XpsPrintDocument(stream);

    //Print
    PrintDocument printxls = new PrintDocument(xlsStream);
    printxls.PrintSettings.SelectPageRange(1, 1);
    printxls.Print();

    //Dispose
    printxls.Dispose();
}

Printing PDF Files in C#

Library Installation

Install-Package Spire.Printing
Install-Package Spire.PDFfor.NETStandard

Code Example

using Spire.Pdf;
using Spire.Printing;

//Check the system
bool isWindows = System.Runtime.InteropServices.RuntimeInformation.IsOSPlatform(System.Runtime.InteropServices.OSPlatform.Windows);
using (PdfDocument pdfDocument = new PdfDocument())
{
    //Use Spire.PDF to save documents as. xps or. pdf document streams
    pdfDocument.LoadFromFile("test.pdf");
    Spire.Pdf.FileFormat fileFormat = !isWindows ? Spire.Pdf.FileFormat.PDF : Spire.Pdf.FileFormat.XPS;
    MemoryStream stream = new MemoryStream();
    pdfDocument.SaveToStream(stream, fileFormat);

    //Save to IPrintDocumentStream according to the system
    IPrintDocumentStream pdfStream = !isWindows ? new PdfPrintDocument(stream) : new XpsPrintDocument(stream);

    //Print
    PrintDocument printPdf = new PrintDocument(pdfStream);
    printPdf.PrintSettings.SelectPageRange(1, 1);
    printPdf.Print();

    //Dispose
    printPdf.Dispose();
}

Printing PowerPoint Presentations in C#

Library Installation

Install-Package Spire.Printing
Install-Package Spire.Presentationfor.NETStandard

Code Example

using Spire.Presentation;
using Spire.Printing;

//Check the system
bool isWindows = System.Runtime.InteropServices.RuntimeInformation.IsOSPlatform(System.Runtime.InteropServices.OSPlatform.Windows);
using (Presentation ppt = new Presentation())
{
    //Use Spire.Presentation to save documents as. xps or. pdf document streams
    ppt.LoadFromFile("test.pptx");
    Spire.Presentation.FileFormat fileFormat = !isWindows ? Spire.Presentation.FileFormat.PDF : Spire.Presentation.FileFormat.XPS;
    MemoryStream stream = new MemoryStream();
    ppt.SaveToFile(stream, fileFormat);

    //Save to IPrintDocumentStream according to the system
    IPrintDocumentStream pptStream = !isWindows ? new PdfPrintDocument(stream) : new XpsPrintDocument(stream);

    //Print
    PrintDocument printPpt = new PrintDocument(pptStream);
    printPpt.PrintSettings.SelectPageRange(1, 1);
    printPpt.Print();

    //Dispose
    printPpt.Dispose();
}

Advanced Print Settings and Operations

For automated and cross-platform printing scenarios, Spire.Printing provides additional control over printer selection, paper handling, and page output through the PrintSettings API. These options are commonly used in unattended services and batch-printing workflows.

Discovering and Selecting a Printer

Instead of using the system default, you can enumerate available printers and route the print job to a specific device:

IEnumerable<string> printers = printDocument.PrintSettings.Printers;

// Select the first printer or choose based on your logic
string selectedPrinterName = printers.First();
printDocument.PrintSettings.PrinterName = selectedPrinterName;

This is useful when multiple printers are installed or when deterministic printer routing is required.

Choosing a Supported Paper Size

To ensure compatibility with the selected printer, you can query and apply one of its supported paper sizes:

IEnumerable<PaperSize> paperSizes = printDocument.PrintSettings.PaperSizes;

// Select the first available size or apply custom logic
PaperSize selectedPaperSize = paperSizes.First();
printDocument.PrintSettings.PaperSize = selectedPaperSize;

This guarantees the selected paper size is compatible with the target printer.

Selecting Specific Pages

You can restrict printing to certain pages, either as a continuous range or a specific set:

// Print pages 2 to 5
printDocument.PrintSettings.SelectPageRange(2, 5);

// Print specific pages: 1, 3, 5, 7
int[] pages = { 1, 3, 5, 7 };
printDocument.PrintSettings.SelectSomePages(pages);

Only one of these methods should be used per print job.

These advanced operations allow precise control over print output, making it suitable for automated workflows, batch processing, or scenarios where consistent print settings are required across multiple documents and printers.

Licensing Notes

Without a valid license, Spire.Printing prints only the first 10 pages. This limitation can be removed by applying a license for Spire.Office for .NET or the corresponding document libraries, such as Spire.Doc, Spire.XLS, Spire.PDF, or Spire.Presentation.

Spire.Pdf.License.LicenseProvider.SetLicenseKey(string key);
Spire.Doc.License.LicenseProvider.SetLicenseKey(string key);
Spire.Xls.License.LicenseProvider.SetLicenseKey(string key);
Spire.Presentation.License.LicenseProvider.SetLicenseKey(string key);

For details on how to apply a license key, see the licensing guide.

Conclusion

Spire.Printing provides a flexible and reliable way to implement professional printing in C# applications. It supports stream-based printing of PDF, Word, Excel, and PowerPoint documents across Windows, Linux, and macOS, and works seamlessly with Spire.Office for .NET libraries—especially for .NET Standard—to handle Office and PDF files in modern .NET applications.

With the core printing workflow understood, developers can easily apply advanced print settings, such as selecting printers, paper sizes, or specific pages, to meet real-world business requirements and automated workflows.

For evaluation or short-term testing scenarios, a temporary license can be requested to remove trial limitations during development.

Published in .NET

How to Create PDF in ASP.NET & ASP.NET Core with C#

2025-12-31 08:59:48 Written by zaki zou

Tutorial on PDF Generation in ASP.NET & ASP.NET Core

In many web applications, PDF files are more than just downloadable documents—they are often the final output of business processes. Common examples include invoices, financial reports, contracts, certificates, and data exports that must preserve layout and formatting across devices.

For developers working with ASP.NET, the ability to create PDF files directly on the server side is a frequent requirement. Whether you are building a traditional ASP.NET MVC application or a modern ASP.NET Core service, generating PDFs programmatically allows you to deliver consistent, print-ready documents to end users.

However, implementing PDF generation in ASP.NET is not always straightforward. Developers often encounter challenges such as:

Managing document layout and pagination
Handling fonts and international text
Returning PDF files efficiently to the browser
Supporting both ASP.NET Framework and ASP.NET Core

This article focuses on practical solutions for creating PDF documents in ASP.NET and ASP.NET Core scenarios using Spire.PDF for .NET. You will learn how to generate PDFs using C# in:

ASP.NET Framework applications
ASP.NET Core applications
MVC and Web API–based projects

By the end of this guide, you will have a clear understanding of how ASP.NET PDF generation works and how to apply it in real-world projects.

Quick Navigation

Overview: Common Approaches to Create PDF in ASP.NET
Environment Setup for ASP.NET PDF Generation
How to Create PDF in ASP.NET (Framework) Using C#
Generate PDF in ASP.NET Core Applications
Advanced Scenarios for ASP.NET PDF Generation
Choosing an ASP.NET PDF Library
Why Use Spire.PDF for ASP.NET PDF Creation
FAQ: Frequently Asked Questions

1. Overview: Creating PDF Directly in ASP.NET Using C#

In ASP.NET and ASP.NET Core applications, PDF files are often generated as the final output of server-side processes, such as reports, invoices, and data exports.

One of the most reliable ways to achieve this is creating PDF documents directly through C# code. In this approach, the application controls:

Page creation and pagination
Text formatting and layout
File output and response handling

This tutorial focuses on this code-driven PDF generation approach, which works consistently across ASP.NET Framework and ASP.NET Core and is well suited for server-side scenarios where predictable output and layout control are required.

2. Environment Setup for ASP.NET PDF Generation

Before you start generating PDFs in ASP.NET or ASP.NET Core applications, it is important to ensure that your development environment is properly configured. This will help you avoid common issues and get your projects running smoothly.

2.1. .NET SDK Requirements

ASP.NET Framework: Ensure your project targets .NET Framework 4.6.1 or higher.
ASP.NET Core: Install .NET 6 or .NET 7 SDK, depending on your project target.
Verify your installed SDK version using:

dotnet --version

2.2. Installing the Spire.PDF for .NET Library

To generate PDFs, you need a PDF library compatible with your project. One widely used option is Spire.PDF for .NET, which supports both ASP.NET Framework and ASP.NET Core.

Install via NuGet Package Manager in Visual Studio:

Install-Package Spire.PDF

You can also download Spire.PDF for .NET and install it manually.

Verify the installation by checking that the Spire.Pdf.dll is referenced in your project.

2.3. Project Template Considerations

ASP.NET Framework: Use an MVC or Web Forms project and ensure required assemblies (e.g., System.Web) are referenced.
ASP.NET Core: Use an MVC or API project and configure any required services for the PDF library.

Ensure the environment allows writing files if needed and supports necessary fonts for your documents.

3. How to Create PDF in ASP.NET (Framework) Using C#

This section demonstrates how to create PDF files in ASP.NET Framework applications using C#. These examples apply to classic ASP.NET Web Forms and ASP.NET MVC projects.

3.1 Create a Simple PDF File in ASP.NET

The basic workflow for creating PDF in ASP.NET is:

Create a PdfDocument instance.
Add pages and content.
Save the document using PdfDocument.SaveToFile() method, or return it to the client.

Below is a simple C# example that creates a PDF file and saves it on the server.

using Spire.Pdf;
using Spire.Pdf.Graphics;
using System.Drawing;

PdfDocument document = new PdfDocument();
PdfPageBase page = document.Pages.Add();

PdfFont font = new PdfFont(PdfFontFamily.Helvetica, 12f);
page.Canvas.DrawString(
    "Hello, this PDF was generated in ASP.NET using C#.",
    font,
    PdfBrushes.Black,
    new PointF(40, 40)
);

document.SaveToFile(Server.MapPath("~/Output/Sample.pdf"));
document.Close();

This example demonstrates the core idea of PDF generation in ASP.NET using C#: everything is created programmatically, giving you full control over content and layout.

In real applications, this approach is commonly used to generate:

Confirmation documents
Server-side reports
System-generated notices

If you also want to include images in your PDFs, you can check out our guide on inserting images into PDF files using C# for a step-by-step example.

3.2 Generate PDF in ASP.NET MVC

In ASP.NET MVC projects, PDFs are usually generated inside controller actions and returned directly to the browser. This allows users to download or preview the document without saving it permanently on the server.

A typical PDF generation in MVC implementation looks like this:

using Spire.Pdf;
using Spire.Pdf.Graphics;
using System.Drawing;
using System.IO;
using System.Web.Mvc;

namespace WebApplication.Controllers
{
    public class DefaultController : Controller
    {
        public ActionResult GeneratePdf()
        {
            // Create a PDF document
            using (PdfDocument document = new PdfDocument())
            {
                PdfPageBase page = document.Pages.Add();

                PdfFont font = new PdfFont(PdfFontFamily.Helvetica, 12f);
                page.Canvas.DrawString(
                    "PDF generated in ASP.NET MVC.",
                    font,
                    PdfBrushes.Black,
                    new PointF(40, 40)
                );

                // Save the document to stream and return to browser
                using (MemoryStream stream = new MemoryStream())
                {
                    document.SaveToStream(stream);

                    return File(
                        stream.ToArray(),
                        "application/pdf",
                        "MvcSample.pdf"
                    );
                }
            }
        }
    }
}

Below is the preview of the generated PDF document:

ASP.NET MVC generate PDF using C# example Hello World

Practical Notes for MVC Projects

Returning a FileResult is the most common pattern
Memory streams help avoid unnecessary disk I/O
This approach works well for on-demand PDF generation triggered by user actions

With this method, you can seamlessly integrate ASP.NET PDF generation into existing MVC workflows such as exporting reports or generating invoices.

Tip: If you need to present PDFs to users in a ASP.NET application, you can use Spire.PDFViewer for ASP.NET, a component that allows you to display PDF documents in a web environment.

4. Generate PDF in ASP.NET Core Applications

With the rise of cross-platform development and cloud-native architectures, ASP.NET Core has become the default choice for many new projects. Although the core idea of PDF generation remains similar, there are several implementation details that differ from the traditional ASP.NET Framework.

This section explains how to generate PDF in ASP.NET Core using C#, covering both MVC-style web applications and Web API–based services.

4.1 Generate PDF in ASP.NET Core Web Application

In an ASP.NET Core web application, PDF files are commonly generated inside controller actions and returned as downloadable files. Unlike ASP.NET Framework, ASP.NET Core does not rely on System.Web, so file handling is typically done using streams.

Below is a simple example demonstrating ASP.NET Core PDF generation in a controller.

Create a new ASP.NET Core Web App (Model-View-Controller) project in your IDE and add a new controller named PdfController with an action named CreatePdf() in the Controllers folder.

using Microsoft.AspNetCore.Mvc;
using Spire.Pdf;
using Spire.Pdf.Graphics;
using System.Drawing;

namespace CoreWebApplication.Controllers
{
    public class PdfController : Controller
    {
        public IActionResult CreatePdf()
        {
            using (PdfDocument document = new PdfDocument())
            {
                PdfPageBase page = document.Pages.Add();

                PdfFont font = new PdfFont(PdfFontFamily.Helvetica, 14f, PdfFontStyle.Bold);
                page.Canvas.DrawString(
                    "PDF generated in ASP.NET Core.",
                    font,
                    PdfBrushes.DarkRed,
                    new PointF(40, 40)
                );

                using (MemoryStream stream = new MemoryStream())
                {
                    document.SaveToStream(stream);
                    return File(
                        stream.ToArray(),
                        "application/pdf",
                        "AspNetCoreSample.pdf"
                    );
                }
            }
        }
    }
}

Below is the preview of the generated PDF document:

ASP.NET Core Web App MVC generate PDF using C#

Key Differences from ASP.NET Framework

No dependency on Server.MapPath
Stream-based file handling is the recommended pattern
Works consistently across Windows, Linux, and Docker environments

This approach is suitable for dashboards, admin panels, and internal systems where users trigger ASP.NET Core PDF generation directly from the UI.

If you want to create structured tables in your PDFs, you can check out our guide on generating tables in PDF using ASP.NET Core and C# for a step-by-step example.

4.2 Generate PDF in ASP.NET Core Web API

For front-end and back-end separated architectures, PDF generation is often implemented in ASP.NET Core Web API projects. In this scenario, the API endpoint returns a PDF file as a binary response, which can be consumed by web clients, mobile apps, or other services.

A typical ASP.NET PDF generation in Web API example looks like this:

Add this code inside a controller named PdfApiController in the Controllers folder.

using Microsoft.AspNetCore.Mvc;
using Spire.Pdf;
using Spire.Pdf.Graphics;
using System.Drawing;

[ApiController]
[Route("api/pdf")]
public class PdfApiController : ControllerBase
{
    [HttpGet("generate")]
    public IActionResult GeneratePdf()
    {
        PdfDocument document = new PdfDocument();
        PdfPageBase page = document.Pages.Add();

        PdfFont font = new PdfFont(PdfFontFamily.Helvetica, 14f, PdfFontStyle.Bold);
        page.Canvas.DrawString(
            "PDF generated by ASP.NET Core Web API.",
            font,
            PdfBrushes.BlueViolet,
            new PointF(40, 40)
        );

        using (MemoryStream stream = new MemoryStream())
        {
            document.SaveToStream(stream);
            document.Close();

            return File(
                stream.ToArray(),
                "application/pdf",
                "ApiGenerated.pdf"
            );
        }
    }
}

Below is the preview of the generated PDF document:

ASP.NET Core Web API generate PDF using C#

Practical Considerations for Web API

Always set the correct Content-Type (application/pdf)
Use streams to avoid unnecessary disk access
Suitable for microservices and distributed systems

This pattern is widely used when ASP.NET PDF generation is part of an automated workflow rather than a user-driven action.

5. Advanced Scenarios for ASP.NET PDF Generation

Basic examples are useful for learning, but real-world applications often require more advanced PDF features. This section focuses on scenarios that commonly appear in production systems and demonstrate the practical value of server-side PDF generation.

5.1 Export Dynamic Data to PDF

One of the most frequent use cases is exporting dynamic data—such as database query results—into a structured PDF document.

Typical scenarios include:

Sales reports
Order summaries
Financial statements

The example below demonstrates generating a simple table-like layout using dynamic data.

PdfDocument document = new PdfDocument();
PdfPageBase page = document.Pages.Add();

PdfFont headerFont = new PdfFont(PdfFontFamily.Helvetica, 12f, PdfFontStyle.Bold);
PdfFont bodyFont = new PdfFont(PdfFontFamily.Helvetica, 10f);

float y = 40;

// Header
page.Canvas.DrawString("Order Report", headerFont, PdfBrushes.Black, 40, y);
y += 30;

// Sample dynamic data
string[] orders = { "Order #1001 - $250", "Order #1002 - $180", "Order #1003 - $320" };

foreach (string order in orders)
{
    page.Canvas.DrawString(order, bodyFont, PdfBrushes.Black, 40, y);
    y += 20;
}

document.SaveToFile("OrderReport.pdf");
document.Close();

Output Preview:

Export Dynamic Data to PDF in ASP.NET Using C#

This approach allows you to:

Populate PDFs from databases or APIs
Generate documents dynamically per request
Maintain consistent formatting regardless of data size

5.2 Styling and Layout Control in Generated PDFs

Another important aspect of ASP.NET PDF generation is layout control. In many business documents, appearance matters as much as content.

Common layout requirements include:

Page margins and alignment
Headers and footers
Multi-page content handling

For example, adding a simple header and footer:

PdfPageBase page = document.Pages.Add();

PdfFont font = new PdfFont(PdfFontFamily.Helvetica, 12f);

page.Canvas.DrawString(
    "Company Confidential",
    font,
    PdfBrushes.Gray,
    new PointF(40, 15)
);

page.Canvas.DrawString(
    "Page 1",
    font,
    PdfBrushes.Gray,
    new PointF(page.Canvas.ClientSize.Width - 60, page.Canvas.ClientSize.Height - 30)
);

Output Preview:

Styling and Layout Control in Generated PDFs in ASP.NET Using C#

When working with multi-page documents, it is important to:

Track vertical position (y coordinate)
Add new pages when content exceeds page height
Keep layout logic consistent across pages

These considerations help ensure that generated PDFs are suitable for both on-screen viewing and printing.

5.3 Related PDF Generation Scenarios

In addition to creating PDF files directly via C# code, some ASP.NET applications use other PDF workflows depending on their requirements. Check out the following articles for more examples:

6. Choosing an ASP.NET PDF Library

When implementing PDF generation in ASP.NET or ASP.NET Core, selecting the right PDF library is a critical decision. The choice directly affects development efficiency, long-term maintainability, and runtime performance.

Instead of focusing only on feature lists, it is more practical to evaluate an ASP.NET PDF library based on how it fits real application requirements.

Key Factors to Consider

API Usability

A good PDF library should provide:

Clear object models (documents, pages, fonts, graphics)
Intuitive APIs for drawing text and layout
Minimal boilerplate code for common tasks

This is especially important for projects where PDF generation logic evolves over time.

ASP.NET and ASP.NET Core Compatibility

Many teams maintain both legacy ASP.NET applications and newer ASP.NET Core services. Choosing a library that works consistently across:

ASP.NET Framework
ASP.NET Core
MVC and Web API projects

can significantly reduce migration and maintenance costs.

3. Performance and Stability

In production environments, PDF generation often runs:

On-demand under user requests
As background jobs
Inside high-concurrency services

An ASP.NET PDF generator should be stable under load and capable of handling multi-page documents without excessive memory usage.

In practice, libraries generally fall into categories such as HTML-based converters or code-driven PDF APIs. For applications that require predictable output and fine-grained control, direct PDF creation via C# code is often the preferred approach.

7. Why Use Spire.PDF for ASP.NET PDF Creation

For developers who need to create PDF files in ASP.NET using C#, Spire.PDF for .NET provides a balanced solution that fits both tutorial examples and real-world projects.

Practical Advantages in ASP.NET Scenarios

Native support for ASP.NET and ASP.NET Core The same API can be used across classic ASP.NET, MVC, ASP.NET Core Web Apps, and Web API projects.
Code-driven PDF creation PDFs can be generated directly through C# without relying on external rendering engines or browser components.
Rich PDF features Supports text, images, tables, pagination, headers and footers, making it suitable for reports, invoices, and business documents.
Deployment-friendly Works well in server environments, including containerized and cloud-hosted ASP.NET Core applications.

Because of these characteristics, Spire.PDF fits naturally into PDF generation in ASP.NET workflows where stability, layout control, and cross-version compatibility matter more than quick HTML rendering.

For a complete reference of all available methods and classes, you can consult the official API documentation: Spire.PDF for .NET API Reference.

8. Frequently Asked Questions (FAQ)

Can I generate PDF in ASP.NET Core without MVC?

Yes. PDF generation in ASP.NET Core does not strictly require MVC. In addition to MVC controllers, PDFs can also be generated and returned from:

ASP.NET Core Web API controllers
Minimal APIs
Background services

As long as the application returns a valid PDF byte stream with the correct Content-Type, the approach works reliably.

What is the difference between generating PDF in ASP.NET and ASP.NET Core?

The core PDF creation logic is similar, but there are some differences:

ASP.NET Framework relies on System.Web features such as Server.MapPath
ASP.NET Core uses stream-based file handling
ASP.NET Core is cross-platform and better suited for modern deployment models

From a PDF API perspective, most logic can be shared between the two.

Is it possible to generate PDF directly from C# code in ASP.NET?

Yes. Many production systems generate PDFs entirely through C# code. This approach:

Avoids HTML rendering inconsistencies
Provides precise layout control
Works well for structured documents such as reports and invoices

It is a common pattern in ASP.NET PDF solutions where consistency and reliability are required.

Conclusion

Generating PDF files is a common requirement in ASP.NET and ASP.NET Core applications, especially for scenarios such as reports, invoices, and data exports. By creating PDFs directly through C# code, you gain full control over document structure, layout, and output behavior.

This guide demonstrated how to generate PDFs in both ASP.NET Framework and ASP.NET Core, covering MVC and Web API scenarios, dynamic data output, and basic layout control. It also discussed how to evaluate PDF libraries based on real application requirements.

If you plan to test these examples in a real project environment without functional limitations, you can apply for a temporary license to unlock all full features during evaluation.

Published in Document Operation

Tagged under

pdf net Document Operation

Generate PDF from HTML Template in C# – Full Tutorial

2025-10-21 08:26:43 Written by zaki zou

C-sharp Generate PDF from HTML Template

In many modern .NET applications, generating professional-looking PDF documents is a common requirement — especially for invoices, reports, certificates, and forms. Instead of creating PDFs manually, a smarter approach is to use HTML templates . HTML makes it easy to design layouts using CSS, include company branding, and reuse the same structure across multiple documents.

By dynamically inserting data into HTML and converting it to PDF programmatically, you can automate document generation while maintaining design consistency.

In this tutorial, you’ll learn how to generate a PDF from an HTML template in C# .NET using Spire.PDF for .NET. We’ll guide you step-by-step — from setting up your development environment (including the required HTML-to-PDF plugin), preparing the HTML template, inserting dynamic data, and generating the final PDF file.

On this page:

Why Generate PDFs from HTML Templates in C#?
Set Up Your .NET Environment
Prepare an HTML Template
Insert Dynamic Data into HTML Before Conversion
Convert Updated HTML Template to PDF in C#
Best Practices for Generating PDF from HTML in C#
Final Words
FAQs About C# HTML Template to PDF Conversion

Why Generate PDFs from HTML Templates in C#?

Using HTML templates for PDF generation offers several advantages:

Reusability: Design once, reuse anywhere — perfect for reports, receipts, and forms.
Styling flexibility: HTML + CSS allow rich formatting without complex PDF drawing code.
Dynamic content: Easily inject runtime data such as customer names, order totals, or timestamps.
Consistency: Ensure all generated documents follow the same layout and style guidelines.
Ease of maintenance: You can update the HTML template without changing your C# logic.

Set Up Your .NET Environment

Before you begin coding, make sure your project is properly configured to handle HTML-to-PDF conversion.

1. Install Spire.PDF for .NET

Spire.PDF for .NET is a professional library designed for creating, reading, editing, and converting PDF documents in C# and VB.NET applications—without relying on Adobe Acrobat. It provides powerful APIs for handling text, images, annotations, forms, and HTML-to-PDF conversion.

You can install it via NuGet:

Install-Package Spire.PDF

Or download it directly from the official website and reference the DLL in your project.

2. Install the HTML Rendering Plugin

Spire.PDF relies on an external rendering engine (Qt WebEngine or Chrome) to accurately convert HTML content into PDF. This plugin must be installed separately.

Steps:

Download the plugin package for your platform.

Extract the contents to a local folder, such as: C:\plugins-windows-x64\plugins
In your C# code, register the plugin path before performing the conversion.

HtmlConverter.PluginPath = @"C:\plugins-windows-x64\plugins";

Prepare an HTML Template

Create an HTML file with placeholders for your dynamic data. For example, name it invoice_template.html :

<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>Invoice</title>
  <style>
    body { font-family: Arial; margin: 40px; }
    .header { font-size: 24px; font-weight: bold; margin-bottom: 20px; }
    table { width: 100%; border-collapse: collapse; margin-top: 20px; }
    th, td { border: 1px solid #999; padding: 8px; text-align: left; }
  </style>
</head>
<body>
  <div class="header">Invoice for {CustomerName}</div>
  <p>Date: {InvoiceDate}</p>
  <table>
    <tr><th>Item</th><th>Price</th></tr>
    <tr><td>{Item}</td><td>{Price}</td></tr>
  </table>
</body>
</html>

Tips:

Keep CSS inline or embedded within the HTML file.
Avoid JavaScript or complex animations.
Use placeholders like {CustomerName} and {InvoiceDate} for data replacement.

Insert Dynamic Data into HTML Before Conversion

You can read your HTML file as text, replace placeholders with real values, and then save it as a new temporary file.

using System;
using System.IO;

string template = File.ReadAllText("invoice_template.html");

template = template.Replace("{CustomerName}", "John Doe");
template = template.Replace("{InvoiceDate}", DateTime.Now.ToShortDateString());
template = template.Replace("{Item}", "Wireless Mouse");
template = template.Replace("{Price}", "$25.99");

File.WriteAllText("invoice_ready.html", template);

This approach lets you generate customized PDFs for each user or transaction dynamically.

Convert Updated HTML Template to PDF in C#

Now that your HTML content is ready, you can use the HtmlConverter.Convert() method to directly convert the HTML string into a PDF file.

Below is a full code example to create PDF from HTML template file in C#:

using System;
using System.Collections.Generic;
using System.Drawing;
using System.IO;
using Spire.Additions.Qt;
using Spire.Pdf.Graphics;
using Spire.Pdf.HtmlConverter;

namespace CreatePdfFromHtmlTemplate
{
    class Program
    {
        static void Main(string[] args)
        {
            // Path to the HTML template file
            string htmlFilePath = "invoice_template.html";

            // Step 1: Read the HTML template from file
            if (!File.Exists(htmlFilePath))
            {
                Console.WriteLine("Error: HTML template file not found.");
                return;
            }

            string htmlTemplate = File.ReadAllText(htmlFilePath);

            // Step 2: Define dynamic data for invoice placeholders
            Dictionary<string, string> invoiceData = new Dictionary<string, string>()
            {
                { "INVOICE_NUMBER", "INV-2025-001" },
                { "INVOICE_DATE", DateTime.Now.ToString("yyyy-MM-dd") },
                { "BILLER_NAME", "John Doe" },
                { "BILLER_ADDRESS", "123 Main Street, New York, USA" },
                { "BILLER_EMAIL", "john.doe@example.com" },
                { "ITEM_DESCRIPTION", "Consulting Services" },
                { "ITEM_QUANTITY", "10" },
                { "ITEM_UNIT_PRICE", "$100" },
                { "ITEM_TOTAL", "$1000" },
                { "SUBTOTAL", "$1000" },
                { "TAX_RATE", "5" },
                { "TAX", "$50" },
                { "TOTAL", "$1050" }
            };

            // Step 3: Replace placeholders in the HTML template with real values
            string populatedInvoice = PopulateInvoice(htmlTemplate, invoiceData);

            // Optional: Save the populated HTML for debugging or review
            File.WriteAllText("invoice_ready.html", populatedInvoice);

            // Step 4: Specify the plugin path for the HTML to PDF conversion
            string pluginPath = @"C:\plugins-windows-x64\plugins";
            HtmlConverter.PluginPath = pluginPath;

            // Step 5: Define output PDF file path
            string outputFile = "InvoiceOutput.pdf";

            try
            {
                // Step 6: Convert the HTML string to PDF
                HtmlConverter.Convert(
                    populatedInvoice,
                    outputFile,
                    enableJavaScript: true,
                    timeout: 100000,                // 100 seconds
                    pageSize: new SizeF(595, 842),  // A4 size in points
                    margins: new PdfMargins(20),    // 20-point margins
                    loadHtmlType: LoadHtmlType.SourceCode
                );

                Console.WriteLine($"PDF generated successfully: {outputFile}");
            }
            catch (Exception ex)
            {
                Console.WriteLine($"Error during PDF generation: {ex.Message}");
            }
        }

        /// <summary>
        /// Helper method: Replaces placeholders in the HTML with actual data values.
        /// </summary>
        private static string PopulateInvoice(string template, Dictionary<string, string> data)
        {
            string result = template;
            foreach (var entry in data)
            {
                result = result.Replace("{" + entry.Key + "}", entry.Value);
            }
            return result;
        }
    }
}

How it works :

Create an invoice template with placeholder variables in {VARIABLE_NAME} format.
Set up a dictionary with key-value pairs containing actual invoice data that matches the template placeholders.
Replace all placeholders in the HTML template with actual values from the data dictionary.
Use Spire.PDF with the Qt plugin to render the HTML content as a PDF file.

Result:

Create PDF from Template in C#

Best Practices for Generating PDF from HTML in C#

Use fixed-width layouts: Avoid fluid or responsive designs to maintain consistent rendering.
Embed or inline CSS: Ensure your styles are self-contained.
Use standard fonts: Arial, Times New Roman, or other supported fonts convert reliably.
Keep images lightweight: Compress large images to improve performance.
Test with different page sizes: A4 and Letter are the most common formats.
Avoid unsupported tags: Elements relying on JavaScript (like <canvas>) won’t render.

Final Words

Generating PDFs from HTML templates in C# .NET is a powerful way to automate document creation while preserving visual consistency. By combining Spire.PDF for .NET with the HTML rendering plugin , you can easily transform styled HTML layouts into print-ready PDF files that integrate seamlessly with your applications.

Whether you’re building a reporting system, an invoicing tool, or a document automation service, this approach saves time, reduces complexity, and produces professional results with minimal code.

FAQs About C# HTML Template to PDF Conversion

Q1: Can I use Google Chrome for HTML rendering instead of Qt WebEngine?

Absolutely. For advanced HTML, CSS, or modern JavaScript, we recommend using the Google Chrome engine via the ChromeHtmlConverter class for more precise and reliable PDF results.

For a complete guide, see our article: Convert HTML to PDF using ChromeHtmlConverter

Q2: Do I need to install a plugin for every machine running my application?

Yes, each environment must have access to the HTML rendering plugin (Qt or Chrome engine) for successful HTML-to-PDF conversion.

Q3: Does Spire.PDF support external CSS files or online resources?

Yes, but inline or embedded CSS is recommended for better rendering accuracy.

Q4: Can I use this approach in ASP.NET or web APIs?

Absolutely. You can generate PDFs server-side and return them as downloadable files or streams.

Q5: Is JavaScript supported during HTML rendering?

Limited support. Static elements are rendered correctly, but scripts and dynamic DOM manipulations are not executed.

Get a Free License

To fully experience the capabilities of Spire.PDF for .NET without any evaluation limitations, you can request a free 30-day trial license.

Published in Document Operation

Tagged under

pdf net Document Operation

Edit PDF in C# .NET: Step-by-Step Guide with Code Examples

2025-09-30 09:29:33 Written by zaki zou

How to Edit PDF using C# .NET

PDF (Portable Document Format) is widely used for sharing, distributing, and preserving documents because it maintains a consistent layout and formatting across platforms. Developers often need to edit PDF files in C#, whether it's to replace text, insert images, add watermarks, or extract pages.

In this step-by-step tutorial, you will learn how to programmatically edit PDFs in C# with the Spire.PDF for .NET library.

Why Edit PDFs Programmatically in C#
C# Library to Edit PDFs
Step-by-Step Guide: Editing PDF in C#
Tips for Efficient PDF Editing in C#
Conclusion
FAQs

Why Edit PDFs Programmatically in C

While tools like Adobe Acrobat provide manual PDF editing, programmatically editing PDFs has significant advantages:

Automation: Batch process hundreds of documents without human intervention.
Integration: Edit PDFs as part of a workflow, such as generating reports, invoices, or certificates dynamically.
Consistency: Apply uniform styling, stamps, or watermarks across multiple PDFs.
Flexibility: Extract or replace content programmatically to integrate with databases or external data sources.

C# Library to Edit PDFs

Spire.PDF for .NET is a robust .NET PDF library that enables developers to generate, read, edit, and convert PDF files in .NET applications. It's compatible with both .NET Framework and .NET Core applications.

Spire.PDF - C# Edit PDF Library

This library provides a rich set of features for developers working with PDFs:

PDF Creation: Generate new PDFs from scratch or from existing documents.
Text Editing: Add, replace, or delete text on any page.
Image Editing: Insert images, resize, or remove them.
Page Operations: Insert, remove, extract, or reorder pages.
Annotations: Add stamps, comments, and shapes for marking content.
Watermarking: Add text or image watermarks for branding or security.
Form Handling: Create and fill PDF forms programmatically.
Digital Signatures: Add and validate signatures for authenticity.
Encryption: Apply password protection and user permissions.

Step-by-Step Guide: Editing PDF in C

Modifying a PDF file in C# involves several steps: setting up a C# project, installing the library, loading the PDF file, making necessary changes, and saving the document. Let's break down each step in detail.

Step 1: Set Up Your C# Project

Before you start editing PDFs, you need to create a new C# project by following the steps below:

Open Visual Studio.
Create a new project. You can choose a Console App or a Windows Forms App depending on your use case.
Name your project (e.g., PdfEditorDemo) and click Create.

Step 2: Install Spire.PDF

Next, you need to install the Spire.PDF library, which provides all the functionality required to read, edit, and save PDF files programmatically.

You can simply install it via the NuGet Package Manager Console with the following command:

Install-Package Spire.PDF

Alternatively, you can use the NuGet Package Manager GUI to search for Spire.PDF and click Install.

Step 3: Load an Existing PDF

Before you can modify an existing PDF file, you need to load it into a PdfDocument object. This gives you access to its pages, text, images, and structure.

using Spire.Pdf;
PdfDocument pdf = new PdfDocument();
pdf.LoadFromFile("example.pdf");

Step 4: Edit PDF Content

Text editing, image insertion, page management, and watermarking are common operations when working with PDFs. This step covers all these editing tasks.

4.1 Edit Text

Text editing is one of the most common operations when working with PDFs. Depending on your needs, you might want to replace existing text or add new text to specific pages.

Replace existing text:

Replacing text in PDF allows you to update content across a single page or an entire PDF while maintaining formatting consistency. Using the PdfTextReplacer class, you can quickly find and replace text programmatically:

// Get the first page
PdfPageBase page = pdf.Pages[0];
// Create a PdfTextReplacer
PdfTextReplacer textReplacer = new PdfTextReplacer(page);

// Replace all occurrences of target text with new text
textReplacer.ReplaceAllText("Old text", "New text");

Add new text:

In addition to replacing existing content, you may need to insert new text into a PDF. With just one line of code, you can add text to any location on a PDF page:

page.Canvas.DrawString(
    "Hello, World!",
    new PdfTrueTypeFont(new Font("Arial Unicode MS", 15f, FontStyle.Bold), true),
    new PdfSolidBrush(Color.Black),
    90, 30
);

4.2 Insert and Update Images

PDFs often contain visual elements such as logos, charts, or illustrations. You can insert new images or update outdated graphics to enhance the document's visual appeal.

Insert an Image:

// Load an image
PdfImage image = PdfImage.FromFile("logo.png");
// Draw the image at a specific location with defined size
page.Canvas.DrawImage(image, 100, 150, 200, 100);

Update an image:

// Load the new image
PdfImage newImage = PdfImage.FromFile("image1.jpg");
// Create a PdfImageHelper instance
PdfImageHelper imageHelper = new PdfImageHelper();
// Get the image information from the page
PdfImageInfo[] imageInfo = imageHelper.GetImagesInfo(page);
// Replace the first image on the page with the new image
imageHelper.ReplaceImage(imageInfo[0], newImage);

4.3 Add, Remove, or Extract Pages

Managing page structure is another important aspect of PDF editing, such as adding new pages, removing unwanted pages, and extracting particular pages to a new document.

Add a new page:

// Add a new page
PdfPageBase newPage = pdf.Pages.Add();

Remove a page:

// Remove the last page
pdf.Pages.RemoveAt(pdf.Pages.Count - 1);

Extract a page to a new document:

// Create a new PDF document
PdfDocument newPdf = new PdfDocument();
// Extract the third page to a new PDF document
newPdf.InsertPage(pdf, pdf.Pages[2]);
// Save the new PDF document
newPdf.SaveToFile("extracted_page.pdf");

4.4 Add Watermarks

Adding Watermarks to PDFs can help indicate confidentiality, add branding, or protect intellectual property. You can easily add them programmatically to any page:

// Iterate through each page in the PDF document
foreach (PdfPageBase page in pdf.Pages)
{
    // Create a tiling brush for the watermark
    // The brush size is set to half the page width and one-third of the page height
    PdfTilingBrush brush = new PdfTilingBrush(
        new SizeF(page.Canvas.ClientSize.Width / 2, page.Canvas.ClientSize.Height / 3));

    // Set the brush transparency to 0.3 for a semi-transparent watermark
    brush.Graphics.SetTransparency(0.3f);

    // Save the current graphics state for later restoration
    brush.Graphics.Save();

    // Move the origin of the brush to its center to prepare for rotation
    brush.Graphics.TranslateTransform(brush.Size.Width / 2, brush.Size.Height / 2);

    // Rotate the coordinate system by -45 degrees to angle the watermark
    brush.Graphics.RotateTransform(-45);

    // Draw the watermark text on the brush
    // Using Helvetica font, size 24, violet color, centered alignment
    brush.Graphics.DrawString(
        "DO NOT COPY",
        new PdfFont(PdfFontFamily.Helvetica, 24),
        PdfBrushes.Violet,
        0, 0,
        new PdfStringFormat(PdfTextAlignment.Center));

    // Restore the previously saved graphics state, undoing rotation and translation
    brush.Graphics.Restore();

    // Reset the transparency to fully opaque
    brush.Graphics.SetTransparency(1);

    // Draw the brush over the entire page area to apply the watermark
    page.Canvas.DrawRectangle(brush, new RectangleF(new PointF(0, 0), page.Canvas.ClientSize));
}

Step 5: Save the Modified PDF

After making all the necessary edits, the final step is to save your changes.

// Save the Modified PDF and release resources
pdf.SaveToFile("modified.pdf");
pdf.Close();

Output PDF

The output modified.pdf looks like this:

C# Edit PDF Output

Tips for Efficient PDF Editing in C

When editing PDFs programmatically, it's important to keep a few best practices in mind to ensure the output remains accurate, readable, and efficient.

Batch Processing: For repetitive tasks, process multiple PDF files in a loop rather than handling them individually. This approach improves efficiency and reduces manual effort.
Text Placement: Use coordinates carefully when inserting new text. Proper positioning prevents content from overlapping with existing elements and maintains a clean layout.
Fonts and Encoding: Choose fonts that support the characters you need. This is especially critical for languages such as Chinese, Arabic, or other scripts that require extended font support.
Memory Management: Always release resources by disposing of PdfDocument objects after use. Proper memory management helps avoid performance issues in larger applications.

Conclusion

This tutorial demonstrates how to edit PDF in C# using Spire.PDF. From replacing text, inserting images, and managing pages, to adding watermarks, each step includes practical code examples. Developers can now automate PDF editing, enhance document presentation, and handle PDFs efficiently within professional applications.

FAQs

Q1: How can I programmatically edit text in a PDF using C#?

A1: You can use a C# PDF library like Spire.PDF to replace existing text or add new text to a PDF. Classes such as PdfTextReplacer and page.Canvas.DrawString() provide precise control over text editing while preserving formatting.

Q2: How do I replace or add text in a PDF using C#?

A2: With C#, libraries like Spire.PDF let you search and replace existing text using PdfTextReplacer or add new text anywhere on a page using page.Canvas.DrawString().

Q3: Can I insert or update images in a PDF programmatically?

A3: Yes. You can load images into your project and use classes like PdfImage and PdfImageHelper to draw or replace images on a PDF page.

Q4: Is it possible to add watermarks to a PDF using code?

A4: Absolutely. You can add text or image watermarks programmatically, control transparency, rotation, and position, and apply them to one or all pages of a PDF.

Q5: How can I extract specific pages from a PDF?

A5: You can create a new PDF document and insert selected pages from the original PDF, enabling you to extract single pages or ranges for separate use.

Published in Document Operation

Tagged under

pdf net Document Operation

Convert HTML to Text in C#: Easy Methods + Examples

2025-09-24 02:01:39 Written by zaki zou

Convert HTML to Plain Text using C#

In .NET development, converting HTML to plain text is a common task, whether you need to extract content from web pages, process HTML emails, or generate lightweight text reports. However, HTML’s rich formatting, tags, and structural elements can complicate workflows that require clean, unformatted text. This is why using C# for HTML to text conversion becomes essential.

Spire.Doc for .NET simplifies this process: it’s a robust library for document manipulation that natively supports loading HTML files/strings and converting them to clean plain text. This guide will explore how to convert HTML to plain text in C# using the library, including detailed breakdowns of two core scenarios: converting HTML strings (in-memory content) and HTML files (disk-based content).

Why Use Spire.Doc for HTML to Text Conversion?
Installing Spire.Doc
Convert HTML Strings to Text in C#
Convert HTML File to Text in C#
FAQs
Conclusion

Why Use Spire.Doc for HTML to Text Conversion?

Spire.Doc is a .NET document processing library that stands out for HTML-to-text conversion due to:

Simplified Code: Minimal lines of code to handle even complex HTML.
Structure Preservation: Maintains logical formatting (line breaks, list indentation) in the output text.
Special Character Support: Automatically converts HTML entities to their plain text equivalents.
Lightweight: Avoids heavy dependencies, making it suitable for both desktop and web applications

Installing Spire.Doc

Spire.Doc is available via NuGet, the easiest way to manage dependencies:

In Visual Studio, right-click your project > Manage NuGet Packages.
Search for Spire.Doc and install the latest stable version.
Alternatively, use the Package Manager Console:

Install-Package Spire.Doc

After installing, you can dive into the C# code to extract text from HTML.

Convert HTML Strings to Text in C#

This example renders an HTML string into a Document object, then uses SaveToFile() to save it as a plain text file.

using Spire.Doc;
using Spire.Doc.Documents;

namespace HtmlToTextSaver
{
    class Program
    {
        static void Main(string[] args)
        {
            // Define HTML content
            string htmlContent = @"
                    <html>
                        <body>
                            <h1>Sample HTML Content</h1>
                            <p>This is a paragraph with <strong>bold</strong> and <em>italic</em> text.</p>
                            <p>Another line with a <a href='https://example.com'>link</a>.</p>
                            <ul>
                                <li>List item 1</li>
                                <li>List item 2 (with <em>italic</em> text)</li>
                            </ul>
                            <p>Special characters: &copy; &amp; &reg;</p>
                        </body>
                    </html>";

            // Create a Document object
            Document doc = new Document();
            // Add a section to hold content
            Section section = doc.AddSection();
            // Add a paragraph
            Paragraph paragraph = section.AddParagraph();
            // Render HTML into the paragraph
            paragraph.AppendHTML(htmlContent); 

            // Save as plain text
            doc.SaveToFile("HtmlStringtoText.txt", FileFormat.Txt);
        }
    }
}

How It Works:

HTML String Definition: We start with a sample HTML string containing headings, paragraphs, formatting tags (<strong>, <em>), links, lists, and special characters.
Document Setup: A Document object is created to manage the content, with a Section and Paragraph to structure the HTML rendering.
HTML Rendering: AppendHTML() parses the HTML string and converts it into the document's internal structure, preserving content hierarchy.
Text Conversion: SaveToFile() with FileFormat.Txt converts the rendered content to plain text, stripping HTML tags while retaining readable structure.

Output:

Convert an HTML string to plain text using C#

Extended reading: Parse or Read HTML in C#

Convert HTML File to Text in C#

This example directly loads an HTML file and converts it to text. Ideal for batch processing or working with pre-existing HTML documents (e.g., downloaded web pages, local templates).

using Spire.Doc;
using Spire.Doc.Documents;

namespace HtmlToText
{
    class Program
    {
        static void Main()
        {
            // Create a Document object
            Document doc = new Document();

            // Load an HTML file
            doc.LoadFromFile("sample.html", FileFormat.Html, XHTMLValidationType.None);

            // Convert HTML to plain text
            doc.SaveToFile("HTMLtoText.txt", FileFormat.Txt);
            doc.Dispose();
        }
    }
}

How It Works:

Document Initialization: A Document object is created to handle the file operations.
HTML File Loading: LoadFromFile() imports the HTML file, with FileFormat.Html specifying the input type. XHTMLValidationType.None ensures compatibility with non-strict HTML.
Text Conversion: SaveToFile() with FileFormat.Txt converts the loaded HTML content to plain text.

Convert an HTML file to plain text using C#

To preserve the original formatting and style, you can refer to the C# tutorial to convert the HTML file to Word.

FAQs

Q1: Can Spire.Doc process malformed HTML?

A: Yes. Spire.Doc includes built-in tolerance for malformed HTML, but you may need to disable strict validation to ensure proper parsing.

When loading HTML files, use XHTMLValidationType.None (as shown in the guide) to skip strict XHTML checks:

doc.LoadFromFile("malformed.html", FileFormat.Html, XHTMLValidationType.None);

This setting tells Spire.Doc to parse the HTML like a web browser (which automatically corrects minor issues like unclosed <p> or <li> tags) instead of rejecting non-compliant content.

Q2: Can I extract specific elements from HTML (like only paragraphs or headings)?

A: Yes, after loading the HTML into a Document object, you can access specific elements through the object model (like paragraphs, tables, etc.) and extract text from only those specific elements rather than the entire document.

Q3: Can I convert HTML to other formats besides plain text using Spire.Doc?

A: Yes, Spire.Doc supports conversion to multiple formats, including Word DOC/DOCX, PDF, image, RTF, and more, making it a versatile document processing solution.

Q4: Does Spire.Doc work with .NET Core/.NET 5+?

A: Spire.Doc fully supports .NET Core, .NET 5/6/7/8, and .NET Framework 4.0+. There’s no difference in functionality across these frameworks, which means you can use the same code (e.g., Document, AppendHTML(), SaveToFile()) regardless of which .NET runtime you’re targeting.

Conclusion

Converting HTML to text in C# is straightforward with the Spire.Doc library. Whether you’re working with HTML strings or files, Spire.Doc simplifies the process by handling HTML parsing, structure preservation, and text conversion. By following the examples in this guide, you can seamlessly integrate HTML-to-text conversion into your C# applications.

You can request a free 30-day trial license here to unlock full functionality and remove limitations of the Spire.Doc library.

Published in Conversion

Tagged under

doc net Conversion

HTML Parsing in C#: Strings, Files, and URLs

2025-09-16 10:10:02 Written by zaki zou

How to Parse HTML in C#

The need to efficiently parse HTML in C# is a common requirement for many development tasks, from web scraping, data extraction to content automation. While .NET offers built-in tools (e.g., HtmlAgilityPack), Spire.Doc simplifies HTML parsing in C# with its intuitive object model and seamless integration.

This guide explores how to leverage Spire.Doc for .NET to parse HTML, including loading HTML from various sources, navigating document structure, and extracting critical data.

Setting Up Spire.Doc
How Spire.Doc Parses HTML
How to Load and Parse HTML Content
Conclusion
Common Questions

Setting Up Spire.Doc

The easiest way to integrate the C# HTML parser library into your project is via NuGet:

Open your project in Visual Studio.
Right-click the project in the Solution Explorer → Select Manage NuGet Packages.
In the NuGet Package Manager, search for Spire.Doc.
Select the latest stable version and click Install.

Alternatively, download the library directly from the E-iceblue website, extract the ZIP file, and reference Spire.Doc.dll in your project.

How Spire.Doc Parses HTML

Spire.Doc converts HTML into a structured object model, where elements like <p>, <a>, and <table> are mapped to classes you can programmatically access. Key components include:

Document: Acts as the container for parsed HTML content.
Section: Represents a block of content (similar to HTML’s <body> or <div> sections).
Paragraph: Maps to HTML block elements like <p>, <h1>, or <li>.
DocumentObject: Base class for all elements within a Paragraph (images, links, etc.).

This model ensures that HTML structures are preserved and accessible via intuitive C# properties and methods.

How to Load and Parse HTML Content

Spire.Doc supports parsing HTML from strings, local files, or even remote URLs (when combined with HTTP clients). Below are detailed examples for each scenario.

Parse an HTML String in C#

Parse an HTML string (e.g., from a web API or database) into Spire.Doc’s object model for inspection.

using Spire.Doc;
using Spire.Doc.Documents;

namespace ParseHtmlString
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create a Document object
            Document doc = new Document();

            // Add a section to act as a container
            Section section = doc.AddSection();
            // Add a paragraph
            Paragraph para = section.AddParagraph();

            // Define HTML content to parse
            string htmlContent = @"
                <h2>Sample HTML String</h2>
                <p>This is a paragraph with <strong>bold text</strong> and a <a href='https://www.e-iceblue.com/'>link</a>.</p>
                <ul>
                    <li>List item 1</li>
                    <li>List item 2</li>
                </ul>
            ";

            // Parse the HTML string into the paragraph
            para.AppendHTML(htmlContent);

            // Print all paragraph text
            Console.WriteLine("Parsed HTML Content:");
            Console.WriteLine("---------------------");

            foreach (Paragraph paragraph in section.Paragraphs)
            {
                Console.WriteLine(paragraph.Text);
            }
        }
    }
}

In this code, the method AppendHTML() automatically converts HTML tags to corresponding Spire.Doc objects (e.g., <h1> → Heading1 style, <ul> → list paragraphs).

Output:

Parse an HTML string using C#

Pro Tip: You can also call the SaveToFile() method to convert the HTML string to Word in C#.

Parse an HTML File in C#

For HTML content stored in a file (e.g., downloaded web pages, static HTML reports), load it via LoadFromFile() and then analyze its structure (e.g., extracting headings, paragraphs).

using Spire.Doc;
using Spire.Doc.Documents;

namespace ParseHtmlFile
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create a Document object
            Document doc = new Document();

            // Load an HTML file
            doc.LoadFromFile("sample.html", FileFormat.Html);

            // Traverse sections (HTML body blocks)
            foreach (Section section in doc.Sections)
            {
                Console.WriteLine($"Section {doc.Sections.IndexOf(section) + 1}:");
                Console.WriteLine("---------------------------------");

                // Traverse paragraphs in the section
                foreach (Paragraph para in section.Paragraphs)
                {
                    // Print paragraph text and style (e.g., heading level)
                    string styleName = para.StyleName;
                    Console.WriteLine($"[{styleName}] {para.Text}"+ "\n");
                }
                Console.WriteLine();
            }
        }
    }
}

This C# code example loads a local HTML file and then uses the Paragraph.StyleName and Paragraph.Text properties to extract content along with its styling information.

Output:

Load and parse an HTML file using C#

Spire.Doc’s object model allows you to interact with an HTML file just like you would with a Word document. In addition to extracting text content, you can also extract elements like links, tables from HTML.

Parse a URL in C#

To parse HTML from a web page, combine Spire.Doc with HttpClient to fetch the HTML content first, then parse it.

using Spire.Doc;
using Spire.Doc.Documents;

namespace HtmlUrlParsing
{
    class Program
    {
        // HttpClient instance for web requests
        private static readonly HttpClient httpClient = new HttpClient();

        static async Task Main(string[] args)
        {
            try
            {
                // Fetch HTML from a URL
                string url = "https://www.e-iceblue.com/privacypolicy.html";
                Console.WriteLine($"Fetching HTML from: {url}");
                string htmlContent = await FetchHtmlFromUrl(url);

                // Parse the fetched HTML
                Document doc = new Document();
                Section section = doc.AddSection();
                Paragraph paragraph = section.AddParagraph();
                paragraph.AppendHTML(htmlContent);

                // Extract key information
                Console.WriteLine("\nParsed Content Summary:");
                Console.WriteLine($"Sections: {doc.Sections.Count}");
                Console.WriteLine($"Paragraphs: {section.Paragraphs.Count}");
                Console.WriteLine("-------------------------------------------");

                // Extract all heading paragraphs
                foreach (Paragraph para in section.Paragraphs)
                {
                    if (para.StyleName.StartsWith("Heading"))
                    {
                        string headings = para.Text;
                        Console.WriteLine($"Headings: {headings}");
                    }
                }
                
            }
            catch (Exception ex)
            {
                Console.WriteLine($"Error: {ex.Message}");
            }
        }

        // Helper method to fetch HTML from a URL
        private static async Task<string> FetchHtmlFromUrl(string url)
        {
            // Set a user-agent to avoid being blocked by servers
            httpClient.DefaultRequestHeaders.UserAgent.ParseAdd("Mozilla/5.0 (Windows NT 10.0; Win64; x64)");

            // Send GET request and return HTML content
            HttpResponseMessage response = await httpClient.GetAsync(url);
            response.EnsureSuccessStatusCode(); // Throw on HTTP errors (4xx, 5xx)
            return await response.Content.ReadAsStringAsync();
        }
    }
}

This C# code combines web scraping (fetching HTML from a URL) with document parsing (using Spire.Doc) to extract structured information (like headings) from web content. It’s useful for scenarios like content analysis or web data extraction.

Output:

Parse HTML from a web URL using C#

Conclusion

Spire.Doc for .NET provides a comprehensive solution for reading HTML in C# applications. Whether you're working with HTML strings, local files, or even web URLs, this library streamlines the process with intuitive APIs and reliable performance. By following the examples outlined in this guide, you can efficiently integrate HTML parsing capabilities into your .NET projects.

To fully experience the capabilities of Spire.Doc for .NET, request a free 30-day trial license here.

Common Questions

Q1: Why use Spire.Doc for HTML parsing instead of HtmlAgilityPack?

A: Spire.Doc and HtmlAgilityPack serve different primary goals, so the choice depends on your needs:

HtmlAgilityPack: A lightweight library only for parsing and manipulating raw HTML (e.g., extracting tags, fixing invalid HTML). It does not handle document formatting or export to Word.
Spire.Doc: Designed for document manipulation first - it parses HTML and maps it directly to structured Word elements (sections, paragraphs, styles like headings/bold). This is critical if you need to:
- Preserve HTML structure in an editable Word file.
- Extract styled content (e.g., identify "Heading 1" vs. "Normal" text).
- Export parsed HTML to RTF, TXT, PDF, etc.

Q2. How do I convert HTML to Text in C#

A: To convert an HTML file to plain text in C#, get its text content via the GetText() method and then write the result to a .txt file.

// Create a Document object
Document doc = new Document();

// Load an HTML file
doc.LoadFromFile("sample.html", FileFormat.Html);

// Get text from HTML
string text = doc.GetText();

// Write to a text file
File.WriteAllText("HTMLText.txt", text);

Q3: Can Spire.Doc handle malformed or incomplete HTML?

A: Spire.Doc has good error tolerance and can handle imperfect HTML to some extent. However, severely malformed HTML might cause parsing issues. For best results, ensure your HTML is well-formed or use HTML sanitization libraries before parsing with Spire.Doc.

Q3: Can I use Spire.Doc in ASP.NET Core applications?

A: Yes, Spire.Doc is fully compatible with ASP.NET Core applications. The installation and usage process is the same as in other .NET applications.

Published in Document Operation

Tagged under

doc net Operation

News Category

.NET (1316)

Children categories

Installing Spire.Printing

Platform Compatibility

The Core Printing Workflow and Printing Settings

Printing Word, Excel, PowerPoint, PDF, and Other Documents

Printing Word Documents in C#

Printing Excel Files in C#

Printing PDF Files in C#

Printing PowerPoint Presentations in C#

Advanced Print Settings and Operations

Discovering and Selecting a Printer

Choosing a Supported Paper Size

Selecting Specific Pages

Licensing Notes

Conclusion

1. Overview: Creating PDF Directly in ASP.NET Using C#

2. Environment Setup for ASP.NET PDF Generation

2.1. .NET SDK Requirements

2.2. Installing the Spire.PDF for .NET Library

2.3. Project Template Considerations

3. How to Create PDF in ASP.NET (Framework) Using C#

3.1 Create a Simple PDF File in ASP.NET

3.2 Generate PDF in ASP.NET MVC

Practical Notes for MVC Projects

4. Generate PDF in ASP.NET Core Applications

4.1 Generate PDF in ASP.NET Core Web Application

Key Differences from ASP.NET Framework

4.2 Generate PDF in ASP.NET Core Web API

Practical Considerations for Web API

5. Advanced Scenarios for ASP.NET PDF Generation

5.1 Export Dynamic Data to PDF

5.2 Styling and Layout Control in Generated PDFs

5.3 Related PDF Generation Scenarios

6. Choosing an ASP.NET PDF Library

Key Factors to Consider

7. Why Use Spire.PDF for ASP.NET PDF Creation

Practical Advantages in ASP.NET Scenarios

8. Frequently Asked Questions (FAQ)

Can I generate PDF in ASP.NET Core without MVC?

What is the difference between generating PDF in ASP.NET and ASP.NET Core?

Is it possible to generate PDF directly from C# code in ASP.NET?

Conclusion

Why Generate PDFs from HTML Templates in C#?

Set Up Your .NET Environment

Prepare an HTML Template

Insert Dynamic Data into HTML Before Conversion

Convert Updated HTML Template to PDF in C#

Best Practices for Generating PDF from HTML in C#

Final Words

FAQs About C# HTML Template to PDF Conversion

Q1: Can I use Google Chrome for HTML rendering instead of Qt WebEngine?

Q2: Do I need to install a plugin for every machine running my application?

Q3: Does Spire.PDF support external CSS files or online resources?

Q4: Can I use this approach in ASP.NET or web APIs?

Q5: Is JavaScript supported during HTML rendering?

Get a Free License

Table of Contents

Why Edit PDFs Programmatically in C

C# Library to Edit PDFs

Step-by-Step Guide: Editing PDF in C

Step 1: Set Up Your C# Project

Step 2: Install Spire.PDF

Step 3: Load an Existing PDF

Step 4: Edit PDF Content

4.1 Edit Text

4.2 Insert and Update Images

4.3 Add, Remove, or Extract Pages

4.4 Add Watermarks

Step 5: Save the Modified PDF

Tips for Efficient PDF Editing in C

Conclusion

FAQs

Q1: How can I programmatically edit text in a PDF using C#?

Q2: How do I replace or add text in a PDF using C#?

Q3: Can I insert or update images in a PDF programmatically?

Q4: Is it possible to add watermarks to a PDF using code?

Q5: How can I extract specific pages from a PDF?

Why Use Spire.Doc for HTML to Text Conversion?