Conversion (34)

In .NET development, converting HTML to plain text is a common task, whether you need to extract content from web pages, process HTML emails, or generate lightweight text reports. However, HTML’s rich formatting, tags, and structural elements can complicate workflows that require clean, unformatted text. This is why using C# for HTML to text conversion becomes essential.
Spire.Doc for .NET simplifies this process: it’s a robust library for document manipulation that natively supports loading HTML files/strings and converting them to clean plain text. This guide will explore how to convert HTML to plain text in C# using the library, including detailed breakdowns of two core scenarios: converting HTML strings (in-memory content) and HTML files (disk-based content).
- Why Use Spire.Doc for HTML to Text Conversion?
- Installing Spire.Doc
- Convert HTML Strings to Text in C#
- Convert HTML File to Text in C#
- FAQs
- Conclusion
Why Use Spire.Doc for HTML to Text Conversion?
Spire.Doc is a .NET document processing library that stands out for HTML-to-text conversion due to:
- Simplified Code: Minimal lines of code to handle even complex HTML.
- Structure Preservation: Maintains logical formatting (line breaks, list indentation) in the output text.
- Special Character Support: Automatically converts HTML entities to their plain text equivalents.
- Lightweight: Avoids heavy dependencies, making it suitable for both desktop and web applications
Installing Spire.Doc
Spire.Doc is available via NuGet, the easiest way to manage dependencies:
- In Visual Studio, right-click your project > Manage NuGet Packages.
- Search for Spire.Doc and install the latest stable version.
- Alternatively, use the Package Manager Console:
Install-Package Spire.Doc
After installing, you can dive into the C# code to extract text from HTML.
Convert HTML Strings to Text in C#
This example renders an HTML string into a Document object, then uses SaveToFile() to save it as a plain text file.
using Spire.Doc;
using Spire.Doc.Documents;
namespace HtmlToTextSaver
{
class Program
{
static void Main(string[] args)
{
// Define HTML content
string htmlContent = @"
<html>
<body>
<h1>Sample HTML Content</h1>
<p>This is a paragraph with <strong>bold</strong> and <em>italic</em> text.</p>
<p>Another line with a <a href='https://example.com'>link</a>.</p>
<ul>
<li>List item 1</li>
<li>List item 2 (with <em>italic</em> text)</li>
</ul>
<p>Special characters: © & ®</p>
</body>
</html>";
// Create a Document object
Document doc = new Document();
// Add a section to hold content
Section section = doc.AddSection();
// Add a paragraph
Paragraph paragraph = section.AddParagraph();
// Render HTML into the paragraph
paragraph.AppendHTML(htmlContent);
// Save as plain text
doc.SaveToFile("HtmlStringtoText.txt", FileFormat.Txt);
}
}
}
How It Works:
- HTML String Definition: We start with a sample HTML string containing headings, paragraphs, formatting tags (
<strong>,<em>), links, lists, and special characters. - Document Setup: A
Documentobject is created to manage the content, with aSectionandParagraphto structure the HTML rendering. - HTML Rendering:
AppendHTML()parses the HTML string and converts it into the document's internal structure, preserving content hierarchy. - Text Conversion:
SaveToFile()withFileFormat.Txtconverts the rendered content to plain text, stripping HTML tags while retaining readable structure.
Output:

Extended reading: Parse or Read HTML in C#
Convert HTML File to Text in C#
This example directly loads an HTML file and converts it to text. Ideal for batch processing or working with pre-existing HTML documents (e.g., downloaded web pages, local templates).
using Spire.Doc;
using Spire.Doc.Documents;
namespace HtmlToText
{
class Program
{
static void Main()
{
// Create a Document object
Document doc = new Document();
// Load an HTML file
doc.LoadFromFile("sample.html", FileFormat.Html, XHTMLValidationType.None);
// Convert HTML to plain text
doc.SaveToFile("HTMLtoText.txt", FileFormat.Txt);
doc.Dispose();
}
}
}
How It Works:
- Document Initialization: A
Documentobject is created to handle the file operations. - HTML File Loading:
LoadFromFile()imports the HTML file, withFileFormat.Htmlspecifying the input type.XHTMLValidationType.Noneensures compatibility with non-strict HTML. - Text Conversion:
SaveToFile()withFileFormat.Txtconverts the loaded HTML content to plain text.

To preserve the original formatting and style, you can refer to the C# tutorial to convert the HTML file to Word.
FAQs
Q1: Can Spire.Doc process malformed HTML?
A: Yes. Spire.Doc includes built-in tolerance for malformed HTML, but you may need to disable strict validation to ensure proper parsing.
When loading HTML files, use XHTMLValidationType.None (as shown in the guide) to skip strict XHTML checks:
doc.LoadFromFile("malformed.html", FileFormat.Html, XHTMLValidationType.None);
This setting tells Spire.Doc to parse the HTML like a web browser (which automatically corrects minor issues like unclosed <p> or <li> tags) instead of rejecting non-compliant content.
Q2: Can I extract specific elements from HTML (like only paragraphs or headings)?
A: Yes, after loading the HTML into a Document object, you can access specific elements through the object model (like paragraphs, tables, etc.) and extract text from only those specific elements rather than the entire document.
Q3: Can I convert HTML to other formats besides plain text using Spire.Doc?
A: Yes, Spire.Doc supports conversion to multiple formats, including Word DOC/DOCX, PDF, image, RTF, and more, making it a versatile document processing solution.
Q4: Does Spire.Doc work with .NET Core/.NET 5+?
A: Spire.Doc fully supports .NET Core, .NET 5/6/7/8, and .NET Framework 4.0+. There’s no difference in functionality across these frameworks, which means you can use the same code (e.g., Document, AppendHTML(), SaveToFile()) regardless of which .NET runtime you’re targeting.
Conclusion
Converting HTML to text in C# is straightforward with the Spire.Doc library. Whether you’re working with HTML strings or files, Spire.Doc simplifies the process by handling HTML parsing, structure preservation, and text conversion. By following the examples in this guide, you can seamlessly integrate HTML-to-text conversion into your C# applications.
You can request a free 30-day trial license here to unlock full functionality and remove limitations of the Spire.Doc library.
Convert Markdown to HTML in C# .NET (Strings, Files & Batch)
2025-09-11 06:35:13 Written by zaki zou
Markdown (md) is a widely adopted lightweight markup language known for its simplicity and readability. Developers, technical writers, and content creators often use it for documentation, README files, blogs, and technical notes. While Markdown is easy to write and read in its raw form, displaying it on websites or integrating it into web applications requires HTML. Converting Markdown to HTML is therefore a fundamental task for developers working with content management systems, documentation pipelines, or web-based applications.
In this tutorial, you will learn how to convert Markdown to HTML in C#. The guide covers converting both Markdown strings and files to HTML, as well as batch processing multiple Markdown documents efficiently. By the end, you’ll have practical, ready-to-use examples that you can apply directly to real-world projects.
Table of Contents
- Understanding Markdown and HTML: Key Differences and Use Cases
- C# Library for Markdown to HTML Conversion
- Convert a Markdown String to HTML in C# (Step-by-Step)
- Convert a Single Markdown File to HTML in C# (Step-by-Step)
- Batch Convert Multiple Markdown Files to HTML in C#
- Additional Tips for Efficient Markdown to HTML Conversion in C#
- Conclusion
- FAQs
Understanding Markdown and HTML: Key Differences and Use Cases
What is Markdown?
Markdown is a lightweight markup language that allows developers and writers to create structured documents using plain text. It uses straightforward syntax for headings, lists, links, images, code blocks, and more. Its readability in raw form makes it ideal for writing documentation, README files, technical blogs, and collaborative notes.
Example Markdown:
# Project Title
This is a **bold** statement.
- Feature 1
- Feature 2
What is HTML?
HTML (HyperText Markup Language) is the foundational language of the web. Unlike Markdown, HTML provides precise control over document structure, formatting, multimedia embedding, and web interactivity. While Markdown focuses on simplicity, HTML is indispensable for web pages and application content.
Example HTML Output:
<h1>Project Title</h1>
<p>This is a <strong>bold</strong> statement.</p>
<ul>
<li>Feature 1</li>
<li>Feature 2</li>
</ul>
Key Differences and Use Cases
| Feature | Markdown | HTML |
|---|---|---|
| Complexity | Simple, minimal syntax | More detailed, verbose |
| Readability | Readable in raw form | Harder to read directly |
| Use Cases | Documentation, readmes, blogs | Websites, web apps, emails |
Use Case Tip: Use Markdown for author-friendly writing, then convert it to HTML for web display, automated documentation pipelines, or content management systems.
C# Library for Markdown to HTML Conversion
For C# developers, one of the most practical libraries for Markdown-to-HTML conversion is Spire.Doc for .NET. This library offers robust document processing capabilities, supporting not only loading Markdown files and converting content to HTML, but also extending to other formats, such as Markdown to Word and PDF. With this flexibility, developers can easily choose the output format that best fits their project needs.
Key Features
- Load Markdown files and convert to HTML
- Preserve headings, lists, links, images, and other Markdown formatting in HTML output
- Batch process multiple Markdown documents efficiently
- Integrate seamlessly with .NET applications without requiring Microsoft Office
- Compatible with .NET Framework and .NET Core
Installation
You can easily add the required library to your C# project in two ways:
- Using NuGet (Recommended)
Run the following command in your Package Manager Console:
This method ensures that the library and its dependencies are automatically downloaded and integrated into your project.Install-Package Spire.Doc - Manual Installation
Alternatively, you can download the library DLL and manually add it as a reference in your project. This approach is useful if you need offline installation or prefer direct control over the library files.
Tip: Using NuGet is generally recommended for faster setup and easier version management.
Convert a Markdown String to HTML in C# (Step-by-Step)
In many applications, Markdown content may be generated dynamically or stored in a database as a string. This section demonstrates how you can convert a Markdown string into a fully formatted HTML file using C#.
Steps to Convert a Markdown String to HTML
- Prepare the Markdown string that you want to convert.
- Save the Markdown string to a .md file with WriteAllText.
- Load the Markdown file into a Document object using LoadFromFile with FileFormat.Markdown.
- Save the document as an HTML file using SaveToFile with FileFormat.Html.
Example Code
using Spire.Doc;
using System;
using System.IO;
namespace MarkdownToHtml
{
internal class Program
{
static void Main(string[] args)
{
// Define the markdown string
string markdown = @"
# Welcome to C# Markdown Tutorial
This tutorial demonstrates **Markdown syntax** in a more detailed way.
Here is a [link](https://example.com).
## Features
- Headings, bold, and italic text
- Links and images
- Ordered and unordered lists
- Code blocks and inline code
- Blockquotes
- Tables
";
// Define the file paths
string markdownFilePath = "example.md"; // Path to save the Markdown file
string outputHtmlPath = "output.html"; // Path to save the converted HTML file
// Create a Markdown file from the markdown string
File.WriteAllText(markdownFilePath, markdown);
// Load the Markdown file
Document document = new Document();
document.LoadFromFile(markdownFilePath, FileFormat.Markdown);
// Save as HTML
document.SaveToFile(outputHtmlPath, FileFormat.Html);
// Close the document
document.Close();
Console.WriteLine($"Markdown string converted to HTML at: {outputHtmlPath}");
}
}
}

Convert a Single Markdown File to HTML in C# (Step-by-Step)
If you have a Markdown file ready, converting it to HTML for web pages or email templates is straightforward. With Spire.Doc, you can load your Markdown file and export it as a fully formatted HTML document, preserving all styling, including headings, lists, links, images, and other formatting elements.
Steps to Convert a Markdown File to HTML
- Prepare the Markdown file you want to convert.
- Load the file into a Document object using LoadFromFile with the FileFormat.Markdown parameter.
- Save the loaded document as HTML using SaveToFile with FileFormat.Html.
Example Code
using Spire.Doc;
using System;
namespace MarkdownToHtml
{
internal class Program
{
static void Main(string[] args)
{
// Path to the Markdown file
string markdownFile = @"C:\Docs\example.md";
// Path to save the converted HTML file
string htmlFile = @"C:\Docs\example.html";
// Load the Markdown file
Document document = new Document();
document.LoadFromFile(markdownFile, FileFormat.Markdown);
// Save as HTML file
document.SaveToFile(htmlFile, FileFormat.Html);
// Close the document
document.Close();
Console.WriteLine($"Converted '{markdownFile}' to HTML successfully!");
}
}
}

Batch Convert Multiple Markdown Files to HTML in C#
If you have a collection of Markdown files that need to be converted at once, you can use the following C# example to batch process and convert them into HTML.
Example Code
using Spire.Doc;
using System;
using System.IO;
namespace MarkdownToHtml
{
internal class Program
{
static void Main(string[] args)
{
// Define the input folder containing Markdown files
string inputFolder = @"C:\Docs\MarkdownFiles";
// Define the output folder where converted HTML files will be saved
string outputFolder = @"C:\Docs\HtmlFiles";
// Create the output folder if it does not already exist
Directory.CreateDirectory(outputFolder);
// Loop through all Markdown (.md) files in the input folder
foreach (string file in Directory.GetFiles(inputFolder, "*.md"))
{
// Load the Markdown file into a Document object
Document doc = new Document();
doc.LoadFromFile(file, FileFormat.Markdown);
// Get the file name without extension
string fileName = Path.GetFileNameWithoutExtension(file);
// Build the output path with .html extension
string outputPath = Path.Combine(outputFolder, fileName + ".html");
// Save the document as an HTML file
doc.SaveToFile(outputPath, FileFormat.Html);
// Print a confirmation message for each converted file
Console.WriteLine($"Converted {file} to HTML.");
}
// Print a final message when batch conversion is complete
Console.WriteLine("Batch conversion complete.");
}
}
}
Additional Tips for Efficient Markdown to HTML Conversion in C#
Converting Markdown to HTML is straightforward, but applying a few practical strategies can help handle advanced scenarios, improve performance, and ensure your HTML output is clean and consistent. Here are some key tips to enhance your conversion workflow:
-
Implement Error Handling When processing multiple files, wrap your conversion logic in try-catch blocks to handle invalid Markdown, missing files, or access permission issues. This ensures your batch conversion won’t fail entirely due to a single problematic file.
try { Document doc = new Document(); doc.LoadFromFile(filePath, FileFormat.Markdown); doc.SaveToFile(outputPath, FileFormat.Html); } catch (Exception ex) { Console.WriteLine($"Failed to convert {filePath}: {ex.Message}"); } -
Optimize Batch Conversion Performance
For large numbers of Markdown files, consider using asynchronous or parallel processing. This reduces conversion time and avoids high memory usage:Parallel.ForEach(Directory.GetFiles(inputFolder, "*.md"), file => { // Conversion logic }); -
Post-Process HTML Output
After conversion, you can enhance the HTML by injecting CSS styles, adding custom attributes, or minifying the output. This is especially useful when integrating HTML into web pages or applications.string htmlContent = File.ReadAllText(outputPath); htmlContent = "<link rel='stylesheet' href='https://cdn.e-iceblue.com/style.css'>" + htmlContent; File.WriteAllText(outputPath, htmlContent); -
Maintain UTF-8 Encoding
Always save Markdown and HTML files with UTF-8 encoding to preserve special characters, symbols, and multilingual content, ensuring consistent rendering across browsers and devices.
Conclusion
In this tutorial, you learned how to convert Markdown to HTML in C#, covering single Markdown strings, individual files, and batch processing multiple documents.
These examples provide a solid foundation for integrating Markdown to HTML conversion into various .NET applications, including documentation systems, blogs, and other content-driven projects. By applying these methods, you can efficiently manage Markdown content and produce consistent, well-structured HTML output.
FAQs
Q1: Can I convert Markdown with images and links using Spire.Doc in C#?
A1: Yes. The library allows you to convert Markdown files that include images, hyperlinks, headings, lists, and code blocks into fully formatted HTML. This ensures the output closely matches your source content.
Q2: Do I need Microsoft Office installed to convert Markdown to HTML in C#?
A2: No. Spire.Doc is a standalone library for .NET, so you can convert Markdown to HTML in C# without Microsoft Office, making it easy to integrate into both desktop and web applications.
Q3: How can I batch convert multiple Markdown files to HTML in C# efficiently?
A3: You can loop through all Markdown files in a folder and convert them using Spire.Doc’s Document.LoadFromFile and SaveToFile methods. This approach allows batch conversion of Markdown documents to HTML in .NET quickly and reliably.
Q4: Can I convert Markdown to HTML dynamically in an ASP.NET application using C#?
A4: Yes. You can dynamically convert Markdown content stored as strings or files to HTML in ASP.NET using Spire.Doc, which is useful for web apps, blogs, or CMS platforms.
Q5: Is Spire.Doc compatible with .NET Core and .NET 6 for Markdown to HTML conversion?
A5: Yes. It supports .NET Framework, .NET Core, .NET 5, and .NET 6+, making it ideal for modern C# projects that require Markdown to HTML conversion.
Q6: Can I customize the HTML output after converting Markdown in C#?
A6: Yes. After conversion, you can add CSS, modify HTML tags, or inject styles programmatically in C# to match your website or application’s design requirements.
Q7: Can Spire.Doc convert other document formats besides Markdown?
A7: Yes. It can convert a wide range of formats, such as Word to PDF or Word to HTML, giving you flexibility to manage different document types in C# projects.
Q8: How do I preserve special characters and encoding when converting Markdown to HTML in C#?
A8: Always save your Markdown files with UTF-8 encoding to ensure special characters, symbols, and multilingual content are preserved during Markdown to HTML conversion.

Converting HTML to RTF in C# is a key task for developers working with web content that needs to be transformed into editable, universally compatible documents. HTML excels at web display with dynamic styles and structure, while RTF is ideal for shareable, editable files in tools like Word or WordPad.
For .NET developers, using libraries like Spire.Doc can streamline the process. In this tutorial, we'll explore how to use C# to convert HTML to RTF, covering everything from basic implementations to advanced scenarios such as handling HTML images, batch conversion.
- Why Use Spire.Doc for HTML to RTF Conversion?
- Getting Started
- Convert HTML to RTF (C# Code Examples)
- Advanced Conversion Scenarios
- Final Thoughts
- Common Questions
Why Use Spire.Doc for HTML to RTF Conversion?
Spire.Doc for .NET is a lightweight, feature-rich library for creating, editing, and converting Word and RTF documents in .NET applications (supports .NET Framework, .NET Core, and .NET 5+). For HTML to rich text conversion, it offers key benefits:
- Preserves HTML formatting (fonts, colors, links, lists, tables).
- Supports loading HTML from strings or local files.
- No dependency on Microsoft Word or other third-party software.
- Intuitive API with minimal code required.
Getting Started
1. Create a C# Project
If you’re starting from scratch, create a new Console App (.NET Framework/.NET Core) project in Visual Studio. This example uses a console app for simplicity, but the code works in WinForms, WPF, or ASP.NET projects too.
2. Install Spire.Doc via NuGet
The fastest way to add Spire.Doc to your C# project is through NuGet Package Manager:
- Open your C# project in Visual Studio.
- Right-click the project in the Solution Explorer → Select Manage NuGet Packages.
- Search for Spire.Doc and click Install to add the latest version to your project.
Alternatively, use the NuGet Package Manager Console with this command:
Install-Package Spire.Doc
Convert HTML to RTF (C# Code Examples)
Spire.Doc’s Document class handles HTML loading and RTF saving. Below are two common scenarios:
Scenario 1: Convert HTML String to RTF in C#
Use this when HTML content is dynamic (e.g., from user input, APIs, or databases).
using Spire.Doc;
using Spire.Doc.Documents;
namespace HtmlToRtfConverter
{
class Program
{
static void Main(string[] args)
{
// Create a Document object
Document doc = new Document();
// Define your HTML content
string htmlString = @"
<html>
<body>
<h1 style='color: #00BFFF; font-family: Arial'>HTML to RTF Conversion</h1>
<p>This is a <b>bold paragraph</b> with a <a href='https://www.e-iceblue.com'>link</a>.</p>
<ul>
<li>Item 1 </li>
<li>Item 2</li>
</ul>
<table border='1' cellpadding='5'>
<tr><td>Name</td><td>Gender</td><td>Age</td></tr>
<tr><td>John</td><td>Male</td><td>30</td></tr>
<tr><td>Kate</td><td>Female</td><td>26</td></tr>
</table>
</body>
</html>";
// Add a paragraph in Word
Paragraph para = doc.AddSection().AddParagraph();
// Append the HTML string to the paragraph
para.AppendHTML(htmlString);
// Save the document as RTF
doc.SaveToFile("HtmlStringToRtf.rtf", FileFormat.Rtf);
doc.Dispose();
}
}
}
In this code:
- Document Object: Represents an empty document.
- HTML String: You can customize this to include any valid HTML (styles, media, or dynamic content from databases/APIs).
- AppendHTML(): Parses HTML tags (e.g.,
<h1>,<table>,<a>) and inserts them into a paragraph. - SaveToFile(): Writes the converted content to an RTF file.
Output:

The SaveToFile method accepts different FileFormat parameters. You can change it to implement HTML to Word conversion in C#.
Scenario 2: Convert HTML File to RTF File
For static HTML files (e.g., templates or saved web pages), use LoadFromFile with parameter FileFormat.Html:
using Spire.Doc;
namespace ConvertHtmlToRTF
{
class Program
{
static void Main()
{
// Create a Document object
Document doc = new Document();
// Load an HTML file
doc.LoadFromFile("Test.html", FileFormat.Html);
// Save the HTML file as rtf format
doc.SaveToFile("HTMLtoRTF.rtf", FileFormat.Rtf);
doc.Dispose();
}
}
}
This code simplifies HTML-to-RTF conversion into three core steps:
- Creates a Document object.
- Loads an existing HTML file using LoadFromFile() with the FileFormat.Html parameter.
- Saves the loaded HTML as an RTF format using SaveToFile() with the FileFormat.Rtf parameter.
Output:

Spire.Doc supports bidirectional conversion, so you can convert the RTF file back to HTML in C# when needed.
Advanced Conversion Scenarios
1. Handling Images in HTML
Spire.Doc preserves images embedded in HTML (via <img> tags). For local images, ensure the src path is correct. For remote images (URLs), Spire.Doc automatically downloads and embeds them.
// HTML with local and remote images
string htmlWithImages = @"<html>
<body>
<h3>HTML with Images</h3>
<p>Local image: <img src='https://cdn.e-iceblue.com/C:\Users\Administrator\Desktop\HelloWorld.png' alt='Sample Image' width='200'></p>
<p>Remote image: <img src='https://www.e-iceblue.com/images/art_images/csharp-html-to-rtf.png' alt='Online Image'></p>
</body>
</html>";
// Append the HTML string to a paragraph
Paragraph para = doc.AddSection().AddParagraph();
para.AppendHTML(htmlWithImages);
// Save the document as RTF
doc.SaveToFile("HtmlWithImage.rtf", FileFormat.Rtf);
2. Batch Conversion of Multiple HTML Files
Process an entire directory of HTML files with a loop:
string inputDir = @"C:\Input\HtmlFiles";
string outputDir = @"C:\Output\RtfFiles";
// Create output directory if it doesn't exist
Directory.CreateDirectory(outputDir);
// Get all .html files in input directory
foreach (string htmlFile in Directory.EnumerateFiles(inputDir, "*.html"))
{
using (Document doc = new Document())
{
doc.LoadFromFile(htmlFile, FileFormat.Html, XHTMLValidationType.None);
// Use the same filename but with .rtf extension
string fileName = Path.GetFileNameWithoutExtension(htmlFile) + ".rtf";
string outputPath = Path.Combine(outputDir, fileName);
doc.SaveToFile(outputPath, FileFormat.Rtf);
Final Thoughts
Converting HTML to RTF in C# is straightforward with Spire.Doc for .NET. This library eliminates the need for manual parsing and ensures consistent formatting across outputs. Whether you’re working with HTML strings or files, this article provides practical code examples to handle both scenarios.
For further exploration, refer to the Spire.Doc documentation.
Common Questions
Q1: Is Spire.Doc free to use?
A: For large-scale projects, you can request a free 30-day trial license to fully evaluate it. Alternatively, Spire.Doc offers a free community edition without any watermarks (but with certain page/functionality limits).
Q2: Does Spire.Doc preserve HTML hyperlinks, images, and tables in the RTF output?
A: Yes. Spire.Doc retains most HTML elements:
- Hyperlinks:
<a>tags are converted to clickable links in RTF. - Images: Local (
<img src="/path">) and remote (<img src="/URL">) images are embedded in the RTF. - Tables: HTML tables (with border, cellpadding, etc.) are converted to RTF tables with preserved structure.
Q3: Can I style the RTF output further after loading the HTML?
A: Absolutely. After loading the HTML content into the Document object, you can use the full Spire.Doc API to programmatically modify the document before saving it as RTF.
Q4: Can I convert HTML to other formats with Spire.Doc?
A: Yes. Apart from converting to RTF, the library also supports converting HTML to Word, HTML to XML, and HTML to images, etc.
Markdown, with its lightweight syntax, offers a streamlined approach to web content creation, collaboration, and document sharing, particularly in environments where tools like Git or Markdown-friendly editors are prevalent. By converting Word documents to Markdown files, users can enhance their productivity, facilitate easier version control, and ensure compatibility across different systems and platforms. In this article, we will explore the process of converting Word documents to Markdown files using Spire.Doc for .NET, providing simple C# code examples.
Install Spire.Doc for .NET
To begin with, you need to add the DLL files included in the Spire.Doc for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.Doc
Convert Word to Markdown with C#
Using Spire.Doc for .NET, we can convert a Word document to a Markdown file by loading the document using Document.LoadFromFile() method and then convert it to a Markdown file using Document.SaveToFile(filename: String, FileFormat.Markdown) method. The detailed steps are as follows:
- Create an instance of Document class.
- Load a Word document using Document.LoadFromFile() method.
- Convert the document to a Markdown file using Document.SaveToFile(filename: String, FileFormat.Markdown) method.
- Release resources.
- C#
using Spire.Doc;
namespace WordToMarkdown
{
class Program
{
static void Main(string[] args)
{
// Create an instance of Document class
Document doc = new Document();
// Load a Word document
doc.LoadFromFile("Sample.docx");
// Convert the document to a Markdown file
doc.SaveToFile("output/WordToMarkdown.md", FileFormat.Markdown);
doc.Dispose();
}
}
}

Convert Word to Markdown Without Images
When using Spire.Doc for .NET to convert Word documents to Markdown files, images are stored in Base64 encoding by default, which can increase the file size and affect compatibility. To address this, we can remove the images during conversion, thereby reducing the file size and enhancing compatibility.
The following steps outline how to convert Word documents to Markdown files without images:
- Create an instance of Document class.
- Load a Word document using Document.LoadFromFile() method.
- Iterate through the sections and then the paragraphs in the document.
- Iterate through the document objects in the paragraphs:
- Get a document object through Paragraph.ChildObjects[] property.
- Check if it’s an instance of DocPicture class. If it is, remove it using Paragraph.ChildObjects.Remove(DocumentObject) method.
- Convert the document to a Markdown file using Document.SaveToFile(filename: String, FileFormat.Markdown) method.
- Release resources.
- C#
using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;
namespace WordToMarkdownNoImage
{
class Program
{
static void Main(string[] args)
{
// Create an instance of Document class
Document doc = new Document();
// Load a Word document
doc.LoadFromFile("Sample.docx");
// Iterate through the sections in the document
foreach (Section section in doc.Sections)
{
// Iterate through the paragraphs in the sections
foreach (Paragraph paragraph in section.Paragraphs)
{
// Iterate through the document objects in the paragraphs
for (int i = 0; i < paragraph.ChildObjects.Count; i++)
{
// Get a document object
DocumentObject docObj = paragraph.ChildObjects[i];
// Check if it is an instance of DocPicture class
if (docObj is DocPicture)
{
// Remove the DocPicture instance
paragraph.ChildObjects.Remove(docObj);
}
}
}
}
// Convert the document to a Markdown file
doc.SaveToFile("output/WordToMarkdownNoImage.md", FileFormat.Markdown);
doc.Dispose();
}
}
}

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Markdown, as a lightweight markup language, is favored by programmers and technical document writers for its simplicity, readability, and clear syntax. However, in specific scenarios, there is often a need to convert Markdown documents into Word documents with rich formatting capabilities and control over the layout or to generate PDF files suitable for printing and easy viewing. This article is going to demonstrate how to convert Markdown content into Word documents or PDF files with Spire.Doc for .NET to meet various document processing requirements in different scenarios.
Install Spire.Doc for .NET
To begin with, you need to add the DLL files included in the Spire.Doc for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.Doc
Convert Markdown Files to Word Documents with C#
With Spire.Doc for .NET, we can load a Markdown file using Document.LoadFromFile(string fileName, FileFormat.Markdown) method and then convert it to other formats using Document.SaveToFile(string fileName, fileFormat FileFormat) method.
Since images in Markdown files are stored as links, directly converting a Markdown file to a Word document is suitable for Markdown files that do not contain images. If the file contains images, further processing of the images is required after conversion.
Here are the steps to convert a Markdown file to a Word document:
- Create an instance of Document class.
- Load a Markdown file using Document.LoadFromFile(string fileName, FileFormat.Markdown) method.
- Convert the file to a Word document and save it using Document.SaveToFile(string fileName, FileFormat.Docx) method.
- C#
using Spire.Doc;
namespace MdToDocx
{
class Program
{
static void Main(string[] args)
{
// Create an object of Document class
Document doc = new Document();
// Load a Markdown file
doc.LoadFromFile("Sample.md", FileFormat.Markdown);
// Convert the Markdown file to a Word document
doc.SaveToFile("MarkdownToWord.docx", FileFormat.Docx);
doc.Close();
}
}
}

Convert Markdown Files to PDF Files with C#
We can also directly convert Markdown files to PDF files by using the FileFormat.PDF Enum as the parameter. Here are the steps to convert a Markdown file to a PDF file:
- Create an instance of Document class.
- Load a Markdown file using Document.LoadFromFile(string fileName, FileFormat.Markdown) method.
- Convert the file to a PDF file and save it using Document.SaveToFile(string fileName, FileFormat.Docx) method.
- C#
using Spire.Doc;
namespace MdToDocx
{
class Program
{
static void Main(string[] args)
{
// Create an object of Document class
Document doc = new Document();
// Load a Markdown file
doc.LoadFromFile("Sample.md", FileFormat.Markdown);
// Convert the Markdown file to a PDF file
doc.SaveToFile("MarkdownToPDF.pdf", FileFormat.PDF);
doc.Close();
}
}
}

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Word and Excel are two completely different file types. Word documents are used to write essays, letters or create reports, while Excel documents are used to save data in tabular form, make charts, or perform mathematical calculations. It is not recommended to convert a complex Word document to an Excel spreadsheet because Excel can hardly render content according to its original layout in Word.
However, if your Word document is primarily composed of tables and you want to analyze the table data in Excel, you can use Spire.Office for .NET to convert Word to Excel while maintaining good readability.
Install Spire.Office for .NET
To begin with, you need to add the DLL files included in the Spire.Office for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.Office
Convert Word to Excel in C# and VB.NET
This scenario actually uses two libraries in the Spire.Office package. They’re Spire.Doc for .NET and Spire.XLS for .NET. The former is used to read and extract content from a Word document, and the latter is used to create an Excel document and write data in the specific cells. To make this code example easy to understand, we created the following three custom methods that preform specific functions.
- ExportTableInExcel() - Export data from a Word table to specified Excel cells.
- CopyContentInTable() - Copy content from a table cell in Word to an Excel cell.
- CopyTextAndStyle() - Copy text with formatting from a Word paragraph to an Excel cell.
The following steps demonstrate how to export data from an entire Word document to a worksheet using Spire.Office for .NET.
- Create a Document object to load a Word file.
- Create a Worbbook object and add a worksheet named "WordToExcel" to it.
- Traverse though all the sections in the Word document, traverse through all the document objects under a certain section, and then determine if a document object is a paragraph or a table.
- If the document object is a paragraph, write the paragraph in a specified cell in Excel using CoypTextAndStyle() method.
- If the document object is a table, export the table data from Word to Excel cells using ExportTableInExcel() method.
- Auto fit the row height and column width in Excel so that the data within a cell will not exceed the bound of the cell.
- Save the workbook to an Excel file using Workbook.SaveToFile() method.
- C#
- VB.NET
using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;
using Spire.Xls;
using System;
using System.Drawing;
namespace ConvertWordToExcel
{
class Program
{
static void Main(string[] args)
{
//Create a Document object
Document doc = new Document();
//Load a Word file
doc.LoadFromFile(@"C:\Users\Administrator\Desktop\Invoice.docx");
//Create a Workbook object
Workbook wb = new Workbook();
//Remove the default worksheets
wb.Worksheets.Clear();
//Create a worksheet named "WordToExcel"
Worksheet worksheet = wb.CreateEmptySheet("WordToExcel");
int row = 1;
int column = 1;
//Loop through the sections in the Word document
foreach (Section section in doc.Sections)
{
//Loop through the document object under a certain section
foreach (DocumentObject documentObject in section.Body.ChildObjects)
{
//Determine if the object is a paragraph
if (documentObject is Paragraph)
{
CellRange cell = worksheet.Range[row, column];
Paragraph paragraph = documentObject as Paragraph;
//Copy paragraph from Word to a specific cell
CopyTextAndStyle(cell, paragraph);
row++;
}
//Determine if the object is a table
if (documentObject is Table)
{
Table table = documentObject as Table;
//Export table data from Word to Excel
int currentRow = ExportTableInExcel(worksheet, row, table);
row = currentRow;
}
}
}
//Auto fit row height and column width
worksheet.AllocatedRange.AutoFitRows();
worksheet.AllocatedRange.AutoFitColumns();
//Wrap text in cells
worksheet.AllocatedRange.IsWrapText = true;
//Save the workbook to an Excel file
wb.SaveToFile("WordToExcel.xlsx", ExcelVersion.Version2013);
}
//Export data from Word table to Excel cells
private static int ExportTableInExcel(Worksheet worksheet, int row, Table table)
{
CellRange cell;
int column;
foreach (TableRow tbRow in table.Rows)
{
column = 1;
foreach (TableCell tbCell in tbRow.Cells)
{
cell = worksheet.Range[row, column];
cell.BorderAround(LineStyleType.Thin, Color.Black);
CopyContentInTable(tbCell, cell);
column++;
}
row++;
}
return row;
}
//Copy content from a Word table cell to an Excel cell
private static void CopyContentInTable(TableCell tbCell, CellRange cell)
{
Paragraph newPara = new Paragraph(tbCell.Document);
for (int i = 0; i < tbCell.ChildObjects.Count; i++)
{
DocumentObject documentObject = tbCell.ChildObjects[i];
if (documentObject is Paragraph)
{
Paragraph paragraph = documentObject as Paragraph;
foreach (DocumentObject cObj in paragraph.ChildObjects)
{
newPara.ChildObjects.Add(cObj.Clone());
}
if (i < tbCell.ChildObjects.Count - 1)
{
newPara.AppendText("\n");
}
}
}
CopyTextAndStyle(cell, newPara);
}
//Copy text and style of a paragraph to a cell
private static void CopyTextAndStyle(CellRange cell, Paragraph paragraph)
{
RichText richText = cell.RichText;
richText.Text = paragraph.Text;
int startIndex = 0;
foreach (DocumentObject documentObject in paragraph.ChildObjects)
{
if (documentObject is TextRange)
{
TextRange textRange = documentObject as TextRange;
string fontName = textRange.CharacterFormat.FontName;
bool isBold = textRange.CharacterFormat.Bold;
Color textColor = textRange.CharacterFormat.TextColor;
float fontSize = textRange.CharacterFormat.FontSize;
string textRangeText = textRange.Text;
int strLength = textRangeText.Length;
ExcelFont font = cell.Worksheet.Workbook.CreateFont();
font.Color = textColor;
font.IsBold = isBold;
font.Size = fontSize;
font.FontName = fontName;
int endIndex = startIndex + strLength;
richText.SetFont(startIndex, endIndex, font);
startIndex += strLength;
}
if (documentObject is DocPicture)
{
DocPicture picture = documentObject as DocPicture;
cell.Worksheet.Pictures.Add(cell.Row, cell.Column, picture.Image);
cell.Worksheet.SetRowHeightInPixels(cell.Row, 1, picture.Image.Height);
}
}
switch (paragraph.Format.HorizontalAlignment)
{
case HorizontalAlignment.Left:
cell.Style.HorizontalAlignment = HorizontalAlignType.Left;
break;
case HorizontalAlignment.Center:
cell.Style.HorizontalAlignment = HorizontalAlignType.Center;
break;
case HorizontalAlignment.Right:
cell.Style.HorizontalAlignment = HorizontalAlignType.Right;
break;
}
}
}
}

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
ODT files are OpenDocument Text files created with word processing programs such as free OpenOffice Writer. Like DOCX files, ODT files can contain content like text, images, objects and styles, but they may not be readable by some people who don't have the appropriate application installed. If you plan to share an ODT file, it's best to convert it to pdf so everyone can access it. In this article, we will explain how to convert ODT to PDF in C# and VB.NET using Spire.Doc for .NET.
Install Spire.Doc for .NET
To begin with, you need to add the DLL files included in the Spire.Doc for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.Doc
Convert ODT to PDF using C# and VB.NET
The following are the steps to convert an ODT file to PDF:
- Create an instance of Document class.
- Load an ODT file using Document.LoadFromFile() method.
- Convert the ODT file to PDF using Document.SaveToFile(string fileName, FileFormat fileFormat) method.
- C#
- VB.NET
using Spire.Doc;
namespace ConvertOdtToPdf
{
internal class Program
{
static void Main(string[] args)
{
//Create a Document instance
Document doc = new Document();
//Load an ODT file
doc.LoadFromFile("Sample.odt");
//Save the ODT file to PDF
doc.SaveToFile("OdtToPDF.pdf", FileFormat.PDF);
}
}
}

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Compared with Word document format, pictures are more convenient to share and preview across platforms, because they do not require MS Word to be installed on machines. Moreover, converting Word to images can preserve the original appearance of the document, which is useful when further modifications are not desired. In this article, you will learn how to convert Word documents to images in C# and VB.NET using Spire.Doc for .NET.
- Convert Word to JPG in C#, VB.NET
- Convert Word to SVG in C#, VB.NET
- Convert Word to PNG with Customized Resolution in C#, VB.NET
Install Spire.Doc for .NET
To begin with, you need to add the DLL files included in the Spire.Doc for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.Doc
Convert Word to JPG in C#, VB.NET
Spire.Doc for .NET offers the Document.SaveToImages() method to convert a whole Word document into individual Bitmap or Metafile images. Then, a Bitmap or Metafile image can be saved as a BMP, EMF, JPEG, PNG, GIF, or WMF format file. The following are the steps to convert a Word document to JPG images using this library.
- Create a Document object.
- Load a Word document using Document.LoadFromFile() method.
- Convert the document to Bitmap images using Document.SaveToImages() method.
- Loop through the image collection to get the specific one and save it as a JPG file.
- C#
- VB.NET
using Spire.Doc;
using Spire.Doc.Documents;
using System;
using System.Drawing;
using System.Drawing.Imaging;
namespace ConvertWordToJPG
{
class Program
{
static void Main(string[] args)
{
//Create a Document object
Document doc = new Document();
//Load a Word document
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Template.docx");
//Convert the whole document into individual images
Image[] images = doc.SaveToImages(ImageType.Bitmap);
//Loop through the image collection
for (int i = 0; i < images.Length; i++)
{
//Save the image to a JPEG format file
string outputfile = String.Format("Image-{0}.jpg", i);
images[i].Save("C:\\Users\\Administrator\\Desktop\\Images\\" + outputfile, ImageFormat.Jpeg);
}
}
}
}
Convert Word to SVG in C#, VB.NET
Using Spire.Doc for .NET, you can save a Word document as a queue of byte arrays. Each byte array can then be written as a SVG file. The detailed steps to convert Word to SVG are as follows.
- Create a Document object.
- Load a Word file using Document.LoadFromFile() method.
- Save the document as a queue of byte arrays using Document.SaveToSVG() method.
- Loop through the items in the queue to get a specific byte array.
- Write the byte array to a SVG file.
- C#
- VB.NET
using Spire.Doc;
using System;
using System.Collections.Generic;
using System.IO;
namespace CovnertWordToSVG
{
class Program
{
static void Main(string[] args)
{
//Create a Document object
Document doc = new Document();
//Load a Word document
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Template.docx");
//Save the document as a queue of byte arrays
Queue<byte[]> svgBytes = doc.SaveToSVG();
//Loop through the items in the queue
for (int i = 0; i < svgBytes.Count; i++)
{
//Convert the queue to an array
byte[][] bytes = svgBytes.ToArray();
//Specify the output file name
string outputfile = String.Format("Image-{0}.svg", i);
//Write the byte[] in a SVG format file
FileStream fs = new FileStream("C:\\Users\\Administrator\\Desktop\\Images\\" + outputfile, FileMode.Create);
fs.Write(bytes[i], 0, bytes[i].Length);
fs.Close();
}
}
}
}
Convert Word to PNG with Customized Resolution in C#, VB.NET
An image with higher resolution is generally more clear. You can customize the image resolution while converting Word to PNG by following the following steps.
- Create a Document object.
- Load a Word file using Document.LoadFromFile() method.
- Convert the document to Bitmap images using Document.SaveToImages() method.
- Loop through the image collection to get the specific one.
- Call the custom method ResetResolution() to reset the image resolution.
- Save the image as a PNG file.
- C#
- VB.NET
using Spire.Doc;
using System;
using System.Drawing;
using System.Drawing.Imaging;
using Spire.Doc.Documents;
namespace ConvertWordToPng
{
class Program
{
static void Main(string[] args)
{
//Create a Document object
Document doc = new Document();
//Load a Word document
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Template.docx");
//Convert the whole document into individual images
Image[] images = doc.SaveToImages(ImageType.Metafile);
//Loop through the image collection
for (int i = 0; i < images.Length; i++)
{
//Reset the resolution of a specific image
Image newimage = ResetResolution(images[i] as Metafile, 150);
//Save the image to a PNG format file
string outputfile = String.Format("Image-{0}.png", i);
newimage.Save("C:\\Users\\Administrator\\Desktop\\Images\\" + outputfile, ImageFormat.Png);
}
}
//Set the image resolution by the ResetResolution() method
public static Image ResetResolution(Metafile mf, float resolution)
{
int width = (int)(mf.Width * resolution / mf.HorizontalResolution);
int height = (int)(mf.Height * resolution / mf.VerticalResolution);
Bitmap bmp = new Bitmap(width, height);
bmp.SetResolution(resolution, resolution);
using (Graphics g = Graphics.FromImage(bmp))
{
g.DrawImage(mf, Point.Empty);
}
return bmp;
}
}
}

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
It is possible to perform Word to PDF conversion in Azure apps such as Azure Web apps and Azure Functions apps using Spire.Doc for .NET. In this article, you can see the code example to achieve this function with Spire.Doc for .NET.
The input Word document:

Step 1: Install Spire.Doc NuGet Package as a reference to your project from NuGet.org.

Step 2: Add the following code to convert Word to PDF.
//Create a Document instance Document document = new Document(false);
//Load the Word document
document.LoadFromFile(@"sample.docx");
//Create a ToPdfParameterList instance
ToPdfParameterList ps = new ToPdfParameterList
{
UsePSCoversion = true
};
//Save Word document to PDF using PS conversion
document.SaveToFile("ToPdf.pdf", ps);
Private Sub SurroundingSub()
Dim document As Document = New Document(false)
document.LoadFromFile("sample.docx")
Dim ps As ToPdfParameterList = New ToPdfParameterList With {
.UsePSCoversion = True
}
document.SaveToFile("ToPdf.pdf", ps)
End Sub
The Output PDF document:

PCL File is Digital printed document created in the Printer Command Language (more commonly referred to as PCL) page description language. From v7.1.19, Spire.Doc supports to convert word document to PCL. There are many kinds of standard for PCL document; the PCL here refers to PCL 6 (PCL 6 Enhanced or PCL XL). This article will show you how to save word document to PCL in C# and VB.NET by only three lines of codes.
using Spire.Doc;
namespace DOCPCL
{
class Program
{
static void Main(string[] args)
{
//load the sample document
Document doc = new Document();
doc.LoadFromFile("Sample.docx", FileFormat.Docx2010);
//save the document as a PCL file
doc.SaveToFile("Result.pcl", FileFormat.PCL);
}
}
}
Imports Spire.Doc
Namespace DOCPCL
Class Program
Private Shared Sub Main(args As String())
'load the sample document
Dim doc As New Document()
doc.LoadFromFile("Sample.docx", FileFormat.Docx2010)
'save the document as a PCL file
doc.SaveToFile("Result.pcl", FileFormat.PCL)
End Sub
End Class
End Namespace