Conversion (34)
Simple introduction about Word XML
Word XML is a special XML format, which makes Word be able to manipulate the Word documents stored in XML format. It can be divided into two types: WordML(supported by Word 2003) and WordXML(supported by Word 2007). If external applications support Word XML and the generated data follow the Word XML structure, then the data can be processed by Word. In this way, Word XML has become the bridge between Word and other external applications, any XML- formatted document based on Word XML structure can be opened, edited and saved in Word.
Using C#/VB.NET to convert Word to Word XML via Spire.Doc
Spire.Doc enables users to convert word document to Word XML format easily by using the doc.SaveToFile() method. Now, please follow the detail steps below:
Note: Before start, please download Spire.Doc and install it correctly, then add Spire.Doc.dll file from Bin folder as the reference of your project.
This is the screenshot of the original word document:

Step 1: Create a new document instance.
Document doc = new Document();
Step 2: Load the sample word document from file.
doc.LoadFromFile("Spire.Doc for .NET.docx");
Step 3: Save the word document as Word XML format.
For word 2003:
doc.SaveToFile("DocxToWordML.xml", FileFormat.WordML);
For word 2007:
doc.SaveToFile("DocxToWordXML.xml", FileFormat.WordXml);
Effective screenshot:

Full codes:
using Spire.Doc;
namespace Convert_Word_to_Word_XML
{
class Program
{
static void Main(string[] args)
{
Document doc = new Document();
doc.LoadFromFile("Spire.Doc for .NET.docx");
doc.SaveToFile("DocxToWordML.xml", FileFormat.WordML);
//doc.SaveToFile("DocxToWordXML.xml", FileFormat.WordXml);
}
}
}
Imports Spire.Doc
Namespace Convert_Word_to_Word_XML
Class Program
Private Shared Sub Main(args As String())
Dim doc As New Document()
doc.LoadFromFile("Spire.Doc for .NET.docx")
doc.SaveToFile("DocxToWordML.xml", FileFormat.WordML)
'doc.SaveToFile("DocxToWordXML.xml", FileFormat.WordXml);
End Sub
End Class
End Namespace
RTF (Rich Text Format) is a cross-platform document developed by Microsoft in the 1980s. RTF can be opened by most word processors, and it is also convenient for editing. But when it comes to sharing and printing documents in daily work, it’s more recommended to convert the RTF to PDF for further processing. In this article, you will learn how to convert RTF to PDF programmatically using Spire.Doc for .NET.
Install Spire.Doc for .NET
To begin with, you need to add the DLL files included in the Spire.Doc for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.Doc
Convert RTF to PDF in C# and VB.NET
Spire.Doc for .NET enables you to directly load a file with .rtf extension and then convert it to PDF with only three lines of code. The detailed steps are as follows.
- Create a Document instance.
- Load a sample RTF document using Document.LoadFromFile() method.
- Save the document as a PDF file using Document.SaveToFile() method.
- C#
- VB.NET
using Spire.Doc;
namespace RTFtoPDF
{
class Program
{
static void Main(string[] args)
{
//Create a Document instance
Document doc = new Document();
//Load a sample RTF document
doc.LoadFromFile("sample.rtf", FileFormat.Rtf);
//Save it to PDF
doc.SaveToFile("RTFtoPDF.pdf", FileFormat.PDF);
}
}
}

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Spire.Doc has a powerful ability to operate RTF file formats in C# and VB.NET. By using Spire.Doc, developers can convert RTF to PDF, HTML and word documents in .doc, .docx. This article will show you how to convert RTF into image and then reset the image resolution.
Download and install Spire.Doc for .NET and then add Spire.Doc.dll as reference in the downloaded Bin folder though the below path: "..\Spire.Doc\Bin\NET4.0\ Spire.Doc.dll". Here comes to the details of how to convert RTF into PNG and reset image resolution in C#.
Step 1: Create a new document and load from file.
Document doc = new Document();
doc.LoadFromFile("sample.rtf", FileFormat.Rtf);
Step 2: Save the RTF to image
Image[] images = doc.SaveToImages(Spire.Doc.Documents.ImageType.Metafile);
Step 3: Traverse the elements in the list of images and save them into .Png format.
for (int i = 0; i < images.Length; i++)
{
Metafile mf = images[i] as Metafile;
Image newimage = ResetResolution(mf, 200);
string outputfile = String.Format("image-{0}.png", i);
newimage.Save(outputfile, System.Drawing.Imaging.ImageFormat.Png);
}
Step 4: Set the image resolution call the method: ResetResolution.
public static Image ResetResolution(Metafile mf, float resolution)
{
int width = (int)(mf.Width * resolution / mf.HorizontalResolution);
int height = (int)(mf.Height * resolution / mf.VerticalResolution);
Bitmap bmp = new Bitmap(width, height);
bmp.SetResolution(resolution, resolution);
using (Graphics g = Graphics.FromImage(bmp))
{
g.DrawImage(mf, Point.Empty);
}
return bmp;
}
Effective screenshot of the image before reset the image resolution:

The image after reset the image resolution:

Full codes:
using Spire.Doc;
using System.Drawing;
using System.Drawing.Imaging;
namespace RTFtoImage
{
class Program
{
static void Main(string[] args)
{
//Create a new document and load from file.
Document doc = new Document();
doc.LoadFromFile("sample.rtf", FileFormat.Rtf);
// save the RTF to image
Image[] images = doc.SaveToImages(Spire.Doc.Documents.ImageType.Metafile);
for (int i = 0; i < images.Length; i++)
{
Metafile mf = images[i] as Metafile;
Image newimage = ResetResolution(mf, 200);
string outputfile = String.Format("image-{0}.png", i);
newimage.Save(outputfile, System.Drawing.Imaging.ImageFormat.Png);
}
}
//set the image resolution by the ResetResolution() method
public static Image ResetResolution(Metafile mf, float resolution)
{
int width = (int)(mf.Width * resolution / mf.HorizontalResolution);
int height = (int)(mf.Height * resolution / mf.VerticalResolution);
Bitmap bmp = new Bitmap(width, height);
bmp.SetResolution(resolution, resolution);
using (Graphics g = Graphics.FromImage(bmp))
{
g.DrawImage(mf, Point.Empty);
}
return bmp;
}
}
}
PDF/A is an ISO-standardized version of the Portable Document Format (PDF) specialized for the digital preservation of electronic documents. It is widely used for long term archiving for PDF format. This article mainly shows how to convert word document (doc and docx) to PDF/A in C# by using Spire.Doc.
Make sure Spire.Doc for .NET Version 5.0.26 (or above) has been installed correctly and then add Spire.Doc.dll as reference in the downloaded Bin folder though the below path: "..\Spire.Doc\Bin\NET4.0\ Spire.Doc.dll".
First, check the original word document that will be converted to PDF/A.
Here comes to the details of how developers convert word document to PDF/A directly by using Spire.Doc:
Step 1: Load a word document from the file.
Document document = new Document(); document.LoadFromFile(@"D:\test.docx",FileFormat.Docx);
Step 2: Sets the Pdf document's Conformance-level to PDF_A1B.
ToPdfParameterList toPdf = new ToPdfParameterList(); toPdf.PdfConformanceLevel = Spire.Pdf.PdfConformanceLevel.Pdf_A1B;
Step 3: Save word document to PDF
document.SaveToFile("result.Pdf",toPdf);
Please check the effective screenshot of the result PDF in PDF/A format.
HTML is the standard format for web pages and online content. However, there are many scenarios where you may need to convert HTML documents into other file formats, such as PDF, XPS, and XML. Whether you're looking to generate a printable version of a web page, share HTML content in a more universally accepted format, or extract data from HTML for further processing, being able to reliably convert HTML documents to these alternate formats is an important skill to have. In this article, we will demonstrate how to convert HTML to PDF, XPS, and XML in C# using Spire.Doc for .NET.
- Convert HTML to PDF in C#
- Convert HTML String to PDF in C#
- Convert HTML to XPS in C#
- Convert HTML to XML in C#
Install Spire.Doc for .NET
To begin with, you need to add the DLL files included in the Spire.Doc for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.Doc
Convert HTML to PDF in C#
Converting HTML to PDF offers several advantages, including enhanced portability, consistent formatting, and easy sharing. PDF files retain the original layout, styling, and visual elements of the HTML content, ensuring that the document appears the same across different devices and platforms.
You can use the Document.SaveToFile(string filename, FileFormat.PDF) method to convert an HTML file to PDF format. The detailed steps are as follows.
- Create an instance of the Document object.
- Load an HTML file using the Document.LoadFromFile() method.
- Save the HTML file to PDF format using the Document.SaveToFile(string filename, FileFormat.PDF) method.
- C#
using Spire.Doc;
using Spire.Doc.Documents;
namespace ConvertHtmlToPdf
{
internal class Program
{
static void Main(string[] args)
{
// Create an instance of the Document class
Document doc = new Document();
// Load an HTML file
doc.LoadFromFile("Sample.html", FileFormat.Html, XHTMLValidationType.None);
//Convert the HTML file to PDF format
doc.SaveToFile("HtmlToPDF.pdf", FileFormat.PDF);
doc.Close();
}
}
}

Convert HTML String to PDF in C#
In addition to converting HTML files to PDF, you are also able to convert HTML strings to PDF. Spire.Doc for .NET provides the Paragraph.AppendHTML() method to add an HTML string to a Word document. Once the HTML string has been added, you can convert the result document to PDF using the Document.SaveToFile(string filename, FileFormat.PDF) method. The detailed steps are as follows.
- Create an instance of the Document object.
- Add a paragraph to the document using the Document.AddSection().AddParagraph() method.
- Append an HTML string to the paragraph using the Paragraph.AppendHTML() method.
- Save the document to PDF format using the Document.SaveToFile(string filename, FileFormat.PDF) method.
- C#
using Spire.Doc;
using Spire.Doc.Documents;
namespace ConvertHtmlStringToPdf
{
internal class Program
{
static void Main(string[] args)
{
// Create an instance of the Document class
Document doc = new Document();
// Add a paragraph to the document
Paragraph para = doc.AddSection().AddParagraph();
// Specify the HTML string
string htmlString = @"<h1>This is a Heading</h1>
<p>This is a paragraph.</p>
<ul>
<li>Item 1</li>
<li>Item 2</li>
<li>Item 3</li>
</ul>";
// Append the HTML string to the paragraph
para.AppendHTML(htmlString);
// Convert the document to PDF format
doc.SaveToFile("HtmlStringToPDF.pdf", FileFormat.PDF);
doc.Close();
}
}
}

Convert HTML to XPS in C#
XPS, or XML Paper Specification, is an alternative format to PDF that provides similar functionality and advantages. Converting HTML to XPS ensures the preservation of document layout, fonts, and images while maintaining high fidelity. XPS files are optimized for printing and can be viewed using XPS viewers or Windows' built-in XPS Viewer.
By using the Document.SaveToFile(string filename, FileFormat.XPS) method, you can convert HTML files to XPS format with ease. The detailed steps are as follows.
- Create an instance of the Document object.
- Load an HTML file using the Document.LoadFromFile() method.
- Save the HTML file to XPS format using the Document.SaveToFile(string filename, FileFormat.XPS) method.
- C#
using Spire.Doc;
using Spire.Doc.Documents;
namespace ConvertHtmlToXps
{
internal class Program
{
static void Main(string[] args)
{
// Create an instance of the Document class
Document doc = new Document();
// Load an HTML file
doc.LoadFromFile("Sample.html", FileFormat.Html, XHTMLValidationType.None);
//Convert the HTML file to XPS format
doc.SaveToFile("HtmlToXPS.xps", FileFormat.XPS);
doc.Close();
}
}
}

Convert HTML to XML in C#
Converting HTML to XML unlocks the potential for data extraction, manipulation, and integration with other systems. XML is a flexible and extensible markup language that allows for structured representation of data. By converting HTML to XML, you can extract specific elements, organize data hierarchically, and perform data analysis or integration tasks using XML processing tools and techniques.
To convert HTML files to XML format, you can use the Document.SaveToFile(string filename, FileFormat.Xml) method. The detailed steps are as follows.
- Create an instance of the Document object.
- Load an HTML file using the Document.LoadFromFile() method.
- Save the HTML file to XML format using the Document.SaveToFile(string filename, FileFormat.Xml) method.
- C#
using Spire.Doc;
using Spire.Doc.Documents;
namespace ConvertHtmlToXml
{
internal class Program
{
static void Main(string[] args)
{
// Create an instance of the Document class
Document doc = new Document();
// Load an HTML file
doc.LoadFromFile("Sample.html", FileFormat.Html, XHTMLValidationType.None);
//Convert the HTML file to XML format
doc.SaveToFile("HtmlToXML.xml", FileFormat.Xml);
doc.Close();
}
}
}

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Convert Word document to HTML is popular and widely used by programmers and developers. With the help of Spire.Doc for .NET, a professional word component, without installing MS Word, developers can convert word to html with only two lines of key code in C#. At the same time, Spire.Doc supports convert HTML to word document easily and quickly.
This article still focuses on convert word from/to HTML, while it mainly about the supports of embed image in the word document and HTML. With the improvements of Spire.Doc (starts from Spire.Doc V. 4.9.32), now it supports the new function of ImageEmbedded.
Please download Spire.Doc (version 4.9.32 or above) with .NET framework together and follow the simple steps as below:
Convert Word to HTML in C#:
Step 1: Create the word document.
Document document = new Document();
Step 2: Set the value of imageEmbedded attribute.
doc.HtmlExportOptions.ImageEmbedded=true;
Step 3: Save word document to HTML.
doc.SaveToFile("result.html",FileFormat.Html);
Spire.Doc also supports load the result HTML page and convert it into word document in only three lines of codes as below.
doc.SaveToFile("htmltoword.docx",FileFormat.Docx);
Besides conversion of word from/to HTML, Spire.Doc also supports Convert Word to PDF, Convert Word to Image and Convert Word to XPS in C#.
The article will introduce an easy way to convert Word to Emf by a powerful and independent Word .NET component called Spire.Doc, without Microsoft Word installed on the machine. It also offers support for converting Word and HTML to frequently-used image formats like Jpeg, Png, Gif, Bmp and Tiff, etc. Just click here to have a try.
Emf is a file extension for Enhanced MetaFile, used as a graphics language for printer drivers by the Windows operating system. In 1993, a newer version with additional commands 32-bit version of Win32/GDI introduced the Enhanced Metafile (Emf). Microsoft also recommends enhanced-format (Emf) functions to be used instead of rarely being used Windows-format (WMF) functions.
Spire.Doc presents almost the easiest solution to convert Word to Emf through the following 5 lines simple code.
using Spire.Doc;
using System.Drawing.Imaging;
namespace DOCEMF
{
class Program
{
static void Main(string[] args)
{
// create an instance of Spire.Doc.Document
Document doc = new Document();
// load the file base on a specified file name
doc.LoadFromFile(@"../../Original Word.docx", FileFormat.Docx);
//convert the first page of document to image
System.Drawing.Image image = doc.SaveToImages(0, Spire.Doc.Documents.ImageType.Metafile);
// save the document object to Emf file
image.Save(@"../../Convert Word to Image.emf", ImageFormat.Emf);
//close the document
doc.Close();
}
}
}
Check the effect screenshot below:
XPS (XML Paper Specification) is a fixed-layout document format designed to preserve document fidelity and provide device-independent document appearance. It is similar to PDF, but is based on XML rather than PostScript. If you want to save a Word document to a fixed-layout file format, XPS would be an option. This article will demonstrate how to convert Word documents to XPS in C# and VB.NET using Spire.Doc for .NET.
Install Spire.Doc for .NET
To begin with, you need to add the DLL files included in the Spire.Doc for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.Doc
Convert Word to XPS in C# and VB.NET
The following are the detailed steps to convert a Word document to XPS using Spire.Doc for .NET:
- Initialize an instance of Document class.
- Load a Word document using Document.LoadFromFile() method.
- Save the Word document to XPS using Document.SaveToFile(string filePath, FileFormat fileFormat) method.
- C#
- VB.NET
using Spire.Doc;
namespace ConvertWordToXps
{
class Program
{
static void Main(string[] args)
{
//Create a Document instance
Document doc = new Document();
//Load a Word document
doc.LoadFromFile("Sample.docx");
//convert the document to XPS
doc.SaveToFile("ToXPS.xps", FileFormat.XPS);
}
}
}

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
EPUB is a standard file format for publishing eBooks or other electronic documents. The content in an EPUB file is reflowable, which means that the content automatically adjusts itself to fit the screen it is being displayed on. People who want to publish their eBooks may need to convert their works stored in Word documents to EPUB files. In this article, you will learn how to programmatically achieve this task using Spire.Doc for .NET.
Install Spire.Doc for .NET
To begin with, you need to add the DLL files included in the Spire.Doc for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.Doc
Convert Word to EPUB
The detailed steps are as follows:
- Create a Document instance.
- Load a sample Word document using Document.LoadFromFile() method.
- Save the document to EPUB using Document.SaveToFile() method.
- C#
- VB.NET
using Spire.Doc;
namespace WordtoEPUB
{
class Epub
{
static void Main(string[] args)
{
//Create a Document instance
Document document = new Document();
//Load a sample Word document
document.LoadFromFile("demo.docx");
//Convert the Word document to EPUB
document.SaveToFile("ToEpub.epub", FileFormat.EPub);
}
}
}

Convert Word to EPUB with a Cover Image
The detailed steps are as follows.
- Create a Document instance.
- Load a sample Word document using Document.LoadFromFile() method.
- Create a DocPicture instance, and then load an image using DocPicture.LoadImage() method.
- Save the Word document to EPUB with cover image using Document.SaveToEpub(String, DocPicture) method.
- C#
- VB.NET
using Spire.Doc;
using Spire.Doc.Fields;
using System.Drawing;
namespace ConvertWordToEpubWithCoverImage
{
class Program
{
static void Main(string[] args)
{
//Create a Document instance
Document doc = new Document();
//Load a sample Word document
doc.LoadFromFile("demo.docx");
//Create a DocPicture instance
DocPicture picture = new DocPicture(doc);
//Load an image
picture.LoadImage(Image.FromFile("CoverImage.png"));
//Save the Word document to EPUB with cover image
doc.SaveToEpub("ToEpubWithCoverImage.epub", picture);
}
}
}

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Converting HTML to images enables the transformation of dynamic web content-such as text, graphics, and layouts-into static formats like PNG or JPEG. This process is ideal for capturing web pages for documentation, generating thumbnails, or ensuring visually consistent content across platforms, providing both accuracy and versatility.
In this article, you'll learn how to convert HTML files and strings to images using C# with Spire.Doc for .NET.
Install Spire.Doc for .NET
To begin with, you need to add the DLL files included in the Spire.Doc for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.Doc
Convert an HTML File to Image in C#
Using Spire.Doc for .NET, you can directly load an HTML file by utilizing the Document.LoadFromFile() method. Once loaded, you can convert the document into Bitmap images with the Document.SaveToImages() method. Afterward, you can loop through the generated images and save each one in widely-used image formats such as PNG, JPG, or BMP.
The following are the steps to convert an HTML file to images using Spire.Doc in C#:
- Create a Document object.
- Load an HTML file using the Document.LoadFromFile() method.
- Adjust properties such as margins, which will affect the output image's layout.
- Call the Document.SaveToImages() method to convert the loaded document into an array of Bitmap images.
- Iterate through the images and save each one to your desired output format.
- C#
using Spire.Doc;
using Spire.Doc.Documents;
using System.Drawing;
using System.Drawing.Imaging;
namespace ConvertHtmlFileToPng
{
class Program
{
static void Main(string[] args)
{
// Create a Document object
Document document = new Document();
// Load an HTML file
document.LoadFromFile(@"C:\Users\Administrator\Desktop\MyHtml.html", FileFormat.Html, XHTMLValidationType.None);
// Get the first section
Section section = document.Sections[0];
// Set the page margins
section.PageSetup.Margins.All = 2;
// Convert the document to an array of bitmap images
Image[] images = document.SaveToImages(ImageType.Bitmap);
// Iterate through the images
for (int index = 0; index < images.Length; index++)
{
// Specify the output file name
string fileName = string.Format(@"C:\Users\Administrator\Desktop\Output\image_{0}.png", index);
// Save each image as a PNG file
images[index].Save(fileName, ImageFormat.Png);
}
// Dispose resources
document.Dispose();
}
}
}

Convert an HTML String to Image in C#
In certain scenarios, you might need to convert an HTML string directly into an image. This approach is especially beneficial for handling dynamically generated content or when you prefer not to depend on external HTML files.
Here is how you can convert an HTML string to images using Spire.Doc in C#:
- Create a Document object.
- Add a section and a paragraph to the document.
- Adjust properties such as margins, which will affect the output image's layout.
- Read the HTML string from a file or define it directly in the script.
- Use the Paragraph.AppendHTML() method to render the HTML content in the document.
- Call the Document.SaveToImages() method to convert the document into an array of Bitmap images.
- Iterate through the images and save each one to your desired output format.
- C#
using Spire.Doc;
using Spire.Doc.Documents;
using System.Drawing;
using System.Drawing.Imaging;
namespace ConvertHtmlStringToPng
{
class Program
{
static void Main(string[] args)
{
// Create a Document object
Document document = new Document();
// Add a section to the document
Section section = document.AddSection();
// Set the page margins
section.PageSetup.Margins.All = 2;
// Add a paragraph to the section
Paragraph paragraph = section.AddParagraph();
// Read HTML string from a file
string htmlFilePath = @"C:\Users\Administrator\Desktop\Html.html";
string htmlString = File.ReadAllText(htmlFilePath, System.Text.Encoding.UTF8);
// Append the HTML string to the paragraph
paragraph.AppendHTML(htmlString);
// Convert the document to an array of bitmap images
Image[] images = document.SaveToImages(ImageType.Bitmap);
// Iterate through the images
for (int index = 0; index < images.Length; index++)
{
// Specify the output file name
string fileName = string.Format(@"C:\Users\Administrator\Desktop\Output\image_{0}.png", index);
// Save each image as a PNG file
images[index].Save(fileName, ImageFormat.Png);
}
// Dispose resources
document.Dispose();
}
}
}

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.