Knowledgebase (2311)
Children categories
Spire.XLS for .NET is a professional Excel .NET API that can be used to create, read, write, convert and print Excel files in any type of .NET (C#, VB.NET, ASP.NET, .NET Core, .NET 5.0, .NET 6.0, MonoAndroid and Xamarin.iOS) application. Spire.XLS for .NET offers object model Excel API for speeding up Excel programming in .NET platform - create new Excel documents from template, edit existing Excel documents and convert Excel files.
Spire.XLS for .NET enjoys good reputation in both enterprise and individual customers. These customer types include Banks, Data processing houses, Educational institutions, Government organizations, Insurance firms, Legal institutions, Postal/cargo services and etc.
Extracting images from a Word document programmatically can be useful for automating document processing tasks. In this article, we’ll demonstrate how to extract images from a Word file using C# and the Spire.Doc for .NET library. Spire.Doc is a powerful .NET library that enables developers to manipulate Word documents efficiently.
- Getting Started: Installing Spire.Doc
- Steps for Extracting Images from Word
- Using the Code
- Additional Tips & Best Practices
- Conclusion
Getting Started: Installing Spire.Doc
Before you can start extracting images, you need to install Spire.Doc for .NET. Here's how:
- Using NuGet Package Manager:
- Open your Visual Studio project.
- Right-click on the project in the Solution Explorer and select "Manage NuGet Packages."
- Search for "Spire.Doc" and install the latest version.
- Manual Installation:
- Download the Spire.Doc package from the official website.
- Extract the files and reference the DLLs in your project.
Once installed, you're ready to begin.
Steps for Extracting Images from Word
- Import Spire.Doc module.
- Load the Word document.
- Iterate through sections, paragraphs, and child objects.
- Identify images and saving them to a specified location.
Using the Code
The following C# code demonstrates how to extract images from a Word document:
- C#
using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;
namespace ExtractImages
{
class Program
{
static void Main(string[] args)
{
// Initialize a Document object
Document document = new Document();
// Load the Word file
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx");
// Counter for image files
int index = 0;
// Loop through each section in the document
foreach (Section section in document.Sections)
{
// Loop through paragraphs in the section
foreach (Paragraph paragraph in section.Paragraphs)
{
// Loop through objects in the paragraph
foreach (DocumentObject docObject in paragraph.ChildObjects)
{
// Check if the object is an image
if (docObject.DocumentObjectType == DocumentObjectType.Picture)
{
// Save the image as a PNG file
DocPicture picture = docObject as DocPicture;
picture.Image.Save(string.Format("output/image_{0}.png", index), System.Drawing.Imaging.ImageFormat.Png);
index++;
}
}
}
}
// Dispose resources
document.Dispose();
}
}
}
The extracted images will be saved in the "output" folder with filenames like image_0.png, image_1.png, etc.

Additional Tips & Best Practices
- Handling Different Image Formats:
- Convert images to preferred formats (JPEG, BMP) by changing ImageFormat.Png
- Consider using ImageFormat.Jpeg for smaller file sizes
- Error Handling:
- C#
try { // extraction code } catch (Exception ex) { Console.WriteLine($"Error: {ex.Message}"); } - Performance Optimization:
- For large documents, consider using parallel processing
- Implement progress reporting for user feedback
- Advanced Extraction Scenarios:
- Extract images from headers/footers by checking Section.HeadersFooters
Conclusion
Using Spire.Doc in C# simplifies the process of extracting images from Word documents. This approach is efficient and can be integrated into larger document-processing workflows.
Beyond images, Spire.Doc also supports extracting various other elements from Word documents, including:
- Text
- Metadata
- Tables
- Comments
- Textboxes
- Hyperlinks
- OLE Objects
Whether you're building a document management system or automating report generation, Spire.Doc provides a reliable way to handle Word documents programmatically.
Get a Free License
To fully experience the capabilities of Spire.Doc for .NET without any evaluation limitations, you can request a free 30-day trial license.
Document Tree Traversal
using System;
using System.Collections.Generic;
using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;
using Spire.Doc.Interface;
using Spire.Doc.Collections;
namespace ExtractText
{
class Program
{
static void Main(string[] args)
{
//Open a word document.
Document document = new Document("Sample.doc");
IList<IDocumentObject> nodes = GetAllObjects(document);
foreach (IDocumentObject node in nodes)
{
//Judge the object type.
if (node.DocumentObjectType == DocumentObjectType.TextRange)
{
TextRange textNode = node as TextRange;
Console.WriteLine(textNode.Text);
}
}
}
private static IList<IDocumentObject> GetAllObjects(Document document)
{
//Create a list.
List<IDocumentObject> nodes = new List<IDocumentObject>();
//Create a new queue.
Queue<ICompositeObject> containers = new Queue<ICompositeObject>();
//Put the document objects in the queue.
containers.Enqueue(document);
while (containers.Count > 0)
{
ICompositeObject container = containers.Dequeue();
DocumentObjectCollection docObjects = container.ChildObjects;
foreach (DocumentObject docObject in docObjects)
{
nodes.Add(docObject);
//Judge the docObject.
if (docObject is ICompositeObject)
{
containers.Enqueue(docObject as ICompositeObject);
}
}
}
return nodes;
}
}
}
Imports System
Imports System.Collections.Generic
Imports Spire.Doc
Imports Spire.Doc.Documents
Imports Spire.Doc.Fields
Imports Spire.Doc.Interface
Imports Spire.Doc.Collections
Module Module1
Sub Main()
'Open a word document.
Dim document As New Document("Sample.doc")
Dim nodes As IList(Of IDocumentObject)() = GetAllObjects(document)
Dim containers As New Queue(Of ICompositeObject)()
For Each node As IDocumentObject In nodes
'Judge the object type.
If (node.DocumentObjectType = DocumentObjectType.TextRange) Then
Dim textNode As TextRange = node
Console.WriteLine(textNode.Text)
End If
Next
End Sub
Function GetAllObjects(ByVal document As Document) As IList(Of IDocumentObject)
'Create a list.
Dim nodes As New List(Of IDocumentObject)()
'Create a new queue.
Dim containers As New Queue(Of ICompositeObject)()
'Put the document objects in the queue.
containers.Enqueue(document)
While (containers.Count > 0)
Dim container As ICompositeObject = containers.Dequeue()
Dim docObjects As DocumentObjectCollection = container.ChildObjects
For Each docObject As DocumentObject In docObjects
nodes.Add(docObject)
'Judge the docObject.
If TypeOf docObject Is ICompositeObject Then
containers.Enqueue(TryCast(docObject, ICompositeObject))
End If
Next
End While
Return nodes
End Function
End Module


