page 6

Subscribe to this RSS feed

.NET (1316)

Children categories

Spire.DataExport for .NET (33)

View items...

Spire.Presentation for .NET (204)

View items...

Spire.Spreadsheet for .NET (1)

View items...

Spire.PDFViewer for MAUI (1)

View items...

C#: Add Gutters on Word Document Pages

2024-03-27 01:31:57 Written by Koohji

Adding gutters on Word document pages can enhance the professionalism and aesthetics of the document. Gutters not only make the document appear neater and more organized but also serve as a guide when printing the document, making it easier for readers to navigate and browse through the content. By adding gutters on the document pages, you can simulate the common binding line effect found in physical documents, giving the document a more printed quality. This article will explain how to use Spire.Doc for .NET to add gutters on Word document pages within a C# project.

Add a Gutter at the Top of a Word Document Page using C#
Add a Gutter at the Left of a Word Document Page using C#

Install Spire.Doc for .NET

To begin with, you need to add the DLL files included in the Spire.Doc for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

Package Manager

PM> Install-Package Spire.Doc

Add a Gutter at the Top of a Word Document Page using C#

To enable the top gutter on a page, you can set the section.PageSetup.IsTopGutter = true. The default gutter area displays blank without content, and this example also includes how to add text within the gutter area. Here are the detailed steps:

Create a Document object.
Load a document using the Document.LoadFromFile() method.
Iterate through all sections of the document using a for loop over the Document.Sections collection.
Set Section.PageSetup.IsTopGutter to true to display the gutter at the top of the page.
Use the Section.PageSetup.Gutter property to set the width of the gutter.
Call the custom AddTopGutterText() method to add text to the gutter area.
Save the document using the Document.SaveToFile() method.

using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;
using Spire.Doc.Formatting;
using System.Drawing;
using System.Text;

namespace SpireDocDemo
{
	internal class Program
	{
		static void Main(string[] args)
		{
			// Create a document object
			Document document = new Document();

			// Load the document
			document.LoadFromFile("Sample1.docx");

			// Iterate through all sections of the document
			for (int i = 0; i < document.Sections.Count; i++)
			{
				// Get the current section
				Section section = document.Sections[i];

				// Set whether to add a gutter at the top of the page to true
				section.PageSetup.IsTopGutter = true;

				// Set the width of the gutter to 100f
				section.PageSetup.Gutter = 100f;

				// Call a method to add text on the top gutter
				AddTopGutterText(section);
			}

			// Save the modified document to a file
			document.SaveToFile("Add Gutter Line at the Top of the Page.docx", FileFormat.Docx2016);

			// Release document resources
			document.Dispose();
		}
		// Method to add text on the top gutter 
		static void AddTopGutterText(Section section)
		{
			// Get the header of the section
			HeaderFooter header = section.HeadersFooters.Header;

			// Set the width of the text box to the page width
			float width = section.PageSetup.PageSize.Width;

			// Set the height of the text box to 40
			float height = 40;

			// Add a text box in the header
			TextBox textBox = header.AddParagraph().AppendTextBox(width, height);

			// Set the text box without border
			textBox.Format.NoLine = true;

			// Set the vertical starting position of the text box to the top margin area
			textBox.VerticalOrigin = VerticalOrigin.TopMarginArea;

			// Set the vertical position of the text box
			textBox.VerticalPosition = 140;

			// Set the horizontal alignment of the text box to left
			textBox.HorizontalAlignment = ShapeHorizontalAlignment.Left;

			// Set the horizontal starting position of the text box to the left margin area
			textBox.HorizontalOrigin = HorizontalOrigin.LeftMarginArea;

			// Set the text anchor to bottom
			textBox.Format.TextAnchor = ShapeVerticalAlignment.Bottom;

			// Set the text wrapping style to in front of text
			textBox.Format.TextWrappingStyle = TextWrappingStyle.InFrontOfText;

			// Set the text wrapping type to both sides
			textBox.Format.TextWrappingType = TextWrappingType.Both;

			// Create a paragraph object
			Paragraph paragraph = new Paragraph(section.Document);

			// Set the paragraph to be horizontally centered
			paragraph.Format.HorizontalAlignment = HorizontalAlignment.Center;

			// Create a font object
			Font font = new Font("Times New Roman", 8);

			// Create a drawing object
			Graphics graphics = Graphics.FromImage(new Bitmap(1, 1));
			string text1 = " - ";
			SizeF size1 = graphics.MeasureString(text1, font);
			float textWidth1 = size1.Width / 96 * 72;
			int count = (int)(textBox.Width / textWidth1);
			StringBuilder stringBuilder = new StringBuilder();
			for (int i = 1; i < count; i++)
			{
				stringBuilder.Append(text1);
			}

			// Create a character format object
			CharacterFormat characterFormat = new CharacterFormat(section.Document);
			characterFormat.FontName = font.Name;
			characterFormat.FontSize = font.Size;
			TextRange textRange = paragraph.AppendText(stringBuilder.ToString());
			textRange.ApplyCharacterFormat(characterFormat);

			// Add the paragraph to the text box
			textBox.ChildObjects.Add(paragraph);
		}
	}
}

C#: Add Gutters on Word Document Pages

Add a Gutter at the Left of a Word Document Page using C#

To set the left-side gutter on the page, ensure that you set the Section.PageSetup.IsTopGutter property to false. Here are the detailed steps:

Create a Document object.
Load a document using the Document.LoadFromFile() method.
Iterate through all sections of the document using a for loop over the Document.Sections collection.
Set Section.PageSetup.IsTopGutter to false to display the gutter on the left side of the page.
Use the Section.PageSetup.Gutter property to set the width of the gutter.
Call the custom AddLeftGutterText() method to add text to the gutter area.
Save the document using the Document.SaveToFile() method.

using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;
using Spire.Doc.Formatting;
using System.Drawing;
using System.Text;

namespace SpireDocDemo
{
	internal class Program
	{
		static void Main(string[] args)
		{
			// Create a document object
			Document document = new Document();

			// Load the document
			document.LoadFromFile("Sample1.docx");

			// Iterate through all sections of the document
			for (int i = 0; i < document.Sections.Count; i++)
			{
				// Get the current section
				Section section = document.Sections[i];

				// Set whether to add a gutter at the top of the page to false, it will be added to the left side of the page
				section.PageSetup.IsTopGutter = false;

				// Set the width of the gutter to 100f
				section.PageSetup.Gutter = 100f;

				// Call a method to add text on the left gutter 
				AddLeftGutterText(section);
			}

			// Save the modified document to a file
			document.SaveToFile("Add Gutter Line on the Left Side of the Page.docx", FileFormat.Docx2016);

			// Release document resources
			document.Dispose();
		}
		// Method to add text on the left gutter 
		static void AddLeftGutterText(Section section)
		{
			// Get the header of the section
			HeaderFooter header = section.HeadersFooters.Header;

			// Set the width of the text box to 40
			float width = 40;

			// Get the page height
			float height = section.PageSetup.PageSize.Height;

			// Add a text box in the header
			TextBox textBox = header.AddParagraph().AppendTextBox(width, height);

			// Set the text box without border
			textBox.Format.NoLine = true;

			// Set the text direction in the text box from right to left
			textBox.Format.LayoutFlowAlt = TextDirection.RightToLeft;

			// Set the horizontal starting position of the text box
			textBox.HorizontalOrigin = HorizontalOrigin.LeftMarginArea;

			// Set the horizontal position of the text box
			textBox.HorizontalPosition = 140;

			// Set the vertical alignment of the text box to top
			textBox.VerticalAlignment = ShapeVerticalAlignment.Top;

			// Set the vertical starting position of the text box to the top margin area
			textBox.VerticalOrigin = VerticalOrigin.TopMarginArea;

			// Set the text anchor to top
			textBox.Format.TextAnchor = ShapeVerticalAlignment.Top;

			// Set the text wrapping style to in front of text
			textBox.Format.TextWrappingStyle = TextWrappingStyle.InFrontOfText;

			// Set the text wrapping type to both sides
			textBox.Format.TextWrappingType = TextWrappingType.Both;

			// Create a paragraph object
			Paragraph paragraph = new Paragraph(section.Document);

			// Set the paragraph to be horizontally centered
			paragraph.Format.HorizontalAlignment = HorizontalAlignment.Center;

			// Create a font object
			Font font = new Font("Times New Roman", 8);

			// Create a drawing object
			Graphics graphics = Graphics.FromImage(new Bitmap(1, 1));
			string text1 = " - ";

			// Measure the size of the text
			SizeF size1 = graphics.MeasureString(text1, font);
			float textWidth1 = size1.Width / 96 * 72;

			int count = (int)(textBox.Height / textWidth1);
			StringBuilder stringBuilder = new StringBuilder();
			for (int i = 1; i < count; i++)
			{
				stringBuilder.Append(text1);
			}

			// Create a character format object
			CharacterFormat characterFormat = new CharacterFormat(section.Document);
			characterFormat.FontName = font.Name;
			characterFormat.FontSize = font.Size;
			TextRange textRange = paragraph.AppendText(stringBuilder.ToString());
			textRange.ApplyCharacterFormat(characterFormat);

			// Add the paragraph to the text box
			textBox.ChildObjects.Add(paragraph);
		}
	}
}

C#: Add Gutters on Word Document Pages

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Page Setup

Tagged under

doc net Page Setup

C#: Add or Remove Editable area in a Word Document

2024-03-14 01:16:32 Written by Koohji

Adding the ability to edit permission area in a Word document can help users specify certain sections for others to edit while protecting the rest of the document from accidental modifications. This is particularly useful for scenarios like collaborative documents, document reviews, and comments. On the other hand, removing editable area functionality allows the document to be restored to a read-only state when specific sections do not need to be edited, ensuring the integrity and security of the document content. This article will explain how to use Spire.Doc for .NET to add or remove editable area in a Word document within a C# project.

Add Editable Area in a Word Document in C#
Remove Editable Area in a Word Document in C#

Install Spire.Doc for .NET

Package Manager

PM> Install-Package Spire.Doc

Add Editable Area in a Word Document in C#

The steps to add editable area in a Word document involve inserting PermissionStart and PermissionEnd objects in the document and setting the document to read-only protection mode to ensure that the content within the specified areas can be edited while the rest remains read-only. Here are the detailed steps:

Create a Document object.
Load a Word document using the Document.LoadFromFile() method.
Access a section of the document through the Document.Sections[index] property.
Create a PermissionStart object using PermissionStart permissionStart = new PermissionStart(document, id) to mark the beginning of the editable area .
Create a PermissionEnd object using PermissionEnd permissionEnd = new PermissionEnd(document, id) to mark the end of the editable area .
Access a paragraph using the Section.Paragraphs[index] property.
Insert the permission start object at the beginning of the paragraph using the Paragraph.ChildObjects.Insert(0, permissionStart) method.
Add the permission end object at the end of the paragraph using the Paragraph.ChildObjects.Add(permissionEnd) method.
Set the document to read-only protection mode and restrict editing permissions using the Document.Protect(ProtectionType.AllowOnlyReading, password) method.
Save the resulting document using the Document.SaveToFile() method.

using Spire.Doc;
using Spire.Doc.Documents;

namespace SpireDocDemo
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Create a new document object
            Document document = new Document();

            // Load the document from the specified path
            document.LoadFromFile("Sample1.docx");

            // Get the first section of the document
            Section section = document.Sections[0];

            // Create a permission start object
            PermissionStart permissionStart = new PermissionStart(document, "restricted1");

            // Create a permission end object
            PermissionEnd permissionEnd = new PermissionEnd(document, "restricted1");

            // Get the second paragraph in the section
            Paragraph paragraph = section.Paragraphs[1];

            // Insert the permission start object at the beginning of the paragraph
            paragraph.ChildObjects.Insert(0, permissionStart);

            // Add the permission end object at the end of the paragraph
            paragraph.ChildObjects.Add(permissionEnd);

            // Set the document to be read-only protected
            document.Protect(ProtectionType.AllowOnlyReading, "123456");

            // Save the modified document to the specified path
            document.SaveToFile("AddedEditingPermissionsArea.docx", FileFormat.Docx);

            // Close the document and release the resources occupied by the document object
            document.Close();
            document.Dispose();
        }
  }
}

C#: Add or Remove Editable area in a Word Document

Remove Editable Area in a Word Document in C#

The key steps to remove editable area in a Word document involve iterating through each paragraph of the document and removing the PermissionStart and PermissionEnd objects. Here are the detailed steps:

Create a Document object.
Load a Word document using the Document.LoadFromFile() method.
Iterate through each paragraph in each section of the document, check for the presence of PermissionStart or PermissionEnd objects, and remove them.
Save the resulting document using the Document.SaveToFile() method.

using Spire.Doc;
using Spire.Doc.Documents;

namespace SpireDocDemo
{
    internal class Program
    {
        static void Main(string[] args)
        {
             // Create a new document object
            Document document = new Document();

            // Load the document from the specified path
            document.LoadFromFile("Sample2.docx");

            // Iterate through the sections of the document
            for (int a = 0; a < document.Sections.Count; a++)
            {
                // Get the body of the current section
                Body body = document.Sections[a].Body;

                // Iterate through the child objects of the body
                for (int i = 0; i < body.ChildObjects.Count; i++)
                {
                    // Check if the child object is a paragraph
                    if (body.ChildObjects[i] is Paragraph)
                    {
                        // Get the current paragraph
                        Paragraph paragraph = (Paragraph)body.ChildObjects[i];

                        // Iterate backwards from the last child object of the paragraph
                        for (int j = paragraph.ChildObjects.Count - 1; j >= 0; j--)
                        {
                            // Get the current child object
                            DocumentObject documentObject = paragraph.ChildObjects[j];

                            // Remove the current child object if it is a permission start object
                            if (documentObject.DocumentObjectType == DocumentObjectType.PermissionStart)
                            {
                                paragraph.ChildObjects.RemoveAt(j);
                            }
                            // Remove the current child object if it is a permission end object
                            else if (documentObject.DocumentObjectType == DocumentObjectType.PermissionEnd)
                            {
                                paragraph.ChildObjects.RemoveAt(j);
                            }
                        }
                    }
                }
            }

            // Save the modified document to the specified path
            document.SaveToFile("RemovedEditingPermissionsArea.docx", FileFormat.Docx);

            // Close the document and release the resources occupied by the document object
            document.Close();
            document.Dispose();
        }
    }
}

C#: Add or Remove Editable area in a Word Document

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Security

Tagged under

doc net Security

Read Word Document in C# .NET: Extract Text, Tables, Images

2024-02-29 01:18:43 Written by hayes Liu

C# Guide to Read Word Document Content

Word documents (.doc and .docx) are widely used in business, education, and professional workflows for reports, contracts, manuals, and other essential content. As a C# developer, you may find the need to programmatically read these files to extract information, analyze content, and integrate document data into applications.

In this complete guide, we will delve into the process of reading Word documents in C#. We will explore various scenarios, including:

Extracting text, paragraphs, and formatting details
Retrieving images and structured table data
Accessing comments and document metadata
Reading headers and footers for comprehensive document analysis

By the end of this guide, you will have a solid understanding of how to efficiently parse Word documents in C#, allowing your applications to access and utilize document content with accuracy and ease.

Set Up Your Development Environment for Reading Word Documents in C#
Load Word Document (.doc/.docx) in C#
Read and Extract Content from Word Document in C#
Advanced Tips and Best Practices for Reading Word Documents in C#
Conclusion
FAQs

Set Up Your Development Environment for Reading Word Documents in C#

Before you can read Word documents in C#, it’s crucial to ensure that your development environment is properly set up. This section outlines the necessary prerequisites and step-by-step installation instructions to get you ready for seamless Word document handling.

Prerequisites

Development Environment: Ensure you have Visual Studio or another compatible C# IDE installed.
.NET Requirement: Ensure you have .NET Framework or .NET Core installed.
Library Requirement: Spire.Doc for .NET, a versatile library that allows developers to:
- Create Word documents from scratch
- Edit and format existing Word documents
- Read and extract text, tables, images, and other content programmatically
- Convert Word documents to PDF, HTML, and other formats
- Work independently without requiring Microsoft Word installation

Install Spire.Doc

To incorporate Spire.Doc into your C# project, follow these steps to install it via NuGet:

Open your project in Visual Studio.
Right-click on your project in the Solution Explorer and select Manage NuGet Packages.
In the Browse tab, search for "Spire.Doc" and click Install.

Alternatively, you can use the Package Manager Console with the following command:

PM> Install-Package Spire.Doc

This installation adds the necessary references, enabling you to programmatically work with Word documents.

Load Word Document (.doc/.docx) in C#

To begin, you need to load a Word document into your project. The following example demonstrates how to load a .docx or .doc file in C#:

using Spire.Doc;
using Spire.Doc.Documents;
using System;

namespace LoadWordExample
{
    class Program
    {
        static void Main(string[] args)
        {
            // Specify the path of the Word document
            string filePath = @"C:\Documents\Sample.docx";

            // Create a Document object
            using (Document document = new Document())
            {
                // Load the Word .docx or .doc document
                document.LoadFromFile(filePath);
            }
        }
    }
}

This code loads a Word file from the specified path into a Document object, which is the entry point for accessing all document elements.

Read and Extract Content from Word Document in C#

After loading the Word document into a Document object, you can access its contents programmatically. This section covers various methods for extracting different types of content effectively.

Extract Text

Extracting text is often the first step in reading Word documents. You can retrieve all text content using the built-in GetText() method:

using (StreamWriter writer = new StreamWriter("ExtractedText.txt", false, Encoding.UTF8))
{
    // Get all text from the document
    string allText = document.GetText();
    
    // Write the entire text to a file
    writer.Write(allText);
}

This method extracts all text, disregarding formatting and non-text elements like images.

C# Example to Extract All Text from Word Document

Read Paragraphs and Formatting Information

When working with Word documents, it is often useful not only to access the text content of paragraphs but also to understand how each paragraph is formatted. This includes details such as alignment and spacing after the paragraph, which can affect layout and readability.

The following example demonstrates how to iterate through all paragraphs in a Word document and retrieve their text content and paragraph-level formatting in C#:

using (StreamWriter writer = new StreamWriter("Paragraphs.txt", false, Encoding.UTF8))
{
    // Loop through all sections
    foreach (Section section in document.Sections)
    {
        // Loop through all paragraphs in the section
        foreach (Paragraph paragraph in section.Paragraphs)
        {
            // Get paragraph alignment
            HorizontalAlignment alignment = paragraph.Format.HorizontalAlignment;

            // Get spacing after paragraph
            float afterSpacing = paragraph.Format.AfterSpacing;

            // Write paragraph formatting and text to the file
            writer.WriteLine($"[Alignment: {alignment}, AfterSpacing: {afterSpacing}]");
            writer.WriteLine(paragraph.Text);
            writer.WriteLine(); // Add empty line between paragraphs
        }
    }
}

This approach allows you to extract both the text and key paragraph formatting attributes, which can be useful for tasks such as document analysis, conditional processing, or preserving layout when exporting content.

Extract Images

Images embedded within Word documents play a vital role in conveying information. To extract these images, you will examine each paragraph's content, identify images (typically represented as DocPicture objects), and save them for further use:

// Create the folder if it does not exist
string imageFolder = "ExtractedImages";
if (!Directory.Exists(imageFolder))
    Directory.CreateDirectory(imageFolder);

int imageIndex = 1;

// Loop through sections and paragraphs to find images
foreach (Section section in document.Sections)
{
    foreach (Paragraph paragraph in section.Paragraphs)
    {
        foreach (DocumentObject obj in paragraph.ChildObjects)
        {
            if (obj is DocPicture picture)
            {
                // Save each image as a separate PNG file
                string fileName = Path.Combine(imageFolder, $"Image_{imageIndex}.png");
                picture.Image.Save(fileName, System.Drawing.Imaging.ImageFormat.Png);
                imageIndex++;
            }
        }
    }
}

This code saves all images in the document as separate PNG files, with options to choose other formats like JPEG or BMP.

C# Example to Extract Images from Word Document

Extract Table Data

Tables are commonly used to organize structured data, such as financial reports or survey results. To access this data, iterate through the tables in each section and retrieve the content of individual cells:

// Create a folder to store tables
string tableDir = "Tables";
if (!Directory.Exists(tableDir))
    Directory.CreateDirectory(tableDir);

// Loop through each section
for (int sectionIndex = 0; sectionIndex < document.Sections.Count; sectionIndex++)
{
    Section section = document.Sections[sectionIndex];
    TableCollection tables = section.Tables;

    // Loop through all tables in the section
    for (int tableIndex = 0; tableIndex < tables.Count; tableIndex++)
    {
        ITable table = tables[tableIndex];
        string fileName = Path.Combine(tableDir, $"Section{sectionIndex + 1}_Table{tableIndex + 1}.txt");

        using (StreamWriter writer = new StreamWriter(fileName, false, Encoding.UTF8))
        {
            // Loop through each row
            for (int rowIndex = 0; rowIndex < table.Rows.Count; rowIndex++)
            {
                TableRow row = table.Rows[rowIndex];

                // Loop through each cell
                for (int cellIndex = 0; cellIndex < row.Cells.Count; cellIndex++)
                {
                    TableCell cell = row.Cells[cellIndex];

                    // Loop through each paragraph in the cell
                    for (int paraIndex = 0; paraIndex < cell.Paragraphs.Count; paraIndex++)
                    {
                        writer.Write(cell.Paragraphs[paraIndex].Text.Trim() + " ");
                    }

                    // Add tab between cells
                    if (cellIndex < row.Cells.Count - 1) writer.Write("\t");
                }

                // Add newline after each row
                writer.WriteLine();
            }
        }
    }
}

This method allows efficient extraction of structured data, making it ideal for generating reports or integrating content into databases.

C# Example to Extract Table Data from Word Document

Read Comments

Comments are valuable for collaboration and feedback within documents. Extracting them is crucial for auditing and understanding the document's revision history.

The Document object provides a Comments collection, which allows you to access all comments in a Word document. Each comment contains one or more paragraphs, and you can extract their text for further processing or save them into a file.

using (StreamWriter writer = new StreamWriter("Comments.txt", false, Encoding.UTF8))
{
    // Loop through all comments in the document
    foreach (Comment comment in document.Comments)
    {
        // Loop through each paragraph in the comment
        foreach (Paragraph p in comment.Body.Paragraphs)
        {
            writer.WriteLine(p.Text);
        }
        // Add empty line to separate different comments
        writer.WriteLine();
    }
}

This code retrieves the content of all comments and outputs it into a single text file.

Retrieve Document Metadata

Word documents contain metadata such as the title, author, and subject. These metadata items are stored as document properties, which can be accessed through the BuiltinDocumentProperties property of the Document object:

using (StreamWriter writer = new StreamWriter("Metadata.txt", false, Encoding.UTF8))
{
    // Write built-in document properties to file
    writer.WriteLine("Title: " + document.BuiltinDocumentProperties.Title);
    writer.WriteLine("Author: " + document.BuiltinDocumentProperties.Author);
    writer.WriteLine("Subject: " + document.BuiltinDocumentProperties.Subject);
}

Read Headers and Footers

Headers and footers frequently contain essential content like page numbers and titles. To programmatically access this information, iterate through each section's header and footer paragraphs and retrieve the text of each paragraph:

using (StreamWriter writer = new StreamWriter("HeadersFooters.txt", false, Encoding.UTF8))
{
    // Loop through all sections
    foreach (Section section in document.Sections)
    {
        // Write header paragraphs
        foreach (Paragraph headerParagraph in section.HeadersFooters.Header.Paragraphs)
        {
            writer.WriteLine("Header: " + headerParagraph.Text);
        }

        // Write footer paragraphs
        foreach (Paragraph footerParagraph in section.HeadersFooters.Footer.Paragraphs)
        {
            writer.WriteLine("Footer: " + footerParagraph.Text);
        }
    }
}

This method ensures that all recurring content is accurately captured during document processing.

Advanced Tips and Best Practices for Reading Word Documents in C#

To get the most out of programmatically reading Word documents, following these tips can help improve efficiency, reliability, and code maintainability:

Use using Statements: Always wrap Document objects in using to ensure proper memory management.
Check for Null or Empty Sections: Prevent errors by verifying sections, paragraphs, tables, or images exist before accessing them.
Batch Reading Multiple Documents: Loop through a folder of Word files and apply the same extraction logic to each file. This helps automate workflows and consolidate extracted content efficiently.

Conclusion

Efficiently reading Word documents programmatically in C# involves handling various content types. With the techniques outlined in this guide, developers can:

Load Word documents (.doc and .docx) with ease.
Extract text, paragraphs, and formatting details for thorough analysis.
Retrieve images, structured table data, and comments.
Access headers, footers, and document metadata for complete insights.

FAQs

Q1: Can I read Word documents without installing Microsoft Word?

A1: Yes, libraries like Spire.Doc enable you to read and process Word files without requiring Microsoft Word installation.

Q2: Does this support both .doc and .docx formats?

A2: Absolutely, all methods discussed in this guide work seamlessly with both legacy (.doc) and modern (.docx) Word files.

Q3: Can I extract only specific sections of a document?

A3: Yes, by iterating through sections and paragraphs, you can selectively filter and extract the desired content.

Published in Document Operation

Tagged under

doc net Operation

C#: Add, Insert, or Delete Pgaes in Word Documents

2024-02-27 07:24:02 Written by Koohji

Adding, inserting, and deleting pages in a Word document is crucial for managing and presenting content. By adding or inserting a new page in Word, you can expand the document to accommodate more content, making it more structured and readable. Deleting pages can help streamline the document by removing unnecessary information or erroneous content. This article will explain how to use Spire.Doc for .NET to add, insert, or delete a page in a Word document within a C# project.

Add a Page in a Word Document using C#
Insert a Page in a Word Document using C#
Delete a Page from a Word Document using C#

Install Spire.Doc for .NET

Package Manager

PM> Install-Package Spire.Doc

Add a Page in a Word Document using C#

The steps to add a new page at the end of a Word document involve first obtaining the last section, then inserting a page break at the end of the last paragraph of that section to ensure that subsequently added content appears on a new page. Here are the detailed steps:

Create a Document object.
Load a Word document using the Document.LoadFromFile() method.
Get the body of the last section of the document using Document.LastSection.Body.
Add a page break by calling Paragraph.AppendBreak(BreakType.PageBreak) method.
Create a new ParagraphStyle object.
Add the new paragraph style to the document's style collection using Document.Styles.Add() method.
Create a new Paragraph object and set the text content.
Apply the previously created paragraph style to the new paragraph using Paragraph.ApplyStyle(ParagraphStyle.Name) method.
Add the new paragraph to the document using Body.ChildObjects.Add(Paragraph) method.
Save the resulting document using the Document.SaveToFile() method.

// Create a new document object
Document document = new Document();

// Load a document
document.LoadFromFile("Sample.docx");

// Get the body of the last section of the document
Body body = document.LastSection.Body;

// Insert a page break after the last paragraph in the body
body.LastParagraph.AppendBreak(BreakType.PageBreak);

// Create a new paragraph style
ParagraphStyle paragraphStyle = new ParagraphStyle(document);
paragraphStyle.Name = "CustomParagraphStyle1";
paragraphStyle.ParagraphFormat.LineSpacing = 12;
paragraphStyle.ParagraphFormat.AfterSpacing = 8;
paragraphStyle.CharacterFormat.FontName = "Microsoft YaHei";
paragraphStyle.CharacterFormat.FontSize = 12;

// Add the paragraph style to the document's style collection
document.Styles.Add(paragraphStyle);

// Create a new paragraph and set the text content
Paragraph paragraph = new Paragraph(document);
paragraph.AppendText("Thank you for using our Spire.Doc for .NET product. The trial version will add a red watermark to the generated document and only supports converting the first 10 pages to other formats. Upon purchasing and applying a license, these watermarks will be removed, and the functionality restrictions will be lifted.");

// Apply the paragraph style
paragraph.ApplyStyle(paragraphStyle.Name);

// Add the paragraph to the body's content collection
body.ChildObjects.Add(paragraph);

// Create another new paragraph and set the text content
paragraph = new Paragraph(document);
paragraph.AppendText("To experience our product more fully, we provide a one-month temporary license free of charge to each of our customers. Please send an email to sales@e-iceblue.com, and we will send the license to you within one working day.");

// Apply the paragraph style
paragraph.ApplyStyle(paragraphStyle.Name);

// Add the paragraph to the body's content collection
body.ChildObjects.Add(paragraph);

// Save the document to the specified path
document.SaveToFile("Add a Page.docx", FileFormat.Docx);

// Close the document
document.Close();

// Release the resources of the document object
document.Dispose();

C#: Add, Insert, or Delete Pgaes in Word Documents

Insert a Page in a Word Document using C#

Before inserting a new page, it is necessary to determine the ending position index of the specified page content within the section. Subsequently, add the content of the new page to the document one by one after this position. Finally, to separate the content from the following pages, adding a page break is essential. The detailed steps are as follows:

Create a Document object.
Load a Word document using the Document.LoadFromFile() method.
Create a FixedLayoutDocument object.
Obtain the FixedLayoutPage object of a page in the document.
Determine the index position of the last paragraph on the page within the section.
Create a new ParagraphStyle object.
Add the new paragraph style to the document's style collection using Document.Styles.Add() method.
Create a new Paragraph object and set the text content.
Apply the previously created paragraph style to the new paragraph using the Paragraph.ApplyStyle(ParagraphStyle.Name) method.
Insert the new paragraph at the specified using the Body.ChildObjects.Insert(index, Paragraph) method.
Create another new paragraph object, set its text content, add a page break by calling the Paragraph.AppendBreak(BreakType.PageBreak) method, apply the previously created paragraph style, and then insert this paragraph into the document.
Save the resulting document using the Document.SaveToFile() method.

using Spire.Doc;
using Spire.Doc.Pages;
using Spire.Doc.Documents;

namespace SpireDocDemo
{
    internal class Program
    {
        static void Main(string[] args)
        {
           // Create a new document object
            Document document = new Document();

            // Load the sample document from a file
            document.LoadFromFile("Sample.docx");

            // Create a fixed layout document object
            FixedLayoutDocument layoutDoc = new FixedLayoutDocument(document);

            // Get the first page
            FixedLayoutPage page = layoutDoc.Pages[0];

            // Get the body of the document
            Body body = page.Section.Body;

            // Get the last paragraph of the current page
            Paragraph paragraphEnd = page.Columns[0].Lines[page.Columns[0].Lines.Count - 1].Paragraph;

            // Initialize the end index
            int endIndex = 0;
            if (paragraphEnd != null)
            {
                // Get the index of the last paragraph
                endIndex = body.ChildObjects.IndexOf(paragraphEnd);
            }

            // Create a new paragraph style
            ParagraphStyle paragraphStyle = new ParagraphStyle(document);
            paragraphStyle.Name = "CustomParagraphStyle1";
            paragraphStyle.ParagraphFormat.LineSpacing = 12;
            paragraphStyle.ParagraphFormat.AfterSpacing = 8;
            paragraphStyle.CharacterFormat.FontName = "Microsoft YaHei";
            paragraphStyle.CharacterFormat.FontSize = 12;

            // Add the paragraph style to the document's style collection
            document.Styles.Add(paragraphStyle);

            // Create a new paragraph and set the text content
            Paragraph paragraph = new Paragraph(document);
            paragraph.AppendText("Thank you for using our Spire.Doc for .NET product. The trial version will add a red watermark to the generated document and only supports converting the first 10 pages to other formats. Upon purchasing and applying a license, these watermarks will be removed, and the functionality restrictions will be lifted.");

            // Apply the paragraph style
            paragraph.ApplyStyle(paragraphStyle.Name);

            // Insert the paragraph at the specified position
            body.ChildObjects.Insert(endIndex + 1, paragraph);

            // Create another new paragraph
            paragraph = new Paragraph(document);
            paragraph.AppendText("To experience our product more fully, we provide a one-month temporary license free of charge to each of our customers. Please send an email to sales@e-iceblue.com, and we will send the license to you within one working day.");

            // Apply the paragraph style
            paragraph.ApplyStyle(paragraphStyle.Name);

            // Add a page break
            paragraph.AppendBreak(BreakType.PageBreak);

            // Insert the paragraph at the specified position
            body.ChildObjects.Insert(endIndex + 2, paragraph);

            // Save the document to the specified path
            document.SaveToFile("Insert a Page.docx", Spire.Doc.FileFormat.Docx);

            // Close and release the original document
            document.Close();
            document.Dispose();
        }
    }
}

C#: Add, Insert, or Delete Pgaes in Word Documents

Delete a Page from a Word Document using C#

To delete the content of a page, first determine the index positions of the starting and ending elements of that page in the document. Then, you can utilize a loop to systematically remove these elements one by one. The detailed steps are as follows:

Create a Document object.
Load a Word document using the Document.LoadFromFile() method.
Create a FixedLayoutDocument object.
Obtain the FixedLayoutPage object of the first page in the document.
Use the FixedLayoutPage.Section property to get the section where the page is located.
Determine the index position of the first paragraph on the page within the section.
Determine the index position of the last paragraph on the page within the section.
Use a for loop to remove the content of the page one by one.
Save the resulting document using the Document.SaveToFile() method.

using Spire.Doc;
using Spire.Doc.Pages;
using Spire.Doc.Documents;

namespace SpireDocDemo
{
    internal class Program
    {
        static void Main(string[] args)
        {
           // Create a new document object
            Document document = new Document();

            // Load the sample document from a file
            document.LoadFromFile("Sample.docx");

            // Create a fixed layout document object
            FixedLayoutDocument layoutDoc = new FixedLayoutDocument(document);

            // Get the second page
            FixedLayoutPage page = layoutDoc.Pages[1];

            // Get the section of the page
            Section section = page.Section;

            // Get the first paragraph on the first page
            Paragraph paragraphStart = page.Columns[0].Lines[0].Paragraph;
            int startIndex = 0;
            if (paragraphStart != null)
            {
                // Get the index of the starting paragraph
                startIndex = section.Body.ChildObjects.IndexOf(paragraphStart);
            }

            // Get the last paragraph on the last page
            Paragraph paragraphEnd = page.Columns[0].Lines[page.Columns[0].Lines.Count - 1].Paragraph;

            int endIndex = 0;
            if (paragraphEnd != null)
            {
                // Get the index of the ending paragraph
                endIndex = section.Body.ChildObjects.IndexOf(paragraphEnd);
            }

            // Delete all content within the specified range
            for (int i = 0; i <= (endIndex - startIndex); i++)
            {
                section.Body.ChildObjects.RemoveAt(startIndex);
            }

            // Save the document to the specified path
            document.SaveToFile("Delete a Page.docx", Spire.Doc.FileFormat.Docx);

            // Close and release the original document
            document.Close();
            document.Dispose();
        }
    }
}

C#: Add, Insert, or Delete Pgaes in Word Documents

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Document Operation

Tagged under

doc net Operation

C#: Add or Remove Captions in Word Documents

2024-01-03 01:14:57 Written by Koohji

Captions are important elements in a Word document that enhance readability and organizational structure. They provide explanations and supplementary information for images, tables, and other content, improving the clarity and comprehensibility of the document. Captions are also used to emphasize key points and essential information, facilitating referencing and indexing of specific content. By using captions effectively, readers can better understand and interpret data and images within the document while quickly locating the desired information. This article will demonstrate how to use Spire.Doc for .NET to add or remove captions in a Word document within a C# project.

Add Image Captions to a Word document in C#
Add Table Captions to a Word document in C#
Remove Captions from a Word document in C#

Install Spire.Doc for .NET

Package Manager

PM> Install-Package Spire.Doc

Add Image Captions to a Word document in C#

To add captions to images in a Word document, you can achieve it by creating a paragraph, adding an image, and calling the method DocPicture.AddCaption(string name, CaptionNumberingFormat format, CaptionPosition captionPosition) to generate the caption with a specified name, numbering format, and caption position. The following are the detailed steps:

Create an object of the Document class.
Use the Document.AddSection() method to add a section.
Add a paragraph using Section.AddParagraph() method.
Use the Paragraph.AppendPicture(Image image) method to add a DocPicture image object to the paragraph.
Use the DocPicture.AddCaption(string name, CaptionNumberingFormat format, CaptionPosition captionPosition) method to add a caption with numbering format as CaptionNumberingFormat.Number.
Set the Document.IsUpdateFields property to true to update all fields.
Use the Document.SaveToFile() method to save the resulting document.

using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;
using System.Drawing;
namespace AddPictureCaption
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Create a Word document object
            Document document = new Document();

            // Add a section
            Section section = document.AddSection();

            // Add a new paragraph and insert an image
            Paragraph pictureParagraphCaption = section.AddParagraph();
            pictureParagraphCaption.Format.AfterSpacing = 10;
            DocPicture pic1 = pictureParagraphCaption.AppendPicture(Image.FromFile("Data\\1.png"));
            pic1.Height = 100;
            pic1.Width = 100;

            // Add a caption to the image
            CaptionNumberingFormat format = CaptionNumberingFormat.Number;
            pic1.AddCaption("Image", format, CaptionPosition.BelowItem);

            // Add another paragraph and insert another image
            pictureParagraphCaption = section.AddParagraph();
            DocPicture pic2 = pictureParagraphCaption.AppendPicture(Image.FromFile("Data\\2.png"));
            pic2.Height = 100;
            pic2.Width = 100;

            // Add a caption to the second image
            pic2.AddCaption("Image", format, CaptionPosition.BelowItem);

            // Update all fields in the document
            document.IsUpdateFields = true;

            // Save to a docx document
            string result = "AddImageCaption.docx";
            document.SaveToFile(result, Spire.Doc.FileFormat.Docx2016);

            // Close and dispose of the document object to release resources
            document.Close();
            document.Dispose();
        }
    }
}

C#: Add or Remove Captions in Word documents

Add Table Captions to a Word document in C#

To add captions to a table in a Word document, you can achieve this by creating the table and using the Table.AddCaption(string name, CaptionNumberingFormat format, CaptionPosition captionPosition) method to generate a numbered caption. The steps involved are as follows:

Create an object of the Document class.
Use the Document.AddSection() method to add a section.
Create a Table object and add it to the specified section in the document.
Use the Table.ResetCells(int rowsNum, int columnsNum) method to set the number of rows and columns in the table.
Add a caption to the table using the Table.AddCaption(string name, CaptionNumberingFormat format, CaptionPosition captionPosition) method, specifying the caption numbering format as CaptionNumberingFormat.Number.
Set the Document.IsUpdateFields property to true to update all fields.
Use the Document.SaveToFile() method to save the resulting document.

using Spire.Doc;
namespace AddTableCation
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Create a Word document object
            Document document = new Document();

            // Add a section
            Section section = document.AddSection();

            // Add a table
            Table tableCaption = section.AddTable(true);
            tableCaption.ResetCells(3, 2);

            // Add a caption to the table
            tableCaption.AddCaption("Table", CaptionNumberingFormat.Number, CaptionPosition.BelowItem);

            // Add another table and caption
            tableCaption = section.AddTable(true);
            tableCaption.ResetCells(2, 3);
            tableCaption.AddCaption("Table", CaptionNumberingFormat.Number, CaptionPosition.BelowItem);

            // Update all fields in the document
            document.IsUpdateFields = true;

            // Save to a docx document
            string result = "AddTableCaption.docx";
            document.SaveToFile(result, Spire.Doc.FileFormat.Docx2016);

            // Close and dispose of the document object to release resources
            document.Close();
            document.Dispose();
        }
    }
}

C#: Add or Remove Captions in Word documents

Remove Captions from a Word document in C#

Spire.Doc for .NET can also facilitate the removal of captions from an existing Word document. Here are the detailed steps:

Create an object of the Document class.
Use the Document.LoadFromFile() method to load a Word document.
Create a custom method, named DetectCaptionParagraph(Paragraph paragraph), to determine if a paragraph contains a caption.
Iterate through all the Paragraph objects in the document using a loop and utilize the custom method, DetectCaptionParagraph(Paragraph paragraph), to identify and delete paragraphs that contain captions.
Use the Document.SaveToFile() method to save the resulting document.

using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;
namespace DeleteCaptions
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Create a Word document object
            Document document = new Document();

            // Load the example.docx file
            document.LoadFromFile("Data/Sample.docx");
            Section section;

            // Iterate through all sections
            for (int i = 0; i < document.Sections.Count; i++)
            {
                section = document.Sections[i];

                // Iterate through paragraphs in reverse order
                for (int j = section.Body.Paragraphs.Count - 1; j >= 0; j--)
                {
                    // Check if the paragraph is a caption paragraph
                    if (DetectCaptionParagraph(section.Body.Paragraphs[j]))
                    {
                        // If it's a caption paragraph, remove it
                        section.Body.Paragraphs.RemoveAt(j);
                    }
                }
            }

            // Save the document after removing captions
            string result = "RemoveCaptions.docx";
            document.SaveToFile(result, Spire.Doc.FileFormat.Docx2016);

            // Close and dispose of the document object to release resources
            document.Close();
            document.Dispose();
        }
        // Method to detect if a paragraph is a caption paragraph
        static bool DetectCaptionParagraph(Paragraph paragraph)
        {
            bool tag = false;
            Field field;

            // Iterate through the child objects in the paragraph
            for (int i = 0; i < paragraph.ChildObjects.Count; i++)
            {
                if (paragraph.ChildObjects[i].DocumentObjectType == DocumentObjectType.Field)
                {
                    // Check if the child object is of Field type
                    field = (Field)paragraph.ChildObjects[i];
                    if (field.Type == FieldType.FieldSequence)
                    {
                        // Check if the Field type is FieldSequence, indicating a caption field type
                        return true;
                    }
                }
            }
            return tag;
        }
    }
}

C#: Add or Remove Captions in Word documents

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Others

Tagged under

doc net Others

C#: Compare PDF Documents

2023-12-15 00:55:52 Written by Koohji

PDF has become the standard format for sharing and preserving documents across different platforms, playing a ubiquitous role in both professional and personal settings. However, creating high-quality PDF documents requires multiple checks and revisions. In this context, knowing how to efficiently compare PDF files and pinpoint their differences becomes crucial, which enables document editors to quickly identify discrepancies between different versions of a document, resulting in significant time savings during the document creation and review process. This article aims to demonstrate how to compare PDF documents effortlessly using Spire.PDF for .NET in C# programs.

Compare Two PDF Documents in C#
Compare a Specific Page Range of Two PDF Documents

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

Package Manager

PM> Install-Package Spire.PDF

Compare Two PDF Documents in C#

With Spire.PDF for .NET, developers can create an instance of the PdfComparer class, passing two PdfDocument objects as parameters, and then utilize the PdfComparer.Compare(String fileName) method to compare the two documents. The resulting comparison is saved as a new PDF document, allowing for further analysis or review of the differences between the two PDFs.

The resulting PDF document displays the two original documents on the left and the right, with the deleted items in red and the added items in yellow.

The following are the detailed steps for comparing two PDF documents:

Create two objects of PdfDocument class and load two PDF documents using PdfDocument.LoadFromFile() method.
Create an instance of PdfComparer class and pass the two PdfDocument objects as parameters.
Compare the two documents and save the result as another PDF document using PdfComparer.Compare() method.

using Spire.Pdf;
using Spire.Pdf.Comparison;

namespace ExtractTablesToExcel
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create an object of PdfDocument class and load a PDF document
            PdfDocument pdf1 = new PdfDocument();
            pdf1.LoadFromFile("Sample1.pdf");

            //Create another object of PdfDocument class and load another PDF document
            PdfDocument pdf2 = new PdfDocument();
            pdf2.LoadFromFile("Sample2.pdf");

            //Create an object of PdfComparer class with the two document
            PdfComparer comparer = new PdfComparer(pdf1, pdf2);
            
            //Compare the two document and save the comparing result to another PDF document
            comparer.Compare("output/ComparingResult.pdf");
            pdf1.Close();
            pdf2.Close();
        }
    }
}

C#: Compare PDF Documents

Compare a Specific Page Range of Two PDF Documents

After creating an instance of PdfComparer class, developers can also use the PdfComparer.Options.SetPageRange() method to set the page range to be compared. This allows for comparing only the specified page range in two PDF documents. The detailed steps are as follows:

Create two objects of PdfDocument class and load two PDF documents using PdfDocument.LoadFromFile() method.
Create an instance of PdfComparer class and pass the two PdfDocument objects as parameters.
Set the page range to be compared using PdfComparer.Options.SetPageRange() method.
Compare the specified page range in the two PDF documents and save the result as another PDF document using PdfComparer.Compare() method.

using Spire.Pdf;
using Spire.Pdf.Comparison;

namespace ExtractTablesToExcel
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create an object of PdfDocument class and load a PDF document
            PdfDocument pdf1 = new PdfDocument();
            pdf1.LoadFromFile("Sample1.pdf");

            //Create another object of PdfDocument class and load another PDF document
            PdfDocument pdf2 = new PdfDocument();
            pdf2.LoadFromFile("Sample2.pdf");

            //Create an object of PdfComparer class with the two document
            PdfComparer comparer = new PdfComparer(pdf1, pdf2);

            //Set the page range to be compared
            comparer.Options.SetPageRanges(1, 1, 1, 1);
            
            //Compare the specified page range and save the comparing result to another PDF document
            comparer.Compare("output/PageRangeComparingResult.pdf");
            pdf1.Close();
            pdf2.Close();
        }
    }
}

C#: Compare PDF Documents

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Document Operation

Tagged under

pdf net Document Operation

C#/VB.NET: Merge and Split Table Cells in PowerPoint

2023-09-18 01:06:43 Written by Koohji

Merging and splitting table cells in PowerPoint are two common functions, mainly used to adjust the layout and structure of the table. Merging cells involves combining adjacent cells into a larger one. It allows users to create title cells that span multiple columns or rows. On the other hand, splitting cells means dividing a cell into several smaller ones, which is useful for creating detailed layouts or accommodating diverse content. In this article, we will show you how to merge and split table cells in PowerPoint programmatically by using Spire.Presentation for .NET.

Merge Table Cells in PowerPoint
Split Table Cells in PowerPoint

Install Spire.Presentation for .NET

To begin with, you need to add the DLL files included in the Spire.Presentation for.NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

Package Manager

PM> Install-Package Spire.Presentation

Merge Table Cells in PowerPoint

Spire.Presentation for .NET provides users with ITable[int columnIndex, int rowIndex] property and ITable.MergeCells(Cell startCell, Cell endCell, boolean allowSplitting) method to get and merge the specific cells. The detailed steps are as follows.

Create an object of Presentation class.
Load a sample file using Presentation.LoadFromFile() method.
Get the table from the first slide by looping through all shapes.
Get the specific cells by ITable[int columnIndex, int rowIndex] property and merge them by using ITable.MergeCells(Cell startCell, Cell endCell, boolean allowSplitting) method.
Save the result file using Presentation.SaveToFile() method.

C#
VB.NET

using Spire.Presentation;

namespace MergeCells
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create an object of Presentation class 
            Presentation presentation = new Presentation();

            //Load a PowerPoint presentation
            presentation.LoadFromFile("sample.pptx");

            //Get the table from the first slide by looping through all shapes
            ITable table = null;
            foreach (IShape shape in presentation.Slides[0].Shapes)
            {
                if (shape is ITable)
                {
                    table = (ITable)shape;

                    //Merge the cells from [0,0] to [4,0]
                    table.MergeCells(table[0, 0], table[4, 0], false);
                }
            }

            //Save the result document
            presentation.SaveToFile("MergeCells.pptx", FileFormat.Pptx2010);
            presentation.Dispose();
        }
    }
}

C#/VB.NET: Merge and Split Table Cells in PowerPoint

Split Table Cells in PowerPoint

Spire.Presentation for .NET also supports users to get the specific cell and split it into smaller ones by using ITable[int columnIndex, int rowIndex] property and Cell.Split(int RowCount, int ColunmCount) method. The detailed steps are as follows.

Create an object of Presentation class.
Load a sample file using Presentation.LoadFromFile() method.
Get the table from the first slide by looping through all shapes.
Get the specific cell by ITable[int columnIndex, int rowIndex] property and split it into 2 rows and 2 columns by using Cell.Split(int RowCount, int ColumnCount) method.
Save the result file using Presentation.SaveToFile() method.

C#
VB.NET

using Spire.Presentation;

namespace SplitCells
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create an object of Presentation class
            Presentation presentation = new Presentation();

            //Load a PowerPoint presentation
            presentation.LoadFromFile("sample.pptx");

            //Get the table from the first slide by looping through all shapes
            ITable table = null;
            foreach (IShape shape in presentation.Slides[0].Shapes)
            {
                if (shape is ITable)
                {
                    table = (ITable)shape;

                    //Split cell [2, 2] into 2 rows and 2 columns
                    table[2, 2].Split(2, 2);
                }
            }

            //Save the result document
            presentation.SaveToFile("SplitCells.pptx", FileFormat.Pptx2013);
            presentation.Dispose();
                }
            }
        }

C#/VB.NET: Merge and Split Table Cells in PowerPoint

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Table

Tagged under

ppt net Table

C#/VB.NET: Change or Delete Hyperlinks in PDF

2023-07-11 00:39:08 Written by Administrator

Hyperlinks in PDF documents allow users to jump to pages or open documents, making PDF files more interactive and easier to use. However, if the target site of the link has been changed or the link points to the wrong page, it may cause trouble or misunderstanding to the document users. Therefore, it is very important to change or remove wrong or invalid hyperlinks in PDF documents to ensure the accuracy and usability of the hyperlinks, so as to provide a better reading experience for users. This article will introduce how to change or remove hyperlinks in PDF documents through .NET programs using Spire.PDF for .NET.

Change the URL of a Hyperlink in PDF in C#/VB.NET
Remove Hyperlinks from PDF in C#/VB.NET

Install Spire.PDF for .NET

Package Manager

PM> Install-Package Spire.PDF

Change the URL of a Hyperlink in PDF

To change the URL of a hyperlink on a PDF page, it is necessary to get the hyperlink annotation widget and use the PdfUriAnnotationWidget.Uri property to reset the URL. The detailed steps are as follows:

Create an object of PdfDocument class.
Load a PDF file using PdfDocument.LoadFromFIle() method.
Get the first page of the document using PdfDocument.Pages[] property.
Get the first hyperlink widget on the page using PdfPageBase.AnnotationsWidget[] property.
Reset the URL of the hyperlink using PdfUriAnnotationWidget.Uri property.
Save the document using PdfDocument.SaveToFile() method.

C#
VB.NET

using Spire.Pdf;
using Spire.Pdf.Annotations;
using System;

namespace ChangeHyperlink
{
    internal class Program
    {
        static void Main(string[] args)
        {
            //Cretae an object of PdfDocument
            PdfDocument pdf = new PdfDocument();

            //Load a PDF file
            pdf.LoadFromFile("Sample.pdf");

            //Get the first page
            PdfPageBase page = pdf.Pages[0];

            //Get the first hyperlink
            PdfUriAnnotationWidget url = (PdfUriAnnotationWidget)page.Annotations[0];

            //Reset the url of the hyperlink
            url.Uri = "https://en.wikipedia.org/wiki/Climate_change";

            //Save the PDF file
            pdf.SaveToFile("ChangeHyperlink.pdf");
            pdf.Dispose();
        }
    }
}

C#/VB.NET: Change or Delete Hyperlinks in PDF

Remove Hyperlinks from PDF

Spire.PDF for .NET provides the PdfPageBase.AnnotationsWidget.RemoveAt() method to remove a hyperlink on a PDF page by its index. Eliminating all hyperlinks from a PDF document requires iterating through the pages, obtaining the annotation widgets of each page, verifying whether an annotation is an instance of the PdfUriAnnotationWidget class, and deleting the annotation if it is. The following are the detailed steps:

Create an object of PdfDocument class.
Load a PDF document using PdfDocument.LoadFromFIle() method.
To remove a specific hyperlink, get the page containing the hyperlink and remove the hyperlink by its index using PdfPageBase.AnnotationsWidget.RemoveAt() method.
To remove all hyperlinks, loop through the pages in the document to get the annotation collection of each page using PdfPageBase.AnnotationsWidget property.
Check if an annotation widget is an instance of PdfUriAnnotationWidget class and remove the annotation widget using PdfAnnotationCollection.Remove(PdfUriAnnotationWidget) method if it is.
Save the document using PdfDocument.SaveToFIle() method.

C#
VB.NET

using Spire.Pdf;
using Spire.Pdf.Annotations;
using System;
using System.Dynamic;

namespace DeleteHyperlink
{
    internal class Program
    {
        static void Main(string[] args)
        {
            //Cretae an object of PdfDocument
            PdfDocument pdf = new PdfDocument();

            //Load a PDF file
            pdf.LoadFromFile("Sample.pdf");

            //Remove the second hyperlink in the fisrt page
            //PdfPageBase page = pdf.Pages[0];
            //page.AnnotationsWidget.RemoveAt(1);

            //Remove all hyperlinks in the document
            //Loop through pages in the document
            foreach (PdfPageBase page in pdf.Pages)
            {
                //Get the annotation collection of a page
                PdfAnnotationCollection collection = page.Annotations;
                for (int i = collection.Count - 1; i >= 0; i--)
                {
                    PdfAnnotation annotation = collection[i];
                    //Check if an annotation is an instance of PdfUriAnnotationWidget
                    if (annotation is PdfUriAnnotationWidget)
                    {
                        PdfUriAnnotationWidget url = (PdfUriAnnotationWidget)annotation;
                        //Remove the hyperlink
                        collection.Remove(url);
                    }
                }
            }

            //Save the document
            pdf.SaveToFile("DeleteHyperlink.pdf");
            pdf.Dispose();
        }
    }
}

C#/VB.NET: Change or Delete Hyperlinks in PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Link

Tagged under

pdf net Link

C#/VB.NET: Create a Line Chart in Word

2023-07-07 01:00:51 Written by Koohji

Charts in Word documents are a valuable tool for presenting and analyzing data in a visually appealing and understandable format. They help summarize key trends, patterns, or relationships within the data, which is especially useful when you are creating company reports, business proposals or research papers. In this article, you will learn how to programmatically add a line chart to a Word document using Spire.Doc for .NET.

Install Spire.Doc for .NET

To begin with, you need to add the DLL files included in the Spire.Doc for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

Package Manager

PM> Install-Package Spire.Doc

Create a Line Chart in Word in C# and VB.NET

A line chart is a common type of chart that connects a series of data points with a continuous line. To add a line chart in Word, Spire.Doc for .NET offers the Paragraph.AppendChart(ChartType.Line, float width, float height) method. The following are the detailed steps.

Create a Document object.
Add a section and then add a paragraph to the section.
Add a line chart with specified size to the paragraph using Paragraph.AppendChart(ChartType.Line, float width, float height) method.
Get the chart and then set the chart title using Chart.Tilte.Text property.
Add a custom series to the chart using Chart.Series.Add(string seriesName, string[] categories, double[] values) method.
Set the legend position using Chart.Legend.Position property.
Save the result document using Document.SaveToFile() method.

C#
VB.NET

using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields.Shapes.Charts;
using Spire.Doc.Fields;

namespace WordLineChart
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a Document object
            Document document = new Document();

            //Add a section
            Section section = document.AddSection();

            //Add a paragraph to the section
            Paragraph newPara = section.AddParagraph();

            //Add a line chart with specified size to the paragraph
            ShapeObject shape = newPara.AppendChart(ChartType.Line, 460, 300);

            //Get the chart
            Chart chart = shape.Chart;

            //Set chart title
            chart.Title.Text = "Sales Report";

            //Clear the default series data of the chart
            chart.Series.Clear();

            //Add three custom series with specified series names, category names, and series values to chart
            string[] categories = { "Jan", "Feb", "Mar", "Apr"};
            chart.Series.Add("Team A", categories, new double[] { 1000, 2000, 2500, 4200 });
            chart.Series.Add("Team B", categories, new double[] { 1500, 1800, 3500, 4000 });
            chart.Series.Add("Team C", categories, new double[] { 1200, 2500, 2900, 3600 });

            //Set the legend position
            chart.Legend.Position = LegendPosition.Bottom;

            //Save the result document
            document.SaveToFile("AppendLineChart.docx", FileFormat.Docx);
        }
    }
}

C#/VB.NET: Create a Line Chart in Word

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Chart

Tagged under

doc net Chart

C#/VB.NET: Extract Tables from PDF to Excel

2023-07-06 07:17:34 Written by Koohji

Extracting tables from PDFs and converting them into Excel format offers numerous advantages, such as enabling data manipulation, analysis, and visualization in a more versatile and familiar environment. This task is particularly valuable for researchers, analysts, and professionals dealing with large amounts of tabular data. In this article, you will learn how to extract tables from PDF to Excel in C# and VB.NET using Spire.Office for .NET.

Install Spire.Office for .NET

To begin with, you need to add the Spire.Pdf.dll and the Spire.Xls.dll included in the Spire.Office for.NET package as references in your .NET project. Spire.PDF is responsible for extracting data from PDF tables, and Spire.XLS is responsible for creating an Excel document based on the data obtained from PDF.

The DLL files can be either downloaded from this link or installed via NuGet.

Package Manager

PM> Install-Package Spire.Office

Extract Tables from PDF to Excel in C#, VB.NET

Spire.PDF for .NET offers the PdfTableExtractor.ExtractTable(int pageIndex) method to extract tables from a specific page of a searchable PDF document. The text of a specific cell can be accessed using PdfTable.GetText(int rowIndex, int columnIndex) method. This value can be then written to a worksheet through Worksheet.Range[int row, int column].Value property offered by Spire.XLS for .NET. The following are the detailed steps.

Create an instance of PdfDocument class.
Load the sample PDF document using PdfDocument.LoadFromFile() method.
Extract tables from a specific page using PdfTableExtractor.ExtractTable() method.
Get text of a certain table cell using PdfTable.GetText() method.
Create a Workbook object.
Write the cell data obtained from PDF into a worksheet through Worksheet.Range.Value property.
Save the workbook to an Excel file using Workbook.SaveTofile() method.

The following code example extracts all tables from a PDF document and writes each of them into an individual worksheet within a workbook.

C#
VB.NET

using Spire.Pdf;
using Spire.Pdf.Utilities;
using Spire.Xls;

namespace ExtractTablesToExcel
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument object
            PdfDocument doc = new PdfDocument();

            //Load the sample PDF file
            doc.LoadFromFile(@"C:\Users\Administrator\Desktop\table.pdf");

            //Create a Workbook object
            Workbook workbook = new Workbook();

            //Clear default worksheets
            workbook.Worksheets.Clear();

            //Initialize an instance of PdfTableExtractor class
            PdfTableExtractor extractor = new PdfTableExtractor(doc);

            //Declare a PdfTable array 
            PdfTable[] tableList = null;

            int sheetNumber = 1;

            //Loop through the pages 
            for (int pageIndex = 0; pageIndex < doc.Pages.Count; pageIndex++)
            {
                //Extract tables from a specific page
                tableList = extractor.ExtractTable(pageIndex);

                //Determine if the table list is null
                if (tableList != null && tableList.Length > 0)
                {
                    //Loop through the table in the list
                    foreach (PdfTable table in tableList)
                    {
                        //Add a worksheet
                        Worksheet sheet = workbook.Worksheets.Add(String.Format("sheet{0}", sheetNumber));

                        //Get row number and column number of a certain table
                        int row = table.GetRowCount();
                        int column = table.GetColumnCount();

                        //Loop though the row and colunm 
                        for (int i = 0; i < row; i++)
                        {
                            for (int j = 0; j < column; j++)
                            {
                                //Get text from the specific cell
                                string text = table.GetText(i, j);

                                //Write text to a specified cell
                                sheet.Range[i + 1, j + 1].Value = text;
                            }
              
                        }
                        sheetNumber++;
                    }
                }
            }

            //Save to file
            workbook.SaveToFile("ToExcel.xlsx", ExcelVersion.Version2013);
        }
    }
}

C#/VB.NET: Extract Tables from PDF to Excel

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Table

Tagged under

pdf net Table

News Category

.NET (1316)

Children categories

Install Spire.Doc for .NET

Add a Gutter at the Top of a Word Document Page using C#

Add a Gutter at the Left of a Word Document Page using C#

Apply for a Temporary License

Install Spire.Doc for .NET

Add Editable Area in a Word Document in C#

Remove Editable Area in a Word Document in C#

Apply for a Temporary License

Table of Contents

Set Up Your Development Environment for Reading Word Documents in C#

Prerequisites

Install Spire.Doc

Load Word Document (.doc/.docx) in C#

Read and Extract Content from Word Document in C#

Extract Text

Read Paragraphs and Formatting Information

Extract Images

Extract Table Data

Read Comments

Retrieve Document Metadata

Read Headers and Footers

Advanced Tips and Best Practices for Reading Word Documents in C#

Conclusion

FAQs

Q1: Can I read Word documents without installing Microsoft Word?

Q2: Does this support both .doc and .docx formats?

Q3: Can I extract only specific sections of a document?

Install Spire.Doc for .NET

Add a Page in a Word Document using C#

Insert a Page in a Word Document using C#

Delete a Page from a Word Document using C#

Apply for a Temporary License

Install Spire.Doc for .NET

Add Image Captions to a Word document in C#

Add Table Captions to a Word document in C#

Remove Captions from a Word document in C#

Apply for a Temporary License

Install Spire.PDF for .NET

Compare Two PDF Documents in C#

Compare a Specific Page Range of Two PDF Documents

Apply for a Temporary License

Install Spire.Presentation for .NET

Merge Table Cells in PowerPoint

Split Table Cells in PowerPoint

Apply for a Temporary License

Install Spire.PDF for .NET

Change the URL of a Hyperlink in PDF

Remove Hyperlinks from PDF

Apply for a Temporary License

Install Spire.Doc for .NET

Create a Line Chart in Word in C# and VB.NET

Apply for a Temporary License

Install Spire.Office for .NET

Extract Tables from PDF to Excel in C#, VB.NET

Apply for a Temporary License

More...