C#/VB.NET: Create a PowerPoint Document

2022-01-25 07:43:30 Written by Koohji

PowerPoint is a presentation document that is typically used for product introductions, performance reports, teaching, and other purposes. Since the design of PowerPoint is a visual behavior and needs constant fine-tuning, it is not recommended to create PowerPoint from scratch programmatically. But if you do have the requirement to create PowerPoint documents in C# or VB.NET, you can try this solution provided by Spire.Presentation for .NET.

Install Spire.Presentation for .NET

To begin with, you need to add the DLL files included in the Spire.Presentation for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.Presentation

Create a Simple PowerPoint Document

Spire.Presentation for .NET offers the Presentation class and the ISlide interface to represent a PowerPoint document and a slide respectively. It is quite straightforward and simple for developers to use the properties and methods under them to create or manipulate PowerPoint files. The following are the steps to generate a simple PowerPoint document using it.

  • Create a Presentation object, and set the slide size type to screen 16x9 through the Presentation.SlideSize.Type property.
  • Get the first slide through the Presentation.Slides[] property.
  • Set the background image of the slide using ISlide.SlideBackground property.
  • Add a rectangle to the slide using ISlide.Shapes.AppendShape() method, positioning the shape at the center of the slide using IAutoShape.SetShapeAlignment() method.
  • Set the fill color, line style, font color, and text of the shape through other properties under the IAutoShape object.
  • Save the presentation to a .pptx file using Presentation.SaveToFile() method.
  • C#
  • VB.NET
using System.Drawing;
using Spire.Presentation;
using Spire.Presentation.Drawing;

namespace CreatePowerPoint
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a Presentation object
            Presentation presentation = new Presentation();

            //Set the slide size type to screen 16x9
            presentation.SlideSize.Type = SlideSizeType.Screen16x9;

            //Get the first slide
            ISlide slide = presentation.Slides[0];

            //Set the background image
            string imgPath = @"C:\Users\Administrator\Desktop\bgImage.jpg";
            IImageData imageData = presentation.Images.Append(Image.FromFile(imgPath));
            slide.SlideBackground.Type = Spire.Presentation.Drawing.BackgroundType.Custom;
            slide.SlideBackground.Fill.FillType = Spire.Presentation.Drawing.FillFormatType.Picture;
            slide.SlideBackground.Fill.PictureFill.FillType = PictureFillType.Stretch;
            slide.SlideBackground.Fill.PictureFill.Picture.EmbedImage = imageData;

            //Insert a rectangle shape
            Rectangle rect = new Rectangle(100, 100, 500, 80);
            IAutoShape shape = slide.Shapes.AppendShape(ShapeType.Rectangle, rect);

            //Position the shape at the center of the slide
            shape.SetShapeAlignment(ShapeAlignment.AlignCenter);
            shape.SetShapeAlignment(ShapeAlignment.DistributeVertically);

            //Set the fill color, line style and font color of the shape
            shape.Fill.FillType = FillFormatType.Solid;
            shape.Fill.SolidColor.Color = Color.BlueViolet;
            shape.ShapeStyle.LineStyleIndex = 0;//no line
            shape.ShapeStyle.FontColor.Color = Color.White;

            //Set the text of the shape
            shape.TextFrame.Text = "This article shows you how to create a simple PowerPoint document using Spire.Presentation for Java.";

            //Save to file
            presentation.SaveToFile("CreatePowerPoint.pptx", FileFormat.Pptx2013);
        }
    }
}

C#/VB.NET: Create a PowerPoint Document

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

C#/VB.NET: Convert PDF to Linearized

2021-12-10 03:38:02 Written by Koohji

PDF linearization, also known as "Fast Web View", is a way of optimizing PDF files. Ordinarily, users can view a multipage PDF file online only when their web browsers have downloaded all pages from the server. However, if the PDF file is linearized, the browsers can display the first page very quickly even if the full download has not been completed. This article will demonstrate how to convert a PDF to linearized in C# and VB.NET using Spire.PDF for .NET.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

  • Package Manager
PM> Install-Package Spire.PDF

Convert PDF to Linearized

The following are the steps to convert a PDF file to linearized:

  • Load a PDF file using PdfToLinearizedPdfConverter class.
  • Convert the file to linearized using PdfToLinearizedPdfConverter.ToLinearizedPdf() method.
  • C#
  • VB.NET
using Spire.Pdf.Conversion;

namespace ConvertPdfToLinearized
{
    class Program
    {
        static void Main(string[] args)
        {
            //Load a PDF file
            PdfToLinearizedPdfConverter converter = new PdfToLinearizedPdfConverter("Sample.pdf");
            //Convert the file to a linearized PDF
            converter.ToLinearizedPdf("Linearized.pdf");
        }
    }
}
Imports Spire.Pdf.Conversion

Namespace ConvertPdfToLinearized
    Friend Class Program
        Private Shared Sub Main(ByVal args As String())
            'Load a PDF file
            Dim converter As PdfToLinearizedPdfConverter = New PdfToLinearizedPdfConverter("Sample.pdf")
            'Convert the file to a linearized PDF
            converter.ToLinearizedPdf("Linearized.pdf")
        End Sub
    End Class
End Namespace

Open the result file in Adobe Acrobat and take a look at the document properties, you can see the value of “Fast Web View” is Yes which means the file is linearized.

C#/VB.NET: Convert PDF to Linearized

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Converting a PDF with color images to grayscale can help you reduce the file size and print the PDF in a more affordable mode without consuming colored ink. In this article, you will learn how to achieve the conversion programmatically in C# and VB.NET using Spire.PDF for .NET.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

  • Package Manager
PM> Install-Package Spire.PDF 

Convert PDF to Grayscale

The following are the steps to convert a color PDF to grayscale:

  • C#
  • VB.NET
using Spire.Pdf.Conversion;
 
namespace ConvertPdfToGrayscale
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfGrayConverter instance and load a PDF file
            PdfGrayConverter converter = new PdfGrayConverter(@"Sample.pdf");
            //Convert the PDF to grayscale
            converter.ToGrayPdf("Grayscale.pdf");
            converter.Dispose();
        }
    }
}
Imports Spire.Pdf.Conversion

Namespace ConvertPdfToGrayscale
    Friend Class Program
        Private Shared Sub Main(ByVal args As String())
            'Create a PdfGrayConverter instance and load a PDF file
            Dim converter As PdfGrayConverter = New PdfGrayConverter("Sample.pdf")
            'Convert the PDF to grayscale
            converter.ToGrayPdf("Grayscale.pdf")
            converter.Dispose()
        End Sub
    End Class
End Namespace

The input PDF:

C#/VB.NET: Convert PDF to Grayscale (Black and White)

The output PDF:

C#/VB.NET: Convert PDF to Grayscale (Black and White)

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

When creating a new table in PowerPoint, the rows and columns are evenly distributed by default. As you insert data into the table cells, the row heights and column widths will be automatically adjusted to fit with the contents. To make the table nicely organized, you may want to re-distribute the rows and columns. This article demonstrates how to accomplish this task in C# and VB.NET using Spire.Presentation for .NET.

Install Spire.Presentation for .NET

To begin with, you need to add the DLL files included in the Spire.Presentation for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.Presentation

Distribute Table Rows and Columns

The following are the steps to distribute table rows and columns evenly in PowerPoint.

  • C#
  • VB.NET
using Spire.Presentation;

namespace DistributeRowsAndColumns
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a Presentation instance
            Presentation presentation = new Presentation();

            //Load the PowerPoint document
            presentation.LoadFromFile(@"C:\Users\Administrator\Desktop\Table.pptx");

            //Get the first slide
            ISlide slide = presentation.Slides[0];

            //Loop through the shapes
            for (int i = 0; i < slide.Shapes.Count; i++)
            {
                //Determine if a shape is table
                if (slide.Shapes[i] is ITable)
                {
                    //Get the table in the slide
                    ITable table = (ITable)slide.Shapes[i];

                    //Distribute table rows
                    table.DistributeRows(0, table.TableRows.Count-1);

                    //Distribute table columns
                    table.DistributeColumns(0, table.ColumnsList.Count-1);

                }
            }

            //Save the result to file
            presentation.SaveToFile("DistributeRowsAndColumns.pptx", FileFormat.Pptx2013);
        }
    }
}
Imports Spire.Presentation
 
Namespace DistributeRowsAndColumns
    Class Program
        Shared  Sub Main(ByVal args() As String)
            'Create a Presentation instance
            Dim presentation As Presentation =  New Presentation() 
 
            'Load the PowerPoint document
            presentation.LoadFromFile("C:\Users\Administrator\Desktop\Table.pptx")
 
            'Get the first slide
            Dim slide As ISlide =  presentation.Slides(0) 
 
            'Loop through the shapes
            Dim i As Integer
            For  i = 0 To  slide.Shapes.Count- 1  Step  i + 1
                'Determine if a shape is table
                If TypeOf slide.Shapes(i) Is ITable Then
                    'Get the table in the slide
                    Dim table As ITable = CType(slide.Shapes(i), ITable)
 
                    'Distribute table rows
                    table.DistributeRows(0, table.TableRows.Count-1)
 
                    'Distribute table columns
                    table.DistributeColumns(0, table.ColumnsList.Count-1)
 
                End If
            Next
 
            'Save the result to file
            presentation.SaveToFile("DistributeRowsAndColumns.pptx", FileFormat.Pptx2013)
        End Sub
    End Class
End Namespace

C#/VB.NET: Distribute Table Rows and Columns in PowerPoint

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Sometimes after you have finished an Excel workbook, you may need to replace some of the existing pictures with better ones for the purpose of making the workbook more appealing and persuasive. In this tutorial, you will learn how to replace a picture in Excel using Spire.XLS for .NET.

Install Spire.XLS for .NET

To begin with, you need to add the DLL files included in the Spire.XLS for .NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

  • Package Manager
PM> Install-Package Spire.XLS

Replace a Picture in Excel

The following are the detailed steps to replace a picture with another one using Spire.XLS for .NET.

  • C#
  • VB.NET
using Spire.Xls;
using Spire.Xls.Collections;
namespace ReplacePictureinExcel
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a Workbook instance 
            Workbook workbook = new Workbook();
            //Load the Excel file
            workbook.LoadFromFile("input.xls");

            //Get the first sheet
            Worksheet sheet = workbook.Worksheets[0];

            //Get Excel picture collection
            PicturesCollection pictureCollection = sheet.Pictures;

            //Get the first picture from the collection 
            ExcelPicture excelPicture = pictureCollection[0];

            // Creates an Image from the specified file.
            excelPicture.Picture = System.Drawing.Image.FromFile("input.png");

            //Save the document
            workbook.SaveToFile("ReplaceImage.xlsx", ExcelVersion.Version2013);
        }
    }
}
Imports Spire.Xls
Imports Spire.Xls.Collections
Imports System.Drawing

Namespace ReplacePictureinExcel
	Class Program
		Private Shared Sub Main(args As String())

			'Create a Workbook instance
			Dim workbook As New Workbook()
			'Load the Excel file
			workbook.LoadFromFile(Input.xls)

			'Get the first sheet
			Dim sheet As Worksheet = workbook.Worksheets(0)

			'Get Excel picture collection
			Dim pictureCollection As PicturesCollection = sheet.Pictures

			'Get the first picture from the collection
			Dim excelPicture As ExcelPicture = pictureCollection(0)

			' Creates an Image from the specified file.
			excelPicture.Picture = Image.FromFile(image)

			'Save the document
			workbook.SaveToFile("ReplaceImage.xlsx", ExcelVersion.Version2013)
		End Sub
	End Class
End Namespace

The original file:

C#/VB.NET: Replace a Picture in Excel

The generated file:

C#/VB.NET: Replace a Picture in Excel

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

When you're dealing Excel documents, it is a common task that you may need to copy data from a main workbook and paste into a separate workbook. You can copy either a selected cell range or an entire worksheet between different workbooks. This article demonstrates how to copy a selected cell range from one workbook to another by using Spire.XLS for .NET.

Install Spire.XLS for .NET

To begin with, you need to add the DLL files included in the Spire.XLS for.NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

  • Package Manager
PM> Install-Package Spire.XLS

Copy a Cell Range Between Different Workbooks

Spire.XLS offers the Worksheet.Copy() method to copy data from a source range to a destination range. The destination range can be a cell range inside the same workbook or from a different workbook. The following are the steps to copy a cell range from a workbook to another.

  • Create a Workbook object to load the source Excel document.
  • Get the source worksheet and the source cell range using Workbook.Worksheets property and Worksheet.Range property respectively.
  • Create another Workbook object to load the destination Excel document.
  • Get the destination worksheet and cell range.
  • Copy the data from the source range to the destination range using Worksheet.Copy(CellRange source, CellRange destRange).
  • Copy the column widths from the source range to the destination range, so that the data can display properly in the destination workbook.
  • Save the destination workbook to an Excel file using Workbook.SaveToFile() method.
  • C#
  • VB.NET
using Spire.Xls;

namespace CopyCellRange
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a Workbook object
            Workbook sourceBook = new Workbook();

            //Load the source workbook
            sourceBook.LoadFromFile(@"C:\Users\Administrator\Desktop\source.xlsx");

            //Get the source worksheet
            Worksheet sourceSheet = sourceBook.Worksheets[0];

            //Get the source cell range
            CellRange sourceRange = sourceSheet.Range["A1:E4"];

            //Create another Workbook objecy
            Workbook destBook = new Workbook();

            //Load the destination workbook
            destBook.LoadFromFile(@"C:\Users\Administrator\Desktop\destination.xlsx");

            //Get the destination worksheet
            Worksheet destSheet = destBook.Worksheets[0];

            //Get the destination cell range
            CellRange destRange = destSheet.Range["B2:F5"];

            //Copy data from the source range to the destination range
            sourceSheet.Copy(sourceRange, destRange);

            //Loop through the columns in the source range
            for (int i = 0; i < sourceRange.Columns.Length; i++)
            {
                //Copy the column widths also from the source range to destination range
                destRange.Columns[i].ColumnWidth = sourceRange.Columns[i].ColumnWidth;
            }
            
            //Save the destination workbook to an Excel file
            destBook.SaveToFile("CopyRange.xlsx");
        }
    }
}
Imports Spire.Xls
 
Namespace CopyCellRange
    Class Program
        Shared  Sub Main(ByVal args() As String)
            'Create a Workbook object
            Dim sourceBook As Workbook =  New Workbook() 
 
            'Load the source workbook
            sourceBook.LoadFromFile("C:\Users\Administrator\Desktop\source.xlsx")
 
            'Get the source worksheet
            Dim sourceSheet As Worksheet =  sourceBook.Worksheets(0) 
 
            'Get the source cell range
            Dim sourceRange As CellRange =  sourceSheet.Range("A1:E4") 
 
            'Create another Workbook objecy
            Dim destBook As Workbook =  New Workbook() 
 
            'Load the destination workbook
            destBook.LoadFromFile("C:\Users\Administrator\Desktop\destination.xlsx")
 
            'Get the destination worksheet
            Dim destSheet As Worksheet =  destBook.Worksheets(0) 
 
            'Get the destination cell range
            Dim destRange As CellRange =  destSheet.Range("B2:F5") 
 
            'Copy data from the source range to the destination range
            sourceSheet.Copy(sourceRange, destRange)
 
            'Loop through the columns in the source range
            Dim i As Integer
            For  i = 0 To  sourceRange.Columns.Length- 1  Step  i + 1
                'Copy the column widths also from the source range to destination range
                destRange.Columns(i).ColumnWidth = sourceRange.Columns(i).ColumnWidth
            Next
 
            'Save the destination workbook to an Excel file
            destBook.SaveToFile("CopyRange.xlsx")
        End Sub
    End Class
End Namespace

C#/VB.NET: Copy Cell Ranges Between Different Workbooks

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Extract tables from PDF files in C#/.NET Extracting tables from PDF files is a common requirement in data processing, reporting, and automation tasks. PDFs are widely used for sharing structured data, but extracting tables programmatically can be challenging due to their complex layout. Fortunately, with the right tools, this process becomes straightforward. In this guide, we’ll explore how to extract tables from PDF in C# using the Spire.PDF for .NET library, and export the results to TXT and CSV formats for easy reuse.

Table of Contents:


Prerequisites for Reading PDF Tables in C#

Spire.PDF for .NET is a powerful library for processing PDF files in C# and VB.NET. It supports a wide range of PDF operations, including table extraction, text extraction, image extraction, and more.

The easiest way to add the Spire.PDF library is via NuGet Package Manager.​

1. Open Visual Studio and create a new C# project. (Here we create a Console App)

2. In Visual Studio, right-click your project > Manage NuGet Packages.

3. Search for “Spire.PDF” and install the latest version.


Understanding PDF Table Structure

Before coding, let’s clarify how PDFs store tables. Unlike Excel (which explicitly defines rows/columns), PDFs use:

  • Text Blocks: Individual text elements positioned with coordinates.
  • Borders/Lines: Visual cues (horizontal/vertical lines) that humans interpret as table edges.
  • Spacing: Consistent gaps between text blocks to indicate cells.

The Spire.PDF library infers table structure by analyzing these visual cues, matching text blocks to rows/columns based on proximity and alignment.


How to Extract Tables from PDF in C#

If you need a quick way to preview table data (e.g., debugging or verifying extraction), printing it to the console is a great starting point.

Key methods to extract data from a PDF table:

  • PdfDocument: Represents a PDF file.
  • LoadFromFile: Loads the PDF file for processing.
  • PdfTableExtractor: Analyzes the PDF to detect tables using visual cues (borders, spacing).
  • ExtractTable(pageIndex): Returns an array of PdfTable objects for the specified page.
  • GetRowCount()/GetColumnCount(): Retrieve the dimensions of each table.
  • GetText(rowIndex, columnIndex): Extracts text from the cell at the specified row and column.
using Spire.Pdf;
using Spire.Pdf.Utilities;

namespace ExtractPdfTable
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create a PdfDocument object
            PdfDocument pdf = new PdfDocument();

            // Load a PDF file
            pdf.LoadFromFile("invoice.pdf");

            // Initialize an instance of PdfTableExtractor class
            PdfTableExtractor extractor = new PdfTableExtractor(pdf);


            // Loop through the pages 
            for (int pageIndex = 0; pageIndex < pdf.Pages.Count; pageIndex++)
            {
                // Extract tables from a specific page
                PdfTable[] tableList = extractor.ExtractTable(pageIndex);

                // Determine if the table list is null
                if (tableList != null && tableList.Length > 0)
                {
                    int tableNumber = 1;
                    // Loop through the table in the list
                    foreach (PdfTable table in tableList)
                    {
                        Console.WriteLine($"\nTable {tableNumber} on Page {pageIndex + 1}:");
                        Console.WriteLine("-----------------------------------");

                        // Get row number and column number of a certain table
                        int row = table.GetRowCount();
                        int column = table.GetColumnCount();

                        // Loop through rows and columns 
                        for (int i = 0; i < row; i++)
                        {
                            for (int j = 0; j < column; j++)
                            {
                                // Get text from the specific cell
                                string text = table.GetText(i, j);

                                // Print cell text to console with a separator
                                Console.Write($"{text}\t");
                            }
                            // New line after each row
                            Console.WriteLine();
                        }
                        tableNumber++;
                    }
                }
            }

            // Close the document
            pdf.Close();
        }
    }
}

When to Use This Method

  • Quick debugging or validation of extracted data.
  • Small datasets where you don’t need persistent storage.

Output: Retrieve PDF table data and output to the console

Extract data from a PDF table

Extract PDF Tables to a Text File in C#

For lightweight, human-readable storage, saving tables to a text file is ideal. This method uses StringBuilder to efficiently compile table data, preserving row breaks for readability.

Key features of extracting PDF tables and exporting to TXT:

  • Efficiency: StringBuilder minimizes memory overhead compared to string concatenation.
  • Persistent Storage: Saves data to a text file for later review or sharing.
  • Row Preservation: Uses \r\n to maintain row structure, making the text file easy to scan.
using Spire.Pdf;
using Spire.Pdf.Utilities;
using System.Text;

namespace ExtractTableToTxt
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create a PdfDocument object
            PdfDocument pdf = new PdfDocument();

            // Load a PDF file
            pdf.LoadFromFile("invoice.pdf");

            // Create a StringBuilder object
            StringBuilder builder = new StringBuilder();

            // Initialize an instance of PdfTableExtractor class
            PdfTableExtractor extractor = new PdfTableExtractor(pdf);

            // Declare a PdfTable array 
            PdfTable[] tableList = null;

            // Loop through the pages 
            for (int pageIndex = 0; pageIndex < pdf.Pages.Count; pageIndex++)
            {
                // Extract tables from a specific page
                tableList = extractor.ExtractTable(pageIndex);

                // Determine if the table list is null
                if (tableList != null && tableList.Length > 0)
                {
                    // Loop through the table in the list
                    foreach (PdfTable table in tableList)
                    {
                        // Get row number and column number of a certain table
                        int row = table.GetRowCount();
                        int column = table.GetColumnCount();

                        // Loop through the rows and columns 
                        for (int i = 0; i < row; i++)
                        {
                            for (int j = 0; j < column; j++)
                            {
                                // Get text from the specific cell
                                string text = table.GetText(i, j);

                                // Add text to the string builder
                                builder.Append(text + " ");
                            }
                            builder.Append("\r\n");
                        }
                    }
                }
            }

            // Write to a .txt file
            File.WriteAllText("ExtractPDFTable.txt", builder.ToString());
        }
    }
}

When to Use This Method

  • Archiving table data in a lightweight, universally accessible format.
  • Sharing with teams that need to scan data without spreadsheet tools.
  • Using as input for basic scripts (e.g., PowerShell) to extract specific values.

Output: Extract PDF table data and save to a text file.

Extract table data from PDF to a TXT file

Pro Tip: For VB.NET demos, convert the above code using our C# ⇆ VB.NET Converter.

Export PDF Tables to CSV in C#

CSV (Comma-Separated Values) is the industry standard for tabular data, compatible with Excel, Google Sheets, and databases. This method formats the extracted tables into a valid CSV file by quoting cells and handling special characters.

Key features of extracting tables from PDF to CSV:

  • StreamWriter: Writes data incrementally to the CSV file, reducing memory usage for large PDFs.
  • Quoted Cells: Cells are wrapped in double quotes (" ") to avoid misinterpreting commas within text as column separators.
  • UTF-8 Encoding: Supports special characters in cell text.
  • Spreadsheet Ready: Directly opens in Excel, Google Sheets, or spreadsheet tools for analysis.
using Spire.Pdf;
using Spire.Pdf.Utilities;
using System.Text;

namespace ExtractTableToCsv
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create a PdfDocument object
            PdfDocument pdf = new PdfDocument();

            // Load a PDF file
            pdf.LoadFromFile("invoice.pdf");

            // Create a StreamWriter object for efficient CSV writing
            using (StreamWriter csvWriter = new StreamWriter("PDFtable.csv", false, Encoding.UTF8))
            {
                // Create a PdfTableExtractor object
                PdfTableExtractor extractor = new PdfTableExtractor(pdf);

                // Loop through the pages 
                for (int pageIndex = 0; pageIndex < pdf.Pages.Count; pageIndex++)
                {
                    // Extract tables from a specific page
                    PdfTable[] tableList = extractor.ExtractTable(pageIndex);

                    // Determine if the table list is null
                    if (tableList != null && tableList.Length > 0)
                    {
                        // Loop through the table in the list
                        foreach (PdfTable table in tableList)
                        {
                            // Get row number and column number of a certain table
                            int row = table.GetRowCount();
                            int column = table.GetColumnCount();

                            // Loop through the rows
                            for (int i = 0; i < row; i++)
                            {
                                // Creates a list to store data 
                                List<string> rowData = new List<string>();
                                // Loop through the columns
                                for (int j = 0; j < column; j++)
                                {
                                    // Retrieve text from table cells
                                    string cellText = table.GetText(i, j).Replace("\"", "\"\"");
                                    // Add the cell text to the list and wrap in double quotes
                                    rowData.Add($"\"{cellText}\"");
                                }
                                // Join cells with commas and write to CSV
                                csvWriter.WriteLine(string.Join(",", rowData));
                            }
                        }
                    }
                }
            }
        }
    }
}

When to Use This Method

  • Data analysis (import into Excel for calculations).
  • Migrating PDF tables to databases (e.g., SQL Server, PostgreSQL, MySQL).
  • Collaborating with teams that rely on spreadsheets.

Output: Parse PDF table data and export to a CSV file.

Extract table data from PDF to a CSV file

Recommendation: Integrate with Spire.XLS for .NET to extract tables from PDF to Excel directly.


Conclusion

This guide has outlined three efficient methods for extracting tables from PDFs in C#. By leveraging the Spire.PDF for .NET library, you can automate the PDF table extraction process and export results to console, TXT, or CSV for further analysis. Whether you’re building a data pipeline, report generator, or business tool, these approaches streamline workflows, save time, and minimize human error.

Refer to the online documentation and obtain a free trial license here to explore more advanced PDF operations.


FAQs

Q1: Why use Spire.PDF for .NET to extract tables?

A: Spire.PDF provides a dedicated PdfTableExtractor class that detects tables based on visual cues (borders, spacing, and text alignment), simplifying the process of parsing structured data from PDFs.

Q2: Can Spire.PDF extract tables from scanned (image-based) PDFs?

A: No. The .NET PDF library works only with text-based PDFs (where text is selectable). For scanned PDFs, use Spire.OCR to extract text before parsing tables.

Q3: Can I extract tables from multiple PDFs at once?

A: Yes. To batch-process multiple PDFs, use Directory.GetFiles() to list all PDF files in a folder, then loop through each file and run the extraction logic. For example:

string[] pdfFiles = Directory.GetFiles(@"C:\Invoices\", "*.pdf");
foreach (string file in pdfFiles)
{
// Run extraction code for each file  
}

Q4: How can I improve performance when extracting tables from large PDFs?

A: For large PDFs (100+ pages), optimize performance by:

  • Processing pages in batches instead of loading the entire PDF at once.
  • Disposing of unused PdfTable or PdfDocument objects with the using statements to free memory.
  • Skipping pages with no tables early (using if (tableList == null || tableList.Length == 0)).

Sometimes you may want to print Word documents in accordance with your own preferences, for instance, print your files on custom paper sizes to make them more personalized. In this article, you will learn how to achieve this function using Spire.Doc for .NET.

Install Spire.Doc for .NET

To begin with, you need to add the DLL files included in the Spire.Doc for .NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

  • Package Manager
PM> Install-Package Spire.Doc

Print Word on a Custom Paper Size

The table below shows a list of core classes, methods and properties utilized in this scenario.

Name Description
Document Class Represents a document model for Word.
PaperSize Class Specifies the size of a piece of paper.
PrintDocument Class Defines a reusable object that sends output to a printer, when printing from a Windows Forms application.
PrintDocument.DefaultPageSettings Property Gets or sets page settings that are used as defaults for all pages to be printed.
Document.PrintDocument Property Gets the PrintDocument object.
DefaultPageSettings.PaperSize Property Sets the custom paper size.
Document.LoadFromFile() Method Loads the sample document.
PrintDocument.Print() Method Prints the document.

The following are the steps to print Word on a custom paper size.

  • C#
  • VB.NET
using Spire.Doc;
using System.Drawing.Printing;

namespace PrintWord
{
    class Program
    {
        static void Main(string[] args)
        {
             //Instantiate a Document object.
            Document doc = new Document();

            //Load the document
            doc.LoadFromFile(@"Sample.docx");

            //Get the PrintDocument object
            PrintDocument printDoc = doc.PrintDocument;

            //Customize the paper size
            printDoc.DefaultPageSettings.PaperSize = new PaperSize("custom", 900, 800);

            //Print the document
            printDoc.Print();

        }
    }
}
Imports Spire.Doc
Imports System.Drawing.Printing

Namespace PrintWord
	Class Program
		Private Shared Sub Main(args As String())
			'Instantiate a Document object.
			Dim doc As New Document()

			'Load the document
			doc.LoadFromFile("Sample.docx")

			'Get the PrintDocument object
			Dim printDoc As PrintDocument = doc.PrintDocument

			'Customize the paper size
			printDoc.DefaultPageSettings.PaperSize = New PaperSize("custom", 900, 800)

			'Print the document
			printDoc.Print()

		End Sub
	End Class
End Namespace

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

LaTeX is a powerful tool to typeset mathematical equations. It supports plenty of mathematical symbols and notations to create mathematical equations, for instance, fractions, integrals and more.

Spire.Presentation API provides developers with the ability to create and add mathematical equations to PowerPoint shape using LaTeX code. The following steps demonstrate how to achieve this function using Spire.Presentation:

  • Create a Presentation instance.
  • Get the reference of a slide by using its index.
  • Use ShapeList.AppendShape method to add a shape to the first slide.
  • Use ParagraphCollection.AddParagraphFromLatexMathCode(string) method to create a mathematical equation from LaTeX code and add it to the shape.
  • Save the result document using Presentation.SaveToFile(string, FileFormat) method.

The following code shows how to add mathematical equations to PowerPoint in C#.

using Spire.Presentation;
using System.Drawing;

namespace MathEquations
{
    class Program
    {
        static void Main(string[] args)
        {            
            //The LaTeX codes
            string latexCode1 = @"x^{2} + \sqrt{x^{2}+1}=2";
            string latexCode2 = @"F(x) &= \int^a_b \frac{1}{3}x^3";
            string latexCode3 = @"\alpha + \beta  \geq \gamma";
            string latexCode4 = @"\overrightarrow{abc}";
            string latexCode5 = @"\begin{bmatrix} 1 & 0 & \cdots & 0\\ 1 & 0 & \cdots & 0\\ \vdots & \vdots & \ddots & \vdots\\ 1 & 0 & 0 & 0 \end{bmatrix}";
            string latexCode6 = @"\log_a{b}";

            //Create a Presentation instance
            Presentation ppt = new Presentation();

            //Get the first slide by using its index
            ISlide slide = ppt.Slides[0];

            //Add a shape to the slide
            IAutoShape shape = slide.Shapes.AppendShape(ShapeType.Rectangle, new RectangleF(30, 100, 200, 30));
            shape.TextFrame.Paragraphs.Clear();
            //Add a math equation to the shape using the LaTeX code 
            shape.TextFrame.Paragraphs.AddParagraphFromLatexMathCode(latexCode1);

            //Add a shape to the slide
            shape = slide.Shapes.AppendShape(ShapeType.Rectangle, new RectangleF(240, 100, 200, 40));
            shape.TextFrame.Paragraphs.Clear();
            //Add a math equation to the shape using the LaTeX code 
            shape.TextFrame.Paragraphs.AddParagraphFromLatexMathCode(latexCode2);

            //Add a shape to the slide
            shape = slide.Shapes.AppendShape(ShapeType.Rectangle, new RectangleF(30, 180, 200, 40));
            shape.TextFrame.Paragraphs.Clear();
            //Add a math equation to the shape using the LaTeX code 
            shape.TextFrame.Paragraphs.AddParagraphFromLatexMathCode(latexCode3);

            //Add a shape to the slide
            shape = slide.Shapes.AppendShape(ShapeType.Rectangle, new RectangleF(240, 180, 200, 40));
            shape.TextFrame.Paragraphs.Clear();
            //Add a math equation to the shape using the LaTeX code 
            shape.TextFrame.Paragraphs.AddParagraphFromLatexMathCode(latexCode4);

            //Add a shape to the slide
            shape = slide.Shapes.AppendShape(ShapeType.Rectangle, new RectangleF(30, 280, 200, 70));
            shape.TextFrame.Paragraphs.Clear();
            //Add a math equation to the shape using the LaTeX code 
            shape.TextFrame.Paragraphs.AddParagraphFromLatexMathCode(latexCode5);

            //Add a shape to the slide
            shape = slide.Shapes.AppendShape(ShapeType.Rectangle, new RectangleF(240, 280, 200, 40));
            shape.TextFrame.Paragraphs.Clear();
            //Add a math equation to the shape using the LaTeX code 
            shape.TextFrame.Paragraphs.AddParagraphFromLatexMathCode(latexCode6);

            for (int i = 0; i < slide.Shapes.Count; i++)
            {
                slide.Shapes[i].Fill.FillType = Spire.Presentation.Drawing.FillFormatType.None;
                slide.Shapes[i].Line.FillType = Spire.Presentation.Drawing.FillFormatType.None;
            }

            //Save the result document
            ppt.SaveToFile("MathEquations.pptx", FileFormat.Pptx2013);
        }
    }
}

The following code shows how to add mathematical equations to PowerPoint in VB.NET.

Imports Spire.Presentation
Imports System.Drawing

Namespace MathEquations
    Friend Class Program
        Private Shared Sub Main(ByVal args As String())
            'The LaTeX codes
            Dim latexCode1 As String = "x^{2} + \sqrt{x^{2}+1}=2"
            Dim latexCode2 As String = "F(x) &= \int^a_b \frac{1}{3}x^3"
            Dim latexCode3 As String = "\alpha + \beta  \geq \gamma"
            Dim latexCode4 As String = "\overrightarrow{abc}"
            Dim latexCode5 As String = "\begin{bmatrix} 1 & 0 & \cdots & 0\\ 1 & 0 & \cdots & 0\\ \vdots & \vdots & \ddots & \vdots\\ 1 & 0 & 0 & 0 \end{bmatrix}"
            Dim latexCode6 As String = "\log_a{b}"

            'Create a Presentation instance
            Dim ppt As Presentation = New Presentation()

            'Get the first slide by using its index
            Dim slide As ISlide = ppt.Slides(0)

            'Add a shape to the slide
            Dim shape As IAutoShape = slide.Shapes.AppendShape(ShapeType.Rectangle, New RectangleF(30, 100, 200, 30))
            shape.TextFrame.Paragraphs.Clear()
            'Add a math equation to the shape using the LaTeX code 
            shape.TextFrame.Paragraphs.AddParagraphFromLatexMathCode(latexCode1)

            'Add a shape to the slide
            shape = slide.Shapes.AppendShape(ShapeType.Rectangle, New RectangleF(240, 100, 200, 40))
            shape.TextFrame.Paragraphs.Clear()
            'Add a math equation to the shape using the LaTeX code 
            shape.TextFrame.Paragraphs.AddParagraphFromLatexMathCode(latexCode2)

            'Add a shape to the slide
            shape = slide.Shapes.AppendShape(ShapeType.Rectangle, New RectangleF(30, 180, 200, 40))
            shape.TextFrame.Paragraphs.Clear()
            'Add a math equation to the shape using the LaTeX code 
            shape.TextFrame.Paragraphs.AddParagraphFromLatexMathCode(latexCode3)

            'Add a shape to the slide
            shape = slide.Shapes.AppendShape(ShapeType.Rectangle, New RectangleF(240, 180, 200, 40))
            shape.TextFrame.Paragraphs.Clear()
            'Add a math equation to the shape using the LaTeX code 
            shape.TextFrame.Paragraphs.AddParagraphFromLatexMathCode(latexCode4)

            'Add a shape to the slide
            shape = slide.Shapes.AppendShape(ShapeType.Rectangle, New RectangleF(30, 280, 200, 70))
            shape.TextFrame.Paragraphs.Clear()
            'Add a math equation to the shape using the LaTeX code 
            shape.TextFrame.Paragraphs.AddParagraphFromLatexMathCode(latexCode5)

            'Add a shape to the slide
            shape = slide.Shapes.AppendShape(ShapeType.Rectangle, New RectangleF(240, 280, 200, 40))
            shape.TextFrame.Paragraphs.Clear()
            'Add a math equation to the shape using the LaTeX code 
            shape.TextFrame.Paragraphs.AddParagraphFromLatexMathCode(latexCode6)

            For i As Integer = 0 To slide.Shapes.Count - 1
                slide.Shapes(i).Fill.FillType = Spire.Presentation.Drawing.FillFormatType.None
                slide.Shapes(i).Line.FillType = Spire.Presentation.Drawing.FillFormatType.None
            Next

            'Save the result document
            ppt.SaveToFile("MathEquations.pptx", FileFormat.Pptx2013)
        End Sub
    End Class
End Namespace

The following is the output document after adding mathematical equations:

Add Math Equations to PowerPoint using LaTeX Code in C#, VB.NET

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Spire.Presentation for .NET provides you with the ability to replace text with regular expression using the ReplaceTextWithRegex method of IShape class. The ReplaceTextWithRegex method accepts the following parameters:

Regex: the regular expression to search text.

string: the text to replace with.

The following example demonstrates how to replace text with regular expression in a PowerPoint document using Spire.Presentation for .NET.

C#
using Spire.Presentation;
using System.Text.RegularExpressions;

namespace ReplaceTextWithRegex
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a Presentation instance
            Presentation ppt = new Presentation();
            //Load a sample document
            ppt.LoadFromFile("Sample.pptx");

            //Get the first slide
            ISlide slide = ppt.Slides[0];

            //Replace "ABC" and the subsequent to the end of the line as "ABC DEF"
            Regex regex = new Regex("ABC.*");
            string newvalue = "ABC DEF";
            foreach (IShape shape in slide.Shapes)
            {
                shape.ReplaceTextWithRegex(regex, newvalue);
            }

            //Save the result document
            ppt.SaveToFile("ReplaceTextWithRegex.pptx", FileFormat.Pptx2013);
        }
    }
}
VB.NET
Imports Spire.Presentation
Imports System.Text.RegularExpressions

Namespace ReplaceTextWithRegex
    Friend Class Program
        Private Shared Sub Main(ByVal args As String())
            'Create a Presentation instance
            Dim ppt As Presentation = New Presentation()
            'Load the sample document
            ppt.LoadFromFile("Sample.pptx")

            'Get the first slide
            Dim slide As ISlide = ppt.Slides(0)

            'Replace "ABC" and the subsequent to the end of the line as "ABC DEF"
            Dim regex As Regex = New Regex("ABC.*")
            Dim newvalue As String = "ABC DEF"

            For Each shape As IShape In slide.Shapes
                shape.ReplaceTextWithRegex(regex, newvalue)
            Next

            'Save the result document
            ppt.SaveToFile("ReplaceTextWithRegex.pptx", FileFormat.Pptx2013)
        End Sub
    End Class
End Namespace

The input PowerPoint document:

Replace Text with Regular Expression (Regex) in PowerPoint in C#, VB.NET

The output PowerPoint document:

Replace Text with Regular Expression (Regex) in PowerPoint in C#, VB.NET

Page 9 of 95
page 9