Java: Export Table Data from PDF to Excel
Tables are commonly seen in PDF invoices and financial reports. You may encounter the situation where you need to export PDF table data into Excel, so that you can analyze the data using the tools provided by MS Excel. This article explains how to extract tables from a PDF page and export them as individual Excel worksheets using Spire.Office for Java.
Install Spire.Office for Java
The scenario actually uses Spire.PDF for Java for extracting tables from PDF, and Spire.XLS for Java for generating Excel files. In order to use them in the same project, you’ll need to add the Spire.Office.jar file as a dependency in your Java program.
The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project’s pom.xml file.
<repositories>
<repository>
<id>com.e-iceblue</id>
<name>e-iceblue</name>
<url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>e-iceblue</groupId>
<artifactId>spire.office</artifactId>
<version>10.10.0</version>
</dependency>
</dependencies>
Export Table Data from PDF to Excel
The following are the main steps to extract all tables from a certain page and save each of them as an individual worksheet in an Excel document.
- Load a sample PDF document while initializing the PdfDocument object.
- Create a PdfTableExtractor object, and call extactTable(int pageIndex) method under it to extract all tables in the first page.
- Create a Workbook instance.
- Loop through the tables in the PdfTable[] array, and get the specific one by its index.
- Add a worksheet to the workbook using Workbook.getWorksheets.add() method.
- Loop through the cells in the PDF table, and get the value of a specific cell using PdfTable.getText(int rowIndex, int columnIndex) method. Then insert the value to the worksheet using Worksheet.get(int row, int column).setText(String string) method.
- Save the workbook to an Excel document using Workbook.saveToFile() method.
- Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.utilities.PdfTable;
import com.spire.pdf.utilities.PdfTableExtractor;
import com.spire.xls.ExcelVersion;
import com.spire.xls.Workbook;
import com.spire.xls.Worksheet;
public class ExtractTableDataAndSaveInExcel {
public static void main(String[] args) {
//Load a sample PDF document
PdfDocument pdf = new PdfDocument("C:\\Users\\Administrator\\Desktop\\Tables.pdf");
//Create a PdfTableExtractor instance
PdfTableExtractor extractor = new PdfTableExtractor(pdf);
//Extract tables from the first page
PdfTable[] pdfTables = extractor.extractTable(0);
//Create a Workbook object,
Workbook wb = new Workbook();
//Remove default worksheets
wb.getWorksheets().clear();
//If any tables are found
if (pdfTables != null && pdfTables.length > 0) {
//Loop through the tables
for (int tableNum = 0; tableNum < pdfTables.length; tableNum++) {
//Add a worksheet to workbook
String sheetName = String.format("Table - %d", tableNum + 1);
Worksheet sheet = wb.getWorksheets().add(sheetName);
//Loop through the rows in the current table
for (int rowNum = 0; rowNum < pdfTables[tableNum].getRowCount(); rowNum++) {
//Loop through the columns in the current table
for (int colNum = 0; colNum < pdfTables[tableNum].getColumnCount(); colNum++) {
//Extract data from the current table cell
String text = pdfTables[tableNum].getText(rowNum, colNum);
//Insert data into a specific cell
sheet.get(rowNum + 1, colNum + 1).setText(text);
}
}
//Auto fit column width
for (int sheetColNum = 0; sheetColNum < sheet.getColumns().length; sheetColNum++) {
sheet.autoFitColumn(sheetColNum + 1);
}
}
}
//Save the workbook to an Excel file
wb.saveToFile("output/ExportTableToExcel.xlsx", ExcelVersion.Version2016);
}
}

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Java: Extract Table Data from PDF Document
Table is one of the most commonly used formatting elements in PDF. In some cases, you may need to extract data from PDF tables to perform further analysis. In this article, you will learn how to achieve this task programmatically in Java using Spire.PDF for Java.
Install Spire.PDF for Java
First of all, you're required to add the Spire.PDF.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.
<repositories>
<repository>
<id>com.e-iceblue</id>
<name>e-iceblue</name>
<url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>e-iceblue</groupId>
<artifactId>spire.pdf</artifactId>
<version>11.11.11</version>
</dependency>
</dependencies>
Extract Table Data from PDF Document
Spire.PDF for Java uses the PdfTableExtractor.extractTable(int pageIndex) method to detect and extract tables from a desired PDF page.
The following are the steps to extract table data from a PDF file:
- Load a sample PDF document using PdfDocument class.
- Create a StringBuilder instance and a PdfTableExtractor instance.
- Loop through the pages in the PDF, extract tables from each page into a PdfTable array using PdfTableExtractor.extractTable(int pageIndex) method.
- Loop through the tables in the array.
- Loop through the rows and columns in each table, after that extract data from each table cell using PdfTable.getText(int rowIndex, int columnIndex) method, then append the data to the StringBuilder instance using StringBuilder.append() method.
- Write the extracted data to a txt document using Writer.write() method.
- Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.utilities.PdfTable;
import com.spire.pdf.utilities.PdfTableExtractor;
import java.io.FileWriter;
public class ExtractTableData {
public static void main(String []args) throws Exception {
//Load a sample PDF document
PdfDocument pdf = new PdfDocument("Sample.pdf");
//Create a StringBuilder instance
StringBuilder builder = new StringBuilder();
//Create a PdfTableExtractor instance
PdfTableExtractor extractor = new PdfTableExtractor(pdf);
//Loop through the pages in the PDF
for (int pageIndex = 0; pageIndex < pdf.getPages().getCount(); pageIndex++) {
//Extract tables from the current page into a PdfTable array
PdfTable[] tableLists = extractor.extractTable(pageIndex);
//If any tables are found
if (tableLists != null && tableLists.length > 0) {
//Loop through the tables in the array
for (PdfTable table : tableLists) {
//Loop through the rows in the current table
for (int i = 0; i < table.getRowCount(); i++) {
//Loop through the columns in the current table
for (int j = 0; j < table.getColumnCount(); j++) {
//Extract data from the current table cell and append to the StringBuilder
String text = table.getText(i, j);
builder.append(text + " | ");
}
builder.append("\r\n");
}
}
}
}
//Write data into a .txt document
FileWriter fw = new FileWriter("ExtractTable.txt");
fw.write(builder.toString());
fw.flush();
fw.close();
}
}
The input PDF:

The output .txt document with extracted table data:

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Java: Create Tables in a PDF Document
A table represents information or data in the form of horizontal rows and vertical columns. Creating tables is often more efficient than describing the data in the paragraph text, especially when the data is numerical or large. The tabular data presentation makes it easier to read and understand. In this article, you will learn how to create tables in a PDF document in Java using Spire.PDF for Java.
Spire.PDF for Java offers the PdfTable and the PdfGrid class to work with the tables in a PDF document. The PdfTable class is used to quickly create simple, regular tables without too much formatting, while the PdfGrid class is used to create more complex tables.
The table below lists the differences between these two classes.
| PdfTable | PdfGrid | |
| Formatting | ||
| Row | Can be set through events. No API support. | Can be set through API. |
| Column | Can be set through API. | Can be set through API. |
| Cell | Can be set through events. No API support. | Can be set through API. |
| Others | ||
| Column span | Not support. | Can be set through API. |
| Row span | Can be set through events. No API support. | Can be set through API. |
| Nested table | Can be set through events. No API support. | Can be set through API. |
| Events | BeginCellLayout, EndCellLayout, BeginRowLayout, EndRowLayout, BeginPageLayout, EndPageLayout. | BeginPageLayout, EndPageLayout. |
The following sections demonstrate how to create a table in PDF using the PdfTable class and the PdfGrid class, respectively.
Install Spire.PDF for Java
First of all, you're required to add the Spire.Pdf.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.
<repositories>
<repository>
<id>com.e-iceblue</id>
<name>e-iceblue</name>
<url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>e-iceblue</groupId>
<artifactId>spire.pdf</artifactId>
<version>11.11.11</version>
</dependency>
</dependencies>
Create a Table in PDF Using PdfTable Class
The following are the steps to create a table using the PdfTable class using Spire.PDF for Java.
- Create a PdfDocument object.
- Add a page to it using PdfDocument.getPages().add() method.
- Create a Pdftable object.
- Set the table style using the methods under PdfTableStyle object which is returned by PdfTable.getTableStyle() method.
- Insert data to table using PdfTable.setDataSource() method.
- Set row height and row color through BeginRowLayout event.
- Draw table on the PDF page using PdfTable.draw() method.
- Save the document to a PDF file using PdfDocument.saveToFile() method.
- Java
import com.spire.data.table.DataTable;
import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;
import com.spire.pdf.PdfPageSize;
import com.spire.pdf.graphics.*;
import com.spire.pdf.tables.*;
import java.awt.*;
import java.awt.geom.Point2D;
public class CreateTable {
public static void main(String[] args) {
//Create a PdfDocument object
PdfDocument doc = new PdfDocument();
//Add a page
PdfPageBase page = doc.getPages().add(PdfPageSize.A4, new PdfMargins(40));
//Create a PdfTable object
PdfTable table = new PdfTable();
//Set font for header and the rest cells
table.getStyle().getDefaultStyle().setFont(new PdfTrueTypeFont(new Font("Times New Roman", Font.PLAIN, 12), true));
table.getStyle().getHeaderStyle().setFont(new PdfTrueTypeFont(new Font("Times New Roman", Font.BOLD, 12), true));
//Define data
String[] data = {"ID;Name;Department;Position;Level",
"1; David; IT; Manager; 1",
"3; Julia; HR; Manager; 1",
"4; Sophie; Marketing; Manager; 1",
"7; Wickey; Marketing; Sales Rep; 2",
"9; Wayne; HR; HR Supervisor; 2",
"11; Mia; Dev; Developer; 2"};
String[][] dataSource = new String[data.length][];
for (int i = 0; i < data.length; i++) {
dataSource[i] = data[i].split("[;]", -1);
}
//Set data as the table data
table.setDataSource(dataSource);
//Set the first row as header row
table.getStyle().setHeaderSource(PdfHeaderSource.Rows);
table.getStyle().setHeaderRowCount(1);
//Show header(the header is hidden by default)
table.getStyle().setShowHeader(true);
//Set font color and background color of header row
table.getStyle().getHeaderStyle().setBackgroundBrush(PdfBrushes.getGray());
table.getStyle().getHeaderStyle().setTextBrush(PdfBrushes.getWhite());
//Set text alignment in header row
table.getStyle().getHeaderStyle().setStringFormat(new PdfStringFormat(PdfTextAlignment.Center, PdfVerticalAlignment.Middle));
//Set text alignment in other cells
for (int i = 0; i < table.getColumns().getCount(); i++) {
table.getColumns().get(i).setStringFormat(new PdfStringFormat(PdfTextAlignment.Center, PdfVerticalAlignment.Middle));
}
//Register with BeginRowLayout event
table.beginRowLayout.add(new BeginRowLayoutEventHandler() {
public void invoke(Object sender, BeginRowLayoutEventArgs args) {
Table_BeginRowLayout(sender, args);
}
});
//Draw table on the page
table.draw(page, new Point2D.Float(0, 30));
//Save the document to a PDF file
doc.saveToFile("output/PdfTable.pdf");
}
//Event handler
private static void Table_BeginRowLayout(Object sender, BeginRowLayoutEventArgs args) {
//Set row height
args.setMinimalHeight(20f);
//Alternate color of rows except the header row
if (args.getRowIndex() == 0) {
return;
}
if (args.getRowIndex() % 2 == 0) {
args.getCellStyle().setBackgroundBrush(PdfBrushes.getLightGray());
} else {
args.getCellStyle().setBackgroundBrush(PdfBrushes.getWhite());
}
}
}

Create a Table in PDF Using PdfGrid Class
Below are the steps to create a table in PDF using the PdfGrid class using Spire.PDF for Java.
- Create a PdfDocument object.
- Add a page to it using PdfDocument.getPages().add() method.
- Create a PdfGrid object.
- Set the table style using the methods under the PdfGridStyle object which is returned by PdfGrid.getStyle() method.
- Add rows and columns to the table using PdfGrid.getRows().add() method and PdfGrid.getColumns().add() method.
- Insert data to specific cells using PdfGridCell.setValue() method.
- Span cells across columns or rows using PdfGridCell.setRowSpan() method or PdfGridCell.setColumnSpan() method.
- Set the formatting of a specific cell using PdfGridCell.setStringFormat() method and the methods under PdfGridCellStyle object.
- Draw table on the PDF page using PdfGrid.draw() method.
- Save the document to a PDF file using PdfDocument.saveToFile() method.
- Java
import com.spire.pdf.*;
import com.spire.pdf.graphics.*;
import com.spire.pdf.grid.PdfGrid;
import com.spire.pdf.grid.PdfGridRow;
import java.awt.*;
import java.awt.geom.Point2D;
public class CreateGrid {
public static void main(String[] args) {
//Create a PdfDocument object
PdfDocument doc = new PdfDocument();
//Add a page
PdfPageBase page = doc.getPages().add(PdfPageSize.A4,new PdfMargins(40));
//Create a PdfGrid
PdfGrid grid = new PdfGrid();
//Set cell padding
grid.getStyle().setCellPadding(new PdfPaddings(1, 1, 1, 1));
//Set font
grid.getStyle().setFont(new PdfTrueTypeFont(new Font("Times New Roman", Font.PLAIN, 13), true));
//Add rows and columns
PdfGridRow row1 = grid.getRows().add();
PdfGridRow row2 = grid.getRows().add();
PdfGridRow row3 = grid.getRows().add();
PdfGridRow row4 = grid.getRows().add();
grid.getColumns().add(4);
//Set column width
for (int i = 0; i < grid.getColumns().getCount(); i++) {
grid.getColumns().get(i).setWidth(120);
}
//Write data into specific cells
row1.getCells().get(0).setValue("Order and Payment Status");
row2.getCells().get(0).setValue("Order number");
row2.getCells().get(1).setValue("Date");
row2.getCells().get(2).setValue ("Customer");
row2.getCells().get(3).setValue("Paid or not");
row3.getCells().get(0).setValue("00223");
row3.getCells().get(1).setValue("2022/06/02");
row3.getCells().get(2).setValue("Brick Lane Realty");
row3.getCells().get(3).setValue("Yes");
row4.getCells().get(0).setValue("00224");
row4.getCells().get(1).setValue("2022/06/03");
row4.getCells().get(3).setValue("No");
//Span cell across columns
row1.getCells().get(0).setColumnSpan(4);
//Span cell across rows
row3.getCells().get(2).setRowSpan(2);
//Set text alignment of specific cells
row1.getCells().get(0).setStringFormat(new PdfStringFormat(PdfTextAlignment.Center));
row3.getCells().get(2).setStringFormat(new PdfStringFormat(PdfTextAlignment.Left, PdfVerticalAlignment.Middle));
//Set background color of specific cells
row1.getCells().get(0).getStyle().setBackgroundBrush(PdfBrushes.getOrange());
row4.getCells().get(3).getStyle().setBackgroundBrush(PdfBrushes.getLightGray());
//Format cell border
PdfBorders borders = new PdfBorders();
borders.setAll(new PdfPen(new PdfRGBColor(Color.ORANGE), 0.8f));
for (int i = 0; i < grid.getRows().getCapacity(); i++) {
PdfGridRow gridRow = grid.getRows().get(i);
gridRow.setHeight(20f);
for (int j = 0; j < gridRow.getCells().getCount(); j++) {
gridRow.getCells().get(j).getStyle().setBorders(borders);
}
}
//Draw table on the page
grid.draw(page, new Point2D.Float(0, 30));
//Save the document to a PDF file
doc.saveToFile("output/PdfGrid.pdf");
}
}

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.