Word documents often contain valuable data in the form of tables, which can be used for reporting, data analysis, and record-keeping. However, manually extracting and transferring these tables to other formats can be a time-consuming and error-prone task. By automating this process using Python, we can save time, ensure accuracy, and maintain consistency. Spire.Doc for Python provides a seamless solution for the table extraction task, making it effortless to create accessible and manageable files with data from Word document tables. This article will demonstrate how to leverage Spire.Doc for Python to extract tables from Word documents and write them into text files and Excel worksheets.
- Extract Tables from Word Documents to Text Files with Python
- Extract Tables from Word Documents to Excel Workbooks with Python
Install Spire.Doc for Python
This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.Doc
If you are unsure how to install, please refer to: How to Install Spire.Doc for Python on Windows
Extract Tables from Word Documents to Text Files with Python
Spire.Doc for Python offers the Section.Tables property to retrieve a collection of tables within a section of a Word document. Then, developers can use the properties and methods under the ITable class to access the data in the tables and write it into a text file. This provides a convenient solution for converting Word document tables into text files.
The detailed steps for extracting tables from Word documents to text files are as follows:
- Create an object of Document class and load a Word document using Document.LoadFromFile() method.
- Iterate through the sections in the document and get the table collection of each section through Section.Tables property.
- Iterate through the tables and create a string object for each table.
- Iterate through the rows in each table and the cells in each row, get the text of each cell through TableCell.Paragraphs[].Text property, and add the cell text to the string.
- Save each string to a text file.
- Python
from spire.doc import *
from spire.doc.common import *
# Create an instance of Document
doc = Document()
# Load a Word document
doc.LoadFromFile("Sample.docx")
# Loop through the sections
for s in range(doc.Sections.Count):
# Get a section
section = doc.Sections.get_Item(s)
# Get the tables in the section
tables = section.Tables
# Loop through the tables
for i in range(0, tables.Count):
# Get a table
table = tables.get_Item(i)
# Initialize a string to store the table data
tableData = ''
# Loop through the rows of the table
for j in range(0, table.Rows.Count):
# Loop through the cells of the row
for k in range(0, table.Rows.get_Item(j).Cells.Count):
# Get a cell
cell = table.Rows.get_Item(j).Cells.get_Item(k)
# Get the text in the cell
cellText = ''
for para in range(cell.Paragraphs.Count):
paragraphText = cell.Paragraphs.get_Item(para).Text
cellText += (paragraphText + ' ')
# Add the text to the string
tableData += cellText
if k < table.Rows.get_Item(j).Cells.Count - 1:
tableData += '\t'
# Add a new line
tableData += '\n'
# Save the table data to a text file
with open(f'output/Tables/WordTable_{s+1}_{i+1}.txt', 'w', encoding='utf-8') as f:
f.write(tableData)
doc.Close()

Extract Tables from Word Documents to Excel Workbooks with Python
Developers can also utilize Spire.Doc for Python to retrieve table data and then use Spire.XLS for Python to write the table data into an Excel worksheet, thereby enabling the conversion of Word document tables into Excel workbooks.
Install Spire.XLS for Python via PyPI:
pip install Spire.XLS
The detailed steps for extracting tables from Word documents to Excel workbooks are as follows:
- Create an object of Document class and load a Word document using Document.LoadFromFile() method.
- Create an object of Workbook class and clear the default worksheets using Workbook.Worksheets.Clear() method.
- Iterate through the sections in the document and get the table collection of each section through Section.Tables property.
- Iterate through the tables and create a worksheet for each table using Workbook.Worksheets.Add() method.
- Iterate through the rows in each table and the cells in each row, get the text of each cell through TableCell.Paragraphs[].Text property, and write the text to the worksheet using Worksheet.SetCellValue() method.
- Save the workbook using Workbook.SaveToFile() method.
- Python
from spire.doc import *
from spire.doc.common import *
from spire.xls import *
from spire.xls.common import *
# Create an instance of Document
doc = Document()
# Load a Word document
doc.LoadFromFile('Sample.docx')
# Create an instance of Workbook
wb = Workbook()
wb.Worksheets.Clear()
# Loop through sections in the document
for i in range(doc.Sections.Count):
# Get a section
section = doc.Sections.get_Item(i)
# Loop through tables in the section
for j in range(section.Tables.Count):
# Get a table
table = section.Tables.get_Item(j)
# Create a worksheet
ws = wb.Worksheets.Add(f'Table_{i+1}_{j+1}')
# Write the table to the worksheet
for row in range(table.Rows.Count):
# Get a row
tableRow = table.Rows.get_Item(row)
# Loop through cells in the row
for cell in range(tableRow.Cells.Count):
# Get a cell
tableCell = tableRow.Cells.get_Item(cell)
# Get the text in the cell
cellText = ''
for paragraph in range(tableCell.Paragraphs.Count):
paragraph = tableCell.Paragraphs.get_Item(paragraph)
cellText = cellText + (paragraph.Text + ' ')
# Write the cell text to the worksheet
ws.SetCellValue(row + 1, cell + 1, cellText)
# Save the workbook
wb.SaveToFile('output/Tables/WordTableToExcel.xlsx', FileFormat.Version2016)
doc.Close()
wb.Dispose()

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
