Tables are a powerful formatting tool in Word, allowing you to organize and present data effectively. However, the default table borders may not always align with your document's style and purpose. By selectively changing or removing the borders, you can achieve a variety of visual effects to suit your requirements. In this article, we will explore how to change and remove borders for tables in Word documents in Python using Spire.Doc for Python.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python. It can be easily installed in your Windows through the following pip command.

pip install Spire.Doc

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows

Change Borders for a Table in Word in Python

Spire.Doc for Python empowers you to retrieve the borders collection of a table by using the Table.TableFormat.Borders property. Once retrieved, you can access individual borders (like top border, bottom border, left border, right border, horizontal border, and vertical border) from the collection and then modify them by adjusting their line style, width, and color. The detailed steps are as follows.

  • Create an object of the Document class.
  • Load a Word document using Document.LoadFromFile() method.
  • Get a specific section using Document.Sections[index] property.
  • Get a specific table using Section.Tables[index] property.
  • Get the borders collection of the table using Table.TableFormat.Borders property.
  • Get an individual border, such as the top border from the borders collection using Borders.Top property, and then change its line style, width and color.
  • Refer to the above step to get other individual borders from the borders collection, and then change their line style, width and color.
  • Save the resulting document using Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create an object of the Document class
document = Document()
# Load a Word document
document.LoadFromFile("Table.docx")

# Add a section to the document
section = document.Sections[0]

# Get the first table in the section
table = section.Tables[0] if isinstance(section.Tables[0], Table) else None

# Get the collection of the borders
borders = table.TableFormat.Borders

# Get the top border and change border style, line width, and color
topBorder = borders.Top
topBorder.BorderType = BorderStyle.Single
topBorder.LineWidth = 1.0
topBorder.Color = Color.get_YellowGreen()

# Get the left border and change border style, line width, and color
leftBorder = borders.Left
leftBorder.BorderType = BorderStyle.Single
leftBorder.LineWidth = 1.0
leftBorder.Color = Color.get_YellowGreen()

# Get the right border and change border style, line width, and color
rightBorder = borders.Right
rightBorder.BorderType = BorderStyle.Single
rightBorder.LineWidth = 1.0
rightBorder.Color = Color.get_YellowGreen()

# Get the bottom border and change border style, line width, and color
bottomBorder = borders.Bottom
bottomBorder.BorderType = BorderStyle.Single
bottomBorder.LineWidth = 1.0
bottomBorder.Color = Color.get_YellowGreen()

# Get the horizontal border and change border style, line width, and color
horizontalBorder = borders.Horizontal
horizontalBorder.BorderType = BorderStyle.Dot
horizontalBorder.LineWidth = 1.0
horizontalBorder.Color = Color.get_Orange()

# Get the vertical border and change border style, line width, and color
verticalBorder = borders.Vertical
verticalBorder.BorderType = BorderStyle.Dot
verticalBorder.LineWidth = 1.0
verticalBorder.Color = Color.get_CornflowerBlue()

# Save the resulting document
document.SaveToFile("ChangeBorders.docx", FileFormat.Docx2013)
document.Close()

Python: Change or Remove Borders for Tables in Word

Remove Borders from a Table in Word in Python

To remove borders from a table, you need to set the BorderType property of the borders to BorderStyle.none. The detailed steps are as follows.

  • Create an object of the Document class.
  • Load a Word document using Document.LoadFromFile() method.
  • Get a specific section using Document.Sections[index] property.
  • Get a specific table using Section.Tables[index] property.
  • Get the borders collection of the table using Table.TableFormat.Borders property.
  • Get an individual border, such as the top border from the borders collection using Borders.Top property. Then set the BorderType property of the top border to BorderStyle.none.
  • Refer to the above step to get other individual borders from the borders collection and then set the BorderType property of the borders to BorderStyle.none.
  • Save the resulting document using Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

# Initialize an instance of the Document class
document = Document()
document.LoadFromFile("ChangeBorders.docx")

# Add a section to the document
section = document.Sections[0]

# Get the first table in the section
table = section.Tables[0] if isinstance(section.Tables[0], Table) else None

# Get the borders collection of the table
borders = table.TableFormat.Borders

# Remove top border
topBorder = borders.Top
topBorder.BorderType = BorderStyle.none

# Remove left border
leftBorder = borders.Left
leftBorder.BorderType = BorderStyle.none

# Remove right border
rightBorder = borders.Right
rightBorder.BorderType = BorderStyle.none

# Remove bottom border
bottomBorder = borders.Bottom
bottomBorder.BorderType = BorderStyle.none

# remove inside horizontal border
horizontalBorder = borders.Horizontal
horizontalBorder.BorderType = BorderStyle.none

# Remove inside vertical border
verticalBorder = borders.Vertical
verticalBorder.BorderType = BorderStyle.none

# Save the resulting document
document.SaveToFile("RemoveBorders.docx", FileFormat.Docx2013)
document.Close()

Python: Change or Remove Borders for Tables in Word

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Various written documents, such as academic papers, reports, and legal materials, often have specific formatting guidelines that encompass word count, page count, and other essential metrics. Accurately measuring these elements is crucial as it ensures that your document adheres to the required standards and meets the expected quality benchmarks. In this article, we will explain how to count words, pages, characters, paragraphs, and lines in a Word document in Python using Spire.Doc for Python.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python. It can be easily installed in your Windows through the following pip commands.

pip install Spire.Doc

If you are unsure how to install, please refer to: How to Install Spire.Doc for Python on Windows

Count Words, Pages, Characters, Paragraphs, and Lines in a Word Document in Python

Spire.Doc for Python offers the BuiltinDocumentProperties class that empowers you to retrieve crucial information from your Word document. By utilizing this class, you can access a wealth of details, including the built-in document properties, as well as the number of words, pages, characters, paragraphs, and lines contained within the document.

The steps below explain how to get the number of words, pages, characters, paragraphs, and lines in a Word document in Python using Spire.Doc for Python:

  • Create an object of the Document class.
  • Load a Word document using the Document.LoadFromFile() method.
  • Get the BuiltinDocumentProperties object using the Document.BuiltinDocumentProperties property.
  • Get the number of words, characters, paragraphs, lines, and pages in the document using the WordCount, CharCount, ParagraphCount, LinesCount, PageCount properties of the BuiltinDocumentProperties class, and append the result to a list.
  • Write the content of the list into a text file.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create an object of the Document class
doc = Document()
# Load a Word document
doc = Document("Input.docx")

# Create a list
sb = []

# Get the built-in properties of the document
properties = doc.BuiltinDocumentProperties

# Get the number of words, characters, paragraphs, lines, and pages and append the result to the list
sb.append("The number of words: " + str(properties.WordCount))
sb.append("The number of characters: " + str(properties.CharCount))
sb.append("The number of paragraphs: " + str(properties.ParagraphCount))
sb.append("The number of lines: " + str(properties.LinesCount))
sb.append("The number of pages: " + str(properties.PageCount))

# Save the data in the list to a text file
with open("result.txt", "w") as file:
file.write("\n".join(sb))

doc.Close()

Python: Count Words, Pages, Characters, Paragraphs and Lines in Word

Count Words and Characters in a Specific Paragraph of a Word Document in Python

In addition to retrieving the overall word count, page count, and other metrics for an entire Word document, you are also able to get the word count and character count for a specific paragraph by using the Paragraph.WordCount and Paragraph.CharCount properties.

The steps below explain how to get the number of words and characters of a paragraph in a Word document in Python using Spire.Doc for Python:

  • Create an object of the Document class.
  • Load a Word document using the Document.LoadFromFile() method.
  • Get a specific paragraph using the Document.Sections[sectionIndex].Paragraphs[paragraphIndex] property.
  • Get the number of words and characters in the paragraph using the Paragraph.WordCount and Paragraph.CharCount properties, and append the result to a list.
  • Write the content of the list into a text file.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create an object of the Document class
doc = Document()
# Load a Word document
doc = Document("Input.docx")

# Get a specific paragraph
paragraph = doc.Sections[0].Paragraphs[0]

# Create a list
sb = []

# Get the number of words and characters in the paragraph and append the result to the list
sb.append("The number of words: " + str(paragraph.WordCount))
sb.append("The number of characters: " + str(paragraph.CharCount))

# Save the data in the list to a text file
with open("result.txt", "w") as file:
file.write("\n".join(sb))

doc.Close()

Python: Count Words, Pages, Characters, Paragraphs and Lines in Word

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

When dealing with a large volume of customized documents such as contracts, reports, or personal letters, the variable feature in Word documents becomes crucial. Variables allow you to store and reuse information like dates, names, or product details, making the documents more personalized and dynamic. This article will delve into how to use Spire.Doc for Python to insert, count, retrieve, and delete variables in Word documents, enhancing the efficiency and flexibility of document management.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Window through the following pip command.

pip install Spire.Doc

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Window

Add Variables into Word Documents with Python

The way Word variables work is based on the concept of "fields". When you insert a variable into a Word document, what you're actually doing is inserting a field, which points to a value stored either in the document properties or an external data source. Upon updating the fields, Word recalculates them to display the most current information.

Spire.Doc for Python offers the VariableCollection.Add(name, value) method to insert variables into Word documents. Here are the detailed steps:

  • Create a Document object.
  • Call the Document.AddSection() method to create a new section.
  • Call the Section.AddParagraph() method to create a new paragraph.
  • Call the Paragraph.AppendField(fieldName, fieldType) method to add a variable field (FieldDocVariable) within the paragraph.
  • Set Document.IsUpdateFields to True to update the fields.
  • Save the document by Document.SaveToFile() method.
  • Python
from spire.doc import *

# Create a Document object
document = Document()

# Add a new section to the document
section = document.AddSection()

# Add a new paragraph within the newly created section
paragraph = section.AddParagraph()

# Append a FieldDocVariable type field named "CompanyName" to the paragraph
paragraph.AppendField("CompanyName", FieldType.FieldDocVariable)

# Add the variable to the document's variable collection
document.Variables.Add("CompanyName", "E-ICEBLUE")

# Update fields
document.IsUpdateFields = True

# Save the document to a specified path
document.SaveToFile("AddVariable.docx", FileFormat.Docx2016)

# Dispose the document
document.Dispose()

Python: Add, Count, Retrieve and Remove Word Variables

Count the Number of Variables in a Word Document with Python

Here are the detailed steps to use the Document.Variables.Count property to get the number of variables:

  • Create a Document object.
  • Call the Document.LoadFromFile() method to load the document that contains the variables.
  • Use the Document.Variables.Count property to obtain the number of variables.
  • Print the count in console.
  • Python
from spire.doc import *

# Create a Document object
document = Document()

# Load an existing document
document.LoadFromFile("AddVariable.docx")

# Get the count of variables in the document
count=document.Variables.Count

# Print to console
print(f"The count of variables:{count}")

Python: Add, Count, Retrieve and Remove Word Variables

Retrieve Variables from a Word Document with Python

Spire.Doc for Python provides the GetNameByIndex(int index) and GetValueByIndex(int index) methods to retrieve variable names and values by their indices. Below are the detailed steps:

  • Create a Document object.
  • Call the Document.LoadFromFile() method to load the document that contains the variables.
  • Call the Document.Variables.GetNameByIndex(index) method to obtain the variable name.
  • Call the Document.Variables.GetValueByIndex(index) method to obtain the variable value.
  • Call the Document.Variables.get_Item(name) to obtain variable value through the variable name.
  • Print the count in console.
  • Python
from spire.doc import *

# Create a Document object
document = Document()

# Load an existing document
document.LoadFromFile("AddVariable.docx")

# Obtain variable name based on index 0
name=document.Variables.GetNameByIndex(0)

# Obtain variable value based on index 0
value=document.Variables.GetValueByIndex(0)

# Obtain variable value through the variable name
value1=document.Variables.get_Item("CompanyName")

# Print to console
print("Variable Name:", name)
print("Variable Value:", value)

Python: Add, Count, Retrieve and Remove Word Variables

Delete Variables from a Word Document with Python

The VariableCollection.Remove(name) method can be used to delete a specified variable from the document, with the parameter being the name of the variable.

  • Create a Document object.
  • Call the Document.LoadFromFile() method to load the document that contains the variables.
  • Call the Document.Variables.Remove(name) method to remove the variable.
  • Set Document.IsUpdateFields to True to update the fields.
  • Save the document by Document.SaveToFile() method.
  • Python
from spire.doc import *

# Create a Document object
document = Document()

# Load an existing document
document.LoadFromFile("AddVariable.docx")

# Remove the variable named "CompanyName"
document.Variables.Remove("CompanyName")

# Update fields
document.IsUpdateFields=True

# Save the document
document.SaveToFile("RemoveVariable.docx",FileFormat.Docx2016)

# Dispose the document
document.Dispose()

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Spire.Doc for Python is a robust library that enables you to read and write Microsoft Word documents using Python. With Spire.Doc, you can create, read, edit, and convert both DOC and DOCX file formats without requiring Microsoft Word to be installed on your system.

This article demonstrates how to install Spire.Doc for Python on Mac.

Step 1

Download the most recent version of Python for macOS and install it on your Mac. If you have already completed this step, proceed directly to step 2.

How to Install Spire.Doc for Python on Mac

Step 2

Open VS Code and search for 'Python' in the Extensions panel. Click 'Install' to add support for Python in your VS Code.

How to Install Spire.Doc for Python on Mac

Step 3

Click 'Explorer' > 'NO FOLRDER OPENED' > 'Open Folder'.

How to Install Spire.Doc for Python on Mac

Choose an existing folder as the workspace, or you can create a new folder and then open it.

How to Install Spire.Doc for Python on Mac

Add a .py file to the folder you just opened and name it whatever you want (in this case, HelloWorld.py).

How to Install Spire.Doc for Python on Mac

Step 4

Use the keyboard shortcut Ctrl + ' to open the Terminal. Then, install Spire.Doc for Python by entering the following command line in the terminal.

pip3 install spire.doc

Note that pip3 is a package installer specifically designed for Python 3.x versions, while pip is a package installer for Python 2.x versions. If you are working with Python 2.x, you can use the pip command.

How to Install Spire.Doc for Python on Mac

Step 5

Open a Terminal window on your Mac, and type the following command to obtain the installation path of Python on your system.

python3 -m pip --version

How to Install Spire.Doc for Python on Mac

Step 6

Add the following code snippet to the 'HelloWorld.py' file.

  • Python
from spire.doc.common import *
from spire.doc import *

document = Document()
section = document.AddSection()
paragraph = section.AddParagraph()
paragraph.AppendText("Hello World")
document.SaveToFile("HelloWorld.docx", FileFormat.Docx2019)
document.Dispose()

How to Install Spire.Doc for Python on Mac

After executing the Python file, you will find the resulting Word document in the 'EXPLORER' panel.

How to Install Spire.Doc for Python on Mac

Section breaks in Word allow users to divide a document into sections, each with unique formatting options. This is especially useful when working with long documents where you want to apply different layouts, headers, footers, margins or page orientations within the same document. In this article, you will learn how to insert or remove section breaks in Word in Python using Spire.Doc for Python.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip commands.

pip install Spire.Doc

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows

Insert Section Breaks in Word in Python

Spire.Doc for Python provides the Paragraph.InsertSectionBreak(breakType: SectionBreakType) method to insert a specified type of section break to a paragraph. The following table provides an overview of the supported section break types, along with their corresponding Enums and descriptions:

Section Break Enum Description
New page SectionBreakType.New_Page Start the new section on a new page.
Continuous SectionBreakType.No_Break Start the new section on the same page, allowing for continuous content flow.
Odd page SectionBreakType.Odd_Page Start the new section on the next odd-numbered page.
Even page SectionBreakType.Even_Page Start the new section on the next even-numbered page.
New column SectionBreakType.New_Column Start the new section in the next column if columns are enabled.

The following are the detailed steps to insert a continuous section break:

  • Create a Document instance.
  • Load a Word document using Document.LoadFromFile() method.
  • Get a specified section using Document.Sections[] property.
  • Get a specified paragraph of the section using Section.Paragraphs[] property.
  • Add a section break to the end of the paragraph using Paragraph.InsertSectionBreak() method.
  • Save the result document using Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

inputFile = "sample.docx"
outputFile = "InsertSectionBreak.docx"

# Create a Document instance
document = Document()

# Load a Word document
document.LoadFromFile(inputFile)

# Get the first section in the document
section = document.Sections[0]

# Get the second paragraph in the section
paragraph = section.Paragraphs[1]

# Insert a continuous section break
paragraph.InsertSectionBreak(SectionBreakType.NoBreak)

# Save the result document
document.SaveToFile(outputFile, FileFormat.Docx2016)
document.Close()

Python: Insert or Remove Section Breaks in Word

Remove Section Breaks in Word in Python

To delete all sections breaks in a Word document, we need to access the first section in the document, then copy the contents of the other sections to the first section and delete them. The following are the detailed steps:

  • Create a Document instance.
  • Load a Word document using Document.LoadFromFile() method.
  • Get the first section using Document.Sections[] property.
  • Iterate through other sections in the document.
  • Get the second section, and then iterate through to get its child objects.
  • Clone the child objects of the second section and add them to the first section using Section.Body.ChildObjects.Add() method.
  • Delete the second section using Document.Sections.Remove() method.
  • Repeat the process to copy and delete the remaining sections.
  • Save the result document using Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

inputFile = "Report.docx"
outputFile = "RemoveSectionBreaks.docx"

# Create a Document instance
document = Document()

# Load a Word document
document.LoadFromFile(inputFile)

# Get the first section in the document
sec = document.Sections[0]

# Iterate through other sections in the document
for i in range(document.Sections.Count - 1):
    # Get the second section in the document
    section = document.Sections[1]
    
    # Iterate through all child objects of the second section
    for j in range(section.Body.ChildObjects.Count):
        # Get the child objects
        obj = section.Body.ChildObjects.get_Item(j)
        # Clone the child objects to the first section
        sec.Body.ChildObjects.Add(obj.Clone())
        # Remove the second section
        document.Sections.Remove(section)

# Save the result document
document.SaveToFile(outputFile, FileFormat.Docx2016)
document.Close()

Python: Insert or Remove Section Breaks in Word

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Knowing how to remove headers or footers in Word is an essential skill as there may be times you need to change the formatting of your document or collaborate with others who do not need the headers or footers. In this article, you will learn how to remove headers or footers in Word in Python using Spire.Doc for Python.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip commands.

pip install Spire.Doc

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows

Remove Headers in a Word Document in Python

Spire.Doc for Python supports getting different headers in the first pages, odd pages, and even pages, and then delete all of them through the HeaderFooter.ChildObjects.Clear() method. The following are the detailed steps:

  • Create a Document instance.
  • Load a Word document using Document.LoadFromFile() method.
  • Get a specified section using Document.Sections[] property.
  • Iterate through all paragraphs in the section, and then all child objects in each paragraph.
  • Get the headers for the first, odd, and even pages using Section.HeadersFooters[hfType: HeaderFooterType] property, and then delete them using HeaderFooter.ChildObjects.Clear() method.
  • Save the result document using Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

inputFile = "HeaderFooter.docx"
outputFile = "RemoveHeaders.docx"

# Create a Document instance
doc = Document()

# Load a Word document
doc.LoadFromFile(inputFile)

# Get the first section
section = doc.Sections[0]

# Iterate through all paragraphs in the section
for i in range(section.Paragraphs.Count):
    para = section.Paragraphs.get_Item(i)

    # Iterate through all child objects in each paragraph
    for j in range(para.ChildObjects.Count):
        obj = para.ChildObjects.get_Item(j)

        # Delete header in the first page
        header = None
        header = section.HeadersFooters[HeaderFooterType.HeaderFirstPage]
        if header is not None:
            header.ChildObjects.Clear()

        # Delete headers in the odd pages
        header = section.HeadersFooters[HeaderFooterType.HeaderOdd]
        if header is not None:
            header.ChildObjects.Clear()

        # Delete headers in the even pages
        header = section.HeadersFooters[HeaderFooterType.HeaderEven]
        if header is not None:
            header.ChildObjects.Clear()

# Save the result document
doc.SaveToFile(outputFile, FileFormat.Docx)
doc.Close()

Python: Remove Headers or Footers in Word

Remove Footers in a Word Document in Python

Deleting footers is similar to that of deleting headers, you can also get the footers on different pages first and then delete them at once. The following are the detailed steps:

  • Create a Document instance.
  • Load a Word document using Document.LoadFromFile() method.
  • Get a specified section using Document.Sections[] property.
  • Iterate through all paragraphs in the section, and then all child objects in each paragraph.
  • Get the footers for the first, odd, and even pages using Section.HeadersFooters[hfType: HeaderFooterType] property, and then delete them using HeaderFooter.ChildObjects.Clear() method.
  • Save the result document using Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

inputFile = "HeaderFooter.docx"
outputFile = "RemoveFooters.docx"

# Create a Document instance
doc = Document()

# Load a Word document
doc.LoadFromFile(inputFile)

# Get the first section
section = doc.Sections[0]

# Iterate through all paragraphs in the section
for i in range(section.Paragraphs.Count):
    para = section.Paragraphs.get_Item(i)

    # Iterate through all child objects in each paragraph
    for j in range(para.ChildObjects.Count):
        obj = para.ChildObjects.get_Item(j)

        # Delete footer in the first page
        footer = None
        footer = section.HeadersFooters[HeaderFooterType.FooterFirstPage]
        if footer is not None:
            footer.ChildObjects.Clear()

        # Delete footers in the odd pages
        footer = section.HeadersFooters[HeaderFooterType.FooterOdd]
        if footer is not None:
            footer.ChildObjects.Clear()

        # Delete footers in the even pages
        footer = section.HeadersFooters[HeaderFooterType.FooterEven]
        if footer is not None:
            footer.ChildObjects.Clear()

# Save the result document
doc.SaveToFile(outputFile, FileFormat.Docx)
doc.Close()

Python: Remove Headers or Footers in Word

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Python: Extract Comments from Word

2024-06-07 07:59:40 Written by Koohji

Comments in Word documents are often used for collaborative review and feedback purposes. They may contain text and images that provide valuable information to guide document improvements. Extracting the text and images from comments allows you to analyze and evaluate the feedback provided by reviewers, helping you gain a comprehensive understanding of the strengths, weaknesses, and suggestions related to the document. In this article, we will demonstrate how to extract text and images from Word comments in Python using Spire.Doc for Python.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.Doc

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows

Extract Text from Word Comments in Python

You can easily retrieve the author and text of a Word comment using the Comment.Format.Author and Comment.Body.Paragraphs[index].Text properties provided by Spire.Doc for Python. The detailed steps are as follows.

  • Create an object of the Document class.
  • Load a Word document using the Document.LoadFromFile() method.
  • Create a list to store the extracted comment data.
  • Iterate through the comments in the document.
  • For each comment, iterate through the paragraphs of the comment body.
  • For each paragraph, get the text using the Comment.Body.Paragraphs[index].Text property.
  • Get the author of the comment using the Comment.Format.Author property.
  • Add the text and author of the comment to the list.
  • Save the content of the list to a text file.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create an object of the Document class
document = Document()
# Load a Word document containing comments
document.LoadFromFile("Comments.docx")

# Create a list to store the extracted comment data
comments = []

# Iterate through the comments in the document
for i in range(document.Comments.Count):
    comment = document.Comments[i]
    comment_text = ""

    # Iterate through the paragraphs in the comment body
    for j in range(comment.Body.Paragraphs.Count):
        paragraph = comment.Body.Paragraphs[j]
        comment_text += paragraph.Text + "\n"

    # Get the comment author
    comment_author = comment.Format.Author

    # Append the comment data to the list
    comments.append({
        "author": comment_author,
        "text": comment_text
    })

# Write the comment data to a file
with open("comment_data.txt", "w", encoding="utf-8") as file:
    for comment in comments:
        file.write(f"Author: {comment['author']}\nText: {comment['text']}\n\n")

Python: Extract Comments from Word

Extract Images from Word Comments in Python

To extract images from Word comments, you need to iterate through the child objects in the paragraphs of the comments to find the DocPicture objects, then get the image data using DocPicture.ImageBytes property, finally save the image data to image files.

  • Create an object of the Document class.
  • Load a Word document using the Document.LoadFromFile() method.
  • Create a list to store the extracted image data.
  • Iterate through the comments in the document.
  • For each comment, iterate through the paragraphs of the comment body.
  • For each paragraph, iterate through the child objects of the paragraph.
  • Check if the object is a DocPicture object.
  • If the object is a DocPicture, get the image data using the DocPicture.ImageBytes property and add it to the list.
  • Save the image data in the list to individual image files.
  • Python
from spire.doc import *
from spire.doc.common import *
 
# Create an object of the Document class
document = Document()
# Load a Word document containing comments
document.LoadFromFile("Comments.docx")
 
# Create a list to store the extracted image data
images = []
 
# Iterate through the comments in the document
for i in range(document.Comments.Count):
    comment = document.Comments[i]
    # Iterate through the paragraphs in the comment body
    for j in range(comment.Body.Paragraphs.Count):
        paragraph = comment.Body.Paragraphs[j]
        # Iterate through the child objects in the paragraph
        for o in range(paragraph.ChildObjects.Count):
            obj = paragraph.ChildObjects[o]
            # Find the images
            if isinstance(obj, DocPicture):
                picture = obj
                # Get the image data and add it to the list
                data_bytes = picture.ImageBytes
                images.append(data_bytes)
 
# Save the image data to image files
for i, image_data in enumerate(images):
    file_name = f"CommentImage-{i}.png"
    with open(os.path.join("CommentImages/", file_name), 'wb') as image_file:
        image_file.write(image_data)

Python: Extract Comments from Word

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Python: Get Revisions of Word Document

2024-06-05 01:02:25 Written by Koohji

With the increasing popularity of team collaboration, the track changes function in Word documents has become the cornerstone of version control and content review. However, for developers who pursue automation and efficiency, how to flexibly extract these revision information from Word documents remains a significant challenge. This article will introduce you to how to use Spire.Doc for Python to obtain revision information in Word documents.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.Doc

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows

Get Revisions of Word Document in Python

Spire.Doc for Python provides the IsInsertRevision and DeleteRevision properties to support determining whether an element in a Word document is an insertion revision or a deletion revision. Here are the detailed steps:

  • Create an instance of the Document class and load the Word document that contains revisions.
  • Initialize lists to collect insertion and deletion revision information.
  • Iterate through the sections of the document and their body elements.
  • Obtain the paragraphs in the body and use the IsInsertRevision property to determine if the paragraph is an insertion revision.
  • Get the type, author, and associated text of the insertion revision.
  • Use the IsDeleteRevision property to determine if the paragraph is a deletion revision, and obtain its revision type, author, and associated text.
  • Iterate through the child elements of the paragraph, similarly checking if the TextRange is an insertion or deletion revision, and retrieve the revision type, author, and associated text.
  • Define a WriteAllText function to save the insertion and deletion revision information to TXT documents.
  • Python
from spire.doc import *

# Function to write text to a file
def WriteAllText(fname: str, text: str):
    with open(fname, "w", encoding='utf-8') as fp:
        fp.write(text)

# Input and output file names
inputFile = "sample.docx"
outputFile1 = "InsertRevision.txt"
outputFile2 = "DeleteRevision.txt"

# Create a Document object
document = Document()

# Load the Word document
document.LoadFromFile(inputFile)

# Initialize lists to store insert and delete revisions
insert_revisions = []
delete_revisions = []

# Iterate through sections in the document
for k in range(document.Sections.Count):
    sec = document.Sections.get_Item(k)

    # Iterate through body elements in the section
    for m in range(sec.Body.ChildObjects.Count):
        # Check if the item is a Paragraph
        docItem = sec.Body.ChildObjects.get_Item(m)
        if isinstance(docItem, Paragraph):
            para = docItem
            para.AppendField("",FieldType.FieldDocVariable)

            # Check if the paragraph is an insertion revision
            if para.IsInsertRevision:
                insRevison = para.InsertRevision
                insType = insRevison.Type
                insAuthor = insRevison.Author

                # Add insertion revision details to the list
                insert_revisions.append(f"Revision Type: {insType.name}\n")
                insert_revisions.append(f"Revision Author: {insAuthor}\n")
                insert_revisions.append(f"Insertion Text: {para.Text}\n")
            # Check if the paragraph is a deletion revision
            elif para.IsDeleteRevision:
                delRevison = para.DeleteRevision
                delType = delRevison.Type
                delAuthor = delRevison.Author

                # Add deletion revision details to the list
                delete_revisions.append(f"Revision Type:: {delType.name}\n")
                delete_revisions.append(f"Revision Author: {delAuthor}\n")
                delete_revisions.append(f"Deletion Text: {para.Text}\n")
            else:
                # Iterate through all child objects of Paragraph
                for j in range(para.ChildObjects.Count):
                    obj = para.ChildObjects.get_Item(j)
                    # Check if the current object is an instance of TextRange
                    if isinstance(obj, TextRange):
                        textRange = obj

                        # Check if the textrange is an insertion revision
                        if textRange.IsInsertRevision:
                            insRevison = textRange.InsertRevision
                            insType = insRevison.Type
                            insAuthor = insRevison.Author

                            # Add insertion revision details to the list
                            insert_revisions.append(f"Revision Type: {insType.name}\n")
                            insert_revisions.append(f"Revision Author: {insAuthor}\n")
                            insert_revisions.append(f"Insertion Text: {textRange.Text}\n")
                        # Check if the textrange is a deletion revision
                        elif textRange.IsDeleteRevision:
                            delRevison = textRange.DeleteRevision
                            delType = delRevison.Type
                            delAuthor = delRevison.Author

                            # Add deletion revision details to the list
                            delete_revisions.append(f"Revision Type: {delType.name}\n")
                            delete_revisions.append(f"Revision Author: {delAuthor}\n")
                            delete_revisions.append(f"Deletion Text: {textRange.Text}\n")

# Write all the insertion revision details to the 'outputFile1' file
WriteAllText(outputFile1, ''.join(insert_revisions))

# Write all the deletion revision details to the 'outputFile2' file
WriteAllText(outputFile2, ''.join(delete_revisions))

# Dispose the document
document.Dispose()

Python: Get Revisions of Word Document

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Word documents leverage Content Control technology to infuse dynamic vitality into document content, offering users enhanced flexibility and convenience when editing and managing documents. These controls, serving as interactive elements, empower users to freely add, remove, or adjust specified content sections while preserving the integrity of the document structure, thereby facilitating agile iterations and personalized customization of document content. This article will guide you how to use Spire.Doc for Python to modify content controls in Word documents within a Python project.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your VS Code through the following pip command.

pip install Spire.Doc

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows

Modify Content Controls in the Body using Python

In Spire.Doc, the object type for the body content control is StructureDocumentTag. To modify these controls, one needs to traverse the Section.Body.ChildObjects collection to locate objects of type StructureDocumentTag. Below are the detailed steps:

  • Create a Document object.
  • Use the Document.LoadFromFile() method to load a Word document into memory.
  • Retrieve the body of a section in the document using Section.Body.
  • Traverse the collection of child objects within Body.ChildObjects, identifying those that are of type StructureDocumentTag.
  • Within the StructureDocumentTag.ChildObjects sub-collection, perform modifications based on the type of each child object.
  • Finally, utilize the Document.SaveToFile() method to save the changes back to the Word document.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a new document object
doc = Document()

# Load the document content from a file
doc.LoadFromFile("Sample1.docx")

# Get the body of the document
body = doc.Sections.get_Item(0).Body

# Create lists for paragraphs and tables
paragraphs = []
tables = []

for i in range(body.ChildObjects.Count):
        obj = body.ChildObjects.get_Item(i)
        # If it is a StructureDocumentTag object
        if obj.DocumentObjectType == DocumentObjectType.StructureDocumentTag:
            sdt = (StructureDocumentTag)(obj)

            # If the tag is "c1" or the alias is "c1"
            if sdt.SDTProperties.Tag == "c1" or sdt.SDTProperties.Alias == "c1":
                for j in range(sdt.ChildObjects.Count):
                    child_obj = sdt.ChildObjects.get_Item(j)

                    # If it is a paragraph object
                    if child_obj.DocumentObjectType == DocumentObjectType.Paragraph:
                        paragraphs.append(child_obj)
                    
                    # If it is a table object
                    elif child_obj.DocumentObjectType == DocumentObjectType.Table:
                        tables.append(child_obj)

# Modify the text content of the first paragraph
if paragraphs:
    (Paragraph)(paragraphs[0]).Text = "Spire.Doc for Python is a totally independent Python Word class library which doesn't require Microsoft Office installed on system."

if tables:
    # Reset the cells of the first table
    (Table)(tables[0]).ResetCells(5, 4)

# Save the modified document to a file
doc.SaveToFile("ModifyBodyContentControls.docx", FileFormat.Docx2016)

# Release document resources
doc.Close()
doc.Dispose()

Python: Modify Content Controls in a Word Document

Modify Content Controls within Paragraphs using Python

In Spire.Doc, the object type for content controls within paragraphs is StructureDocumentTagInline. To modify these, you would traverse the Paragraph.ChildObjects collection to locate objects of type StructureDocumentTagInline. Here are the detailed steps:

  • Instantiate a Document object.
  • Load a Word document using the Document.LoadFromFile() method.
  • Get the body of a section in the document via Section.Body.
  • Retrieve the first paragraph of the text body using Body.Paragraphs.get_Item(0).
  • Traverse the collection of child objects within Paragraph.ChildObjects, identifying those that are of type StructureDocumentTagInline.
  • Within the StructureDocumentTagInline.ChildObjects sub-collection, execute modification operations according to the type of each child object.
  • Save the changes back to the Word document using the Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

 # Create a new Document object
doc = Document()

# Load document content from a file
doc.LoadFromFile("Sample2.docx")

# Get the body of the document
body = doc.Sections.get_Item(0).Body

# Get the first paragraph in the body
paragraph = body.Paragraphs.get_Item(0)

# Iterate through child objects in the paragraph
for i in range(paragraph.ChildObjects.Count):
    obj = paragraph.ChildObjects.get_Item(i)

    # Check if the child object is StructureDocumentTagInline
    if obj.DocumentObjectType == DocumentObjectType.StructureDocumentTagInline:
       
       # Convert the child object to StructureDocumentTagInline type
       structure_document_tag_inline = (StructureDocumentTagInline)(obj)

       # Check if the Tag or Alias property is "text1"
       if structure_document_tag_inline.SDTProperties.Tag == "text1":
            
            # Iterate through child objects in the StructureDocumentTagInline object
            for j in range(structure_document_tag_inline.ChildObjects.Count):
                obj2 = structure_document_tag_inline.ChildObjects.get_Item(j)

                # Check if the child object is a TextRange object
                if obj2.DocumentObjectType == DocumentObjectType.TextRange:

                    # Convert the child object to TextRange type
                    range = (TextRange)(obj2)

                    # Set the text content to a specified content
                    range.Text = "97-2003/2007/2010/2013/2016/2019"

       # Check if the Tag or Alias property is "logo1"
       if structure_document_tag_inline.SDTProperties.Tag == "logo1":
            
            # Iterate through child objects in the StructureDocumentTagInline object
            for j in range(structure_document_tag_inline.ChildObjects.Count):
                obj2 = structure_document_tag_inline.ChildObjects.get_Item(j)
               
                # Check if the child object is an image
                if obj2.DocumentObjectType == DocumentObjectType.Picture:

                    # Convert the child object to DocPicture type
                    doc_picture = (DocPicture)(obj2)

                    # Load a specified image
                    doc_picture.LoadImage("DOC-Python.png")

                    # Set the width and height of the image
                    doc_picture.Width = 100
                    doc_picture.Height = 100

# Save the modified document to a new file
doc.SaveToFile("ModifiedContentControlsInParagraph.docx", FileFormat.Docx2016)

# Release resources of the Document object
doc.Close()
doc.Dispose()

Python: Modify Content Controls in a Word Document

Modify Content Controls Wrapping Table Rows using Python

In Spire.Doc, the object type for content controls within table rows is StructureDocumentTagRow. To modify these controls, you need to traverse the Table.ChildObjects collection to find objects of type StructureDocumentTagRow. Here are the detailed steps:

  • Create a Document object.
  • Load a Word document using the Document.LoadFromFile() method.
  • Retrieve the body of a section within the document using Section.Body.
  • Obtain the first table in the text body via Body.Tables.get_Item(0).
  • Traverse the collection of child objects within Table.ChildObjects, identifying those that are of type StructureDocumentTagRow.
  • Access StructureDocumentTagRow.Cells collection to iterate through the cells within this controlled row, and then execute the appropriate modification actions on the cell contents.
  • Lastly, use the Document.SaveToFile() method to persist the changes made to the document.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a new document object
doc = Document()

# Load the document from a file
doc.LoadFromFile("Sample3.docx")

# Get the body of the document
body = doc.Sections.get_Item(0).Body

# Get the first table
table = body.Tables.get_Item(0)

# Iterate through the child objects in the table
for i in range(table.ChildObjects.Count):
    obj = table.ChildObjects.get_Item(i)

    # Check if the child object is of type StructureDocumentTagRow
    if obj.DocumentObjectType == DocumentObjectType.StructureDocumentTagRow:

        # Convert the child object to a StructureDocumentTagRow object
        structureDocumentTagRow = (StructureDocumentTagRow)(obj)

        # Check if the Tag or Alias property of the StructureDocumentTagRow is "row1"
        if structureDocumentTagRow.SDTProperties.Tag == "row1":

            # Clear the paragraphs in the cell
            structureDocumentTagRow.Cells.get_Item(0).Paragraphs.Clear()

            # Add a paragraph in the cell and set the text
            textRange = structureDocumentTagRow.Cells.get_Item(0).AddParagraph().AppendText("Arts")
            textRange.CharacterFormat.TextColor = Color.get_Blue()
      
# Save the modified document to a file
doc.SaveToFile("ModifiedTableRowContentControl.docx",  FileFormat.Docx2016)

# Release document resources
doc.Close()
doc.Dispose()

Python: Modify Content Controls in a Word Document

Modify Content Controls Wrapping Table Cells using Python

In Spire.Doc, the object type for content controls within table cells is StructureDocumentTagCell. To manipulate these controls, you need to traverse the TableRow.ChildObjects collection to locate objects of type StructureDocumentTagCell. Here are the detailed steps:

  • Create a Document object.
  • Load a Word document using the Document.LoadFromFile() method.
  • Retrieve the body of a section in the document using Section.Body.
  • Obtain the first table in the body using Body.Tables.get_Item(0).
  • Traverse the collection of rows in the table.
  • Within each TableRow, traverse its child objects TableRow.ChildObjects to identify those of type StructureDocumentTagCell.
  • Access StructureDocumentTagCell.Paragraphs collection. This allows you to iterate through the paragraphs within the cell and apply the necessary modification operations to the content.
  • Finally, use the Document.SaveToFile() method to save the modified document.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a new document object
doc = Document()

# Load the document from a file
doc.LoadFromFile("Sample4.docx")

# Get the body of the document
body = doc.Sections.get_Item(0).Body

# Get the first table in the document
table = body.Tables.get_Item(0)

# Iterate through the rows of the table
for i in range(table.Rows.Count):
    row = table.Rows.get_Item(i)

    # Iterate through the child objects in each row
    for j in range(row.ChildObjects.Count):
        obj = row.ChildObjects.get_Item(j)

        # Check if the child object is a StructureDocumentTagCell
        if obj.DocumentObjectType == DocumentObjectType.StructureDocumentTagCell:

            # Convert the child object to StructureDocumentTagCell type
            structureDocumentTagCell = (StructureDocumentTagCell)(obj)

            # Check if the Tag or Alias property of structureDocumentTagCell is "cell1"
            if structureDocumentTagCell.SDTProperties.Tag == "cell1":
                
                # Clear the paragraphs in the cell
                structureDocumentTagCell.Paragraphs.Clear()

                # Add a new paragraph and add text to it
                textRange = structureDocumentTagCell.AddParagraph().AppendText("92")
                textRange.CharacterFormat.TextColor = Color.get_Blue()

# Save the modified document to a new file
doc.SaveToFile("ModifiedTableCellContentControl.docx", FileFormat.Docx2016)

# Dispose of the document object
doc.Close()
doc.Dispose()

Python: Modify Content Controls in a Word Document

Modify Content Controls within Table Cells using Python

This case demonstrates modifying content controls within paragraphs inside table cells. The process involves navigating to the paragraph collection TableCell.Paragraphs within each cell, then iterating through each paragraph's child objects (Paragraph.ChildObjects) to locate StructureDocumentTagInline objects for modification. Here are the detailed steps:

  • Initiate a Document instance.
  • Use the Document.LoadFromFile() method to load a Word document.
  • Retrieve the body of a section in the document with Section.Body.
  • Obtain the first table in the body via Body.Tables.get_Item(0).
  • Traverse the table rows collection (Table.Rows), engaging with each TableRow object.
  • For each TableRow, navigate its cells collection (TableRow.Cells), entering each TableCell object.
  • Within each TableCell, traverse its paragraph collection (TableCell.Paragraphs), examining each Paragraph object.
  • In each paragraph, traverse its child objects (Paragraph.ChildObjects), identifying StructureDocumentTagInline instances for modification.
  • Within the StructureDocumentTagInline.ChildObjects collection, apply the appropriate edits based on the type of each child object.
  • Finally, utilize Document.SaveToFile() to commit the changes to the document.
  • Python
from spire.doc import *
from spire.doc.common import *

 # Create a new Document object
doc = Document()

# Load document content from file
doc.LoadFromFile("Sample5.docx")

# Get the body of the document
body = doc.Sections.get_Item(0).Body

# Get the first table
table = body.Tables.get_Item(0)

# Iterate through the rows of the table
for r in range(table.Rows.Count):
    row = table.Rows.get_Item(r)
    for c in range(row.Cells.Count):
        cell = row.Cells.get_Item(c)
        for p in range(cell.Paragraphs.Count):
            paragraph = cell.Paragraphs.get_Item(p)
            for i in range(paragraph.ChildObjects.Count):
                obj = paragraph.ChildObjects.get_Item(i)

                # Check if the child object is of type StructureDocumentTagInline
                if obj.DocumentObjectType == DocumentObjectType.StructureDocumentTagInline:

                    # Convert to StructureDocumentTagInline object
                    structure_document_tag_inline = (StructureDocumentTagInline)(obj)

                    # Check if the Tag or Alias property of StructureDocumentTagInline is "test1"
                    if structure_document_tag_inline.SDTProperties.Tag == "test1":

                        # Iterate through the child objects of StructureDocumentTagInline
                        for j in range(structure_document_tag_inline.ChildObjects.Count):
                            obj2 = structure_document_tag_inline.ChildObjects.get_Item(j)

                            # Check if the child object is of type TextRange
                            if obj2.DocumentObjectType == DocumentObjectType.TextRange:

                                # Convert to TextRange object
                                textRange = (TextRange)(obj2)

                                # Set the text content
                                textRange.Text = "89"

                                # Set text color
                                textRange.CharacterFormat.TextColor = Color.get_Blue()

# Save the modified document to a new file
doc.SaveToFile("ModifiedContentControlInParagraphOfTableCell.docx", FileFormat.Docx2016)

# Dispose of the Document object resources
doc.Close()
doc.Dispose()

Python: Modify Content Controls in a Word Document

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Creating a table of contents in a Word document significantly enhances its navigability and readability. It serves as a road map for the document, enabling readers to quickly overview the structure and grasp the content framework. This feature facilitates easy navigation for users to jump to any section within the document, which is particularly valuable for lengthy reports, papers, or manuals. It not only saves readers time in locating information but also augments the professionalism of the document and enhances the user experience. Moreover, a table of contents is easy to maintain and update; following any restructuring of the document, it can be swiftly revised to reflect the latest content organization, ensuring coherence and accuracy throughout the document. This article will demonstrate how to use Spire.Doc for Python to create a table of contents in a newly created Word document within a Python project.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.Doc

If you are unsure how to install, please refer to: How to Install Spire.Doc for Python on Windows

Python Create a Table Of Contents Using Heading Styles

Creating a table of contents using heading styles is a default method in Word documents to automatically generate a table of contents by utilizing different levels of heading styles to mark titles and sub-titles within the document, followed by leveraging Word's table of contents feature to automatically populate the contents. Here are the detailed steps:

  • Create a Document object.
  • Add a section using the Document.AddSection() method.
  • Add a paragraph using the Section.AddParagraph() method.
  • Create a table of contents object using the Paragraph.AppendTOC(int lowerLevel, int upperLevel) method.
  • Create a CharacterFormat object and set the font.
  • Apply a heading style to the paragraph using the Paragraph.ApplyStyle(BuiltinStyle.Heading1) method.
  • Add text content using the Paragraph.AppendText() method.
  • Apply character formatting to the text using the TextRange.ApplyCharacterFormat() method.
  • Update the table of contents using the Document.UpdateTableOfContents() method.
  • Save the document using the Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a new document object
doc = Document()

# Add a section to the document
section = doc.AddSection()

# Append a Table of Contents (TOC) paragraph
TOC_paragraph = section.AddParagraph()
TOC_paragraph.AppendTOC(1, 3)

# Create and set character format objects for font
character_format1 = CharacterFormat(doc)
character_format1.FontName = "Microsoft YaHei"

character_format2 = CharacterFormat(doc)
character_format2.FontName = "Microsoft YaHei"
character_format2.FontSize = 12

# Add a paragraph with Heading 1 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading1)

# Add text and apply character formatting
text_range1 = paragraph.AppendText("Overview")
text_range1.ApplyCharacterFormat(character_format1)

# Insert normal content
paragraph = section.Body.AddParagraph()
text_range2 = paragraph.AppendText("Spire.Doc for Python is a professional Python Word development component that enables developers to easily integrate Word document creation, reading, editing, and conversion functionalities into their own Python applications. As a completely standalone component, Spire.Doc for Python does not require the installation of Microsoft Word on the runtime environment.")

# Add a paragraph with Heading 1 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading1)
text_range1 = paragraph.AppendText("Main Functions")
text_range1.ApplyCharacterFormat(character_format1)

# Add a paragraph with Heading 2 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading2)
textRange1 = paragraph.AppendText("Only Spire.Doc, No Microsoft Office Automation")
textRange1.ApplyCharacterFormat(character_format1)

# Add regular content
paragraph = section.Body.AddParagraph()
textRange2 = paragraph.AppendText("Spire.Doc for Python is a totally independent Python Word class library which doesn't require Microsoft Office installed on system. Microsoft Office Automation is proved to be unstable, slow and not scalable to produce MS Word documents. Spire.Doc for Python is many times faster than Microsoft Word Automation and with much better stability and scalability.")
textRange2.ApplyCharacterFormat(character_format2)

# Add a paragraph with Heading 3 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading3)
textRange1 = paragraph.AppendText("Word Versions")
textRange1.ApplyCharacterFormat(character_format1)
paragraph = section.Body.AddParagraph()
textRange2 = paragraph.AppendText("Word97-03  Word2007  Word2010  Word2013  Word2016  Word2019")
textRange2.ApplyCharacterFormat(character_format2)

# Add a paragraph with Heading 2 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading2)
textRange1 = paragraph.AppendText("Convert File Documents with High Quality")
textRange1.ApplyCharacterFormat(character_format1)

# Add regular content
paragraph = section.Body.AddParagraph()
textRange2 = paragraph.AppendText("By using Spire.Doc for Python, users can save Word Doc/Docx to stream, save as web response and convert Word Doc/Docx to XML, Markdown, RTF, EMF, TXT, XPS, EPUB, HTML, SVG, ODT and vice versa. Spire.Doc for Python also supports to convert Word Doc/Docx to PDF and HTML to image.")
textRange2.ApplyCharacterFormat(character_format2)

# Add a paragraph with Heading 2 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading2)
extRange1 = paragraph.AppendText("Other Technical Features")
textRange1.ApplyCharacterFormat(character_format1)

# Add regular content
paragraph = section.Body.AddParagraph()
textRange2 = paragraph.AppendText("By using Spire.Doc for Python, developers can build any type of a 64-bit Python application to create and handle Word documents.")
textRange2.ApplyCharacterFormat(character_format2)

# Update the table of contents
doc.UpdateTableOfContents()

# Save the document
doc.SaveToFile("CreateTOCUsingHeadingStyles.docx", FileFormat.Docx2016)

# Release resources
doc.Dispose()

Python: Create a Table Of Contents for a Newly Created Word Document

Python Create a Table Of Contents Using Outline Level Styles

In a Word document, you can create a table of contents using outline level styles. You can assign an outline level to a paragraph using the ParagraphFormat.OutlineLevel property. Afterwards, you apply these outline levels to the rules for generating the table of contents using the TableOfContent.SetTOCLevelStyle() method. Here's a detailed steps:

  • Create a Document object.
  • Add a section using the Document.AddSection() method.
  • Create a ParagraphStyle object and set the outline level using ParagraphStyle.ParagraphFormat.OutlineLevel = OutlineLevel.Level1.
  • Add the created ParagraphStyle object to the document using the Document.Styles.Add() method.
  • Add a paragraph using the Section.AddParagraph() method.
  • Create a table of contents object using the Paragraph.AppendTOC(int lowerLevel, int upperLevel) method.
  • Set the default setting for creating the table of contents with heading styles to False, TableOfContent.UseHeadingStyles = false.
  • Apply the outline level style to the table of contents rules using the TableOfContent.SetTOCLevelStyle(int levelNumber, string styleName) method.
  • Create a CharacterFormat object and set the font.
  • Apply the style to the paragraph using the Paragraph.ApplyStyle(ParagraphStyle.Name) method.
  • Add text content using the Paragraph.AppendText() method.
  • Apply character formatting to the text using the TextRange.ApplyCharacterFormat() method.
  • Update the table of contents using the Document.UpdateTableOfContents() method.
  • Save the document using the Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a document object
doc = Document()

# Add a section to the document
section = doc.AddSection()

# Define Outline Level 1
titleStyle1 = ParagraphStyle(doc)
titleStyle1.Name = "T1S"
titleStyle1.ParagraphFormat.OutlineLevel = OutlineLevel.Level1
titleStyle1.CharacterFormat.Bold = True
titleStyle1.CharacterFormat.FontName = "Microsoft YaHei"
titleStyle1.CharacterFormat.FontSize = 18
titleStyle1.ParagraphFormat.HorizontalAlignment = HorizontalAlignment.Left
doc.Styles.Add(titleStyle1)

# Define Outline Level 2
titleStyle2 = ParagraphStyle(doc)
titleStyle2.Name = "T2S"
titleStyle2.ParagraphFormat.OutlineLevel = OutlineLevel.Level2
titleStyle2.CharacterFormat.Bold = True
titleStyle2.CharacterFormat.FontName = "Microsoft YaHei"
titleStyle2.CharacterFormat.FontSize = 16
titleStyle2.ParagraphFormat.HorizontalAlignment = HorizontalAlignment.Left
doc.Styles.Add(titleStyle2)

# Define Outline Level 3
titleStyle3 = ParagraphStyle(doc)
titleStyle3.Name = "T3S"
titleStyle3.ParagraphFormat.OutlineLevel = OutlineLevel.Level3
titleStyle3.CharacterFormat.Bold = True
titleStyle3.CharacterFormat.FontName = "Microsoft YaHei"
titleStyle3.CharacterFormat.FontSize = 14
titleStyle3.ParagraphFormat.HorizontalAlignment = HorizontalAlignment.Left
doc.Styles.Add(titleStyle3)

# Add a paragraph
TOCparagraph = section.AddParagraph()
toc = TOCparagraph.AppendTOC(1, 3)
toc.UseHeadingStyles = False
toc.UseHyperlinks = True
toc.UseTableEntryFields = False
toc.RightAlignPageNumbers = True
toc.SetTOCLevelStyle(1, titleStyle1.Name)
toc.SetTOCLevelStyle(2, titleStyle2.Name)
toc.SetTOCLevelStyle(3, titleStyle3.Name)

# Define character format
characterFormat = CharacterFormat(doc)
characterFormat.FontName = "Microsoft YaHei"
characterFormat.FontSize = 12

# Add a paragraph and apply outline level style 1
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle1.Name)
paragraph.AppendText("Overview")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("Spire.Doc for Python is a professional Word Python API specifically designed for developers to create, read, write, convert, and compare Word documents with fast and high-quality performance.")
textRange.ApplyCharacterFormat(characterFormat)

# Add a paragraph and apply outline level style 1
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle1.Name)
paragraph.AppendText("Main Functions")

# Add a paragraph and apply outline level style 2
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle2.Name)
paragraph.AppendText("Only Spire.Doc, No Microsoft Office Automation")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("Spire.Doc for Python is a totally independent Python Word class library which doesn't require Microsoft Office installed on system. Microsoft Office Automation is proved to be unstable, slow and not scalable to produce MS Word documents. Spire.Doc for Python is many times faster than Microsoft Word Automation and with much better stability and scalability.")
textRange.ApplyCharacterFormat(characterFormat)

# Add a paragraph and apply outline level style 3
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle3.Name)
paragraph.AppendText("Word Versions")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("Word97-03  Word2007  Word2010  Word2013  Word2016  Word2019")
textRange.ApplyCharacterFormat(characterFormat)

# Add a paragraph and apply outline level style 2
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle2.Name)
paragraph.AppendText("Convert File Documents with High Quality")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("By using Spire.Doc for Python, users can save Word Doc/Docx to stream, save as web response and convert Word Doc/Docx to XML, RTF, EMF, TXT, XPS, EPUB, HTML, SVG, ODT and vice versa. Spire.Doc for Python also supports to convert Word Doc/Docx to PDF and HTML to image.")
textRange.ApplyCharacterFormat(characterFormat)

# Add a paragraph and apply outline level style 2
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle2.Name)
paragraph.AppendText("Other Technical Features")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("By using Spire.Doc for Python, developers can build any type of a 64-bit Python application to create and handle Word documents.")
textRange.ApplyCharacterFormat(characterFormat)

# Update the table of contents
doc.UpdateTableOfContents()

# Save the document
doc.SaveToFile("CreateTOCUsingOutlineStyles.docx", FileFormat.Docx2016)

# Release resources
doc.Dispose()

Python: Create a Table Of Contents for a Newly Created Word Document

Python Create a Table Of Contents Using Image Captions

Using the Spire.Doc library, you can create a table of contents based on image captions by employing the TableOfContent(Document, "\\h \\z \\c \"Picture\"") method. Below are the detailed steps:

  • Create a Document object.
  • Add a section using the Document.AddSection() method.
  • Create a table of content object with tocForImage = new TableOfContent(Document, " \\h \\z \\c \"Picture\"") and specify the style of the table of contents.
  • Add a paragraph using the Section.AddParagraph() method.
  • Add the table of content object to the paragraph using the Paragraph.Items.Add(tocForImage) method.
  • Add a field separator using the Paragraph.AppendFieldMark(FieldMarkType.FieldSeparator) method.
  • Add the text content "TOC" using the Paragraph.AppendText("TOC") method.
  • Add a field end mark using the Paragraph.AppendFieldMark(FieldMarkType.FieldEnd) method.
  • Add an image using the Paragraph.AppendPicture() method.
  • Add a caption paragraph for the image using the DocPicture.AddCaption() method, including product information and formatting.
  • Update the table of contents to reflect changes in the document using the Document.UpdateTableOfContents(tocForImage) method.
  • Save the document using the Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a new document object
doc = Document()

# Add a section to the document
section = doc.AddSection()

# Create a table of content object for images
tocForImage = TableOfContent(doc, " \\h \\z \\c \"Picture\"")

# Add a paragraph to the section
tocParagraph = section.Body.AddParagraph()

# Add the TOC object to the paragraph
tocParagraph.Items.Add(tocForImage)

# Add a field separator
tocParagraph.AppendFieldMark(FieldMarkType.FieldSeparator)

# Add text content
tocParagraph.AppendText("TOC")

# Add a field end mark
tocParagraph.AppendFieldMark(FieldMarkType.FieldEnd)

# Add a blank paragraph to the section
section.Body.AddParagraph()

# Add a paragraph to the section
paragraph = section.Body.AddParagraph()

# Add an image
docPicture = paragraph.AppendPicture("images/DOC-Python.png")
docPicture.Width = 100
docPicture.Height = 100

# Add a caption paragraph for the image
obj = docPicture.AddCaption("Picture",CaptionNumberingFormat.Number,CaptionPosition.BelowItem)

paragraph = (Paragraph)(obj)
paragraph.AppendText("  Spire.Doc for Python product")
paragraph.Format.AfterSpacing = 20

# Continue adding paragraphs to the section
paragraph = section.Body.AddParagraph()
docPicture = paragraph.AppendPicture("images/PDF-Python.png")
docPicture.Width = 100
docPicture.Height = 100
obj = docPicture.AddCaption("Picture",CaptionNumberingFormat.Number,CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText("  Spire.PDF for Python product")
paragraph.Format.AfterSpacing = 20

paragraph = section.Body.AddParagraph()
docPicture = paragraph.AppendPicture("images/XLS-Python.png")
docPicture.Width = 100
docPicture.Height = 100
obj = docPicture.AddCaption("Picture",CaptionNumberingFormat.Number,CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText("  Spire.XLS for Python product")
paragraph.Format.AfterSpacing = 20

paragraph = section.Body.AddParagraph()
docPicture = paragraph.AppendPicture("images/PPT-Python.png")
docPicture.Width = 100
docPicture.Height = 100
obj = docPicture.AddCaption("Picture",CaptionNumberingFormat.Number,CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText("  Spire.Presentation for Python product")
paragraph.Format.AfterSpacing = 20

# Update the table of contents
doc.UpdateTableOfContents(tocForImage)

# Save the document to a file
doc.SaveToFile("CreateTOCWithImageCaptions.docx", FileFormat.Docx2016)

# Dispose of the document object
doc.Dispose()

Python: Create a Table Of Contents for a Newly Created Word Document

Python Create a Table Of Contents Using Table Captions

Similarly, you can create a table of contents based on table captions by employing the TableOfContent(Document, " \\h \\z \\c \"Table\"") method. Here are the detailed steps:

  • Create a Document object.
  • Add a section using the Document.AddSection() method.
  • Create a table of content object tocForTable = new TableOfContent(Document, " \\h \\z \\c \"Table\"") and specify the style of the table of contents.
  • Add a paragraph using the Section.AddParagraph() method.
  • Add the table of content object to the paragraph using the Paragraph.Items.Add(tocForTable) method.
  • Add a field separator using the Paragraph.AppendFieldMark(FieldMarkType.FieldSeparator) method.
  • Add the text content "TOC" using the Paragraph.AppendText("TOC") method.
  • Add a field end mark using the Paragraph.AppendFieldMark(FieldMarkType.FieldEnd) method.
  • Add a table using the Section.AddTable() method and set the number of rows and columns using the Table.ResetCells(int rowsNum, int columnsNum) method.
  • Add a table caption paragraph using the Table.AddCaption() method, including product information and formatting.
  • Update the table of contents to reflect changes in the document using the Document.UpdateTableOfContents(tocForTable) method.
  • Save the document using the Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a new document
doc = Document()

# Add a section to the document
section = doc.AddSection()

# Create a TableOfContent object
tocForTable = TableOfContent(doc,  " \\h \\z \\c \"Table\"")

# Add a paragraph in the section to place the TableOfContent object
tocParagraph = section.Body.AddParagraph()
tocParagraph.Items.Add(tocForTable)
tocParagraph.AppendFieldMark(FieldMarkType.FieldSeparator)
tocParagraph.AppendText("TOC")
tocParagraph.AppendFieldMark(FieldMarkType.FieldEnd)

# Add two empty paragraphs in the section
section.Body.AddParagraph()
section.Body.AddParagraph()

# Add a table in the section
table = section.Body.AddTable(True)
table.ResetCells(1, 3)

# Add a caption paragraph for the table
obj = table.AddCaption("Table", CaptionNumberingFormat.Number, CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText(" One row three columns")
paragraph.Format.AfterSpacing = 20

# Add a new table in the section
table = section.Body.AddTable(True)
table.ResetCells(3, 3)

# Add a caption paragraph for the second table
obj = table.AddCaption("Table", CaptionNumberingFormat.Number, CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText(" Three rows three columns")
paragraph.Format.AfterSpacing = 20

# Add another new table in the section
table = section.Body.AddTable(True)
table.ResetCells(5, 3)

# Add a caption paragraph for the third table
obj = table.AddCaption("Table", CaptionNumberingFormat.Number, CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText(" Five rows three columns")
paragraph.Format.AfterSpacing = 20

# Update the table of contents
doc.UpdateTableOfContents(tocForTable)

# Save the document to a specified file
doc.SaveToFile("CreateTOCUsingTableCaptions.docx", FileFormat.Docx2016)

# Dispose resources
doc.Dispose()

Python: Create a Table Of Contents for a Newly Created Word Document

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Page 3 of 7
page 3