page 11

Subscribe to this RSS feed

Python (355)

Children categories

Spire.Presentation for Python (53)

View items...

Spire.OCR for Python (3)

View items...

Python: Modify Content Controls in a Word Document

2024-05-31 01:15:59 Written by Koohji

Word documents leverage Content Control technology to infuse dynamic vitality into document content, offering users enhanced flexibility and convenience when editing and managing documents. These controls, serving as interactive elements, empower users to freely add, remove, or adjust specified content sections while preserving the integrity of the document structure, thereby facilitating agile iterations and personalized customization of document content. This article will guide you how to use Spire.Doc for Python to modify content controls in Word documents within a Python project.

Modify Content Controls in the Body using Python
Modify Content Controls within Paragraphs using Python
Modify Content Controls Wrapping Table Rows using Python
Modify Content Controls Wrapping Table Cells using Python
Modify Content Controls within Table Cells using Python

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your VS Code through the following pip command.

Package Manager

pip install Spire.Doc

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows

Modify Content Controls in the Body using Python

In Spire.Doc, the object type for the body content control is StructureDocumentTag. To modify these controls, one needs to traverse the Section.Body.ChildObjects collection to locate objects of type StructureDocumentTag. Below are the detailed steps:

Create a Document object.
Use the Document.LoadFromFile() method to load a Word document into memory.
Retrieve the body of a section in the document using Section.Body.
Traverse the collection of child objects within Body.ChildObjects, identifying those that are of type StructureDocumentTag.
Within the StructureDocumentTag.ChildObjects sub-collection, perform modifications based on the type of each child object.
Finally, utilize the Document.SaveToFile() method to save the changes back to the Word document.

Python

from spire.doc import *
from spire.doc.common import *

# Create a new document object
doc = Document()

# Load the document content from a file
doc.LoadFromFile("Sample1.docx")

# Get the body of the document
body = doc.Sections.get_Item(0).Body

# Create lists for paragraphs and tables
paragraphs = []
tables = []

for i in range(body.ChildObjects.Count):
        obj = body.ChildObjects.get_Item(i)
        # If it is a StructureDocumentTag object
        if obj.DocumentObjectType == DocumentObjectType.StructureDocumentTag:
            sdt = (StructureDocumentTag)(obj)

            # If the tag is "c1" or the alias is "c1"
            if sdt.SDTProperties.Tag == "c1" or sdt.SDTProperties.Alias == "c1":
                for j in range(sdt.ChildObjects.Count):
                    child_obj = sdt.ChildObjects.get_Item(j)

                    # If it is a paragraph object
                    if child_obj.DocumentObjectType == DocumentObjectType.Paragraph:
                        paragraphs.append(child_obj)
                    
                    # If it is a table object
                    elif child_obj.DocumentObjectType == DocumentObjectType.Table:
                        tables.append(child_obj)

# Modify the text content of the first paragraph
if paragraphs:
    (Paragraph)(paragraphs[0]).Text = "Spire.Doc for Python is a totally independent Python Word class library which doesn't require Microsoft Office installed on system."

if tables:
    # Reset the cells of the first table
    (Table)(tables[0]).ResetCells(5, 4)

# Save the modified document to a file
doc.SaveToFile("ModifyBodyContentControls.docx", FileFormat.Docx2016)

# Release document resources
doc.Close()
doc.Dispose()

Python: Modify Content Controls in a Word Document

Modify Content Controls within Paragraphs using Python

In Spire.Doc, the object type for content controls within paragraphs is StructureDocumentTagInline. To modify these, you would traverse the Paragraph.ChildObjects collection to locate objects of type StructureDocumentTagInline. Here are the detailed steps:

Instantiate a Document object.
Load a Word document using the Document.LoadFromFile() method.
Get the body of a section in the document via Section.Body.
Retrieve the first paragraph of the text body using Body.Paragraphs.get_Item(0).
Traverse the collection of child objects within Paragraph.ChildObjects, identifying those that are of type StructureDocumentTagInline.
Within the StructureDocumentTagInline.ChildObjects sub-collection, execute modification operations according to the type of each child object.
Save the changes back to the Word document using the Document.SaveToFile() method.

Python

from spire.doc import *
from spire.doc.common import *

 # Create a new Document object
doc = Document()

# Load document content from a file
doc.LoadFromFile("Sample2.docx")

# Get the body of the document
body = doc.Sections.get_Item(0).Body

# Get the first paragraph in the body
paragraph = body.Paragraphs.get_Item(0)

# Iterate through child objects in the paragraph
for i in range(paragraph.ChildObjects.Count):
    obj = paragraph.ChildObjects.get_Item(i)

    # Check if the child object is StructureDocumentTagInline
    if obj.DocumentObjectType == DocumentObjectType.StructureDocumentTagInline:
       
       # Convert the child object to StructureDocumentTagInline type
       structure_document_tag_inline = (StructureDocumentTagInline)(obj)

       # Check if the Tag or Alias property is "text1"
       if structure_document_tag_inline.SDTProperties.Tag == "text1":
            
            # Iterate through child objects in the StructureDocumentTagInline object
            for j in range(structure_document_tag_inline.ChildObjects.Count):
                obj2 = structure_document_tag_inline.ChildObjects.get_Item(j)

                # Check if the child object is a TextRange object
                if obj2.DocumentObjectType == DocumentObjectType.TextRange:

                    # Convert the child object to TextRange type
                    range = (TextRange)(obj2)

                    # Set the text content to a specified content
                    range.Text = "97-2003/2007/2010/2013/2016/2019"

       # Check if the Tag or Alias property is "logo1"
       if structure_document_tag_inline.SDTProperties.Tag == "logo1":
            
            # Iterate through child objects in the StructureDocumentTagInline object
            for j in range(structure_document_tag_inline.ChildObjects.Count):
                obj2 = structure_document_tag_inline.ChildObjects.get_Item(j)
               
                # Check if the child object is an image
                if obj2.DocumentObjectType == DocumentObjectType.Picture:

                    # Convert the child object to DocPicture type
                    doc_picture = (DocPicture)(obj2)

                    # Load a specified image
                    doc_picture.LoadImage("DOC-Python.png")

                    # Set the width and height of the image
                    doc_picture.Width = 100
                    doc_picture.Height = 100

# Save the modified document to a new file
doc.SaveToFile("ModifiedContentControlsInParagraph.docx", FileFormat.Docx2016)

# Release resources of the Document object
doc.Close()
doc.Dispose()

Python: Modify Content Controls in a Word Document

Modify Content Controls Wrapping Table Rows using Python

In Spire.Doc, the object type for content controls within table rows is StructureDocumentTagRow. To modify these controls, you need to traverse the Table.ChildObjects collection to find objects of type StructureDocumentTagRow. Here are the detailed steps:

Create a Document object.
Load a Word document using the Document.LoadFromFile() method.
Retrieve the body of a section within the document using Section.Body.
Obtain the first table in the text body via Body.Tables.get_Item(0).
Traverse the collection of child objects within Table.ChildObjects, identifying those that are of type StructureDocumentTagRow.
Access StructureDocumentTagRow.Cells collection to iterate through the cells within this controlled row, and then execute the appropriate modification actions on the cell contents.
Lastly, use the Document.SaveToFile() method to persist the changes made to the document.

Python

from spire.doc import *
from spire.doc.common import *

# Create a new document object
doc = Document()

# Load the document from a file
doc.LoadFromFile("Sample3.docx")

# Get the body of the document
body = doc.Sections.get_Item(0).Body

# Get the first table
table = body.Tables.get_Item(0)

# Iterate through the child objects in the table
for i in range(table.ChildObjects.Count):
    obj = table.ChildObjects.get_Item(i)

    # Check if the child object is of type StructureDocumentTagRow
    if obj.DocumentObjectType == DocumentObjectType.StructureDocumentTagRow:

        # Convert the child object to a StructureDocumentTagRow object
        structureDocumentTagRow = (StructureDocumentTagRow)(obj)

        # Check if the Tag or Alias property of the StructureDocumentTagRow is "row1"
        if structureDocumentTagRow.SDTProperties.Tag == "row1":

            # Clear the paragraphs in the cell
            structureDocumentTagRow.Cells.get_Item(0).Paragraphs.Clear()

            # Add a paragraph in the cell and set the text
            textRange = structureDocumentTagRow.Cells.get_Item(0).AddParagraph().AppendText("Arts")
            textRange.CharacterFormat.TextColor = Color.get_Blue()
      
# Save the modified document to a file
doc.SaveToFile("ModifiedTableRowContentControl.docx",  FileFormat.Docx2016)

# Release document resources
doc.Close()
doc.Dispose()

Python: Modify Content Controls in a Word Document

Modify Content Controls Wrapping Table Cells using Python

In Spire.Doc, the object type for content controls within table cells is StructureDocumentTagCell. To manipulate these controls, you need to traverse the TableRow.ChildObjects collection to locate objects of type StructureDocumentTagCell. Here are the detailed steps:

Create a Document object.
Load a Word document using the Document.LoadFromFile() method.
Retrieve the body of a section in the document using Section.Body.
Obtain the first table in the body using Body.Tables.get_Item(0).
Traverse the collection of rows in the table.
Within each TableRow, traverse its child objects TableRow.ChildObjects to identify those of type StructureDocumentTagCell.
Access StructureDocumentTagCell.Paragraphs collection. This allows you to iterate through the paragraphs within the cell and apply the necessary modification operations to the content.
Finally, use the Document.SaveToFile() method to save the modified document.

Python

from spire.doc import *
from spire.doc.common import *

# Create a new document object
doc = Document()

# Load the document from a file
doc.LoadFromFile("Sample4.docx")

# Get the body of the document
body = doc.Sections.get_Item(0).Body

# Get the first table in the document
table = body.Tables.get_Item(0)

# Iterate through the rows of the table
for i in range(table.Rows.Count):
    row = table.Rows.get_Item(i)

    # Iterate through the child objects in each row
    for j in range(row.ChildObjects.Count):
        obj = row.ChildObjects.get_Item(j)

        # Check if the child object is a StructureDocumentTagCell
        if obj.DocumentObjectType == DocumentObjectType.StructureDocumentTagCell:

            # Convert the child object to StructureDocumentTagCell type
            structureDocumentTagCell = (StructureDocumentTagCell)(obj)

            # Check if the Tag or Alias property of structureDocumentTagCell is "cell1"
            if structureDocumentTagCell.SDTProperties.Tag == "cell1":
                
                # Clear the paragraphs in the cell
                structureDocumentTagCell.Paragraphs.Clear()

                # Add a new paragraph and add text to it
                textRange = structureDocumentTagCell.AddParagraph().AppendText("92")
                textRange.CharacterFormat.TextColor = Color.get_Blue()

# Save the modified document to a new file
doc.SaveToFile("ModifiedTableCellContentControl.docx", FileFormat.Docx2016)

# Dispose of the document object
doc.Close()
doc.Dispose()

Python: Modify Content Controls in a Word Document

Modify Content Controls within Table Cells using Python

This case demonstrates modifying content controls within paragraphs inside table cells. The process involves navigating to the paragraph collection TableCell.Paragraphs within each cell, then iterating through each paragraph's child objects (Paragraph.ChildObjects) to locate StructureDocumentTagInline objects for modification. Here are the detailed steps:

Initiate a Document instance.
Use the Document.LoadFromFile() method to load a Word document.
Retrieve the body of a section in the document with Section.Body.
Obtain the first table in the body via Body.Tables.get_Item(0).
Traverse the table rows collection (Table.Rows), engaging with each TableRow object.
For each TableRow, navigate its cells collection (TableRow.Cells), entering each TableCell object.
Within each TableCell, traverse its paragraph collection (TableCell.Paragraphs), examining each Paragraph object.
In each paragraph, traverse its child objects (Paragraph.ChildObjects), identifying StructureDocumentTagInline instances for modification.
Within the StructureDocumentTagInline.ChildObjects collection, apply the appropriate edits based on the type of each child object.
Finally, utilize Document.SaveToFile() to commit the changes to the document.

Python

from spire.doc import *
from spire.doc.common import *

 # Create a new Document object
doc = Document()

# Load document content from file
doc.LoadFromFile("Sample5.docx")

# Get the body of the document
body = doc.Sections.get_Item(0).Body

# Get the first table
table = body.Tables.get_Item(0)

# Iterate through the rows of the table
for r in range(table.Rows.Count):
    row = table.Rows.get_Item(r)
    for c in range(row.Cells.Count):
        cell = row.Cells.get_Item(c)
        for p in range(cell.Paragraphs.Count):
            paragraph = cell.Paragraphs.get_Item(p)
            for i in range(paragraph.ChildObjects.Count):
                obj = paragraph.ChildObjects.get_Item(i)

                # Check if the child object is of type StructureDocumentTagInline
                if obj.DocumentObjectType == DocumentObjectType.StructureDocumentTagInline:

                    # Convert to StructureDocumentTagInline object
                    structure_document_tag_inline = (StructureDocumentTagInline)(obj)

                    # Check if the Tag or Alias property of StructureDocumentTagInline is "test1"
                    if structure_document_tag_inline.SDTProperties.Tag == "test1":

                        # Iterate through the child objects of StructureDocumentTagInline
                        for j in range(structure_document_tag_inline.ChildObjects.Count):
                            obj2 = structure_document_tag_inline.ChildObjects.get_Item(j)

                            # Check if the child object is of type TextRange
                            if obj2.DocumentObjectType == DocumentObjectType.TextRange:

                                # Convert to TextRange object
                                textRange = (TextRange)(obj2)

                                # Set the text content
                                textRange.Text = "89"

                                # Set text color
                                textRange.CharacterFormat.TextColor = Color.get_Blue()

# Save the modified document to a new file
doc.SaveToFile("ModifiedContentControlInParagraphOfTableCell.docx", FileFormat.Docx2016)

# Dispose of the Document object resources
doc.Close()
doc.Dispose()

Python: Modify Content Controls in a Word Document

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Form Field

Tagged under

doc Python Form Field

Python: Create a Table Of Contents for a Newly Created Word Document

2024-05-31 01:00:34 Written by Koohji

Creating a table of contents in a Word document significantly enhances its navigability and readability. It serves as a road map for the document, enabling readers to quickly overview the structure and grasp the content framework. This feature facilitates easy navigation for users to jump to any section within the document, which is particularly valuable for lengthy reports, papers, or manuals. It not only saves readers time in locating information but also augments the professionalism of the document and enhances the user experience. Moreover, a table of contents is easy to maintain and update; following any restructuring of the document, it can be swiftly revised to reflect the latest content organization, ensuring coherence and accuracy throughout the document. This article will demonstrate how to use Spire.Doc for Python to create a table of contents in a newly created Word document within a Python project.

Python Create a Table Of Contents Using Heading Styles
Python Create a Table Of Contents Using Outline Level Styles
Python Create a Table Of Contents Using Image Captions
Python Create a Table Of Contents Using Table Captions

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

Package Manager

pip install Spire.Doc

If you are unsure how to install, please refer to: How to Install Spire.Doc for Python on Windows

Python Create a Table Of Contents Using Heading Styles

Creating a table of contents using heading styles is a default method in Word documents to automatically generate a table of contents by utilizing different levels of heading styles to mark titles and sub-titles within the document, followed by leveraging Word's table of contents feature to automatically populate the contents. Here are the detailed steps:

Create a Document object.
Add a section using the Document.AddSection() method.
Add a paragraph using the Section.AddParagraph() method.
Create a table of contents object using the Paragraph.AppendTOC(int lowerLevel, int upperLevel) method.
Create a CharacterFormat object and set the font.
Apply a heading style to the paragraph using the Paragraph.ApplyStyle(BuiltinStyle.Heading1) method.
Add text content using the Paragraph.AppendText() method.
Apply character formatting to the text using the TextRange.ApplyCharacterFormat() method.
Update the table of contents using the Document.UpdateTableOfContents() method.
Save the document using the Document.SaveToFile() method.

Python

from spire.doc import *
from spire.doc.common import *

# Create a new document object
doc = Document()

# Add a section to the document
section = doc.AddSection()

# Append a Table of Contents (TOC) paragraph
TOC_paragraph = section.AddParagraph()
TOC_paragraph.AppendTOC(1, 3)

# Create and set character format objects for font
character_format1 = CharacterFormat(doc)
character_format1.FontName = "Microsoft YaHei"

character_format2 = CharacterFormat(doc)
character_format2.FontName = "Microsoft YaHei"
character_format2.FontSize = 12

# Add a paragraph with Heading 1 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading1)

# Add text and apply character formatting
text_range1 = paragraph.AppendText("Overview")
text_range1.ApplyCharacterFormat(character_format1)

# Insert normal content
paragraph = section.Body.AddParagraph()
text_range2 = paragraph.AppendText("Spire.Doc for Python is a professional Python Word development component that enables developers to easily integrate Word document creation, reading, editing, and conversion functionalities into their own Python applications. As a completely standalone component, Spire.Doc for Python does not require the installation of Microsoft Word on the runtime environment.")

# Add a paragraph with Heading 1 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading1)
text_range1 = paragraph.AppendText("Main Functions")
text_range1.ApplyCharacterFormat(character_format1)

# Add a paragraph with Heading 2 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading2)
textRange1 = paragraph.AppendText("Only Spire.Doc, No Microsoft Office Automation")
textRange1.ApplyCharacterFormat(character_format1)

# Add regular content
paragraph = section.Body.AddParagraph()
textRange2 = paragraph.AppendText("Spire.Doc for Python is a totally independent Python Word class library which doesn't require Microsoft Office installed on system. Microsoft Office Automation is proved to be unstable, slow and not scalable to produce MS Word documents. Spire.Doc for Python is many times faster than Microsoft Word Automation and with much better stability and scalability.")
textRange2.ApplyCharacterFormat(character_format2)

# Add a paragraph with Heading 3 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading3)
textRange1 = paragraph.AppendText("Word Versions")
textRange1.ApplyCharacterFormat(character_format1)
paragraph = section.Body.AddParagraph()
textRange2 = paragraph.AppendText("Word97-03  Word2007  Word2010  Word2013  Word2016  Word2019")
textRange2.ApplyCharacterFormat(character_format2)

# Add a paragraph with Heading 2 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading2)
textRange1 = paragraph.AppendText("Convert File Documents with High Quality")
textRange1.ApplyCharacterFormat(character_format1)

# Add regular content
paragraph = section.Body.AddParagraph()
textRange2 = paragraph.AppendText("By using Spire.Doc for Python, users can save Word Doc/Docx to stream, save as web response and convert Word Doc/Docx to XML, Markdown, RTF, EMF, TXT, XPS, EPUB, HTML, SVG, ODT and vice versa. Spire.Doc for Python also supports to convert Word Doc/Docx to PDF and HTML to image.")
textRange2.ApplyCharacterFormat(character_format2)

# Add a paragraph with Heading 2 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading2)
extRange1 = paragraph.AppendText("Other Technical Features")
textRange1.ApplyCharacterFormat(character_format1)

# Add regular content
paragraph = section.Body.AddParagraph()
textRange2 = paragraph.AppendText("By using Spire.Doc for Python, developers can build any type of a 64-bit Python application to create and handle Word documents.")
textRange2.ApplyCharacterFormat(character_format2)

# Update the table of contents
doc.UpdateTableOfContents()

# Save the document
doc.SaveToFile("CreateTOCUsingHeadingStyles.docx", FileFormat.Docx2016)

# Release resources
doc.Dispose()

Python: Create a Table Of Contents for a Newly Created Word Document

Python Create a Table Of Contents Using Outline Level Styles

In a Word document, you can create a table of contents using outline level styles. You can assign an outline level to a paragraph using the ParagraphFormat.OutlineLevel property. Afterwards, you apply these outline levels to the rules for generating the table of contents using the TableOfContent.SetTOCLevelStyle() method. Here's a detailed steps:

Create a Document object.
Add a section using the Document.AddSection() method.
Create a ParagraphStyle object and set the outline level using ParagraphStyle.ParagraphFormat.OutlineLevel = OutlineLevel.Level1.
Add the created ParagraphStyle object to the document using the Document.Styles.Add() method.
Add a paragraph using the Section.AddParagraph() method.
Create a table of contents object using the Paragraph.AppendTOC(int lowerLevel, int upperLevel) method.
Set the default setting for creating the table of contents with heading styles to False, TableOfContent.UseHeadingStyles = false.
Apply the outline level style to the table of contents rules using the TableOfContent.SetTOCLevelStyle(int levelNumber, string styleName) method.
Create a CharacterFormat object and set the font.
Apply the style to the paragraph using the Paragraph.ApplyStyle(ParagraphStyle.Name) method.
Add text content using the Paragraph.AppendText() method.
Apply character formatting to the text using the TextRange.ApplyCharacterFormat() method.
Update the table of contents using the Document.UpdateTableOfContents() method.
Save the document using the Document.SaveToFile() method.

Python

from spire.doc import *
from spire.doc.common import *

# Create a document object
doc = Document()

# Add a section to the document
section = doc.AddSection()

# Define Outline Level 1
titleStyle1 = ParagraphStyle(doc)
titleStyle1.Name = "T1S"
titleStyle1.ParagraphFormat.OutlineLevel = OutlineLevel.Level1
titleStyle1.CharacterFormat.Bold = True
titleStyle1.CharacterFormat.FontName = "Microsoft YaHei"
titleStyle1.CharacterFormat.FontSize = 18
titleStyle1.ParagraphFormat.HorizontalAlignment = HorizontalAlignment.Left
doc.Styles.Add(titleStyle1)

# Define Outline Level 2
titleStyle2 = ParagraphStyle(doc)
titleStyle2.Name = "T2S"
titleStyle2.ParagraphFormat.OutlineLevel = OutlineLevel.Level2
titleStyle2.CharacterFormat.Bold = True
titleStyle2.CharacterFormat.FontName = "Microsoft YaHei"
titleStyle2.CharacterFormat.FontSize = 16
titleStyle2.ParagraphFormat.HorizontalAlignment = HorizontalAlignment.Left
doc.Styles.Add(titleStyle2)

# Define Outline Level 3
titleStyle3 = ParagraphStyle(doc)
titleStyle3.Name = "T3S"
titleStyle3.ParagraphFormat.OutlineLevel = OutlineLevel.Level3
titleStyle3.CharacterFormat.Bold = True
titleStyle3.CharacterFormat.FontName = "Microsoft YaHei"
titleStyle3.CharacterFormat.FontSize = 14
titleStyle3.ParagraphFormat.HorizontalAlignment = HorizontalAlignment.Left
doc.Styles.Add(titleStyle3)

# Add a paragraph
TOCparagraph = section.AddParagraph()
toc = TOCparagraph.AppendTOC(1, 3)
toc.UseHeadingStyles = False
toc.UseHyperlinks = True
toc.UseTableEntryFields = False
toc.RightAlignPageNumbers = True
toc.SetTOCLevelStyle(1, titleStyle1.Name)
toc.SetTOCLevelStyle(2, titleStyle2.Name)
toc.SetTOCLevelStyle(3, titleStyle3.Name)

# Define character format
characterFormat = CharacterFormat(doc)
characterFormat.FontName = "Microsoft YaHei"
characterFormat.FontSize = 12

# Add a paragraph and apply outline level style 1
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle1.Name)
paragraph.AppendText("Overview")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("Spire.Doc for Python is a professional Word Python API specifically designed for developers to create, read, write, convert, and compare Word documents with fast and high-quality performance.")
textRange.ApplyCharacterFormat(characterFormat)

# Add a paragraph and apply outline level style 1
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle1.Name)
paragraph.AppendText("Main Functions")

# Add a paragraph and apply outline level style 2
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle2.Name)
paragraph.AppendText("Only Spire.Doc, No Microsoft Office Automation")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("Spire.Doc for Python is a totally independent Python Word class library which doesn't require Microsoft Office installed on system. Microsoft Office Automation is proved to be unstable, slow and not scalable to produce MS Word documents. Spire.Doc for Python is many times faster than Microsoft Word Automation and with much better stability and scalability.")
textRange.ApplyCharacterFormat(characterFormat)

# Add a paragraph and apply outline level style 3
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle3.Name)
paragraph.AppendText("Word Versions")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("Word97-03  Word2007  Word2010  Word2013  Word2016  Word2019")
textRange.ApplyCharacterFormat(characterFormat)

# Add a paragraph and apply outline level style 2
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle2.Name)
paragraph.AppendText("Convert File Documents with High Quality")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("By using Spire.Doc for Python, users can save Word Doc/Docx to stream, save as web response and convert Word Doc/Docx to XML, RTF, EMF, TXT, XPS, EPUB, HTML, SVG, ODT and vice versa. Spire.Doc for Python also supports to convert Word Doc/Docx to PDF and HTML to image.")
textRange.ApplyCharacterFormat(characterFormat)

# Add a paragraph and apply outline level style 2
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle2.Name)
paragraph.AppendText("Other Technical Features")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("By using Spire.Doc for Python, developers can build any type of a 64-bit Python application to create and handle Word documents.")
textRange.ApplyCharacterFormat(characterFormat)

# Update the table of contents
doc.UpdateTableOfContents()

# Save the document
doc.SaveToFile("CreateTOCUsingOutlineStyles.docx", FileFormat.Docx2016)

# Release resources
doc.Dispose()

Python: Create a Table Of Contents for a Newly Created Word Document

Python Create a Table Of Contents Using Image Captions

Using the Spire.Doc library, you can create a table of contents based on image captions by employing the TableOfContent(Document, "\\h \\z \\c \"Picture\"") method. Below are the detailed steps:

Create a Document object.
Add a section using the Document.AddSection() method.
Create a table of content object with tocForImage = new TableOfContent(Document, " \\h \\z \\c \"Picture\"") and specify the style of the table of contents.
Add a paragraph using the Section.AddParagraph() method.
Add the table of content object to the paragraph using the Paragraph.Items.Add(tocForImage) method.
Add a field separator using the Paragraph.AppendFieldMark(FieldMarkType.FieldSeparator) method.
Add the text content "TOC" using the Paragraph.AppendText("TOC") method.
Add a field end mark using the Paragraph.AppendFieldMark(FieldMarkType.FieldEnd) method.
Add an image using the Paragraph.AppendPicture() method.
Add a caption paragraph for the image using the DocPicture.AddCaption() method, including product information and formatting.
Update the table of contents to reflect changes in the document using the Document.UpdateTableOfContents(tocForImage) method.
Save the document using the Document.SaveToFile() method.

Python

from spire.doc import *
from spire.doc.common import *

# Create a new document object
doc = Document()

# Add a section to the document
section = doc.AddSection()

# Create a table of content object for images
tocForImage = TableOfContent(doc, " \\h \\z \\c \"Picture\"")

# Add a paragraph to the section
tocParagraph = section.Body.AddParagraph()

# Add the TOC object to the paragraph
tocParagraph.Items.Add(tocForImage)

# Add a field separator
tocParagraph.AppendFieldMark(FieldMarkType.FieldSeparator)

# Add text content
tocParagraph.AppendText("TOC")

# Add a field end mark
tocParagraph.AppendFieldMark(FieldMarkType.FieldEnd)

# Add a blank paragraph to the section
section.Body.AddParagraph()

# Add a paragraph to the section
paragraph = section.Body.AddParagraph()

# Add an image
docPicture = paragraph.AppendPicture("images/DOC-Python.png")
docPicture.Width = 100
docPicture.Height = 100

# Add a caption paragraph for the image
obj = docPicture.AddCaption("Picture",CaptionNumberingFormat.Number,CaptionPosition.BelowItem)

paragraph = (Paragraph)(obj)
paragraph.AppendText("  Spire.Doc for Python product")
paragraph.Format.AfterSpacing = 20

# Continue adding paragraphs to the section
paragraph = section.Body.AddParagraph()
docPicture = paragraph.AppendPicture("images/PDF-Python.png")
docPicture.Width = 100
docPicture.Height = 100
obj = docPicture.AddCaption("Picture",CaptionNumberingFormat.Number,CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText("  Spire.PDF for Python product")
paragraph.Format.AfterSpacing = 20

paragraph = section.Body.AddParagraph()
docPicture = paragraph.AppendPicture("images/XLS-Python.png")
docPicture.Width = 100
docPicture.Height = 100
obj = docPicture.AddCaption("Picture",CaptionNumberingFormat.Number,CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText("  Spire.XLS for Python product")
paragraph.Format.AfterSpacing = 20

paragraph = section.Body.AddParagraph()
docPicture = paragraph.AppendPicture("images/PPT-Python.png")
docPicture.Width = 100
docPicture.Height = 100
obj = docPicture.AddCaption("Picture",CaptionNumberingFormat.Number,CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText("  Spire.Presentation for Python product")
paragraph.Format.AfterSpacing = 20

# Update the table of contents
doc.UpdateTableOfContents(tocForImage)

# Save the document to a file
doc.SaveToFile("CreateTOCWithImageCaptions.docx", FileFormat.Docx2016)

# Dispose of the document object
doc.Dispose()

Python: Create a Table Of Contents for a Newly Created Word Document

Python Create a Table Of Contents Using Table Captions

Similarly, you can create a table of contents based on table captions by employing the TableOfContent(Document, " \\h \\z \\c \"Table\"") method. Here are the detailed steps:

Create a Document object.
Add a section using the Document.AddSection() method.
Create a table of content object tocForTable = new TableOfContent(Document, " \\h \\z \\c \"Table\"") and specify the style of the table of contents.
Add a paragraph using the Section.AddParagraph() method.
Add the table of content object to the paragraph using the Paragraph.Items.Add(tocForTable) method.
Add a field separator using the Paragraph.AppendFieldMark(FieldMarkType.FieldSeparator) method.
Add the text content "TOC" using the Paragraph.AppendText("TOC") method.
Add a field end mark using the Paragraph.AppendFieldMark(FieldMarkType.FieldEnd) method.
Add a table using the Section.AddTable() method and set the number of rows and columns using the Table.ResetCells(int rowsNum, int columnsNum) method.
Add a table caption paragraph using the Table.AddCaption() method, including product information and formatting.
Update the table of contents to reflect changes in the document using the Document.UpdateTableOfContents(tocForTable) method.
Save the document using the Document.SaveToFile() method.

Python

from spire.doc import *
from spire.doc.common import *

# Create a new document
doc = Document()

# Add a section to the document
section = doc.AddSection()

# Create a TableOfContent object
tocForTable = TableOfContent(doc,  " \\h \\z \\c \"Table\"")

# Add a paragraph in the section to place the TableOfContent object
tocParagraph = section.Body.AddParagraph()
tocParagraph.Items.Add(tocForTable)
tocParagraph.AppendFieldMark(FieldMarkType.FieldSeparator)
tocParagraph.AppendText("TOC")
tocParagraph.AppendFieldMark(FieldMarkType.FieldEnd)

# Add two empty paragraphs in the section
section.Body.AddParagraph()
section.Body.AddParagraph()

# Add a table in the section
table = section.Body.AddTable(True)
table.ResetCells(1, 3)

# Add a caption paragraph for the table
obj = table.AddCaption("Table", CaptionNumberingFormat.Number, CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText(" One row three columns")
paragraph.Format.AfterSpacing = 20

# Add a new table in the section
table = section.Body.AddTable(True)
table.ResetCells(3, 3)

# Add a caption paragraph for the second table
obj = table.AddCaption("Table", CaptionNumberingFormat.Number, CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText(" Three rows three columns")
paragraph.Format.AfterSpacing = 20

# Add another new table in the section
table = section.Body.AddTable(True)
table.ResetCells(5, 3)

# Add a caption paragraph for the third table
obj = table.AddCaption("Table", CaptionNumberingFormat.Number, CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText(" Five rows three columns")
paragraph.Format.AfterSpacing = 20

# Update the table of contents
doc.UpdateTableOfContents(tocForTable)

# Save the document to a specified file
doc.SaveToFile("CreateTOCUsingTableCaptions.docx", FileFormat.Docx2016)

# Dispose resources
doc.Dispose()

Python: Create a Table Of Contents for a Newly Created Word Document

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Document Operation

Tagged under

doc Python Document Operation

Python: Rearrange Slides in a PowerPoint Document

2024-05-30 09:42:59 Written by Koohji

Rearranging slides in a PowerPoint presentation is a simple but essential skill. Whether you need to change the order of your points, group related slides together, or move a slide to a different location, the ability to efficiently reorganize your slides can help you create a more coherent and impactful presentation.

In this article, you will learn how to rearrange slides in a PowerPoint document in Python using Spire.Presentation for Python.

Install Spire.Presentation for Python

This scenario requires Spire.Presentation for Python and plum-dispatch v1.7.4. They can be easily installed in your system through the following pip command.

Package Manager

pip install Spire.Presentation

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Presentation for Python on Windows

Rearrange Slides in a PowerPoint Document in Python

To reorder the slides in PowerPoint, two Presentation objects were created - one for loading the original document, and one for creating a new document. By copying the slides from the original document to the new one in the desired sequence, the slide order could be easily rearranged.

The following are the steps to rearrange slides in a PowerPoint document using Python.

Create a Presentation object.
Load a PowerPoint document using Presentation.LoadFromFile() method.
Specify the slide order within a list.
Create another Presentation object for creating a new presentation.
Add the slides from the original document to the new presentation in the specified order using Presentation.Slides.AppendBySlide() method.
Save the new presentation to a PPTX file using Presentation.SaveToFile() method.

Python

from spire.presentation.common import *
from spire.presentation import *

# Create a Presentation object
presentation = Presentation()

# Load a PowerPoint file
presentation.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.pptx")

# Specify the new slide order within a list
newSlideOrder = [4,2,1,3]

# Create another Presentation object
new_presentation =  Presentation()

# Remove the default slide
new_presentation.Slides.RemoveAt(0)

# Iterate through the list
for i in range(len(newSlideOrder)):

    # Add the slides from the original PowerPoint file to the new PowerPoint document in the new order
    new_presentation.Slides.AppendBySlide(presentation.Slides[newSlideOrder[i] - 1])

# Save the new presentation to file
new_presentation.SaveToFile("output/NewOrder.pptx", FileFormat.Pptx2019)

# Dispose resources
presentation.Dispose()
new_presentation.Dispose()

Python: Rearrange Slides in a PowerPoint Document

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Document Operation

Tagged under

ppt Python Document Operation

Python: Extract Tables from Word Documents

2024-05-30 01:01:30 Written by Koohji

Word documents often contain valuable data in the form of tables, which can be used for reporting, data analysis, and record-keeping. However, manually extracting and transferring these tables to other formats can be a time-consuming and error-prone task. By automating this process using Python, we can save time, ensure accuracy, and maintain consistency. Spire.Doc for Python provides a seamless solution for the table extraction task, making it effortless to create accessible and manageable files with data from Word document tables. This article will demonstrate how to leverage Spire.Doc for Python to extract tables from Word documents and write them into text files and Excel worksheets.

Extract Tables from Word Documents to Text Files with Python
Extract Tables from Word Documents to Excel Workbooks with Python

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

Package Manager

pip install Spire.Doc

If you are unsure how to install, please refer to: How to Install Spire.Doc for Python on Windows

Extract Tables from Word Documents to Text Files with Python

Spire.Doc for Python offers the Section.Tables property to retrieve a collection of tables within a section of a Word document. Then, developers can use the properties and methods under the ITable class to access the data in the tables and write it into a text file. This provides a convenient solution for converting Word document tables into text files.

The detailed steps for extracting tables from Word documents to text files are as follows:

Create an object of Document class and load a Word document using Document.LoadFromFile() method.
Iterate through the sections in the document and get the table collection of each section through Section.Tables property.
Iterate through the tables and create a string object for each table.
Iterate through the rows in each table and the cells in each row, get the text of each cell through TableCell.Paragraphs[].Text property, and add the cell text to the string.
Save each string to a text file.

Python

from spire.doc import *
from spire.doc.common import *

# Create an instance of Document
doc = Document()

# Load a Word document
doc.LoadFromFile("Sample.docx")

# Loop through the sections
for s in range(doc.Sections.Count):
    # Get a section
    section = doc.Sections.get_Item(s)
    # Get the tables in the section
    tables = section.Tables
    # Loop through the tables
    for i in range(0, tables.Count):
        # Get a table
        table = tables.get_Item(i)
        # Initialize a string to store the table data
        tableData = ''
        # Loop through the rows of the table
        for j in range(0, table.Rows.Count):
            # Loop through the cells of the row
            for k in range(0, table.Rows.get_Item(j).Cells.Count):
                # Get a cell
                cell = table.Rows.get_Item(j).Cells.get_Item(k)
                # Get the text in the cell
                cellText = ''
                for para in range(cell.Paragraphs.Count):
                    paragraphText = cell.Paragraphs.get_Item(para).Text
                    cellText += (paragraphText + ' ')
                # Add the text to the string
                tableData += cellText
                if k < table.Rows.get_Item(j).Cells.Count - 1:
                    tableData += '\t'
            # Add a new line
            tableData += '\n'
    
        # Save the table data to a text file
        with open(f'output/Tables/WordTable_{s+1}_{i+1}.txt', 'w', encoding='utf-8') as f:
            f.write(tableData)
doc.Close()

Python: Extract Tables from Word Documents

Extract Tables from Word Documents to Excel Workbooks with Python

Developers can also utilize Spire.Doc for Python to retrieve table data and then use Spire.XLS for Python to write the table data into an Excel worksheet, thereby enabling the conversion of Word document tables into Excel workbooks.

Install Spire.XLS for Python via PyPI:

Package Manager

pip install Spire.XLS

The detailed steps for extracting tables from Word documents to Excel workbooks are as follows:

Create an object of Document class and load a Word document using Document.LoadFromFile() method.
Create an object of Workbook class and clear the default worksheets using Workbook.Worksheets.Clear() method.
Iterate through the sections in the document and get the table collection of each section through Section.Tables property.
Iterate through the tables and create a worksheet for each table using Workbook.Worksheets.Add() method.
Iterate through the rows in each table and the cells in each row, get the text of each cell through TableCell.Paragraphs[].Text property, and write the text to the worksheet using Worksheet.SetCellValue() method.
Save the workbook using Workbook.SaveToFile() method.

Python

from spire.doc import *
from spire.doc.common import *
from spire.xls import *
from spire.xls.common import *

# Create an instance of Document
doc = Document()

# Load a Word document
doc.LoadFromFile('Sample.docx')

# Create an instance of Workbook
wb = Workbook()
wb.Worksheets.Clear()

# Loop through sections in the document
for i in range(doc.Sections.Count):
    # Get a section
    section = doc.Sections.get_Item(i)
    # Loop through tables in the section
    for j in range(section.Tables.Count):
        # Get a table
        table = section.Tables.get_Item(j)
        # Create a worksheet
        ws = wb.Worksheets.Add(f'Table_{i+1}_{j+1}')
        # Write the table to the worksheet
        for row in range(table.Rows.Count):
            # Get a row
            tableRow = table.Rows.get_Item(row)
            # Loop through cells in the row
            for cell in range(tableRow.Cells.Count):
                # Get a cell
                tableCell = tableRow.Cells.get_Item(cell)
                # Get the text in the cell
                cellText = ''
                for paragraph in range(tableCell.Paragraphs.Count):
                    paragraph = tableCell.Paragraphs.get_Item(paragraph)
                    cellText = cellText + (paragraph.Text + ' ')
                # Write the cell text to the worksheet
                ws.SetCellValue(row + 1, cell + 1, cellText)

# Save the workbook
wb.SaveToFile('output/Tables/WordTableToExcel.xlsx', FileFormat.Version2016)
doc.Close()
wb.Dispose()

Python: Extract Tables from Word Documents

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Table

Tagged under

doc Python Table

Python: Reorder Columns or Rows in Excel

2024-05-29 01:05:22 Written by Koohji

Reordering columns or rows in Excel is a simple process that allows you to change the arrangement of data within your spreadsheet. This can be useful for better organizing your data or aligning it with other columns or rows. You can reorder by using drag-and-drop, cut and paste, or keyboard shortcuts depending on the version of Excel you are using.

This article focus on introducing how to programmatically reorder columns or rows in an Excel worksheet in Python using Spire.XLS for Python.

Reorder Columns in Excel in Python
Reorder Rows in Excel in Python

Install Spire.XLS for Python

This scenario requires Spire.XLS for Python and plum-dispatch v1.7.4. They can be easily installed in your system through the following pip command.

Package Manager

pip install Spire.XLS

If you are unsure how to install, please refer to this tutorial: How to Install Spire.XLS for Python on Windows

Reorder Columns in Excel in Python

Spire.XLS does not provide a straightforward way to reorganize the order of columns or rows within an Excel worksheet. The solution requires creating a duplicate of the target worksheet. Then, you can copy the columns or rows from the copied worksheet and paste them into the original worksheet in the new preferred column or row sequence.

The following are the steps to reorder columns in an Excel worksheet using Python.

Create a Workbook object.
Load an Excel document from the specified file path.
Get the target worksheet using Workbook.Worksheets[index] property.
Specify the new column order within a list.
Create a temporary sheet and copy the data from the target sheet into it.
Copy the columns from the temporary worksheet to the target worksheet in the desired order using Worksheet.Columns[index].Copy() method.
Remove the temporary sheet.
Save the workbook to a different Excel document.

Python

from spire.xls import *
from spire.xls.common import *

# Create a Workbook object
workbook = Workbook()

# Load the Excel document
workbook.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.xlsx")

# Get a specific worksheet
targetSheet = workbook.Worksheets[0]

# Specify the new column order in a list (the column index starts from 0)
newColumnOrder = [3, 0, 1, 2, 4, 5 ,6, 7]

# Add a temporary worksheet
tempSheet = workbook.Worksheets.Add("temp")

# Copy data from the target worksheet to the temporary sheet
tempSheet.CopyFrom(targetSheet)

# Iterate through the newColumnOrder list
for i in range(len(newColumnOrder)):

    # Copy the column from the temporary sheet to the target sheet in the new order
    tempSheet.Columns[newColumnOrder[i]].Copy(targetSheet.Columns[i], True, True)

    # Reset the column width in the target sheet
    targetSheet.Columns[i].ColumnWidth = tempSheet.Columns[newColumnOrder[i]].ColumnWidth

# Remove the temporary sheet
workbook.Worksheets.Remove(tempSheet)

# Save the workbook to another Excel file
workbook.SaveToFile("output/ReorderColumns.xlsx", FileFormat.Version2016)

# Dispose resources
workbook.Dispose()

Python: Reorder Columns or Rows in Excel

Reorder Rows in Excel in Python

Rearranging the rows in an Excel spreadsheet follows a similar approach to reorganizing the columns. The steps to reorder the rows within an Excel worksheet are as outlined below.

Create a Workbook object.
Load an Excel document from the specified file path.
Get the target worksheet using Workbook.Worksheets[index] property.
Specify the new row order within a list.
Create a temporary sheet and copy the data from the target sheet into it.
Copy the rows from the temporary worksheet to the target worksheet in the desired order using Worksheet.Rows[index].Copy() method.
Remove the temporary sheet.
Save the workbook to a different Excel document.

Python

from spire.xls import *
from spire.xls.common import *

# Create a Workbook object
workbook = Workbook()

# Load the Excel document
workbook.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.xlsx")

# Get a specific worksheet
targetSheet = workbook.Worksheets[0]

# Specify the new row order in a list (the row index starts from 0)
newRowOrder = [0, 2, 3, 1, 4, 5 ,6, 7, 8, 9, 10, 11, 12]

# Add a temporary worksheet
tempSheet = workbook.Worksheets.Add("temp")

# Copy data from the first worksheet to the temporary sheet
tempSheet.CopyFrom(targetSheet)

# Iterate through the newRowOrder list
for i in range(len(newRowOrder)):

    # Copy the row from the temporary sheet to the target sheet in the new order
    tempSheet.Rows[newRowOrder[i]].Copy(targetSheet.Rows[i], True, True)

    # Reset the row height in the target sheet
    targetSheet.Rows[i].RowHeight = tempSheet.Rows[newRowOrder[i]].RowHeight

# Remove the temporary sheet
workbook.Worksheets.Remove(tempSheet)

# Save the workbook to another Excel file
workbook.SaveToFile("output/ReorderRows.xlsx", FileFormat.Version2016)

# Dispose resources
workbook.Dispose()

Python: Reorder Columns or Rows in Excel

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Worksheet

Tagged under

xls Python Worksheet

Python: Read or Remove Document Properties in Excel

2024-05-27 01:11:02 Written by Koohji

Document properties provide additional information about an Excel file, such as author, title, subject, and other metadata associated with the file. Retrieving these properties from Excel can help users gain insight into the file content and history, enabling better organization and management of files. At times, users may also need to remove document properties to protect the privacy and confidentiality of the information contained in the file. In this article, you will learn how to read or remove document properties in Excel in Python using Spire.XLS for Python.

Read Standard and Custom Document Properties in Excel
Remove Standard and Custom Document Properties in Excel

Install Spire.XLS for Python

This scenario requires Spire.XLS for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

Package Manager

pip install Spire.XLS

If you are unsure how to install, please refer to this tutorial: How to Install Spire.XLS for Python on Windows

Read Standard and Custom Document Properties in Excel in Python

Excel properties are divided into two main categories:

Standard Properties: These are predefined properties that are built into Excel files. They typically include basic details about the file such as title, subject, author, keywords, etc.
Custom Properties: These are user-defined attributes that can be added to Excel to track additional information about the file based on your specific needs.

Spire.XLS for Python allows to read both the standard and custom document properties of an Excel file. The following are the detailed steps:

Create a Workbook instance.
Load an Excel file using Workbook.LoadFromFile() method.
Create a StringBuilder instance.
Get a collection of all standard document properties using Workbook.DocumentProperties property.
Get specific standard document properties using the properties of the BuiltInDocumentProperties class and append them to the StringBuilder instance.
Get a collection of all custom document properties using Workbook.CustomDocumentProperties property.
Iterate through the collection.
Get the name, type, and value of each custom document property using ICustomDocumentProperties[].Name, ICustomDocumentProperties[].PropertyType and ICustomDocumentProperties[].Value properties.
Determine the specific property type, and then convert the property value to the value of the corresponding data type.
Append the property name and converted property value to the StringBuilder instance using StringBuilde.append() method.
Write the content of the StringBuilder instance into a txt file.

Python

from spire.xls import *
from spire.xls.common import *

def AppendAllText(fname: str, text: List[str]):
    fp = open(fname, "w")
    for s in text:
        fp.write(s + "\n")
    fp.close()

inputFile = "Budget Template.xlsx"
outputFile = "GetExcelProperties.txt"

# Create a Workbook instance
workbook = Workbook()

# Load an Excel document from disk
workbook.LoadFromFile(inputFile)

# Create a StringBuilder instance
builder = []

# Get a collection of all standard document properties
standardProperties = workbook.DocumentProperties

# Get specific standard properties and append them to the StringBuilder instance
builder.append("Standard Document Properties:")
builder.append("Title: " + standardProperties.Title)
builder.append("Subject: " + standardProperties.Subject)
builder.append("Category: " + standardProperties.Category)
builder.append("Keywords: " + standardProperties.Keywords)
builder.append("Comments: " + standardProperties.Comments)
builder.append("")

# Get a collection of all custom document properties
customProperties = workbook.CustomDocumentProperties

builder.append("Custom Properties:")
# Iterate through the collection
for i in range(len(customProperties)):
    
    # Get the name, type, and value of each custom document property
    name = customProperties[i].Name
    type = customProperties[i].PropertyType
    obj = customProperties[i].Value

    # Determine the specific property type, and then convert the property value to the value of the corresponding data type
    value = None
    if type == PropertyType.Double:
        value = Double(obj).Value
    elif type == PropertyType.DateTime:
        value = DateTime(obj).ToShortDateString()
    elif type == PropertyType.Bool:
        value = Boolean(obj).Value
    elif type == PropertyType.Int:
        value = Int32(obj).Value
    elif type == PropertyType.Int32:
        value = Int32(obj).Value
    else:
        value = String(obj).Value

    # Append the property name and converted property value to the StringBuilder instance   
    builder.append(name + ": " + str(value))

# Write the content of the StringBuilder instance into a text file
AppendAllText(outputFile, builder)
workbook.Dispose()

Python: Read or Remove Document Properties in Excel

Remove Standard and Custom Document Properties in Excel in Python

You can easily delete standard document properties from an Excel file by setting their values as empty. For custom document properties, you can use the ICustomDocumentProperties.Remove() method to delete them. The following are the detailed steps:

Create a Workbook instance.
Load a sample Excel file using Workbook.LoadFromFile() method.
Get a collection of all standard document properties using Workbook.DocumentProperties property.
Set the values of specific standard document properties as empty through the corresponding properties of the BuiltInDocumentProperties class.
Get a collection of all custom document properties using Workbook.CustomDocumentProperties property.
Iterate through the collection.
Delete each custom property from the collection by its name using ICustomDocumentProperties.Remove() method.
Save the result file using Workbook.SaveToFile() method.

Python

from spire.xls import *
from spire.xls.common import *

inputFile = "Budget Template.xlsx"
outputFile = "RemoveExcelProperties.xlsx"

# Create a Workbook instance
workbook = Workbook()

# Load an Excel document from disk
workbook.LoadFromFile(inputFile)

# Get a collection of all standard document properties
standardProperties = workbook.DocumentProperties

# Set the value of each standard document property as empty
standardProperties.Title = ""
standardProperties.Subject = ""
standardProperties.Category = ""
standardProperties.Keywords = ""
standardProperties.Comments = ""

# Get a collection of all custom document properties
customProperties = workbook.CustomDocumentProperties

# Iterate through the collection
for i in range(len(customProperties) - 1, -1, -1):
    # Delete each custom document property from the collection by its name
    customProperties.Remove(customProperties[i].Name)

# Save the result file
workbook.SaveToFile(outputFile, ExcelVersion.Version2016)
workbook.Dispose()

Python: Read or Remove Document Properties in Excel

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Document Operation

Tagged under

xls Python Document Operation

Python: Set Alignment for Table and Table Text in Word

2024-05-24 01:02:59 Written by Koohji

Proper alignment of tables and text in Microsoft Word is crucial for creating visually appealing and easy-to-read documents. By aligning table headers, numeric data, and text appropriately, you can enhance the organization and clarity of your information, making it more accessible to your readers. In this article, we will demonstrate how to align tables and the text in table cells in Microsoft Word in Python using Spire.Doc for Python.

Align Tables in Word in Python
Align the Text in Table Cells in Word in Python

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip commands.

Package Manager

pip install Spire.Doc

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows

Align Tables in Word in Python

A table in a Word document can be aligned to the left, center, or right side by using the Table.TableFormat.HorizontalAlignment property. The detailed steps are as follows.

Create an instance of the Document class.
Load a Word document using Document.LoadFromFile() method.
Get a specific section in the document using Document.Sections[index] property.
Get a specific table in the section using Section.Tables[index] property.
Set the alignment for the table using Table.TableFormat.HorizontalAlignment property.
Save the result document using Document.SaveToFile() method.

Python

from spire.doc import *
from spire.doc.common import *

# Create an instance of the Document class
document = Document()
# Load a Word document
document.LoadFromFile("Tables.docx")

# Get the first section in the document
section = document.Sections[0]

# Get the first, second, and third tables in the section
table1 = section.Tables[0]
table2 = section.Tables[1]
table3 = section.Tables[2]

# Align the first table to the left
table1.TableFormat.HorizontalAlignment = RowAlignment.Left
# Align the second table to the center
table2.TableFormat.HorizontalAlignment = RowAlignment.Center
# Align the third table to the right
table3.TableFormat.HorizontalAlignment = RowAlignment.Right

# Save the result document
document.SaveToFile("AlignTable.docx", FileFormat.Docx2013)
document.Close()

Python: Set Alignment for Table and Table Text in Word

Align the Text in Table Cells in Word in Python

The text within a table cell can be horizontally aligned to the left, center, or right side using the TableCell.Paragraphs[index].Format.HorizontalAlignment property. Additionally, they can also be vertically aligned to the top, center, or bottom of the cell using the TableCell.CellFormat.VerticalAlignment property. The detailed steps are as follows.

Create an instance of the Document class.
Load a Word document using Document.LoadFromFile() method.
Get a specific section in the document using Document.Sections[index] property.
Get a specific table in the section using Section.Tables[index] property.
Loop through the rows in the table.
Loop through the cells in each row.
Set the vertical alignment for the text in each cell using TableCell.CellFormat.VerticalAlignment property.
Loop through the paragraphs in each cell.
Set the horizontal alignment for each paragraph using TableCell.Paragraphs[index].Format.HorizontalAlignment property.
Save the result document using Document.SaveToFile() method.

Python

from spire.doc import *
from spire.doc.common import *

# Create an instance of the Document class
document = Document()
# Load a Word document
document.LoadFromFile("Table.docx")

# Get the first section in the document
section = document.Sections[0]

# Get the first tables in the section
table = section.Tables[0]

# Loop through the rows in the table
for row_index in range(table.Rows.Count):
    row = table.Rows[row_index]
    # Loop through the cells in the row
    for cell_Index in range(row.Cells.Count):
        cell = row.Cells[cell_Index]
        # Vertically align the text in the cell to the center
        cell.CellFormat.VerticalAlignment = VerticalAlignment.Middle
        # Horizontally align the text in the cell to the center
        for para_index in range(cell.Paragraphs.Count):
            paragraph = cell.Paragraphs[para_index]
            paragraph.Format.HorizontalAlignment = HorizontalAlignment.Center

# Save the result document
document.SaveToFile("AlignTableText.docx", FileFormat.Docx2013)
document.Close()

Python: Set Alignment for Table and Table Text in Word

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Table

Tagged under

doc Python Table

Python: Create and Scan QR Codes

2024-05-23 09:02:53 Written by Koohji

QR codes are a type of two-dimensional barcode that can store a variety of information, including URLs, contact details, and even payment information. QR codes have become increasingly popular, allowing for quick and convenient access to digital content, making them a useful tool in our modern, technology-driven world.

In this article, you will learn how to create and scan QR codes in Python using Spire.Barcode for Python.

Create a QR Code in Python
Scan a QR Code Image in Python

Get a Free Trial License

The trial version of Spire.Barcode for Python does not support scanning QR code images without a valid license being applied. Additionally, it displays an evaluation message on any QR code images that are generated.

To remove these limitations, you can get a 30-day trial license for free.

Create a QR Code in Python

Spire.Barcode for Python offers the BarcodeSettings class, which enables you to configure the settings for generating a barcode. These settings encompass the barcode type, the data to be encoded, the color, the margins, and the horizontal and vertical resolution.

After you have set up the desired settings, you can create a BarcodeGenerator instance using those configurations. Subsequently, you can invoke the GenerateImage() method of the generator to produce the barcode image.

The following are the steps to create a QR code in Python.

Create a BarcodeSettings object.
Set the barcode type to QR code using BarcodeSettings.Type property.
Set the data of the 2D barcode using BarcodeSettings.Data2D property.
Set other attributes of the barcode using the properties under the BarcodeSettings object.
Create a BarCodeGenerator object based on the settings.
Create a QR code image using BarCodeGenerator.GenerateImage() method.

Python

from spire.barcode import *

# Write all bytes to a file
def WriteAllBytes(fname: str, data):
    with open(fname, "wb") as fp:
        fp.write(data)
    fp.close()

# Apply license key
License.SetLicenseKey("license key")

# Create a BarcodeSettings object
barcodeSettings = BarcodeSettings()

# Set the type of barcode to QR code
barcodeSettings.Type = BarCodeType.QRCode

# Set the data for the 2D barcode
barcodeSettings.Data2D = "Hello, World"

# Set margins
barcodeSettings.LeftMargin = 0.2
barcodeSettings.RightMargin = 0.2
barcodeSettings.TopMargin = 0.2
barcodeSettings.BottomMargin = 0.2

# Set the horizontal resolution
barcodeSettings.DpiX = 500

# Set the vertical resolution
barcodeSettings.DpiY = 500

# Set error correction level
barcodeSettings.QRCodeECL = QRCodeECL.M

# Do not display text on barcode
barcodeSettings.ShowText = False
  
# Add a logo at the center of the QR code
barcodeSettings.SetQRCodeLogoImage("C:\\Users\\Administrator\\Desktop\\logo.png")

# Create an instance of BarCodeGenerator with the specified settings
barCodeGenerator = BarCodeGenerator(barcodeSettings)

# Generate the image for the barcode
image = barCodeGenerator.GenerateImage()

# Write the PNG image to disk
WriteAllBytes("output/QRCode.png", image)

Python: Create and Scan QR Codes

Scan a QR Code Image in Python

Spire.Barcode provides the BarcodeScanner class, which is responsible for barcode image recognition. This class offers several methods to extract data from barcodes, including:

ScanOneFile(): Scans a single barcode image file and returns the extracted data.
ScanFile(): Scans all barcodes present in a specified image file and returns the extracted data.
ScanStream(): Scans barcodes from a stream of image data and returns the extracted information.

The following code demonstrates how to scan a QR code image using it.

Python

from spire.barcode import *

# Apply license key
License.SetLicenseKey("license key")

# Scan an image file that contains one barcode
result = BarcodeScanner.ScanOneFile("C:\\Users\\Administrator\\Desktop\\QRCode.png")

# Scan an image file that contains multiple barcodes
# results = BarcodeScanner.ScanFile("C:\\Users\\Administrator\\Desktop\\Image.png")

# Print the result
print(result)

Python: Create and Scan QR Codes

Highlight Text in PowerPoint Presentation in Python

Spire.Presentation for Python provides a method called IAutoShape.TextFrame.HighLightText(text: str, color: Color, options: TextHighLightingOptions) to highlight specific text within the shapes of a PowerPoint presentation.

Follow the steps below to highlight specified text in your presentation using Spire.Presentation for Python:

Create an instance of the Presentation class.
Load a PowerPoint presentation using the Presentation.LoadFromFile() method.
Create an instance of the TextHighLightingOptions class, and set the text highlighting options such as whole words only and case sensitive through the TextHighLightingOptions.WholeWordsOnly and TextHighLightingOptions.CaseSensitive properties.
Loop through the slides in the presentation and the shapes on each slide.
Check if the current shape is of IAutoShape type.
If the result is true, typecast it to an IAutoShape object.
Highlight all matches of specific text in the shape using the IAutoShape.TextFrame.HighLightText(text: str, color: Color, options: TextHighLightingOptions) method.
Save the result presentation to a new file using the Presentation.SaveToFile() method.

Python

from spire.presentation.common import *
from spire.presentation import *

# Specify the input and output file paths
input_file = "Example.pptx"
output_file = "HighlightText.pptx"

# Create an instance of the Presentation class
ppt = Presentation()
# Load the PowerPoint presentation
ppt.LoadFromFile(input_file)

# Specify the text to highlight
text_to_highlight = "Spire.Presentation"
# Specify the highlight color
highlight_color = Color.get_Yellow()

# Create an instance of the TextHighLightingOptions class
options = TextHighLightingOptions()
# Set the highlight options (case sensitivity and whole word highlighting)
options.WholeWordsOnly = True
options.CaseSensitive = True

# Loop through the slides in the presentation
for slide in ppt.Slides:
    # Loop through the shapes on each slide
    for shape in slide.Shapes:
            # Check if the shape is of IAutoShape type
            if isinstance (shape, IAutoShape):
                # Typecast the shape to an IAutoShape object
                auto_shape = IAutoShape(shape)
                # Search and highlight specified text within the shape
                auto_shape.TextFrame.HighLightText(text_to_highlight, highlight_color, options)

# Save the result presentation to a new PPTX file
ppt.SaveToFile(output_file, FileFormat.Pptx2013)
ppt.Dispose()

Python: Highlight Text in PowerPoint Presentation

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Paragraph and Text

Tagged under

ppt Python Paragraph and Text

Python: Get Coordinates of the Specified Text or Image in PDF

2024-05-21 01:58:08 Written by Administrator

Retrieving the coordinates of text or images within a PDF document can quickly locate specific elements, which is valuable for extracting content from PDFs. This capability also enables adding annotations, marks, or stamps to the desired locations in a PDF, allowing for more advanced document processing and manipulation.

In this article, you will learn how to get coordinates of the specified text or image in a PDF document using Spire.PDF for Python.

Get Coordinates of the Specified Text in PDF in Python
Get Coordinates of the Specified Image in PDF in Python

Install Spire.PDF for Python

This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

Package Manager

pip install Spire.PDF

If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows

Coordinate System in Spire.PDF

When using Spire.PDF to process an existing PDF document, the origin of the coordinate system is located at the top left corner of the page. The X-axis extends horizontally from the origin to the right, and the Y-axis extends vertically downward from the origin (shown as below).

Python: Get Coordinates of the Specified Text or Image in PDF

Get Coordinates of the Specified Text in PDF in Python

To find the coordinates of a specific piece of text within a PDF document, you must first use the PdfTextFinder.Find() method to locate all instances of the target text on a particular page. Once you have found these instances, you can then access the PdfTextFragment.Positions property to retrieve the precise (X, Y) coordinates for each instance of the text.

The steps to get coordinates of the specified text in PDF are as follows.

Create a PdfDocument object.
Load a PDF document from a specified path.
Get a specific page from the document.
Create a PdfTextFinder object.
Specify find options through PdfTextFinder.Options property.
Search for a string within the page using PdfTextFinder.Find() method.
Get a specific instance of the search results.
Get X and Y coordinates of the text through PdfTextFragment.Positions[0].X and PdfTextFragment.Positions[0].Y properties.

Python

from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
doc = PdfDocument()

# Load a PDF document
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Privacy Policy.pdf")

# Get a specific page
page = doc.Pages.get_Item(0)

# Create a PdfTextFinder object
textFinder = PdfTextFinder(page)

# Specify find options
findOptions = PdfTextFindOptions()
findOptions.Parameter = TextFindParameter.IgnoreCase
findOptions.Parameter = TextFindParameter.WholeWord
textFinder.Options = findOptions
 
# Search for the string "PRIVACY POLICY" within the page
findResults = textFinder.Find("PRIVACY POLICY") 

# Get the first instance of the results
result = findResults[0]

# Get X/Y coordinates of the found text
x = int(result.Positions[0].X)
y = int(result.Positions[0].Y)
print("The coordinates of the first instance of the found text are:", (x, y))

# Dispose resources
doc.Dispose()

Python: Get Coordinates of the Specified Text or Image in PDF

Get Coordinates of the Specified Image in PDF in Python

Spire.PDF for Python provides the PdfImageHelper class, which allows users to extract image details from a specific page within a PDF file. By doing so, you can leverage the PdfImageInfo.Bounds property to retrieve the (X, Y) coordinates of an individual image.

The steps to get coordinates of the specified image in PDF are as follows.

Create a PdfDocument object.
Load a PDF document from a specified path.
Get a specific page from the document.
Create a PdfImageHelper object.
Get the image information from the page using PdfImageHelper.GetImagesInfo() method.
Get X and Y coordinates of a specific image through PdfImageInfo.Bounds property.

Python

from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
doc = PdfDocument()

# Load a PDF document
doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Privacy Policy.pdf")

# Get a specific page 
page = doc.Pages.get_Item(0)

# Create a PdfImageHelper object
imageHelper = PdfImageHelper()

# Get image information from the page
imageInformation = imageHelper.GetImagesInfo(page)

# Get X/Y coordinates of a specific image
x = int(imageInformation[0].Bounds.X)
y = int(imageInformation[0].Bounds.Y)
print("The coordinates of the specified image are:", (x, y))

# Dispose resources
doc.Dispose()

Python: Get Coordinates of the Specified Text or Image in PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Extract/Read

Tagged under

pdf Python Extract Read

News Category

Python (355)

Children categories

Install Spire.Doc for Python

Modify Content Controls in the Body using Python

Modify Content Controls within Paragraphs using Python

Modify Content Controls Wrapping Table Rows using Python

Modify Content Controls Wrapping Table Cells using Python

Modify Content Controls within Table Cells using Python

Apply for a Temporary License

Install Spire.Doc for Python

Python Create a Table Of Contents Using Heading Styles

Python Create a Table Of Contents Using Outline Level Styles

Python Create a Table Of Contents Using Image Captions

Python Create a Table Of Contents Using Table Captions

Apply for a Temporary License

Install Spire.Presentation for Python

Rearrange Slides in a PowerPoint Document in Python

Apply for a Temporary License

Install Spire.Doc for Python

Extract Tables from Word Documents to Text Files with Python

Extract Tables from Word Documents to Excel Workbooks with Python

Apply for a Temporary License

Install Spire.XLS for Python

Reorder Columns in Excel in Python

Reorder Rows in Excel in Python

Apply for a Temporary License

Install Spire.XLS for Python

Read Standard and Custom Document Properties in Excel in Python

Remove Standard and Custom Document Properties in Excel in Python

Apply for a Temporary License

Install Spire.Doc for Python

Align Tables in Word in Python

Align the Text in Table Cells in Word in Python

Apply for a Temporary License

Get a Free Trial License

Create a QR Code in Python

Scan a QR Code Image in Python

See Also

Install Spire.Presentation for Python

Highlight Text in PowerPoint Presentation in Python

Apply for a Temporary License

Install Spire.PDF for Python

Coordinate System in Spire.PDF

Get Coordinates of the Specified Text in PDF in Python

Get Coordinates of the Specified Image in PDF in Python

Apply for a Temporary License

More...