We are happy to announce the release of Spire.Doc for Python 12.2.1. This version supports obtaining page content through fixed layout. More details are listed below.
Here is a list of changes made in this release
| Category | ID | Description |
| New feature | - | Supports obtaining page content through fixed layout.
def WriteAllText(fpath:str,content:str):
with open(fpath,'w',encoding="utf-8") as fp:
fp.write(content)
# Specify the file path
inputFile = "./Data/Sample.docx"
outputFile = "output.txt"
# Create a new instance of Document
doc = Document()
# Load the document from the specified file
doc.LoadFromFile(inputFile, FileFormat.Docx)
# Create a FixedLayoutDocument object using the loaded document
layoutDoc = FixedLayoutDocument(doc)
result = ''
# Get the first line on the first page
line = layoutDoc.Pages[0].Columns[0].Lines[0]
result += "Line: "
result += line.Text
result += "\n"
# Retrieve the original paragraph associated with the line
para = line.Paragraph
result += "Paragraph text: "
result += para.Text
result += "\n"
# Retrieve all the text that appears on the first page in plain text format (including headers and footers).
pageText = layoutDoc.Pages[0].Text
result += pageText
result += "\n"
# Loop through each page in the document and print how many lines appear on each page.
pages = layoutDoc.Pages
for i in range(pages.Count):
page = pages[i]
lines = page.GetChildEntities(LayoutElementType.Line, True)
result += "Page "
result += str(page.PageIndex)
result += " has "
result += str(lines.Count)
result += " lines."
result += "\n"
# Perform a reverse lookup of layout entities for the first paragraph
result += "\n"
result += "The lines of the first paragraph:"
result += "\n"
tempChild = doc.FirstChild
section = Section(tempChild)
para = section.Body.Paragraphs[0]
paragraphLines = layoutDoc.GetLayoutEntitiesOfNode(para)
for i in range(paragraphLines.Count):
tempLine = paragraphLines[i]
paragraphLine = FixedLayoutLine(tempLine)
result += (paragraphLine.Text).strip()
result += "\n"
result += paragraphLine.Rectangle.ToString()
result += "\n"
result += "\n"
# Write the extracted text to a file
WriteAllText(outputFile, result)
# Dispose of the document resources
doc.Dispose()
|
Click the link below to get Spire.Doc for Python 12.2.1: