Python: Extract Bookmarks from PDF
PDF files often contain bookmarks, which are clickable links that make navigating lengthy documents easier. Extracting these bookmarks can be beneficial for creating an outline of the document, analyzing document structure, or identifying key topics or sections. In this article, you will learn how to extract PDF bookmarks with Python using Spire.PDF for Python.
Install Spire.PDF for Python
This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.PDF
If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows
Extract Bookmarks from PDF Using Python
With Spire.PDF for Python, you can create custom methods GetBookmarks() and GetChildBookmark() to get the title and text styles of both parent and child bookmarks in a PDF file, then export them to a TXT file. The following are the detailed steps.
- Create a PdfDocument instance.
- Load a PDF file using PdfDocument.LoadFromFile() method.
- Get bookmarks collection in the PDF file using PdfDocument.Bookmarks property.
- Call custom methods GetBookmarks() and GetChildBookmark() to get the text content and text style of parent and child bookmarks.
- Export the extracted PDF bookmarks to a TXT file.
- Python
from spire.pdf.common import *
from spire.pdf import *
inputFile = "AnnualReport.pdf"
result = "GetPdfBookmarks.txt"
def GetChildBookmark(parentBookmark, content):
if parentBookmark.Count > 0:
# Iterate through each child bookmark in the parent bookmarks
for i in range(parentBookmark.Count):
childBookmark = parentBookmark.get_Item(i)
# Get the title
content.append(childBookmark.Title)
# Get the text style
textStyle = str(childBookmark.DisplayStyle)
content.append(textStyle)
cldBk = PdfBookmarkCollection(childBookmark)
GetChildBookmark(cldBk, content)
def GetBookmarks(bookmarks, result):
# Create an object of StringBuilder
content = []
# Get PDF bookmarks information
if bookmarks.Count > 0:
content.append("Pdf bookmarks:")
# Iterate through each parent bookmark
for i in range(bookmarks.Count):
parentBookmark = bookmarks.get_Item(i)
# Get the title
content.append(parentBookmark.Title)
# Get the text style
textStyle = str(parentBookmark.DisplayStyle)
content.append(textStyle)
cldBk = PdfBookmarkCollection(parentBookmark)
GetChildBookmark(cldBk, content)
# Save to a TXT file
with open(result, "w") as file:
file.write("\n".join(content))
# Create a PdfDocument instance
pdf = PdfDocument()
# Load a PDF file from disk.
pdf.LoadFromFile(inputFile)
# Get bookmarks collection of the PDF file
bookmarks = pdf.Bookmarks
# Get the contents of bookmarks and save them to a TXT file
GetBookmarks(bookmarks, result)
pdf.Close()

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Python: Expand or Collapse Bookmarks in PDF
PDF bookmarks are key tools for optimizing reading navigation. When expanded, users can click on the titles to jump to the corresponding chapters and display sub-level directories, enabling intuitive access and positioning within the document's deep structure. Collapsing bookmarks, on the other hand, allows users to hide all sub-bookmark information at the current level with a single click, simplifying the view and focusing on higher-level structure. These two operations work together to significantly enhance the efficiency and experience of reading complex, multi-level PDF documents. This article will introduce how to programmatically expand and collapse bookmarks in a PDF using Spire.PDF for Python.
Install Spire.PDF for Python
This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.PDF
If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows
Expand or Collapse all Bookmarks in Python
Spire.PDF for Python provides the property BookMarkExpandOrCollapse to expand or collapse bookmarks, when set to True, it expands all bookmarks. Conversely, setting it to False will collapses all bookmarks. The following are the detailed steps for expanding bookmarks in a PDF document.
- Create a PdfDocument class instance.
- Load a PDF document using PdfDocument.LoadFromFile() method.
- Expand all bookmarks using BookMarkExpandOrCollapse property.
- Save the document using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import *
from spire.pdf import *
# Create a PdfDocument object
doc = PdfDocument()
# Load a PDF file
doc.LoadFromFile("Terms of service.pdf")
# Set BookMarkExpandOrCollapse as True to expand all bookmarks, set False to collapse all bookmarks
doc.ViewerPreferences.BookMarkExpandOrCollapse = True
# Save the document
outputFile="ExpandAllBookmarks.pdf"
doc.SaveToFile(outputFile)
# Close the document
doc.Close()

Expand or Collapse a specific Bookmark in Python
If you need to expand or collapse only a specific bookmark, you can use the property ExpandBookmark. The following are the detailed steps.
- Create a PdfDocument class instance.
- Load a PDF document using PdfDocument.LoadFromFile() method.
- Get a specific bookmark using PdfDocument.Bookmarks.get_Item() method.
- Expand the bookmark using ExpandBookmark property.
- Save the result document using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import *
from spire.pdf import *
# Create a PdfDocument object
doc = PdfDocument()
# Load a PDF file
doc.LoadFromFile("Terms of service.pdf")
# Set ExpandBookmark as True for the third bookmark
doc.Bookmarks.get_Item(2).ExpandBookmark = True
# Save the document
outputFile="ExpandSpecifiedBookmarks.pdf"
doc.SaveToFile(outputFile)
# Close the document
doc.Close()

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Python: Add, Edit, or Delete Bookmarks in PDF
PDF bookmarks are navigational aids that allow users to quickly locate and jump to specific sections or pages in a PDF document. Through a simple click, users can arrive at the target location, which eliminates the need to manually scroll or search for specific content in a lengthy document. In this article, you will learn how to programmatically add, modify and delete bookmarks in PDF files using Spire.PDF for Python.
- Add Bookmarks to a PDF Document
- Edit Bookmarks in a PDF Document
- Delete Bookmarks from a PDF Document
Install Spire.PDF for Python
This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.PDF
If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows
Add Bookmarks to a PDF Document in Python
Spire.PDF for Python provides a method to add bookmarks to a PDF document: PdfDocument. Bookmarks.Add(). You can use this method to create primary bookmarks for the PDF document and use the PdfBookmarkCollection.Add() method to add sub-bookmarks to the primary bookmarks. Additionally, the PdfBookmark class offers other methods to set properties such as destination, text color, and text style for the bookmarks. The following are the detailed steps for adding bookmarks to a PDF document.
- Create a PdfDocument class instance.
- Load a PDF document using PdfDocument.LoadFromFile() method.
- Add a parent bookmark to the document using PdfDocument.Bookmarks.Add() method.
- Create a PdfDestination class object and set the destination of the parent bookmark using PdfBookmark.Action property.
- Set the text color and style of the parent bookmark.
- Create a PdfBookmarkCollection class object to add sub-bookmark to the parent bookmark using PdfBookmarkCollection.Add() method.
- Use the above methods to set the destination, text color, and text style of the sub-bookmark.
- Save the document using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import *
from spire.pdf import *
# Create a PdfDocument object
doc = PdfDocument()
# Load a PDF file
doc.LoadFromFile("Terms of service.pdf")
# Loop through the pages in the PDF file
for i in range(doc.Pages.Count):
page = doc.Pages[i]
# Set the title and destination for the bookmark
bookmarkTitle = "Bookmark-{0}".format(i+1)
bookmarkDest = PdfDestination(page, PointF(0.0, 0.0))
# Create and configure the bookmark
bookmark = doc.Bookmarks.Add(bookmarkTitle)
bookmark.Color = PdfRGBColor(Color.get_SaddleBrown())
bookmark.DisplayStyle = PdfTextStyle.Bold
bookmark.Action = PdfGoToAction(bookmarkDest)
# Create a collection to hold child bookmarks
bookmarkColletion = PdfBookmarkCollection(bookmark)
# Set the title and destination for the child bookmark
childBookmarkTitle = "Sub-Bookmark-{0}".format(i+1)
childBookmarkDest = PdfDestination(page, PointF(0.0, 100.0))
# Create and configure the child bookmark
childBookmark = bookmarkColletion.Add(childBookmarkTitle)
childBookmark.Color = PdfRGBColor(Color.get_Coral())
childBookmark.DisplayStyle = PdfTextStyle.Italic
childBookmark.Action = PdfGoToAction(childBookmarkDest)
# Save the PDF file
outputFile = "Bookmark.pdf"
doc.SaveToFile(outputFile)
# Close the document
doc.Close()

Edit Bookmarks in a PDF Document
If you need to update the existing bookmarks, you can use the methods of PdfBookmark class to rename the bookmarks and change their text color, text style. The following are the detailed steps.
- Create a PdfDocument class instance.
- Load a PDF document using PdfDocument.LoadFromFile() method.
- Get a specified bookmark using PdfDocument.Bookmarks[] property.
- Change the title of the bookmark using PdfBookmark.Title property.
- Change the font color of the bookmark using PdfBookmark.Color property.
- Change the text style of the bookmark using PdfBookmark.DisplayStyle property.
- Change the text color and style of the sub-bookmark using the above methods.
- Save the result document using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import *
from spire.pdf import *
# Create a PdfDocument object
doc = PdfDocument()
# Load a PDF file
doc.LoadFromFile("Bookmark.pdf")
# Get the first bookmark
bookmark = doc.Bookmarks[0]
# Change the title of the bookmark
bookmark.Title = "Modified BookMark"
# Set the color of the bookmark
bookmark.Color = PdfRGBColor(Color.get_Black())
# Set the outline text style of the bookmark
bookmark.DisplayStyle = PdfTextStyle.Bold
# Edit child bookmarks of the parent bookmark
pBookmark = PdfBookmarkCollection(bookmark)
for i in range(pBookmark.Count):
childBookmark = pBookmark.get_Item(i)
childBookmark.Color = PdfRGBColor(Color.get_Blue())
childBookmark.DisplayStyle = PdfTextStyle.Regular
# Save the PDF document
outputFile = "EditBookmark.pdf"
# Close the document
doc.SaveToFile(outputFile)

Delete Bookmarks from a PDF Document
Spire.PDF for Python also provides methods to delete any bookmark in a PDF document. PdfDocument.Bookmarks.RemoveAt() method is used to remove a specific primary bookmark, PdfDocument.Bookmarks.Clear() method is used to remove all bookmarks, and PdfBookmarkCollection.RemoveAt() method is used to remove a specific sub-bookmark of a primary bookmark. The detailed steps of removing bookmarks form a PDF document are as follows.
- Create a PdfDocument class instance.
- Load a PDF document using PdfDocument.LoadFromFile() method.
- Get the first bookmark using PdfDocument.Bookmarks[] property.
- Remove a specified sub-bookmark of the first bookmark using PdfBookmarkCollection.RemoveAt() method.
- Remove a specified bookmark including its sub-bookmarks using PdfDocument.Bookmarks.RemoveAt() method.
- Remove all bookmarks in the PDF file using PdfDocument.Bookmarks.Clear() method.
- Save the document using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import *
from spire.pdf import *
# Create a PdfDocument object
doc = PdfDocument()
# Load a PDF file
doc.LoadFromFile("Bookmark.pdf")
# # Delete the first bookmark
# doc.Bookmarks.RemoveAt(0)
# # Get the first bookmark
# bookmark = doc.Bookmarks[0]
# # Remove the first child bookmark from first parent bookmark
# pBookmark = PdfBookmarkCollection(bookmark)
# pBookmark.RemoveAt(0)
#Remove all bookmarks
doc.Bookmarks.Clear()
# Save the PDF document
output = "DeleteAllBookmarks.pdf"
doc.SaveToFile(output)
# Close the document
doc.Close()

Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.