Environment Setup
1.1 Install the Free Library
Use the NuGet Package Manager in Visual Studio to install Free Spire.PDF
Install-Package FreeSpire.PDF
The free version supports basic operations such as reading PDF bookmarks and does not require an additional license file, but it is limited to 10 pages per document.
1.2 Import Namespaces
Add the following namespaces to your code:
using System;using System.IO; using System.Text;using Spire.Pdf; using Spire.Pdf.Bookmarks;
Core Implementation Logic
The overall process can be broken down into four steps:
Load the target PDF document.
Retrieve the document’s
PdfBookmarkCollection.Recursively traverse each bookmark and its child bookmarks to extract titles and display styles.
Write the extracted content to a text file.
2.1 Load the Document and Retrieve the Bookmark Collection
PdfDocument pdf = new PdfDocument(); pdf.LoadFromFile(@"D:\test.pdf"); PdfBookmarkCollection bookmarks = pdf.Bookmarks;
The Bookmarks property returns a collection containing the top-level bookmarks. If the document has no bookmarks, the Count will be 0.
2.2 Recursively Traverse the Bookmark Tree
The bookmark structure is a typical tree: each bookmark node may contain a collection of child bookmarks (accessible via the Count property and indexer). We design two methods:
GetBookmarks: Handles top-level bookmarks, initializes aStringBuilder, and starts the recursion.GetChildBookmark: Recursively processes child bookmarks.
public static void GetBookmarks(PdfBookmarkCollection bookmarks, string result)
{
StringBuilder content = new StringBuilder();
if (bookmarks.Count > 0)
{
content.AppendLine("Pdf bookmarks:");
foreach (PdfBookmark parentBookmark in bookmarks)
{
// Retrieve the title
content.AppendLine(parentBookmark.Title);
// Retrieve the display style (e.g., regular, bold, italic, etc.)
content.AppendLine(parentBookmark.DisplayStyle.ToString());
// Recursively process child bookmarks
GetChildBookmark(parentBookmark, content);
}
}
File.WriteAllText(result, content.ToString());
}Recursive method:
public static void GetChildBookmark(PdfBookmark parentBookmark, StringBuilder content)
{
if (parentBookmark.Count > 0)
{
foreach (PdfBookmark childBookmark in parentBookmark)
{
content.AppendLine(childBookmark.Title);
content.AppendLine(childBookmark.DisplayStyle.ToString());
GetChildBookmark(childBookmark, content);
}
}
}2.3 Complete Code Example
Below is a complete console application example that outputs bookmark information to a file named GetPdfBookmarks.txt.
using System;
using System.IO;
using System.Text;
using Spire.Pdf;
using Spire.Pdf.Bookmarks;
namespace GetBookmark
{
internal class Program
{
static void Main(string[] args)
{
PdfDocument pdf = new PdfDocument();
pdf.LoadFromFile(@"D:\testp\test.pdf");
PdfBookmarkCollection bookmarks = pdf.Bookmarks;
string result = "GetPdfBookmarks.txt";
GetBookmarks(bookmarks, result);
Console.WriteLine("Bookmark extraction completed. The results have been saved to:" + result);
}
public static void GetBookmarks(PdfBookmarkCollection bookmarks, string result)
{
StringBuilder content = new StringBuilder();
if (bookmarks.Count > 0)
{
content.AppendLine("Pdf bookmarks:");
foreach (PdfBookmark parentBookmark in bookmarks)
{
content.AppendLine(parentBookmark.Title);
content.AppendLine(parentBookmark.DisplayStyle.ToString());
GetChildBookmark(parentBookmark, content);
}
}
else
{
content.AppendLine("The PDF document does not contain any bookmarks.");
}
File.WriteAllText(result, content.ToString());
}
public static void GetChildBookmark(PdfBookmark parentBookmark, StringBuilder content)
{
if (parentBookmark.Count > 0)
{
foreach (PdfBookmark childBookmark in parentBookmark)
{
content.AppendLine(childBookmark.Title);
content.AppendLine(childBookmark.DisplayStyle.ToString());
GetChildBookmark(childBookmark, content);
}
}
}
}
}3.Output Format Description
Each bookmark in the generated text file is represented by two lines: the first line is the title, and the second line is the display style. For example:
PDF Bookmarks: Chapter 1 Introduction Regular 1.1 Background Bold 1.2 Objectives Italic Chapter 2 Implementation Regular 2.1 Environment Setup Regular
DisplayStyle is an enumeration with the following possible values:
Regular: Normal textBold: BoldItalic: Italic
The output will vary בהתאם on the actual bookmark styles defined in the PDF document.
4.Notes and Extensions
4.1 Bookmarks May Be Empty
If the PDF has no bookmarks, bookmarks.Count will be 0. In this case, the code writes a message to the file to avoid generating an empty file.
4.2 Retrieving Target Page Numbers and Actions
The above example only retrieves the title and style. If you also need to get the target page number a bookmark links to, you can use the PdfBookmark.Action property (be sure to check the action type). For example:
if (parentBookmark.Action is PdfGoToAction goToAction)
{
int pageIndex = pdf.Pages.IndexOf(goToAction.Destination.Page);
content.AppendLine($"Navigate to page {pageIndex + 1} page");
}Free Spire.PDF provides fairly comprehensive support for Action, so you can extend the functionality based on your specific needs.
4.3 Performance Considerations
For PDFs containing thousands of bookmarks, recursive traversal typically does not cause noticeable performance issues. However, if extraction needs to be performed frequently, consider using a StreamWriter for streaming writes instead of a StringBuilder to reduce memory usage.
4.4 Encoding Handling
File.WriteAllText uses UTF-8 encoding by default. If you need to specify a different encoding (such as GB2312), you can use a StreamWriter instead.
5.Summary
This article demonstrates how to fully extract multi-level bookmark information from a PDF document using a free .NET library. The key points include:
Accessing the root bookmark collection via
PdfDocument.Bookmarks.Recursively traversing
PdfBookmarknodes using theCountproperty and indexer.Reading the
TitleandDisplayStyleproperties.Writing the structured data to a text file.
This approach does not rely on Adobe Acrobat or any other GUI tools, making it ideal for integration into backend services or document processing pipelines. Developers can further extend this approach to retrieve bookmark page numbers, zoom settings, or even modify the bookmark structure.