dump_bookmarks

Extract PDF bookmarks into YAML or JSON

Usage

pdftl <input> dump_bookmarks [json] [no_resolve] [output <output>]

Details

Extracts the document’s Table of Contents (bookmarks) into a highly readable and editable YAML format (or JSON).

This faithful extraction preserves both the hierarchical structure and the complex properties of your bookmarks. The resulting file can be easily edited in any text editor and applied back to the same (or a different) PDF using the update_bookmarks operation.

Supported Bookmark Properties

Each bookmark node is represented as a dictionary (key-value mapping). The extractor natively supports and outputs the following properties:

  • title: The display text of the bookmark.

  • page: The 1-indexed target page number.

  • children: A nested list of sub-bookmarks to maintain outline hierarchy.

  • color: An RGB array for the bookmark text color (e.g., [1.0, 0.0, 0.0] for red).

  • bold / italic: Boolean flags for text styling (true or false).

  • uri: An external web link (used if the bookmark points to a URL rather than a page).

  • dest: A string reference to a Named Destination embedded inside the PDF.

  • view: The precise zoom/viewport array (e.g., ["XYZ", 0, 700, 2.5], ["FitH", 500]).

Skipping Destination Resolution

By default, named destinations are automatically resolved into exact page and view parameters to make the data more immediately useful. If you want to skip this step and only output the original dest names without deriving their page targets, pass the no_resolve flag.

This may be useful if you want to edit the output from dump_bookmarks and pass it to the update_bookmarks operation. See the update_bookmarks help for rules on dest versus page and view precedence.

Format Example

- title: Chapter 1
  page: 1
  bold: true
  children:
    - title: Sub-section A
      page: 2
      view: ["FitH", 800]
    - title: Important Chart
      page: 3
      color: [1.0, 0.0, 0.0]
- title: External Resources
  uri: [https://example.com](https://example.com)

Dependency note

YAML extraction requires the pyyaml library. If it is not installed, you can install it via pip install pdftl[yaml-bookmarks], or simply use the json flag to extract using Python’s standard JSON library instead.

Examples

Print YAML bookmark data to standard output

pdftl in.pdf dump_bookmarks

Dump YAML bookmark data to bookmarks.yaml

pdftl in.pdf dump_bookmarks output bookmarks.yaml

Dump YAML bookmarks, keeping named destinations unresolved

pdftl in.pdf dump_bookmarks no_resolve output bookmarks.yaml

Dump JSON bookmark data to bookmarks.json

pdftl in.pdf dump_bookmarks json output bookmarks.json

Tags: info, metadata, bookmarks

Source: pdftl.operations.dump_bookmarks

Read online: https://pdftl.readthedocs.io/en/latest/operations/dump_bookmarks.html

Type: Operation