# `dump_data` Metadata, page and bookmark info (XML-escaped or JSON) ## Usage > pdftl `` `dump_data` `[json]` `[output` `]` ## Details Extracts document-level metadata and structural information from the input PDF and prints it to the console (or a specified file). This operation is the primary way to export data for inspection or for later use by the `update_info` operation. By default, all string values in the output are processed with XML-style escaping (e.g., `<` becomes `<`). Alternatively, passing the `json` parameter will produce a structured JSON output, which is often easier for other programs to parse. ### Output Format Details (Stanza Format) The default output is a plain text, line-based, key-value format. It consists of both simple top-level fields and multi-line "stanzas". A stanza is a block of related data that begins with a line like `InfoBegin` or `BookmarkBegin`. The data from this command is consumed by `update_info`. #### Top-Level Fields These fields appear as simple `Key: Value` lines. * `PdfID0: ` * The first part of the PDF's unique file identifier. * *Updatable by `update_info`.* * `PdfID1: ` * The second part of the PDF's unique file identifier. * *Not updatable by `update_info`.* * `NumberOfPages: ` * The total number of pages in the document. #### Stanzas These are multi-line blocks, each describing a single record. These can all be updated by `update_info`. ##### 1. Info Stanza (Document Metadata) Each metadata entry (e.g., Title, Author) gets its own stanza. * `InfoBegin` * `InfoKey: ` Standard keys include `Title`, `Author`, `Subject`, `Keywords`, `Creator`, `Producer`, `CreationDate`, `ModDate`. * `InfoValue: ` ##### 2. Bookmark Stanza Represents a single bookmark (outline) item. * `BookmarkBegin` * `BookmarkTitle: ` * `BookmarkLevel: ` (1 is top level) * `BookmarkPageNumber: ` The 1-indexed target page number. ##### 3. PageMedia Stanza (Page-level Boxes) Describes geometry boxes for a page. Coordinates are in PDF points, space-separated (e.g., `0 0 595 842`). * `PageMediaBegin` * `PageMediaNumber: ` The 1-indexed page number. * `PageMediaRotation: <0|90|180|270>` * `PageMediaRect: ` (MediaBox) Always present. * `PageMediaDimensions: ` * `PageMediaCropRect: ` Omitted if identical to `PageMediaRect`. * `PageMediaBleedRect: ` Omitted if identical to `PageMediaCropRect` (or `PageMediaRect` if no crop). * `PageMediaTrimRect: ` Omitted if identical to `PageMediaCropRect` (or `PageMediaRect` if no crop). ##### 4. PageLabel Stanza (Logical Page Numbers) Defines a page numbering style range. * `PageLabelBegin` * `PageLabelNewIndex: ` The 1-indexed physical starting page for this numbering. * `PageLabelStart: ` The starting number for this labelling (e.g., 1). * `PageLabelPrefix: ` String to prepend to page label (e.g., `A-`). * `PageLabelNumStyle: