Python API Reference

This reference documents the dynamic Python API exposed by pdftl. All operations return an pdftl.core.types.OpResult object.

Note

These functions are generated dynamically at runtime via pdftl.api.

pdftl.api.add_bookmarks(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Add one or more top-level bookmarks to a PDF outline.

pdftl.api.add_images(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → 'OpResult'

Applies all parsed add_images rules to a PDF in-place.

This function coordinates the rule parser and the ImageStamper engine to apply overlays/underlays to the input PDF.

pdftl.api.add_text(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult

Applies all parsed add_text rules to a PDF in-place.

This function coordinates the parser and the TextDrawer to apply text overlays to the input PDF.

pdftl.api.attach_files(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Attach files to a PDF document, according to the specified options

pdftl.api.background(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Apply overlay or underlay with optional OCG layering and page-range filtering.

pdftl.api.barcode(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Main entry point for the barcode operation.

pdftl.api.booklet(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → 'OpResult': Imposes pages into a booklet sequence.

pdftl.api.burst(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult

Split one or more PDFs into multiple files, single-page files by default.

Args:

opened_pdfs (list): A list of opened PDF files to burst

operation_args (list): User-supplied arguments

output_pattern (str): A C-style format string for the output: filenames, e.g., “page_%03d.pdf”.

Return: the first input pdf (for pipeline chainability)

Note: Uses the hook side-effect to actually burst

pdftl.api.cat(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Concatenates pages from input PDFs into a new PDF, then rebuilds all links and named destinations, including transforming link target coordinates.

pdftl.api.chop(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult

Chops specified pages of a PDF into multiple smaller pages.

BUG FIXME: currently does strange things with page rotation (pages out of order?)

pdftl.api.clip(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Crop or clip pages in a PDF using specs like ‘1-3(10pt,5%)’.

pdftl.api.create(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Create a new PDF from scratch with blank pages.

pdftl.api.crop(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Crop or clip pages in a PDF using specs like ‘1-3(10pt,5%)’.

pdftl.api.delete(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Deletes pages from input PDF, and otherwise leaves the PDF structure essentially unchanged.

pdftl.api.delete_actions(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Delete actions from a PDF, optionally filtered by selector specs.

pdftl.api.delete_annots(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Delete annotations from a PDF, optionally filtered by selector specs. Without selectors, deletes all annotations from all pages.

pdftl.api.delete_attachments(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Deletes attachments matching criteria from the document NameTree and scrubs corresponding page annotations.

pdftl.api.delete_blank(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult

Delete pages that are blank or visually uniform.

Each spec is evaluated independently. A page is deleted if any spec marks it as blank. Deletion happens in reverse page order so indices remain valid throughout.

pdftl.api.delete_bookmarks(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Delete bookmarks from a PDF outline, optionally filtered by page spec.

pdftl.api.delete_images(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Overwrites images matching criteria with 1x1 transparent stubs. Defaults to global search if no page range is provided.

pdftl.api.deskew(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: CLI Adapter for deskew: Parses arguments, detects text skew, and rotates pages.

pdftl.api.diff_text(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → 'OpResult'

Performs a highly granular, spatially-aware text comparison between two PDFs. Outputs a JSON array of structural changes, including precise bounding box coordinates for where the changes occurred on the page.

Options

`granularity=<level>`: Controls how the diff engine groups text before comparing. Options:
char, word, line, paragraph. Using word prevents sub-word shredding on typos. (Default: word)
`ignore_whitespace=<bool>`: If true, drops changes where the only difference is invisible
space (e.g., reflow line-breaks). (Default: true)
`ignore_soft_hyphens=<bool>`: If true, strips ufffe soft hyphens before comparing. Useful
for ignoring hyphenation differences caused by text reflowing across margins. (Default: false)
`include_bboxes=<bool>`: If true, includes spatial bounding box coordinates for every
change. Turn this off for a cleaner, text-only JSON output. (Default: true)
`margin_top=<float>`, `margin_bottom=<float>`, `margin_left=<float>`,
`margin_right=<float>`: Filters out changes that fall entirely within these margins (in points). Excellent for removing noisy page headers, footers, or marginalia. (Default: 0)

pdftl.api.dump_actions(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Dumps all actions from a PDF in JSON format, optionally filtered.

pdftl.api.dump_annots(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Dumps all annotations from a PDF in JSON format, with compact arrays. Optionally filtered by selector specs using the same syntax as modify_annots.

pdftl.api.dump_bookmarks(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult

Extracts the document’s Table of Contents (bookmarks) into a highly readable and editable YAML format (or JSON).

This faithful extraction preserves both the hierarchical structure and the complex properties of your bookmarks. The resulting file can be easily edited in any text editor and applied back to the same (or a different) PDF using the update_bookmarks operation.

Supported Bookmark Properties Each bookmark node is represented as a dictionary (key-value mapping). The extractor natively supports and outputs the following properties:

`title`: The display text of the bookmark.
`page`: The 1-indexed target page number.
`children`: A nested list of sub-bookmarks to maintain outline hierarchy.
`color`: An RGB array for the bookmark text color (e.g., [1.0, 0.0, 0.0] for red).
`bold` / `italic`: Boolean flags for text styling (true or false).
`uri`: An external web link (used if the bookmark points to a URL rather than a page).
`dest`: A string reference to a Named Destination embedded inside the PDF.
`view`: The precise zoom/viewport array (e.g., [“XYZ”, 0, 700, 2.5], [“FitH”, 500]).

Skipping Destination Resolution By default, named destinations are automatically resolved into exact page and view parameters to make the data more immediately useful. If you want to skip this step and only output the original dest names without deriving their page targets, pass the no_resolve flag.

This may be useful if you want to edit the output from dump_bookmarks and pass it to the update_bookmarks operation. See the update_bookmarks help for rules on dest versus page and view precedence.

Format Example .. code-block:: yaml

title: Chapter 1 page: 1 bold: true children:

title: Sub-section A page: 2 view: [“FitH”, 800]

title: Important Chart page: 3 color: [1.0, 0.0, 0.0]

title: External Resources uri: [https://example.com](https://example.com)

Dependency note

YAML extraction requires the pyyaml library. If it is not installed, you can install it via pip install pdftl[yaml-bookmarks], or simply use the json flag to extract using Python’s standard JSON library instead.

pdftl.api.dump_colorspaces(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Report color spaces used throughout a PDF document.

pdftl.api.dump_data(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Imitate pdftk’s dump_data output, writing to a file or stdout.

pdftl.api.dump_data_annots(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Dumps annotation data from a PDF in pdftk style

pdftl.api.dump_data_fields(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Extracts form field data from the PDF using a Hybrid Tree Walk.

pdftl.api.dump_data_fields_utf8(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Extracts form field data from the PDF using a Hybrid Tree Walk.

pdftl.api.dump_data_utf8(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Imitate pdftk’s dump_data output, writing to a file or stdout.

pdftl.api.dump_dests(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Traverses the /Dests name tree of a PDF using pikepdf.NameTree. This provides a robust, iterable interface to the destinations.

pdftl.api.dump_encryption(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Dumps comprehensive encryption and permission data from a PDF.

pdftl.api.dump_files(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: List files attached to the PDF. Returns a list of dicts with all available attachment metadata.

pdftl.api.dump_fonts(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Dump detailed font metadata records from a PDF file.

pdftl.api.dump_images(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Dump embedded image metadata of a PDF file.

pdftl.api.dump_layers(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Extract OCG (Layer) data and write as JSON.

pdftl.api.dump_signatures(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Validate PDF signatures and returns a list of validation results.

pdftl.api.dump_streams(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → 'OpResult': Dump page content streams in the form that replace operates on them.

pdftl.api.dump_tables(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Extract tables from a PDF file.

pdftl.api.dump_tags(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → 'OpResult': Inspect the PDF structure tree in one of three modes: reading_order (default), tree, or issues.

pdftl.api.dump_text(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Dump text content of a PDF file.

pdftl.api.embed_fonts(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → 'OpResult': Locate equivalents for unembedded fonts and inject their binary streams back into the PDF structures.

pdftl.api.export_fonts(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → 'OpResult': Export fonts with unified sidecar mappings and metrics.

pdftl.api.export_images(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Export embedded images to a directory with a JSON manifest.

pdftl.api.fill_form(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Fill in a form, treating the first argument as a filename for data.

pdftl.api.filter(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Return the given PDF.

pdftl.api.generate_fdf(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Output FDF data for the given PDF

pdftl.api.grep(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → 'OpResult'

The grep operation searches the text content of a PDF for a specified regular expression or literal string. It outputs a structured JSON array detailing matches, page locations, context snippets, and precise coordinate bounding boxes.

Arguments:

<pattern>: The regular expression or text string to search for.
[pages…]: Optional page ranges (e.g., 1-5, 9-end) to restrict the search. If omitted, the entire document is searched.

Search Options:

regex=<b>: If true, treats the pattern as a Python-compatible regular expression. If false, matches the pattern as a plain literal string. (Default: true)
ignore_case=<b> or i=<b>: If true, performs a case-insensitive search. (Default: false)
multiline=<b> or m=<b>: If true, ^ and $ match the start and end of lines. (Default: true)
dotall=<b> or s=<b>: If true, the . special character matches any character, including newlines. (Default: false)
max_count=<N>: Stop searching and parsing after locating <N> total matches.

Context Options:

context=<N>: Number of surrounding lines of text to include before and after each match. (Default: 0)
before_context=<N>: Number of surrounding lines to include strictly before the match.
after_context=<N>: Number of surrounding lines to include strictly after the match.

Typographic Filtering:

You can restrict matches to text that meets specific visual criteria. * min_size=<F>, max_size=<F>: Only match text within a given point-size range. * font_match=<S>: Only match if the font name contains this substring (e.g., “Bold”). * require_bold=<b>: Only match if the text is explicitly bold. * require_italic=<b>: Only match if the text is explicitly italicized. * fonts=<b>: Always extract and output font metadata for matches. Automatically enabled

if any typographic filters are used. (Default: false)

Output Format:

The results are written as a JSON object containing global metadata, a match metrics summary block (count), and a list of hits. Each hit contains: * page, line: 1-indexed page and line numbers where the match begins. * text: The exact string matching the main query. * bboxes: Coordinate bounding boxes [x0, y0, x1, y1] grouped per line. * context_match: The full string of the line(s) containing the match. * match_start_idx, match_end_idx: 0-indexed character offsets marking where the

match resides within context_match.

context_before, context_after: Arrays of surrounding context lines (if requested).
captures: If the regex utilizes capture groups (e.g., Invoice:s*(d+)), this array automatically populates with the group number, exact text, and precise bboxes for every distinct captured sub-pattern.

pdftl.api.highlight(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Applies all parsed highlight rules to a PDF in-place.

pdftl.api.import_fonts(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → 'OpResult'

The import_fonts operation reads a JSON manifest (generated by export_fonts) and overwrites the corresponding internal PDF font structures with the edited assets from a directory.

It automatically handles:

Binary Font stream injection if MD5 hashes differ (safe skip on unmodified files).
Character metrics updates based on the ‘width_sync_mode’ parameter.
Re-compilation of sidecar ToUnicode JSON maps back to compliant PostScript CMaps.
/FontDescriptor property edits from the sidecar’s descriptor block.
/CIDToGIDMap restoration, from either “Identity” or an explicit .cid2gid.json sidecar.
Type 3 glyph procedure reconstruction (including inline images) from a .charprocs file.

If both .json and .ps files exist for a given font’s /ToUnicode map, an error is raised to prevent ambiguity. The user must delete or rename one of them first.

encoding_cmap edits are limited: the only supported change is switching a Type0 font’s /Encoding between /Identity-H and /Identity-V. Any other value written into encoding_cmap is rejected with a warning, and /Encoding is left untouched — there is no general mechanism here for re-encoding into an arbitrary CMap.

Width Sync Modes

Each font’s unified JSON sidecar (font_{obj_id}_{gen_id}_{name}.json) has a top-level width_sync_mode field, defaulting to auto on export. This controls how import_fonts reconciles the PDF’s /Widths (or /W for CID fonts) with any edits:

auto (default): Reads the true metrics out of the edited font binary (if its MD5 changed since export) and writes those into the PDF. If the binary is unchanged or unreadable, falls back to manual behavior using any width.pdf values present in the sidecar mappings.

manual: Writes the width.pdf values from the sidecar mappings directly into the PDF, ignoring the font binary.

patch_font_metrics: Dynamically patches the font binary’s horizontal metrics table in-memory to match the sidecar’s width.pdf values, and writes the patched stream directly into the PDF (leaving workspace files untouched).

squash_font_vectors: Dynamically rescales the font binary’s glyph outlines in-memory to visually fit the sidecar’s width.pdf values, and writes the squashed stream directly into the PDF (leaving workspace files untouched).

preserve: Leaves the PDF’s existing /Widths untouched entirely.

patch_font_metrics and squash_font_vectors require the embedded font file to be present in the directory; without it, no width sync occurs for that font.

For both of these modes, the in-memory binary edit is only ever best-effort: if it can’t be applied (an unresolvable CID-to-GID mapping, no character codes in the font actually matched, or a malformed/unreadable font program), import_fonts still writes the sidecar’s width.pdf values into /Widths or /W directly, so a requested width edit is never silently dropped even when the font program itself couldn’t be patched.

Format-Specific Limitations

When employing squash_font_vectors, be advised that this mode relies on direct, affine coordinate scaling which is only structurally supported for TrueType outline representations (glyf tables). For OpenType-CFF (CFF/CFF2) and classic Adobe Type 1 fonts which do not possess a standard TrueType glyf table, the operation degrades silently to patch_font_metrics mode—meaning the font binary’s internal character metrics will be updated correctly in-memory to prevent layout mismatches, but the visual glyph outline vectors will remain unscaled.

pdftl.api.import_images(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Import edited images from a directory and overwrite internal PDF streams.

pdftl.api.import_streams(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Import and apply streams into the PDF.

pdftl.api.inject(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Injects code at the start and/or end of page content streams.

pdftl.api.insert(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Insert blank pages into the PDF.

pdftl.api.modify_annots(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Modifies properties of existing annotations in a PDF.

pdftl.api.modify_images(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → 'OpResult'

Applies parallelized bitmap image pipeline transformations to target pages.

Raises InvalidArgumentError on unknown image modifiers.

pdftl.api.modify_layers(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult

The modify_layers operation allows you to alter Optional Content Groups (layers) in a PDF. You can permanently merge/strip them, or change their default visibility and behavior.

The command reads a sequence of action-target pairs. If no target is provided after an action, it defaults to affecting “all” layers.

Available Actions

Structural (Permanent): * merge: The visual content of the layer is permanently baked into the

page. The layer is removed from the PDF’s layer menu.

strip: The visual content of the layer is completely deleted from the document, and the layer is removed from the PDF’s layer menu.

State & Behavior (Non-destructive): * show / hide: Sets the default visibility when the document is opened. * lock / unlock: Prevents/allows the user from toggling the layer in viewers. * print / noprint: Overrides behavior to force layer visibility ON or OFF when printing. * screen / noscreen:

Overrides behavior to force layer visibility ON or OFF on digital displays.

Utility: * keep: Used primarily to exclude a specific layer when targeting “all” others.

Note on State vs. Usage: Layer visibility on screen (`show`/`hide`) is independent of its visibility when printing. Modifying a usage state (like `noprint`) without changing its base state will leave its on-screen visibility unchanged.

Target Specifications

name=<string>: Sloppy match. Applies the action to all layers with this name.
id=<integer>: Strict match. Applies the action to the exact underlying PDF object.
all: Explicitly targets all layers.
<string>: If no key= prefix is provided, it defaults to a name= match.

pdftl.api.montage(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Imposes pages onto new canvas pages based on a grid or layout strategy.

pdftl.api.move(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: CLI Adapter for move: Parses string arguments into a spec, then runs logic.

pdftl.api.multibackground(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Apply overlay or underlay with optional OCG layering and page-range filtering.

pdftl.api.multistamp(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Apply overlay or underlay with optional OCG layering and page-range filtering.

pdftl.api.mutate_content(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult

The mutate_content operation allows you to run custom Python logic against the raw instruction streams of a PDF. You can specify your own Python script, and the function from that script to run. The default function search for if no function is specified is mutate.

The Context Dictionary Your function receives a context dictionary with the following keys:

pdf: The current pikepdf.Pdf object.
page_num: The current page number (1-indexed) or None for XObjects.
is_xobject: Boolean, True if mutating a Form XObject.
object: the pikepdf.Object (either a page or an XObject) containing the content stream.
args: A list of extra arguments passed via CLI after the script name.

Passing Arguments You can pass custom parameters to your script: pdftl in.pdf mutate_content script.py 1.5 2.0 output out.pdf In this case, context[‘args’] would be [‘1.5’, ‘2.0’].

pdftl.api.normalize(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Normalize page content streams.

pdftl.api.optimize_images(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Optimize images in the given PDF.

pdftl.api.place(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult

Applies geometric transformations (direct similarities) to the content of selected pages.

`<spec>` syntax:

[<pages>](<operation>…)

Operations:

shift=dx, dy Moves content by the specified x and y distances. Supports units (pt, in, cm, mm) and percentages relative to page size. Example: shift=1in, 50%
scale=factor[:anchor] Scales content by a multiplier (e.g., 0.5 for half size). Optional anchor determines the fixed point (default: center).
spin=angle[:anchor] Rotates content by degrees clockwise. Optional anchor determines the pivot point (default: center).

More than one operation can be given. They should be separated by semicolons, ‘;’. Operations are applied in the order they appear, from left to right.

Anchors:

Anchors define the center of scaling or rotation. * Named: center (default), top-left, top, top-right,

left, right, bottom-left, bottom, bottom-right.

Coordinate: x,y (e.g., 0,0 for bottom-left corner).

pdftl.api.recolor_images(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → 'OpResult': Finds and grayscales color images on targeted pages using parallel tasks.

pdftl.api.recolor_vectors(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Orchestrator entry point registered into pdftl’s core layout pipeline.

pdftl.api.render(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult

The render operation converts PDF pages into raster images or a single PDF. It respects page rotation, cropping, and current pipeline modifications.

You can specify a page range using standard page specifications (e.g., 1-5, even). If no pages are specified, all pages are rendered.

The dpi=<val> argument sets the raster image resolution, in dots per inch (default: 150). It must be a positive number.

The png_compression=<level> argument sets the PNG compression, for PNG output. It must be an integer between 1 and 9, where 9 is the highest compression level, and the slowest. The default level is 9.

The default <template> is page_%d.png. The parameter %d is replaced with the output page counter value, starting at 1. Standard formatting directives like %03d are supported.

Over the API: the server never honors client-supplied filesystem paths, so output <template> has no effect there. Use format=<png|jpg|pdf> instead to select the response shape: png/jpg return a zip of one image per page, pdf returns a single combined PDF. format= is accepted (and ignored by output-based CLI rendering) on the CLI too, for consistency.

Single PDF Output: If the output template ends with .pdf and contains no % directive (e.g., output out.pdf), all rendered pages will be combined into a single PDF file. Note: This keeps all page images in memory until saved.

Image Output: If rendering to images, the output format is guessed from the <template> extension (e.g., .png, .jpg). If no extension is given, PNG is used.

pdftl.api.replace(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Replace in page content streams.

pdftl.api.resample_images(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → 'OpResult': Resample images exceeding the dpi threshold using a ThreadPoolExecutor.

pdftl.api.rotate(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Applies rotations and/or scaling to specified pages of a PDF.

pdftl.api.server(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Bootstrap and start the stateless HTTP daemon server.

pdftl.api.set(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Orchestrates setting various document-level metadata and preferences.

pdftl.api.shuffle(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Shuffles (interleaves) pages from multiple PDFs, applying transformations like rotation and scaling.

pdftl.api.simplify_vectors(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → 'OpResult': Entry point registered with the pdftl operation registry.

pdftl.api.stamp(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Apply overlay or underlay with optional OCG layering and page-range filtering.

pdftl.api.stamp_fields(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Entry point for the stamp_fields operation.

pdftl.api.style_text(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Apply text styles in page content streams.

pdftl.api.tag(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Auto-tag a PDF using OpenDataLoader and return the tagged pikepdf.Pdf document.

pdftl.api.unpack_files(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult: Unpacks attachments from a single PDF file. Returns a generator yielding (filename, bytes).

pdftl.api.unpause(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult

The unpause operation removes intermediate animation frames from PDF slide decks, such as those produced by LaTeX Beamer with pause or uncover directives.

The algorithm renders each page at low resolution and checks whether all ink pixels from the previous page are still present on the current page. If they are, the previous page is considered an intermediate animation frame and is discarded. Only pages where ink disappears or moves are kept, plus the final page.

The dpi=<val> argument controls render resolution for comparison (default: 72). Higher values are slower but more accurate for fine detail.

The ink=<val> argument is the pixel darkness threshold (0-255) below which a pixel is considered ink (default: auto). In auto mode, Otsu’s method is used per page.

The survival=<val> argument is the minimum fraction (0.0-1.0) of ink pixels from the previous page that must survive on the current page for it to be considered a continuation (default: 0.98). Genuine Beamer transitions produce survival=1.00; new slides typically produce <0.20.

If the document has custom page labels, they are remapped so that every surviving page keeps its own original label.

pdftl.api.update_bookmarks(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult

Replaces the document’s Table of Contents (bookmarks) entirely using a specified YAML or JSON file, or YAML data on stdin if bookmarks_file is set to “-“.

This operation consumes the structured format generated by the dump_bookmarks command. It applies validation to prevent broken PDFs: it validates page boundaries, catches typos in property names, and ensures the hierarchical tree structure is sound.

Important Rules & Behaviors * Full Replacement: This command does not merge or append bookmarks. It completely

overwrites the existing Table of Contents with the contents of your file.

Page Indexing: Target page numbers must be 1-indexed (e.g., page: 1 points to the very first physical page of the document).
Boundary Checking: If a bookmark points to a page number that exceeds the total number of pages in the PDF, the operation will fail safely to prevent document corruption.
Strict Validation: Unrecognized properties (such as a typoed pagee instead of page) will trigger an error rather than being silently ignored. This ensures your routing logic always behaves as expected.
Precedence: dump_bookmarks output may contain a dest field alongside resolved page and view fields. This means that the bookmark uses a named destination, and for convenience, the dump includes the resolved page number and view parameters. For update_bookmarks, the dest field, if present, always takes precedence. If both are present, a warning is issued on first encounter. * To update page and view, delete the dest field. * To avoid the warning and keep the dest field only, delete page and view.

Alternatively, use the no_resolve keyword when using dump_bookmarks to avoid this issue.

Clearing Bookmarks If you want to completely remove the Table of Contents from a PDF, you can provide a file containing just an empty list. * YAML file content: [] * JSON file content: { “bookmarks”: [] }

Dependency note Loading YAML files requires the pyyaml library. If it is not installed, you can install it via pip install pdftl[yaml-bookmarks], or simply use .json files natively.

pdftl.api.update_info(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult

CLI Adapter for update_info: parses string arguments into a spec, then runs logic.

Expects op_args to have valid operation arguments (see pdftl help update_info).

xml_strings controls whether to XML-decode string fields in the input.

pdftl.api.update_info_utf8(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult

CLI Adapter for update_info: parses string arguments into a spec, then runs logic.

Expects op_args to have valid operation arguments (see pdftl help update_info).

xml_strings controls whether to XML-decode string fields in the input.

pdftl.api.zoom(*, inputs: list[str] = None, opened_pdfs: list[pikepdf._core.Pdf] = None, operation_args: list[str] = None, password: str = None, output: str = None, run_cli_hook: bool = False, full_result: bool = False, aliases: dict[str, Any] = None, options: dict[str, Any] = None) → pdftl.core.core_types.OpResult

The zoom operation rescales entire pages (including the MediaBox) to fit a specified target dimension. Unlike place, which only moves content within existing boundaries, zoom physically transforms the page size.

This is an in-place operation: unspecified pages are left unchanged.

Syntax:

zoom “<pages>(<target>[,<options>])”

Target Formats:

Relative/Percentage: (50%) or (200%). Resizes the page relative to its current dimensions.
Single Value/Paper: (A4) or (100mm). Scales the page uniformly so that it fits inside a bounding box of that size (aspect ratio preserved).
Explicit Box: (100mm,200mm). Scales the page uniformly to fit inside the specified width and height.
Axis Specific: (width=A4) or (height=11in). Scales the page proportionally based only on the specified dimension.

Options:

shrink: Only scale pages down. If the page is already smaller than the target, it remains unchanged.
grow: Only scale pages up. If the page is already larger than the target, it remains unchanged.

Note: Scaling is always uniform. If a target rectangle is provided, the operation uses the limiting dimension to ensure the entire page fits inside the “envelope”.