# `dump_streams` Dump page content streams as seen by [`replace`]() ## Usage > pdftl `` `dump_streams` `[normalize=false]` `[recurse=false]` `[resources=true]` `[annotate]` `[...]` `[output` `]` ## Details The `dump_streams` operation outputs page content streams in the same form that the [`replace`]() operation operates on: by default normalized (one PDF operator per line), with Form XObjects recursively included. This is the primary tool for crafting a regular expression to pass to [`replace`](). Instead of reaching for `mutool show` or an external PDF inspector, run `dump_streams` to see exactly the text that [`replace`]() will match against. ### Options * `normalize=false` — output the raw, un-normalized stream bytes as stored in the PDF, instead of the normalized form. Annotation is suppressed when normalization is disabled. * `recurse=false` — restrict output to top-level page content streams only, skipping Form XObjects. Mirrors the same flag on [`replace`](). * `resources=true` — pretty-print the associated structural dictionary mapping for each Page and Form XObject. Very helpful to inspect Font and Form maps. * `annotate` — append a PDF-style `%` comment to each operator line explaining what the operator does (e.g. `% show/text: Show text`). Particularly useful when learning the PDF content stream format or hunting for the right operator to target with [`replace`](). ### Output format Each content stream is preceded by a labelled header block: ================ === Page ================ For Form XObjects: ============================================ === Page / XObject (:) ============================================ When an XObject is shared across multiple pages, a warning appears in the header identifying the other pages that reference it. Stream content follows as decoded text (latin-1). Annotation comments, when requested, use standard PDF `%` comment syntax so the output remains valid PDF content stream text. ### Page specification Standard page specs are supported (e.g. `1`, `2-4`, `1 3-5`). Default is all pages. ### Relationship to [`replace`]() `dump_streams` intentionally mirrors [`replace`]()'s behavior: | Behavior | [`replace`]() | `dump_streams` | |-----------------------------------|---------------------|---------------------| | Normalizes page streams | yes | yes (default) | | Normalizes XObject streams | yes | yes (default) | | Recurses into Form XObjects | yes (default) | yes (default) | ## Examples > Print normalized content streams for all pages to stdout ``` pdftl in.pdf dump_streams ``` > Dump page content streams along with their pretty-printed resource blocks ``` pdftl in.pdf dump_streams resources=true ``` > Dump normalized content streams for pages 1-3 to a file ``` pdftl in.pdf dump_streams 1-3 output streams.txt ``` > Dump streams with operator annotations to help write a replace spec ``` pdftl in.pdf dump_streams annotate ``` > Dump the raw (un-normalized) content stream for page 1 ``` pdftl in.pdf dump_streams normalize=false 1 ``` > Dump only top-level page content streams, skipping Form XObjects ``` pdftl in.pdf dump_streams recurse=false ``` **Tags**: info, content_stream, replace *Source: pdftl.operations.dump_streams* *Read online: [https://pdftl.readthedocs.io/en/latest/operations/dump_streams.html](https://pdftl.readthedocs.io/en/latest/operations/dump_streams.html)* *Type: Operation*