#  `dump_streams`

Dump page content streams as seen by [`replace`](<replace.md>)
## Usage
> pdftl `<input>` `dump_streams` `[normalize=false]` `[recurse=false]` `[resources=true]` `[annotate]` `[<page_spec>...]` `[output` `<output>]`

## Details
The `dump_streams` operation outputs page content streams in the same form
that the [`replace`](<replace.md>) operation operates on: by default normalized (one PDF
operator per line), with Form XObjects recursively included.

This is the primary tool for crafting a regular expression to pass to
[`replace`](<replace.md>). Instead of reaching for `mutool show` or an external PDF
inspector, run `dump_streams` to see exactly the text that [`replace`](<replace.md>) will
match against.

### Options

* `normalize=false` — output the raw, un-normalized stream bytes as stored
  in the PDF, instead of the normalized form. Annotation is suppressed when
  normalization is disabled.
* `recurse=false` — restrict output to top-level page content streams only,
  skipping Form XObjects. Mirrors the same flag on [`replace`](<replace.md>).
* `resources=true` — pretty-print the associated structural dictionary mapping
  for each Page and Form XObject. Very helpful to inspect Font and Form maps.
* `annotate` — append a PDF-style `%` comment to each operator line
  explaining what the operator does (e.g. `% show/text: Show text`).
  Particularly useful when learning the PDF content stream format or
  hunting for the right operator to target with [`replace`](<replace.md>).

### Output format

Each content stream is preceded by a labelled header block:

    ================
    === Page <N>
    ================

For Form XObjects:

    ============================================
    === Page <N> / XObject <name> (<obj>:<gen>)
    ============================================

When an XObject is shared across multiple pages, a warning appears in
the header identifying the other pages that reference it.

Stream content follows as decoded text (latin-1). Annotation comments,
when requested, use standard PDF `%` comment syntax so the output
remains valid PDF content stream text.

### Page specification

Standard page specs are supported (e.g. `1`, `2-4`, `1 3-5`).
Default is all pages.

### Relationship to [`replace`](<replace.md>)

`dump_streams` intentionally mirrors [`replace`](<replace.md>)'s behavior:

| Behavior                          | [`replace`](<replace.md>)           | `dump_streams`      |
|-----------------------------------|---------------------|---------------------|
| Normalizes page streams           | yes                 | yes (default)       |
| Normalizes XObject streams        | yes                 | yes (default)       |
| Recurses into Form XObjects       | yes (default)       | yes (default)       |
## Examples

> Print normalized content streams for all pages to stdout
```
pdftl in.pdf dump_streams
```

> Dump page content streams along with their pretty-printed resource blocks
```
pdftl in.pdf dump_streams resources=true
```

> Dump normalized content streams for pages 1-3 to a file
```
pdftl in.pdf dump_streams 1-3 output streams.txt
```

> Dump streams with operator annotations to help write a replace spec
```
pdftl in.pdf dump_streams annotate
```

> Dump the raw (un-normalized) content stream for page 1
```
pdftl in.pdf dump_streams normalize=false 1
```

> Dump only top-level page content streams, skipping Form XObjects
```
pdftl in.pdf dump_streams recurse=false
```


**Tags**: info, content_stream, replace

*Source: pdftl.operations.dump_streams*

*Read online: [https://pdftl.readthedocs.io/en/latest/operations/dump_streams.html](https://pdftl.readthedocs.io/en/latest/operations/dump_streams.html)*

*Type: Operation*