replace

Regex replacement on page content streams

Usage

pdftl <input> replace [<spec>...] output <output>

Details

The replace operation performs replacement of parts of page content streams, based on regular expressions. in the PDF file. Page ranges can be specified. The default page range is all pages. The <spec> specification is:

  [optional page range]/<from>/<to>/[count]

where <from> and <to> are strings describing regular expressions, as described at https://docs.python.org/3/library/re.html.

The delimiter / can be replaced with any other non-alphnumeric character. It must break the <spec> into exactly 4 parts (where the first may be empty). The delimiter is defined as the final character of <spec>, ignoring digits.

Any trailing digits are interpreted as count, which is the maximum number of times the expression will be matched for each page content stream.

Before and after the replacement is applied, the page content stream is normalized (see the normalize operation), which results in it appearing with one operator per line.

Examples

Replace red with blue on pages 1-3

pdftl in.pdf replace '1-3/1 0 0 (RG|rg)/0 0 1 \1/' output out.pdf

Tags: in_place, content_stream, dangerous

Source: pdftl.operations.replace

Read online: https://pdftl.readthedocs.io/en/stable/operations/replace.html

Type: Operation