# `diff_text` Diff the text content of two PDFs and output bounding boxes ## Usage > pdftl `` `diff_text` `` `[options...]` `[output` `]` ## Details Performs a highly granular, spatially-aware text comparison between two PDFs. Outputs a JSON array of structural changes, including precise bounding box coordinates for where the changes occurred on the page. ### Options * **`granularity=`**: Controls how the diff engine groups text before comparing. Options: `char`, `word`, `line`, `paragraph`. Using `word` prevents sub-word shredding on typos. *(Default: word)* * **`ignore_whitespace=`**: If true, drops changes where the only difference is invisible space (e.g., reflow line-breaks). *(Default: true)* * **`ignore_soft_hyphens=`**: If true, strips `\ufffe` soft hyphens before comparing. Useful for ignoring hyphenation differences caused by text reflowing across margins. *(Default: false)* * **`include_bboxes=`**: If true, includes spatial bounding box coordinates for every change. Turn this off for a cleaner, text-only JSON output. *(Default: true)* * **`margin_top=`**, **`margin_bottom=`**, **`margin_left=`**, **`margin_right=`**: Filters out changes that fall entirely within these margins (in points). Excellent for removing noisy page headers, footers, or marginalia. *(Default: 0)* **Tags**: text, compare, utility *Source: pdftl.operations.diff_text* *Read online: [https://pdftl.readthedocs.io/en/latest/operations/diff_text.html](https://pdftl.readthedocs.io/en/latest/operations/diff_text.html)* *Type: Operation*