Quick start¶
Redact your first PDF in under five minutes.
1. Pick a provider¶
kuroi supports three LLM providers — pick one:
- Anthropic API (cloud, default). Set
ANTHROPIC_API_KEYonce per shell: - Claude CLI (cloud, subscription). If you have a Claude Code plan, no
API key is needed. Run
claude /loginonce, then pass--provider claude-cli. - Ollama (local, offline). Install Ollama and pull a
model — no API key, no outbound HTTP. Pass
--provider ollama --model <name>.
Run kuroi setup to configure interactively, or jump to
LLM providers for full details on each.
2. Run the redactor¶
kuroi prints a cost estimate, asks you to confirm, then writes the redacted output:
Estimated cost: $0.0042 (3219 input tokens, anthropic/claude-opus-4-7)
Found 47 candidate redactions.
Apply redactions? [Y/n]: y
Wrote report.redacted.pdf
Backup: ~/.local/share/kuroi/backups/2026-04-30T09-31-02Z-a1b2c3/report.pdf
Audit: ~/.local/share/kuroi/audit/2026-04-30T09-31-02Z-a1b2c3.jsonl
The provider/model in the cost line reflects whatever is configured.
For Claude CLI runs the cost shows as subscription; for Ollama it shows
as local (free).
The Backup: line points at a copy of the original, kept so you can
roll back at any time. Pass --no-backup to skip the copy when you
don't need an undo path.
The -o flag selects the output path. Use --in-place if you want
kuroi to overwrite the input. Either flag is required. See
Batch redaction for output-resolution rules.
3. Inspect the diff¶
To see what kuroi changed, run kuroi diff with both the original and
redacted PDFs:
$ kuroi diff report.pdf report.redacted.pdf
Page 3: 2 redactions
- [40,120,180,138] 'j.doe@example.com'
- [200,400,310,418] '+33 6 12 34 56 78'
Page 7: 1 redaction
- [60,210,220,230] 'Jane M. Doe'
The text renderer prints each redaction as a bbox plus the before-text
snippet. Use --format json for machine-readable output, --format html
for a side-by-side view.
4. Restore if you need to¶
If anything looks wrong, restore the latest backup:
kuroi undo restores the most recent backup in
$XDG_DATA_HOME/kuroi/backups/ (default ~/.local/share/kuroi/backups/).
Pass --backup-dir to point at a different directory. Backups are kept
for 24 hours by default; use kuroi backups gc --max-age <hours> to drop
old backups manually.
What just happened?¶
- kuroi ran the PII rule pack over the PDF text using regex.
- Snippets the regex couldn't classify were sent to the configured LLM for context-aware judgement.
- PyMuPDF rewrote the content stream so the redacted text is no longer recoverable, not just covered with a black box.
- A full audit record was written so you can always trace why a span was redacted. See Audit & undo.
Next steps¶
- Batch redaction — process a folder of PDFs.
- LLM providers — switch to local-only Ollama.
- Audit, diff, undo & backups — verify and reverse.