Skip to content

Quick start

Redact your first PDF in under five minutes.

1. Pick a provider

kuroi supports three LLM providers — pick one:

  • Anthropic API (cloud, default). Set ANTHROPIC_API_KEY once per shell:
    export ANTHROPIC_API_KEY=sk-ant-...
    
  • Claude CLI (cloud, subscription). If you have a Claude Code plan, no API key is needed. Run claude /login once, then pass --provider claude-cli.
  • Ollama (local, offline). Install Ollama and pull a model — no API key, no outbound HTTP. Pass --provider ollama --model <name>.

Run kuroi setup to configure interactively, or jump to LLM providers for full details on each.

2. Run the redactor

$ kuroi run report.pdf -o report.redacted.pdf

kuroi prints a cost estimate, asks you to confirm, then writes the redacted output:

  Estimated cost: $0.0042  (3219 input tokens, anthropic/claude-opus-4-7)
  Found 47 candidate redactions.
Apply redactions? [Y/n]: y
  Wrote report.redacted.pdf
  Backup: ~/.local/share/kuroi/backups/2026-04-30T09-31-02Z-a1b2c3/report.pdf
  Audit: ~/.local/share/kuroi/audit/2026-04-30T09-31-02Z-a1b2c3.jsonl

The provider/model in the cost line reflects whatever is configured. For Claude CLI runs the cost shows as subscription; for Ollama it shows as local (free).

The Backup: line points at a copy of the original, kept so you can roll back at any time. Pass --no-backup to skip the copy when you don't need an undo path.

The -o flag selects the output path. Use --in-place if you want kuroi to overwrite the input. Either flag is required. See Batch redaction for output-resolution rules.

3. Inspect the diff

To see what kuroi changed, run kuroi diff with both the original and redacted PDFs:

$ kuroi diff report.pdf report.redacted.pdf
Page 3: 2 redactions
  - [40,120,180,138]  'j.doe@example.com'
  - [200,400,310,418]  '+33 6 12 34 56 78'
Page 7: 1 redaction
  - [60,210,220,230]  'Jane M. Doe'

The text renderer prints each redaction as a bbox plus the before-text snippet. Use --format json for machine-readable output, --format html for a side-by-side view.

4. Restore if you need to

If anything looks wrong, restore the latest backup:

$ kuroi undo

kuroi undo restores the most recent backup in $XDG_DATA_HOME/kuroi/backups/ (default ~/.local/share/kuroi/backups/). Pass --backup-dir to point at a different directory. Backups are kept for 24 hours by default; use kuroi backups gc --max-age <hours> to drop old backups manually.

What just happened?

  1. kuroi ran the PII rule pack over the PDF text using regex.
  2. Snippets the regex couldn't classify were sent to the configured LLM for context-aware judgement.
  3. PyMuPDF rewrote the content stream so the redacted text is no longer recoverable, not just covered with a black box.
  4. A full audit record was written so you can always trace why a span was redacted. See Audit & undo.

Next steps