Skip to content

LLM providers

kuroi uses an LLM as a "judge" to spot context-sensitive redactions that plain regex misses. Three providers ship with kuroi: Anthropic (cloud, per-token), Claude CLI (cloud, subscription), and Ollama (local).

Provider comparison

Provider Hosting API key required Cost Best for
Anthropic Cloud Yes (ANTHROPIC_API_KEY) Per-token Highest-quality detection on small batches.
Claude CLI Cloud No (subscription) Subscription Heavy use under a Claude Code plan.
Ollama Local No Free (your hardware) Offline workflows; sensitive data that must not leave the host.

List available models

$ kuroi models

Anthropic                                                   cloud
  claude-opus-4-7        (default)   $15.00 / $75.00 per Mtok
  claude-sonnet-4-6                  $3.00 / $15.00 per Mtok
  claude-haiku-4-5-20251001          $1.00 / $5.00 per Mtok
  seed support: temperature=0 only (best-effort, recorded in audit)

Claude CLI                                                  cloud
  claude-opus-4-7                    subscription billing
  claude-sonnet-4-6                  subscription billing
  claude-haiku-4-5-20251001          subscription billing
  seed support: not available

Ollama                                                      local
  llama3.1:8b            (installed) free
  llama3.1:70b                       free
  seed support: full

Default: anthropic / claude-opus-4-7   (configurable)
Pricing last updated: 2026-05-09

(installed) is shown for Ollama models the local daemon currently serves; if the Ollama daemon at the configured URL isn't reachable, the column is simply blank. Use kuroi models --json for machine-readable output, or kuroi models ollama to filter to one provider.

Configure a provider

Pass per invocation:

$ export ANTHROPIC_API_KEY=sk-ant-...
$ kuroi run document.pdf --provider anthropic --model claude-opus-4-7

Or persist by editing ~/.config/kuroi/config.toml:

provider = "anthropic"
model = "claude-opus-4-7"

kuroi setup will write this file for you interactively.

Pass per invocation:

$ pip install claude-agent-sdk          # or `npm install -g @anthropic-ai/claude-code`
$ claude /login                          # one-time, authenticates your subscription
$ kuroi run document.pdf --provider claude-cli --model claude-opus-4-7

If you installed kuroi with pipx, inject the SDK into kuroi's isolated environment instead:

$ pipx inject kuroi claude-agent-sdk

Or persist by editing ~/.config/kuroi/config.toml:

provider = "claude-cli"
model = "claude-opus-4-7"

# Optional — both fields default sensibly.
[claude_cli]
cli_path = "/usr/local/bin/claude"
timeout_s = 300

kuroi setup will probe the CLI, verify it's authenticated, and write this file for you.

Or pass overrides per invocation without editing the config:

$ kuroi run document.pdf --provider claude-cli \
    --claude-cli-path /opt/claude/bin/claude \
    --claude-cli-timeout 600

--claude-cli-path overrides the bundled claude-agent-sdk binary (point it at a system-installed claude if you prefer). --claude-cli-timeout is the per-call timeout in seconds (default 300); raise it for long documents that exceed the default budget.

ANTHROPIC_API_KEY shadowing

If ANTHROPIC_API_KEY is set in your environment when you select the claude-cli provider, the Claude CLI will prefer API (per-token) billing over your subscription. kuroi prints a warning at startup so the behavior is visible. Unset the variable to force subscription billing.

Per-rule model: overrides work for every provider — see Per-category model routing below.

Pass per invocation:

$ ollama serve &  # start the daemon if it's not already running
$ ollama pull llama3.1:8b
$ kuroi run document.pdf --provider ollama --model llama3.1:8b

Or persist by editing ~/.config/kuroi/config.toml:

provider = "ollama"
model = "llama3.1:8b"

[ollama]
url = "http://localhost:11434"

kuroi setup will probe the daemon, list installed models, and write this file for you.

Per-category model routing

Every category in a rule pack accepts an optional model: field. When set, calls for that category are dispatched against the named model instead of the provider's default — useful for routing cheap, repetitive categories (e.g. emails) to a small fast model and reserving the heavy default model for harder ones (e.g. names, addresses).

- id: contact_info
  llm: true
  model: claude-haiku-4-5-20251001

- id: full_name
  llm: true
  # falls back to the configured default model

The chunker groups dispatches by model and runs each group concurrently per batch, so per-category routing does not serialize calls. The override applies regardless of whether the provider is anthropic, claude-cli, or ollama — pick a model id the chosen provider can serve.

Configuration precedence

kuroi resolves configuration in this order (later overrides earlier):

  1. Built-in defaults.
  2. ~/.config/kuroi/config.toml (or $XDG_CONFIG_HOME/kuroi/config.toml).
  3. Environment variables (ANTHROPIC_API_KEY, KUROI_PROVIDER, KUROI_MODEL).
  4. CLI flags (--provider, --model, --ollama-url).

See Configuration for the full key list.

Fully offline workflow

  1. Install Ollama.
  2. Pull a model: ollama pull llama3.1:8b.
  3. Run kuroi setup and pick ollama + your model, or write ~/.config/kuroi/config.toml by hand (see the Ollama tab above).
  4. Run kuroi run document.pdf. No outbound HTTP except to localhost.

Switching providers mid-project

You can change provider per-invocation without persisting the choice. The audit record stores which provider produced each finding, so you can tell later which run did what:

$ kuroi run document.pdf --provider anthropic
$ kuroi diff document.pdf document.redacted.pdf --format json | head -1
{"page":3,"before_text":"...","after_text":"...","redactions":[{"bbox":[40,120,180,138],"kind":"","replacement":"","before_text":"j.doe@example.com"}]}

For the per-finding provider and kind, read the audit JSONL directly (see Audit & undo).

Next steps