kuroi.providers.base¶
base ¶
The Provider Protocol every LLM client implements.
Provider ¶
Bases: Protocol
Interface kuroi uses to talk to any LLM, cloud or local.
detect_redactions ¶
detect_redactions(pages: tuple[Page, ...], llm_category_ids: tuple[str, ...], *, instructions: tuple[str, ...] = (), seed: int | None = None, attempt: int = 0, layout_aware: bool = False, model: str | None = None) -> tuple[list[Finding], list[ChunkRecord]]
Identify spans to redact across the supplied pages.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pages
|
tuple[Page, ...]
|
Word-indexed pages produced by the PDF extractor. |
required |
llm_category_ids
|
tuple[str, ...]
|
Category ids the provider is responsible for. |
required |
instructions
|
tuple[str, ...]
|
Free-text redaction instructions from the user. |
()
|
seed
|
int | None
|
Optional sampling seed for reproducible runs. |
None
|
attempt
|
int
|
Zero-based retry index from the chunker. Providers may use it to scale per-call timeouts (e.g. Ollama gives slow models more time on each retry). |
0
|
layout_aware
|
bool
|
When True, wrap the prompt with PyMuPDF block
boundaries ( |
False
|
model
|
str | None
|
If set, dispatch this call against the named model instead
of the provider instance's configured |
None
|
Returns:
| Type | Description |
|---|---|
list[Finding]
|
A tuple of |
list[ChunkRecord]
|
redactions, chunk_records audit the prompts and raw responses for |
tuple[list[Finding], list[ChunkRecord]]
|
each chunk dispatched to the model. |