API Reference

This section contains the auto-generated API documentation from the kanoa source code.

kanoa: AI-powered interpretation of data science outputs.

Core

class kanoa.core.interpreter.AnalyticsInterpreter(backend='gemini', kb_path=None, kb_content=None, api_key=None, max_tokens=3000, enable_caching=True, track_costs=True, system_prompt=None, user_prompt=None, grounding_mode='local', knowledge_base=None, **backend_kwargs)[source]

Bases: object

AI-powered analytics interpreter with multi-backend support.

Supports: - Multiple AI backends (vLLM, Gemini, Claude, OpenAI) - Knowledge base grounding (text, PDFs, or none) - Multiple input types (figures, DataFrames, dicts) - Cost tracking and optimization

Install backends with:

pip install kanoa[local] # vLLM (Molmo, Gemma 3) pip install kanoa[gemini] # Google Gemini pip install kanoa[claude] # Anthropic Claude pip install kanoa[openai] # OpenAI GPT models pip install kanoa[all] # All backends

__init__(backend='gemini', kb_path=None, kb_content=None, api_key=None, max_tokens=3000, enable_caching=True, track_costs=True, system_prompt=None, user_prompt=None, grounding_mode='local', knowledge_base=None, **backend_kwargs)[source]

Initialize analytics interpreter.

Parameters:
  • backend (Literal['vllm', 'gemini', 'claude', 'github-copilot', 'openai']) – AI backend to use (‘vllm’, ‘gemini’, ‘claude’, ‘openai’)

  • kb_path (Union[str, Path, None]) – Path to knowledge base directory

  • kb_content (Optional[str]) – Pre-loaded knowledge base string

  • api_key (Optional[str]) – API key for cloud backends (or use env vars)

  • max_tokens (int) – Maximum tokens for response

  • enable_caching (bool) – Enable context caching for cost savings

  • track_costs (bool) – Track token usage and costs

  • system_prompt (Optional[str]) – Custom system prompt template (overrides default). Use {kb_context} placeholder for knowledge base content.

  • user_prompt (Optional[str]) – Custom user prompt template (overrides default). Use {context_block} and {focus_block} placeholders.

  • grounding_mode (str) – Knowledge base grounding strategy. - ‘local’: Load KB files into context (default, traditional approach) - ‘rag_engine’: Use Vertex AI RAG Engine for semantic retrieval

  • knowledge_base (Optional[BaseKnowledgeBase]) – BaseKnowledgeBase instance (required if grounding_mode=’rag_engine’)

  • **backend_kwargs (Any) – Additional backend-specific arguments

Example

>>> # Traditional KB grounding (context stuffing)
>>> interp = AnalyticsInterpreter(
...     kb_path="kbs/papers/",
...     grounding_mode="local"
... )
>>>
>>> # RAG Engine grounding (semantic retrieval)
>>> from kanoa.knowledge_base.vertex_rag import VertexRAGKnowledgeBase
>>> rag_kb = VertexRAGKnowledgeBase(
...     project_id="my-project",
...     corpus_display_name="research-papers"
... )
>>> rag_kb.create_corpus()
>>> rag_kb.import_files("gs://my-bucket/papers/")
>>> interp = AnalyticsInterpreter(
...     grounding_mode="rag_engine",
...     knowledge_base=rag_kb
... )
Raises:
  • ImportError – If the requested backend’s dependencies aren’t installed

  • ValueError – If the backend name is unknown or invalid grounding_mode

with_kb(kb_path=None, kb_content=None)[source]

Create a new interpreter instance with a specific knowledge base, sharing the same backend and cost tracking state.

Return type:

AnalyticsInterpreter

Behavior:
  • REPLACES any existing knowledge base.

  • Shares the underlying backend instance (and thus cost stats).

  • Returns a new AnalyticsInterpreter instance.

Example

# Base interpreter (no KB) interp = AnalyticsInterpreter()

# Specialized interpreter (shares costs with base) env_interp = interp.with_kb(“kbs/environmental”)

interpret(fig=None, data=None, context=None, focus=None, include_kb=True, display_result=None, custom_prompt=None, stream=True, **kwargs)[source]

Interpret analytical output using configured backend.

Parameters:
  • fig (Optional[Figure]) – Matplotlib figure to interpret

  • data (Optional[Any]) – DataFrame/dict/other data to interpret

  • context (Optional[str]) – Brief description of the output

  • focus (Optional[str]) – Specific aspects to analyze

  • include_kb (bool) – Whether to include knowledge base context

  • display_result (Optional[bool]) – Auto-display as Markdown in Jupyter. If None, uses kanoa.options.display_result (default: True)

  • custom_prompt (Optional[str]) – Override default prompt template

  • stream (bool) – Whether to stream results (default: True)

  • **kwargs (Any) – Additional backend-specific arguments

Return type:

Union[Iterator[InterpretationChunk], InterpretationResult]

Returns:

Iterator[InterpretationChunk] if stream=True (default) InterpretationResult if stream=False

Raises:

ValueError – If no input (fig, data, context, focus, or custom_prompt) is provided

interpret_figure(fig=None, **kwargs)[source]

Convenience method for matplotlib figures.

Return type:

InterpretationResult

interpret_dataframe(df, **kwargs)[source]

Convenience method for DataFrames.

Return type:

InterpretationResult

get_cost_summary()[source]

Get summary of token usage and costs.

Return type:

Dict[str, Any]

get_kb()[source]

Get the active knowledge base.

Return type:

KnowledgeBaseManager

Returns:

The active KnowledgeBaseManager instance.

Raises:

RuntimeError – If no knowledge base has been configured.

reload_knowledge_base()[source]

Reload knowledge base from source.

Return type:

None

check_kb_cost()[source]

Check the cost/token count of the current knowledge base.

Return type:

Any

Returns:

TokenCheckResult or None if not supported/empty.

get_cache_status()[source]

Check the status of the context cache for the current KB.

Return type:

Dict[str, Any]

Returns:

Dict with cache status details (exists, source, tokens, etc.) or {‘exists’: False, ‘reason’: …} if not supported/found.

get_prompts()[source]

Get the current prompt templates used by this interpreter.

Returns a dictionary with the active prompt templates: - system_prompt: Template for system instruction (with {kb_context} placeholder) - user_prompt: Template for user prompt (with {context_block}, {focus_block} placeholders)

Example

>>> interp = AnalyticsInterpreter()
>>> prompts = interp.get_prompts()
>>> print(prompts["system_prompt"])
You are an expert data analyst...
Returns:

Dictionary with ‘system_prompt’ and ‘user_prompt’ keys

Return type:

Dict[str, str]

preview_prompt(context=None, focus=None, include_kb=True, custom_prompt=None)[source]

Preview the exact prompt that would be sent to the LLM.

This method builds the complete prompt using the current templates and configuration, allowing you to see exactly what the AI will receive.

Parameters:
  • context (Optional[str]) – Brief description of the analytical output

  • focus (Optional[str]) – Specific aspects to analyze

  • include_kb (bool) – Whether to include knowledge base context

  • custom_prompt (Optional[str]) – Custom prompt to preview (overrides templates)

Example

>>> interp = AnalyticsInterpreter(kb_path="./my_kb")
>>> prompt = interp.preview_prompt(
...     context="Inertial sensor calibration data",
...     focus="Drift compensation and alignment"
... )
>>> print(prompt)
Returns:

The complete rendered prompt string

Return type:

str

set_prompts(system_prompt=None, user_prompt=None)[source]

Update prompt templates at runtime (chainable).

This method allows you to modify the system and/or user prompt templates after the interpreter has been initialized.

Parameters:
  • system_prompt (Optional[str]) – New system prompt template (or None to keep current). Use {kb_context} placeholder for knowledge base content.

  • user_prompt (Optional[str]) – New user prompt template (or None to keep current). Use {context_block} and {focus_block} placeholders.

Example

>>> interp = AnalyticsInterpreter()
>>> interp.set_prompts(
...     user_prompt="Provide exactly 3 bullet points..."
... ).interpret(data=df)
>>> # Chain multiple configuration calls
>>> interp.set_prompts(
...     system_prompt="You are an environmental data scientist..."
... ).with_kb("./conservation_kb")
Return type:

AnalyticsInterpreter

Returns:

Self for method chaining

reset_chat()[source]

Reset conversation history (if supported by backend).

For backends that maintain state (like GitHub Copilot), this clears the active session and chat history. For stateless backends, this is a no-op.

Return type:

None

class kanoa.core.types.GroundingSource(uri, score, text, chunk_id=None)[source]

Bases: object

Source attribution for RAG-grounded responses.

uri

Source document URI (e.g., GCS path).

score

Relevance score (0-1) from semantic retrieval.

text

Excerpt from the source document.

chunk_id

Unique identifier for the retrieved chunk.

uri: str
score: float
text: str
chunk_id: str | None = None
__init__(uri, score, text, chunk_id=None)
class kanoa.core.types.CacheCreationResult(name, created, token_count=0)[source]

Bases: object

Result from cache creation operation.

name: str | None
created: bool
token_count: int = 0
__init__(name, created, token_count=0)
class kanoa.core.types.UsageInfo(input_tokens, output_tokens, cost, cached_tokens=None, cache_created=False, savings=None, model=None, tier=None)[source]

Bases: object

Token usage and cost information.

input_tokens: int
output_tokens: int
cost: float
cached_tokens: int | None = None
cache_created: bool = False
savings: float | None = None
model: str | None = None
tier: str | None = None
property cache_savings: float | None

Calculate estimated cost savings from caching.

Returns the savings amount if available.

__init__(input_tokens, output_tokens, cost, cached_tokens=None, cache_created=False, savings=None, model=None, tier=None)
class kanoa.core.types.InterpretationResult(text, backend, usage=None, metadata=None, grounding_sources=None)[source]

Bases: object

Result from interpretation.

Note: This class does NOT implement _repr_markdown_() by design. Display is handled explicitly via display_interpretation() when appropriate. This prevents double-display in streaming mode and makes display side-effects explicit. See docs/source/developer_guide/design_philosophy.md for rationale.

text: str
backend: str
usage: UsageInfo | None = None
metadata: Dict[str, Any] | None = None
grounding_sources: List[GroundingSource] | None = None
__init__(text, backend, usage=None, metadata=None, grounding_sources=None)
class kanoa.core.types.InterpretationChunk(content, type, is_final=False, usage=None, metadata=None)[source]

Bases: object

A chunk of streaming interpretation data.

content: str
type: str
is_final: bool = False
usage: UsageInfo | None = None
metadata: Dict[str, Any] | None = None
__init__(content, type, is_final=False, usage=None, metadata=None)

Backends

Data Types

InterpretationChunk

@dataclass
class InterpretationChunk:
    content: str               # Text delta or status message
    type: str                  # "text", "status", "usage", or "meta"
    is_final: bool = False     # True if this is the last chunk
    usage: Optional[UsageInfo] = None
    metadata: Optional[Dict[str, Any]] = None

InterpretationResult

@dataclass
class InterpretationResult:
    text: str
    backend: str
    usage: Optional[UsageInfo] = None
    metadata: Optional[Dict[str, Any]] = None

UsageInfo

Knowledge Base