API Reference
This section contains the auto-generated API documentation from the kanoa source code.
kanoa: AI-powered interpretation of data science outputs.
Core
- class kanoa.core.interpreter.AnalyticsInterpreter(backend='gemini', kb_path=None, kb_content=None, api_key=None, max_tokens=3000, enable_caching=True, track_costs=True, system_prompt=None, user_prompt=None, grounding_mode='local', knowledge_base=None, **backend_kwargs)[source]
Bases:
objectAI-powered analytics interpreter with multi-backend support.
Supports: - Multiple AI backends (vLLM, Gemini, Claude, OpenAI) - Knowledge base grounding (text, PDFs, or none) - Multiple input types (figures, DataFrames, dicts) - Cost tracking and optimization
- Install backends with:
pip install kanoa[local] # vLLM (Molmo, Gemma 3) pip install kanoa[gemini] # Google Gemini pip install kanoa[claude] # Anthropic Claude pip install kanoa[openai] # OpenAI GPT models pip install kanoa[all] # All backends
- __init__(backend='gemini', kb_path=None, kb_content=None, api_key=None, max_tokens=3000, enable_caching=True, track_costs=True, system_prompt=None, user_prompt=None, grounding_mode='local', knowledge_base=None, **backend_kwargs)[source]
Initialize analytics interpreter.
- Parameters:
backend (
Literal['vllm','gemini','claude','github-copilot','openai']) – AI backend to use (‘vllm’, ‘gemini’, ‘claude’, ‘openai’)kb_path (
Union[str,Path,None]) – Path to knowledge base directorykb_content (
Optional[str]) – Pre-loaded knowledge base stringapi_key (
Optional[str]) – API key for cloud backends (or use env vars)max_tokens (
int) – Maximum tokens for responseenable_caching (
bool) – Enable context caching for cost savingstrack_costs (
bool) – Track token usage and costssystem_prompt (
Optional[str]) – Custom system prompt template (overrides default). Use {kb_context} placeholder for knowledge base content.user_prompt (
Optional[str]) – Custom user prompt template (overrides default). Use {context_block} and {focus_block} placeholders.grounding_mode (
str) – Knowledge base grounding strategy. - ‘local’: Load KB files into context (default, traditional approach) - ‘rag_engine’: Use Vertex AI RAG Engine for semantic retrievalknowledge_base (
Optional[BaseKnowledgeBase]) – BaseKnowledgeBase instance (required if grounding_mode=’rag_engine’)**backend_kwargs (
Any) – Additional backend-specific arguments
Example
>>> # Traditional KB grounding (context stuffing) >>> interp = AnalyticsInterpreter( ... kb_path="kbs/papers/", ... grounding_mode="local" ... ) >>> >>> # RAG Engine grounding (semantic retrieval) >>> from kanoa.knowledge_base.vertex_rag import VertexRAGKnowledgeBase >>> rag_kb = VertexRAGKnowledgeBase( ... project_id="my-project", ... corpus_display_name="research-papers" ... ) >>> rag_kb.create_corpus() >>> rag_kb.import_files("gs://my-bucket/papers/") >>> interp = AnalyticsInterpreter( ... grounding_mode="rag_engine", ... knowledge_base=rag_kb ... )
- Raises:
ImportError – If the requested backend’s dependencies aren’t installed
ValueError – If the backend name is unknown or invalid grounding_mode
- with_kb(kb_path=None, kb_content=None)[source]
Create a new interpreter instance with a specific knowledge base, sharing the same backend and cost tracking state.
- Return type:
- Behavior:
REPLACES any existing knowledge base.
Shares the underlying backend instance (and thus cost stats).
Returns a new AnalyticsInterpreter instance.
Example
# Base interpreter (no KB) interp = AnalyticsInterpreter()
# Specialized interpreter (shares costs with base) env_interp = interp.with_kb(“kbs/environmental”)
- interpret(fig=None, data=None, context=None, focus=None, include_kb=True, display_result=None, custom_prompt=None, stream=True, **kwargs)[source]
Interpret analytical output using configured backend.
- Parameters:
data (
Optional[Any]) – DataFrame/dict/other data to interpretinclude_kb (
bool) – Whether to include knowledge base contextdisplay_result (
Optional[bool]) – Auto-display as Markdown in Jupyter. If None, uses kanoa.options.display_result (default: True)custom_prompt (
Optional[str]) – Override default prompt templatestream (
bool) – Whether to stream results (default: True)**kwargs (
Any) – Additional backend-specific arguments
- Return type:
- Returns:
Iterator[InterpretationChunk] if stream=True (default) InterpretationResult if stream=False
- Raises:
ValueError – If no input (fig, data, context, focus, or custom_prompt) is provided
- interpret_figure(fig=None, **kwargs)[source]
Convenience method for matplotlib figures.
- Return type:
- get_kb()[source]
Get the active knowledge base.
- Return type:
KnowledgeBaseManager- Returns:
The active KnowledgeBaseManager instance.
- Raises:
RuntimeError – If no knowledge base has been configured.
- check_kb_cost()[source]
Check the cost/token count of the current knowledge base.
- Return type:
- Returns:
TokenCheckResult or None if not supported/empty.
- get_prompts()[source]
Get the current prompt templates used by this interpreter.
Returns a dictionary with the active prompt templates: - system_prompt: Template for system instruction (with {kb_context} placeholder) - user_prompt: Template for user prompt (with {context_block}, {focus_block} placeholders)
Example
>>> interp = AnalyticsInterpreter() >>> prompts = interp.get_prompts() >>> print(prompts["system_prompt"]) You are an expert data analyst...
- preview_prompt(context=None, focus=None, include_kb=True, custom_prompt=None)[source]
Preview the exact prompt that would be sent to the LLM.
This method builds the complete prompt using the current templates and configuration, allowing you to see exactly what the AI will receive.
- Parameters:
Example
>>> interp = AnalyticsInterpreter(kb_path="./my_kb") >>> prompt = interp.preview_prompt( ... context="Inertial sensor calibration data", ... focus="Drift compensation and alignment" ... ) >>> print(prompt)
- Returns:
The complete rendered prompt string
- Return type:
- set_prompts(system_prompt=None, user_prompt=None)[source]
Update prompt templates at runtime (chainable).
This method allows you to modify the system and/or user prompt templates after the interpreter has been initialized.
- Parameters:
Example
>>> interp = AnalyticsInterpreter() >>> interp.set_prompts( ... user_prompt="Provide exactly 3 bullet points..." ... ).interpret(data=df)
>>> # Chain multiple configuration calls >>> interp.set_prompts( ... system_prompt="You are an environmental data scientist..." ... ).with_kb("./conservation_kb")
- Return type:
- Returns:
Self for method chaining
- class kanoa.core.types.GroundingSource(uri, score, text, chunk_id=None)[source]
Bases:
objectSource attribution for RAG-grounded responses.
- uri
Source document URI (e.g., GCS path).
- score
Relevance score (0-1) from semantic retrieval.
- text
Excerpt from the source document.
- chunk_id
Unique identifier for the retrieved chunk.
- __init__(uri, score, text, chunk_id=None)
- class kanoa.core.types.CacheCreationResult(name, created, token_count=0)[source]
Bases:
objectResult from cache creation operation.
- __init__(name, created, token_count=0)
- class kanoa.core.types.UsageInfo(input_tokens, output_tokens, cost, cached_tokens=None, cache_created=False, savings=None, model=None, tier=None)[source]
Bases:
objectToken usage and cost information.
- property cache_savings: float | None
Calculate estimated cost savings from caching.
Returns the savings amount if available.
- __init__(input_tokens, output_tokens, cost, cached_tokens=None, cache_created=False, savings=None, model=None, tier=None)
- class kanoa.core.types.InterpretationResult(text, backend, usage=None, metadata=None, grounding_sources=None)[source]
Bases:
objectResult from interpretation.
Note: This class does NOT implement _repr_markdown_() by design. Display is handled explicitly via display_interpretation() when appropriate. This prevents double-display in streaming mode and makes display side-effects explicit. See docs/source/developer_guide/design_philosophy.md for rationale.
- grounding_sources: List[GroundingSource] | None = None
- __init__(text, backend, usage=None, metadata=None, grounding_sources=None)
Backends
Data Types
InterpretationChunk
@dataclass
class InterpretationChunk:
content: str # Text delta or status message
type: str # "text", "status", "usage", or "meta"
is_final: bool = False # True if this is the last chunk
usage: Optional[UsageInfo] = None
metadata: Optional[Dict[str, Any]] = None
InterpretationResult
@dataclass
class InterpretationResult:
text: str
backend: str
usage: Optional[UsageInfo] = None
metadata: Optional[Dict[str, Any]] = None