Streaming API

kanoa uses a streaming-first architecture to provide real-time feedback during long-running interpretation tasks. This guide explains how to work with the streaming interface effectively.

Core Concepts

The AnalyticsInterpreter.interpret method returns an iterator of InterpretationChunk objects by default (stream=True).

`InterpretationChunk`

Each chunk yielded by the stream has the following structure:

@dataclass
class InterpretationChunk:
    content: str               # The actual data (text delta or status message)
    type: str                  # "text", "status", "usage", or "meta"
    is_final: bool = False     # True if this is the last chunk
    usage: Optional[UsageInfo] = None
    metadata: Optional[Dict] = None

type="text": A partial text update from the LLM. Concatenate these to build the full response.
type="status": A status update (e.g., “Connecting…”, “Generating…”). Useful for UI spinners or logs.
type="usage": Emitted at the end of the stream, containing token usage and cost information.

Basic Usage

Streaming (Default)

The most efficient way to use kanoa is to consume the stream directly:

iterator = interpreter.interpret(fig=plt.gcf())

print("Status:", end=" ")
for chunk in iterator:
    if chunk.type == "status":
        print(f"[{chunk.content}]", end=" ", flush=True)
    elif chunk.type == "text":
        print(chunk.content, end="", flush=True)
    elif chunk.type == "usage":
        print(f"\nTotal Cost: ${chunk.usage.cost:.4f}")

Blocking (Legacy Behavior)

If you prefer a single result object after generation is complete, pass stream=False:

result = interpreter.interpret(fig=plt.gcf(), stream=False)
print(result.text)
print(f"Cost: ${result.usage.cost:.4f}")

Note: stream=False simply consumes the iterator internally and aggregates the result. It does not disable streaming at the API level.

Jupyter Notebooks

In Jupyter notebooks, kanoa automatically handles streaming display if display_result=True (default).

To see the output automatically without writing a loop, use stream=False:

# In a notebook cell
interpreter.interpret(fig=plt.gcf(), stream=False)
# Output updates in real-time below the cell, and returns final result object

If you use stream=True (default), you must iterate over the result to trigger the streaming output:

# In a notebook cell
for _ in interpreter.interpret(fig=plt.gcf()):
    pass

Backend Support

All supported backends implement the streaming interface:

Gemini: Supports full text streaming.
Claude: Supports full text streaming.
OpenAI: Supports full text streaming.
vLLM: Supports full text streaming.