Streaming API

kanoa uses a streaming-first architecture to provide real-time feedback during long-running interpretation tasks. This guide explains how to work with the streaming interface effectively.

Core Concepts

The AnalyticsInterpreter.interpret method returns an iterator of InterpretationChunk objects by default (stream=True).

InterpretationChunk

Each chunk yielded by the stream has the following structure:

@dataclass
class InterpretationChunk:
    content: str               # The actual data (text delta or status message)
    type: str                  # "text", "status", "usage", or "meta"
    is_final: bool = False     # True if this is the last chunk
    usage: Optional[UsageInfo] = None
    metadata: Optional[Dict] = None
  • type="text": A partial text update from the LLM. Concatenate these to build the full response.

  • type="status": A status update (e.g., “Connecting…”, “Generating…”). Useful for UI spinners or logs.

  • type="usage": Emitted at the end of the stream, containing token usage and cost information.

Basic Usage

Streaming (Default)

The most efficient way to use kanoa is to consume the stream directly:

iterator = interpreter.interpret(fig=plt.gcf())

print("Status:", end=" ")
for chunk in iterator:
    if chunk.type == "status":
        print(f"[{chunk.content}]", end=" ", flush=True)
    elif chunk.type == "text":
        print(chunk.content, end="", flush=True)
    elif chunk.type == "usage":
        print(f"\nTotal Cost: ${chunk.usage.cost:.4f}")

Blocking (Legacy Behavior)

If you prefer a single result object after generation is complete, pass stream=False:

result = interpreter.interpret(fig=plt.gcf(), stream=False)
print(result.text)
print(f"Cost: ${result.usage.cost:.4f}")

Note: stream=False simply consumes the iterator internally and aggregates the result. It does not disable streaming at the API level.

Jupyter Notebooks

In Jupyter notebooks, kanoa automatically handles streaming display if display_result=True (default).

To see the output automatically without writing a loop, use stream=False:

# In a notebook cell
interpreter.interpret(fig=plt.gcf(), stream=False)
# Output updates in real-time below the cell, and returns final result object

If you use stream=True (default), you must iterate over the result to trigger the streaming output:

# In a notebook cell
for _ in interpreter.interpret(fig=plt.gcf()):
    pass

Backend Support

All supported backends implement the streaming interface:

  • Gemini: Supports full text streaming.

  • Claude: Supports full text streaming.

  • OpenAI: Supports full text streaming.

  • vLLM: Supports full text streaming.