SSE Events¶

This page is the wire-format reference for every event the IndoxHub streaming pipeline emits. For a higher-level walkthrough of how to make a streaming call, see Streaming.

Event-frame anatomy¶

Every frame follows the W3C SSE spec. There are two flavours:

Shape	Has `event:` line?	What it carries
Named event	yes — e.g. `event: usage_start`	IndoxHub-injected accounting / lifecycle signals
Data-only frame	no	Token deltas, finish markers, and the OpenAI Responses-API envelope

Both shapes always include a data: line whose value is JSON (except the final data: [DONE] terminator, which is literal).

`usage_start` — once, near the start¶

Emitted as soon as the upstream provider acknowledges the request. Useful for "thinking…" UI states with the actual context cost shown.

event: usage_start
data: {"type":"usage_start","request_id":"req-1","provider":"openai","model":"gpt-4o-mini","input_tokens":15}

Field	Type	Notes
`request_id`	string	Mirrors the `X-Request-ID` response header.
`provider`	string	One of: `openai`, `anthropic`, `google`, `bedrock`, `deepseek`, `xai`, `mistral`, `qwen`, `huggingface`.
`model`	string	The provider-side model id.
`input_tokens`	int	Prompt tokens, as reported by the provider's first SSE event. May be `0` for providers that report tokens only at end of stream.

Content frames — many¶

Standard token deltas. No event: line; just data:.

data: {"type":"content","data":"Hello","provider":"openai","choices":[{"delta":{"content":"Hello"},"index":0,"finish_reason":null}]}

The choices shape mirrors OpenAI's chat.completion.chunk for SDK compatibility. The flatter type / data / provider fields are added by IndoxHub for provider-agnostic clients.

`finish` — once, before the terminal frames¶

Stream-level finish marker. Carries the upstream provider's finish_reason translated to a normalized vocabulary.

data: {"type":"finish","provider":"openai","finish_reason":"stop","choices":[{"delta":{},"index":0,"finish_reason":"stop"}]}

Possible finish_reason values: stop, length, content_filter, tool_calls, function_call, error.

`usage_final` — once, before [DONE]¶

The accounting event. Carries totals so clients never need a second API call to bill or log.

event: usage_final
data: {"type":"usage_final","request_id":"req-1","provider":"openai","model":"gpt-4o-mini","input_tokens":15,"output_tokens":1,"cost_usd":2.85e-06,"latency_ms":4693}

Field	Type	Notes
`input_tokens`	int	Final prompt-token count after the provider has confirmed it.
`output_tokens`	int	Generated tokens through end of stream (does not include `[DONE]`).
`cost_usd`	float	Computed against the IndoxHub pricing registry. `0.0` if pricing is missing — never blocks the frame.
`latency_ms`	int	Wall-clock from request acceptance to terminal frame.

This event fires even if the upstream stream ends without a usage chunk — the injector tracks tokens locally as a fallback.

`response.done` — once, after `usage_final`¶

OpenAI Responses-API envelope for clients that mirror that shape. Contains the same usage totals as usage_final for redundancy.

data: {"type":"response.done","response":{"id":"req-1","object":"response","status":"completed","usage":{"prompt_tokens":15,"completion_tokens":1,"total_tokens":16}}}

`[DONE]` — terminator¶

data: [DONE]

The W3C-spec way to signal end-of-stream over a long-lived HTTP connection.

`error` — fired in place of `usage_final` on upstream failure¶

data: {"type":"error","data":"Provider returned 502 Bad Gateway","provider":"openai"}

When this fires, no usage_final arrives. Clients should treat the stream as finished after the next [DONE].

Reserved (not emitted today)¶

The injector is wired to support these events, but no production codepath emits them yet. Documented so clients can ignore them safely:

Event	Future use
`rate_limit_warning`	Mid-stream warning when the user is within 10% of their per-minute limit.
`cache_hit`	Sent when a response was served from the prompt cache (not a stream cache — first-token came from a cached upstream prefix).
`provider_fallback`	Sent when the gateway transparently retried on a different provider.

Complete parser¶

This Python example captures every event type, prints content as it arrives, and logs the usage_final totals at the end:

import json, requests

response = requests.post(
    "https://api.indoxhub.com/api/v1/chat/completions",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "model": "openai/gpt-4o-mini",
        "messages": [{"role": "user", "content": "Write one short sentence."}],
        "stream": True,
    },
    stream=True,
)

current_event, content = None, ""
for raw in response.iter_lines(decode_unicode=True):
    if not raw:
        current_event = None  # blank line ends a frame
        continue
    if raw.startswith("event: "):
        current_event = raw[7:]
        continue
    if not raw.startswith("data: "):
        continue
    if raw == "data: [DONE]":
        break

    payload = json.loads(raw[6:])
    if current_event == "usage_start":
        print(f"[start] in_tokens={payload['input_tokens']}")
    elif current_event == "usage_final":
        print(f"[final] cost=${payload['cost_usd']:.6f} "
              f"in={payload['input_tokens']} out={payload['output_tokens']} "
              f"latency={payload['latency_ms']}ms")
    elif payload.get("type") == "content":
        chunk = payload.get("data", "")
        content += chunk
        print(chunk, end="", flush=True)
    elif payload.get("type") == "finish":
        print(f"\n[finish] reason={payload['finish_reason']}")
    elif payload.get("type") == "error":
        print(f"\n[error] {payload.get('data')}")

print(f"\nFinal text: {content!r}")

Sample output:

[start] in_tokens=15
The sun set over the horizon.
[finish] reason=stop
[final] cost=$0.000003 in=15 out=8 latency=924ms
Final text: 'The sun set over the horizon.'

SSE Events¶

Event-frame anatomy¶

usage_start — once, near the start¶

Content frames — many¶

finish — once, before the terminal frames¶

usage_final — once, before [DONE]¶

response.done — once, after usage_final¶

[DONE] — terminator¶

error — fired in place of usage_final on upstream failure¶