FeatureMay 2, 2026

SSE async pipeline live in production — 9 providers, in-band cost events

Phase 7B canary now serves real production traffic for all 9 supported providers (OpenAI, Anthropic, Google, AWS Bedrock, DeepSeek, xAI, Mistral, Qwen, HuggingFace). Every stream emits usage_start + usage_final events with input/output tokens, cost_usd, and latency_ms baked into the wire format — no second API call needed for billing. Bedrock added as the 9th provider; nginx tuned for SSE; Locust load scenario shipped; Python client `indoxhub` v0.2.2 published to PyPI.

What's new since 2026-04-27

AreaChange
ProvidersBedrock added — 9 / 9 covered. Live-validated on us.amazon.nova-micro-v1:0.
RoutePhase 7B canary wires all 9 providers through dispatch_stream(provider_id, …) behind SSE_ASYNC_STREAMING_ENABLED. Default-off in prod; flip the GitHub Variable + re-run PROD_setup → PROD_build_docker to activate.
nginxDedicated SSE location = blocks with proxy_buffering off, 600 s timeouts, HTTP/1.1 keep-alive.
Load testtests/load/locustfile_sse.py — 50-stream scenario records TTFC, total stream duration, max inter-chunk gap.
DocsStreaming page rewritten + new SSE Events reference page. 17 / 17 usage pages now icon-bearing.
Python clientpip install indoxhub==0.2.2 — gold-standard release with R2 mirror surface + full Resemble AI namespace.

Live verification (today)

A single test stream against the canary on production:

event: usage_start
data: {"type":"usage_start","request_id":"…","provider":"openai","model":"gpt-4o-mini","input_tokens":15}

data: {"type":"content","data":"pong","provider":"openai","choices":[…]}

event: usage_final
data: {"type":"usage_final","input_tokens":15,"output_tokens":1,"cost_usd":2.85e-06,"latency_ms":4693}

data: [DONE]

Total live validation spend across all 9 provider wrappers: under $0.001.

Wire-format reference

The full per-event schema and a complete Python parser are documented on the new SSE Events page. The Streaming page now also carries the updated wire-format walkthrough.

Operational rollback

The canary is gated by a single repository-level GitHub Variable. Flip SSE_ASYNC_STREAMING_ENABLED=true, then re-run PROD_setup → PROD_build_docker. Rollback is the same toggle in reverse — no code change, no source redeploy.

#sse#streaming#async#canary#production#openai#anthropic#google#bedrock#deepseek#xai#mistral#qwen#huggingface#pypi#indoxhub-client