Resemble AI — Open-Question Decisions (implementation defaults)¶
Date: 2026-04-19.
Context: Captures the defaults chosen for the 8 open questions raised
during initial Resemble integration scoping. Each entry is marked
REVISIT with the concrete trigger that would cause us to reopen the
decision. The original integration / production-readiness planning docs
have been removed now that the work is complete; this file is the
durable record of "why we picked X".
Q1 — Markup % on top of Resemble's per-second cost¶
Default implemented: RESEMBLE_MARKUP_PCT = 20.0 (settable via env).
Charge formula: charged_amount = provider_cost × (1 + markup/100).
Why 20 %: Middle of the range we use elsewhere (OpenAI reseller markup is 10–30 %). Covers reconciliation drift + S3 egress + support overhead.
REVISIT when: Finance signs off on a final rate card, or when we move a capability to Business-plan pricing.
Q2 — Billing schema migration¶
Default implemented: NEW dedicated tables
(provider_resemble_usage, _request, _job, _asset,
_webhook_event) rather than extending api_requests / billing_*.
Why: Resemble's units (audio_seconds, images_processed,
searches, voice_subscriptions) don't map onto the token-centric
api_requests schema. Cleaner to keep provider-specific ledgers in their
own tables and roll them into the global totals at reporting time.
REVISIT when: We add a second non-token provider (ElevenLabs, AssemblyAI,
etc.) — at that point we may want a common provider_usage table.
Q3 — Audio storage policy¶
Policy (revised 2026-04-27): All new media uploads are routed to
Cloudflare R2 via app/utils/storage/. AWS S3 stays in place as a
fallback backend (selectable via STORAGE_BACKEND=s3) and continues to
serve any pre-existing S3 URLs. Per-asset retention is enforced by R2
lifecycle rules + DB provider_resemble_asset.expires_at based on
which prefix the file lands under.
Per-asset retention table:
| Asset class | Bucket prefix | TTL | Rationale |
|---|---|---|---|
| TTS generated audio | tts-output/ |
7 days | Regenerable — user can re-synthesize |
| STT input audio | stt-input/ |
30 days | Non-regenerable user upload |
| STT transcript | stt-output/ |
30 days | Tiny text files; matches input window |
| Audio enhance/edit results | audio-jobs/ |
7 days | Regenerable derivative |
| Watermark results | watermark/ |
7 days | Regenerable derivative |
| Voice-clone source recordings | voice-recordings/ |
PERMANENT | Identity asset; user investment |
| Built voice models metadata | voice-models/ |
PERMANENT | Identity asset |
| Voice design candidates | voice-design/ |
14 days | Long enough to evaluate + decide |
| Image generation output | images/ |
30 days | Small files; moderate window |
| Bedrock video output | videos/ |
14 days | Large files = real cost |
| Chat-uploaded images | chat-images/ |
90 days | Tied to conversation lifetime |
| Generic uploads | uploads/ |
30 days | Default when purpose unspecified |
Key naming pattern: {prefix}{user_id}/{YYYY}/{MM}/{DD}/{HHMMSS}-{request_id}.{ext}
- Per-user GDPR delete: aws s3 rm bucket/{prefix}{user_id}/ --recursive
- Date-partitioned for lifecycle rules + reporting
- Collision-proof and self-documenting
Three retention principles: 1. Regenerable output → short TTL (7d). 2. User-supplied source → medium TTL (30d). 3. Identity assets → PERMANENT.
Why R2 over S3: - Zero egress fees (vs. S3's ~$0.09/GB) — voice/audio playback hits the bucket from end-user browsers, so egress dominates the bill at scale. - S3-compatible API — same boto3 client with a different endpoint/region.
Why keep S3 around:
- Legacy assets already there; URLs continue to resolve.
- Rollback path: STORAGE_BACKEND=s3 re-routes new uploads instantly
with no code change.
Implementation status (2026-04-27): Live in code on
feat/resemble-phase-6-agents. R2-mirror tested end-to-end against
local server for TTS synthesize, uploads (purpose=voice_clone,
purpose=stt_input, default), audio enhance/edit GET, and watermark
apply GET. Lifecycle rules need to be added in the Cloudflare dashboard
per the table above (one rule per prefix, no rule on
voice-recordings/ or voice-models/).
Soft-delete window (planned): Lifecycle moves expired objects to
_deleted/<original-prefix>/ for 7 days before hard delete, so support
can restore on user request. Not yet wired.
REVISIT when: R2 storage cost crosses 1% of Resemble revenue (rough break-even line) — if it ever does, look at tiered cold storage before shortening any TTL.
Q4 — API shape (OpenAI-compat vs Resemble-native vs both)¶
Default implemented: Resemble-native routes under /resemble/*.
OpenAI-compat TTS (/audio/tts/generations) is NOT auto-wired for
Resemble yet — Resemble's voice selection (UUID) doesn't map onto
OpenAI's voice-name enum, and the per-character cost model doesn't match.
Why: Shipping native first gets a working feature out fast; adding the compatible shape later is additive and can live behind a thin adapter.
REVISIT when: A user explicitly needs the OpenAI shape for a drop-in client migration — add the adapter then.
Q5 — Key management (single account key vs BYOK)¶
Default implemented: Single IndoxHub account key via
settings.RESEMBLE_API_KEY. BYOK is NOT supported on Resemble routes yet.
Why: BYOK adds schema + UI work (storing user-provided keys encrypted) and defeats the rate-limit distribution work in #10. A single key is also how all other non-Bedrock providers work here today.
REVISIT when: A paying customer asks for BYOK, or when Resemble's per-key concurrency caps become the bottleneck despite the rate-limit fairness layer.
Q6 — Business-plan upgrade decision¶
Default implemented: Not upgraded. Voice cloning
(/voices, /voices/{uuid}/build) and voice design
(/voice-design/*) remain live in code but will 403 on upstream until
Resemble Business is purchased. A documented admin toggle would gate the
routes instead — see business-plan.md in this folder.
REVISIT when: We see ≥ N users explicitly asking for cloning or when the marketing case (named-voice features) justifies the monthly cost. Owner: product lead.
Q7 — SLA / failover behavior¶
Default implemented: NO failover. On Resemble 5xx / timeout the route
returns 502 Bad Gateway ("Resemble upstream error"). No retry policy
beyond what ResembleClient already does. No second provider wired in.
Why: Failover to ElevenLabs / Cartesia is a whole provider package's worth of work. Not shippable in this phase.
REVISIT when: Uptime SLO is formally defined and we can't hit it with Resemble alone. Candidate: introduce a circuit breaker (pybreaker) first, then a failover provider if the breaker trips often.
Q8 — Rate-limit fairness across users¶
Default implemented: Redis sliding-window counter per
(user_id, capability). Caps encoded in
app/utils/resemble_rate_limit.py::_LIMITS. Failing the check surfaces a
429 with Retry-After. If Redis is off, checks degrade open.
Why: Simple, matches existing rate-limit infra, prevents one user from filling all concurrency slots on our shared key.
REVISIT when: We add tiered rate limits (free / paid) — extend the
lookup to be a function of user_tier, not just capability.
Also implemented as defaults (not on the original list)¶
- Markup currency: USD only (
provider_resemble_usage.currencydefaults toUSD). Multi-currency will require FX lookup at write time. - Reconciliation window: 24 h rolling. Drift threshold: ±1 %. Alerts
write to
logger.erroronly — no paging integration yet. - Webhook signature scheme: HMAC-SHA256 over the raw request body,
header
X-Resemble-Signature, secretRESEMBLE_WEBHOOK_SECRET. If the secret is unset we accept unsigned payloads (dev convenience); set it in prod. - Asset expiry: Not written by default —
provider_resemble_asset.expires_atis nullable. A cleanup Celery task can be added later.