Skip to main content

Performance Playbook

  • Edge-first rendering via Cloudflare Pages
  • SSE and WebSocket pipelines built for sub-150ms p99
  • Durable Object partitioning per session/tenant

Latency Budgets

ComponentAvgp99Optimization
Edge Routing5ms12mAnycast
Auth Validation20m50mToken cache
LLM Streaming80m150Shard models, reduce hops
WS Broadcast15m30mAffinity, minimal payloads
DB Persistence10m25mEdge DB and batching

Caching and Batching

  • Edge cache for static assets
  • Prompt/result caching where safe
  • Batch DB writes and debounce broadcasts
See also: Observability & SRE, Architecture.