Problem
Content generation that’s actually useful needs (a) multiple models compared head-to-head, not a single black-box choice, and (b) citations grounded in retrieval, not hallucinated.
Approach
Worker API on Cloudflare with a curated model list endpoint (so the frontend never has to ship a stale enum). The frontend (Vite + Pages) calls the same /score and /propose endpoints with models[] and count parameters, getting parallel multi-model output back.
Citations come from AI-Search as the primary source, with the worker handling fan-out, dedup, and response shaping.
What I shipped
- Multi-model proposal + scoring across N models with per-call counts
- Curated model list endpoint (so the FE doesn’t drift from the BE)
- AI-Search citation retrieval pipeline as the primary grounding source
- CORS hardening (
ALLOWED_ORIGINSWorker secret) after a production-blocking misconfiguration - Bug fix for GPT-5 returning empty content on certain prompt shapes
- Stale-page bug fix on the production frontend