What is Cloudflare AI Gateway?

AI Gateway — Glossary

Cloudflare's proxy in front of LLM providers. Every AI call b/cited makes — embeddings, brief generation, AEO citation runs — routes through it. Adds caching, retry, observability, and a unified billing surface across OpenAI, Anthropic, and Perplexity.

AI Gateway is a Cloudflare service that sits between your application and LLM providers (OpenAI, Anthropic, Perplexity, Google Gemini, others). Your code calls the gateway URL instead of the provider's URL directly; the gateway forwards the request, caches eligible responses, logs every call, and surfaces analytics across providers.

The change is one URL swap — the OpenAI client points at:

https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_slug}/openai

instead of https://api.openai.com/v1. Everything else stays identical.

What the gateway adds:

Caching — identical requests return cached responses without re-paying the provider. Saves real money on repeated embeddings or deterministic prompts.
Retry with backoff — provider hiccups (rate limits, transient 5xx) get automatic retries before failing.
Cost + latency analytics — per-prompt, per-provider, per-day breakdown.
Single auth point — provider API keys live in Cloudflare, not scattered across services.
Rate limiting + budget caps — bounded spend if a workflow goes runaway.

Why it matters

b/cited makes a lot of provider calls per ingest:

Hundreds of embedding calls (one per query × batches of 100)
A handful of brief-generation calls (one per cluster being briefed)
AEO citation runs across three providers per tracked prompt

Cached embeddings alone save more than the gateway costs at scale. The cross-provider analytics are how we know the cost-per-AEO-run breakdown — visible at dash.cloudflare.com/?to=/ai/aigateway.

What b/cited does with it

One gateway slug per account; all three Workers (web, api, ingest) route through it
AI_GATEWAY_SLUG is a Worker var, not a secret — the URL is public, only the provider API key in the upstream request is sensitive
Anthropic + Perplexity routes use the same gateway via their respective /anthropic and /perplexity-ai suffixes

AI Gateway (Cloudflare AI Gateway)

Why it matters

What b/cited does with it