An embedding is a list of numbers that represents the meaning of a piece of text. Two pieces of text with similar meaning have similar embeddings — that's the whole trick.
A typical embedding has 1,536 numbers (OpenAI's text-embedding-3-small dimension, which is what BCited uses). You can think of it as a point in 1,536-dimensional space.
Why this matters for AEO and SEO
Modern search engines and LLMs use embeddings internally to:
- Decide whether two queries mean the same thing (cluster them)
- Decide whether a page is relevant to a query (semantic ranking)
- Decide which sources to cite when answering (retrieval-augmented generation)
A site whose pages embed close to relevant queries gets cited and ranked more often. The way to land close to a query's embedding is to write content that uses the same concepts, in similar shape, with similar named entities — not necessarily the same keywords.
How BCited uses embeddings
We embed three things per project:
- Every ranking query (powers cluster discovery + duplicate detection)
- Every cluster centroid (powers cluster naming + cluster-to-URL linking)
- The top URLs on your site (powers internal-link suggestions)
All three vectors live in Cloudflare Vectorize, namespaced per project. Lookups are sub-100ms.