Structured data is the umbrella term for any markup that declares "this content is a thing of type X, with properties Y and Z" in a way crawlers and LLMs can parse without natural-language interpretation.
Three formats are common on the modern web:
- JSON-LD — JSON embedded in
<script type="application/ld+json">. Google's preferred format; what schema.org documentation defaults to. - Microdata —
itemscope/itempropattributes sprinkled inline through HTML. Older, harder to maintain. - RDFa — similar to microdata, even older, mostly historical.
Open Graph (og:title, og:description, og:image) and Twitter Card meta tags are also structured data — they describe the page to social platforms in a machine-readable way.
Why it matters for AEO
LLMs read structured data directly when deciding what a page contains. An Article with proper author, datePublished, and publisher properties is far easier to cite confidently than the same content as unmarked prose — the model knows who wrote it, when, and on behalf of whom.
The hierarchy of impact:
- JSON-LD
@typedeclarations (highest signal — tells the LLM exactly what type of thing this is) - Open Graph + Twitter Card meta (for social share + AI engines that crawl that surface)
- Microdata (lower signal, harder to extract reliably)
What b/cited does about it
The site readiness audit checks for:
- JSON-LD presence + validity
- Specific schema types appropriate to the page type
- Open Graph completeness
- Author / Person schema (the EEAT lever)
Findings surface in the audit grade. Schema markup gaps tend to be the cheapest grade-improvement available — they're a one-time addition that compounds.