Skip to main content
AI Business, Funding & Market

Schema.org

A structured data vocabulary standard co-sponsored by Google, Microsoft, Yahoo, and Yandex. Defines 800+ types and 1400+ properties so search engines and AI answer engines can deterministically recognize page meaning

#schema.org#structured data#JSON-LD#Microdata#RDFa#search engine standard

What is Schema.org?

Schema.org is a structured-data vocabulary standard launched in 2011 with co-sponsorship from four search engines: Google, Microsoft, Yahoo, and Yandex. It currently defines more than 800 types and 1400 properties, specifying three syntaxes: JSON-LD, Microdata, and RDFa. Search engines and AI answer engines (ChatGPT, Claude, Gemini, Perplexity) use it as a primary signal to identify and trust pages as entities.

Why is Schema.org Central to GEO?

LLMs can infer page meaning from natural-language HTML alone, but that inference is probabilistic and accuracy varies by page, language, and domain. Schema.org makes that inference deterministic. When a page explicitly declares "this company is X, this article is Y, the author is Z," LLM entity identification, citation decisions, and answer composition all stabilize.

The "AI search bias toward earned media (third-party sources)" that Chen et al. 2025 proves quantitatively also operates on top of entity-identification accuracy. If AI cannot identify the page as an entity, third-party mentions can't be attributed back to the brand. Schema.org is the base layer of entity attribution.

Recommended 13 Types (RanketAI Site Check)

Of schema.org's 800+ types, RanketAI site check recommends 13 core types when evaluating a site, grouped into 4 categories:

Group Types Responsibility
A · Identity Organization · Corporation · Person Primary entity signal, E-E-A-T
B · Page Meta WebSite · WebPage · BreadcrumbList · CollectionPage Page hierarchy, site level
C · Content Article · FAQPage Top AI answer citation frequency
D · Business Product · Service · LocalBusiness · Event Rich Results, commerce, local

Preferred Syntax: JSON-LD

Among the three syntaxes, Google's official guide explicitly prefers JSON-LD. It separates from the HTML body, which gives it the highest extraction robustness, and it carries the strongest official status as a W3C Recommendation (1.1, 2020-07).

Related Terms

Related terms