AI Business, Funding & Market2026-05-09·Author: RanketAI Editorial Team·Updated: 2026-05-09

RanketAI Guide #06: Schema.org 13 Types and GEO Impact

Maps RanketAI site check's 13 recommended schema.org types (Organization, Article, FAQPage, BreadcrumbList, etc.) to their GEO impact — using KDD 2024 + Chen 2025 + Google Rich Results + Ahrefs 2026-02. JSON-LD rationale and 4-group classification included.

AI-assisted draft · Editorially reviewed

This blog content may use AI tools for drafting and structuring, and is published after editorial review by the RanketAI Editorial Team.

Summary (as of 2026-05-09): The second gate of GEO measurement is schema.org. If the first gate (#05 — robots.txt policy) is bot access, schema.org is the layer at which bots accurately understand and identify content as an entity. This guide maps the GEO impact of the 13 schema types that RanketAI site check recommends, using schema.org + Google Rich Results + KDD 2024 academic backing + Ahrefs Feb 2026 analysis.

Why Schema.org is the Second GEO Gate

In the previous post #05 — The Four AI Crawler Policies, we established that robots.txt is the first gate of GEO. If bots can't access the page, the nine GEO strategies (Aggarwal et al. KDD 2024) all collapse to zero impact. But there is another decisive branch after a bot fetches the page — what did the bot understand this page to be?

LLMs can infer page meaning from natural-language HTML alone. But that inference is probabilistic, and accuracy diverges by page, language, and domain. Schema.org makes that inference deterministic. When a page explicitly declares "this company is RanketAI, this article is Guide series #06, the author is RanketAI Editorial," LLM entity identification, citation decisions, and answer composition all stabilize.

The "AI search bias toward earned media (third-party sources)" that Chen et al. 2025 quantitatively proves also operates on top of entity-identification accuracy. If AI cannot identify the page as an entity, third-party mentions can't be attributed back to the brand. Schema.org is the base layer of entity attribution.

Schema.org's Standard Status

Schema.org was launched in 2011 with co-sponsorship by Google, Microsoft, Yahoo, and Yandex — the four major search engines. It currently defines 800+ types and 1400+ properties, specifying three syntaxes: JSON-LD, Microdata, and RDFa.

The differences between the three syntaxes are summarized in this single table:

Syntax	Location	Extraction Robustness	Google Recommendation
JSON-LD	`<script type="application/ld+json">` (separated from head/body)	✅ (independent of HTML body)	✅ Preferred
Microdata	HTML attributes (`itemscope`, `itemtype`)	△ (bound to HTML body)	Compatible
RDFa	HTML attributes (`vocab`, `typeof`)	△ (bound to HTML body)	Compatible

Google Search Central's official guide clearly declares JSON-LD as the preferred format. Two reasons:

Separation from HTML body — JSON-LD is written as standalone JSON inside <script> blocks. HTML markup changes don't break schema data.
Extraction robustness — Microdata and RDFa depend on the HTML parse tree, so some bots may fail to extract them correctly. JSON-LD's tokenization is consistent.

JSON-LD also has the highest official standard status as a W3C Recommendation (1.1, 2020-07). This is the basis for RanketAI site check's recommendation to convert Microdata to JSON-LD when detected.

13 Types in 4 Groups — Why These 13?

The 13 types that site check recommends are not chosen by quantity, but by stage-by-stage entity-identification responsibility, organized in 4 groups.

Group A — Identity (3 types)

Organization — Company / institution identity. The primary signal for "who is this company?" in AI answers. The most important type for cross-LLM entity disambiguation. (See #02 LLM Citation Algorithm.)
Corporation — Subtype of Organization, used for for-profit legal entities. Reinforces business registration / legal entity information.
Person — Representative / author / expert. The Authoritativeness signal under E-E-A-T.

Group B — Page Meta (4 types)

WebSite — Site-level metadata. Site-level signals like sitelinks searchbox and alternateName.
WebPage — Per-page metadata. The most basic wrapper for any page.
BreadcrumbList — Page hierarchy. The signal AI answers use to recognize "which category does this article belong to?". Also directly used in Google Rich Results' SERP path display.
CollectionPage — Listing / archive pages. Suitable for category, tag, and blog-index pages.

Group C — Content (2 types)

Article — News / blog / guide. The #1 type for AI answer citation. The four fields headline, author, datePublished, and publisher form the core of entity attribution.
FAQPage — Q&A structure. The core of AEO (Answer Engine Optimization). The type cited most frequently in AI answers.

Group D — Business Actions (4 types)

Product — Products. The primary schema for commerce pages. price, availability, and review trigger Rich Results' star ratings and price displays.
Service — Services. SaaS, consulting, and professional services. Fields: provider, areaServed, serviceType.
LocalBusiness — Local business. For brick-and-mortar shops or branches. address, geo, and openingHours feed local search and AI answers' location citations.
Event — Events. Conferences, seminars, webinars, launches. startDate, location, organizer.

What these 13 types share is a 1:1 mapping with AI-answer information units (entity / claim / location / time). The 13 types are the result of mapping the information units AI needs when composing an answer to schema vocabulary.

GEO Impact — Academic + Industry Evidence (as of 2026-05-09)

The impact of these 13 types is verified by both academic research (KDD 2024 / Chen 2025) and 2026 industry analysis (Ahrefs / Bing / Google).

Ahrefs Feb 2026 Quantification — AI Citations Are Expanding Beyond SERPs

According to Ahrefs analysis covered by Search Engine Land in 2026, an analysis of 863,000 keyword SERPs and 4M AI Overview URLs found that the share of AI Overview cited pages also ranking in top-10 SERP positions plummeted from 76% (mid-2025) to 38% (February 2026). This means AI answers are increasingly citing pages outside SERP visibility — as of 2026, schema.org's role in entity attribution has become more decisive. Beyond SERP ranking signals, the primary basis for LLMs to identify and trust pages is schema.org. Bing (March 2025) and Google (April 2025) have publicly acknowledged schema markup's LLM contribution (Copilot content understanding + Google search advantage).

Cite-source Strategy (KDD 2024)

Aggarwal et al. KDD 2024 quantitatively evaluated nine GEO strategies, and the Cite-source strategy (explicitly citing sources) delivered +40% citation rate. Schema.org's Article.author, Organization.url, and Article.citation fields are the implementation form of this strategy. Declaring sources via schema is more robust to LLM extraction than writing "Source: ..." in body prose.

Earned Media Bias (Chen et al. 2025)

The "AI search bias toward earned media" that Chen et al. 2025 proves quantitatively boils down to how a third-party mention is attributed back to the brand. Schema.org's Organization.sameAs field (cross-reference to authoritative sources like Wikipedia, Crunchbase, LinkedIn) is the base layer of that attribution.

E-E-A-T (Google's Official Frame)

Google's E-E-A-T frame consists of four evaluation axes: Experience, Expertise, Authoritativeness, and Trust. Schema.org's Person.knowsAbout, Article.author, Organization.foundingDate, and Review.author serve as primary signals for each axis. The same frame applies to AI Overviews' answer-quality evaluation.

RanketAI Site Check Recommendation Matrix

The rationale for each of the 13 type recommendations, in one table:

Type	Group	Google Rich Results	AI Answer Citation Frequency	Rationale
Organization	A	✅ Logo · sameAs	High	Primary entity signal (every page)
Corporation	A	(Organization-compatible)	Medium	Reinforces legal entity info
Person	A	(Indirect)	High	E-E-A-T Authoritativeness
WebSite	B	✅ Sitelinks Searchbox	Medium	Site-level wrapper
WebPage	B	(Baseline)	Medium	Baseline for every page
BreadcrumbList	B	✅ SERP path display	High	Page hierarchy recognition
CollectionPage	B	(Indirect)	Medium	Listing-page identification
Article	C	✅ headline · author · image	Top	#1 for AI answer citation
FAQPage	C	✅ Direct FAQ display	Top	Core of AEO
Product	D	✅ Star rating · price · stock	High	Commerce
Service	D	(Indirect)	Medium	SaaS · B2B
LocalBusiness	D	✅ Local Pack · Map	Medium	Brick-and-mortar businesses
Event	D	✅ Event card	Medium	Schedule information

Key patterns:

Article + FAQPage — Highest AI answer citation frequency. Apply to all content pages first.
Organization + Person — Primary entity signals. One publish across the site (or on a representative page) is enough.
BreadcrumbList — Direct trigger for Rich Results' SERP path display. Add to every page as a baseline.
Product / LocalBusiness / Event — Apply selectively by business model.

Connection to RanketAI Site Check

RanketAI site check evaluates the following items during page analysis:

Site check evaluation item	Mapping to 13 types
Number of detected schema.org types	How many of the 13 types appear on the page
JSON-LD format adoption	Recommend converting Microdata / RDFa to JSON-LD
Missing-recommended-type analysis	Which of the 13 are missing + page-type priorities

When the diagnosis is weak, prioritize fixes by the following causal mapping:

Weak scenario	Recommended type to add first
Company site without Organization	Organization (Group A) — every page
Article / blog without Article	Article (Group C) — content pages
FAQ section without FAQPage	FAQPage (Group C) — Q&A pages
Category path without BreadcrumbList	BreadcrumbList (Group B) — every page
Commerce site without Product	Product (Group D) — product pages
Microdata only (no JSON-LD)	Re-author the same schema in JSON-LD

In short, RanketAI site check's schema diagnosis is not an abstract guideline but a frame that diagnoses AI-answer citation potential at the page level. When a weakness is found, prioritize additions by the mapping above.

Conclusion — Standard + Academic + Industry + Measurement (4-Axis Consensus, as of 2026-05-09)

This is a 4-axis consensus frame, adding 2026 industry analysis to the 3-axis frame from #05.

Standard — schema.org · Google Rich Results · JSON-LD W3C Recommendation — Vocabulary and format consensus for the 13 types
Academic — KDD 2024 · Chen et al. 2025 — Quantitative validation of Cite-source strategy and earned-media bias
Industry (2026) — Search Engine Land 2026 (Ahrefs coverage) — AI Overview citations rapidly expanding beyond SERPs (top-10 SERP overlap dropped from 76% to 38%) + Bing/Google 2025 official acknowledgment of schema's contribution
Measurement — RanketAI site check — Per-page detection of the 13 types + weakness diagnosis

The 4-axis consensus reaches a clear conclusion:

JSON-LD first — Microdata / RDFa have lower extraction robustness. Adopt JSON-LD as the single format.
Group C (Article · FAQPage) is top for citation frequency — Apply first to content pages.
Group A (Organization · Person) is the primary entity signal — One site-wide publish is sufficient.
BreadcrumbList triggers Rich Results — Add to every page as a baseline.
Group D (Product · Service · LocalBusiness · Event) — Apply selectively by business model.

Schema.org review is the second starting point of GEO work. When weaknesses are found in your measurement results (the schema-type detection count of RanketAI site check), reinforce by priority following the 13-type matrix above.

⚠ Schema specifications are updated frequently. The 13-type recommendations and priorities in this guide are based on the 2026-05-09 snapshot. Before applying in production, please verify against the 10 official references above (especially Google Rich Results guides, individual schema.org type pages, and the latest Ahrefs / Search Engine Land analysis).

Further reading: #01 — Why SEO Alone Isn't Enough in the AI Search Era · #02 — LLM Citation Algorithm Anatomy · #03 — Korea's AI Visibility Gap · #04 — GEO Academia vs Industry vs Measurement Mapping · #05 — The Four AI Crawler Policies

Execution Summary

Item	Practical guideline
Core topic	RanketAI Guide #06: Schema.org 13 Types and GEO Impact
Best fit	Prioritize for AI Business, Funding & Market workflows
Primary action	Define a measurable success KPI (cost, time, or quality) before starting any AI initiative
Risk check	Validate ROI assumptions with a small pilot before committing the full budget
Next step	Establish a quarterly review cadence to track KPI movement and adjust scope

Data Basis

Schema.org official standard (schema.org) — A structured data vocabulary maintained by the W3C Schema.org Community Group, defining 800+ types and 1400+ properties. Specifies three syntaxes — JSON-LD, Microdata, and RDFa. Co-sponsored by Google, Microsoft, Yahoo, and Yandex, making it the de-facto search engine standard.
Google Search Central — Structured Data guide (developers.google.com/search/docs/appearance/structured-data). The official list of schema types eligible for Rich Results. Explicitly recommends JSON-LD over alternatives, with dedicated spec pages for Article, FAQPage, BreadcrumbList, Product, and Organization.
Aggarwal et al. "GEO: Generative Engine Optimization" (Princeton · IIT Delhi · Georgia Tech, KDD 2024, arXiv:2311.09735) — The academic origin of quantitative GEO strategy validation. Quantifies how structured data and citation signals influence LLM answer visibility. The Cite-source strategy alone delivers a +40% improvement in citation rate.
Chen · Wang · Chen · Koudas. "How to Dominate AI Search" (2025-09, arXiv:2509.08919) — Quantitatively shows AI search is systematically biased toward earned media (third-party sources) and structured authority signals. Provides academic backing for the citation effect of Schema.org Organization, Person, and Article.
Google — Helpful, reliable, people-first content (updated 2024) — The E-E-A-T frame (Experience · Expertise · Authoritativeness · Trust) treats schema.org's author, publisher, and reviewedBy fields as primary entity-identification signals. The same frame applies to AI Overviews answer-quality evaluation.
JSON-LD 1.1 W3C Recommendation (2020-07) — The official W3C Recommendation for JSON for Linked Data. The `<script type="application/ld+json">` pattern, which separates from HTML body, is superior to Microdata and RDFa in extraction robustness and maintainability. JSON-LD-first extraction is observed in Google, OpenAI, Anthropic, and Perplexity LLM citations.
Microsoft Bing Webmaster — Structured Markup Validator and JSON-LD-first recommendation. The Bing search index (shared backend with Copilot and Bing Chat) uses Schema.org vocabulary as the primary signal for entity disambiguation (homonyms — same-name people or companies).
Search Engine Land 2026 (covering Ahrefs' February 2026 study). Analysis of 863,000 keyword SERPs and 4M AI Overview URLs. The share of AI Overview citations that also rank in the top 10 SERP positions dropped from 76% (mid-2025) to 38% (February 2026). AI answers are rapidly expanding their citation pool beyond SERP-visible pages — including pages identified through schema.org. Bing (March 2025) and Google (April 2025) have publicly acknowledged schema's LLM contribution (Copilot content understanding + Google search advantage).

Key Claims and Sources

This section maps key claims to their supporting sources one by one for fast verification. Review each claim together with its original reference link below.

Claim:Schema.org is the de-facto official vocabulary, co-sponsored by Google, Microsoft, Yahoo, and Yandex
Source:Schema.org official
Claim:Google explicitly recommends JSON-LD over Microdata and RDFa
Source:Google Structured Data guide
Claim:Article, FAQPage, and BreadcrumbList are the core schema types for Google Rich Results eligibility
Source:Google Search Central
Claim:KDD 2024 validates 9 GEO strategies — Cite-source and other structured signals deliver up to +40% citation rate
Source:Aggarwal et al. KDD 2024 (arXiv:2311.09735)
Claim:AI search is systematically biased toward structured authority signals (Organization, Person, Article schema)
Source:Chen et al. 2025 (arXiv:2509.08919)
Claim:JSON-LD 1.1 is the official W3C Recommendation, superior to Microdata in extraction robustness due to HTML body separation
Source:W3C JSON-LD 1.1 Recommendation
Claim:Under E-E-A-T, schema.org author and publisher fields serve as primary entity-identification signals
Source:Google Helpful Content guide
Claim:Ahrefs Feb 2026 — top-10 SERP overlap of AI Overview citations dropped from 76% (mid-2025) to 38%; AI answers expanded citation pool beyond SERP-visible pages, raising the role of schema.org entity attribution
Source:Search Engine Land 2026 (Ahrefs coverage)

External References

The links below are original sources directly used for the claims and numbers in this post. Checking source context reduces interpretation gaps and speeds up re-validation.

Is your site visible in AI search?

See for free how ChatGPT, Perplexity, and Gemini describe your brand.

Start Free Diagnosis →

X LinkedIn

These related posts are selected to help validate the same decision criteria in different contexts. Read them in order below to broaden comparison perspectives.

24 Questions When Your Brand Is Missing from AI Answers — Per-LLM & Situational Diagnosis 2026

When your brand isn't in ChatGPT, Gemini, or Perplexity answers — 13 most frequently asked questions from RanketAI operational data. GEO/AEO measurement, content structure for LLM citation, diagnose → improve → track workflow.

2026-05-10

How to Become a Brand AI Recommends — From Measurement to Signal Building

Whether ChatGPT, Perplexity, and Gemini recommend your brand comes down to external mentions, entity, and structure. Here are the conditions, the measure → reinforce → re-measure workflow, and the tool-vs-agency choice — backed by data.

2026-06-17

Google AI Mode Optimization Guide: 7 Signals More Important Than llms.txt

Google's official AI Search guidance says llms.txt is not the core lever. Learn the seven readiness signals that matter more for AI Mode and AI Overviews.

2026-05-29

Naver AI Search Explained: AI Briefing vs Google AI Overview, AI Tab vs AI Mode

Naver's AI Briefing is a top-of-results summary close to Google AI Overview; the AI Tab (beta 2026-04-28) is a conversational mode closer to AI Mode. This guide compares how they work — C-rank, AEO, AI-crawler signals — and what they mean for GEO in Korea.

2026-05-20

Ask AI for a 'GEO Tool', Get Map Apps — How Category Naming Decides AI Visibility

We asked AI the same category under two names — 'GEO·AEO visibility tool' and 'AI search visibility tool' — and got completely different answers. Here is how AI resolves acronyms by context, and three rules to name your category clearly.

2026-05-16

Back to List