Insight / 07

Schema Markup
for AI Search.

A practical guide to JSON-LD for the generative era. Which types matter, how to chain them with stable entity IDs, and the three mistakes that kill citation share.

TL;DR

AI answer engines parse schema as confirmation of what the prose claims. The types that matter most for GEO are Organization, Service/Product, Article, FAQPage, BreadcrumbList and HowTo. The high-leverage move is chaining them with stable @id references so every page participates in one coherent knowledge graph.

Why schema matters more in the AI era, not less

There's a persistent myth that because LLMs can parse unstructured text, schema is obsolete. The opposite is true. LLMs parse unstructured text generatively — which means they also generate hallucinations, ambiguities and confident wrong answers. Schema is how you give the engine a machine-readable fact-check of what the prose is claiming. It's the difference between "the model thinks your brand is a pharma company" and "the model knows your brand is a pharma company because the schema says so and the prose confirms it." Every engine we've measured rewards the coherent pair.

The schema types that actually matter

Organization — one per site, canonical node. Include name, url, logo, sameAs (pointing to Wikidata, LinkedIn, Crunchbase, etc), foundingDate, address, and contactPoint. This is the root node every other schema block should reference.

ProfessionalService / Service / Product — for every offering. Name, description, serviceType, provider (pointing back to the Organization by @id), areaServed, offers. This is where most sites are either absent or shipping junk.

Article — on every editorial page. Headline, description, datePublished, dateModified, author (with @id), publisher (with @id pointing to the Organization), mainEntityOfPage. Critical for Article citations.

FAQPage — on service and article pages that answer questions. Each Question gets its Answer explicitly, structured, and matching what the prose says. Do not use FAQPage as a keyword-stuffing vehicle. Engines will ignore or demote.

BreadcrumbList — on every page. Unglamorous. High compounding. It tells the engine where the page sits in the site's hierarchy.

HowTo — on step-based content. Heavily used by Google AI Overviews for "how do I" queries.

The high-leverage move: stable @id chaining

The move that separates serious schema from decorative schema is using stable @id references to build one coherent graph across the whole site. Every page's Organization block uses the same @id (e.g. https://yoursite.com/#organization). Every Service block references that same @id as its provider. Every Article references the same Organization as publisher. The result: an answer engine parsing any single page can reconstruct the whole knowledge graph of the brand. This is what LLMs reward — because it's how their internal models of entities are structured anyway.

Three mistakes that kill citation share

Mistake 1 — shipping schema that contradicts the prose. If your schema says "founded in 2015" and your About page says "founded in 2018," the engine drops trust in both. Consistency is the whole game.

Mistake 2 — keyword-stuffing FAQPage blocks. Every modern answer engine has filters for this. Stuffing FAQPage with marketing questions that aren't really questions will get your schema ignored or the page demoted. Only ship FAQ schema for questions real users actually ask.

Mistake 3 — unstable or unresolvable @ids. If your @ids don't resolve to real URL fragments, or change between page loads, the engine cannot link nodes into a graph. Pick a convention (e.g. https://yoursite.com/#organization) and use it everywhere.

A minimal schema architecture to start from

Every page: BreadcrumbList. Every homepage: Organization + WebSite. Every service page: Service + FAQPage + BreadcrumbList. Every article: Article + FAQPage + BreadcrumbList. Every product page: Product + FAQPage + BreadcrumbList. This covers 95% of the GEO-relevant cases for a typical B2B site and takes one senior engineer about two weeks to implement cleanly.

FAQ

Common questions.

Does schema still matter if AI can read unstructured text?+

More than ever. Schema is how you give the engine a machine-readable confirmation of what the prose claims, which is exactly what reduces hallucination and increases citation confidence.

Should I use JSON-LD or microdata?+

JSON-LD. It's what Google, Bing, and every AI engine prefer, and it keeps the schema separate from the rendered HTML, which is easier to maintain.

How do I validate schema?+

Google's Rich Results Test, Schema.org validator, and Schema Markup Validator. Run all three — they catch different issues.

Is your brand inside the answer?

Free AI Visibility Audit. Response in one working day.