In brief
Schema.org is a standardised vocabulary of structured data, created in 2011
by Google, Microsoft, Yahoo and Yandex. It allows annotating HTML content
so that machines - search bots, LLMs, AI agents - understand the nature of what they read, not just the words.
An article with an Article schema is immediately identified
as a dated publication. A FAQ page with a FAQPage schema
exposes its questions and answers in an exploitable way without HTML analysis.
This guide covers the most important schemas for visibility in
AI answer engines, with ready-to-copy JSON-LD examples.
1. Why schema.org matters for LLMs
LLMs interact with your content at two distinct moments:
- Training (GPTBot, ClaudeBot, Applebot-Extended...).
Training crawlers collect billions of pages.
Schema.org allows them to categorise the document (article, FAQ,
organisation, person), detect its date and author, and
understand relationships between entities (
sameAs,memberOf). This influences what the model "knows" about your brand after training. - Web retrieval (web RAG) (Perplexity, ChatGPT Search).
The system crawls the page and extracts passages. The
Articleschema withdateModifiedis used to evaluate freshness. TheFAQPageschema directly exposes question/answer pairs to retrieval - perfectly structured chunks.
In parallel, schema.org improves performance in Google Search (Featured Snippets, Rich Snippets, AI Overviews) and in Bing, which is the underlying data source for Perplexity and ChatGPT Search. The impact is therefore both direct (LLMs read schema) and indirect (better ranking in the source indexes of web RAG).
2. Priority schemas by page type
2.1 Article / BlogPosting
To use on all editorial content pages (articles, guides, analyses). Essential fields:
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Exact article title",
"description": "150-160 character summary",
"datePublished": "2026-04-22",
"dateModified": "2026-04-22",
"inLanguage": "en",
"author": {
"@type": "Organization",
"name": "Your organisation",
"url": "https://your-site.com"
},
"mainEntityOfPage": "https://your-site.com/article/"
} Common errors: omitting dateModified
(LLMs cannot detect freshness), putting an empty description
identical to the title, omitting inLanguage on multilingual sites.
2.2 FAQPage
To use on pages containing a questions/answers section. This schema is the most directly exploited by LLMs: it pre-digests the chunking work by exposing self-contained Q/A pairs.
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "What is RAG?",
"acceptedAnswer": {
"@type": "Answer",
"text": "RAG (Retrieval Augmented Generation) is a system that allows an LLM to consult external sources before generating its response, to produce cited and up-to-date answers."
}
}
]
} Quality rule: each answer (text) must
be self-contained (understandable without reading the question) and complete
(no "see above" or implicit reference). An answer of fewer
than 40 words is often too short to be exploitable.
2.3 Organization
To place on the homepage or About page. This schema is the main
vector of entity disambiguation: it links
your site to your Wikidata, Wikipedia, LinkedIn, Crunchbase profiles via
sameAs. LLMs use these links to build
a coherent representation of your organisation.
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "Name of your organisation",
"url": "https://your-site.com",
"logo": "https://your-site.com/logo.png",
"description": "Factual description in 1-2 sentences",
"foundingDate": "2024",
"sameAs": [
"https://www.wikidata.org/wiki/Q...",
"https://www.linkedin.com/company/...",
"https://en.wikipedia.org/wiki/..."
]
} 2.4 BreadcrumbList
To place on all pages except homepage. Strong site structure signal for LLMs, which use it to understand the thematic hierarchy of your content.
{
"@context": "https://schema.org",
"@type": "BreadcrumbList",
"itemListElement": [
{
"@type": "ListItem",
"position": 1,
"name": "Home",
"item": "https://your-site.com/"
},
{
"@type": "ListItem",
"position": 2,
"name": "Insights",
"item": "https://your-site.com/insights/"
},
{
"@type": "ListItem",
"position": 3,
"name": "Schema.org and LLMs"
}
]
} 2.5 HowTo
To use on pages describing a step-by-step procedure.
This schema is used by Google AI Overviews for "how" queries.
Each HowToStep becomes a self-contained chunk in the RAG pipeline.
{
"@context": "https://schema.org",
"@type": "HowTo",
"name": "How to optimise your site for LLMs",
"step": [
{
"@type": "HowToStep",
"position": 1,
"name": "Step 1: Audit robots.txt",
"text": "Verify that the major AI bots (GPTBot, PerplexityBot, ClaudeBot) are not blocked in your robots.txt."
},
{
"@type": "HowToStep",
"position": 2,
"name": "Step 2: Add Article schema",
"text": "Add a JSON-LD Article block with datePublished and dateModified on each content page."
}
]
} 2.6 WebSite
To place only on the homepage. Allows search engines
and LLMs to understand that your site is a coherent entity.
The potentialAction field activates the sitelinks search box
in Google.
{
"@context": "https://schema.org",
"@type": "WebSite",
"name": "Site name",
"url": "https://your-site.com",
"inLanguage": "en",
"description": "Short site description"
} Note: do not duplicate the WebSite schema across multiple pages - one instance on the homepage is sufficient.
3. Injecting multiple schemas on the same page
The recommended technique in 2026 is to inject a JSON-LD array
containing multiple schema objects in a single
<script type="application/ld+json"> tag:
<script type="application/ld+json">
[
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "...",
"datePublished": "2026-04-22",
"dateModified": "2026-04-22"
},
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [...]
}
]
</script> This approach is validated by Google and correctly read by LLMs that parse JSON-LD. Alternatively, two separate script tags can be used.
4. Schema errors that harm AI visibility
4.1 Schema inconsistent with visible content
If your FAQPage schema lists questions absent from the
visible HTML, Google and LLMs detect the inconsistency. Rule: schema
must reflect what a user would see on the page, not invisible or
truncated content.
4.2 Incorrect or missing date data
datePublished: "2026" is not a valid ISO 8601 format.
Use "2026-04-22" (YYYY-MM-DD) or
"2026-04-22T10:00:00+00:00" (with time and timezone).
An invalid format is ignored by parsers.
4.3 Organization.sameAs pointing to broken URLs
LLMs verify (during training or retrieval) that
sameAs links resolve to a page that mentions
your entity. An empty Wikidata entry or a LinkedIn 404 link weakens
the entity signal rather than strengthening it.
4.4 WebSite schema duplicated across multiple pages
A common error in CMS that automatically inject WebSite schema
on all pages. Google flags this as a structured data error.
Limit WebSite to the homepage.
4.5 Malformed JSON-LD
Invalid JSON (missing comma, unescaped quote, unclosed brace) causes the parser to completely ignore the block. Verify with the Rich Results Test before any deployment.
5. Schema.org and AI surfaces in 2026
| Surface | Most impactful schema | Observed impact |
|---|---|---|
| Google AI Overviews | Article, FAQPage, HowTo | Strong: AIO reads and cites Q/A pairs from FAQPage |
| Bing Copilot | Article, Organization | Moderate: improves Bing ranking as RAG source |
| Perplexity | Article (dateModified) | Moderate: freshness detected, preferred for current queries |
| ChatGPT Search | Article, FAQPage | Moderate: same logic as Perplexity via Bing |
| LLMs (training) | Organization + sameAs | Long term: entity disambiguation, brand representation |
| Google Featured Snippets | FAQPage, HowTo | Strong and immediate |
6. Action plan: implementation priorities
- Week 1. Add
Articleschema withdatePublishedanddateModifiedon all content pages. Verify with the Rich Results Test. - Week 2. Add
FAQPageschema on pages that already contain a FAQ section or Q/A. Re-read each answer to verify self-containment. - Week 3. Add
Organizationschema on the homepage withsameAsfields completed (Wikidata, LinkedIn, Wikipedia if available). Create the Wikidata entry if it does not exist. - Week 4. Add
BreadcrumbListon all pages except homepage. Audit existing schemas to detect WebSite duplicates and incorrect date formats.
Schema.org checklist for LLMs
Articleschema withdatePublished+dateModifiedon all content pagesdateModifiedupdated on each substantial revisionFAQPageschema on pages with Q/A sections- Each FAQPage answer self-contained (at least 40 words)
Organizationschema withsameAson the homepageWebSiteschema only on the homepage (not duplicated)BreadcrumbListschema on all pages except homepage- JSON-LD validated via Rich Results Test before deployment
- Schema/visible content consistency verified
FAQ
Is schema.org essential to be cited in LLMs?
Not essential - pages without schema are cited. But schema.org improves the precision with which LLMs interpret your content: page type, publication date, author, structured questions and answers. It reduces ambiguity and increases eligibility for enriched surfaces (Featured Snippets, AI Overviews, Bing Answers).
Which schema is most useful for AI SEO?
FAQPage is the most direct: it explicitly exposes questions and answers that LLMs can extract. Article or BlogPosting with dateModified improves freshness detection. Organization with sameAs creates the brand entity. In order of priority: 1) Article/BlogPosting, 2) FAQPage on suitable pages, 3) Organization on the homepage, 4) BreadcrumbList on all pages.
Can you use multiple schemas on the same page?
Yes, and it is recommended. An article can simultaneously carry an Article schema (publication information) and a FAQPage schema (if the article contains a FAQ section). The technique consists of injecting a JSON-LD array containing multiple objects. Google and LLMs read all of them.
How do you verify that your schema is correctly read?
Three tools: the Google Rich Results Test (search.google.com/test/rich-results) to verify validity and eligibility for rich snippets, Schema.org Validator (validator.schema.org) for standard compliance, and the "Structured Data" section in Google Search Console to monitor production errors.
Does schema serve ChatGPT or Perplexity directly?
No official confirmation, but indirectly: schema improves ranking in Google and Bing, which are the underlying sources for the web RAG systems of ChatGPT Search and Perplexity. A better rank in these indexes increases the probability of being in the retrieval candidate pool. Additionally, training crawlers (GPTBot, ClaudeBot) read and index schema.org to understand the nature and date of content.