In brief
- ChatGPT Search and Perplexity crawl your site via AI bots; they must be allowed in your robots.txt and WAF.
- Generative engines extract passages, not whole pages: each paragraph must be readable out of context.
- Cited claims have three characteristics: clear, dated, sourced.
- Domain authority remains the first-level filter: without good SEO, pure GEO is ineffective.
- There is no paid placement in the organic responses of these engines in 2026.
How ChatGPT and Perplexity work when they respond
Before optimising, understand the mechanism. When a user asks a question to Perplexity or ChatGPT Search, the system operates in two stages:
- Retrieval. The engine breaks the web (or its index) into chunks and selects the most relevant passages for the query.
- Generation. The language model drafts a response drawing on the retrieved passages, which it cites as sources.
Optimising for this system means optimising for both stages: first being found during retrieval (technical, crawlability, authority), then being chosen during generation (clear passage, sharp assertion, credible sourcing).
Prerequisites: your site must be accessible
Before any editorial optimisation, verify that AI search bots can access your site. This is the number one cause of sites absent from AI responses despite having relevant content.
Verify that AI bots are allowed
Real-time search bots are distinct from training bots. To be visible in ChatGPT Search and Perplexity responses, the following bots must be able to crawl your site:
| Engine | Search bot (real-time) | Training bot |
|---|---|---|
| ChatGPT Search | OAI-SearchBot | GPTBot |
| Perplexity | PerplexityBot | PerplexityBot |
| Google AI Overviews | Googlebot | Google-Extended |
| Claude (search) | ClaudeBot, Claude-Web | anthropic-ai |
Common trap: Cloudflare and cloud WAFs sometimes block AI bots by default ("Block AI scrapers" option). Check your rules before anything else.
To verify: test with curl -A 'PerplexityBot' https://yoursite.com/page. If you get a 403, the bot is blocked.
Content accessible without JavaScript
AI bots rarely execute JavaScript. If your site is a SPA that renders content client-side, bots see a blank page. Verify that critical content is present in the static HTML returned by the server (SSR or SSG).
Editorial optimisation: making your content citable
Technical work gives access to the site. Editorial work determines whether your content is cited rather than a competitor's.
Self-contained paragraphs
Each paragraph must answer a question autonomously. Test it: copy a paragraph and read it independently. If someone who has not read the rest understands the message, the paragraph is self-contained.
- Avoid orphan pronouns: "it enables" without an explicit referent is weak.
- Re-name entities at the start of sections: "ChatGPT Search" rather than "the engine".
- Date temporal claims: "in April 2026", not "recently".
- Define acronyms locally, not only at the top of the page.
Clear and sourced assertions
A model cites what it can exhibit with confidence. A citable claim has three attributes: sharp ("Perplexity exceeded 10 million daily active users in 2025" rather than "Perplexity is growing fast"), time-stamped (specific date or period), and verifiable (named source).
Query-oriented heading structure
Each H2 should correspond to a distinct search intent. Retrieval systems split documents into chunks, and HTML boundaries (H2, paragraphs) influence these cuts.
- Descriptive H2s: "How to check if your AI bots are blocked" is better than "Verification".
- One intent per section, not two mixed topics under the same H2.
Specificities of each engine
ChatGPT Search (OpenAI)
ChatGPT Search is activated automatically when OpenAI judges a query "web-dependent" (recent information, news, comparisons, prices). OAI-SearchBot crawls pages in real time.
- Favours sources with strong Google domain authority; ChatGPT Search relies on the Bing index and its own signals.
- Pages with recent update dates are better represented on current-events queries.
- The heading hierarchy is directly used for passage splitting.
Perplexity
Perplexity was born as an AI search engine, more aggressive on sourcing than ChatGPT. It systematically displays numbered sources.
- Crawls in real time for all queries, not only those judged "web-dependent".
- Responses often include 3 to 6 distinct sources: more spots to compete for than in AI Overviews.
- Freshness is a strong criterion: re-dating an existing page sometimes suffices to recover lost citations.
- Readily references blogs and specialist sites alongside major media, an opening for niche sites.
Google AI Overviews
AI Overviews appear at the top of Google results for certain queries, before classic links, and cite 3 to 8 sources.
- Built on Google's classic index: ranking well in Google is a strong prerequisite.
- FAQPage and HowTo schemas increase the probability of being selected.
- Triggered mainly on informational queries ("how to", "what is", "difference between").
- Use Google Search Console (AI Overviews report) to track your impressions on this surface.
Brand entity: the underestimated element
LLMs associate your content with an entity. If your brand is vague, poorly defined or confused with another of the same name, the model will not cite you consistently, even if your content is excellent.
- Name your organisation systematically on every page, not only the homepage.
- Publish an About page with dates, locations, activities, sector, key team members.
- Use
sameAsin your Organization schema pointing to Wikipedia, LinkedIn, Wikidata, Crunchbase.
Domain authority remains the first-level filter
All AI engines with real-time search use an existing web index and apply quality filters. A site without inbound links, without organic traffic, without indexation history has very little chance of being selected, even with perfectly structured content.
Classic SEO and GEO are not in competition: a good Google ranking directly increases the probability of being cited by AI engines.
Common mistakes
- Optimising editorial without checking bots: if OAI-SearchBot or PerplexityBot is blocked, everything else is useless.
- Thinking that publishing llms.txt is enough: useful but minor, it is not a ticket to AI responses.
- Unrevised AI-generated content: models devalue hollow formulations. One well-revised 2,000-word article beats ten unrefined AI articles.
- Copying competitor content: models favour original sources. Bring data, field experience, novel angles.
- Publishing without freshness: a page not updated in 18 months loses relevance on time-sensitive queries. Re-date strategic content every 6 months.
4-week action plan
- Week 1, diagnosis. Verify that OAI-SearchBot, PerplexityBot and ClaudeBot are not blocked. Test 10 key queries on Perplexity and ChatGPT Search. Note who is cited.
- Week 2, technical. Fix robots.txt and WAF if needed. Verify rendering (static HTML). Validate your schemas (Organization, Article, FAQPage).
- Week 3, editorial. Take your 3 most relevant pages and rewrite each paragraph for self-containment. Add figures, dates, sources. Restructure H2s around query intents.
- Week 4, authority and measurement. Publish or update a complete About page. Add sameAs to Organization schema. Submit URLs via IndexNow. Re-test the 10 queries.
Frequently asked questions
- Can you pay to appear in ChatGPT or Perplexity?
- No. None of these engines offer paid placement in their organic responses in 2026. OpenAI, Perplexity and Google do not accept payment to favour sources in their generative responses. The only path is editorial and technical optimisation. Some engines offer advertising formats that are separate from responses: this is a different surface, clearly labelled.
- Does ChatGPT visit my site in real time?
- It depends on the mode. In standard mode (without browsing), ChatGPT relies on its training data and does not crawl your site live. In ChatGPT Search mode, OAI-SearchBot crawls pages in real time. Both OpenAI bots (GPTBot for training, OAI-SearchBot for search) must therefore be allowed in your robots.txt and not blocked by your WAF.
- How long does it take to appear in AI responses?
- There is no guaranteed timeline, and this is a fundamental difference from classic SEO. AI visibility depends on crawling (days to weeks for real-time search bots), training (months to years for base models), and query relevance. On Perplexity and ChatGPT Search, improvements are sometimes noticeable within 2 to 8 weeks after publishing structured content.
- Do I need different content for each AI engine?
- No. Good GEO content is optimised once and works across all engines, because they share the same criteria: self-contained passage, clear assertion, sourcing. Differences between ChatGPT Search and Perplexity are marginal at the editorial level. The key is structured, fresh, citable content, not engine-specific adaptation.
- My competitor is being cited instead of me. What can I do?
- First analyse why: is their passage shorter, more direct, better sourced? Is their page structured with clear H2 sections? Do they have stronger domain authority? The answer is almost always editorial. Create content that answers the query more directly, with figures, dates, clear assertions. Also verify that your site is not blocked for AI search bots.