ChatGPT Search: what exactly are we talking about?
ChatGPT Search is the web search feature integrated into ChatGPT, progressively deployed since November 2024. When a user asks a question requiring recent information, ChatGPT can trigger web retrieval via the OAI-SearchBot bot, then synthesise the results citing sources with clickable reference numbers.
It is important to distinguish the three OpenAI bots:
- GPTBot: crawls to feed OpenAI model training. Blocking it does not protect against ChatGPT Search.
- OAI-SearchBot: crawls for ChatGPT Search - the bot to allow if you want to be cited.
- ChatGPT-User: executes requests on demand from users via ChatGPT browsing. Useful to allow as a complement.
Check your robots.txt: if you have a Disallow: / on GPTBot, it does not
apply to OAI-SearchBot. Each bot must be managed separately.
ChatGPT Search architecture: retrieval + model memory
The main difference from Perplexity is architectural. Perplexity is almost exclusively RAG-based: its answer is built from real-time retrieval. ChatGPT Search works differently: GPT-4o has a dense knowledge base (training up to mid-2024) and decides, during generation, whether additional retrieval is needed.
This hybrid architecture has important practical consequences:
- Cited sources are fewer than in Perplexity (often 2 to 4 versus 6 to 10) because the model completes the answer with its own memory.
- Retrieval is triggered selectively, primarily for recent data, prices, ongoing events, and time-sensitive statistics.
- Competition for citation is more intense: fewer available slots in the answer means more rigorous selection.
Source selection criteria for ChatGPT Search
1. Accessibility to OAI-SearchBot
A necessary but not sufficient condition. Your page must be crawlable by OAI-SearchBot, server-rendered (SSR/SSG, not an SPA without server rendering), and return a stable 200. Content behind a login, paywall or heavy JavaScript will not be indexed.
2. Domain authority in the sector
ChatGPT Search evaluates the thematic authority of the domain. Observed signals: high Domain Rating (Ahrefs) or Domain Authority (Moz), significant organic traffic, presence in adjacent reference domains. A site with narrow but deep topical authority (expert on a specific domain) outperforms generalists with the same DR.
3. Content freshness
ChatGPT Search favours recently updated pages for queries where freshness is critical.
Signals: dateModified in the Article schema, <meta name="last-modified"> tag,
HTTP Last-Modified header, and the visible date in the content. All three must be consistent.
4. Self-contained sections
Like any RAG system, ChatGPT Search chunks content and selects the most relevant passages. A page where each section can be understood out of context (self-contained) increases the probability that a chunk will be selected and cited. A section that starts with "As mentioned above..." is an unusable chunk.
5. E-E-A-T signals and structured data
Article schema with author, datePublished and dateModified.
Organization schema with sameAs. These structured signals are the E-E-A-T
proxy signals most directly readable by a retrieval system.
6. Factual density
ChatGPT Search is more selective than Perplexity on factual density: it prefers sources that bring figures, dates, names, precise definitions. Generic content ("it is important to note that...", "several factors come into play...") has little chance of being selected when a more factual source is available.
ChatGPT Search vs Perplexity: comparison table
| Dimension | ChatGPT Search | Perplexity |
|---|---|---|
| Architecture | Model + selective RAG | RAG-first, near-exclusive |
| Sources cited per answer | 2 to 4 on average | 6 to 10 on average |
| Retrieval bot | OAI-SearchBot | PerplexityBot |
| Retrieval triggering | Selective (freshness, recent facts) | Systematic |
| Source bias | Towards known domains, high authority | More open to specialist sources |
| Freshness sensitivity | Very high for recent facts | High (real-time default) |
| Visibility measurement | chatgpt.com traffic + third-party monitoring | perplexity.ai traffic + third-party monitoring |
6 optimisation levers for ChatGPT Search
Lever 1 - Allow OAI-SearchBot and ChatGPT-User
Check your robots.txt. Lines to add if absent:
User-agent: OAI-SearchBot
Disallow:
User-agent: ChatGPT-User
Disallow:
An empty Disallow: means "everything is allowed". Also verify that these bots are not
blocked by a WAF or Cloudflare in bot-fight mode.
Lever 2 - Strict server rendering
ChatGPT Search does not execute JavaScript for main content rendering. Your pages must
return text content in the initial HTML (SSR or SSG). Test with curl -A "OAI-SearchBot"
on a URL: if the response HTML contains your content, you are correctly served.
Lever 3 - Visible and consistent freshness
Update dateModified in your Article schema on each significant revision.
Display the last update date visibly within the article. Ensure that the
HTTP Last-Modified header is consistent with the schema date.
Lever 4 - Self-contained sections with factual headings
Each H2/H3 section must function as an autonomous answer. The section heading must include the key concept ("OAI-SearchBot", "ChatGPT Search retrieval") so that the selected chunk can be used directly. End each section with an actionable conclusion or a key figure.
Lever 5 - Targeted factual density
Include in each page at least 3 to 5 factual claims with precise figures, dates or concrete examples. These should not be filler; each fact must be sourceable. "High factual density" content is what ChatGPT Search seeks to cite in order to credibilise its answers.
Lever 6 - Build domain thematic authority
ChatGPT Search favours domains recognised in their sector. Two priority actions: publish reference content regularly on your specialty (depth topical authority), and earn mentions and backlinks from recognised thematic sources that appear in OpenAI training corpora.
Measuring your visibility in ChatGPT Search
In the absence of an official ChatGPT Search console, available proxies in 2026:
- Referral traffic from chatgpt.com: in Google Analytics 4 or Plausible, segment
source = chatgpt.com. This traffic is under-estimated (many users copy-paste without clicking) but gives a measurable floor. - OAI-SearchBot logs via server logs: Google Search Console does not show third-party bots, but server logs (Nginx/Apache/Cloudflare) show OAI-SearchBot requests with the crawled URLs.
- Active LLM monitoring: tools such as Profound, AthenaHQ, Peec and Otterly regularly query ChatGPT on target queries and detect when your site is cited. This is the most direct method even if it remains token-intensive.
- Regular manual testing: each week, ask ChatGPT 5 to 10 queries from your sector with the search feature enabled (the globe icon). Note whether your domain appears in the cited sources.
FAQ - ChatGPT Search and optimisation
- What is the difference between GPTBot and OAI-SearchBot?
- GPTBot crawls to feed OpenAI model training. OAI-SearchBot crawls for ChatGPT Search, the real-time search feature. They are two distinct bots. Blocking GPTBot does not prevent ChatGPT Search from citing you.
- Does ChatGPT Search cite sources like Perplexity?
- Yes, but less systematically. Perplexity is almost exclusively RAG-based. ChatGPT Search combines model memory with selective retrieval: cited sources are fewer (2 to 4 vs 6 to 10) but selected with more arbitration.
- Should you allow OAI-SearchBot in robots.txt?
- If you want to be cited in ChatGPT Search, yes. OAI-SearchBot is the retrieval bot for ChatGPT Search. Blocking it means you are not a citation candidate.
- Does ChatGPT Search favour certain types of sites?
- Based on observations available in 2026, ChatGPT Search more frequently cites sites with high organic traffic, authoritative sources in their sector, and pages with clean structured data. Narrow specialists outperform generalists.
- Can you measure your visibility in ChatGPT Search?
- Not directly. Available proxies: referral traffic from chatgpt.com, OAI-SearchBot logs, active LLM monitoring (Profound, AthenaHQ, Peec), and regular manual testing with the ChatGPT search feature enabled.
ChatGPT Search checklist (8 points)
- OAI-SearchBot and ChatGPT-User are allowed in robots.txt and not blocked by WAF.
- Key pages are rendered SSR or SSG; content is in the initial HTML.
- dateModified in Article schema is up to date and consistent with visible date and HTTP Last-Modified header.
- Each H2/H3 section is self-contained and starts with the key concept (not "as mentioned above").
- Each page contains at least 3 factual claims with figures, dates or concrete examples.
- Article schema is implemented with author, datePublished and dateModified.
- ChatGPT Search visibility monitoring is in place (referral traffic + LLM monitoring).
- Domain topical authority is built via a cluster of reference content on the main subject.