Origin and context: why llms.txt was created
In September 2024, Jeremy Howard, co-founder of Answer.AI and creator of the fast.ai framework, published a convention proposal called llms.txt. The starting point was simple: LLMs read HTML pages designed for humans. Navigation, menus, banners, scripts, all this structural noise must be filtered before useful information can be extracted. This process introduces loss and imprecision.
llms.txt is a direct response to this problem: give LLMs a clean Markdown file that says "here is what I am, here are my most important pages, here is how to describe them". A navigation and context signal optimised for automated retrieval, not for the human eye.
The convention draws inspiration from robots.txt: a text file at the domain root, readable without authentication, with minimal syntax. It does not replace robots.txt (which manages crawl permissions) or sitemap.xml (which lists URLs for indexing), but complements them with a semantic layer oriented towards LLMs.
If you are looking for a tool to generate your llms.txt automatically, the sister site llmtxt.info offers a free online generator based on the official specification.
The specification: what llms.txt contains
An llms.txt file is a Markdown file located at https://yourdomain.com/llms.txt. It follows a three-zone structure:
Zone 1: header with entity description
The file opens with a description of the entity: what the site is, who its audience is, what its value proposition is. This description should be concise (2 to 4 sentences), factual, and include the key terms of your domain. This is the passage LLMs use to create their initial representation of your site.
Zone 2: thematic sections with annotated links
The body of the file lists your key pages, organised by theme, with a description of each page. The syntax is standard Markdown: H2 headings for sections, Markdown links for pages with 1 to 3 lines of description.
Zone 3: optional section for contextual exclusions
An ## Optional section can list URLs to de-prioritise for LLM context: legal pages, transactional pages without informational value. This is not a crawl exclusion directive (do not confuse with robots.txt), it is a contextual relevance signal.
llms-full.txt: the full-content variant
The convention provides for a second file, /llms-full.txt, which contains the full text of your most important pages, without HTML, without navigation, only editorial content in Markdown format.
- llms.txt is read first for discovery and navigation
- llms-full.txt is used when an LLM needs full context (RAG chatbot, autonomous agent, site analysis tool)
Adoption state in 2026
The most comprehensive data available comes from a Trakkr Research study published in March 2026. The study scanned 37,894 AI-cited domains and found that only 13.3% had implemented a llms.txt file (Trakkr, "The llms.txt Effect", March 2026).
This figure is instructive on two levels: adoption remains minority even among sites already cited by AI engines; implementing llms.txt now gives a real differentiating advantage while the majority has not yet done so.
| Crawler / LLM | Reads llms.txt | Reads llms-full.txt | Notes |
|---|---|---|---|
| Perplexity (PerplexityBot) | Yes (confirmed) | Partial | Uses llms.txt for contextual navigation |
| ChatGPT Search (OAI-SearchBot) | In progress | Not confirmed | OpenAI mentioned the convention without formal commitment |
| Claude (ClaudeBot) | In progress | Not confirmed | Anthropic follows the convention without formal announcement |
| Google AI Overviews | Not documented | Not documented | Google uses its own signals, no statement on llms.txt |
Implementation: step-by-step guide
Step 1: write llms.txt
Recommended minimal structure:
- H1 title = site name
- Blockquote (>) = 2 to 4 sentence description (who you are, what you do, for whom)
- Section ## Main pages with your 5 to 15 most important pages
- Section ## Recent articles if you have an editorial section
- Section ## Optional to signal pages with no contextual value
To generate your file automatically, llmtxt.info provides a free online tool that analyses your site and produces a ready-to-use llms.txt structure.
Step 2: publish at the domain root
The file must be accessible at https://yourdomain.com/llms.txt, not in a subdirectory, not behind authentication. The server must return Content-Type text/plain with HTTP 200.
- Astro: create
public/llms.txt - Next.js: create
public/llms.txt - WordPress: upload via FTP to the root
- Cloudflare Pages / Netlify: place in the
public/directory
Step 3: verify and monitor
After publishing, verify accessibility with curl -I https://yourdomain.com/llms.txt. You should see a 200 and the correct Content-Type. Then monitor your server logs for AI crawler requests.
Best practices and pitfalls to avoid
Do not duplicate robots.txt content in llms.txt. They are two files with distinct functions. llms.txt is not an access control mechanism.
Avoid generic descriptions. "Our site homepage" adds no contextual value. Each description should contain precise and informative terms.
Update regularly. An llms.txt that lists old articles but does not mention recent content deprives LLMs of your updates. Ideally automate generation with each new publication.
Frequently asked questions
- Is llms.txt an official standard recognised by Google or OpenAI?
- No. llms.txt is a convention proposed by Jeremy Howard (Answer.AI) in September 2024. It has not been formally adopted by Google, OpenAI or Anthropic as a mandatory standard. However, Perplexity has confirmed implementation, and adoption was 13.3% among AI-cited domains as of March 2026 (Trakkr Research, n=37,894 domains).
- Should I choose between llms.txt and llms-full.txt, or do I need both?
- Both files serve different purposes. llms.txt is a navigation index: it lists and describes your key pages. llms-full.txt is a content aggregate: it contains the full text of your most important pages. For a small site, llms-full.txt alone may suffice. For a medium-sized site, combining both is optimal.
- What is the impact of llms.txt on classic Google SEO?
- Zero or neutral. llms.txt is an additional text file that does not modify your HTML pages and does not replace robots.txt or sitemap.xml. Google Search does not use it for classic indexing. It targets specifically LLMs and AI answer engines. You can implement it with no risk to your existing SEO.
- How do I know if an LLM has read my llms.txt?
- There is no direct confirmation. Check your server logs for requests to /llms.txt by known crawlers (OAI-SearchBot, PerplexityBot, ClaudeBot, etc.). Then test directly: ask Perplexity or ChatGPT Search to describe your site and see if the response reflects the descriptions in your llms.txt. Changes typically propagate within 1 to 4 weeks.
- Is there a tool to generate llms.txt automatically?
- Yes. llmtxt.info is a free online generator that produces a structured llms.txt file from your site URL. It is based on the official Answer.AI specification. For static sites like Astro or Next.js, programmatic approaches also allow generating the file automatically at build time.