llms.txt: The robots.txt for AI — What It Is and Why You Need One

llms.txt is an emerging web standard that provides AI language models with structured, machine-readable information about a website's content, purpose, and key pages. The web has had robots.txt for decades — it tells search engine crawlers which pages they can and cannot access. Now llms.txt fills the gap for AI agents specifically.
If robots.txt is a traffic sign for crawlers, llms.txt is a welcome packet for AI agents. It tells them who you are, what you offer, and where to find what matters most — in a format they can actually use. With AI-referred website sessions growing 527% between January and May 2025 (SparkToro), and ChatGPT now serving 900 million weekly active users (OpenAI, 2025), making your site readable to these systems is no longer a nice-to-have.
This guide explains what llms.txt is, why it matters, and how to create one for your website.
What Is llms.txt and Where Does It Live?
llms.txt is a plain text file hosted at the root of your website (typically at /llms.txt or /.well-known/llms.txt) that provides a structured summary of your site optimized for large language models.
The format was proposed by Jeremy Howard of fast.ai and is now a de facto standard for AI-friendly websites. Unlike your regular HTML pages — which are designed for human eyes and browsers — llms.txt gives AI agents a clean, structured entry point that is easy to parse and understand.
Think of it this way:
- robots.txt tells crawlers where they can go
- sitemap.xml tells crawlers what pages exist
- llms.txt tells AI agents what your site is actually about
You need all three. Most websites have the first two and are missing the third.
Why Do AI Agents Need llms.txt?
AI agents do not browse the way humans do. When an agent like ChatGPT, Claude, or Perplexity visits your website, it makes HTTP requests and reads raw HTML. It does not execute JavaScript by default, it cannot click cookie banners, and it does not scroll down to find buried content.
Even if an agent successfully reads your homepage, it still faces a challenge: understanding what your site is about from raw HTML is noisy. Navigation menus, footer links, cookie notices, ads, and boilerplate all get in the way of the actual content.
llms.txt solves this by giving agents a clean, curated summary. Instead of trying to extract meaning from a complex HTML page, an agent can read your llms.txt and immediately understand:
- What your website is about
- Who it is for
- What the most important pages are
- How to navigate your content
This leads to more accurate recommendations, better citations, and improved visibility in AI-generated responses.
What Does the llms.txt Specification Look Like?
The format is intentionally simple. A valid llms.txt file uses Markdown formatting and follows a specific structure:
Required: H1 Title
The file must begin with an H1 heading containing your site or company name:
# YourCompany
Required: Blockquote Description
Immediately after the title, a blockquote provides a one-line description of what your site does:
> One clear sentence describing what your site offers and who it is for.
Optional: Section Content
After the required elements, you can add any combination of:
- Paragraphs of descriptive text about your site
- Markdown links to your most important pages
- H2 sections to organize different areas of your site
A Real Example
Here is the llms.txt file for AgentSpeed:
# AgentSpeed
> The first AI Agent Readiness score for websites. Like PageSpeed, but for AI agents.
## What is AgentSpeed?
AgentSpeed scans websites and measures how accessible they are to AI agents
like ChatGPT, Claude, and Perplexity. It runs 10 automated checks across two
tiers and produces a score from 0 to 100 with actionable fix recommendations.
## How It Works
Enter any website URL and AgentSpeed will:
1. Fetch the homepage and key files (robots.txt, sitemap.xml, llms.txt)
2. Run 10 checks across two tiers
3. Calculate a weighted score
4. Generate a detailed report with prioritized fixes
## Links
- Homepage: https://agentspeed.dev
- Scan a website: https://agentspeed.dev/scan?url={url}
- View a report: https://agentspeed.dev/report/{domain}
This tells any AI agent exactly what AgentSpeed is and how to use it — in under 300 words.
What Is the Difference Between robots.txt and llms.txt?
People often confuse these two files. They serve completely different purposes.
robots.txt is a permissions file. It tells crawlers what they are and are not allowed to access. A typical entry looks like:
User-Agent: GPTBot
Allow: /
Disallow: /admin/
This says: GPTBot can crawl everything except the admin section.
llms.txt is a context file. It does not control access — it provides understanding. There are no allow or disallow directives. Instead, it explains what your site contains and points to what matters most.
You need both. robots.txt ensures AI crawlers can access your content. llms.txt ensures they understand it.
What Are the Most Common llms.txt Mistakes?
Too Long
llms.txt works because it is concise. An agent reading a 10,000-word llms.txt is not getting the fast context it needs. Keep it under 1,000 words. If you need to provide detailed documentation, link to it rather than including it inline.
Missing the H1
The H1 title is required by the spec. Files without it are technically invalid and may be ignored by stricter implementations.
Copying Your About Page
Your llms.txt should be written for machines, not for marketing. Skip the adjectives ("industry-leading", "revolutionary"). State facts: what you offer, who it is for, what the key pages are.
Not Updating It
If your site structure changes significantly — new products, renamed sections, updated URLs — update your llms.txt. Stale information is worse than no information.
Only Putting It at One Location
The spec supports both /llms.txt and /.well-known/llms.txt. Support both for maximum compatibility.
How Do You Create Your llms.txt?
Creating llms.txt takes about 15 minutes. Here is a simple process:
Step 1: Write the header
Start with your H1 title and blockquote description. This is the most important part — make the description clear and specific.
Step 2: Describe your site
Write 2-4 sentences explaining what your site does, who it serves, and what makes it useful. No marketing language. Just facts.
Step 3: List your most important pages
Add a "Links" or "Key Pages" section with your most valuable pages and a one-line description of each. For a SaaS product, this might be your homepage, pricing page, documentation, and blog. For an e-commerce site, your category pages and best-sellers.
Step 4: Add optional sections
If your site has distinct areas — a blog, a product catalog, a help center — add H2 sections for each with brief descriptions and key links.
Step 5: Deploy and verify
Upload the file to your web root so it is accessible at yourdomain.com/llms.txt. Then verify it appears correctly by visiting that URL.
You can also generate an llms.txt automatically using our free llms.txt generator, which crawls your site and produces a ready-to-deploy file.
Does llms.txt Actually Make a Difference?
There is a common objection: if major AI crawlers are not currently reading llms.txt, why bother?
The answer is the difference between AI crawlers and AI agents.
AI crawlers (GPTBot, ClaudeBot, Googlebot) scrape content to build training data and search indexes. These crawlers do not use llms.txt today.
AI agents — the interactive tools that complete tasks on behalf of users — are a different story. Coding agents like Claude Code and Cursor already support llms.txt as a discovery mechanism. When you tell an agent to "research AgentSpeed," it will often fetch /llms.txt as one of its first steps.
As agent usage grows, llms.txt support will follow. According to Ahrefs (December 2025), brand mentions correlate 3x more strongly with AI visibility than backlinks — and a well-structured llms.txt is one of the clearest brand signals you can provide to an AI system. Only 11% of domains are cited by both ChatGPT and Google AI Overviews for the same query, meaning every discoverability advantage matters. Implementing llms.txt now means you are ready when adoption accelerates — and it signals to any agent that visits your site that you take machine readability seriously.
Check Your Current llms.txt Score
The AgentSpeed free scan checks whether your website has a valid llms.txt file and validates its content. It takes about two seconds and gives you an immediate readiness score across all 10 agent readiness factors.
If your site is missing an llms.txt, our free generator will create one for you in under 30 seconds.