Technical GEO7 min read

The Complete Guide to llms.txt: How to Prepare Your Site for AI Agents

When an AI agent — Claude, GPT-4o, a Perplexity research workflow — visits your website, it needs to quickly understand what your site does. The llms.txt file is a plain-text document at the root of your site that tells AI systems exactly that. Think robots.txt, but for LLMs.

What Is llms.txt?

llms.txt was proposed by Jeremy Howard (fast.ai) in 2024 as a lightweight standard for helping AI systems understand websites. The concept is simple: a Markdown-formatted file at yourdomain.com/llms.txt that gives AI agents a structured overview of your site.

Unlike robots.txt (which tells crawlers what not to do), llms.txt is positive — it tells AI what your site is, what you do, and where to find your most important content.

Who Uses llms.txt

  • Claude (Anthropic) — reads llms.txt when using web browsing tools
  • GPT-4o with browsing — fetches llms.txt during agentic workflows
  • Perplexity Deep Research — uses llms.txt to prioritize which pages to read
  • AI coding assistants — use llms.txt in API and docs discovery
  • Custom AI agents — any agent using the llms.txt standard

The llms.txt Format

The file uses Markdown with a specific structure. Here's the minimal valid format:

# Your Company Name

> One-line description of what you do and who you serve.

## Product
- [Feature 1](https://yourdomain.com/feature-1): What it does
- [Feature 2](https://yourdomain.com/feature-2): What it does

## Documentation
- [Getting Started](https://yourdomain.com/docs/start): Installation and setup
- [API Reference](https://yourdomain.com/docs/api): Full API documentation

## About
- [Pricing](https://yourdomain.com/pricing): Plans and pricing
- [About Us](https://yourdomain.com/about): Company background

A Real Example: Causabi's llms.txt

# Causabi — GEO Fix Engine

> Causabi is a GEO (Generative Engine Optimization) platform that analyzes
> websites and automatically generates optimizations to improve citation rates
> in ChatGPT, Perplexity, Gemini, and Yandex GPT.

## Product
- [GEO Score](https://causabi.com): Free AI search readiness score (0-100)
- [GEO Fix Engine](https://causabi.com/dashboard): Automated fixes for GEO issues
- [Monitoring](https://causabi.com/monitor): Track citation rate across AI engines

## Scoring Dimensions
- robots_txt: AI bot access check (GPTBot, ClaudeBot, PerplexityBot)
- schema_org: Organization/Product/SoftwareApplication structured data
- faq_schema: FAQPage JSON-LD (highest impact signal, +41% citation rate)
- content_depth: Page content quality and depth
- brand_signals: Domain authority and external mentions
- freshness: Content recency

## Integrations
- [PyPI Package](https://pypi.org/project/causabi-geo/): CLI tool for local GEO analysis
- [MCP Server](https://causabi.com/mcp): Use GEO tools directly in Claude

## Contact
- Email: hello@causabi.com
- GitHub: https://github.com/SHADRINMMM/geo-optimizer

llms-full.txt: The Extended Version

The standard also defines llms-full.txt — a more comprehensive version that includes full documentation content inline, rather than just links. This is primarily for developer tools and API docs where agents need the complete content without making multiple requests.

For most websites, llms.txt (with links) is sufficient. Use llms-full.txt if you have:

  • An API or SDK with documentation you want AI agents to be able to use
  • Content that changes frequently (agents can re-fetch the full file)
  • Docs that are hard to navigate via regular links

What to Include in Your llms.txt

Required sections

  • H1 header: Your brand/product name
  • Blockquote description: 1-3 sentences describing what you do and for whom. This is the most important field — it's what AI agents read first.

Strongly recommended sections

  • Key features/product pages — link to pages that explain what you do in depth
  • Documentation or guides — your best explanatory content
  • Contact/About — helps establish legitimacy

What NOT to include

  • Don't list every page on your site — link to the 10-20 most important ones
  • Don't include transactional pages (cart, checkout, login)
  • Don't add marketing fluff — AI agents respond to factual, concrete descriptions
  • Don't include internal URLs, staging links, or anything not publicly accessible

How to Create It

Option 1 — Manual: Create a llms.txt file in your public directory and write it yourself using the format above.

Option 2 — Automated with Causabi CLI:

pip install causabi-geo

# Generate llms.txt from your site
geo-optimizer fix llms-txt yourdomain.com

# Output: llms.txt ready to publish

The CLI crawls your site, extracts key pages, and generates a properly formatted llms.txt using your existing content.

Technical Deployment

Place the file at /llms.txt (root of your domain). Serve it with:

  • Content-Type: text/plain
  • Charset: UTF-8
  • No authentication — must be publicly accessible without login
  • Cache-Control: public, max-age=86400 (1 day is fine)

For Next.js, add the file to /public/llms.txt. For static sites, add it to your root. For WordPress, add it via an plugin that serves static files or via functions.php.

Does It Actually Work?

The honest answer: llms.txt is a signal, not a guarantee. AI agents don't universally read it, and its adoption is growing but not yet universal. However:

  • It has zero downside — it's a small text file
  • Claude and GPT-4o do fetch it during agentic tasks that involve browsing your domain
  • Having it signals technical sophistication and AI-readiness to both AI systems and human developers
  • It contributes to your overall GEO score (brand signals + robots/crawl signals)

It's not the highest-impact GEO action (that's FAQ schema), but it's one of the easiest to implement and has compounding value as agentic AI use grows.

Generate your llms.txt automatically

Causabi's GEO Fix Engine generates a production-ready llms.txt from your existing site content — no manual writing required.

Check your GEO score →