SEO

What is llms.txt? An Honest Guide for Small Businesses (2026)

Efe Gerek•April 28, 2026•11 min read

Quick answer: llms.txt is a plain text file that lives at the root of your website (yoursite.com/llms.txt) and gives AI engines a curated, machine-readable map of your most important content. It was proposed in September 2024 by Jeremy Howard of Answer.AI. It's not yet a proven citation lever — Google has publicly said they don't use it, ChatGPT doesn't proactively fetch it, and there's no measurable evidence that adding one boosts AI citations. But it's cheap to add, forward-compatible, and useful for specific cases (developer tools, agent workflows). Should you have one? Probably yes — but for honest reasons, not hype.

This guide explains what llms.txt actually is, what the evidence says about whether it works, and why we built it into Web Gerek anyway.

The problem llms.txt is trying to solve

When ChatGPT or Perplexity needs to answer a question about your business, here's what happens behind the scenes: the AI fetches one or more pages from your website, parses the HTML, tries to figure out what's important, and synthesizes an answer.

This works, but it's slow and noisy. AI crawlers waste time on navigation menus, footer junk, JavaScript-heavy sections, and structurally ambiguous content. They sometimes miss important pages entirely. And if your site has 50 pages, the AI doesn't know which 5 are the ones that actually matter.

llms.txt tries to fix this with a simple idea: instead of making AI engines reverse-engineer your site, give them a curated index. Hand them a markdown-formatted file that says "here's who I am, here's a one-paragraph summary, here are my 10 most important pages with descriptions, here are my key resources." Clean, structured, fast to parse.

The file format is intentionally simple — H1 site name, blockquote summary, sectioned lists of links with short descriptions. You can read the spec at llmstxt.org.

How llms.txt is different from robots.txt and sitemap.xml

These three files often get confused. They serve different purposes:

File	Purpose	Audience
robots.txt	Tells crawlers what they're allowed to access	Any crawler (Googlebot, Bingbot, GPTBot, etc.)
sitemap.xml	Lists every URL on your site for discovery	Search engine crawlers
llms.txt	Curated, human-readable summary of your most important content	AI language models

robots.txt is access control. sitemap.xml is exhaustive discovery. llms.txt is understanding — telling AI what you are and what matters, in a format optimized for LLM context windows.

The honest answer: does it actually work?

This is where most articles about llms.txt go off the rails. They promise citation lifts, faster ChatGPT visibility, and "the AI SEO magic button." The evidence in 2026 doesn't support those promises. Here's the actual picture:

Google has publicly said no. Google's John Mueller stated in July 2025 that no AI system at Google uses llms.txt and they don't plan to support it. He compared the approach to the discredited keywords meta tag from the early 2000s.

Major AI engines don't proactively fetch it. Server log studies across multiple test sites show GPTBot, ClaudeBot, and PerplexityBot rarely request the /llms.txt URL during normal crawling. They look for content, not manifest files.

No measurable citation uplift. ALLMO.ai analyzed 94,000+ AI-cited URLs across ChatGPT, Claude, and Perplexity and found no correlation between having an llms.txt file and getting cited more often.

Top cited brands don't have one. SE Ranking's January 2026 study found 0 out of the top 20 most-AI-cited media companies have an llms.txt file. The same study found ~10% adoption across 300k domains, mostly on smaller and developer-focused sites.

Adoption is concentrated in dev tools. The brands that have implemented it well — Anthropic, Stripe, Zapier, Cloudflare, Vercel, Mintlify — are companies whose primary audience is developers using AI coding assistants. For those use cases, llms.txt does provide value: tools like Cursor, Continue, Aider, and various RAG frameworks actively read it when present.

There are real fetch events. Mintlify reported 436 AI crawler visits after adding their llms.txt — small but non-zero. When you paste a URL directly into ChatGPT, Claude, or Perplexity, they read llms.txt files perfectly. Anthropic's Claude has been observed reading llms.txt during web crawling in some workflows.

So the honest summary: llms.txt is not currently the lever that moves AI search visibility for a typical small business. It's a forward-compatible standard with mixed evidence and clear utility in specific niches.

Why we built it into Web Gerek anyway

Three reasons.

One: the cost is essentially zero. Generating a good llms.txt for a small business takes about 100 lines of code that auto-pulls from existing data — business name, description, top pages, blog posts. There's no ongoing maintenance burden, no risk of getting it wrong, no downside. When the cost of a bet is near-zero and the potential upside is non-zero, you take the bet.

Two: forward compatibility matters. AI search is evolving fast. Today the major engines don't proactively fetch llms.txt, but the standard is gaining adoption (10% of sites and growing) and a critical mass would change vendor behavior. If/when ChatGPT or Perplexity decide to start preferring llms.txt-equipped sites, every Web Gerek site is already set up. We'd rather build it once and forget than scramble later.

Three: it works today for specific cases. When someone pastes your URL into Claude or ChatGPT and asks "what does this business do," having a clean llms.txt produces a measurably better answer than letting the AI reverse-engineer your homepage. That's not nothing — it's how a fraction of your prospects already learn about you.

Four (the honest one): it's a credibility signal. When a developer or sophisticated buyer checks your site for llms.txt and finds a well-structured one, that's a small but real trust signal. It says you're paying attention to where the web is heading. For a small business in a crowded category, small signals compound.

We didn't build it because we believe it's a magic citation lever. We built it because the cost was near-zero and the strategic logic was clear.

What a good llms.txt actually looks like

Here's the structure (simplified) of what every Web Gerek site ships with:

# [Business Name]

> [80-word summary in plain language: what the business does, who it serves,
> what makes it distinctive, location.]

## About

[2-3 paragraphs explaining the business, its services, hours, contact info,
and any structural details an AI would need to answer "what does this
business do" or "is this business open right now."]

## Services / Products

- [Service 1]: [Short description, price if applicable]
- [Service 2]: [Short description]
- [Service 3]: [Short description]

## Locations & Hours

- [Address, city, country]
- [Hours, broken down by day]

## Important Pages

- [Homepage](https://example.com): Overview and contact
- [Booking](https://example.com/booking): Online appointment scheduling
- [Menu](https://example.com/menu): Current menu and prices
- [Blog](https://example.com/blog): Tips and guides

## Contact

- Phone: +90 ...
- Email: ...
- Address: ...

The principles behind this structure:

The summary is 60-80 words. That's the chunk most likely to be quoted verbatim if an AI cites you.
Every link has a 5-15 word description. Don't just list URLs — tell the AI what's at each one.
Information that varies (hours, prices, services) lives in the file. The AI can answer hours questions from the manifest without crawling pages.
Markdown is intentional. AI engines parse markdown more reliably than HTML.
It's bilingual if your site is. Web Gerek's llms.txt has both Turkish and English sections so AI engines serve answers in either language correctly.

Common mistakes (some we made early)

Listing every page on the site. llms.txt is not sitemap.xml. The whole point is curation. List 10-30 key pages, not 200.

Marketing copy instead of facts. AI engines aren't your customers. "We are passionate about delivering excellence" is meaningless. "Family-run barbershop in Sultanahmet, open Tuesday-Sunday 09:00-19:00, walk-ins accepted" is what gets cited.

Letting it go stale. If hours, services, or prices change, the file should update too. We auto-regenerate on every publish. If you write yours by hand, set a calendar reminder to review quarterly.

Skipping the summary block. The blockquote summary at the top is the single most-quoted part of an llms.txt. Skipping it is like a website without a <title> tag — possible, but you lose your best lever.

Making it too long. Aim for 1,500-3,000 tokens (~1,000-2,000 words). Longer files get truncated by AI context windows.

Pointing to pages that don't exist. Sounds obvious. Happens constantly when sites get rebuilt. Always verify the links resolve.

Who should add llms.txt today

Not everyone, despite what some hype articles claim. The honest tiering:

Definitely yes:

Developer tools, APIs, documentation sites — the original use case where it demonstrably works
AI agent platforms or anything in the AI/ML space (signaling matters in this niche)
Small service businesses on a platform that ships it for free (cost is zero, why not)

Probably yes:

Content-heavy blogs and publishers — helps direct AI to your cornerstone content
Local service businesses with a clear NAP, hours, and service list to communicate

Optional / low priority:

Pure e-commerce stores (product feeds and schema.org/Product matter more)
Lead-generation landing pages with no real "documentation" to reference
Businesses where AI search isn't part of the customer journey

Skip for now:

Sites where you'd have to write it manually, you're a non-technical owner, and the time isn't free. Spend that time on Google Business Profile, reviews, and schema instead — those have proven impact.

The future of llms.txt

Here's where I'll editorialize. The evidence today says llms.txt isn't moving citations. But three things make me think the standard isn't going away:

Adoption is compounding, not stalling. From "a few sites in 2024" to ~10% in early 2026 is a real curve. WordPress's Yoast plugin shipped one-click support. Cloudflare added a generator. Most modern static-site frameworks have plugins. When tooling catches up, adoption accelerates.

The agentic web is real. As AI agents (not just chatbots) start performing actions on websites — booking appointments, comparing options, making purchases — they need machine-readable manifests. llms.txt is currently the leading proposal for that interface.

Vendor incentives may shift. Right now, major AI engines don't need llms.txt because they have enough resources to crawl and parse aggressively. As query volume scales and inference costs compound, providing a curated manifest becomes a polite-but-real way to reduce the cost of being a good AI citizen.

If you're betting on llms.txt being important in 2026 specifically — that's a weak bet. If you're betting on it being important by 2028 — that's a much more reasonable bet. Either way, the cost of being wrong is near zero.

FAQ

Does Google use llms.txt? No. John Mueller confirmed in 2025 that Google's AI systems do not use it and have no plans to.

Does ChatGPT read my llms.txt? Not proactively during normal web search. But if someone gives ChatGPT your URL directly, it reads llms.txt and uses it to answer questions. Some inference workflows in Claude have been observed reading it during retrieval.

If it doesn't work, why are big companies like Anthropic and Stripe using it? Most of those implementations target developer-tool use cases (AI coding assistants, RAG frameworks, agent workflows) where llms.txt demonstrably does work. For those companies, the file is consumer-product infrastructure, not search marketing.

Will having an llms.txt hurt my rankings? No. It's not visible in search results, doesn't affect crawl budget meaningfully, and isn't a ranking signal for any major engine. The only risk is misconfiguration — if you accidentally point your llms.txt at the wrong URLs, you could mislead AI engines about your business.

How big should it be? Aim for 1,500-3,000 tokens (~1,000-2,000 words). Larger files often get truncated. Smaller files don't include enough context to be useful.

How often should I update it? On every meaningful business change — new services, updated hours, price changes, address changes. Web Gerek regenerates yours automatically on every publish. If you maintain it manually, review it quarterly.

Is it the same as llms-full.txt? The spec defines two file types: llms.txt (curated index, ~1-3K tokens) and llms-full.txt (complete content, all pages flattened). Most sites just need llms.txt. llms-full.txt matters more for documentation-heavy sites.

If you want a properly-structured llms.txt without writing one yourself, every site published on Web Gerek ships with one — auto-generated from your business info, bilingual where applicable, regenerated on every publish. You don't need to do anything.

For the broader picture of how AI search optimization actually works in 2026, see our guide on how to show up in ChatGPT and other AI engines.

llms.txtAI searchGEOAEOsmall business

Get Started

Create your professional website in minutes. Free.

Try Free →