webtomcp

2026-05-17 · KY · ~5 min read

llms.txt vs MCP: When You Need a Static File and When You Need a Server

Two different technologies. Two different jobs. llms.txt is a plain-text file you put at the root of your site to tell AI crawlers and language models what your site contains — think of it as a structured table of contents aimed at AI readers. MCP (Model Context Protocol) is a live query protocol that lets AI assistants ask questions about your site on demand and get cited, retrieved answers. You don't pick one or the other. Most sites with meaningful content end up wanting both, and they serve completely different purposes.

What llms.txt is

llms.txt is a community-proposed convention — llmstxt.org — for a plain Markdown file placed at https://yourdomain.com/llms.txt. The file summarizes your site: what you do, key facts, links to important pages. It's the AI-era equivalent of robots.txt combined with a site summary.

When an AI crawler (GPTBot, ClaudeBot, PerplexityBot) ingests your site, it may find and read llms.txt first. If the file is well-written, the crawler gets a clean, structured summary of your site's purpose and content without having to parse every page. AI models that answer questions from their training data or via retrieval-augmented search use this as a signal for what your site is about and what its key URLs are.

llms.txt is passive and discoverable. You write it once, update it occasionally, and AI crawlers pick it up. You don't control when or whether a specific AI user gets the information — you're broadcasting to any crawler that visits.

What MCP is

MCP (Model Context Protocol) is an open spec published by Anthropic that defines a standard wire format for AI clients to query external data sources interactively. An MCP server exposes tools — typically search(query) and fetch(url) — and an AI client calls those tools when it needs information from that source.

When a user asks Claude or ChatGPT a question and the AI has an MCP server connected, it issues a live query to that server, gets back retrieved passages with source URLs, and incorporates them into the answer. The user sees a cited, current response pulled from the actual indexed content.

MCP is active and on-demand. Every time a user asks a question, the AI client fires a live retrieval query. The response reflects the most recently indexed content, not what was in a training snapshot. The AI doesn't guess — it queries.

The core difference, with examples

A table helps:

Aspect llms.txt MCP server
What it isStatic Markdown fileLive query endpoint
Who reads itAI crawlers, training pipelinesAI assistant clients (Claude, ChatGPT, Cursor)
When it's usedCrawl time — before any user questionQuery time — when a user asks a question
FreshnessAs fresh as you update the fileAs fresh as your last index re-crawl
CoverageSummary + key URLsFull site, every page, queryable
Requires user actionNo — crawlers find it automaticallyYes — user must connect the MCP endpoint to their AI client
Setup costWrite one file, ~30 minutesIndex site, ~5 minutes with a managed service

Concrete example: a user asks ChatGPT about your pricing. If you only have llms.txt, ChatGPT may have seen a summary of your site during training, but it can't retrieve live pricing details at query time — it guesses from whatever was in the summary or its training data. If you have an MCP server indexed, ChatGPT queries the index, pulls the exact pricing page, and returns a cited answer with the current numbers.

When to use each — and when to use both

Use llms.txt alone if your site has minimal dynamic content (a five-page product landing page, a simple portfolio), you don't need real-time retrieval, and you're mainly concerned with how AI crawlers describe you in training pipelines and web searches.

Use MCP alone if your audience is technical users who will actively connect an AI assistant to your content (developer tool companies, API docs, engineering wikis). They'll use the MCP endpoint directly; they may never look at your llms.txt.

Use both — which is the right answer for most content sites with more than a few dozen pages:

They're complementary, not competing. One is your AI-facing table of contents; the other is your AI-facing search index.

WebToMCP generates both for you

When you index a site with WebToMCP, you get the MCP endpoint URL (for live querying) plus a generated llms.txt for your site that you can download and put at your domain root. The llms.txt is auto-populated from your indexed content: key summaries, canonical page URLs, section structure.

You don't have to pick. Sign in, submit your URL, get both. The MCP endpoint goes into your AI client; the llms.txt goes at the root of your domain for crawlers.

Free tier covers both. Sign in with Google to get started, or try the live demo endpoints on the landing page first.


Questions? developer@webtomcp.net. Or sign in with Google to index your own site (free).