
There's a quiet but significant shift happening in how AI tools discover and read website content, and it could matter a lot for how your business shows up in AI-generated answers. A proposed standard called llms.txt, put forward by Australian technologist Jeremy Howard, aims to give website owners a straightforward way to tell AI models exactly what to read, and how.
Key takeaways
- llms.txt is a proposed standard that makes your website content easier for AI and large language models (LLMs) to read and use accurately.
- It works similarly to robots.txt but is designed for AI crawlers rather than traditional search engines.
- It uses simple Markdown format and can contain anything from a list of URLs to the full flattened text of your entire website.
- It's not a blocking tool. Think of it more as a guide that helps AI models find and understand your best content.
- Adoption is growing, with major tech companies already publishing their own llms.txt files.
What problem does llms.txt actually solve?
AI language models are hungry for web content, but they have a real limitation: their context windows are too small to process most websites in one go. On top of that, converting a typical webpage, complete with navigation menus, adverts, JavaScript and other clutter, into clean, readable text is messy and imprecise.
Howard's proposal quotes the problem directly: "While websites serve both human readers and LLMs, the latter benefit from more concise, expert-level information gathered in a single, accessible location."
In other words, your website is built for people. llms.txt builds a version of it for AI.
How llms.txt works
The idea is straightforward. You create a plain text file written in Markdown, name it llms.txt, and place it in the root directory of your website, so it lives at something like yourdomain.com/llms.txt. AI models that support the standard can then read that file to understand your content without having to crawl every single page.
What goes in the file is flexible. You can include:
- A simple list of URLs pointing to sections of your site
- URLs paired with short summaries of what each page covers
- The full raw text of your website flattened into a single file
You can also create an llms-full.txt variant that contains your complete site content in one place. Howard himself has a file on one of his websites that is 115,378 words long and 966 kb in size, containing the complete flattened text of the site. Files can be smaller, potentially larger, or broken across multiple directories depending on your site's structure.
You can also create individual .md Markdown versions of specific pages you want AI models to pay particular attention to.
Is it like robots.txt?
In some ways, yes. Both are simple text files that sit in your website's root directory and provide instructions to automated agents. But there's an important difference: llms.txt is not a blocking tool.
Robots.txt uses directives like "Disallow" to stop crawlers from accessing certain pages. llms.txt doesn't work that way. It has no robots.txt-style directives and isn't intended to include them. It's more about choosing which content to highlight and provide context for, rather than blocking anything outright.
As with robots.txt, an AI agent can choose to follow the llms.txt file or ignore it entirely. There's no enforcement mechanism. But for AI platforms that do honour it, you're giving them a cleaner, more useful signal about your content.
Who's already using it?
Several well-known technology companies have already published llms.txt or llms-full.txt files. Anthropic, Hugging Face, Perplexity, Zapier and others all have live examples you can look at. A resource called llms.txt Hub has compiled a list of AI developers using the standard for documentation and describes itself as one of the largest such resources for tracking adoption.
It's worth noting that llms.txt isn't just for developers or tech companies. It's designed for any website owner or content creator who wants to give AI models a better way to understand their site.
What's in it for your business?
The most immediate benefit is relevance. If an AI model can read a clean, well-structured version of your website, it's more likely to pull accurate information from it when answering a user's question. That matters increasingly as more people use AI tools like ChatGPT, Perplexity and others to find answers rather than browsing search results.
There's also a useful secondary benefit for you as a business owner. A fully flattened version of your site in a single file is genuinely handy for your own analysis: reviewing site content, spotting gaps, checking consistency of messaging, or understanding your site's structure at a glance.
For anyone already thinking about SEO and how their content performs in search, llms.txt is worth watching. It sits at the intersection of content strategy and the growing discipline sometimes called Generative Engine Optimisation (GEO), which is about ensuring your content surfaces accurately in AI-generated responses.
How to generate an llms.txt file
You don't need a developer to create a basic llms.txt file. There are generator tools available online, including Markdowner, which is a free, open-source tool that converts website content into structured Markdown. Many tools will process smaller sites for free, while larger sites may require a custom approach.
A few important cautions if you go down this route:
- Vet any third-party tool carefully before using it. Research its security practices before giving it access to your site content.
- Review the file it generates before you upload it to your server. Don't publish anything you haven't checked.
- If your site is large or complex, it may be worth asking a developer to build a custom solution rather than relying on a generic tool.
If your site runs on WordPress and you'd like help setting this up properly, that's exactly the kind of thing our website maintenance service covers.
Should you bother right now?
llms.txt is still a proposal rather than a formal ratified standard. It has supporters and sceptics. But the direction of travel is clear: AI models are consuming web content at scale, and the tools and standards around that are developing fast. Getting ahead of it now, when the implementation effort is relatively low, is a sensible move for most businesses.
At minimum, understanding what llms.txt is puts you in a better position to act when adoption becomes more widespread. At best, having a well-structured file in place now could give your content a genuine edge in how AI tools represent your business.
If you want help thinking through how this fits with your broader content and SEO strategy, get in touch with the IceBox team. We're happy to walk you through it.
Frequently asked questions
What is llms.txt and what does it do?
llms.txt is a proposed standard created by Australian technologist Jeremy Howard. It's a Markdown-formatted text file you place in your website's root directory to help AI language models read and understand your content more accurately, without having to crawl every page individually.
Is llms.txt the same as robots.txt?
They're similar in that both are simple text files placed in your website's root directory. But robots.txt uses directives like 'Disallow' to block crawlers, while llms.txt has no blocking function. It's a guide for AI models about what content to read, not a gatekeeper.
Does llms.txt actually stop AI from using my content?
No. llms.txt doesn't block AI models from accessing your content. Like robots.txt, any AI agent can choose to follow it or ignore it. It gives context and direction, but there's no enforcement mechanism built in.
Who is already using llms.txt?
Several major technology companies have already published llms.txt or llms-full.txt files, including Anthropic, Hugging Face, Perplexity and Zapier. A resource called llms.txt Hub tracks adoption across AI developers and documentation sites.
Related services
Need a hand with this? Here's how IceBoxDesigns can help.