Robots.txt Generator — Create & Validate robots.txt

This free robots.txt generator helps you create and validate the robots.txt file that tells search-engine crawlers which paths they may request. It builds User-agent, Allow, and Disallow directives plus the Sitemap line, and explains the key limitation: robots.txt controls crawling, not indexing, so a blocked URL can still appear in results — use a noindex meta tag to truly hide a page. Avoid the classic mistake of shipping Disallow: / from staging. Everything is generated locally in your browser; nothing is uploaded.

Crawler rules
Generated robots.txt

        

What is robots.txt?

Robots.txt is a plain text file placed at the root of your website (e.g., yoursite.com/robots.txt) that instructs web crawlers which pages or directories they are and aren't allowed to access. It follows the Robots Exclusion Protocol — an informal standard followed by all major search engines and most well-behaved bots.

The file uses simple directives: User-agent specifies which bot the rule applies to (* means all), Allow permits access to a path, Disallow blocks it, and Sitemap points crawlers to your sitemap. Robots.txt is a request, not a security measure — malicious bots ignore it entirely. Never use robots.txt to hide sensitive pages; use proper authentication instead. Common use cases: blocking crawlers from /admin, /checkout, staging environments, or duplicate URL patterns with parameters. Since 2023, you can also use robots.txt to block AI training scrapers like GPTBot, CCBot, and Google-Extended.

Related tools

Sitemap generator → Meta tag analyzer → Schema generator →

What robots.txt controls — and what it doesn't

A robots.txt file at your site root tells crawlers which paths they may request, using User-agent, Allow and Disallow directives. It manages crawl behaviour, not indexing: a URL blocked here can still appear in search results if other sites link to it. To truly keep a page out of the index, use a noindex meta tag instead — and don't also block it in robots.txt, or the crawler can't see the noindex.

Common directives

User-agent: *
Disallow: /admin/
Disallow: /cart
Allow: /
Sitemap: https://example.com/sitemap.xml

Mistakes that cost traffic

The most damaging error is shipping Disallow: / from a staging site to production, which de-indexes everything. Other pitfalls: blocking your CSS or JS (Google needs them to render and rank the page), forgetting the Sitemap: line, and assuming robots.txt provides security — it is public and politely advisory, so never list secret paths in it.

⚠️ Common Mistakes to Avoid

Frequently asked questions

Where do I put the robots.txt file?

It must be placed in the root directory of your website (e.g., yoursite.com/robots.txt).

Does every site need one?

While not strictly required, it's best practice for managing crawl budget and preventing indexing of private areas.

Reviewed by the ToolsmithPro editorial team · Last updated June 2026. Every calculation and conversion runs entirely in your browser — your inputs are never uploaded, stored or shared. Formulas and methodology are documented on our about page; spot an error? tell us and we'll fix it.