Free Robots.txt Generator
Configure user-agent rules, allow & disallow paths, and sitemap URLs — then download your finished robots.txt file ready to deploy. No signup, no limits.
Quick Presets
Configuration
Use * for all crawlers, or enter a specific bot name.
Generated robots.txt
User-agent: * Disallow: /admin/ Allow: / Sitemap: https://example.com/sitemap.xml
Place this file at the root of your domain: https://yourdomain.com/robots.txt
What This Tool Does
Everything you need to generate a valid, production-ready robots.txt file.
Block Specific Bots
Target any crawler by name — Googlebot, GPTBot, CCBot, Bingbot, and more. Use presets for common configurations.
Allow & Disallow Rules
Add multiple allow and disallow paths per user-agent group. Rules are applied in the order they appear.
Sitemap Declaration
Add one or more sitemap URLs directly to your robots.txt so crawlers can discover your full content index.
Download Ready
Download the finished file as robots.txt — ready to deploy at the root of your domain in one step.
robots.txt Rules Explained
A robots.txt file is made up of one or more groups. Each group starts with a User-agent line and contains rules that apply to that bot.
User-agent: *Applies the following rules to all crawlers. Replace * with a bot name like Googlebot to target only that crawler.
Example: User-agent: GPTBot
Disallow: /path/Tells the crawler not to access this path or anything under it. Use Disallow: / to block your entire site.
Example: Disallow: /admin/
Allow: /path/Explicitly permits access to a path, even if a broader Disallow rule would block it. Used for exceptions.
Example: Allow: /public/
Crawl-delay: 10Asks the crawler to wait 10 seconds between requests. Reduces server load. Googlebot ignores this directive.
Example: Crawl-delay: 5
Sitemap: URLPoints the crawler to your sitemap file. Can appear multiple times. Helps crawlers discover all your pages.
Example: Sitemap: https://example.com/sitemap.xml
Common robots.txt Mistakes
Small errors in robots.txt can block search engines from your entire site or leave sensitive areas exposed.
✕ Blocking CSS and JS files
Never disallow your stylesheet or script directories. Google needs to render your pages to understand them — blocking assets breaks that.
✕ Using robots.txt to hide sensitive pages
robots.txt is public. If you list /private/ in Disallow, anyone can read it and know that path exists. Use authentication instead.
✕ Forgetting to remove staging Disallow
A common incident: staging has Disallow: / and that config gets deployed to production. Always check your live robots.txt after every deploy.
✕ Thinking Disallow prevents indexing
Disallow blocks crawling, not indexing. A blocked page can still appear in search if other sites link to it. Use noindex to fully remove a page.
Frequently Asked Questions
Common questions about robots.txt files and how crawlers use them.
What is a robots.txt file?
A robots.txt file tells web crawlers which parts of your site they can and cannot access. It sits at the root of your domain and is checked by most search engine bots before they start crawling.
Does robots.txt prevent Google from indexing pages?
Blocking a URL in robots.txt stops Googlebot from crawling it, but doesn't guarantee it won't appear in search results. If other sites link to the page, Google may still index it from those signals. Use a noindex tag to fully exclude a page from search results.
Should I block AI bots with robots.txt?
If you don't want your content used for AI training, you can block crawlers like GPTBot (OpenAI), CCBot, Google-Extended, and anthropic-ai by name. Reputable AI companies respect robots.txt. The 'Block AI bots' preset in this tool covers the main ones.
Where do I put the robots.txt file?
It must be at your domain root — https://yourdomain.com/robots.txt. A file in a subdirectory does nothing. Subdomains each need their own robots.txt.
What is Crawl-delay?
Crawl-delay tells a bot how many seconds to wait between page requests. It reduces server load from aggressive crawlers. Note: Googlebot ignores Crawl-delay — use Google Search Console to control its crawl rate instead.