What robots.txt actually does
A robots.txt file is a plain-text file at the root of your website that tells search engine crawlers which parts of the site they may or may not request. When a crawler like Googlebot or Bingbot visits, it reads this file first and respects the Allow and Disallow rules you set for its user-agent. It is the simplest, oldest part of technical SEO — and also one of the easiest to get dangerously wrong, because a single stray Disallow: / can quietly remove an entire site from search results.
This generator gives you a structured form instead of a blank text file, so the syntax is always correct. You add one or more user-agent groups, list the paths each crawler should avoid or be allowed into, and optionally point crawlers at your sitemap. The output is standards-compliant and ready to paste.
How to use it
- Pick a starting preset (allow all, block all, WordPress, or e-commerce) or build from scratch.
- For each user-agent group, set the crawler name (
*means all crawlers) and addDisalloworAllowpaths, one per line. - Add your sitemap URL so crawlers can discover your pages faster.
- Copy or download the generated file and upload it to your site root as
robots.txt. - After deploying, test it in Google Search Console › robots.txt report to confirm it parses as expected.
Important rules and gotchas
- It is public. Anyone can read
yourdomain.com/robots.txt. Never list secret URLs there — you are advertising them. Protect private pages with authentication instead. - Disallow does not guarantee de-indexing. A blocked page can still appear in results if other sites link to it. To keep a page out of the index, allow crawling and use a
noindexmeta tag. - Paths are case-sensitive and matched as prefixes.
Disallow: /adminblocks/admin,/admin/and/administrator. - One file per site, always at the root. Subdomains need their own robots.txt.
Frequently asked questions
Will robots.txt block my AdSense or analytics?
It should not. Google’s ad crawler (Mediapartners-Google) and the AdSense review need to read your content pages. Blocking your own pages with Disallow: / can cause both indexing and ad-serving problems, so only block genuine admin, cart, or search-result paths.
What is crawl-delay and should I use it?
Crawl-delay asks crawlers to wait a number of seconds between requests. Google ignores it (manage crawl rate in Search Console instead), but Bing and others honour it. Only add it if your server is genuinely struggling with crawl load.
Do I even need a robots.txt file?
If you want every page crawled, a robots.txt is optional — but having one that explicitly allows everything and links your sitemap is good practice and avoids 404s on the /robots.txt request.
Is anything sent to a server?
No. The file is assembled in your browser from the form fields. Nothing you enter is transmitted or stored.
