robots.txt Generator

Default Policy
Specific Rules
Sitemap

Generated robots.txt

Using the robots.txt Generator

This tool helps you create a robots.txt file to guide search engine crawlers on your site. It provides a user-friendly interface for setting both general and specific rules.

  1. Set a Default Policy: Choose whether to allow or disallow all crawlers by default. "Allow all" is the most common setting.
  2. Add Specific Rules (Optional): Click "Add User-agent Rule" to create rules for specific crawlers (like Googlebot) or to disallow access to certain directories (like /admin/).
  3. Add Your Sitemap: Paste the full URL to your sitemap.xml file. This is highly recommended.
  4. Copy the Code: The generator creates the file content in real-time. Copy the code, create a new file named robots.txt, paste the content, and upload it to the root directory of your website.

What is a robots.txt file?

A robots.txt file is a simple text file that lives in the root directory of your website (e.g., https://www.example.com/robots.txt). Its purpose is to provide instructions to web crawlers, also known as bots or spiders, about which pages or files the crawler can or cannot request from your site.

It is important to note that this file is a guideline, not a gatekeeper. Malicious bots will likely ignore it completely. It should never be used to hide private information. Its primary purpose is to manage crawler traffic and prevent your server from being overwhelmed with requests, and to stop certain pages (like internal search results) from being indexed.

Key Directives Explained

Directive Description
User-agent This specifies which crawler the following rules apply to. User-agent: * is a wildcard that applies to all crawlers. You can also target specific bots, like User-agent: Googlebot.
Disallow This tells the user-agent not to crawl a specific URL path. For example, Disallow: /images/ would tell crawlers not to access the images directory.
Allow This directive is used to counteract a Disallow rule. For example, you might disallow an entire directory but specifically allow one file within it.
Sitemap This directive points crawlers to the location of your XML sitemap, which helps them discover all the pages on your site you want them to index.

Best Practices

  • Location is Key: The file must be named robots.txt (all lowercase) and placed in the root directory of your domain.
  • One Directive Per Line: Each Allow, Disallow, or Sitemap rule must be on its own line.
  • Use Comments: You can add comments to your file by starting a line with a hash symbol (#). This is useful for explaining complex rules to others and your future self.
  • Do Not Use for Security: A robots.txt file is publicly accessible. Never use it to block access to sensitive or private user information. Use proper authentication and server-side rules for that.
  • Test Your File: After uploading your file, use the robots.txt Tester in Google Search Console to ensure it works as you expect and does not accidentally block important content.