FreeSEOTools.io
πŸ€–
FreeTechnical SEO

Robots.txt Analyzer

Fetch and analyze any website's robots.txt file. Instantly check if AI bots (GPTBot, ClaudeBot, PerplexityBot, Google-Extended) are allowed or blocked, view all crawl rules, and discover sitemap declarations.

Enter a domain or URL above to analyze its robots.txt

How to Use the Robots.txt Analyzer

Enter any domain or URL (e.g. example.com or https://example.com) and click Analyze. The tool fetches the robots.txt file from the server and parses all directives. Results include:

  • AI bot access status (allowed, blocked, or partial)
  • All declared sitemap URLs
  • Complete rule table with User-agent, Allow, Disallow, and Crawl-delay values
  • Raw robots.txt content (expandable)

Understanding the AI Bot Status

StatusMeaning
AllowedNo Disallow rules found for this bot (either no rules, or explicit Allow: /)
BlockedDisallow: / rule found β€” this bot cannot access any page
PartialSome paths are disallowed but others are allowed, or Disallow: / is present alongside specific Allow rules

AI Bots Checked

GPTBotOpenAI

Trains ChatGPT and GPT models

ClaudeBotAnthropic

Trains Claude AI models

PerplexityBotPerplexity AI

Powers Perplexity search and AI answers

CCBotCommon Crawl

Builds the Common Crawl dataset used to train many LLMs

Google-ExtendedGoogle

Trains Gemini (Bard) β€” separate from Googlebot

GooglebotGoogle

Main Google search crawler for indexing

Common Robots.txt Mistakes That Hurt SEO

Blocking CSS and JavaScript

If Googlebot cannot fetch your CSS/JS files, it cannot render your pages correctly. This can cause Google to misunderstand your content or miss important structured data.

Disallow during development

Staging sites often have "Disallow: /" set correctly, but this rule sometimes gets deployed to production. Check robots.txt after every major deployment.

Missing Sitemap declaration

Adding Sitemap: https://example.com/sitemap.xml to robots.txt ensures all major crawlers find your sitemap regardless of whether they have it in their index.

Case-sensitive paths

User-agent matching is case-insensitive, but path matching is case-sensitive. Disallow: /Admin will not block /admin on case-sensitive servers.

Frequently Asked Questions

What is a robots.txt file and why does it matter for SEO?

A robots.txt file is a plain text file at the root of your domain (e.g. example.com/robots.txt) that instructs web crawlers which pages or sections they can or cannot access. It is part of the Robots Exclusion Protocol. For SEO, robots.txt controls which pages Googlebot crawls, helps manage crawl budget on large sites, and prevents duplicate or low-value pages from consuming crawl resources. Misconfigured robots.txt can accidentally block entire sections of your site from being indexed.

What is the difference between Disallow and noindex?

Disallow in robots.txt prevents crawlers from accessing a URL, but the URL can still appear in search results if other pages link to it (Google may index the URL without crawling the content). The noindex meta tag or HTTP header tells crawlers that they can crawl the page but should not include it in search results. To prevent indexing, use noindex. To save crawl budget on pages you definitely do not want crawled (like admin areas), use Disallow. Never use both Disallow and noindex on the same page β€” if Disallow is set, Google cannot read the noindex directive.

How do I block AI bots like GPTBot and ClaudeBot?

To block AI training bots, add specific User-agent rules to your robots.txt. For example: User-agent: GPTBot followed by Disallow: / will block all OpenAI GPTBot access. For ClaudeBot (Anthropic), add User-agent: ClaudeBot with Disallow: /. You can also use User-agent: * with Disallow: / to block all bots, then selectively allow Googlebot. Note that blocking AI bots does not affect your search engine rankings unless you accidentally block Googlebot or other important crawlers.

Does robots.txt affect Core Web Vitals or page speed?

Robots.txt itself does not affect Core Web Vitals. However, blocking Googlebot from CSS and JavaScript files can prevent Google from rendering your pages correctly, which may cause Google to misinterpret your content and potentially affect rankings. Always allow Googlebot access to all resources needed to render your pages, including JS and CSS files. Use Google Search Console's URL Inspection tool to see how Google renders your pages.

Related Tools

Need a Full Technical SEO Audit?

Our SEO experts review your robots.txt, sitemap, crawl budget, and technical configuration to build a complete action plan for your site.

Get a Free SEO Audit