Question 1

What is a robots.txt file and why does it matter for SEO?

Accepted Answer

A robots.txt file is a plain text file at the root of your domain (e.g. example.com/robots.txt) that instructs web crawlers which pages or sections they can or cannot access. It is part of the Robots Exclusion Protocol. For SEO, robots.txt controls which pages Googlebot crawls, helps manage crawl budget on large sites, and prevents duplicate or low-value pages from consuming crawl resources. Misconfigured robots.txt can accidentally block entire sections of your site from being indexed.

Question 2

What is the difference between Disallow and noindex?

Accepted Answer

Disallow in robots.txt prevents crawlers from accessing a URL, but the URL can still appear in search results if other pages link to it (Google may index the URL without crawling the content). The noindex meta tag or HTTP header tells crawlers that they can crawl the page but should not include it in search results. To prevent indexing, use noindex. To save crawl budget on pages you definitely do not want crawled (like admin areas), use Disallow. Never use both Disallow and noindex on the same page — if Disallow is set, Google cannot read the noindex directive.

Question 3

How do I block AI bots like GPTBot and ClaudeBot?

Accepted Answer

To block AI training bots, add specific User-agent rules to your robots.txt. For example: User-agent: GPTBot followed by Disallow: / will block all OpenAI GPTBot access. For ClaudeBot (Anthropic), add User-agent: ClaudeBot with Disallow: /. You can also use User-agent: * with Disallow: / to block all bots, then selectively allow Googlebot. Note that blocking AI bots does not affect your search engine rankings unless you accidentally block Googlebot or other important crawlers.

Question 4

Does robots.txt affect Core Web Vitals or page speed?

Accepted Answer

Robots.txt itself does not affect Core Web Vitals. However, blocking Googlebot from CSS and JavaScript files can prevent Google from rendering your pages correctly, which may cause Google to misinterpret your content and potentially affect rankings. Always allow Googlebot access to all resources needed to render your pages, including JS and CSS files. Use Google Search Console's URL Inspection tool to see how Google renders your pages.

Robots.txt Analyzer

How to Use the Robots.txt Analyzer

Common Robots.txt Mistakes That Hurt SEO

Frequently Asked Questions

What is a robots.txt file and why does it matter for SEO?

What is the difference between Disallow and noindex?

How do I block AI bots like GPTBot and ClaudeBot?

Does robots.txt affect Core Web Vitals or page speed?

Related Tools

Need a Full Technical SEO Audit?