robots.txt

What is robots.txt?

robots.txt is a plain text file placed at the root of a website that instructs web crawlers (like Googlebot) which pages or sections they are allowed or disallowed from crawling.

Basic Syntax

plaintext

User-agent: *
Allow: /
Disallow: /private/

Sitemap: https://example.com/sitemap.xml

Key Directives

| Directive | Purpose |
|-----------|---------|
| User-agent | Specifies which crawler the rules apply to |
| Allow | Permits crawling of specified paths |
| Disallow | Blocks crawling of specified paths |
| Sitemap | Points to your XML sitemap |
| Crawl-delay | Suggests delay between requests (not universally supported) |

Important Notes

•robots.txt is a suggestion, not enforcement — well-behaved crawlers follow it

•It does not prevent indexing — use noindex meta tags for that

•Keep it simple — overly complex rules can accidentally block important pages

•Test it — Use Google Search Console's robots.txt tester

What is robots.txt?

Basic Syntax

Key Directives

Important Notes

Related Terms

XML Sitemap

Web Crawling

Related Tools

Sitemap to Robots.txt Generator

Related Articles

Robots.txt: What It Is, Why It Matters, and How to Write One

Want AI-powered customer support?