Back to Glossary
    SEO & Web

    robots.txt

    A text file that tells search engine crawlers which pages they can and cannot access on a website.

    What is robots.txt?

    robots.txt is a plain text file placed at the root of a website that instructs web crawlers (like Googlebot) which pages or sections they are allowed or disallowed from crawling.

    Basic Syntax

    plaintext
    User-agent: *
    Allow: /
    Disallow: /private/
    
    Sitemap: https://example.com/sitemap.xml

    Key Directives

    | Directive | Purpose |
    |-----------|---------|
    | User-agent | Specifies which crawler the rules apply to |
    | Allow | Permits crawling of specified paths |
    | Disallow | Blocks crawling of specified paths |
    | Sitemap | Points to your XML sitemap |
    | Crawl-delay | Suggests delay between requests (not universally supported) |

    Important Notes

    robots.txt is a suggestion, not enforcement — well-behaved crawlers follow it
    It does not prevent indexing — use noindex meta tags for that
    Keep it simple — overly complex rules can accidentally block important pages
    Test it — Use Google Search Console's robots.txt tester

    Related Terms

    Related Tools

    Related Articles

    Want AI-powered customer support?

    Deploy a custom AI chatbot trained on your website in minutes.