Back to Glossary

    Chunking

    The process of breaking large documents into smaller, meaningful pieces for better AI retrieval and processing.

    What is Chunking?

    Chunking is the process of splitting large documents or web pages into smaller, semantically meaningful segments. This is a critical step in building RAG-based AI systems.

    Why Chunking Matters

    Context windows — LLMs have token limits; smaller chunks fit better
    Precision — Smaller chunks improve retrieval accuracy
    Relevance — Only the most relevant portions are sent to the AI

    Chunking Strategies

    | Strategy | Description | Best For |
    |----------|-------------|----------|
    | Fixed-size | Split every N characters/tokens | Simple documents |
    | Semantic | Split at paragraph/section boundaries | Structured content |
    | Recursive | Progressively split using multiple separators | General purpose |
    | Sentence | Split at sentence boundaries | FAQ content |

    Optimal Chunk Size

    There's no one-size-fits-all answer, but common guidelines:
    200-500 tokens for Q&A and support chatbots
    500-1000 tokens for detailed documentation
    100-200 tokens for FAQ-style content

    In SiteSupport

    When you crawl your website, SiteSupport automatically chunks each page into optimal segments, generates embeddings for each chunk, and indexes them for fast retrieval.

    Related Terms

    Related Tools

    Related Articles

    Want AI-powered customer support?

    Deploy a custom AI chatbot trained on your website in minutes.