Back to Glossary

    RAG (Retrieval-Augmented Generation)

    An AI technique that combines information retrieval with text generation to produce more accurate, grounded responses.

    What is RAG?

    Retrieval-Augmented Generation (RAG) is an AI architecture that enhances large language models (LLMs) by retrieving relevant information from external knowledge bases before generating responses.

    How RAG Works

    1.Query Processing — The user's question is converted into a vector embedding
    2.Retrieval — The system searches a knowledge base for the most relevant documents or chunks
    3.Augmentation — Retrieved context is combined with the original query
    4.Generation — The LLM generates a response grounded in the retrieved information

    Why RAG Matters

    Traditional LLMs can only rely on their training data, which may be outdated or incomplete. RAG solves this by:
    Reducing hallucinations — Responses are grounded in actual source material
    Staying current — The knowledge base can be updated without retraining the model
    Domain specificity — You can point the system at your own data for specialized answers

    RAG in Customer Support

    Platforms like SiteSupport use RAG to train AI chatbots on your website content. When a visitor asks a question, the system retrieves relevant pages from your site and generates an accurate answer based on your actual content.

    Key Components

    | Component | Purpose |
    |-----------|---------|
    | Embeddings | Convert text into numerical vectors for similarity search |
    | Vector Database | Store and query document embeddings efficiently |
    | LLM | Generate natural language responses from retrieved context |
    | Chunking | Break documents into optimal-sized pieces for retrieval |

    Related Terms

    Related Tools

    Want AI-powered customer support?

    Deploy a custom AI chatbot trained on your website in minutes.