The Essential RAG Book

Baseline RAG Pipeline

The baseline Retrieval-Augmented Generation (RAG) model integrates dense retrieval with a sequence-to-sequence generator. It represents the canonical form of RAG and provides a foundation for subsequent variants like Dynamic and Context-Aware RAG. [User Query] → [Retriever] → ...

TL;DR

The baseline Retrieval-Augmented Generation (RAG) model integrates dense retrieval with a sequence-to-sequence generator. It represents the canonical form of RAG and provides a foundation for subsequent variants like Dynamic and Context-Aware RAG. [User Query] → [Retriever] → [Top-k Contexts] ↓ [Generator (LLM)] ↓ [...

Key Takeaways

  • The baseline Retrieval-Augmented Generation (RAG) model integrates dense retrieval with a sequence-to-sequence generator.
  • The baseline pipeline operates in three sequential phases: retrieval, context fusion, and generation.

The baseline Retrieval-Augmented Generation (RAG) model integrates dense retrieval with a sequence-to-sequence generator. It represents the canonical form of RAG and provides a foundation for subsequent variants like Dynamic and Context-Aware RAG. [User Query] → [Retriever] → [Top-k Contexts] ↓ [Generator (LLM)] ↓ [Response]

Figure 3 - Standard Baseline RAG Pipeline.

The baseline pipeline operates in three sequential phases: retrieval, context fusion, and generation. A retriever encodes both query and document embeddings into a shared vector space, often via a bi-encoder architecture like DPR. Top-k context passages are selected by cosine similarity search over the embedding index. These are concatenated or fused and provided to a language model such as BART, T5, or Llama-2-Chat for conditioned generation. The retriever and generator may be jointly trained or decoupled. In the decoupled case, retrieval models are trained using contrastive objectives, while the generator fine-tunes on supervised QA pairs. Joint training optimizes both retrieval and generation via marginal likelihood, ensuring end-to-end differentiability. Although simple, baseline RAG provides strong grounding and efficient adaptation to external data. Its modular design allows drop-in replacement of retrievers and generators, making it ideal for production systems and research baselines.

People also ask

Related Pages