References — RAG Module

Core papers, documentation, and guides for deep RAG study.


Primary Documentation

Anthropic

LangChain

LlamaIndex


Foundational Papers

Original RAG Paper

  • Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
    Lewis et al., Facebook AI Research (2020)
    https://arxiv.org/abs/2005.11401
    The paper that introduced the RAG framework. Proposes RAG-Sequence and RAG-Token
    variants, fine-tunes a retriever and generator jointly. Essential reading for
    understanding the original formulation before it became an engineering pattern.

HyDE Paper

  • Precise Zero-Shot Dense Retrieval without Relevance Labels
    Gao et al. (2022)
    https://arxiv.org/abs/2212.10496
    Introduces Hypothetical Document Embeddings (HyDE): generating a hypothetical answer
    to a query and using that for dense retrieval instead of the raw query. Demonstrates
    consistent improvements on BEIR benchmarks without any fine-tuning.

Evaluation

RAGAS

  • RAGAS Documentation
    https://docs.ragas.io
    The standard framework for RAG evaluation. Covers Faithfulness, Answer Relevancy,
    Context Precision, Context Recall, and more. Includes guides for building evaluation
    datasets and integrating with LangChain / LlamaIndex.

Vector Databases

Chroma

  • Chroma Documentation
    https://www.trychroma.com/docs
    Official docs for the Chroma open-source vector database. Covers collections,
    embedding functions, metadata filtering, persistent vs in-memory modes, and the
    Python/JavaScript clients.

Pinecone

  • Pinecone Documentation
    https://docs.pinecone.io
    Managed vector database docs. Covers indexes, namespaces, serverless vs pod-based
    architecture, metadata filtering, hybrid search, and production best practices.

Additional Resources

Surveys and Deep Dives

  • BEIR: A Heterogeneous Benchmark for Zero-Shot Evaluation of Information Retrieval Models
    Thakur et al. (2021) — https://arxiv.org/abs/2104.08663
    The standard benchmark for comparing retrieval models. If you want to evaluate
    embedding models or retrieval strategies, BEIR is the reference.

  • Lost in the Middle: How Language Models Use Long Contexts
    Liu et al. (2023) — https://arxiv.org/abs/2307.03172
    Demonstrates empirically that LLMs underutilize information in the middle of long
    contexts. Directly relevant to how you order retrieved chunks in the prompt.

  • Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
    Asai et al. (2023) — https://arxiv.org/abs/2310.11511
    Proposes training a model to decide when to retrieve and to critique its own outputs.
    Related to agentic RAG patterns.

  • FLARE: Active Retrieval Augmented Generation
    Jiang et al. (2023) — https://arxiv.org/abs/2305.06983
    Forward-looking active retrieval: the model triggers retrieval mid-generation when
    uncertain. Relevant to the FLARE section in the README.

BM25 and Hybrid Retrieval

ColBERT and Reranking

  • ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction
    Khattab & Zaharia (2020) — https://arxiv.org/abs/2004.12832
    Introduces ColBERT’s MaxSim operator for efficient cross-encoder-quality retrieval.

  • RAGatouillehttps://github.com/bclavie/RAGatouille
    Python library making ColBERT practical for RAG systems. Wraps training, indexing,
    and retrieval in a simple API.

Chunking


Tools Referenced in This Module

ToolPurposeLink
chromadbLocal/embedded vector DBhttps://www.trychroma.com
rank-bm25BM25 implementation in Pythonhttps://github.com/dorianbrown/rank_bm25
sentence-transformersLocal embedding modelshttps://www.sbert.net
BAAI/bge-m3Top open-source embedding modelhttps://huggingface.co/BAAI/bge-m3
RAGASRAG evaluation frameworkhttps://docs.ragas.io
Cohere RerankReranking APIhttps://docs.cohere.com/reference/rerank
LangChainRAG orchestration frameworkhttps://python.langchain.com
LlamaIndexRAG orchestration frameworkhttps://docs.llamaindex.ai
QdrantHigh-performance vector DBhttps://qdrant.tech/documentation
pgvectorPostgreSQL vector extensionhttps://github.com/pgvector/pgvector
FAISSMeta’s ANN libraryhttps://faiss.ai