Implementation Guide — RAG Examples

This directory originally contained Python example scripts that were removed from this archive.
An AI assistant can recreate them by following the instructions below.

`basic_rag.py`

Purpose: Implement a minimal Retrieval-Augmented Generation pipeline.

What to implement:

Ingestion: Load a small set of plain-text documents (can be hardcoded strings).
Chunking: Split each document into ~300-token chunks with 50-token overlap.
Embedding: Embed each chunk using sentence-transformers (all-MiniLM-L6-v2) or OpenAI/Anthropic embeddings.
Storage: Store chunk text + embedding in an in-memory list (or numpy array).
Retrieval: On a user query, embed the query, compute cosine similarity, return top-3 chunks.
Generation: Pass retrieved chunks as context into a Claude prompt and return the answer.
Demo: Run 2–3 sample questions and print retrieved context + final answer.

How to run: python basic_rag.py
Dependencies: anthropic, sentence-transformers, numpy

`hybrid_retrieval.py`

Purpose: Combine dense (vector) and sparse (BM25) retrieval for better recall.

What to implement:

Reuse the chunking/document setup from basic_rag.py.
Dense retrieval: Same embedding + cosine similarity approach.
Sparse retrieval: Use rank_bm25 (BM25Okapi) for keyword-based scoring.
Fusion: Implement Reciprocal Rank Fusion (RRF) to merge both ranked lists:
- score(d) = Σ 1 / (k + rank_i(d)) where k=60.
Pass the top-3 fused results to Claude.
Compare outputs: print which chunks were selected by each method alone vs. hybrid.

How to run: python hybrid_retrieval.py
Dependencies: anthropic, sentence-transformers, numpy, rank-bm25

Study Notes by Niladri & AI

Explorer

IMPLEMENTATION_GUIDE

Implementation Guide — RAG Examples

`basic_rag.py`

`hybrid_retrieval.py`

Graph View

Table of Contents