RAG document question answering

A retrieval-augmented generation system that answers questions about municipal policy documents using TF-IDF and BM25 retrieval with term-overlap re-ranking.

Information retrieval NLP TF-IDF BM25 Question answering Text search

0.82

Mean reciprocal rank

Interactive dashboard

Four-page Streamlit application for document retrieval and evaluation

Query

Natural language question input
Top-k retrieved passages with scores
Highlighted matching terms

Documents

15 municipal policy documents
Chunk length distribution
Full document viewer

Evaluation

30 Q&A pairs with ground truth
Precision and recall at different k
Per-query reciprocal rank details

Metrics dashboard

TF-IDF vs BM25 side-by-side comparison
MRR sensitivity to chunk size
Retrieval score distributions

Launch live app

$ pip install -r requirements.txt && streamlit run app.py

Key results

TF-IDF and BM25 retrieval evaluated on 30 municipal policy questions

0.82

MRR (TF-IDF)

0.89

Precision@3

Policy documents

Evaluation questions

Methodology

Fifteen synthetic municipal policy documents covering Calgary bylaws, transit, water services, housing, parks, emergency management, and more are chunked into overlapping text segments (500 characters, 50 overlap). Two retrieval methods are compared: TF-IDF with cosine similarity and BM25 with Okapi scoring. A term-overlap re-ranker provides a second pass to improve ranking quality. Evaluation uses 30 hand-crafted questions with ground truth document IDs, measuring precision@k, recall@k, and mean reciprocal rank.

Document chunking

500 chars, 50 overlap

TF-IDF / BM25

Index and score

Re-ranking

Term overlap scoring

Evaluation

MRR, P@k, R@k

How to run

Set up the project locally in three commands

$ pip install -r requirements.txt
$ python data/generate_data.py
$ streamlit run app.py

Data source

The document corpus consists of 15 synthetic municipal policy texts modeled on Calgary city bylaws, strategic plans, and public service descriptions. Topics include land use zoning, public transit, water services, affordable housing, parks and recreation, emergency management, climate action, transportation infrastructure, business licensing, community safety, waste services, economic development, snow control, arts and culture, and property assessment. The 30 evaluation questions were crafted to cover two questions per document, each with a ground truth relevant document ID.

Links

{}

View code on GitHub

Source, notebooks, and document data

View notebooks

EDA, feature engineering, modeling, evaluation

Interactive Streamlit dashboard