Project 12
Predicting 500,000+ Calgary property values with XGBoost and explaining every individual valuation with SHAP waterfall plots to help owners and real estate professionals
Streamlit application
Four pages covering property valuation predictions, SHAP-based explanations, geographic assessment maps, and historical trend analysis.
Valuator
Enter property attributes and get an instant assessed value prediction with confidence range
SHAP explainer
Per-prediction waterfall plots showing which features push the value up or down and by how much
Map
Geographic visualization of assessed values by community with median price overlays and heatmaps
Trends
Historical assessment trends by community and property type with year-over-year change analysis
Key results
R-squared
0.77
XGBoost regressor on 617K+ property assessments
MAE
$42K
Mean absolute error in predicted assessed value
Properties
617K
Property assessment records from Calgary Open Data via Socrata API
Methodology
Fetched 617,000+ property assessments from Calgary Open Data via Socrata API. Cleaned, de-duplicated, and log-transformed the right-skewed value distribution. Engineered community-level aggregates and land-use frequencies. Compared Ridge, Random Forest, Gradient Boosting, and XGBoost regressors, then applied SHAP TreeExplainer for global feature importance and per-prediction waterfall plots.