Project 06

Neighborhood livability segmentation

Choosing where to live in Calgary involves weighing safety, density, economic vitality, and housing mix across 200+ communities. This project applies unsupervised machine learning to group communities into distinct livability segments for residents, planners, and policymakers.

K-Means PCA Unsupervised learning Calgary open data 200+ communities
0.42 Silhouette score (K-Means)

Streamlit dashboard pages

Cluster map

Interactive Calgary map with communities color-coded by livability segment assignment

Cluster profiles

Detailed breakdown of each segment showing average crime, population, and business metrics

03

Community comparison

Side-by-side comparison of any two communities across all 10 livability features

04

Radar chart

Multi-dimensional radar visualization of community feature profiles and cluster centroids

Key results

0.42
Silhouette score
K-Means clustering
~60%
PCA variance explained
First two components
200+
Communities segmented
Four data sources integrated

Methodology

Integrated four Calgary Open Data sources: census, crime, business licences, and building permits. Built a 10-feature community-level matrix covering population, crime rate, business diversity, and more. Applied KMeans (k=2..10) and Agglomerative clustering with silhouette score selection. Reduced dimensionality with PCA for visualization and component interpretation.

01 Integrate four open data sources
02 Build 10-feature community matrix
03 KMeans with silhouette selection
04 PCA dimensionality reduction
05 Profile cluster characteristics
06 Deploy Streamlit dashboard