Project 02

Community crime classifier

Calgary has over 200 communities, each with distinct crime patterns shaped by demographics and location. This project classifies communities by crime risk level using 77K+ crime records and census data, helping city planners and residents make informed decisions about safety and resource allocation.

Gradient Boosting Classification Census data Calgary open data 77K records
85% Accuracy (Gradient Boosting)

Streamlit dashboard pages

Overview

Dataset summary with crime categories, community counts, and risk level distributions

Crime map

Interactive map of Calgary communities color-coded by predicted risk classification

03

Trends

Temporal crime trends, category breakdowns, and per-capita rate analysis across communities

04

Classifier

Predict crime risk level for any community based on demographic and geographic features

Key results

85%
Accuracy
Gradient Boosting classifier
0.84
Weighted F1 score
Across low/med/high classes
77K+
Crime records
With census demographics

Methodology

Fetched crime statistics and civic census data from Calgary Open Data via the Socrata API. Aggregated crime counts at the community level with per-capita rates and category breakdowns. Created risk labels using percentile-based thresholds. Trained and compared Logistic Regression, Decision Tree, Random Forest, and Gradient Boosting classifiers, evaluating with accuracy and F1 scores.

01 Fetch crime and census data
02 Aggregate per-capita crime rates
03 Create risk labels by percentile
04 Train four classifiers
05 Evaluate with accuracy and F1
06 Deploy Streamlit dashboard