Project 13
Combining time-series ridership forecasting with network graph analysis on 7,700+ Calgary transit stops to identify bottlenecks, predict demand, and optimize route planning
Streamlit application
Four pages covering transit network visualization, ridership demand forecasting, bottleneck identification, and route optimization recommendations.
Network
Transit network graph visualization with degree and betweenness centrality highlighting critical transfer points
Ridership forecast
Monthly ridership predictions using lag features, rolling means, and year-over-year change indicators
Bottlenecks
Stops with highest betweenness centrality where disruptions would cascade through the network
Optimizer
Route frequency and capacity recommendations based on forecast demand and network connectivity analysis
Key results
R-squared
0.80
XGBoost regressor on monthly ridership holdout set
MAPE
~8%
Mean absolute percentage error on ridership forecasts
Stops
7.7K
Transit stops analyzed for network centrality and connectivity
Methodology
Fetched monthly ridership and transit stop data from Calgary Open Data. Engineered lag features (1/3/12-month), rolling means, and year-over-year change indicators. Compared Ridge Regression, Random Forest, and XGBoost for ridership forecasting. Built a transit network graph with NetworkX to compute degree and betweenness centrality, identifying bottleneck stops and under-served areas.