Hockey Data Centre
Advanced Hockey analytics for casual fans or supernerds. No paywalls.
xG values are generated by a custom XGBoost model (with isotonic regression calibration) trained on ~1.5 million shots between 2015-2025 with location, distance, angle, game state, player strength-state, shift data (and others) totalling to 40+ numeric, categorical and ordinal features. On unseen shots, this model acheives close to (open-source) SOTA performance (MSE/Brier: 0.04287, LogLoss: 0.15970, AUC: 0.83716) but training is still a WIP and shot features are expanding. Alternative architectures like simple MLPs are next. Other models like win likelihood, entertainment value, defensive prowess are in development. Data is refreshed regularly from the NHL API. YT videos with data stories in development.