Model performance

Walk-forward out-of-sample backtest: each match is predicted before its result, then ratings update. Binary home-win head-to-head.

Matches scored
5095
Accuracy
69.8%
Brier
0.205857
Baseline Brier
0.25

Brier 0.205857 vs a 0.5 coin-flip baseline of 0.25 — a 17.7% reduction in Brier score. Log loss 0.600864.

Reliability buckets

P(home)CountAvg predictionActual rate
0.2-0.34826.6%0.0%
0.3-0.427836.2%14.0%
0.4-0.594945.6%30.9%
0.5-0.6185055.2%53.2%
0.6-0.7144064.5%79.2%
0.7-0.849273.7%91.9%
0.8-0.93882.0%97.4%