Home · Methodology
Model Documentation
A full technical breakdown of the Glicko-2 + Elo + XGBoost pipeline, VOPO calculation, and why Pinnacle is the right benchmark.
TennisGlicko runs four parallel rating tracks for every ATP and WTA player: three surface-specific Glicko-2 models (hard, clay, grass) and one general Elo model. All four update after every completed match.
Glicko-2 extends the classic Elo formula by tracking rating uncertainty (RD — Rating Deviation) and rating volatility. A player returning from injury carries high RD, so predictions involving them reflect that uncertainty correctly. As they play more matches, RD shrinks and the rating stabilizes.
Surface separation is critical: Rafael Nadal's clay rating and hard-court rating are maintained as entirely independent values. A loss on hard courts does not affect the clay track. This captures the real performance differential that aggregate rankings ignore.
The training set covers 440,000+ historical ATP and WTA matches. Both circuits are modeled separately to avoid cross-circuit rating contamination.
Raw Glicko-2 and Elo win probabilities are fed as features into an XGBoost classifier alongside additional signals: H2H record on surface, days since last match (fatigue/rust proxy), round of tournament, and rank differential.
XGBoost outputs a calibrated win probability. The final internal estimate is the average of the Glicko-2 surface probability and the XGBoost output — blending the structural rating data with the calibrated probabilistic model.
Model performance is measured by Brier Score — a proper scoring rule where lower is better (0 = perfect, 0.25 = random). The current Brier Score is 0.178 over the full historical dataset.
Pinnacle's no-vig implied probability strips the bookmaker margin from both sides of the market and re-normalizes to 100%. This gives the closest available market estimate of true win probability.
Why Pinnacle? Pinnacle accepts sharp bettors and does not limit winning accounts. Their lines are continuously corrected by professional money, making them the hardest market to beat and the most reliable benchmark. A model that consistently finds positive VOPO against Pinnacle has demonstrable edge — the industry-standard test is beating their closing line.
VOPO updates live as Pinnacle moves their lines. The model probability is static per match (recalculated when new ratings are published), while the market probability shifts with betting volume.
Green EV fires when two conditions are simultaneously true:
The 12% threshold was calibrated empirically on the historical dataset to balance signal frequency against false-positive rate. Matches below the threshold may still carry positive EV — Green EV is a high-conviction filter, not a comprehensive EV ranking.
TennisGlicko does not incorporate breaking news — injury withdrawals, walkovers, or late lineup changes are not reflected until after they affect rating data. Pinnacle's market typically reacts to this information faster than the model.
The model does not account for player motivation, match-fixing risk, or tournament-specific strategic play (e.g., protecting a ranking position). These factors exist but are not quantifiable from available data.
VOPO is not a guarantee of profit. Even a positive EV bet loses more than half the time when win probability is below 60%. Bankroll management and a large sample size are required to realize expected value.
Try it live
See VOPO scores for today's ATP/WTA matches — 48h free trial, no card needed.