Athlemetrics · Rating Analysis

Rating Model · Evaluation

Regression model evaluation and interpretability: generalization, error analysis, feature importance, and role-level comparisons (charts from assets/img/base_score/).

R² ≈ 0.97
Overall fit & generalization
MAE ≈ 0.014
Mean absolute error
RMSE ≈ 0.022
Root mean squared error
Train vs. Test metrics
Train vs Test · Metric summary
GroupKFold R2
GroupKFold · R² stability
Feature importances
Top features · Importance

Overall Performance

Train vs. Test Metrics

Train vs Test
Low MAE/RMSE with a small train–test gap

Indicates strong generalization with low overfitting risk.

Cross-Validation Stability

GroupKFold R2 scores
R² remains stable across grouped splits

Robust to season/team grouping, closer to real-world deployment.

Seasonal Trend Fit

Season score trend
Tracks historical season-level shifts

Useful for cross-season comparison, anomaly spotting, and storytelling.

Accuracy & Error Analysis

Predicted vs. Actual

Actual vs predicted
Tight clustering along the diagonal

High accuracy with no obvious systematic bias across roles/positions.

Residual Distribution

Residual histogram
Centered, near-symmetric residuals

Residuals are centered near zero, suggesting low statistical bias.

Residuals by Role

Residuals by role
Median error is consistent across roles

Operationally fair: no role is systematically favored or penalized.

Feature & Role Insights

Score Distributions

Score distribution by role
Score distributions differ by role

Supports both within-role comparison and cross-role interpretation.

Feature Importance

Feature importances
Interpretable signals for product UX

Combine with show_features to explain why a score is assigned.

Score Bounds

Score bounds by role
Expected range (5th–95th percentile)

Useful for outlier detection and communicating floor/ceiling expectations.

Note: Click any chart to zoom (Esc to close, ←/→ to navigate).