Loading...
Loading...
Compare original and translation side by side
IRON LAW: Elo Assumes Each Matchup Is Independent and Stationary
Rating changes are based on surprise: beating a higher-rated opponent
gains more points than beating a lower-rated one. K-factor controls
update speed: high K (32) = volatile, fast adaptation. Low K (16) =
stable, slow adaptation. Choose K based on how quickly skill changes.IRON LAW: Elo Assumes Each Matchup Is Independent and Stationary
Rating changes are based on surprise: beating a higher-rated opponent
gains more points than beating a lower-rated one. K-factor controls
update speed: high K (32) = volatile, fast adaptation. Low K (16) =
stable, slow adaptation. Choose K based on how quickly skill changes.{
"ratings": [{"id": "player_A", "rating": 1720, "matches": 50, "wins": 35, "losses": 15}],
"metadata": {"k_factor": 32, "initial_rating": 1500, "total_matches": 500}
}{
"ratings": [{"id": "player_A", "rating": 1720, "matches": 50, "wins": 35, "losses": 15}],
"metadata": {"k_factor": 32, "initial_rating": 1500, "total_matches": 500}
}| Input | Expected | Why |
|---|---|---|
| 1500 beats 2000 | Large rating gain (~29 pts at K=32) | Huge upset, large surprise |
| 2000 beats 1500 | Small rating gain (~3 pts at K=32) | Expected outcome, minimal surprise |
| Draw between equals | No change | Expected outcome exactly matches actual |
| 输入 | 预期结果 | 原因 |
|---|---|---|
| 1500分玩家击败2000分玩家 | 评分大幅提升(K=32时约29分) | 爆冷获胜,意外性极高 |
| 2000分玩家击败1500分玩家 | 评分小幅提升(K=32时约3分) | 预期内结果,意外性极低 |
| 两名评分相同的玩家平局 | 评分无变化 | 实际结果与预期完全一致 |
| Script | Description | Usage |
|---|---|---|
| Update Elo ratings (single match or batch) with zero-sum verification | |
python scripts/elo.py --verify| 脚本 | 描述 | 使用方法 |
|---|---|---|
| 更新Elo评分(单场或批量比赛),并进行零和校验 | |
python scripts/elo.py --verifyreferences/bradley-terry.mdreferences/variable-k.mdreferences/bradley-terry.mdreferences/variable-k.md