NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.
Host | Duration | Avg Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo |
---|
ID | Host | Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo | CLI | PGN |
---|
Commit ID | 1fee23a598a54739571e01cfe6aef1c3488b0e74 |
---|---|
Author | Joona Kiiski |
Date | 2014-01-07 04:41:16 UTC |
Tweak King PST tables
First tested with 50K games at very short TC of 5+0.05
ELO: 3.11 +-2.0 (95%) LOS: 99.9%
Total: 49665 W: 10941 L: 10497 D: 28227
Then retested with usual SPRT at short TC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 16875 W: 3198 L: 3049 D: 10628
And at long TC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 5890 W: 985 L: 857 D: 4048
bench: 7800379
|