NCM plays each Stockfish dev build 20,000 times against Stockfish 15. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.
Host | Duration | Avg Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo |
---|
ID | Host | Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo | CLI | PGN |
---|
Commit ID | 7ff865b924a5027ab3ecb5a6525ad165262fe3c2 |
---|---|
Author | Gary Linscott |
Date | 2014-06-26 19:21:06 UTC |
Merge pull request #9 from glinscott/pawnspan
Scale down endgames with pawns on one or two adjacent files
Passed STC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 16081 W: 2745 L: 2604 D: 10732
Passed LTC
LLR: 2.95 (-2.94,2.94) [1.00,6.00]
Total: 123832 W: 17292 L: 16584 D: 89956
128k games to measure ELO at 15+0.05:
ELO: 2.07 +-1.1 (95%) LOS: 100.0%
Total: 128000 W: 21632 L: 20869 D: 85499
New bench: 8028792
|