NCM plays each Stockfish dev build 20,000 times against Stockfish 15. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.
Host | Duration | Avg Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo |
---|
ID | Host | Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo | CLI | PGN |
---|
Commit ID | c45a4e0b48d7f5c570637d63592a07e185476334 |
---|---|
Author | Marco Costalba |
Date | 2012-11-11 10:20:39 UTC |
Revert "Don't prune discovered checks"
Both Lucas re-test and Jean-Francois confirrm it
is a regression.
Here Jean-Francois's results after 3600 games :
Score of 96d3b1c92b8 vs 3b87314: 690 - 729 - 2181 [0.495] 3600
ELO: -3.86 +- 99%: 14.94 95%: 11.35
LOS: 15.03%
Wins: 690 Losses: 729 Draws: 2181 Total: 3600
Bench: 5404066
|