NCM plays each Stockfish dev build 20,000 times against Stockfish 15. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.
Host | Duration | Avg Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo |
---|
ID | Host | Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo | CLI | PGN |
---|
Commit ID | 67f88f5e3e4ee6e4b49151173c2898253178506f |
---|---|
Author | Joerg Oster |
Date | 2014-02-15 07:20:33 UTC |
Return static eval when reaching MAX_PLY
Makes more sense than returning a draw score. Tested
with reduced MAX_PLY = 30 and passed both short TC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 17434 W: 3345 L: 3194 D: 10895
And long TC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 2610 W: 488 L: 373 D: 1749
With current limit of MAX_PLY = 100 the patch should not
introduce any measurable change, nevertheless is the correct
approach.
Idea of returning eval is from Michel Van den Bergh.
bench: 8430785
|