NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.
Host | Duration | Avg Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo |
---|
ID | Host | Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo | CLI | PGN |
---|
Commit ID | b6482472a03833287dc21bdaa783f156978ac63e |
---|---|
Author | Guenther Demetz |
Date | 2019-12-10 07:07:34 UTC |
Refine improving-logic
Don't rely on the assumption that we are improving after surviving a
check. Instead, compare with the static eval of 2 moves before.
STC
https://tests.stockfishchess.org/tests/view/5dedfd7f3cff9a249bb9e44d
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 38859 W: 8621 L: 8397 D: 21841
LTC
https://tests.stockfishchess.org/tests/view/5dee1b5a3cff9a249bb9e465
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 51130 W: 8308 L: 7996 D: 34826
Bench: 5371271
|