NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.
Host | Duration | Avg Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo |
---|
ID | Host | Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo | CLI | PGN |
---|
Commit ID | 1e2f0511033945d07e1c8856980ed72cdbe42822 |
---|---|
Author | Linmiao Xu |
Date | 2024-07-23 17:24:00 UTC |
Replace simple eval with psqt in re-eval condition
As a result, re-eval depends only on smallnet outputs
so an extra call to simple eval can be removed.
Passed non-regression STC:
https://tests.stockfishchess.org/tests/view/669743054ff211be9d4ec232
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 214912 W: 55801 L: 55777 D: 103334
Ptnml(0-2): 746, 24597, 56760, 24593, 760
https://github.com/official-stockfish/Stockfish/pull/5501
Bench: 1440277
|