NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.
Host | Duration | Avg Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo |
---|
ID | Host | Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo | CLI | PGN |
---|
Commit ID | 44cddbd962c738678f407a7414efa5b93f0710d9 |
---|---|
Author | Joost VandeVondele |
Date | 2024-06-15 10:06:45 UTC |
Add matetrack to CI
verifies that all mate PVs printed for finished iterations (i.e. no lower or upper bounds),
are complete, i.e. of the expected length and ending in mate, and do not contain drawing
or illegal moves.
based on a set of 2000 positions and the code in https://github.com/vondele/matetrack
closes https://github.com/official-stockfish/Stockfish/pull/5390
No functional change
|