NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.
Host | Duration | Avg Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo |
---|
ID | Host | Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo | CLI | PGN |
---|
Commit ID | ce2d9e27ea8b10abbd69ebd5dd73e7dcf0aa0655 |
---|---|
Author | FauziAkram |
Date | 2024-11-13 19:35:02 UTC |
Simplify big-net reevaluation
Passed STC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 37408 W: 9699 L: 9477 D: 18232
Ptnml(0-2): 130, 4326, 9577, 4534, 137
https://tests.stockfishchess.org/tests/view/672ffd8086d5ee47d953e633
Passed LTC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 151062 W: 38087 L: 37999 D: 74976
Ptnml(0-2): 63, 16686, 41958, 16748, 76
https://tests.stockfishchess.org/tests/view/673087aa86d5ee47d953e66b
closes https://github.com/official-stockfish/Stockfish/pull/5674
Bench: 848812
|