NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.
| Host | Duration | Avg Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo | 
|---|
| ID | Host | Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo | CLI | PGN | 
|---|
| Commit ID | 54cf226604cfc9d17f432fa0b5bca56277e5561c | 
|---|---|
| Author | FauziAkram | 
| Date | 2024-11-13 19:09:13 UTC | 
| Revert VLTC regression from #5634
https://tests.stockfishchess.org/tests/view/671bf61b86d5ee47d953cf23
And thanks to @xu-shawn for suggesting running a VLTC regress test since
depth modifications affect scaling. Also, the LTC was showing a slight
regress after 680+k games  ~= -0.34 , for reference:
https://tests.stockfishchess.org/tests/view/67042b1f86d5ee47d953be7c
closes https://github.com/official-stockfish/Stockfish/pull/5663
Bench: 1307308 | |