NCM plays each Stockfish dev build 20,000 times against Stockfish 15. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.
| Host | Duration | Avg Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo |
|---|
| ID | Host | Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo | CLI | PGN |
|---|
| Commit ID | 145d29314211536e7508d3f07ee9d68c171370cc |
|---|---|
| Author | Marco Costalba |
| Date | 2014-05-04 07:42:32 UTC |
|
Revert stalemate detection in evaluation
Unfortunatly we have a slow down that causes
a regression in STC with no-regression mode:
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22454 W: 3836 L: 4029 D: 14589
bench: 8678654
|
|