Dev Builds » 20140714-2314

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 15. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host Duration Avg Base NPS Games WLD Standard Elo Ptnml(0-2) Gamepair Elo

Test Detail

ID Host Base NPS Games WLD Standard Elo Ptnml(0-2) Gamepair Elo CLI PGN

Commit

Commit ID 67a5e1ecf97eae3c74f5c84ebdbb7e3719bf90bb
Author lucasart
Date 2014-07-14 23:14:58 UTC
Contempt = 20 Also raise the admissible bounds to (-100,100), as there is no reason to prevent users from using high values if they want to. Does not regress in self play: ELO: 0.10 +-2.0 (95%) LOS: 53.7% Total: 40000 W: 7084 L: 7073 D: 25843 master vs SF 3 ELO: 182.86 +-2.7 (95%) LOS: 100.0% Total: 40000 W: 21843 L: 2541 D: 15616 Contempt = 20 vs SF 3 ELO: 189.25 +-2.8 (95%) LOS: 100.0% Total: 40000 W: 22721 L: 2859 D: 14420 Diff is therefore 6.4 +/- 3.9 elo against a 180-190 elo weaker engine, which is significantly positive, as expected. This elo difference is likely understated, because of FishTest aggressive draw adjudication though. We could push Contempt further, but after 20cp, it would get in the way of FishTest draw adjudication rule, and is likely to reduce the testing throughput as a result. bench 8198667
Copyright 2011–2024 Next Chess Move LLC