Dev Builds » 20181101-1500

You are viewing an old NCM Stockfish dev build test. You may find the most recent dev build tests using Stockfish 15 as the baseline here.

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 7. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host Duration Avg Base NPS Games Wins Losses Draws Elo
ncm-et-3 07:30:26 1968499 2917 1617 60 1240 +206.86 ± 9.5
ncm-et-4 08:48:04 1949327 3391 1819 92 1480 +195.18 ± 8.71
ncm-et-9 08:47:53 1977580 3425 1949 75 1401 +213.43 ± 8.97
ncm-et-10 08:48:12 1968866 3434 1902 65 1467 +207.44 ± 8.72
ncm-et-13 08:48:27 1978848 3420 1873 72 1475 +203.4 ± 8.7
ncm-et-15 08:48:03 1964011 3413 1852 71 1490 +201.11 ± 8.65
20000 11012 435 8553 +204.48 ± 3.62

Test Detail

ID Host Started (UTC) Duration Base NPS Games Wins Losses Draws Elo CLI PGN
58147 ncm-et-4 2018-11-03 03:14 01:00:16 1966305 391 222 10 159 +210.99 ± 26.79
58146 ncm-et-15 2018-11-03 03:12 01:03:15 1963541 413 233 5 175 +215.87 ± 25.2
58145 ncm-et-3 2018-11-03 03:10 01:04:54 1964456 417 228 7 182 +205.03 ± 24.73
58144 ncm-et-13 2018-11-03 03:09 01:05:05 1975772 420 216 12 192 +184.29 ± 24.16
58143 ncm-et-9 2018-11-03 03:08 01:06:16 1973909 425 220 12 193 +185.98 ± 24.11
58142 ncm-et-10 2018-11-03 03:08 01:06:52 1972355 434 239 5 190 +209.5 ± 24.08
58141 ncm-et-4 2018-11-03 01:56 01:17:47 1969753 500 255 13 232 +183.51 ± 21.9
58140 ncm-et-15 2018-11-03 01:52 01:18:17 1965867 500 266 7 227 +199.29 ± 21.99
58139 ncm-et-3 2018-11-03 01:52 01:16:27 1965226 500 266 10 224 +196.45 ± 22.28
58138 ncm-et-13 2018-11-03 01:51 01:17:39 1970959 500 268 7 225 +201.19 ± 22.11
58137 ncm-et-9 2018-11-03 01:50 01:17:54 1965245 500 281 8 211 +212.86 ± 23.02
58136 ncm-et-10 2018-11-03 01:49 01:16:57 1964317 500 273 7 220 +206.01 ± 22.41
58135 ncm-et-4 2018-11-03 00:39 01:15:53 1985723 500 252 9 239 +184.42 ± 21.36
58134 ncm-et-15 2018-11-03 00:35 01:16:18 1966927 500 270 7 223 +203.11 ± 22.23
58133 ncm-et-3 2018-11-03 00:34 01:16:45 1975174 500 255 12 233 +184.42 ± 21.81
58132 ncm-et-9 2018-11-03 00:32 01:16:04 1983164 500 288 8 204 +219.87 ± 23.48
58131 ncm-et-13 2018-11-03 00:32 01:17:21 1979816 500 284 12 204 +211.87 ± 23.6
58130 ncm-et-10 2018-11-03 00:31 01:17:10 1968932 500 266 8 226 +198.34 ± 22.08
58129 ncm-et-3 2018-11-02 23:17 01:15:48 1966152 500 304 6 190 +238.66 ± 24.39
58128 ncm-et-15 2018-11-02 23:17 01:17:05 1966939 500 283 15 202 +207.95 ± 23.81
58127 ncm-et-4 2018-11-02 23:16 01:21:16 1885671 500 256 15 229 +182.61 ± 22.13
58126 ncm-et-9 2018-11-02 23:15 01:16:44 1967860 500 304 16 180 +228.08 ± 25.39
58125 ncm-et-13 2018-11-02 23:14 01:17:21 1982082 500 272 7 221 +205.04 ± 22.35
58124 ncm-et-10 2018-11-02 23:12 01:18:05 1959704 500 287 11 202 +215.85 ± 23.71
58123 ncm-et-4 2018-11-02 21:59 01:16:24 1973343 500 289 17 194 +211.87 ± 24.39
58122 ncm-et-3 2018-11-02 21:58 01:18:50 1975221 500 273 12 215 +201.19 ± 22.89
58121 ncm-et-15 2018-11-02 21:57 01:18:40 1953108 500 259 14 227 +186.25 ± 22.22
58120 ncm-et-13 2018-11-02 21:57 01:16:02 1989904 500 293 11 196 +221.9 ± 24.12
58119 ncm-et-10 2018-11-02 21:56 01:15:18 1969859 500 285 11 204 +213.85 ± 23.58
58118 ncm-et-9 2018-11-02 21:55 01:18:26 1996586 500 272 9 219 +203.11 ± 22.55
58117 ncm-et-4 2018-11-02 20:41 01:16:26 1972553 500 287 15 198 +211.87 ± 24.08
58116 ncm-et-3 2018-11-02 20:39 01:17:42 1964766 500 291 13 196 +217.85 ± 24.17
58115 ncm-et-13 2018-11-02 20:38 01:17:17 1980598 500 262 10 228 +192.71 ± 22.04
58114 ncm-et-15 2018-11-02 20:38 01:17:38 1968007 500 265 8 227 +197.4 ± 22.02
58113 ncm-et-10 2018-11-02 20:38 01:16:45 1984882 500 277 12 211 +205.04 ± 23.15
58112 ncm-et-9 2018-11-02 20:37 01:16:44 1971599 500 281 10 209 +210.89 ± 23.22
58111 ncm-et-9 2018-11-02 19:21 01:15:45 1984701 500 303 12 185 +231.21 ± 24.94
58110 ncm-et-15 2018-11-02 19:20 01:16:50 1963692 500 276 15 209 +201.19 ± 23.35
58109 ncm-et-4 2018-11-02 19:20 01:20:02 1891946 500 258 13 229 +186.25 ± 22.07
58108 ncm-et-10 2018-11-02 19:20 01:17:05 1962019 500 275 11 214 +204.07 ± 22.93
58107 ncm-et-13 2018-11-02 19:20 01:17:42 1972805 500 278 13 209 +205.04 ± 23.3

Commit

Commit ID 3f1eb85a1ceb1b408f8f51cb82064b69e095399d
Author Joost VandeVondele
Date 2018-11-01 15:00:56 UTC
Fix issues from using adjustedDepth too broadly The recently committed Fail-High patch (081af9080542a0d076a5482da37103a96ee15f64) had a number of changes beyond adjusting the depth of search on fail high, with some undesirable side effects. 1) Decreasing depth on PV output, confusing GUIs and players alike as described in issue #1787. The depth printed is anyway a convention, let's consider adjustedDepth an implementation detail, and continue to print rootDepth. Depth, nodes, time and move quality all increase as we compute more. (fixing this output has no effect on play). 2) Fixes go depth output (now based on rootDepth again, no effect on play), also reported in issue #1787 3) The depth lastBestDepth is used to compute how long a move is stable, a new move found during fail-high is incorrectly considered stable if based on adjustedDepth instead of rootDepth (this changes time management). Reverting this passed STC and LTC: STC LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 82982 W: 17810 L: 17808 D: 47364 http://tests.stockfishchess.org/tests/view/5bd391a80ebc595e0ae1e993 LTC LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 109083 W: 17602 L: 17619 D: 73862 http://tests.stockfishchess.org/tests/view/5bd40c820ebc595e0ae1f1fb 4) In the thread voting scheme, the rank of the fail-high thread is now artificially low, incorrectly since the quality of the move is much better than what adjustedDepth suggests (e.g. if it takes 10 iterations to find VALUE_KNOWN_WIN, it has very low depth). Further evidence comes from a test that showed that the move of highest depth is not better than that of the last PV (which is potentially of much lower adjustedDepth). I.e. this test http://tests.stockfishchess.org/tests/view/5bd37a120ebc595e0ae1e7c3 failed SPRT[0, 5]: LLR: -2.95 (-2.94,2.94) [0.00,5.00] Total: 10609 W: 2266 L: 2345 D: 5998 In a running 5+0.05 th 8 test (more than 10000 games) a positive Elo estimate is shown (strong enough for a [-3,1], possibly not [0,4]): http://tests.stockfishchess.org/tests/view/5bd421be0ebc595e0ae1f315 LLR: -0.13 (-2.94,2.94) [0.00,4.00] Total: 13644 W: 2573 L: 2532 D: 8539 Elo 1.04 [-2.52,4.61] / LOS 71% Thus, restore old behavior as a bugfix, keeping the core of the fail-high patch idea as resolving scheme. This is non-functional for bench, but changes searches via time management and in the threaded case. Bench: 3556672
Copyright 2011–2024 Next Chess Move LLC