Dev Builds » 20211121-2018

You are viewing an old NCM Stockfish dev build test. You may find the most recent dev build tests using Stockfish 15 as the baseline here.

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host Duration Avg Base NPS Games WLD Standard Elo Ptnml(0-2) Gamepair Elo
ncm-dbt-01 10:01:56 1233790 3342 1118 568 1656 +57.7 ± 5.96 4 188 753 706 20 +115.06 ± 12.41
ncm-dbt-02 10:00:18 1234250 3326 1046 604 1676 +46.45 ± 5.83 1 214 802 634 12 +92.15 ± 12.02
ncm-dbt-03 10:02:39 1273192 3352 1053 600 1699 +47.24 ± 5.87 4 210 806 641 15 +93.84 ± 11.99
ncm-dbt-04 10:02:55 1255149 3342 1058 621 1663 +45.69 ± 6.07 4 231 784 628 24 +88.57 ± 12.17
ncm-dbt-05 09:59:29 1267370 3292 1009 546 1737 +49.19 ± 5.93 3 202 788 635 18 +97.01 ± 12.13
ncm-dbt-06 10:04:22 1260052 3346 1088 591 1667 +51.99 ± 5.84 1 198 795 661 18 +102.56 ± 12.07
20000 6372 3530 10098 +49.71 ± 2.42 17 1243 4728 3905 107 +98.14 ± 4.95

Test Detail

ID Host Base NPS Games WLD Standard Elo Ptnml(0-2) Gamepair Elo CLI PGN
207928 ncm-dbt-05 1210105 132 46 22 64 +63.87 ± 31.78 0 9 25 31 1 +126.37 ± 68.65
207927 ncm-dbt-02 1189942 160 51 39 70 +26.11 ± 26.51 0 14 40 26 0 +52.51 ± 54.29
207926 ncm-dbt-01 1204807 176 65 26 85 +78.28 ± 24.36 0 4 44 37 3 +150.97 ± 50.82
207925 ncm-dbt-04 1230656 176 56 40 80 +31.67 ± 25.66 0 14 45 28 1 +59.81 ± 51.08
207924 ncm-dbt-06 1243688 176 50 34 92 +31.67 ± 23.8 0 11 51 25 1 +59.81 ± 47.08
207923 ncm-dbt-03 1244071 180 62 26 92 +70.43 ± 24.21 0 8 38 44 0 +147.19 ± 55.94
207922 ncm-dbt-05 1238529 500 147 77 276 +48.96 ± 14.47 0 26 131 90 3 +95.44 ± 29.59
207921 ncm-dbt-02 1198655 500 147 94 259 +36.97 ± 14.83 1 34 126 89 0 +76.25 ± 30.37
207920 ncm-dbt-06 1224694 500 170 81 249 +62.51 ± 15.24 0 24 119 101 6 +119.89 ± 31.25
207919 ncm-dbt-04 1225038 500 163 99 238 +44.72 ± 15.28 0 35 118 95 2 +88.0 ± 31.44
207918 ncm-dbt-01 1238987 500 157 79 264 +54.65 ± 14.49 1 23 124 101 1 +112.14 ± 30.53
207917 ncm-dbt-03 1231496 500 155 101 244 +37.67 ± 14.74 0 34 130 84 2 +73.34 ± 29.83
207916 ncm-dbt-05 1246056 500 156 86 258 +48.96 ± 14.61 0 29 123 97 1 +98.44 ± 30.73
207915 ncm-dbt-02 1196643 500 154 87 259 +46.84 ± 14.88 0 32 120 97 1 +93.95 ± 31.17
207914 ncm-dbt-06 1234698 500 158 86 256 +50.38 ± 15.2 0 33 113 103 1 +101.46 ± 32.14
207913 ncm-dbt-04 1207170 500 144 87 269 +39.78 ± 15.74 0 40 116 91 3 +76.25 ± 31.7
207912 ncm-dbt-01 1177871 500 172 97 231 +52.51 ± 15.56 0 32 115 99 4 +101.46 ± 31.86
207911 ncm-dbt-03 1231999 500 147 84 269 +44.01 ± 16.8 2 38 111 93 6 +83.57 ± 32.37
207910 ncm-dbt-05 1232122 500 165 87 248 +54.64 ± 15.15 0 30 114 104 2 +109.07 ± 32.0
207909 ncm-dbt-02 1203116 500 158 94 248 +44.72 ± 15.53 0 36 117 94 3 +86.52 ± 31.58
207908 ncm-dbt-06 1214402 500 170 87 243 +58.21 ± 14.93 0 25 121 100 4 +113.68 ± 30.97
207907 ncm-dbt-01 1241708 500 165 71 264 +66.1 ± 14.73 0 24 109 116 1 +135.76 ± 32.75
207906 ncm-dbt-04 1233653 500 163 90 247 +51.09 ± 16.1 1 35 107 104 3 +101.46 ± 32.96
207905 ncm-dbt-03 1247467 500 153 85 262 +47.55 ± 14.92 1 30 119 100 0 +98.44 ± 31.3
207904 ncm-dbt-06 1214394 500 158 92 250 +46.13 ± 15.36 1 32 119 96 2 +92.46 ± 31.31
207903 ncm-dbt-05 1222589 500 146 85 269 +42.6 ± 15.78 2 32 123 89 4 +83.57 ± 30.77
207902 ncm-dbt-02 1194378 500 162 84 254 +54.64 ± 15.02 0 28 119 100 3 +107.54 ± 31.29
207901 ncm-dbt-03 1250913 500 143 95 262 +33.46 ± 14.99 1 37 125 87 0 +68.99 ± 30.52
207900 ncm-dbt-01 1203931 500 164 91 245 +51.09 ± 15.74 2 30 113 103 2 +104.49 ± 32.14
207899 ncm-dbt-04 1219974 500 157 97 246 +41.89 ± 16.46 2 35 121 85 7 +77.71 ± 31.04
207898 ncm-dbt-05 1257406 500 166 79 255 +61.08 ± 15.7 1 25 116 102 6 +118.33 ± 31.7
207897 ncm-dbt-02 1193729 500 158 93 249 +45.42 ± 14.55 0 30 126 93 1 +90.97 ± 30.33
207896 ncm-dbt-04 1223344 500 165 94 241 +49.67 ± 15.29 0 32 118 97 3 +96.94 ± 31.44
207895 ncm-dbt-03 1226814 500 167 90 243 +53.93 ± 15.63 0 31 116 98 5 +102.97 ± 31.72
207894 ncm-dbt-01 1202536 500 160 82 258 +54.64 ± 15.91 0 35 105 107 3 +107.54 ± 33.26
207893 ncm-dbt-06 1224479 500 175 84 241 +63.94 ± 14.5 0 23 114 112 1 +130.94 ± 31.98
206833 ncm-dbt-02 1209412 500 161 83 256 +54.64 ± 15.28 0 30 115 102 3 +107.54 ± 31.86
206832 ncm-dbt-05 1236406 500 140 80 280 +41.89 ± 14.87 0 34 123 92 1 +83.57 ± 30.77
206831 ncm-dbt-03 1252012 500 167 89 244 +54.64 ± 14.76 0 27 120 101 2 +109.07 ± 31.14
206830 ncm-dbt-01 1104727 500 171 95 234 +53.22 ± 15.84 1 32 110 104 3 +106.01 ± 32.56
206829 ncm-dbt-06 1230497 500 152 94 254 +40.48 ± 15.42 0 38 118 92 2 +79.17 ± 31.44
206828 ncm-dbt-04 1219828 500 164 80 256 +58.93 ± 15.22 0 26 119 100 5 +113.68 ± 31.27
177757 ncm-dbt-02 1488127 166 55 30 81 +52.72 ± 26.43 0 10 39 33 1 +103.41 ± 55.07
177756 ncm-dbt-01 1495758 166 64 27 75 +78.76 ± 28.22 0 8 33 39 3 +151.2 ± 60.1
177755 ncm-dbt-06 1493565 170 55 33 82 +45.21 ± 26.55 0 12 40 32 1 +87.65 ± 54.35
177754 ncm-dbt-03 1500769 172 59 30 83 +59.14 ± 22.06 0 5 47 34 0 +121.93 ± 48.72
177753 ncm-dbt-05 1495749 160 43 30 87 +28.29 ± 28.77 0 17 33 30 0 +56.96 ± 59.19
177752 ncm-dbt-04 1481531 166 46 34 86 +25.16 ± 27.49 1 14 40 28 0 +54.87 ± 54.3

Commit

Commit ID a5a89b27c8e3225fb453d603bc4515d32bb351c3
Author Stéphane Nicolet
Date 2021-11-21 20:18:08 UTC
Introduce Optimism Current master implements a scaling of the raw NNUE output value with a formula equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature allows Stockfish to keep material on the board when she thinks she has the advantage, and to seek exchanges and simplifications when she thinks she has to defend. This patch slightly offsets the turning point between these two strategies, by adding to Stockfish's evaluation a small "optimism" value before actually doing the scaling. The effect is that SF will play a little bit more risky, trying to keep the tension a little bit longer when she is defending, and keeping even more material on the board when she has an advantage. We note that this patch is similar in spirit to the old "Contempt" idea we used to have in classical Stockfish, but this implementation differs in two key points: a) it has been tested as an Elo-gainer against master; b) the values output by the search are not changed on average by the implementation (in other words, the optimism value changes the tension/exchange strategy, but a displayed value of 1.0 pawn has the same signification before and after the patch). See the old comment https://github.com/official-stockfish/Stockfish/pull/1361#issuecomment-359165141 for some images illustrating the ideas. ------- finished yellow at STC: LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 165048 W: 41705 L: 41611 D: 81732 Ptnml(0-2): 565, 18959, 43245, 19327, 428 https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.00> Total: 121656 W: 30762 L: 30287 D: 60607 Ptnml(0-2): 87, 12558, 35032, 13095, 56 https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877 ------- How to continue from there? a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value could be tweaked to try to gain more Elo, so the parameters of the sigmoid function in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92 b) in a similar vein, with two recents patches affecting the scaling of the NNUE evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning of the NNUE network; c) this patch will tend to keep tension in middlegame a little bit longer, so any patch improving the defensive aspect of play via search extensions in risky, tactical positions would be welcome. ------- closes https://github.com/official-stockfish/Stockfish/pull/3797 Bench: 6184852
Copyright 2011–2024 Next Chess Move LLC