Dev Builds » 20230223-1227

You are viewing an old NCM Stockfish dev build test. You may find the most recent dev build tests using Stockfish 15 as the baseline here.

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host Duration Avg Base NPS Games WLD Standard Elo Ptnml(0-2) Gamepair Elo
ncm-dbt-01 04:59:42 1123131 1662 673 165 824 +109.7 ± 7.64 0 29 280 507 15 +237.19 ± 20.38
ncm-dbt-02 04:58:36 1229735 1648 681 149 818 +116.32 ± 7.59 1 22 261 524 16 +256.12 ± 21.11
ncm-dbt-03 05:01:21 1238053 1684 710 173 801 +114.79 ± 7.7 1 27 267 528 19 +249.86 ± 20.89
ncm-dbt-04 05:01:20 1227014 1672 695 161 816 +114.98 ± 7.36 0 22 271 530 13 +253.71 ± 20.7
ncm-dbt-05 04:58:52 1226688 1666 684 165 817 +111.95 ± 7.73 0 33 263 522 15 +243.57 ± 21.05
ncm-dbt-06 05:01:07 1236692 1668 698 182 788 +111.12 ± 8.04 0 38 264 510 22 +236.69 ± 21.0
ncm-et-3 06:24:16 1298031 1668 715 160 793 +120.18 ± 7.34 0 22 248 551 13 +269.29 ± 21.67
ncm-et-4 06:23:41 1304310 1660 686 170 804 +111.69 ± 7.96 1 30 274 502 23 +238.22 ± 20.61
ncm-et-9 06:26:19 1300182 1668 679 164 825 +110.89 ± 7.96 0 39 259 518 18 +238.62 ± 21.2
ncm-et-10 06:26:12 1290267 1670 680 154 836 +113.28 ± 7.5 1 22 277 520 15 +248.07 ± 20.47
ncm-et-13 06:24:44 1304398 1670 672 162 836 +109.6 ± 7.72 2 29 274 517 13 +239.53 ± 20.61
ncm-et-15 06:26:57 1298588 1664 702 157 805 +118.15 ± 7.62 0 23 262 526 21 +257.48 ± 21.07
20000 8275 1962 9763 +113.54 ± 2.22 6 336 3200 6255 203 +247.19 ± 6.02

Test Detail

ID Host Base NPS Games WLD Standard Elo Ptnml(0-2) Gamepair Elo CLI PGN
201329 ncm-dbt-02 1231078 148 59 20 69 +93.76 ± 26.43 0 4 28 41 1 +197.17 ± 65.7
201328 ncm-dbt-01 1123086 162 66 11 85 +122.82 ± 23.16 0 2 23 55 1 +279.59 ± 73.46
201327 ncm-dbt-05 1241386 166 74 18 74 +121.97 ± 20.85 0 0 28 54 1 +277.08 ± 64.7
201326 ncm-dbt-06 1238499 168 66 16 86 +106.62 ± 25.31 0 4 28 50 2 +225.71 ± 65.87
201325 ncm-dbt-04 1221084 172 75 17 80 +121.92 ± 20.45 0 0 29 56 1 +277.16 ± 63.52
201324 ncm-dbt-03 1233575 184 75 17 92 +113.38 ± 20.76 0 2 30 60 0 +257.84 ± 63.47
201323 ncm-dbt-01 1117164 500 204 59 237 +103.73 ± 13.71 0 7 96 142 5 +219.87 ± 34.73
201322 ncm-dbt-02 1223737 500 205 42 253 +117.55 ± 12.82 0 4 82 161 3 +263.42 ± 37.69
201321 ncm-dbt-06 1238845 500 210 59 231 +108.3 ± 14.68 0 13 78 154 5 +232.26 ± 38.82
201320 ncm-dbt-04 1229633 500 210 39 251 +123.81 ± 14.32 0 7 74 160 9 +268.17 ± 39.97
201319 ncm-dbt-05 1218290 500 204 44 252 +115.22 ± 14.53 0 12 71 162 5 +251.89 ± 40.68
201318 ncm-dbt-03 1220526 500 208 54 238 +110.6 ± 14.22 0 10 81 154 5 +238.66 ± 38.15
201317 ncm-dbt-01 1130699 500 207 49 244 +113.68 ± 14.22 0 11 74 161 4 +249.64 ± 39.91
201316 ncm-dbt-02 1220978 500 206 44 250 +116.77 ± 13.88 0 8 77 160 5 +256.44 ± 39.15
201315 ncm-dbt-06 1240890 500 202 56 242 +104.49 ± 14.35 0 11 87 147 5 +221.9 ± 36.76
201314 ncm-dbt-04 1231841 500 201 51 248 +107.54 ± 13.23 0 8 85 156 1 +238.66 ± 37.17
201313 ncm-dbt-05 1222303 500 202 52 246 +107.54 ± 14.52 0 10 87 146 7 +226.0 ± 36.75
201312 ncm-dbt-03 1256618 500 217 49 234 +121.45 ± 13.66 0 4 82 156 8 +263.42 ± 37.69
201311 ncm-dbt-01 1121578 500 196 46 258 +107.54 ± 14.05 0 9 87 149 5 +230.16 ± 36.73
201310 ncm-dbt-02 1243147 500 211 43 246 +121.46 ± 14.34 1 6 74 162 7 +268.17 ± 39.97
201309 ncm-dbt-05 1224776 500 204 51 245 +109.83 ± 13.89 0 11 77 160 2 +243.0 ± 39.13
201308 ncm-dbt-03 1241495 500 210 53 237 +112.91 ± 15.0 1 11 74 158 6 +245.2 ± 39.87
201307 ncm-dbt-06 1228536 500 220 51 229 +122.24 ± 14.98 0 10 71 159 10 +261.07 ± 40.78
201306 ncm-dbt-04 1225500 500 209 54 237 +111.37 ± 13.22 0 7 83 158 2 +247.41 ± 37.61
166994 ncm-et-4 1316726 160 70 18 72 +117.16 ± 23.46 0 2 25 52 1 +261.95 ± 70.05
166993 ncm-et-15 1303018 164 69 20 75 +107.07 ± 22.25 0 1 32 48 1 +232.99 ± 60.33
166992 ncm-et-3 1299163 168 72 17 79 +118.08 ± 24.5 0 3 25 54 2 +258.14 ± 70.13
166991 ncm-et-10 1289360 170 68 14 88 +114.31 ± 27.43 0 5 25 51 4 +234.51 ± 69.61
166990 ncm-et-13 1294113 170 69 18 83 +107.53 ± 20.87 0 0 35 49 1 +234.5 ± 56.59
166989 ncm-et-9 1293786 168 67 16 85 +108.9 ± 25.33 0 4 27 51 2 +231.91 ± 67.16
166988 ncm-et-15 1292375 500 224 43 233 +131.74 ± 14.03 0 5 69 166 10 +290.66 ± 41.43
166987 ncm-et-3 1306191 500 221 52 227 +122.24 ± 13.12 0 5 75 166 4 +275.45 ± 39.63
166986 ncm-et-4 1302034 500 207 47 246 +115.22 ± 15.0 0 9 83 147 11 +238.66 ± 37.66
166985 ncm-et-9 1303775 500 202 49 249 +109.83 ± 13.89 0 11 77 160 2 +243.0 ± 39.13
166984 ncm-et-13 1306611 500 201 42 257 +114.45 ± 14.37 1 7 80 156 6 +249.64 ± 38.38
166983 ncm-et-10 1297988 500 206 46 248 +115.22 ± 13.03 0 5 83 159 3 +256.44 ± 37.51
166982 ncm-et-3 1285991 500 217 50 233 +120.67 ± 13.5 0 8 70 169 3 +273.0 ± 41.13
166981 ncm-et-4 1283914 500 187 55 258 +93.95 ± 14.53 0 14 94 138 4 +196.45 ± 35.33
166980 ncm-et-9 1299396 500 197 50 253 +105.25 ± 14.51 0 11 87 146 6 +221.9 ± 36.76
166979 ncm-et-15 1289815 500 205 50 245 +111.37 ± 13.89 0 10 78 159 3 +245.2 ± 38.89
166978 ncm-et-13 1308444 500 204 44 252 +115.22 ± 13.38 0 5 85 155 5 +251.89 ± 37.03
166977 ncm-et-10 1286920 500 203 46 251 +112.91 ± 13.56 1 4 86 155 4 +249.64 ± 36.79
166976 ncm-et-4 1314567 500 222 50 228 +124.6 ± 14.14 1 5 72 165 7 +277.93 ± 40.53
166975 ncm-et-3 1300781 500 205 41 254 +118.33 ± 13.35 0 6 78 162 4 +263.42 ± 38.85
166974 ncm-et-13 1308427 500 198 58 244 +99.95 ± 15.05 1 17 74 157 1 +219.87 ± 39.53
166973 ncm-et-10 1286800 500 203 48 249 +111.37 ± 13.73 0 8 83 155 4 +243.0 ± 37.64
166972 ncm-et-9 1303771 500 213 49 238 +118.33 ± 15.15 0 13 68 161 8 +254.16 ± 41.47
166971 ncm-et-15 1309147 500 204 44 252 +115.22 ± 14.05 0 7 83 153 7 +247.41 ± 37.61

Commit

Commit ID 69639d764bde566e524b8c2566119bf677cb2622
Author Linmiao Xu
Date 2023-02-23 12:27:57 UTC
Reintroduce nnue pawn scaling with lower lazy thresholds Params found with the nevergrad TBPSA optimizer via nevergrad4sf modified to: * use SPRT LLR with fishtest STC elo gainer bounds [0, 2] as the objective function * increase the game batch size after each new optimal point is found The params were the optimal point after TBPSA iteration 7 and 160 nevergrad evaluations with: * initial batch size of 96 games per evaluation * batch size increase of 64 games after each iteration * a budget of 512 evaluations * TC: fixed 1.5 million nodes per move, no time limit nevergrad4sf enables optimizing stockfish params with TBPSA: https://github.com/vondele/nevergrad4sf Using pentanomial game results with smaller game batch sizes was inspired by: Use of SPRT LLR calculated from pentanomial game results as the objective function was an experiment at maximizing the information from game batches to reduce the computational cost for TBPSA to converge on good parameters. For the exact code used to find the params: https://github.com/linrock/tuning-fork Passed STC: https://tests.stockfishchess.org/tests/view/63f4ef5ee74a12625bcd114a LLR: 2.94 (-2.94,2.94) <0.00,2.00> Total: 66552 W: 17736 L: 17390 D: 31426 Ptnml(0-2): 164, 7229, 18166, 7531, 186 Passed LTC: https://tests.stockfishchess.org/tests/view/63f56028e74a12625bcd2550 LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 71264 W: 19150 L: 18787 D: 33327 Ptnml(0-2): 23, 6728, 21771, 7083, 27 closes https://github.com/official-stockfish/Stockfish/pull/4401 bench 3687580
Copyright 2011–2024 Next Chess Move LLC