Dev Builds » 20231230-1008

You are viewing an old NCM Stockfish dev build test. You may find the most recent dev build tests using Stockfish 15 as the baseline here.

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host Duration Avg Base NPS Games WLD Standard Elo Ptnml(0-2) Gamepair Elo
ncm-dbt-01 09:51:10 1215381 3344 1419 255 1670 +126.21 ± 5.21 1 32 482 1116 41 +283.1 ± 15.51
ncm-dbt-02 09:46:19 1227884 3296 1411 262 1623 +126.42 ± 5.23 0 31 479 1096 42 +282.77 ± 15.55
ncm-dbt-03 09:50:59 1243527 3332 1459 275 1598 +129.09 ± 5.42 1 45 438 1133 49 +289.28 ± 16.29
ncm-dbt-04 09:51:15 1228172 3350 1461 253 1636 +131.18 ± 5.3 0 35 451 1135 54 +293.92 ± 16.06
ncm-dbt-05 09:47:05 1231940 3316 1440 244 1632 +131.21 ± 5.47 0 38 450 1106 64 +289.85 ± 16.08
ncm-dbt-06 09:51:07 1233478 3362 1432 271 1659 +125.12 ± 5.32 0 45 473 1120 43 +278.6 ± 15.68
20000 8622 1560 9818 +128.2 ± 2.17 2 226 2773 6706 293 +286.2 ± 6.47

Test Detail

ID Host Base NPS Games WLD Standard Elo Ptnml(0-2) Gamepair Elo CLI PGN
243346 ncm-dbt-02 1245659 296 126 21 149 +128.84 ± 16.77 0 1 45 98 4 +289.64 ± 51.14
243345 ncm-dbt-05 1240035 316 135 18 163 +135.05 ± 17.39 0 2 44 105 7 +298.76 ± 52.1
243344 ncm-dbt-03 1242275 332 151 15 166 +151.19 ± 14.92 0 2 30 130 4 +377.09 ± 64.2
243343 ncm-dbt-01 1206234 344 147 35 162 +117.39 ± 17.61 0 5 57 103 7 +246.56 ± 45.57
243342 ncm-dbt-04 1227690 350 156 38 156 +121.9 ± 16.8 0 4 55 110 6 +263.42 ± 46.42
243341 ncm-dbt-06 1241077 362 161 23 178 +139.48 ± 15.7 0 0 52 120 9 +310.14 ± 47.27
243340 ncm-dbt-02 1224475 500 211 37 252 +126.17 ± 13.42 0 4 75 164 7 +280.42 ± 39.57
243339 ncm-dbt-05 1232519 500 217 26 257 +139.81 ± 14.55 0 6 60 171 13 +309.64 ± 44.55
243338 ncm-dbt-03 1243883 500 210 39 251 +123.81 ± 15.13 0 14 58 171 7 +273.0 ± 44.48
243337 ncm-dbt-04 1222317 500 209 36 255 +125.38 ± 13.07 0 5 71 170 4 +285.49 ± 40.81
243336 ncm-dbt-01 1249892 500 213 36 251 +128.55 ± 13.38 0 3 75 164 8 +285.49 ± 39.5
243335 ncm-dbt-06 1226284 500 220 32 248 +137.37 ± 13.53 0 7 54 183 6 +321.19 ± 46.91
243334 ncm-dbt-02 1220001 500 217 40 243 +128.55 ± 12.43 0 3 70 174 3 +298.62 ± 41.01
243333 ncm-dbt-05 1217162 500 214 46 240 +121.45 ± 14.98 0 9 75 155 11 +256.44 ± 39.69
243332 ncm-dbt-03 1231469 500 220 40 240 +130.94 ± 13.87 0 5 69 167 9 +290.66 ± 41.43
243331 ncm-dbt-04 1239248 500 216 34 250 +132.54 ± 14.18 0 9 57 177 7 +301.33 ± 45.48
243330 ncm-dbt-06 1239776 500 211 35 254 +127.76 ± 13.57 0 6 68 170 6 +288.06 ± 41.76
243329 ncm-dbt-01 1230785 500 205 32 263 +125.38 ± 12.89 0 3 76 166 5 +282.94 ± 39.21
243328 ncm-dbt-02 1224239 500 208 42 250 +119.89 ± 13.16 0 5 78 163 4 +268.17 ± 38.8
243327 ncm-dbt-05 1234272 500 222 41 237 +131.74 ± 13.85 0 7 62 174 7 +298.62 ± 43.77
243326 ncm-dbt-03 1259620 500 220 45 235 +126.97 ± 14.45 0 5 77 156 12 +270.57 ± 39.07
243325 ncm-dbt-04 1232207 500 209 35 256 +126.17 ± 13.24 0 3 77 163 7 +280.42 ± 38.93
243324 ncm-dbt-06 1234129 500 218 43 239 +126.97 ± 13.58 0 7 66 172 5 +288.06 ± 42.4
243323 ncm-dbt-01 1219935 500 209 37 254 +124.6 ± 13.8 0 5 76 161 8 +273.0 ± 39.35
243322 ncm-dbt-02 1228269 500 215 38 247 +128.55 ± 13.73 0 5 71 166 8 +285.49 ± 40.81
243321 ncm-dbt-05 1233118 500 207 35 258 +124.6 ± 14.47 0 6 77 156 11 +265.78 ± 39.12
243320 ncm-dbt-06 1227210 500 207 45 248 +116.77 ± 13.54 0 7 78 161 4 +258.75 ± 38.88
243319 ncm-dbt-03 1240126 500 224 35 241 +138.18 ± 12.93 0 2 65 175 8 +318.25 ± 42.59
243318 ncm-dbt-04 1231058 500 218 38 244 +130.94 ± 13.51 0 2 76 162 10 +288.06 ± 39.11
243317 ncm-dbt-01 1209804 500 208 38 254 +123.02 ± 13.29 0 7 69 171 3 +280.42 ± 41.44
243316 ncm-dbt-02 1217392 500 213 41 246 +124.6 ± 14.31 0 8 70 164 8 +273.0 ± 41.13
243315 ncm-dbt-05 1234554 500 219 42 239 +128.55 ± 13.38 0 3 75 164 8 +285.49 ± 39.5
243314 ncm-dbt-06 1235298 500 211 46 243 +119.11 ± 13.52 0 4 84 155 7 +258.75 ± 37.2
243313 ncm-dbt-01 1231521 500 215 42 243 +125.39 ± 14.13 1 6 68 169 6 +282.94 ± 41.76
243312 ncm-dbt-03 1245267 500 219 49 232 +123.02 ± 13.64 1 6 68 172 3 +282.94 ± 41.76
243311 ncm-dbt-04 1230284 500 224 34 242 +138.99 ± 13.85 0 6 57 178 9 +318.25 ± 45.73
243310 ncm-dbt-02 1235155 500 221 43 236 +129.35 ± 13.72 0 5 70 167 8 +288.06 ± 41.11
243309 ncm-dbt-05 1231921 500 226 36 238 +138.99 ± 13.29 0 5 57 181 7 +324.17 ± 45.77
243308 ncm-dbt-06 1230574 500 204 47 249 +112.91 ± 15.0 0 14 71 159 6 +243.0 ± 40.56
243307 ncm-dbt-01 1159502 500 222 35 243 +136.56 ± 12.39 0 3 61 182 4 +324.17 ± 44.15
243306 ncm-dbt-03 1242055 500 215 52 233 +117.55 ± 14.53 0 11 71 162 6 +256.44 ± 40.73
243305 ncm-dbt-04 1214403 500 229 38 233 +139.81 ± 14.19 0 6 58 175 11 +315.35 ± 45.33

Commit

Commit ID f12035c88c58a5fd568d26cde9868f73a8d7b839
Author Linmiao Xu
Date 2023-12-30 10:08:03 UTC
Update default net to nn-b1e55edbea57.nnue Created by retraining the master big net `nn-0000000000a0.nnue` on the same dataset with the ranger21 optimizer and more WDL skipping at training time. More WDL skipping is meant to increase lambda accuracy and train on fewer misevaluated positions where position scores are unlikely to correlate with game outcomes. Inspired by: - repeated reports in discord #events-discuss about SF misplaying due to wrong endgame evals, possibly due to Leela's endgame weaknesses reflected in training data - an attempt to reduce the skewed dataset piece count distribution where there are much more positions with less than 16 pieces, since the target piece count distribution in the trainer is symmetric around 16 The faster convergence seen with ranger21 is meant to: - prune experiment ideas more quickly since fewer epochs are needed to reach elo maxima - research faster potential trainings by shortening each run ```yaml experiment-name: 2560-S7-Re-514G-ranger21-more-wdl-skip training-dataset: /data/S6-514G.binpack early-fen-skipping: 28 start-from-engine-test-net: True nnue-pytorch-branch: linrock/nnue-pytorch/r21-more-wdl-skip num-epochs: 1200 lr: 4.375e-4 gamma: 0.995 start-lambda: 1.0 end-lambda: 0.7 ``` Experiment yaml configs converted to easy_train.sh commands with: https://github.com/linrock/nnue-tools/blob/4339954/yaml_easy_train.py Implementations based off of Sopel's NNUE training & experimentation log: https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY - Experiment 336 - ranger21 https://github.com/Sopel97/nnue-pytorch/tree/experiment_336 - Experiment 351 - more WDL skipping The version of the ranger21 optimizer used is: https://github.com/lessw2020/Ranger21/blob/b507df6/ranger21/ranger21.py The dataset is the exact same as in: https://github.com/official-stockfish/Stockfish/pull/4782 Local elo at 25k nodes per move: nn-epoch619.nnue : 6.2 +/- 4.2 Passed STC: https://tests.stockfishchess.org/tests/view/658a029779aa8af82b94fbe6 LLR: 2.93 (-2.94,2.94) <0.00,2.00> Total: 46528 W: 11985 L: 11650 D: 22893 Ptnml(0-2): 154, 5489, 11688, 5734, 199 Passed LTC: https://tests.stockfishchess.org/tests/view/658a448979aa8af82b95010f LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 265326 W: 66378 L: 65574 D: 133374 Ptnml(0-2): 153, 30175, 71254, 30877, 204 This was additionally tested with the latest DualNNUE and passed SPRTs: Passed STC vs. https://github.com/official-stockfish/Stockfish/pull/4919 https://tests.stockfishchess.org/tests/view/658bcd5c79aa8af82b951846 LLR: 2.93 (-2.94,2.94) <0.00,2.00> Total: 296128 W: 76273 L: 75554 D: 144301 Ptnml(0-2): 1223, 35768, 73617, 35979, 1477 Passed LTC vs. https://github.com/official-stockfish/Stockfish/pull/4919 https://tests.stockfishchess.org/tests/view/658c988d79aa8af82b95240f LLR: 2.95 (-2.94,2.94) <0.50,2.50> Total: 75618 W: 19085 L: 18680 D: 37853 Ptnml(0-2): 45, 8420, 20497, 8779, 68 closes https://github.com/official-stockfish/Stockfish/pull/4942 Bench: 1304666
Copyright 2011–2024 Next Chess Move LLC