Dev Builds » 20220210-1854

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 15. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host Duration Avg Base NPS Games WLD Standard Elo Ptnml(0-2) Gamepair Elo
ncm-dbt-01 06:55:34 584084 4000 938 1090 1972 -13.21 ± 5.34 5 555 1030 407 3 -26.11 ± 10.61
ncm-dbt-02 06:55:17 585849 4000 947 1087 1966 -12.17 ± 5.18 3 525 1083 387 2 -24.19 ± 10.31
ncm-dbt-03 06:55:48 584431 4000 938 1124 1938 -16.17 ± 5.1 3 540 1097 360 0 -31.88 ± 10.22
ncm-dbt-04 06:56:23 568707 4004 946 1084 1974 -11.98 ± 5.04 7 489 1142 363 1 -22.94 ± 9.97
ncm-dbt-05 06:57:05 580397 3996 942 1092 1962 -13.05 ± 5.18 1 535 1078 381 3 -26.48 ± 10.34
20000 4711 5477 9812 -13.31 ± 2.31 19 2644 5430 1898 9 -26.32 ± 4.6

Test Detail

ID Host Base NPS Games WLD Standard Elo Ptnml(0-2) Gamepair Elo CLI PGN
433284 ncm-dbt-04 570308 4 1 1 2 0.0 ± 17.0 0 0 2 0 0 -0.0 ± 15.22
433283 ncm-dbt-05 579496 496 123 142 231 -13.32 ± 14.0 0 61 146 40 1 -28.08 ± 27.75
433282 ncm-dbt-03 584579 500 129 144 227 -10.42 ± 13.65 1 55 152 42 0 -19.48 ± 27.0
433281 ncm-dbt-01 581610 500 123 135 242 -8.34 ± 14.39 1 59 141 49 0 -15.3 ± 28.49
433280 ncm-dbt-02 585464 500 123 133 244 -6.95 ± 15.03 0 66 128 56 0 -13.9 ± 30.16
433279 ncm-dbt-04 568116 500 116 141 243 -17.38 ± 14.87 1 70 132 47 0 -33.46 ± 29.64
433278 ncm-dbt-05 583614 500 117 136 247 -13.21 ± 14.67 0 67 136 46 1 -27.85 ± 29.13
433277 ncm-dbt-02 587834 500 123 144 233 -14.6 ± 14.91 1 68 132 49 0 -27.85 ± 29.65
433276 ncm-dbt-01 583405 500 104 136 260 -22.27 ± 14.73 1 72 136 40 1 -44.72 ± 29.1
433275 ncm-dbt-03 586435 500 101 146 253 -31.35 ± 14.85 1 82 128 39 0 -61.79 ± 30.13
433274 ncm-dbt-04 568991 500 114 135 251 -14.6 ± 14.27 0 66 139 45 0 -29.25 ± 28.74
433273 ncm-dbt-05 581902 500 118 124 258 -4.17 ± 14.54 0 60 136 54 0 -8.34 ± 29.15
433272 ncm-dbt-02 586097 500 110 138 252 -19.48 ± 14.13 0 69 140 41 0 -39.08 ± 28.58
433271 ncm-dbt-01 585928 500 124 132 244 -5.56 ± 15.41 0 66 128 54 2 -13.9 ± 30.16
433270 ncm-dbt-03 584159 500 114 149 237 -24.36 ± 14.37 0 75 135 40 0 -48.96 ± 29.23
433269 ncm-dbt-04 568633 500 121 140 239 -13.21 ± 14.28 0 64 142 43 1 -27.85 ± 28.34
433268 ncm-dbt-05 582193 500 124 139 237 -10.43 ± 15.07 0 69 127 54 0 -20.87 ± 30.28
433267 ncm-dbt-02 586181 500 126 124 250 +1.39 ± 14.42 0 55 138 57 0 +2.78 ± 28.89
433266 ncm-dbt-01 582569 500 119 128 253 -6.25 ± 15.58 0 70 119 61 0 -12.51 ± 31.26
433265 ncm-dbt-03 581735 500 109 137 254 -19.48 ± 15.15 0 77 124 49 0 -39.08 ± 30.66
433264 ncm-dbt-04 566730 500 126 137 237 -7.64 ± 14.46 2 56 143 49 0 -12.51 ± 28.23
433263 ncm-dbt-05 578506 500 109 141 250 -22.27 ± 14.73 0 75 133 41 1 -46.13 ± 29.5
433262 ncm-dbt-01 583112 500 115 141 244 -18.08 ± 15.66 0 80 116 54 0 -36.26 ± 31.64
433261 ncm-dbt-04 569071 500 113 138 249 -17.38 ± 14.87 2 67 135 46 0 -32.05 ± 29.26
433260 ncm-dbt-03 582152 500 121 134 245 -9.04 ± 14.32 0 62 139 49 0 -18.08 ± 28.75
433259 ncm-dbt-02 585126 500 108 135 257 -18.78 ± 14.73 1 68 140 39 2 -39.08 ± 28.58
433258 ncm-dbt-02 585717 500 120 155 225 -24.36 ± 14.37 1 72 138 39 0 -47.55 ± 28.83
433257 ncm-dbt-03 584790 500 123 138 239 -10.42 ± 13.92 1 57 148 44 0 -19.48 ± 27.55
433256 ncm-dbt-05 578877 500 108 146 246 -26.46 ± 13.99 0 74 140 36 0 -53.22 ± 28.53
433255 ncm-dbt-01 584579 500 117 145 238 -19.48 ± 14.26 0 70 138 42 0 -39.08 ± 28.85
433254 ncm-dbt-04 568752 500 118 127 255 -6.25 ± 14.47 2 55 143 50 0 -9.73 ± 28.23
433253 ncm-dbt-02 585506 500 124 128 248 -2.78 ± 15.17 0 64 126 60 0 -5.56 ± 30.41
433252 ncm-dbt-04 566493 500 118 131 251 -9.04 ± 13.38 0 55 153 42 0 -18.08 ± 26.86
433251 ncm-dbt-05 577274 500 128 147 225 -13.21 ± 15.53 1 72 122 55 0 -25.06 ± 30.9
433250 ncm-dbt-01 585717 500 115 134 251 -13.21 ± 15.77 2 71 121 56 0 -23.66 ± 31.02
433249 ncm-dbt-03 585042 500 118 137 245 -13.21 ± 14.8 0 69 131 50 0 -26.46 ± 29.78
433248 ncm-dbt-02 584874 500 113 130 257 -11.82 ± 14.17 0 63 141 46 0 -23.66 ± 28.48
433247 ncm-dbt-03 586562 500 123 139 238 -11.12 ± 14.24 0 63 140 47 0 -22.27 ± 28.62
433246 ncm-dbt-01 585759 500 121 139 240 -12.51 ± 14.99 1 67 131 51 0 -23.66 ± 29.78
433245 ncm-dbt-05 581319 500 115 117 268 -1.39 ± 14.42 0 57 138 55 0 -2.78 ± 28.89
433244 ncm-dbt-04 571270 500 119 134 247 -10.43 ± 13.37 0 56 153 41 0 -20.87 ± 26.85

Commit

Commit ID cb9c2594fcedc881ae8f8bfbfdf130cf89840e4c
Author Tomasz Sobczyk
Date 2022-02-10 18:54:31 UTC
Update architecture to "SFNNv4". Update network to nn-6877cd24400e.nnue. Architecture: The diagram of the "SFNNv4" architecture: https://user-images.githubusercontent.com/8037982/153455685-cbe3a038-e158-4481-844d-9d5fccf5c33a.png The most important architectural changes are the following: * 1024x2 [activated] neurons are pairwise, elementwise multiplied (not quite pairwise due to implementation details, see diagram), which introduces a non-linearity that exhibits similar benefits to previously tested sigmoid activation (quantmoid4), while being slightly faster. * The following layer has therefore 2x less inputs, which we compensate by having 2 more outputs. It is possible that reducing the number of outputs might be beneficial (as we had it as low as 8 before). The layer is now 1024->16. * The 16 outputs are split into 15 and 1. The 1-wide output is added to the network output (after some necessary scaling due to quantization differences). The 15-wide is activated and follows the usual path through a set of linear layers. The additional 1-wide output is at least neutral, but has shown a slightly positive trend in training compared to networks without it (all 16 outputs through the usual path), and allows possibly an additional stage of lazy evaluation to be introduced in the future. Additionally, the inference code was rewritten and no longer uses a recursive implementation. This was necessitated by the splitting of the 16-wide intermediate result into two, which was impossible to do with the old implementation with ugly hacks. This is hopefully overall for the better. First session: The first session was training a network from scratch (random initialization). The exact trainer used was slightly different (older) from the one used in the second session, but it should not have a measurable effect. The purpose of this session is to establish a strong network base for the second session. Small deviations in strength do not harm the learnability in the second session. The training was done using the following command: python3 train.py \ /home/sopel/nnue/nnue-pytorch-training/data/nodes5000pv2_UHO.binpack \ /home/sopel/nnue/nnue-pytorch-training/data/nodes5000pv2_UHO.binpack \ --gpus "$3," \ --threads 4 \ --num-workers 4 \ --batch-size 16384 \ --progress_bar_refresh_rate 20 \ --random-fen-skipping 3 \ --features=HalfKAv2_hm^ \ --lambda=1.0 \ --gamma=0.992 \ --lr=8.75e-4 \ --max_epochs=400 \ --default_root_dir ../nnue-pytorch-training/experiment_$1/run_$2 Every 20th net was saved and its playing strength measured against some baseline at 25k nodes per move with pure NNUE evaluation (modified binary). The exact setup is not important as long as it's consistent. The purpose is to sift good candidates from bad ones. The dataset can be found https://drive.google.com/file/d/1UQdZN_LWQ265spwTBwDKo0t1WjSJKvWY/view Second session: The second training session was done starting from the best network (as determined by strength testing) from the first session. It is important that it's resumed from a .pt model and NOT a .ckpt model. The conversion can be performed directly using serialize.py The LR schedule was modified to use gamma=0.995 instead of gamma=0.992 and LR=4.375e-4 instead of LR=8.75e-4 to flatten the LR curve and allow for longer training. The training was then running for 800 epochs instead of 400 (though it's possibly mostly noise after around epoch 600). The training was done using the following command: The training was done using the following command: python3 train.py \ /data/sopel/nnue/nnue-pytorch-training/data/T60T70wIsRightFarseerT60T74T75T76.binpack \ /data/sopel/nnue/nnue-pytorch-training/data/T60T70wIsRightFarseerT60T74T75T76.binpack \ --gpus "$3," \ --threads 4 \ --num-workers 4 \ --batch-size 16384 \ --progress_bar_refresh_rate 20 \ --random-fen-skipping 3 \ --features=HalfKAv2_hm^ \ --lambda=1.0 \ --gamma=0.995 \ --lr=4.375e-4 \ --max_epochs=800 \ --resume-from-model /data/sopel/nnue/nnue-pytorch-training/data/exp295/nn-epoch399.pt \ --default_root_dir ../nnue-pytorch-training/experiment_$1/run_$run_id In particular note that we now use lambda=1.0 instead of lambda=0.8 (previous nets), because tests show that WDL-skipping introduced by vondele performs better with lambda=1.0. Nets were being saved every 20th epoch. In total 16 runs were made with these settings and the best nets chosen according to playing strength at 25k nodes per move with pure NNUE evaluation - these are the 4 nets that have been put on fishtest. The dataset can be found either at ftp://ftp.chessdb.cn/pub/sopel/data_sf/T60T70wIsRightFarseerT60T74T75T76.binpack in its entirety (download might be painfully slow because hosted in China) or can be assembled in the following way: Get the https://github.com/official-stockfish/Stockfish/blob/5640ad48ae5881223b868362c1cbeb042947f7b4/script/interleave_binpacks.py script. Download T60T70wIsRightFarseer.binpack https://drive.google.com/file/d/1_sQoWBl31WAxNXma2v45004CIVltytP8/view Download farseerT74.binpack http://trainingdata.farseer.org/T74-May13-End.7z Download farseerT75.binpack http://trainingdata.farseer.org/T75-June3rd-End.7z Download farseerT76.binpack http://trainingdata.farseer.org/T76-Nov10th-End.7z Run python3 interleave_binpacks.py T60T70wIsRightFarseer.binpack farseerT74.binpack farseerT75.binpack farseerT76.binpack T60T70wIsRightFarseerT60T74T75T76.binpack Tests: STC: https://tests.stockfishchess.org/tests/view/6203fb85d71106ed12a407b7 LLR: 2.94 (-2.94,2.94) <0.00,2.50> Total: 16952 W: 4775 L: 4521 D: 7656 Ptnml(0-2): 133, 1818, 4318, 2076, 131 LTC: https://tests.stockfishchess.org/tests/view/62041e68d71106ed12a40e85 LLR: 2.94 (-2.94,2.94) <0.50,3.00> Total: 14944 W: 4138 L: 3907 D: 6899 Ptnml(0-2): 21, 1499, 4202, 1728, 22 closes https://github.com/official-stockfish/Stockfish/pull/3927 Bench: 4919707
Copyright 2011–2025 Next Chess Move LLC