Update NNUE architecture to SFNNv7 with larger L1 size of 2048
Creating this net involved:
- a 5-step training process from scratch
- greedy permuting L1 weights with https://github.com/official-stockfish/Stockfish/pull/4620
- leb128 compression with https://github.com/glinscott/nnue-pytorch/pull/251
- greedy 2- and 3- cycle permuting with https://github.com/official-stockfish/Stockfish/pull/4640
The 5 training steps were:
1. 400 epochs, lambda 1.0, lr 9.75e-4
UHOx2-wIsRight-multinet-dfrc-n5000-largeGensfen-d9.binpack (178G)
nodes5000pv2_UHO.binpack
data_pv-2_diff-100_nodes-5000.binpack
wrongIsRight_nodes5000pv2.binpack
multinet_pv-2_diff-100_nodes-5000.binpack
dfrc_n5000.binpack
large_gensfen_multipvdiff_100_d9.binpack
ep399 chosen as start model for step2
2. 800 epochs, end-lambda 0.75, skip 16
LeelaFarseer-T78juntoaugT79marT80dec.binpack (141G)
T60T70wIsRightFarseerT60T74T75T76.binpack
test78-junjulaug2022-16tb7p.no-db.min.binpack
test79-mar2022-16tb7p.no-db.min.binpack
test80-dec2022-16tb7p.no-db.min.binpack
ep559 chosen as start model for step3
3. 800 epochs, end-lambda 0.725, skip 20
leela96-dfrc99-v2-T80dectofeb-sk20-mar-v6-T77decT78janfebT79apr.binpack (223G)
leela96-filt-v2.min.binpack
dfrc99-16tb7p-eval-filt-v2.min.binpack
test80-dec2022-16tb7p-filter-v6-sk20.min-mar2023.binpack
test80-jan2023-16tb7p-filter-v6-sk20.min-mar2023.binpack
test80-feb2023-16tb7p-filter-v6-sk20.min-mar2023.binpack
test80-mar2023-2tb7p-filter-v6.min.binpack
test77-dec2021-16tb7p.no-db.min.binpack
test78-janfeb2022-16tb7p.no-db.min.binpack
test79-apr2022-16tb7p.no-db.min.binpack
ep499 chosen as start model for step4
4. 800 epochs, end-lambda 0.7, skip 24
0dd1cebea57 dataset https://github.com/official-stockfish/Stockfish/pull/4606
ep599 chosen as start model for step5
5. 800 epochs, end-lambda 0.7, skip 28
same dataset as step4
ep619 became nn-1b951f8b449d.nnue
For the final step5 training:
python3 easy_train.py \
--experiment-name L1-2048-S5-sameData-sk28-S4-0dd1cebea57-shuffled-S3-leela96-dfrc99-v2-T80dectofeb-sk20-mar-v6-T77decT78janfebT79apr-sk20-S2-LeelaFarseerT78T79T80-ep399-S1-UHOx2-wIsRight-multinet-dfrc-n5000-largeGensfen-d9 \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr.binpack \
--early-fen-skipping 28 \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-L1-2048 \
--engine-test-branch linrock/Stockfish/L1-2048 \
--start-from-engine-test-net False \
--start-from-model /data/experiments/experiment_L1-2048-S4-0dd1cebea57-shuffled-S3-leela96-dfrc99-v2-T80dectofeb-sk20-mar-v6-T77decT78janfebT79apr-sk20-S2-LeelaFarseerT78T79T80-ep399-S1-UHOx2-wIsRight-multinet-dfrc-n5000-largeGensfen-d9/training/run_0/nn-epoch599.nnue
--max_epoch 800 \
--lr 4.375e-4 \
--gamma 0.995 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--tui False \
--seed $RANDOM \
--gpus 0
SF training data components for the step1 dataset:
https://drive.google.com/drive/folders/1yLCEmioC3Xx9KQr4T7uB6GnLm5icAYGU
Leela training data for steps 2-5 can be found at:
https://robotmoon.com/nnue-training-data/
Due to larger L1 size and slower inference, the speed penalty loses elo
at STC. Measurements from 100 bench runs at depth 13 with x86-64-modern
on Intel Core i5-1038NG7 2.00GHz:
sf_base = 1240730 +/- 3443 (95%)
sf_test = 1153341 +/- 2832 (95%)
diff = -87388 +/- 1616 (95%)
speedup = -7.04330% +/- 0.130% (95%)
Local elo at 25k nodes per move (vs. L1-1536 nn-fdc1d0fe6455.nnue):
nn-epoch619.nnue : 21.1 +/- 3.2
Failed STC:
https://tests.stockfishchess.org/tests/view/6498ee93dc7002ce609cf979
LLR: -2.95 (-2.94,2.94) <0.00,2.00>
Total: 11680 W: 3058 L: 3299 D: 5323
Ptnml(0-2): 44, 1422, 3149, 1181, 44
LTC:
https://tests.stockfishchess.org/tests/view/649b32f5dc7002ce609d20cf
Elo: 0.68 ± 1.5 (95%) LOS: 80.5%
Total: 40000 W: 10887 L: 10809 D: 18304
Ptnml(0-2): 36, 3938, 11958, 4048, 20
nElo: 1.50 ± 3.4 (95%) PairsRatio: 1.02
Passed VLTC 180+1.8:
https://tests.stockfishchess.org/tests/view/64992b43dc7002ce609cfd20
LLR: 3.06 (-2.94,2.94) <0.00,2.00>
Total: 38086 W: 10612 L: 10338 D: 17136
Ptnml(0-2): 9, 3316, 12115, 3598, 5
Passed VLTC SMP 60+0.6 th 8:
https://tests.stockfishchess.org/tests/view/649a21fedc7002ce609d0c7d
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 38936 W: 11091 L: 10820 D: 17025
Ptnml(0-2): 1, 2948, 13305, 3207, 7
closes https://github.com/official-stockfish/Stockfish/pull/4646
Bench: 2505168