Update NNUE architecture to SFNNv8: L1-2560 nn-ac1dbea57aa3.nnue
Creating this net involved:
- a 6-stage training process from scratch. The datasets used in stages 1-5 were fully minimized.
- permuting L1 weights with https://github.com/official-stockfish/nnue-pytorch/pull/254
A strong epoch after each training stage was chosen for the next. The 6 stages were:
```
1. 400 epochs, lambda 1.0, default LR and gamma
UHOx2-wIsRight-multinet-dfrc-n5000 (135G)
nodes5000pv2_UHO.binpack
data_pv-2_diff-100_nodes-5000.binpack
wrongIsRight_nodes5000pv2.binpack
multinet_pv-2_diff-100_nodes-5000.binpack
dfrc_n5000.binpack
2. 800 epochs, end-lambda 0.75, LR 4.375e-4, gamma 0.995, skip 12
LeelaFarseer-T78juntoaugT79marT80dec.binpack (141G)
T60T70wIsRightFarseerT60T74T75T76.binpack
test78-junjulaug2022-16tb7p.no-db.min.binpack
test79-mar2022-16tb7p.no-db.min.binpack
test80-dec2022-16tb7p.no-db.min.binpack
3. 800 epochs, end-lambda 0.725, LR 4.375e-4, gamma 0.995, skip 20
leela93-v1-dfrc99-v2-T78juntosepT80jan-v6dd-T78janfebT79aprT80aprmay.min.binpack
leela93-filt-v1.min.binpack
dfrc99-16tb7p-filt-v2.min.binpack
test78-juntosep2022-16tb7p-filter-v6-dd.min-mar2023.binpack
test80-jan2023-3of3-16tb7p-filter-v6-dd.min-mar2023.binpack
test78-janfeb2022-16tb7p.min.binpack
test79-apr2022-16tb7p.min.binpack
test80-apr2022-16tb7p.min.binpack
test80-may2022-16tb7p.min.binpack
4. 800 epochs, end-lambda 0.7, LR 4.375e-4, gamma 0.995, skip 24
leela96-dfrc99-v2-T78juntosepT79mayT80junsepnovjan-v6dd-T80mar23-v6-T60novdecT77decT78aprmayT79aprT80may23.min.binpack
leela96-filt-v2.min.binpack
dfrc99-16tb7p-filt-v2.min.binpack
test78-juntosep2022-16tb7p-filter-v6-dd.min-mar2023.binpack
test79-may2022-16tb7p.filter-v6-dd.min.binpack
test80-jun2022-16tb7p.filter-v6-dd.min.binpack
test80-sep2022-16tb7p.filter-v6-dd.min.binpack
test80-nov2022-16tb7p.filter-v6-dd.min.binpack
test80-jan2023-3of3-16tb7p-filter-v6-dd.min-mar2023.binpack
test80-mar2023-2tb7p.v6-sk16.min.binpack
test60-novdec2021-16tb7p.min.binpack
test77-dec2021-16tb7p.min.binpack
test78-aprmay2022-16tb7p.min.binpack
test79-apr2022-16tb7p.min.binpack
test80-may2023-2tb7p.min.binpack
5. 960 epochs, end-lambda 0.7, LR 4.375e-4, gamma 0.995, skip 28
Increased max-epoch to 960 near the end of the first 800 epochs
5af11540bbfe dataset: https://github.com/official-stockfish/Stockfish/pull/4635
6. 1000 epochs, end-lambda 0.7, LR 4.375e-4, gamma 0.995, skip 28
Increased max-epoch to 1000 near the end of the first 800 epochs
1ee1aba5ed dataset: https://github.com/official-stockfish/Stockfish/pull/4782
```
L1 weights permuted with:
```bash
python3 serialize.py $nnue $nnue_permuted \
--features=HalfKAv2_hm \
--ft_optimize \
--ft_optimize_data=/data/fishpack32.binpack \
--ft_optimize_count=10000
```
Speed measurements from 100 bench runs at depth 13 with profile-build x86-64-avx2:
```
sf_base = 1329051 +/- 2224 (95%)
sf_test = 1163344 +/- 2992 (95%)
diff = -165706 +/- 4913 (95%)
speedup = -12.46807% +/- 0.370% (95%)
```
Training data can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move (vs. L1-2048 nn-1ee1aba5ed4c.nnue)
ep959 : 16.2 +/- 2.3
Failed 10+0.1 STC:
https://tests.stockfishchess.org/tests/view/6501beee2cd016da89abab21
LLR: -2.92 (-2.94,2.94) <0.00,2.00>
Total: 13184 W: 3285 L: 3535 D: 6364
Ptnml(0-2): 85, 1662, 3334, 1440, 71
Failed 180+1.8 VLTC:
https://tests.stockfishchess.org/tests/view/6505cf9a72620bc881ea908e
LLR: -2.94 (-2.94,2.94) <0.00,2.00>
Total: 64248 W: 16224 L: 16374 D: 31650
Ptnml(0-2): 26, 6788, 18640, 6650, 20
Passed 60+0.6 th 8 VLTC SMP (STC bounds):
https://tests.stockfishchess.org/tests/view/65084a4618698b74c2e541dc
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 90630 W: 23372 L: 23033 D: 44225
Ptnml(0-2): 13, 8490, 27968, 8833, 11
Passed 60+0.6 th 8 VLTC SMP:
https://tests.stockfishchess.org/tests/view/6501d45d2cd016da89abacdb
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 137804 W: 35764 L: 35276 D: 66764
Ptnml(0-2): 31, 13006, 42326, 13522, 17
closes https://github.com/official-stockfish/Stockfish/pull/4795
bench 1246812