Commit ID b939c805139e4b37f04fbf177f580c35ebe9f130
Author MichaelB7
Date 2021-07-24 16:04:59 UTC
Update the default net to nn-76a8a7ffb820.nnue. combined work by Serio Vieri, Michael Byrne, and Jonathan D (aka SFisGod) based on top of previous developments, by restarts from good nets. Sergio generated the net The initial net nn-d8609abe8caf.nnue is trained by generating around 16B of training data from the last master net nn-9e3c6298299a.nnue, then trained, continuing from the master net, with lambda=0.2 and sampling ratio of 1. Starting with LR=2e-3, dropping LR with a factor of 0.5 until it reaches LR=5e-4. in_scaling is set to 361. No other significant changes made to the pytorch trainer. Training data gen command (generates in chunks of 200k positions): generate_training_data min_depth 9 max_depth 11 count 200000 random_move_count 10 random_move_max_ply 80 random_multi_pv 12 random_multi_pv_diff 100 random_multi_pv_depth 8 write_min_ply 10 eval_limit 1500 book noob_3moves.epd output_file_name gendata/$(date +"%Y%m%d-%H%M")_${HOSTNAME}.binpack PyTorch trainer command (Note that this only trains for 20 epochs, repeatedly train until convergence): python --features "HalfKAv2^" --max_epochs 20 --smart-fen-skipping --random-fen-skipping 500 --batch-size 8192 --default_root_dir $dir --seed $RANDOM --threads 4 --num-workers 32 --gpus $gpuids --track_grad_norm 2 --gradient_clip_val 0.05 --lambda 0.2 --log_every_n_steps 50 $resumeopt $data $val See for the scripts used to generate data. Based on that Michael generated nn-76a8a7ffb820.nnue in the following way: The net being submitted was trained with the pytorch trainer: python i:/bin/all.binpack i:/bin/all.binpack --gpus 1 --threads 4 --num-workers 30 --batch-size 16384 --progress_bar_refresh_rate 30 --smart-fen-skipping --random-fen-skipping 3 --features=HalfKAv2^ --auto_lr_find True --lambda=1.0 --max_epochs=240 --seed %random%%random% --default_root_dir exp/run_109 --resume-from-model ./pt/ This run is thus started from Segio Vieri's net nn-d8609abe8caf.nnue all.binpack equaled 4 parts Wrong_NNUE_2.binpack plus two parts of Training_Data.binpack Each set was concatenated together - making one large Wrong_NNUE 2 binpack and one large Training so the were approximately equal in size. They were then interleaved together. The idea was to give Wrong_NNUE.binpack closer to equal weighting with the Training_Data binpack modifications: loss = torch.pow(torch.abs(p - q), 2.6).mean() LR = 8.0e-5 calculated as follows: 1.5e-3*(.992^360) - the idea here was to take a highly trained net and just use all.binpack as a finishing micro refinement touch for the last 2 Elo or so. This net was discovered on the 59th epoch. optimizer = ranger.Ranger(train_params, betas=(.90, 0.999), eps=1.0e-7, gc_loc=False, use_gc=False) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=1, gamma=0.992) For this micro optimization, I had set the period to "5" in This changes the checkpoint output so that every 5th checkpoint file is created The final touches were to adjust the NNUE scale, as was done by Jonathan in tests running at the same time. passed LTC LLR: 2.94 (-2.94,2.94) <0.50,3.50> Total: 53040 W: 1732 L: 1575 D: 49733 Ptnml(0-2): 14, 1432, 23474, 1583, 17 passed STC LLR: 2.94 (-2.94,2.94) <-0.50,2.50> Total: 37928 W: 3178 L: 3001 D: 31749 Ptnml(0-2): 100, 2446, 13695, 2623, 100. closes Bench: 5169957
