Dev Builds » 20220704-1342

You are viewing an old NCM Stockfish dev build test. You may find the most recent dev build tests using Stockfish 15 as the baseline here.

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 7. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host Duration Avg Base NPS Games Wins Losses Draws Elo
ncm-et-3 11:14:38 1961026 3384 2894 9 481 +439.64 ± 15.54
ncm-et-4 11:14:32 1885221 3210 2802 14 394 +461.08 ± 17.2
ncm-et-9 11:14:45 1960848 3354 2883 7 464 +446.02 ± 15.83
ncm-et-10 11:10:57 1958836 3350 2873 8 469 +443.08 ± 15.74
ncm-et-13 11:14:55 1953515 3337 2864 6 467 +444.68 ± 15.77
ncm-et-15 11:14:38 1956924 3365 2915 2 448 +457.07 ± 16.09
20000 17231 46 2723 +448.36 ± 6.52

Test Detail

ID Host Started (UTC) Duration Base NPS Games Wins Losses Draws Elo CLI PGN
159326 ncm-et-4 2022-07-05 03:45 00:43:34 1934004 210 190 2 18 +502.98 ± 84.84
159325 ncm-et-13 2022-07-05 03:20 01:08:30 1950091 337 299 0 38 +489.46 ± 56.78
159324 ncm-et-9 2022-07-05 03:18 01:11:11 1959721 354 305 0 49 +451.47 ± 49.57
159323 ncm-et-10 2022-07-05 03:17 01:11:10 1952854 350 305 1 44 +461.13 ± 52.59
159322 ncm-et-15 2022-07-05 03:16 01:12:26 1962611 365 317 0 48 +461.01 ± 50.13
159321 ncm-et-3 2022-07-05 03:11 01:17:50 1947670 384 334 0 50 +462.85 ± 49.08
159320 ncm-et-4 2022-07-05 02:03 01:41:53 1922845 500 431 2 67 +446.7 ± 42.25
159319 ncm-et-13 2022-07-05 01:37 01:42:29 1950019 500 436 1 63 +463.16 ± 43.6
159318 ncm-et-9 2022-07-05 01:37 01:39:45 1965845 500 435 0 65 +463.15 ± 42.83
159317 ncm-et-10 2022-07-05 01:36 01:40:48 1963686 500 425 0 75 +436.43 ± 39.73
159316 ncm-et-15 2022-07-05 01:34 01:41:20 1956028 500 430 0 70 +449.35 ± 41.19
159315 ncm-et-3 2022-07-05 01:31 01:39:40 1967543 500 419 1 80 +419.61 ± 38.47
159314 ncm-et-4 2022-07-05 00:16 01:46:24 1871765 500 434 3 63 +452.04 ± 43.62
159313 ncm-et-10 2022-07-04 23:57 01:38:34 1970817 500 434 0 66 +460.32 ± 42.48
159312 ncm-et-9 2022-07-04 23:56 01:41:01 1956485 500 437 0 63 +468.95 ± 43.54
159311 ncm-et-13 2022-07-04 23:54 01:42:49 1953168 500 423 1 76 +429.05 ± 39.52
159310 ncm-et-15 2022-07-04 23:54 01:39:49 1960481 500 437 0 63 +468.95 ± 43.54
159309 ncm-et-3 2022-07-04 23:51 01:38:27 1960013 500 432 0 68 +454.76 ± 41.82
159308 ncm-et-4 2022-07-04 22:28 01:47:13 1848590 500 437 4 59 +457.52 ± 45.06
159307 ncm-et-10 2022-07-04 22:16 01:39:51 1959545 500 433 4 63 +446.7 ± 43.57
159306 ncm-et-13 2022-07-04 22:15 01:38:24 1961999 500 425 2 73 +431.48 ± 40.4
159305 ncm-et-9 2022-07-04 22:14 01:40:33 1963840 500 432 1 67 +452.04 ± 42.22
159304 ncm-et-15 2022-07-04 22:12 01:41:15 1956945 500 437 0 63 +468.95 ± 43.54
159303 ncm-et-3 2022-07-04 22:10 01:40:57 1962609 500 414 1 85 +408.38 ± 37.27
159302 ncm-et-4 2022-07-04 20:35 01:52:12 1781689 500 434 0 66 +460.32 ± 42.48
159301 ncm-et-13 2022-07-04 20:33 01:41:02 1945733 500 441 1 58 +477.99 ± 45.54
159300 ncm-et-9 2022-07-04 20:33 01:40:43 1956485 500 431 1 68 +449.35 ± 41.89
159299 ncm-et-15 2022-07-04 20:32 01:39:15 1966462 500 425 1 74 +433.94 ± 40.08
159298 ncm-et-3 2022-07-04 20:32 01:37:33 1967386 500 434 1 65 +457.52 ± 42.89
159297 ncm-et-10 2022-07-04 20:31 01:41:17 1951934 500 423 1 76 +429.05 ± 39.52
159296 ncm-et-13 2022-07-04 18:53 01:40:00 1956330 500 425 1 74 +433.94 ± 40.08
159295 ncm-et-15 2022-07-04 18:52 01:39:03 1960310 500 436 1 63 +463.16 ± 43.6
159294 ncm-et-9 2022-07-04 18:50 01:42:24 1957250 500 421 4 75 +417.32 ± 39.84
159293 ncm-et-4 2022-07-04 18:49 01:44:45 1869346 500 437 3 60 +460.32 ± 44.73
159292 ncm-et-10 2022-07-04 18:49 01:41:11 1949484 500 427 0 73 +441.5 ± 40.29
159291 ncm-et-3 2022-07-04 18:49 01:42:09 1954117 500 425 2 73 +431.48 ± 40.4
159290 ncm-et-9 2022-07-04 17:10 01:39:08 1966310 500 422 1 77 +426.65 ± 39.25
159289 ncm-et-15 2022-07-04 17:10 01:41:30 1935637 500 433 0 67 +457.52 ± 42.15
159288 ncm-et-3 2022-07-04 17:10 01:38:02 1967850 500 436 4 60 +454.76 ± 44.68
159287 ncm-et-13 2022-07-04 17:10 01:41:41 1957266 500 415 0 85 +412.8 ± 37.2
159286 ncm-et-4 2022-07-04 17:10 01:38:31 1968314 500 439 0 61 +474.93 ± 44.28
159285 ncm-et-10 2022-07-04 17:10 01:38:06 1963532 500 426 2 72 +433.94 ± 40.69

Commit

Commit ID 85f8ee6199f8578fbc082fc0f37e1985813e637a
Author Joost VandeVondele
Date 2022-07-04 13:42:34 UTC
Update default net to nn-3c0054ea9860.nnu First things first... this PR is being made from court. Today, Tord and Stéphane, with broad support of the developer community are defending their complaint, filed in Munich, against ChessBase. With their products Houdini 6 and Fat Fritz 2, both Stockfish derivatives, ChessBase violated repeatedly the Stockfish GPLv3 license. Tord and Stéphane have terminated their license with ChessBase permanently. Today we have the opportunity to present our evidence to the judge and enforce that termination. To read up, have a look at our blog post https://stockfishchess.org/blog/2022/public-court-hearing-soon/ and https://stockfishchess.org/blog/2021/our-lawsuit-against-chessbase/ This PR introduces a net trained with an enhanced data set and a modified loss function in the trainer. A slight adjustment for the scaling was needed to get a pass on standard chess. passed STC: https://tests.stockfishchess.org/tests/view/62c0527a49b62510394bd610 LLR: 2.94 (-2.94,2.94) <0.00,2.50> Total: 135008 W: 36614 L: 36152 D: 62242 Ptnml(0-2): 640, 15184, 35407, 15620, 653 passed LTC: https://tests.stockfishchess.org/tests/view/62c17e459e7d9997a12d458e LLR: 2.94 (-2.94,2.94) <0.50,3.00> Total: 28864 W: 8007 L: 7749 D: 13108 Ptnml(0-2): 47, 2810, 8466, 3056, 53 Local testing at a fixed 25k nodes resulted in Test run1026/easy_train_data/experiments/experiment_2/training/run_0/nn-epoch799.nnue localElo: 4.2 +- 1.6 The real strength of the net is in FRC and DFRC chess where it gains significantly. Tested at STC with slightly different scaling: FRC: https://tests.stockfishchess.org/tests/view/62c13a4002ba5d0a774d20d4 Elo: 29.78 +-3.4 (95%) LOS: 100.0% Total: 10000 W: 2007 L: 1152 D: 6841 Ptnml(0-2): 31, 686, 2804, 1355, 124 nElo: 59.24 +-6.9 (95%) PairsRatio: 2.06 DFRC: https://tests.stockfishchess.org/tests/view/62c13a5702ba5d0a774d20d9 Elo: 55.25 +-3.9 (95%) LOS: 100.0% Total: 10000 W: 2984 L: 1407 D: 5609 Ptnml(0-2): 51, 636, 2266, 1779, 268 nElo: 96.95 +-7.2 (95%) PairsRatio: 2.98 Tested at LTC with identical scaling: FRC: https://tests.stockfishchess.org/tests/view/62c26a3c9e7d9997a12d6caf Elo: 16.20 +-2.5 (95%) LOS: 100.0% Total: 10000 W: 1192 L: 726 D: 8082 Ptnml(0-2): 10, 403, 3727, 831, 29 nElo: 44.12 +-6.7 (95%) PairsRatio: 2.08 DFRC: https://tests.stockfishchess.org/tests/view/62c26a539e7d9997a12d6cb2 Elo: 40.94 +-3.0 (95%) LOS: 100.0% Total: 10000 W: 2215 L: 1042 D: 6743 Ptnml(0-2): 10, 410, 3053, 1451, 76 nElo: 92.77 +-6.9 (95%) PairsRatio: 3.64 This is due to the mixing in a significant fraction of DFRC training data in the final training round. The net is trained using the easy_train.py script in the following way: ``` python easy_train.py \ --training-dataset=../Leela-dfrc_n5000.binpack \ --experiment-name=2 \ --nnue-pytorch-branch=vondele/nnue-pytorch/lossScan4 \ --additional-training-arg=--param-index=2 \ --start-lambda=1.0 \ --end-lambda=0.75 \ --gamma=0.995 \ --lr=4.375e-4 \ --start-from-engine-test-net True \ --tui=False \ --seed=$RANDOM \ --max_epoch=800 \ --auto-exit-timeout-on-training-finished=900 \ --network-testing-threads 8 \ --num-workers 12 ``` where the data set used (Leela-dfrc_n5000.binpack) is a combination of our previous best data set (mix of Leela and some SF data) and DFRC data, interleaved to form: The data is available in https://drive.google.com/drive/folders/1S9-ZiQa_3ApmjBtl2e8SyHxj4zG4V8gG?usp=sharing Leela mix: https://drive.google.com/file/d/1JUkMhHSfgIYCjfDNKZUMYZt6L5I7Ra6G/view?usp=sharing DFRC: https://drive.google.com/file/d/17vDaff9LAsVo_1OfsgWAIYqJtqR8aHlm/view?usp=sharing The training branch used is https://github.com/vondele/nnue-pytorch/commits/lossScan4 A PR to the main trainer repo will be made later. This contains a revised loss function, now computing the loss from the score based on the win rate model, which is a more accurate representation than what we had before. Scaling constants are tweaked there as well. closes https://github.com/official-stockfish/Stockfish/pull/4100 Bench: 5186781
Copyright 2011–2024 Next Chess Move LLC