Dev Builds » 20220704-1342

You are viewing an old NCM Stockfish dev build test. You may find the most recent dev build tests using Stockfish 15 as the baseline here.

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host Duration Avg Base NPS Games WLD Standard Elo Ptnml(0-2) Gamepair Elo
ncm-dbt-01 10:07:56 1289438 3376 1341 413 1622 +98.02 ± 5.67 1 88 619 942 38 +204.0 ± 13.7
ncm-dbt-02 10:01:59 1286401 3324 1302 388 1634 +98.06 ± 5.59 2 80 608 946 26 +207.71 ± 13.82
ncm-dbt-03 10:06:47 1321123 3364 1307 404 1653 +95.6 ± 5.66 0 101 603 952 26 +200.91 ± 13.88
ncm-dbt-04 10:07:29 1293349 3382 1339 410 1633 +97.95 ± 5.62 2 85 619 952 33 +205.54 ± 13.7
ncm-dbt-05 10:03:07 1300224 3366 1323 401 1642 +97.66 ± 5.53 2 74 638 938 31 +205.33 ± 13.47
ncm-dbt-06 09:27:03 1214332 3188 1253 381 1554 +97.51 ± 5.75 2 72 607 878 35 +203.28 ± 13.81
20000 7865 2397 9738 +97.47 ± 2.3 9 500 3694 5608 189 +204.46 ± 5.6

Test Detail

ID Host Base NPS Games WLD Standard Elo Ptnml(0-2) Gamepair Elo CLI PGN
203588 ncm-dbt-02 1192318 130 52 13 65 +107.53 ± 25.55 0 2 22 41 0 +240.82 ± 74.81
203587 ncm-dbt-03 1228453 156 65 18 73 +108.02 ± 24.85 0 4 23 51 0 +242.22 ± 72.99
203586 ncm-dbt-05 1222939 166 65 20 81 +96.59 ± 25.34 0 4 32 45 2 +199.32 ± 61.23
203585 ncm-dbt-04 1219660 182 77 24 81 +104.19 ± 21.72 0 2 35 53 1 +225.71 ± 58.02
203584 ncm-dbt-01 1194293 178 67 26 85 +81.48 ± 26.28 0 7 37 42 3 +158.49 ± 56.74
203583 ncm-dbt-06 1217824 188 76 20 92 +106.72 ± 22.04 0 3 33 57 1 +232.85 ± 60.34
203582 ncm-dbt-02 1193782 500 194 57 249 +97.69 ± 14.28 0 12 93 141 4 +206.01 ± 35.51
203581 ncm-dbt-03 1237941 500 192 54 254 +98.44 ± 14.44 0 15 84 149 2 +211.87 ± 37.38
203580 ncm-dbt-05 1233190 500 194 54 252 +99.95 ± 13.36 0 7 99 141 3 +213.85 ± 34.13
203579 ncm-dbt-04 1232130 500 198 60 242 +98.44 ± 14.89 1 10 96 136 7 +204.07 ± 34.89
203578 ncm-dbt-06 1208047 500 197 53 250 +102.97 ± 14.64 0 11 91 141 7 +213.85 ± 35.9
203577 ncm-dbt-01 1207050 500 199 60 241 +99.2 ± 15.47 0 15 90 136 9 +200.24 ± 36.12
203576 ncm-dbt-03 1251187 500 185 66 249 +84.3 ± 14.47 0 17 99 132 2 +176.33 ± 34.4
203575 ncm-dbt-02 1202633 500 201 58 241 +102.22 ± 13.86 0 11 87 150 2 +221.9 ± 36.76
203574 ncm-dbt-06 1226505 500 207 58 235 +106.78 ± 14.82 1 8 90 143 8 +223.94 ± 36.07
203573 ncm-dbt-04 1210896 500 197 58 245 +99.2 ± 14.6 0 12 93 139 6 +206.01 ± 35.51
203572 ncm-dbt-05 1208579 500 179 58 263 +85.78 ± 14.94 0 17 100 128 5 +174.55 ± 34.22
203571 ncm-dbt-01 1215852 500 192 74 234 +83.57 ± 15.43 0 21 95 129 5 +169.27 ± 35.09
203570 ncm-dbt-02 1196866 500 190 63 247 +90.22 ± 15.03 0 15 100 128 7 +181.7 ± 34.21
203569 ncm-dbt-03 1240906 500 188 57 255 +93.2 ± 14.51 0 15 92 140 3 +196.45 ± 35.72
203568 ncm-dbt-05 1223805 500 210 63 227 +105.25 ± 14.04 0 9 90 146 5 +223.94 ± 36.07
203567 ncm-dbt-06 1209595 500 191 54 255 +97.69 ± 14.58 0 11 98 134 7 +200.24 ± 34.5
203566 ncm-dbt-04 1198246 500 194 55 251 +99.2 ± 14.75 0 14 88 143 5 +207.95 ± 36.54
203565 ncm-dbt-01 1204046 500 199 72 229 +90.22 ± 14.6 1 13 97 136 3 +190.85 ± 34.75
203564 ncm-dbt-03 1233114 500 195 62 243 +94.69 ± 14.69 0 13 97 134 6 +194.57 ± 34.74
203563 ncm-dbt-02 1193286 500 200 57 243 +102.22 ± 14.02 0 12 85 151 2 +221.9 ± 37.21
203562 ncm-dbt-05 1228121 500 194 54 252 +99.95 ± 14.15 0 13 86 149 2 +215.85 ± 36.98
203561 ncm-dbt-04 1197158 500 197 69 234 +90.96 ± 14.76 0 16 94 136 4 +189.0 ± 35.33
203560 ncm-dbt-06 1213611 500 191 51 258 +99.95 ± 14.31 0 11 93 141 5 +209.91 ± 35.49
203559 ncm-dbt-01 1191454 500 196 54 250 +101.46 ± 14.01 0 8 98 138 6 +211.87 ± 34.38
203558 ncm-dbt-03 1230132 500 201 68 231 +94.69 ± 14.98 0 18 84 145 3 +200.24 ± 37.29
203557 ncm-dbt-05 1208246 500 201 74 225 +90.22 ± 15.17 2 11 101 130 6 +187.16 ± 33.99
203556 ncm-dbt-02 1192518 500 192 57 251 +96.19 ± 15.7 2 14 87 141 6 +202.15 ± 36.72
203555 ncm-dbt-01 1204425 500 201 51 248 +107.54 ± 13.89 0 10 83 154 3 +234.38 ± 37.67
203554 ncm-dbt-04 1231342 500 191 66 243 +88.74 ± 14.28 0 14 100 133 3 +185.33 ± 34.19
203553 ncm-dbt-06 1215219 500 190 69 241 +85.78 ± 14.22 0 16 98 135 1 +181.7 ± 34.58
203552 ncm-dbt-03 1222938 500 200 52 248 +106.01 ± 15.26 0 16 76 152 6 +223.94 ± 39.17
203551 ncm-dbt-05 1215702 500 202 56 242 +104.49 ± 13.55 0 7 94 145 4 +223.94 ± 35.14
203550 ncm-dbt-02 1192529 500 194 60 246 +95.44 ± 14.25 0 12 96 138 4 +200.24 ± 34.91
203549 ncm-dbt-06 1209526 500 201 76 223 +88.74 ± 14.86 1 12 104 127 6 +181.7 ± 33.45
203548 ncm-dbt-01 1208734 500 201 58 241 +102.22 ± 15.08 0 13 89 140 8 +209.91 ± 36.34
203547 ncm-dbt-04 1226206 500 207 56 237 +108.3 ± 14.21 0 10 84 151 5 +232.26 ± 37.43
175855 ncm-dbt-03 1441080 8 2 0 6 +88.62 ± 93.84 0 0 2 2 0 +190.67 ± 458.56
175849 ncm-dbt-02 1442678 44 20 4 20 +132.36 ± 45.38 0 0 7 14 1 +289.19 ± 143.82
175847 ncm-dbt-01 1439631 48 22 4 22 +136.94 ± 35.23 0 0 6 18 0 +337.98 ± 165.55
175846 ncm-dbt-04 1423897 50 23 7 20 +115.21 ± 47.81 0 1 8 15 1 +240.82 ± 132.75
175845 ncm-dbt-05 1443843 50 21 4 25 +123.01 ± 42.3 0 1 6 18 0 +288.06 ± 162.53
175844 ncm-dbt-03 1440647 50 19 8 23 +77.69 ± 50.32 0 2 11 11 1 +147.19 ± 107.43
175843 ncm-dbt-01 1436451 50 20 1 29 +138.96 ± 41.07 0 0 7 17 1 +315.3 ± 146.24
175842 ncm-dbt-02 1446470 50 18 6 26 +85.04 ± 41.74 0 1 11 13 0 +181.7 ± 107.11
175841 ncm-dbt-05 1436599 50 15 6 29 +63.21 ± 52.66 0 3 11 10 1 +115.23 ± 106.91
175840 ncm-dbt-04 1426017 50 18 4 28 +99.94 ± 52.02 0 3 5 17 0 +219.87 ± 157.08
175839 ncm-dbt-03 1444718 50 21 5 24 +115.2 ± 36.62 0 0 9 16 0 +263.38 ± 120.78
175838 ncm-dbt-01 1442386 50 24 5 21 +138.97 ± 34.19 0 0 6 19 0 +346.06 ± 166.42
175837 ncm-dbt-02 1450868 50 20 6 24 +99.94 ± 42.49 0 1 9 15 0 +219.87 ± 122.63
175836 ncm-dbt-05 1438621 50 22 6 22 +115.21 ± 61.06 0 2 8 12 3 +200.24 ± 129.69
175835 ncm-dbt-04 1427713 50 20 5 25 +107.54 ± 60.64 1 1 6 16 1 +240.82 ± 153.02
175834 ncm-dbt-03 1445743 50 19 9 22 +70.42 ± 45.16 0 1 14 9 1 +130.94 ± 89.9
175833 ncm-dbt-05 1442823 50 20 6 24 +99.93 ± 36.88 0 0 11 14 0 +219.84 ± 104.3
175832 ncm-dbt-02 1446468 50 21 7 22 +99.93 ± 36.88 0 0 11 14 0 +219.84 ± 104.3
175831 ncm-dbt-04 1433580 50 17 6 27 +77.7 ± 45.94 0 2 10 13 0 +164.07 ± 113.84
175830 ncm-dbt-03 1436619 50 20 5 25 +107.51 ± 47.72 0 0 12 11 2 +200.21 ± 97.77
175829 ncm-dbt-01 1439501 50 20 8 22 +85.04 ± 41.74 0 1 11 13 0 +181.7 ± 107.11

Commit

Commit ID 85f8ee6199f8578fbc082fc0f37e1985813e637a
Author Joost VandeVondele
Date 2022-07-04 13:42:34 UTC
Update default net to nn-3c0054ea9860.nnu First things first... this PR is being made from court. Today, Tord and Stéphane, with broad support of the developer community are defending their complaint, filed in Munich, against ChessBase. With their products Houdini 6 and Fat Fritz 2, both Stockfish derivatives, ChessBase violated repeatedly the Stockfish GPLv3 license. Tord and Stéphane have terminated their license with ChessBase permanently. Today we have the opportunity to present our evidence to the judge and enforce that termination. To read up, have a look at our blog post https://stockfishchess.org/blog/2022/public-court-hearing-soon/ and https://stockfishchess.org/blog/2021/our-lawsuit-against-chessbase/ This PR introduces a net trained with an enhanced data set and a modified loss function in the trainer. A slight adjustment for the scaling was needed to get a pass on standard chess. passed STC: https://tests.stockfishchess.org/tests/view/62c0527a49b62510394bd610 LLR: 2.94 (-2.94,2.94) <0.00,2.50> Total: 135008 W: 36614 L: 36152 D: 62242 Ptnml(0-2): 640, 15184, 35407, 15620, 653 passed LTC: https://tests.stockfishchess.org/tests/view/62c17e459e7d9997a12d458e LLR: 2.94 (-2.94,2.94) <0.50,3.00> Total: 28864 W: 8007 L: 7749 D: 13108 Ptnml(0-2): 47, 2810, 8466, 3056, 53 Local testing at a fixed 25k nodes resulted in Test run1026/easy_train_data/experiments/experiment_2/training/run_0/nn-epoch799.nnue localElo: 4.2 +- 1.6 The real strength of the net is in FRC and DFRC chess where it gains significantly. Tested at STC with slightly different scaling: FRC: https://tests.stockfishchess.org/tests/view/62c13a4002ba5d0a774d20d4 Elo: 29.78 +-3.4 (95%) LOS: 100.0% Total: 10000 W: 2007 L: 1152 D: 6841 Ptnml(0-2): 31, 686, 2804, 1355, 124 nElo: 59.24 +-6.9 (95%) PairsRatio: 2.06 DFRC: https://tests.stockfishchess.org/tests/view/62c13a5702ba5d0a774d20d9 Elo: 55.25 +-3.9 (95%) LOS: 100.0% Total: 10000 W: 2984 L: 1407 D: 5609 Ptnml(0-2): 51, 636, 2266, 1779, 268 nElo: 96.95 +-7.2 (95%) PairsRatio: 2.98 Tested at LTC with identical scaling: FRC: https://tests.stockfishchess.org/tests/view/62c26a3c9e7d9997a12d6caf Elo: 16.20 +-2.5 (95%) LOS: 100.0% Total: 10000 W: 1192 L: 726 D: 8082 Ptnml(0-2): 10, 403, 3727, 831, 29 nElo: 44.12 +-6.7 (95%) PairsRatio: 2.08 DFRC: https://tests.stockfishchess.org/tests/view/62c26a539e7d9997a12d6cb2 Elo: 40.94 +-3.0 (95%) LOS: 100.0% Total: 10000 W: 2215 L: 1042 D: 6743 Ptnml(0-2): 10, 410, 3053, 1451, 76 nElo: 92.77 +-6.9 (95%) PairsRatio: 3.64 This is due to the mixing in a significant fraction of DFRC training data in the final training round. The net is trained using the easy_train.py script in the following way: ``` python easy_train.py \ --training-dataset=../Leela-dfrc_n5000.binpack \ --experiment-name=2 \ --nnue-pytorch-branch=vondele/nnue-pytorch/lossScan4 \ --additional-training-arg=--param-index=2 \ --start-lambda=1.0 \ --end-lambda=0.75 \ --gamma=0.995 \ --lr=4.375e-4 \ --start-from-engine-test-net True \ --tui=False \ --seed=$RANDOM \ --max_epoch=800 \ --auto-exit-timeout-on-training-finished=900 \ --network-testing-threads 8 \ --num-workers 12 ``` where the data set used (Leela-dfrc_n5000.binpack) is a combination of our previous best data set (mix of Leela and some SF data) and DFRC data, interleaved to form: The data is available in https://drive.google.com/drive/folders/1S9-ZiQa_3ApmjBtl2e8SyHxj4zG4V8gG?usp=sharing Leela mix: https://drive.google.com/file/d/1JUkMhHSfgIYCjfDNKZUMYZt6L5I7Ra6G/view?usp=sharing DFRC: https://drive.google.com/file/d/17vDaff9LAsVo_1OfsgWAIYqJtqR8aHlm/view?usp=sharing The training branch used is https://github.com/vondele/nnue-pytorch/commits/lossScan4 A PR to the main trainer repo will be made later. This contains a revised loss function, now computing the loss from the score based on the win rate model, which is a more accurate representation than what we had before. Scaling constants are tweaked there as well. closes https://github.com/official-stockfish/Stockfish/pull/4100 Bench: 5186781
Copyright 2011–2024 Next Chess Move LLC