NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.
Host | Duration | Avg Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo |
---|---|---|---|---|---|---|---|
ncm-dbt-01 | 02:55:56 | 1192962 | 1000 | 417 79 504 | +122.24 ± 9.28 | 0 9 153 329 9 | +274.22 ± 27.57 |
ncm-dbt-02 | 02:52:37 | 1213330 | 1000 | 447 86 467 | +131.34 ± 10.22 | 0 11 141 324 24 | +284.22 ± 28.81 |
ncm-dbt-03 | 02:53:16 | 1231246 | 1000 | 440 69 491 | +135.35 ± 9.0 | 0 6 128 355 11 | +315.35 ± 30.23 |
ncm-dbt-05 | 02:52:43 | 1201398 | 1000 | 451 67 482 | +140.62 ± 8.94 | 0 2 128 354 16 | +327.18 ± 30.08 |
ncm-dbt-06 | 02:56:46 | 1231718 | 1000 | 451 95 454 | +129.35 ± 9.25 | 0 7 142 339 12 | +293.29 ± 28.64 |
5000 | 2206 396 2398 | +131.74 ± 4.19 | 0 35 692 1701 72 | +298.08 ± 12.92 |
ID | Host | Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo | CLI | PGN | ||
---|---|---|---|---|---|---|---|---|---|---|---|
368833 | ncm-dbt-01 | 1181164 | 500 | 212 38 250 | +126.17 ± 13.24 | 0 6 68 172 4 | +288.06 ± 41.76 | ↓ | |||
cutechess-cli \ -rounds 259 \ -games 2 \ -concurrency 9 \ -srand 838789782 \ -pgnout ncm-dbt-20240526-1824-010.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=64 \ option.Threads=2 \ -engine \ name=20240526-1824 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4d876275cf127b9e7cf91cef984deafa2abb47d9 \ -engine \ name=sf14 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:14 |
|||||||||||
368832 | ncm-dbt-02 | 1196993 | 500 | 222 46 232 | +127.76 ± 14.1 | 0 3 80 155 12 | +273.0 ± 38.12 | ↓ | |||
cutechess-cli \ -rounds 259 \ -games 2 \ -concurrency 9 \ -srand 3646926651 \ -pgnout ncm-dbt-20240526-1824-009.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=64 \ option.Threads=2 \ -engine \ name=20240526-1824 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4d876275cf127b9e7cf91cef984deafa2abb47d9 \ -engine \ name=sf14 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:14 |
|||||||||||
368831 | ncm-dbt-06 | 1203554 | 500 | 230 43 227 | +136.56 ± 12.59 | 0 1 68 174 7 | +315.35 ± 41.44 | ↓ | |||
cutechess-cli \ -rounds 259 \ -games 2 \ -concurrency 9 \ -srand 1285259460 \ -pgnout ncm-dbt-20240526-1824-008.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=64 \ option.Threads=2 \ -engine \ name=20240526-1824 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4d876275cf127b9e7cf91cef984deafa2abb47d9 \ -engine \ name=sf14 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:14 |
|||||||||||
368830 | ncm-dbt-03 | 1237742 | 500 | 218 41 241 | +128.55 ± 13.56 | 0 5 70 168 7 | +288.06 ± 41.11 | ↓ | |||
cutechess-cli \ -rounds 259 \ -games 2 \ -concurrency 9 \ -srand 2342868829 \ -pgnout ncm-dbt-20240526-1824-007.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=64 \ option.Threads=2 \ -engine \ name=20240526-1824 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4d876275cf127b9e7cf91cef984deafa2abb47d9 \ -engine \ name=sf14 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:14 |
|||||||||||
368829 | ncm-dbt-05 | 1196118 | 500 | 227 36 237 | +139.81 ± 12.47 | 0 1 64 178 7 | +327.18 ± 42.83 | ↓ | |||
cutechess-cli \ -rounds 259 \ -games 2 \ -concurrency 9 \ -srand 3708605781 \ -pgnout ncm-dbt-20240526-1824-006.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=64 \ option.Threads=2 \ -engine \ name=20240526-1824 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4d876275cf127b9e7cf91cef984deafa2abb47d9 \ -engine \ name=sf14 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:14 |
|||||||||||
368828 | ncm-dbt-02 | 1229667 | 500 | 225 40 235 | +134.95 ± 14.82 | 0 8 61 169 12 | +295.94 ± 44.08 | ↓ | |||
cutechess-cli \ -rounds 259 \ -games 2 \ -concurrency 9 \ -srand 2235850260 \ -pgnout ncm-dbt-20240526-1824-005.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=64 \ option.Threads=2 \ -engine \ name=20240526-1824 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4d876275cf127b9e7cf91cef984deafa2abb47d9 \ -engine \ name=sf14 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:14 |
|||||||||||
368827 | ncm-dbt-06 | 1259883 | 500 | 221 52 227 | +122.24 ± 13.48 | 0 6 74 165 5 | +273.0 ± 39.95 | ↓ | |||
cutechess-cli \ -rounds 259 \ -games 2 \ -concurrency 9 \ -srand 2986218125 \ -pgnout ncm-dbt-20240526-1824-004.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=64 \ option.Threads=2 \ -engine \ name=20240526-1824 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4d876275cf127b9e7cf91cef984deafa2abb47d9 \ -engine \ name=sf14 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:14 |
|||||||||||
368826 | ncm-dbt-03 | 1224750 | 500 | 222 28 250 | +142.25 ± 11.73 | 0 1 58 187 4 | +346.12 ± 45.18 | ↓ | |||
cutechess-cli \ -rounds 259 \ -games 2 \ -concurrency 9 \ -srand 504209726 \ -pgnout ncm-dbt-20240526-1824-003.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=64 \ option.Threads=2 \ -engine \ name=20240526-1824 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4d876275cf127b9e7cf91cef984deafa2abb47d9 \ -engine \ name=sf14 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:14 |
|||||||||||
368825 | ncm-dbt-01 | 1204760 | 500 | 205 41 254 | +118.33 ± 13.0 | 0 3 85 157 5 | +261.07 ± 36.86 | ↓ | |||
cutechess-cli \ -rounds 259 \ -games 2 \ -concurrency 9 \ -srand 4105442839 \ -pgnout ncm-dbt-20240526-1824-002.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=64 \ option.Threads=2 \ -engine \ name=20240526-1824 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4d876275cf127b9e7cf91cef984deafa2abb47d9 \ -engine \ name=sf14 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:14 |
|||||||||||
368824 | ncm-dbt-05 | 1206679 | 500 | 224 31 245 | +141.44 ± 12.81 | 0 1 64 176 9 | +327.18 ± 42.83 | ↓ | |||
cutechess-cli \ -rounds 259 \ -games 2 \ -concurrency 9 \ -srand 743102950 \ -pgnout ncm-dbt-20240526-1824-001.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=64 \ option.Threads=2 \ -engine \ name=20240526-1824 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4d876275cf127b9e7cf91cef984deafa2abb47d9 \ -engine \ name=sf14 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:14 |
Commit ID | 4d876275cf127b9e7cf91cef984deafa2abb47d9 |
---|---|
Author | Stéphane Nicolet |
Date | 2024-05-26 18:24:05 UTC |
Simplify material weights in evaluation
This patch uses the same material weights for the nnue
amplification term and the optimism term in evaluate().
STC:
LLR: 2.99 (-2.94,2.94) <-1.75,0.25>
Total: 83360 W: 21489 L: 21313 D: 40558
Ptnml(0-2): 303, 9934, 21056, 10058, 329
https://tests.stockfishchess.org/tests/view/664eee69928b1fb18de500d9
LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 192648 W: 48675 L: 48630 D: 95343
Ptnml(0-2): 82, 21484, 53161, 21501, 96
https://tests.stockfishchess.org/tests/view/664fa17aa86388d5e27d7d6e
closes https://github.com/official-stockfish/Stockfish/pull/5287
Bench: 1495602
|