Dev Builds » 20260504-0632

You are viewing an old NCM Stockfish dev build test. You may find the most recent dev build tests using Stockfish 15 as the baseline here.

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host	Duration	Avg Base NPS	Games	WLD	Standard Elo	Ptnml(0-2)	Gamepair Elo

Test Detail

ID	Host	Base NPS	Games	WLD	Standard Elo	Ptnml(0-2)	Gamepair Elo	CLI	PGN

Commit

Commit ID	ab8f901d25def6c17f83b7a1b4b84eaec404c0f4
Author	anematode
Date	2026-05-04 06:32:42 UTC
Remove small net Failed VVLTC non-regression https://tests.stockfishchess.org/tests/view/69d562b84088e069540a2288 LLR: -2.96 (-2.94,2.94) <-1.75,0.25> Total: 386998 W: 99181 L: 99760 D: 188057 Ptnml(0-2): 35, 35792, 122429, 35203, 40 Failed STC non-regression https://tests.stockfishchess.org/tests/view/69f3c6601e5788938e86a99e LLR: -2.93 (-2.94,2.94) <-1.75,0.25> Total: 33696 W: 8492 L: 8795 D: 16409 Ptnml(0-2): 124, 4209, 8504, 3868, 143 Many thanks to Dubslow, Torom, ces42, Shawn, vondele, Disservin and others for discussion. ## Summary The venerable small net has been around for quite some time now, and while the big net architecture has substantially advanced with TI, the small net has stayed with plain HalfKA. It therefore presents a few burdens: multiple net architectures to maintain, multiple nets to train, and a whole lot of templates to deal with the variable L1 size. Locally I measure a slowdown of -2.5% in NPS with this branch – and it's probably more on non-AVX512 architectures – but a pure slowdown of that magnitude would lead to more dramatic losses (even at VVLTC) than exhibited in the above tests, suggesting that the small net's lower eval quality is deleterious. Bonus: Shawn found this interesting PGN among the VVLTC games: https://lichess.org/study/hvo8jflc/OeTOityv `master` seems to misevaluate the fortress because all positions go to small net (the material difference is larger than the threshold). closes https://github.com/official-stockfish/Stockfish/pull/6796 Bench: 2877007