Dev Builds » 20250418-1232

You are viewing an old NCM Stockfish dev build test. You may find the most recent dev build tests using Stockfish 15 as the baseline here.

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host	Duration	Avg Base NPS	Games	WLD	Standard Elo	Ptnml(0-2)	Gamepair Elo

Test Detail

ID	Host	Base NPS	Games	WLD	Standard Elo	Ptnml(0-2)	Gamepair Elo	CLI	PGN

Commit

Commit ID	d2d046c2a497b2d70debde07ccc414ca633d550b
Author	pb00067
Date	2025-04-18 12:32:26 UTC
Improve stalemate detection during search Currently SF’s quiescence search like most alpha-beta based engines doesn’t verify for stalemate because doing it each leaf position is to expensive and costs elo. However in certain positions this creates a blindspot for SF, not recognizing soon enough that the opponent can reach a stalemate by sacrifycing his last mobile heavy piece(s). This tactical motif & it’s measure are similar to zugzwang & verification search: the measure itself does not gain elo, but prevents SF from loosing/drawing games in an awkward way. The fix consists of 3 measures: 1. Make qsearch verify for stalemate on transitions to pure KP-material for the side to move with our last Rook/Queen just been captured. In fact this is the scenario where stalemate happens with highest frequency. The stalemate-verification itself is optimized by merely checking for pawn pushes & king mobility (captures were already tried by qssearch) 2. Another culprit for the issue figured out to be SEE based pruning for checks in step 14. Here often the move forcing the stalemate (or forcing the opponent to not retake) get pruned away and it need to much time to reach enough depth. To encounter this we verify following conditions: - a) side to move is happy with a draw (alpha < 0) - b) we are about to sacrify our last heavy & unique mobile piece in this position. - c) this piece doesn’t move away from our kingring giving the king a new square to move. When all 3 conditions meet we don’t prune the move, because there is a good chance that capturing the piece means stalemate. 3. Store terminal nodes (mates & stalemates) in TT with higher depth than searched depth. This prevents SF from: - reanalyzing the node (=trying to generate legal moves) in vain at each iterative deepening step. - overwriting an already correct draw-evaluation from a previous shallow normal search by a qsearch which doesn’t recognize stalemate and might store a verry erratic evaluation. This is due to the 4 constant in the TT-overwrite condition: d - DEPTH_ENTRY_OFFSET + 2 * pv > depth8 – 4 which allows qs to override entries made by normal searches with depth <=4. This 3hrd measure however is not essential for fixing the issue, but tests (one of vdv & one of mine) seem to suggest that this measure brings some small benefit. Another other position where SF benefits from this fix is for instance Position FEN 8/8/8/1B6/6p1/8/3K1Ppp/3N2kr w - - 0 1 bm f4 +M9 P.S.: Also this issue higly depends on the used net, how good the net is at evaluate such mobility restricted positions. SF16 was pretty good in solving 2rr4/5pBk/PqP3p1/1N3pPp/1PQ1bP1P/8/3R4/R4K2 b - - 0 40 bm Rxc6 (< 1 second) while SF16_1 with introduction of the dual net needs about 1,5 minutes and SF17.1 requires 3 minutes to find the drawing move Rxc6. P.S.2: Using more threads produces indeterminism & using high hash pressure makes SF reevaluate explored positions more often which makes it more likely to solve the position. To have stable meaningful results I tested therfore with one single thread and low hash pressure. Preliminary LTC test at 30k games https://tests.stockfishchess.org/tests/view/67ece7a931d7cf8afdc44e18 Elo: 0.04 ± 2.0 (95%) LOS: 51.7% Total: 24416 W: 6226 L: 6223 D: 11967 Ptnml(0-2): 12, 2497, 7185, 2504, 10 nElo: 0.09 ± 4.4 (95%) PairsRatio: 1.00 Passed LTC no-regression sprt https://tests.stockfishchess.org/tests/view/67ee8e4631d7cf8afdc452fb LLR: 2.94 (-2.94,2.94) <-1.75,0.25> Total: 401556 W: 101612 L: 101776 D: 198168 Ptnml(0-2): 152, 42241, 116170, 42049, 166 closes https://github.com/official-stockfish/Stockfish/pull/5983 fixes https://github.com/official-stockfish/Stockfish/issues/5899 Bench: 1721673