Dev Builds » 20260519-1636

You are viewing an old NCM Stockfish dev build test. You may find the most recent dev build tests using Stockfish 15 as the baseline here.

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host	Duration	Avg Base NPS	Games	WLD	Standard Elo	Ptnml(0-2)	Gamepair Elo

Test Detail

ID	Host	Base NPS	Games	WLD	Standard Elo	Ptnml(0-2)	Gamepair Elo	CLI	PGN

Commit

Commit ID	fff35786bf4a5941a017066842f6977398ddac7e
Author	anematode
Date	2026-05-19 16:36:11 UTC
Use bitset representation for nnz and move computation into feature transformer Passed avx2/bmi2 STC: https://tests.stockfishchess.org/tests/view/69ff99eb9392f0c317213eda LLR: 2.93 (-2.94,2.94) <0.00,2.00> Total: 20992 W: 5469 L: 5192 D: 10331 Ptnml(0-2): 38, 2197, 5751, 2470, 40 Passed avxvnni STC: https://tests.stockfishchess.org/tests/view/6a00022b9392f0c317213f7c LLR: 2.93 (-2.94,2.94) <0.00,2.00> Total: 47328 W: 12328 L: 12009 D: 22991 Ptnml(0-2): 112, 5148, 12847, 5423, 134 Passed NEON STC: https://tests.stockfishchess.org/tests/view/69ff99d69392f0c317213ed8 LLR: 2.96 (-2.94,2.94) <0.00,2.00> Total: 29600 W: 7664 L: 7376 D: 14560 Ptnml(0-2): 48, 3074, 8277, 3344, 57 Passed avx512icl non-regression: https://tests.stockfishchess.org/tests/view/6a002b699392f0c317213f91 LLR: 2.94 (-2.94,2.94) <-1.75,0.25> Total: 90112 W: 23130 L: 22975 D: 44007 Ptnml(0-2): 192, 9939, 24633, 10106, 186 Measurements from vondele (Neoverse V2, armv8-dotprod): ==== master ==== ==== Bench: 2344696 ==== 1 Nodes/second : 280724660 2 Nodes/second : 280647282 3 Nodes/second : 282055192 Average (over 3): 281142378 ==== 50a44640a3 ==== ==== Bench: 2344696 ==== 1 Nodes/second : 284271937 2 Nodes/second : 285638071 3 Nodes/second : 284349426 Average (over 3): 284753144 The patch's benefit is non-uniform and is ~0 for avx512icl unfortunately -- although I think we should be able to find something there.... ## Background/explanation The idea here is to move the non-zero block computation back to the previous layer, and overlap the work better. Then, to avoid trying to emulate compress instructions on targets not supporting them (i.e., everything except for AVX512), we use a `pop_lsb` loop on a bitset enumerating the non-zero blocks, rather than first writing them out as indices. An early AVX2 implementation worked on fishtest (https://tests.stockfishchess.org/tests/view/69c3410534b6988b1e472db4) and linrock demonstrated that it also worked for NEON (https://tests.stockfishchess.org/tests/view/69ce0dc73ddc0eccd617188f). Thanks to him for encouraging me to push this idea over the finish line. To abstract over the particular NNZ representation (bitset or index list), we use `NNZInfo` and `NNZCursor`. The cursor is used to write to one or the other perspective of the NNZ list. I'd appreciate ideas on making the code cleaner/more readable as imo it's still a bit ugly. ## Fixing GCC 15 regression Separately, vondele noticed a regression in ARM performance from GCC 15 caused by suboptimal codegen. See https://discord.com/channels/435943710472011776/813919248455827515/1502837886381461605 for more info, but the easiest fix that I could come up with was inserting a couple `asm` optimization barriers. Single threaded data: ``` average: stockfish.master.13.3 1041619 average: stockfish.master.15.2 999214 average: stockfish.patched.15.2 1031369 ``` There's definitely regressions elsewhere as well, which we shld chase down, but at least this should unblock the arm64 universal binary work. closes https://github.com/official-stockfish/Stockfish/pull/6814 No functional change