Dev Builds » 20140405-0926

You are viewing an old NCM Stockfish dev build test. You may find the most recent dev build tests using Stockfish 15 as the baseline here.

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host	Duration	Avg Base NPS	Games	WLD	Standard Elo	Ptnml(0-2)	Gamepair Elo

Test Detail

ID	Host	Base NPS	Games	WLD	Standard Elo	Ptnml(0-2)	Gamepair Elo	CLI	PGN

Commit

Commit ID	be641e881fdfdf3354453381f832fe7822e7c731
Author	Lucas Braesch
Date	2014-04-05 09:26:44 UTC
Remove QueenOn7th and QueenOnPawn Small simplification. Passed SPRT(-3,1) both at STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 17051 W: 3132 L: 3005 D: 10914 and LTC: LLR: 4.55 (-2.94,2.94) [-3.00,1.00] Total: 24890 W: 3842 L: 3646 D: 17402 The rationale behind this is that I've never managed to add a Queen on 7th rank bonus in DiscoCheck, because it never showed to be positive (evne slightly) in testing. The only thing that worked is Rook on 7th rank. In terms of SF code, it seemed natural to group it with QueenOnPawn as well as those are done together. I know you're against groupping in general, but when it comes to non regression test, you are being more conservative by groupping. If the group passes SPRT(-3,1) it's safer to commit, than test every component in SPRT(-3,1) and end up with the risk of commiting several -1 elo regression instead of just one -1 elo regression. In chess terms, perhaps it's just easier to manouver a Queen (which can more also diagonaly) than a Rook. Therefore you can let the search do its job without needing eval ad-hoc terms to guide it. For the Rook which takes more moves to manouver such eval terms can be (marginally) useful. bench: 7473314