Dev Builds » 20180714-0613

You are viewing an old NCM Stockfish dev build test. You may find the most recent dev build tests using Stockfish 15 as the baseline here.

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host	Duration	Avg Base NPS	Games	WLD	Standard Elo	Ptnml(0-2)	Gamepair Elo

Test Detail

ID	Host	Base NPS	Games	WLD	Standard Elo	Ptnml(0-2)	Gamepair Elo	CLI	PGN

Commit

Commit ID	d2d4e85f25061aacd65a8b458b79cad15b74a5bb
Author	candirufish
Date	2018-07-14 06:13:15 UTC
Tuned Values after 2 million spsa games Various king and pawn eval values tuned after 2 million games. Rounding slightly adjusted. LTC: http://tests.stockfishchess.org/tests/view/5b477a260ebc5978f4be3ed4 LLR: 2.95 (-2.94,2.94) [0.00,4.00] Total: 32783 W: 5852 L: 5588 D: 21343 STC: http://tests.stockfishchess.org/tests/view/5b472d420ebc5978f4be3e4d LLR: 3.23 (-2.94,2.94) [0.00,4.00] Total: 44380 W: 10201 L: 9841 D: 24338 I think I reached the limit of the fishtest framework. It frequently crashed at 2 million games already. The small values also moved a lot throughout the entire tuning session though with smaller margin. The passed danger and close enemies values seems the most sensitive (changing close enemies alone to 6 failed before but now it passes), whether or not they are close to optimal I don't know, but it seems some parameters are also correlated to others. Closes https://github.com/official-stockfish/Stockfish/pull/1670 bench: 5103722