Dev Builds » 20140714-2314

You are viewing an old NCM Stockfish dev build test. You may find the most recent dev build tests using Stockfish 15 as the baseline here.

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host	Duration	Avg Base NPS	Games	WLD	Standard Elo	Ptnml(0-2)	Gamepair Elo

Test Detail

ID	Host	Base NPS	Games	WLD	Standard Elo	Ptnml(0-2)	Gamepair Elo	CLI	PGN

Commit

Commit ID	67a5e1ecf97eae3c74f5c84ebdbb7e3719bf90bb
Author	lucasart
Date	2014-07-14 23:14:58 UTC
Contempt = 20 Also raise the admissible bounds to (-100,100), as there is no reason to prevent users from using high values if they want to. Does not regress in self play: ELO: 0.10 +-2.0 (95%) LOS: 53.7% Total: 40000 W: 7084 L: 7073 D: 25843 master vs SF 3 ELO: 182.86 +-2.7 (95%) LOS: 100.0% Total: 40000 W: 21843 L: 2541 D: 15616 Contempt = 20 vs SF 3 ELO: 189.25 +-2.8 (95%) LOS: 100.0% Total: 40000 W: 22721 L: 2859 D: 14420 Diff is therefore 6.4 +/- 3.9 elo against a 180-190 elo weaker engine, which is significantly positive, as expected. This elo difference is likely understated, because of FishTest aggressive draw adjudication though. We could push Contempt further, but after 20cp, it would get in the way of FishTest draw adjudication rule, and is likely to reduce the testing throughput as a result. bench 8198667