NCM plays each Stockfish dev build 20,000 times against Stockfish 15. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.
Host | Duration | Avg Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo |
ncm-dbt-01 | 02:16:53 | 583632 | 1338 | 455 247 636 | +54.45 ± 8.5 | 1 56 348 262 2 | +111.14 ± 18.06 |
ncm-dbt-02 | 02:16:34 | 586849 | 1316 | 437 256 623 | +48.09 ± 8.88 | 0 75 328 254 1 | +97.53 ± 18.77 |
ncm-dbt-03 | 02:17:09 | 586469 | 1324 | 454 248 622 | +54.5 ± 8.43 | 0 55 348 257 2 | +110.66 ± 18.03 |
ncm-dbt-04 | 02:16:51 | 572559 | 1320 | 420 229 671 | +50.63 ± 9.06 | 0 75 324 256 5 | +100.64 ± 18.9 |
ncm-dbt-05 | 02:15:28 | 586134 | 1304 | 428 242 634 | +49.9 ± 9.22 | 0 79 312 257 4 | +99.63 ± 19.29 |
6602 | 2194 1222 3186 | +51.53 ± 3.94 | 1 340 1660 1286 14 | +103.93 ± 8.32 |
ID | Host | Base NPS | Games | WLD | Standard Elo | Ptnml(0-2) | Gamepair Elo | CLI | PGN | ||
404081 | ncm-dbt-05 | 580987 | 304 | 106 62 136 | +50.64 ± 19.34 | 0 18 74 58 2 | +98.56 ± 39.73 | ||||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 3347443247 \ -pgnout ncm-dbt-20230723-2356-015.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
404080 | ncm-dbt-02 | 584832 | 316 | 100 57 159 | +47.57 ± 18.03 | 0 18 79 61 0 | +97.0 ± 38.37 | ||||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 922372598 \ -pgnout ncm-dbt-20230723-2356-014.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
404079 | ncm-dbt-04 | 573406 | 320 | 96 51 173 | +49.18 ± 18.99 | 0 22 71 67 0 | +100.42 ± 40.61 | ||||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 3957196249 \ -pgnout ncm-dbt-20230723-2356-013.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
404078 | ncm-dbt-03 | 589027 | 324 | 108 58 158 | +54.05 ± 18.1 | 0 17 79 65 1 | +108.48 ± 38.4 | ||||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 2135273267 \ -pgnout ncm-dbt-20230723-2356-012.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
404077 | ncm-dbt-01 | 585337 | 338 | 115 65 158 | +51.78 ± 16.65 | 0 15 89 65 0 | +105.96 ± 35.82 | ||||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 1363891962 \ -pgnout ncm-dbt-20230723-2356-011.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
404076 | ncm-dbt-05 | 589283 | 500 | 150 90 260 | +41.89 ± 14.48 | 0 32 126 92 0 | +85.04 ± 30.35 | ↓ | |||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 2166345239 \ -pgnout ncm-dbt-20230723-2356-010.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
404075 | ncm-dbt-02 | 586562 | 500 | 162 95 243 | +46.84 ± 14.09 | 0 27 129 94 0 | +95.44 ± 29.88 | ↓ | |||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 2462317313 \ -pgnout ncm-dbt-20230723-2356-009.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
404074 | ncm-dbt-04 | 571270 | 500 | 167 81 252 | +60.36 ± 14.62 | 0 22 124 100 4 | +118.33 ± 30.49 | ↓ | |||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 843449531 \ -pgnout ncm-dbt-20230723-2356-008.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
404073 | ncm-dbt-03 | 585843 | 500 | 171 85 244 | +60.36 ± 13.2 | 0 16 132 102 0 | +124.6 ± 29.14 | ↓ | |||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 781350874 \ -pgnout ncm-dbt-20230723-2356-007.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
404072 | ncm-dbt-01 | 583992 | 500 | 168 91 241 | +53.93 ± 13.34 | 0 19 135 96 0 | +110.6 ± 28.84 | ↓ | |||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 408628355 \ -pgnout ncm-dbt-20230723-2356-006.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
404071 | ncm-dbt-05 | 588132 | 500 | 172 90 238 | +57.5 ± 15.16 | 0 29 112 107 2 | +115.23 ± 32.29 | ↓ | |||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 322008249 \ -pgnout ncm-dbt-20230723-2356-005.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
404070 | ncm-dbt-02 | 589155 | 500 | 175 104 221 | +49.67 ± 14.77 | 0 30 120 99 1 | +99.95 ± 31.16 | ↓ | |||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 2723293583 \ -pgnout ncm-dbt-20230723-2356-004.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
404069 | ncm-dbt-01 | 581569 | 500 | 172 91 237 | +56.78 ± 14.6 | 1 22 124 101 2 | +115.23 ± 30.51 | ↓ | |||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 3760565956 \ -pgnout ncm-dbt-20230723-2356-003.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
404068 | ncm-dbt-04 | 573002 | 500 | 157 97 246 | +41.89 ± 14.48 | 0 31 129 89 1 | +83.57 ± 29.93 | ↓ | |||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 3786834321 \ -pgnout ncm-dbt-20230723-2356-002.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
404067 | ncm-dbt-03 | 584537 | 500 | 175 105 220 | +48.96 ± 13.65 | 0 22 137 90 1 | +98.44 ± 28.66 | ↓ | |||
cutechess-cli \ -rounds 266 \ -games 2 \ -concurrency 16 \ -srand 875184062 \ -pgnout ncm-dbt-20230723-2356-001.pgn \ -openings \ file=UHO_4060_v2.epd \ format=epd \ order=random \ -repeat \ -resign \ movecount=3 \ score=600 \ -draw \ movenumber=34 \ movecount=8 \ score=5 \ -each \ tc=30+0.3 \ timemargin=10000 \ proto=uci \ option.Hash=128 \ option.Threads=8 \ -engine \ name=20230723-2356 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=dev_build:4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 \ -engine \ name=sf15 \ cmd=docker \ arg=run \ arg=-i \ arg=--rm \ arg=--entrypoint=/engine \ arg=stockfish:15 |
Commit ID | 4b2979760f3862700c6a0b8d3ab0f6a6e0a638c0 |
Author | Joost VandeVondele |
Date | 2023-07-23 23:56:20 UTC |
Check clock more often
This patch changes the frequency with which the time is checked, changing
frequency from every 1024 counted nodes to every 512 counted nodes. The
master value was tuned for the old classical eval, the patch takes the
roughly 2x slowdown in nps with SFNNUEv7 into account. This could reduce
a bit the losses on time on fishtest, but they are probably unrelated.
passed STC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 76576 W: 19677 L: 19501 D: 37398
Ptnml(0-2): 274, 8592, 20396, 8736, 290
No functional change