Dev Builds » 20260703-1829

You are viewing an old NCM Stockfish dev build test. You may find the most recent dev build tests using Stockfish 15 as the baseline here.

Use this dev build

NCM plays each Stockfish dev build 20,000 times against Stockfish 14. This yields an approximate Elo difference and establishes confidence in the strength of the dev builds.

Summary

Host	Duration	Avg Base NPS	Games	WLD	Standard Elo	Ptnml(0-2)	Gamepair Elo

Test Detail

ID	Host	Base NPS	Games	WLD	Standard Elo	Ptnml(0-2)	Gamepair Elo	CLI	PGN

Commit

Commit ID	d5bbc6b67d3184fc39fa87a18596e77e92af990c
Author	anematode
Date	2026-07-03 18:29:50 UTC
[RfC] RISC-V port and universal binary Performance on Spacemit K3, thanks @edolnx for testing master: Total time (ms) : 65200 Nodes searched : 3493826 Nodes/second : 53586 riscv-scalable-port: Total time (ms) : 15834 Nodes searched : 3493826 Nodes/second : 220653 Also thanks to @camel-cdr for guidance on RVV programming, and https://cloud-v.co for supplying an RVV instance to test with passed STC: LLR: 2.81 (-2.94,2.94) <0.00,2.00> Total: 1152 W: 527 L: 108 D: 517 Ptnml(0-2): 0, 17, 167, 348, 44 https://tests.stockfishchess.org/tests/view/6a39895b3036e45021aeb368 ## Summary We've had a `riscv64` target for a while, but haven't really optimized for it, in particular the vector extension (RVV). RVV, like SVE, is based on a scalable vector system where the vector length ranges from 128 to 65536. In practice implementations are between 128 and 2048, and 256 bits is quite common (e.g. the Spacemit K3 system above). Unfortunately this doesn't fit well into the rest of our code which assumes a fixed vector length, so what I've done is bypass the `VECTOR` ifdef (which now basically means "FIXED_LENGTH_VECTOR") and just have RVV-specific paths. The ability to explicitly control `vl` makes the code quite readable, in my opinion. We use LMUL>1 in most places to take advantage of multi-vector instructions. Generally the LMULs were chosen to best support a 256-bit vlen, which is very common, but by virtue of how the vlen control works, the code works with any vlen. In a couple places, i.e., `get_changed_pieces` and `AffineTransformSparseInput::propagate`, we have separate implementations depending on the vlen, because the optimal LMUL varies a lot between implementations. One little wrinkle is that `load_as` is compiled to a sequence of byte loads, because although unaligned loads are legal in RVA23, the spec says that they may be extremely slow (even though they usually aren't, in actual hw), so compilers are conservative. Thus I aligned the relevant buffers and made the semantics of `load_as` that the operand is aligned, by adding a runtime assertion. ### Universal binary Adding a universal binary is pretty easy and we can just cross-compile. There are two targets: baseline rv64gc and riscv64-rva23, which is actually a smaller subset of RVA23 that also works on some older processors that don't support the full thing. We use clang because GCC, until recently, has a nasty bug with LTO and RVV. Like the universal ARM and x86 builds, we check all the builds in CI. In this case we run bench with multiple vlens, 128 through 1024. In the meantime I deleted the existing broken and unused riscv64 tests. ### Follow-ups - Optimizations - zvdot4a8i path closes https://github.com/official-stockfish/Stockfish/pull/6920 No functional change