-
-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RVV support enabled accidentally and leads to SIGILL on older kernels #1705
Comments
Workaround for buggy RVV detection in zlib-ng, see zlib-ng/zlib-ng#1705
Workaround for buggy RVV detection in zlib-ng, see zlib-ng/zlib-ng#1705
We enable everything by default if we can't detect availability of specific instructions... This is to encourage people to upgrade to at least oldest usable kernel and C library versions... Flipping the default to not enable everything would require the change for all architectures. We still allow disabling every optimisation to make it possible to compile for older kernels. |
We are using this workaround based on the fix from highway diff --git a/arch/riscv/riscv_features.c b/arch/riscv/riscv_features.c
index b066f42..259a63a 100644
--- a/arch/riscv/riscv_features.c
+++ b/arch/riscv/riscv_features.c
@@ -42,4 +42,20 @@ void Z_INTERNAL riscv_check_features(struct riscv_cpu_features *features) {
riscv_check_features_runtime(features);
else
riscv_check_features_compile_time(features);
+ if (features->has_rvv) {
+ size_t e8m1_vec_len;
+ int64_t vtype_reg_val;
+ // Check that a vuint8m1_t vector is at least 16 bytes and that tail
+ // agnostic and mask agnostic mode are supported
+ //
+ __asm__ volatile(
+ "vsetvli %0, zero, e8, m1, ta, ma\n\t"
+ "csrr %1, vtype"
+ : "=r"(e8m1_vec_len), "=r"(vtype_reg_val));
+
+ // The RVV target is supported if the VILL bit of VTYPE (the MSB bit of
+ // VTYPE) is not set and the length of a vuint8m1_t vector is at least 16
+ // bytes
+ features->has_rvv = (vtype_reg_val >= 0 && e8m1_vec_len >= 16);
+ }
} For 32 bit the |
The sophgo 2042 (milk-v pioneer) has RVV 0.7.1 which is not fully compatible. Fix the RVV test. Patch is adapted from: google/highway#2127 Note that it only works on 64 bit RISC-V. upstream: zlib-ng/zlib-ng#1705
When the type depends on if the target is 32-bit or 64-bit, the variable type should be You should make a pull request with the patch, so it can be reviewed by contributors who use RISC-V. |
@alexsifivetw Can you check the patch above, if it can be cleaned up for zlib-ng? |
@ncopa |
I'm not familiar with Rust and Cargo. It will install a prebuilt version, right? Could you specify the no-RVV version in Cargo? |
I'd rather see this fixed sooner than later, so if @ncopa doesn't want or have time to make a PR, I can try to make it even though I don't have RISC-V hardware to test it on... |
Original version posted by @ncopa in zlib-ng#1705.
Original version posted by @ncopa in zlib-ng#1705.
Original version posted by @ncopa in zlib-ng#1705.
It looks like the change in 2.2.2 from #1770, meant to fix RVV auto-detection on systems with older kernels (and thus possibly to fix this issue altogether?), is ineffective on at least some systems, because SIGILL now occurs during detection itself. In 2.2.1:
In 2.2.2:
Full backtraces and context are available at rust-lang/libz-sys#200 (comment). The system is a 64-bit RISC-V machine without RVV support, hosted by Scaleway with a "system type" of EM-RV1-C4M16S128-A (with these specifications), and running a 5.10.113-scw1 kernel:
My guess is that the continued problem, in this new form, is not specific to one particular chip and also not specific to Scaleway builds of the kernel, but I am not certain. Furthermore, even if I am correct, it may be that this old of a kernel version is uncommonly used elsewhere; I am not sure about that either. I can attempt to emulate a separate RISC-V system without RVV support and produce the problem with a generic kernel, but since that would involve some delays and might not be necessary, I figured I'd report this first. Please let me know if there is further information I should provide. I encountered this when using this library through the libz-sys Rust crate (itself appearing as a dependency of gitoxide). But I have also verified by running the zlib-ng test suite that a number of this project's own tests fail on the same system due to SIGILL (directly or indirectly). All the tests that failed in 2.2.1 also failed in 2.2.2, but several more tests failed in 2.2.2. I presume this is because detection sometimes occurs when the results of the detection would not be used. Running the tests on the tip of the That is to say that, without passing any -cmake .
+cmake -DWITH_RVV=OFF .
cmake --build . --config Release
ctest --verbose -C Release |
PR #1585 has added runtime detection of RISC-V vector support for kernels newer than or equal to 6.5, and if the kernel is too old, zlib-ng would use compile time compiler support as fallback. However, this behavior is not safe for older kernels in conjunction with new compilers. In my case, I was using Linux 6.1.61 and GCC 13.2.1 when trying to compile starship 1.18.1 with default features enabled, including zlib-ng. Some git test cases of starship would fail because of the SIGILL signal coming from
adler32_rvv
.I suggest using a more conservative assumption about runtime support of RVV when hwcap is not available. When the kernel is too old, or hwcap is disabled by the sandbox mechanism, RVV support should be disabled.
The text was updated successfully, but these errors were encountered: