You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#defineSIMDPP_ARCH_X86_AVX2
#include<simdpp/simd.h>voidsum(double* out, doubleconst* lhs, doubleconst* rhs) {
usingvec_t = simdpp::float64<1>;
auto l = simdpp::load_u<vec_t>(lhs);
auto r = simdpp::load_u<vec_t>(rhs);
simdpp::store_u(out, l + r);
}
will load and write 4 doubles instead of a single one, which may result in an unexpected buffer overflow. is this the intended behavior?
The text was updated successfully, but these errors were encountered:
it's not just sizes smaller than the smallest native size. for example, the above code generates the correct instructions if we use float64<2>, but not with float64<3> (loads/stores 4 doubles) or even float64<6> (loads/stores 8 doubles).
i think vectors of size N that's a power of 2 and smaller than the smallest native size could be implemented as unaligned std::array<double, N> for example.
as for the ones that are of size N larger than the largest native size, we could implement them like this (assuming the largest size is 4 for example)
this function, for example
will load and write 4
double
s instead of a single one, which may result in an unexpected buffer overflow. is this the intended behavior?The text was updated successfully, but these errors were encountered: