We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
dp4a now is also supported on AMD RDNA3+ GPUs
Updated OpenCL-Wrapper, added hardware-supported dp4a for AMD/Intel G… …PUs too, updated benchmark examples in Readme
Minor cosmetics
Automatically use zero-copy buffers on CPUs/iGPUs, bandwidth kernels … …now write non-zero data
Removed wait() call at the end of benchmark on Linux
Cosmetics
Fixed terrible performance on ARM GPUs by macro-replacing fused-multi… …ply-add (fma) with a*b+c, added automatic OS detection in make.sh
Added operating system info to OpenCL device driver version printout
Fixed several issues with macOS
Minor cosmetics in Readme