Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unecessary dependencies from KA #549

Merged
merged 4 commits into from
Dec 10, 2024
Merged

Remove unecessary dependencies from KA #549

merged 4 commits into from
Dec 10, 2024

Conversation

vchuravy
Copy link
Member

No description provided.

Copy link

codecov bot commented Dec 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 0.00%. Comparing base (265e5b8) to head (580716b).
Report is 2 commits behind head on main.

Additional details and impacted files
@@          Coverage Diff          @@
##            main    #549   +/-   ##
=====================================
  Coverage   0.00%   0.00%           
=====================================
  Files         12      12           
  Lines        751     751           
=====================================
  Misses       751     751           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@vchuravy vchuravy marked this pull request as ready for review December 10, 2024 06:37
@vchuravy vchuravy changed the title Support UnsafeAtomics 0.3 Remove unecessary dependencies from KA Dec 10, 2024
@vchuravy vchuravy merged commit d373ee0 into main Dec 10, 2024
32 of 36 checks passed
@vchuravy vchuravy deleted the vc/unsafe_atomics branch December 10, 2024 06:47
Copy link
Contributor

github-actions bot commented Dec 10, 2024

Benchmark Results

main 9076406... main/9076406766a50d...
saxpy/default/Float16/1024 0.545 ± 0.0067 μs 0.535 ± 0.006 μs 1.02
saxpy/default/Float16/1048576 0.174 ± 0.0048 ms 0.172 ± 0.0022 ms 1.01
saxpy/default/Float16/16384 3.13 ± 0.024 μs 3.14 ± 0.054 μs 0.998
saxpy/default/Float16/2048 0.714 ± 0.0075 μs 0.709 ± 0.0068 μs 1.01
saxpy/default/Float16/256 0.419 ± 0.0077 μs 0.416 ± 0.0053 μs 1.01
saxpy/default/Float16/262144 0.0441 ± 0.00082 ms 0.0443 ± 0.00099 ms 0.995
saxpy/default/Float16/32768 5.8 ± 0.05 μs 5.81 ± 0.11 μs 0.998
saxpy/default/Float16/4096 1.11 ± 0.019 μs 1.09 ± 0.016 μs 1.02
saxpy/default/Float16/512 0.458 ± 0.0042 μs 0.456 ± 0.0058 μs 1.01
saxpy/default/Float16/64 0.385 ± 0.0057 μs 0.381 ± 0.0048 μs 1.01
saxpy/default/Float16/65536 11.4 ± 0.13 μs 11.6 ± 0.22 μs 0.988
saxpy/default/Float32/1024 0.438 ± 0.006 μs 0.427 ± 0.0056 μs 1.03
saxpy/default/Float32/1048576 0.232 ± 0.02 ms 0.223 ± 0.019 ms 1.04
saxpy/default/Float32/16384 2.57 ± 0.28 μs 2.92 ± 0.98 μs 0.879
saxpy/default/Float32/2048 0.552 ± 0.013 μs 0.551 ± 0.0099 μs 1
saxpy/default/Float32/256 0.385 ± 0.0044 μs 0.383 ± 0.0047 μs 1.01
saxpy/default/Float32/262144 0.0565 ± 0.0035 ms 0.0547 ± 0.0042 ms 1.03
saxpy/default/Float32/32768 5 ± 0.52 μs 5.48 ± 1.6 μs 0.913
saxpy/default/Float32/4096 0.925 ± 0.072 μs 0.903 ± 0.0075 μs 1.02
saxpy/default/Float32/512 0.406 ± 0.0054 μs 0.398 ± 0.0047 μs 1.02
saxpy/default/Float32/64 0.372 ± 0.0043 μs 0.375 ± 0.0038 μs 0.993
saxpy/default/Float32/65536 12.4 ± 1.1 μs 12.4 ± 1.7 μs 1
saxpy/default/Float64/1024 0.556 ± 0.015 μs 0.533 ± 0.011 μs 1.04
saxpy/default/Float64/1048576 0.498 ± 0.037 ms 0.486 ± 0.035 ms 1.02
saxpy/default/Float64/16384 5.06 ± 0.52 μs 5.09 ± 1.5 μs 0.994
saxpy/default/Float64/2048 0.934 ± 0.068 μs 0.904 ± 0.014 μs 1.03
saxpy/default/Float64/256 0.416 ± 0.0078 μs 0.395 ± 0.0045 μs 1.05
saxpy/default/Float64/262144 0.0973 ± 0.021 ms 0.11 ± 0.0099 ms 0.882
saxpy/default/Float64/32768 12.5 ± 0.89 μs 12.6 ± 1.8 μs 0.994
saxpy/default/Float64/4096 1.49 ± 0.15 μs 1.6 ± 0.23 μs 0.934
saxpy/default/Float64/512 0.442 ± 0.0077 μs 0.434 ± 0.0064 μs 1.02
saxpy/default/Float64/64 0.387 ± 0.0069 μs 0.375 ± 0.0041 μs 1.03
saxpy/default/Float64/65536 28.3 ± 1.8 μs 28.2 ± 2.8 μs 1
saxpy/static workgroup=(1024,)/Float16/1024 1.93 ± 0.023 μs 1.93 ± 0.022 μs 1
saxpy/static workgroup=(1024,)/Float16/1048576 0.158 ± 0.0069 ms 0.158 ± 0.0053 ms 1
saxpy/static workgroup=(1024,)/Float16/16384 4.16 ± 0.09 μs 4.28 ± 0.12 μs 0.974
saxpy/static workgroup=(1024,)/Float16/2048 2.1 ± 0.031 μs 2.1 ± 0.036 μs 1
saxpy/static workgroup=(1024,)/Float16/256 2.59 ± 0.027 μs 2.57 ± 0.025 μs 1.01
saxpy/static workgroup=(1024,)/Float16/262144 0.0422 ± 0.0015 ms 0.0425 ± 0.0028 ms 0.993
saxpy/static workgroup=(1024,)/Float16/32768 6.54 ± 0.15 μs 6.8 ± 0.25 μs 0.962
saxpy/static workgroup=(1024,)/Float16/4096 2.42 ± 0.032 μs 2.41 ± 0.029 μs 1.01
saxpy/static workgroup=(1024,)/Float16/512 3.03 ± 0.051 μs 3.01 ± 0.025 μs 1.01
saxpy/static workgroup=(1024,)/Float16/64 2.31 ± 0.041 μs 2.27 ± 0.042 μs 1.02
saxpy/static workgroup=(1024,)/Float16/65536 12.2 ± 0.3 μs 12.6 ± 0.59 μs 0.97
saxpy/static workgroup=(1024,)/Float32/1024 1.96 ± 0.03 μs 1.95 ± 0.02 μs 1
saxpy/static workgroup=(1024,)/Float32/1048576 0.233 ± 0.02 ms 0.225 ± 0.02 ms 1.03
saxpy/static workgroup=(1024,)/Float32/16384 4.2 ± 0.74 μs 4.76 ± 0.061 μs 0.882
saxpy/static workgroup=(1024,)/Float32/2048 2.1 ± 0.036 μs 2.09 ± 0.03 μs 1
saxpy/static workgroup=(1024,)/Float32/256 2.44 ± 0.046 μs 2.44 ± 0.037 μs 1
saxpy/static workgroup=(1024,)/Float32/262144 0.06 ± 0.0035 ms 0.0571 ± 0.0042 ms 1.05
saxpy/static workgroup=(1024,)/Float32/32768 7.11 ± 0.48 μs 8.1 ± 0.15 μs 0.878
saxpy/static workgroup=(1024,)/Float32/4096 2.4 ± 0.072 μs 2.36 ± 0.021 μs 1.01
saxpy/static workgroup=(1024,)/Float32/512 2.44 ± 0.037 μs 2.45 ± 0.036 μs 0.996
saxpy/static workgroup=(1024,)/Float32/64 2.71 ± 8.3 μs 2.82 ± 8.6 μs 0.96
saxpy/static workgroup=(1024,)/Float32/65536 15.2 ± 1 μs 15.8 ± 1.3 μs 0.961
saxpy/static workgroup=(1024,)/Float64/1024 2.08 ± 0.037 μs 2.05 ± 0.032 μs 1.01
saxpy/static workgroup=(1024,)/Float64/1048576 0.518 ± 0.037 ms 0.52 ± 0.036 ms 0.996
saxpy/static workgroup=(1024,)/Float64/16384 7.32 ± 0.76 μs 8.25 ± 1.1 μs 0.888
saxpy/static workgroup=(1024,)/Float64/2048 2.34 ± 0.061 μs 2.33 ± 0.028 μs 1.01
saxpy/static workgroup=(1024,)/Float64/256 2.43 ± 0.058 μs 2.41 ± 0.058 μs 1.01
saxpy/static workgroup=(1024,)/Float64/262144 0.118 ± 0.01 ms 0.115 ± 0.0096 ms 1.03
saxpy/static workgroup=(1024,)/Float64/32768 15.3 ± 1 μs 16 ± 1.8 μs 0.952
saxpy/static workgroup=(1024,)/Float64/4096 2.99 ± 0.24 μs 3.08 ± 0.2 μs 0.972
saxpy/static workgroup=(1024,)/Float64/512 2.41 ± 0.052 μs 2.42 ± 0.042 μs 0.996
saxpy/static workgroup=(1024,)/Float64/64 2.41 ± 13 μs 2.41 ± 11 μs 0.997
saxpy/static workgroup=(1024,)/Float64/65536 31.2 ± 2.1 μs 30.7 ± 2.8 μs 1.02
time_to_load 0.784 ± 0.0067 s 0.32 ± 0.0029 s 2.45

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant