Skip to content

Conversation

@ashvardanian
Copy link
Owner

@ashvardanian ashvardanian commented Oct 31, 2024

It started as a straightforward optimization request from the @albumentations-team: to improve the special case of the wsum (Weighted Sum) operation for the "non-weighted" scenario and to add APIs for scalar multiplication and addition. This update introduces new public APIs in both C and Python:

  1. scale: Implements $\alpha * A_i + \beta$
  2. sum: Computes $A_i + B_i$

Recognizing the value of consistency with widely-used libraries, we’ve also added "aliases" aligned with names familiar to developers using NumPy and OpenCV for element-wise addition and multiplication across vectors and scalars:

NumPy OpenCV SimSIMD
np.add cv.add simd.add
np.multiply cv.multiply simd.multiply

Note: SimSIMD and NumPy differ in handling certain corner cases. SimSIMD offers broader support, with up to 64 tensor dimensions (compared to NumPy’s 32), wider compatibility with Python versions, operating systems, hardware, and numeric types—and of course, greater speed! However, SimSIMD requires input vectors to be of identical types. For integers, it also supports saturation to prevent overflow/underflow, which can simplify debugging but may be unexpected for some developers.

The real excitement came when we realized that larger projects would take time to adopt emerging numeric types like bfloat16 and float8, which are well-known in AI circles. To bridge this gap, SimSIMD now introduces an AnyTensor type designed for maximum interoperability via CPython's Buffer Protocol and beyond, setting it apart from similar types in NumPy, PyTorch, TensorFlow, and JAX.

Tensor Class for C, Python, and Rust 🦀

Element-wise Operations 🧮

Reduced Range Trigonometry 📐

Geospatial Operations 🛰️


Breaking:

  • the project is renamed from "SimSIMD" to "MathKong", aligning with StringZilla.
  • "cosine" distance is now called "angular" to avoid confusion with trigonometric element-wise functions.
  • the flush_denormals functions become configure_thread and also enables AMX and SME.
  • DistancesTensor in Python is replaced with NDArray to match NumPy API.
  • kernel have different output types instead of applying simsimd_distance_t uniformly.
  • complex products output complex types.
  • the ABI of element-wise operation changed taking scaling factors by pointer.

If you have any feedback regarding the limitations of current array-processing software in a single- or multi-node AI training settings, I am all ears 👂

@ashvardanian ashvardanian changed the title Element-wise BLAS-like APIs Element-wise BLASAPIs & new Tensor for Python Nov 1, 2024
@ashvardanian ashvardanian changed the title Element-wise BLASAPIs & new Tensor for Python Element-wise BLAS APIs & new Tensor for Python Nov 1, 2024
@ashvardanian ashvardanian force-pushed the main-elementwise branch 2 times, most recently from 14fd5d3 to 18c41fd Compare November 5, 2024 22:32
@ashvardanian ashvardanian changed the title Element-wise BLAS APIs & new Tensor for Python Element-wise BLAS APIs & new Tensor for Python: ⬆️ 450 kernels Nov 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants