Skip to content

added linear regression square root approximation#1565

Open
AMNTDEVIL wants to merge 1 commit intoTheAlgorithms:masterfrom
AMNTDEVIL:master
Open

added linear regression square root approximation#1565
AMNTDEVIL wants to merge 1 commit intoTheAlgorithms:masterfrom
AMNTDEVIL:master

Conversation

@AMNTDEVIL
Copy link

Description

This PR introduces a fast square root approximation using least squares linear regression,
optimized for the interval [1.0, 2.0]. It is designed for performance-critical environments
such as game engines, signal processors, and embedded systems where calling the standard
sqrt() is too expensive.

The core idea is to fit a linear equation y = αx + β that minimizes the squared error
against √x over the target range, then use bit-level tricks to extend it to all positive
floats via IEEE 754 exponent manipulation.


How It Works

1. Coefficient Calculation

Uses the normal equations of linear regression over N = 1000 sampled points in [1.0, 2.0]
to compute the optimal slope α and intercept β:

$$\alpha = \frac{N \sum x_i y_i - \sum x_i \sum y_i}{N \sum x_i^2 - (\sum x_i)^2}, \quad \beta = \frac{\sum y_i - \alpha \sum x_i}{N}$$

2. IEEE 754 Bit-Level Representation

The coefficients are inspected at the bit level, which is useful for low-level systems
that need hardcoded hex constants instead of runtime computation:

alpha = 0.411859  →  0x3ED2DF36
beta  = 0.601160  →  0x3F19E5A4

3. Fixed-Point Conversion (16.16 format)

For hardware without an FPU, the coefficients are scaled by 2^16 = 65536 and stored as integers:

alpha_fixed = 26991  →  0x696F
beta_fixed  = 39397  →  0x99E5

4. Range Extension via Bit Manipulation

To handle inputs outside [1.0, 2.0]:

  1. Extract the IEEE 754 exponent to normalize x into [1.0, 2.0)
  2. Apply the linear approximation on the mantissa
  3. Rescale the result by 2^(exp/2)
  4. Handle odd exponents by multiplying by the hardcoded constant √2 ≈ 1.41421356

Performance vs Accuracy

x Approx Actual Error
1.00 1.004923 1.000000 +0.49%
1.50 1.225748 1.224745 +0.08%
2.00 1.419493 1.414214 +0.37%
4.00 2.008845 2.000000 +0.44%
16.00 4.017689 4.000000 +0.44%

Sub-1% error across the primary range, with no calls to math.h.


Use Cases

  • Game engines and real-time graphics — similar in spirit to the Quake III fast inverse sqrt
  • Embedded systems and microcontrollers — runs on hardware without an FPU
  • Signal processing pipelines — where throughput matters more than exact precision
  • Education — demonstrates how linear regression applies to function approximation

References


Checklist

  • Description of change added
  • Filename matches file name guidelines
  • Tests and examples added, tests pass
  • Relevant documentation and comments updated
  • PR title follows semantic commit guidelines
  • Checked for duplicate suggestions

I acknowledge that all my contributions will be made under the project's license.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant