BUG: fix negative samples generated by Wald distribution (#29609) by charris · Pull Request #29926 · numpy/numpy

charris · 2025-10-10T22:25:42Z

Backport of #29603.

In numpy.random, when generating Wald samples, if the mean and scale have a large discrepancy, the current implementation sometimes generates negative samples. An explicit example of this has been added to test_generator_mt19937.py, mean=1e8, scale=2.25 (which is specific to the seed used in that test file).

The key line implicated in distributions.c:

X = mean + mu_2l * (Y - sqrt(4 * scale * Y + Y * Y));

which has numerical issues when Y >> scale.

I have replaced this with the equivalent rationalized form:

d = Y + sqrt(Y) * sqrt(Y + 4 * scale);
X = mean - (2 * mean * Y) / d;

This can be seen my noting the following algebraic identity:

Y - sqrt(4 * scale * Y + Y * Y))
= - 4 * scale * Y / (Y + sqrt(Y^2 + 4 * scale * Y))

As mu_2l = mean / (2 * scale), we then have:

X = mean - 2 * mean * Y / (Y + sqrt(Y^2 + 4 * scale * Y))

Introduce a new variable for the denominator and factoring gives the final expressions:

d = Y + sqrt(Y) * sqrt(Y + 4 * scale)
X = mean - (2 * mean * Y) / d

And now the test passes, i.e., all samples are non-negative.

I have not made this change to the legacy implementation (nor the legacy test) as I figure they are deprecated, although it also suffers from the same issue.

Disclaimer: I found this failing property during a research project I am working on using LLM agents to write property-based tests (i.e., in Hypothesis). This property, that all Wald samples should be non-negative, was proposed and written "autonomously" by the agent. I spent several hours analyzing the failure and becoming acquainted with the NumPy codebase myself in order to verify it was a real issue, running it with the seed in the test suite, etc. I also pinpointed the cause of the problem and wrote this fix and the PR myself. I take full responsibility for it.

The Hypothesis test that caught the issue was simply:

import numpy.random
from hypothesis import given, strategies as st, settings
from numpy.random import MT19937, Generator

@given(
    st.floats(min_value=1e8, max_value=1e15),
    st.floats(min_value=0.1, max_value=10.0)
)
@settings(max_examples=50)
def test_wald_negative_values(mean, scale):
    """Wald distribution should never produce negative values"""
    random = Generator(MT19937(1234567890))
    samples = random.wald(mean, scale, size=1000)
    assert all(s >= 0 for s in samples), f"Found negative values with mean={mean}, scale={scale}"

Note that even after this PR, this property fails, albeit the minimal example is now mean=280432326968266.0, scale=0.25 (i.e., even more extreme values). But, at such values, it seems like we hit the limits of floating point computations.

In numpy.random, when generating Wald samples, if the mean and scale have a large discrepancy, the current implementation suffers from catastrophic cancellation. An explicit example of this has been added to test_generator_mt19937.py. The key line implicated in distributions.c: `X = mean + mu_2l * (Y - sqrt(4 * scale * Y + Y * Y));` which has numerical issues when Y >> scale. I have replaced this with the equivalent rationalized form: ``` d = Y + sqrt(Y) * sqrt(Y + 4 * scale); X = mean - (2 * mean * Y) / d; ``` And now the test passes.

charris added this to the 2.3.4 release milestone Oct 10, 2025

charris added 00 - Bug component: numpy.random 08 - Backport Used to tag backport PRs labels Oct 10, 2025

charris merged commit e2f0383 into numpy:maintenance/2.3.x Oct 10, 2025
85 of 90 checks passed

charris deleted the backport-29609 branch October 10, 2025 23:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BUG: fix negative samples generated by Wald distribution (#29609)#29926

BUG: fix negative samples generated by Wald distribution (#29609)#29926
charris merged 1 commit intonumpy:maintenance/2.3.xfrom
charris:backport-29609

charris commented Oct 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

charris commented Oct 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants