You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
loglik_obs(::Binomial, y, μ, wt, ϕ) =logpdf(Binomial(Int(wt), μ), Int(y*wt))
(Sidebar: The PDF function will actually accept non integer values, returning -Inf if they're not within 1eps of an integer or the value of the corresponding integer if they are.)
As part of an analysis at the day job, I found a pathological case where we currently get an InexactError:
julia>using BenchmarkTools
julia>using Distributions
# values from a fitted model
julia> y, μ, wt, ϕ =0.6376811594202898, 0.8492925285671102, 69.0, NaN# phi isn't used here
(0.6376811594202898, 0.8492925285671102, 69.0, NaN)
julia># definition from GLM.jlloglik_obs(::Binomial, y, μ, wt, ϕ) =logpdf(Binomial(Int(wt), μ), Int(y*wt))
loglik_obs (generic function with 1 method)
julia>loglik_obs(Binomial(), y, μ, wt, ϕ) # error
ERROR: InexactError:Int64(43.99999999999999)
Stacktrace:
[1] Int64
@ ./float.jl:788 [inlined]
[2] loglik_obs(#unused#::Binomial{Float64}, y::Float64, μ::Float64, wt::Float64, ϕ::Float64)
@ Main ./REPL[76]:2
[3] top-level scope
@ REPL[77]:1
If we look at the numbers creating the problem, we have
julia> y * wt
43.99999999999999
julia>nextfloat(y * wt)
44.0
julia>44/ wt == y
true
julia>44/ y == wt
true
So this really is just a floating point issue. We can fix this by defining a safer_int for "close enough"
functionsafer_int(x::T) where {T<:Base.IEEEFloat}
r =round(Int, x)
abs(x - r) <=eps(x) &&return r
throw(InexactError(:safer_int, T, x))
endloglik_obs_safer(::Binomial, y, μ, wt, ϕ) =logpdf(Binomial(Int(wt), μ), safer_int(y*wt))
and then everything works:
julia>loglik_obs_safer(Binomial(), y, μ, wt, ϕ) # works-11.628163156011077
What happens to performance though for a non pathological case? Let's take a look
julia> y, μ, wt, ϕ =1/3 , 0.5, 3, NaN
(0.3333333333333333, 0.5, 3, NaN)
julia>@benchmarkloglik_obs(Binomial(), $y, $μ, $wt, $ϕ)
BenchmarkTools.Trial:10000 samples with 987 evaluations.
Range (min … max):50.616 ns …86.458 ns ┊ GC (min … max):0.00%…0.00%
Time (median):50.785 ns ┊ GC (median):0.00%
Time (mean ± σ):51.102 ns ±1.389 ns ┊ GC (mean ± σ):0.00%±0.00%
▆█▄ ▁ ▁ ▁
▆████▆▃▄▃▃▃▂▅▅▅▃▅▄▄▄▄▅▄▅██▆▇▇▇▆▅▄▃▅▅▆█▇▅▆▆▅▅▅▅▅▅▆▇▆▆▅▅▆▅▆▅▅ █
50.6 ns Histogram:log(frequency) by time 55.2 ns <
Memory estimate:0 bytes, allocs estimate:0.
julia>@benchmarkloglik_obs_safer(Binomial(), $y, $μ, $wt, $ϕ)
BenchmarkTools.Trial:10000 samples with 986 evaluations.
Range (min … max):53.414 ns …88.446 ns ┊ GC (min … max):0.00%…0.00%
Time (median):53.542 ns ┊ GC (median):0.00%
Time (mean ± σ):53.847 ns ±1.313 ns ┊ GC (mean ± σ):0.00%±0.00%
▂█▇▃ ▁▂▁ ▂▂ ▂
████▆▇▅▄▄▃▄▅▄▅▄▅▄▄▄▅▃▆▅▃▃▄███████▇▆▆▇██▅▄▄▅▅▃▅▅▅▃▁▄▄▅▅▅▄▅▆▆ █
53.4 ns Histogram:log(frequency) by time 57.9 ns <
Memory estimate:0 bytes, allocs estimate:0.
So only about 2-3ns worse.
Tested on Apple silicon:
julia>versioninfo()
Julia Version 1.8.2
Commit 36034abf260 (2022-09-2915:21 UTC)
Platform Info:
OS: macOS (arm64-apple-darwin21.3.0)
CPU:8× Apple M1
WORD_SIZE:64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, apple-m1)
Threads:4 on 4 virtual cores
@nalimilan if you have no strong objections, I would just go ahead and open a PR to add this.
Hat-tip @ararslan who rubber-ducked this with me and @haberdashPI who found the pathological case.
The text was updated successfully, but these errors were encountered:
Currently,
loglik_obs(::Binomial)
forces the second argument tologpdf(::Binomial, x)
to be be an integer:GLM.jl/src/glmtools.jl
Line 515 in 1459737
(Sidebar: The PDF function will actually accept non integer values, returning -Inf if they're not within 1eps of an integer or the value of the corresponding integer if they are.)
As part of an analysis at the day job, I found a pathological case where we currently get an
InexactError
:If we look at the numbers creating the problem, we have
So this really is just a floating point issue. We can fix this by defining a
safer_int
for "close enough"and then everything works:
What happens to performance though for a non pathological case? Let's take a look
So only about 2-3ns worse.
Tested on Apple silicon:
@nalimilan if you have no strong objections, I would just go ahead and open a PR to add this.
Hat-tip @ararslan who rubber-ducked this with me and @haberdashPI who found the pathological case.
The text was updated successfully, but these errors were encountered: