-
Notifications
You must be signed in to change notification settings - Fork 19
/
nitrofreq.qmd
164 lines (122 loc) · 5.04 KB
/
nitrofreq.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
---
title: Poisson GLMM using frequentist methods
author: "[Julian Faraway](https://julianfaraway.github.io/)"
date: "`r format(Sys.time(), '%d %B %Y')`"
format:
gfm:
toc: true
---
```{r global_options, include=FALSE}
knitr::opts_chunk$set(comment=NA,
echo = TRUE,
fig.path="figs/",
dev = 'svglite',
fig.ext = ".svg",
warning=FALSE,
message=FALSE)
knitr::opts_knit$set(global.par = TRUE)
```
```{r graphopts, include=FALSE}
par(mgp=c(1.5,0.5,0), mar=c(3.1,3.1,0.1,0), pch=20)
ggplot2::theme_set(ggplot2::theme_bw())
```
See the [introduction](../index.md) for an overview.
See a [mostly Bayesian analysis](nitrofen.md) analysis of the same
data.
This example is discussed in more detail in the book
[Bayesian Regression Modeling with INLA](https://julianfaraway.github.io/brinlabook/chaglmm.html#sec:poissonglmm)
Packages used:
```{r}
library(ggplot2)
```
# Data
In [Davison and Hinkley, 1997]( https://doi.org/10.1017/CBO9780511802843), the results of a study on Nitrofen, a herbicide, are reported. Due to concern regarding the effect on animal life, 50 female water fleas were divided into five groups of ten each and treated with different concentrations of the herbicide. The number of offspring in three subsequent broods for each flea was recorded. We start by loading the data from the `boot` package: (the `boot` package comes with base R distribution so there is no need to download this)
```{r}
#| label: nitrofendat
data(nitrofen, package="boot")
head(nitrofen)
```
We need to rearrange the data to have one response value per line:
```{r}
lnitrofen = data.frame(conc = rep(nitrofen$conc,each=3),
live = as.numeric(t(as.matrix(nitrofen[,2:4]))),
id = rep(1:50,each=3),
brood = rep(1:3,50))
head(lnitrofen)
```
Make a plot of the data:
```{r}
#| label: fig-nitrodat
#| fig-cap: "The number of live offspring varies with the concentration of Nitrofen and the brood number."
lnitrofen$jconc <- lnitrofen$conc + rep(c(-10,0,10),50)
lnitrofen$fbrood = factor(lnitrofen$brood)
ggplot(lnitrofen, aes(x=jconc,y=live, shape=fbrood, color=fbrood)) +
geom_point(position = position_jitter(w = 0, h = 0.5)) +
xlab("Concentration") + labs(shape = "Brood")
```
# Model
Since the response is a small count, a Poisson model is a natural
choice. We expect the rate of the response to vary with the brood and
concentration level. The plot of the data suggests these two predictors
may have an interaction. The three observations for a single flea are
likely to be correlated. We might expect a given flea to tend to produce
more, or less, offspring over a lifetime. We can model this with an
additive random effect. The linear predictor is:
```{math}
\eta_i = x_i^T \beta + u_{j(i)}, \quad i=1, \dots, 150. \quad j=1, \dots 50,
```
where $x_i$ is a vector from the design matrix encoding the information
about the $i^{th}$ observation and $u_j$ is the random affect associated
with the $j^{th}$ flea. The response has distribution
$Y_i \sim Poisson(\exp(\eta_i))$.
# LME4
We fit a model using penalized quasi-likelihood
(PQL) using the `lme4` package:
```{r}
library(lme4)
lnitrofen$brood = factor(lnitrofen$brood)
glmod <- glmer(live ~ I(conc/300)*brood + (1|id), nAGQ=25,
family=poisson, data=lnitrofen)
summary(glmod, correlation = FALSE)
```
We scaled the concentration by dividing by 300 (the maximum value is
310) to avoid scaling problems encountered with `glmer()`. This is
helpful in any case since it puts all the parameter estimates on a
similar scale. The first brood is the reference level so the slope for
this group is estimated as $-0.0437$ and is not statistically
significant, confirming the impression from the plot. We can see that
numbers of offspring in the second and third broods start out
significantly higher for zero concentration of the herbicide, with
estimates of $1.1688$ and $1.3512$. But as concentration increases, we
see that the numbers decrease significantly, with slopes of $-1.6730$
and $-1.8312$ relative to the first brood. The individual SD is
estimated at $0.302$ which is noticeably smaller than the estimates
above, indicating that the brood and concentration effects outweigh the
individual variation.
We would need to work harder to test the significance of the effects as
`lmerTest` is of no help for GLMMs. We do have the z-tests but in this
example, they do not give us the tests we want.
# NLME
Cannot fit GLMMs (other than normal response).
# MMRM
Cannot fit GLMMs (other than normal response).
# GLMMTMB
See the discussion for the [single random effect example](pulpfreq.md#GLMMTMB)
for some introduction.
```{r}
library(glmmTMB)
```
We can fit the model with:
```{r}
gtmod <- glmmTMB(live ~ I(conc/300)*brood + (1|id),
family=poisson, data=lnitrofen)
summary(gtmod)
```
Almost the same output as with `lme4` although a different optimizer
has been used for the fit.
# Discussion
Only `lme4` and `glmmTMB` can manage this.
# Package version info
```{r}
sessionInfo()
```