ãã®è¨äºã¯
R Advent Calendar 2021ã®12æ¥ç®ã®è¨äºã«ãªããããã§ãã
æ¨æ¥ã®è¨äºã¯ãããããã®Lassoå帰ã¹ã¯ã©ããå®è£
ã§ãã
ããã¯ã¬ãã¬ãã«ãããã®ã§æ£ååå帰å®å
¨ç解ã試ã¿ãå¢ã¯å¿
èªã
ããã¦ææ¥ããããããããã¸ã¹ãã£ãã¯å帰ãã¹ã¯ã©ããå®è£
ãããããã§ããã¹ã¯ã©ããå®è£
ã¯ãããã
åã¯ãã£ã¦ã¾ãããã
- ä¸è¬åç·å½¢æ··åã¢ãã«ãRã§å®æ½ããæ¹æ³ãæ¸ãã¦ãã¾ãã
- ã©ã³ãã å¹æã®æ¨å®ã«ã¤ãã¦2021å¹´12æ11æ¥ç¾å¨ã§åã追ãããã¦ããé¨åã¾ã§è¨è¿°ãã¦ãã¾ãã
- æ··åã¢ãã«ã®æ¡å¼µ(æ£åå/éç·å½¢ã¢ãã«)ã«ã¤ãã¦ãä»å¾ã®èª¿æ»ã»ãªãµã¼ãã®ããã«ã¡ã¢ãã¦ãã¾ãã
ãä¹ ãã¶ãã§ã
5æ以æ¥ããã°ã®æ´æ°ããã¦ãã¾ããã§ããã
ä»å¹´1å¹´ãæ¯ãè¿ããã¨ã¯ã¾ãæ¹ãã¦è¨äºã«ãããããªã¨æãã¾ããã
主ã«ä»¥ä¸ã®ãããªãã¨ãããã¾ããã
- 転è·æ´»åããã¦ãããç²ããã
- ãããªãã¨ããã¦ããã1å¹´è²ã¦ãå¾è¼©ãå¼ã£ãæããã
- æ¡ç¨æ´»åããã¦ãã
- R以å¤ã触ããã¨ãå¢ãã¦ãã(æè¿ã¯ãã¯ã¼ãã¤ã³ãã¨Pythonã¨ãRustã¨ãã触ã£ã¦ãã)ã
ä»æ¥æ¸ããã¨
2019å¹´ã®ã¢ããã³ãã«ã¬ã³ãã¼ã§ã¯æ£è¦ç·å½¢ã¢ãã«ããä¸è¬åç·å½¢æ··åã¢ãã«ã¾ã§ã®å¤é·ã«ã¤ãã¦ã
é©å½ã«æ¸ãã¦ãã¾ããã
ä»å¹´ã¯ãã®ä¸ã§ãä¸è¬åç·å½¢æ··åã¢ãã«ãç´°ããæ¸ããã¨æãã¾ãã
ãªãï¼
é¢ç½ãããã§ãã
ç°å¢
ä»åã®ã¹ã¯ãªããã¯ä»¥ä¸ã®ç°å¢ã§å®è¡ãã¦ãã¾ããã¹ã¯ãªããã¯ä»¥ä¸ã
version # platform x86_64-pc-linux-gnu # arch x86_64 # os linux-gnu # system x86_64, linux-gnu # status # major 4 # minor 1.1 # year 2021 # month 08 # day 10 # svn rev 80725 # language R # version.string R version 4.1.1 (2021-08-10) # nickname Kick Things
Rè¨èªã®è¯ãã¨ããã¯OSã¸ã®ä¾åãããªãä½ããã¨ã ã¨æãã¾ãã
æè¿ã¯ãã¼ã«ã«ã§ãä»®æ³ç°å¢ãç«ã¦ã¦ç°å¢ä¾åãåé¿ããåããé常ã«éè¦ã§ã¯ããã¾ããã
ãã¼ã¿åæããããã°ã©ãã³ã°ãå§ããRãæ¯å½èªã§Rã好ããªäººã¯ã
ã¾ããã¼ã«ã«ã®ç°å¢ã§ä¸éãã³ã¼ããæ¸ãããã¨ã¯éè¦ãªã¹ãããã ã¨æããã¾ãã
ã¡ãªã¿ã«ããã©ã«ãã®ç®¡çã«ã¤ãã¦ã¯éå»ã«è¨äºãæ¸ããã®ã§ã
軽ãåèã«ãã¦ãããããªãããããªã¨æãã¾ãã
ãã¼ã¿åæãã¡ããã¨ç®¡çããã with R ããã©ã«ãæ§æç·¨ã
ãã¼ã¿
palmerpenguins
ã§èªã¿è¾¼ãããã³ã®ã³ãã¼ã¿ãç¨ãã¾ãã
測å®å¹´ããã³ã®ã³ã®ç¨®é¡ãå±
ä½å°ãªã©ãæ··åã¢ãã«ãé©ç¨ããã«ãé©ããè¯ããã¼ã¿ã§ãã
ãã¼ã¿åæåå¿è
ï¼ã¨ãã人ã«ããæ¬ æå¤ãã«ãã´ãªå¤æ°ã¨ãã£ãåºæ¬çãªæ¦å¿µããã£ããå«ã¾ãã¦ãã¦ã
ã¨ã¦ããããªãã¨æãã¾ãã
library(palmerpengins) summary(penguins) # species island bill_length_mm bill_depth_mm # Adelie :152 Biscoe :168 Min. :32.10 Min. :13.10 # Chinstrap: 68 Dream :124 1st Qu.:39.23 1st Qu.:15.60 # Gentoo :124 Torgersen: 52 Median :44.45 Median :17.30 # Mean :43.92 Mean :17.15 # 3rd Qu.:48.50 3rd Qu.:18.70 # Max. :59.60 Max. :21.50 # NA's :2 NA's :2 # # flipper_length_mm body_mass_g sex year # Min. :172.0 Min. :2700 female:165 Min. :2007 # 1st Qu.:190.0 1st Qu.:3550 male :168 1st Qu.:2007 # Median :197.0 Median :4050 NA's : 11 Median :2008 # Mean :200.9 Mean :4202 Mean :2008 # 3rd Qu.:213.0 3rd Qu.:4750 3rd Qu.:2009 # Max. :231.0 Max. :6300 Max. :2009 # NA's :2 NA's :2
å®è£
å¤åãçè«ã¨ãã©ãã§ãããã®ã§ã³ã¼ãç¥ãããï¼ãã¨ãã人ããããã ãããªãã¨æãã¾ãã
以éã«åãªãã«çè«ãåã¿ç ãã¦ããã®ã§ããããªããªã£ããã¾ãèªãã§ãã ããã
Rã§ç·å½¢æ··åã¢ãã«ãå®è£
ã§ããããã±ã¼ã¸ã¯ããã¤ãããã¾ããç´°ããç¹ã§éãã¯ããã¾ããã
ãã®è¨äºã§ã¯ãåãä¸çªå¥½ã㪠lme4
ããã±ã¼ã¸ã§ãã¢ã³ã¹ãã¬ã¼ã·ã§ã³ããã¾ãã
大å¤ãã°ããããã¨ã«ãRã§å帰åæãç¥ã£ã¦ãã人ã§ããã°åé¡ãªãå®è¡ãã§ããã¨æãã¾ãã
ç¥ãå¿
è¦ã®ããã³ã¼ãã¯ãã©ã³ãã å¹æããæ¨å®ãããå¤æ°ã«é¢ããé¨åã§ãã
å
·ä½çã«ã¯ãã©ã³ãã å¹æãæ¨å®ãããå¤æ°ã(å¤æ°å | ã°ã«ã¼ãå¤æ°å)
ã®å½¢ã§æ¸ãã¾ãã
åçãå ãããå ´åã¯1
ãå«ã¿ã¾ãã
ä»åã¯ãã³ã®ã³ã®ãã¡ã°ãã®é·ã(bill_length_mm
)ããã³ã®ã³ã®ç¿¼ã®é·ã(flipper_length_mm
)ã§å帰ãããã¨ãèãã¾ãã
ã©ã³ãã å¹æã«ã¤ãã¦ã¯ããã³ã®ã³ã®ç¨®é¡ï¼species
ï¼ã«ãã£ã¦åçã¨ç¿¼ã®é·ããç°ãªãã ããã¨ãã仮説ã«åºã¥ã(1 + flipper_length_mm | species)
ã¨ããå½¢ã§è¨è¿°ãã¾ãã
# define data # to simple, we omit NA rows. usedata <- penguins %>% na.omit() # glmm model run (for comparison) glmm_model <- lme4::lmer(bill_length_mm ~ flipper_length_mm + (1 + flipper_length_mm|species), data = usedata)
ã¢ããªã³ã°ã®é¨åã¯ã»ãã®2è¡ã1é¢æ°ã§æ¸ã¿ã¾ããã¨ã¦ãç°¡åã
çµæãè¦ã¦ã¿ã¾ãã
summary(glmm_model) # Linear mixed model fit by REML ['lmerMod'] # Formula: bill_length_mm ~ flipper_length_mm + (1 + flipper_length_mm | species) # Data: usedata # # REML criterion at convergence: 1596.2 # # Scaled residuals: # Min 1Q Median 3Q Max # -2.6469 -0.6595 0.0297 0.6924 4.9957 # # Random effects: # Groups Name Variance Std.Dev. Corr # species (Intercept) 2.6567656 1.62996 # flipper_length_mm 0.0009521 0.03086 -1.00 # Residual 6.7064017 2.58967 # Number of obs: 333, groups: species, 3 # # Fixed effects: # Estimate Std. Error t value # (Intercept) 1.25267 4.34480 0.288 # flipper_length_mm 0.21789 0.02767 7.873 # # Correlation of Fixed Effects: # (Intr) # flppr_lngt_ -0.886 # optimizer (nloptwrap) convergence code: 0 (OK) # unable to evaluate scaled gradient # Model failed to converge: degenerate Hessian with 1 negative eigenvalues
ä»åã¯ããã¤ãwarnings
ãã§ã¦ãã¾ãã¾ãããããã®é¨åã«ã¤ãã¦ã¯ä¸æ¦èã«ç½®ãã¾ã*1ã
Fixed effects
ã«ããé¨åã¯é常ã®å帰ä¿æ°ã®è§£éãå¯è½ã§ããRandom effects
ã®é¨åã¯Variance(åæ£)ã®æ¨å®å¤ã®ã¿ãç®åºããã¦ãã¾ããä¿æ°ãã¯ããåºãã«ã¯ranef()
é¢æ°ã使ããã¨ã§å¯è½ã§ãã
ranef(glmm_model) # $species # (Intercept) flipper_length_mm # Adelie 1.4836890 -0.028092178 # Chinstrap -1.7997485 0.034099870 # Gentoo 0.3160595 -0.006007692
åçã¨ç¿¼ã®é·ãã«é¢ãã¦ã¯ãåºå®å¹æã¨ã©ã³ãã å¹æã®åè¨ã§ããã³ã®ã³ã®ç¨®é¡å¥ã®ä¿æ°ããç®åºã§ãã¾ãã
ã¢ããªã³ã°èªä½ã¯é常ã«ç°¡åã§ãããã ã¢ããªã³ã°ãã®ãã®ã®éç¨ã¯é常ã«é£ããã®ã§ãã
ä¾ãã°ä»åã¯ããã³ã®ã³ã®ç¨®é¡ã«ãã£ã¦ãåé¨ä½ã®å¤§ãã/é·ãã¯ç°ãªãã ãããã¨ãã仮説ãããã¦ãã©ãããããã¯ããç¨åº¦ã®å¦¥å½æ§ãæã£ã¦è©ä¾¡ã§ãããã ãã¨ãããã¨ãåãã£ã¦ãã¾ããã
ç¾å®åé¡ã§ãããã£ã仮説ã妥å½æ§ãæã¤å½¢ã§è¡¨ç¾ã§ããªãå ´åãä¸è¬åç·å½¢æ··åã¢ãã«ã¯ãè¤ééãããã¢ãã«ã«ãªã£ã¦ãã¾ãã§ããããããã¯è¯ããªãã
ã§ãã®ã§ã¡ããã¨ä»®èª¬ãç«ã¦ã¦åæãã¾ããããã¨ã¯è¨ãã¤ã¤ä½¿ããªãã¨ä½¿ããããã«ã¯ãªããªãã®ã§ã
ä¸ã¤ã®ããå£ã¨ãã¦ã¯ãã¼ãã®ãããããä¸çªè¤éãªã¢ãã«ããå®ç¾©ãã¦ã
ããããããã£ã¨ã·ã³ãã«ã«è§£ããªãããã¨ä¸ã£ã¦ããã¨ããã¹ã¿ã¤ã«ã(éªéã§ãã)ããããããªãã§ããããã
R Advent Calendarè¦ç´ ã¯ããã§çµãããªã®ã§ä»¥éã¯è足ã§ãã
ä¸è¬åç·å½¢æ··åã¢ãã«ã¨ã¯
ä¸è¬åç·å½¢æ··åã¢ãã«ã¯åºãè¨ãã°ç·å½¢å帰åæã®1ãã¡ããªã¼ã§ãã
ãã ãé常ã®å帰åæãããè¦è«ããä»®å®ãç·©ããã¨ãç¹å¾´ã§ãã
é常ã®å帰ã¢ãã«
ä¾ãã°æ¥æ¬å¨ä½ã®äººãã¡ããããæãã«æ¨æ¬ãæ½åºãã¦ã
æå¾ãå¹´é½¢ãæ§å¥ãè·æ¥ãªã©(以å¾é·ãã®ã§ãå人å±æ§ãã¨ããã¾ã)ã®èª¬æå¤æ°(${X}$ã§è¡¨è¨)ã§å帰ããã·ã³ãã«ãªå帰ã¢ãã«ãèãã¾ãã
$$Income_i = \sum_j^{J}\beta_j X_{i,j} + \epsilon_i$$
ã«ãã³ããè¡ååãã¦
$${Income} = {X}{\beta} + {\epsilon}$$
ãèãã¾ãã
çµ±è¨ã¢ãã«ã«ã¯ããããã¨åæãè¨ãã¾ããä¸è¨ã®ã¢ãã«ã¯å
¨å½ã§è©ä¾¡ãããã¨ãåæã«ãã¦ãã¾ããã¤ã¾ããæ¥æ¬å
¨ä½ã«ããã æå¾ã¨å人å±æ§ã¨ã®éã«ãããå
¨å½å¹³åãã®é¢ä¿æ§ãã¢ãã«åãã¦ãã¾ãããã®çµæãç¡æå³ã§ã¯ããã¾ãããã
ä¾ãã°å°±æ¥æ§é åºæ¬èª¿æ»ã®ãã¼ã¿ãªã©ãè¦ã¦ããã¨ãå±
ä½ããé½éåºçå¥ã«å人å±æ§ã®å帰ä¿æ°ã¯ç°ãªãã®ã§ã¯ãªããï¼ã
ã¨ãã仮説ãã§ã¦ããããããã¾ããã
ãã ãä¸è¨ã®ã¢ãã«ã§ã¯ãé½éåºçå¥ã«å帰ä¿æ°ãå¤ããããããªã¢ãã«ãä½ããã¨ã¯é£ããã§ããã©ã®ããã«çµã¿ç´ãã°è¯ãã§ããããã
ç·å½¢æ··åã¢ãã«
é½éåºçå¥ã«å帰ä¿æ°ãç°ãªãã¨ãããã¨ã¯ãä¾ãã°å¹´é½¢ã¨ããå¤æ°1ã¤ã«å¯¾ãã¦ã
47åã®å帰ä¿æ°ãå¾ããããªã¢ãã«ãèãããããã§ãã
ããã¦ããã®47åã®å帰ä¿æ°ãæ¨æ¬æ°ã ãå¾ãããã°è¯ãã®ã§ãä¾ãã°ä»¥ä¸ã®ãããªå½¢ã®æ¸ãæããã§ããããããã¾ããã
$${Income} = {X}{\beta} + {X}{b} + {\epsilon}$$
ããã§æ°ããå
¥ã£ã¦ãã${b}$ã¯è¡ã¯äººã®æ°ãåãé½éåºçã®æ°ã ããããã¯ãã«ã§ãã
ããã«ãã£ã¦é½éåºçå¥ã«è·æ¥ã®å½±é¿ãè©ä¾¡ããã¨ããã¢ãã«ãã(ä¸è¬å)ç·å½¢æ··åã¢ãã«ã®åºæ¬çãªæ§é ã§ãã
é½éåºçå¥ã§ãªãã¦ããä»ã®ä¾ã¨ãã¦ãåºèã®æç³»åãã¼ã¿ã使ã£ã売ä¸äºæ¸¬ã¢ãã«ã«ç·å½¢æ··åã¢ãã«ã使ãå ´åã¯åºèå¥ã«å¤åããã説æå¤æ°ã${b}$ã«ãããå¤æ°è¡å(è¨ç»è¡å)${Z}$ã¨ãã¦å
¥åãããããªãã¡
$${Income} = {X}{\beta} + {Z}{b} + {\epsilon}$$
ã®å½¢å¼ã§è¡¨ç¾ãããã¨ãå¤ãã§ãã
åºå®å¹æã¨ã©ã³ãã å¹æ
å帰ä¿æ°ã«ã¤ãã¦ã${\beta}$ããåºå®å¹æ(fixed effect)ãã${b}$ããå¤éå¹æããããã¯ãã©ã³ãã å¹æ(random effect)ãã¨å¼ã³ã¾ãã
ãªããããã®ããã«å¼ã¶ã®ãã«ã¤ãã¦ã¯Wikipediaå
çã§ã簡便ãªè¨è¼ãããã¾ããã$\beta$ã¯å®æ°ã§ããä¸æ¹ã${b}$ã¯ç¢ºçå¤æ°ã¨ãã¦æ±ããã¨ãçç±ã§ãã
ã®æ¨å®
ãã¦ã${b}$ã¯å帰ä¿æ°ã®è¡åã§ãããã¤ã¾ããã©ã¡ã¼ã¿ã§ããã®ã§ã
ãã¼ã¿ã®æ§é ããæ¨å®ãããå¿
è¦ãããã¾ãã
ä¸æ¹ã§ããããã©ã³ãã å¹æãã¨è¨ãããããã«ããã¡ãã¯ç¢ºçå¤æ°ã¨ãã¦ã®æ±ããããå¿
è¦ãããã¾ãã
ãã®æ±ãã«ãã£ã¦ãã©ã¡ã¼ã¿æ¨å®ã確ççã«å¯è½ã«ãªãã¾ãã
çè«ã¨å®è£
ã¯ã®ã£ããããããã®ãªã®ã§ããã®è¾ºã¯Bates, et. al(2015)ããã¼ã¹ã«æ´çãã¦ããã¾ãã
å®å¼å
以ä¸ã®ãããªç·å½¢æ··åã¢ãã«ãèãã¾ãã
$$Y = {X}{\beta} + {Z}{b} + {\epsilon}$$
ç®çå¤æ°$Y$ãç¹å®ã®åå¸(ããã§ã¯æ£è¦åå¸ã¨ãããã¨ã«ãã¾ã)ã«å¾ããã¨ãä»®å®ãã¾ãã
$$ Y \sim N(\mu, \sigma ^2)$$
ãã¦ãä¸è¨ã®ä»®å®ã®ãã¨ã§$\mu$ãæ¨å®ããã¨ãé常ã®ç·å½¢å帰ã®ãã©ã¡ã¼ã¿æ¨å®ã¨åãå®å¼åã«ãªãã¾ãã
ä»åã¯æ··åã¢ãã«ã§ã${b}$ãæ¨å®ãããã®ã§ã$b$ã§æ¡ä»¶ä»ãããã$Y$ã®åå¸ã¨ãã¦èãããã¨æãã¾ããããªãã¡
$$(Y|B = b) \sim N(X \beta + Z b + \epsilon, \sigma ^2 W^{-1})$$
ããã¯Bates et. al(2015)ã®å¼(2)ã«ç¸å½ããå¤æã§ããã©ã³ãã å¹æ${b}$ã«ãã£ã¦æ¡ä»¶ä»ããããæã
ã¯$Y$ãæ£è¦åå¸ã«å¾ãã¨ããããã¨ããã¢ãã«ãèãã¾ãããã®æã®å¹³å$\mu$ãå帰å¼ã§æ±ãããã¾ãã${W}$ã¯å¯¾è§è¡åã§ãåæ£ã®éã¿ãæå³ãã¾ããä»åã¯ããã¾ãã§ã¦ããªãã®ã§ããåæ£ãå¶å¾¡ããéã¿ã ãªããã¨ããæãã§è¦ã¦ãã ããã
ä»åã®é¢å¿ã¯${b}$ãæ¨å®ãããã¨ã«ããã®ã§ãããããã1ã¤1ã¤ä¸å¯§ã«æ¨å®ããã®ã¯å¤§å¤ã«éª¨ãæãã¾ããé½éåºçå¥ã®ãã©ã¡ã¼ã¿ãæ¨å®ããå ´åãå¤æ°1ã¤ã«ã¤ã47åå¿
è¦ã§ããããæ¨æ¬æ°åããã®ã§ãããã1ã¤1ã¤æ¨å®ãããã¨ããã¨å®¹æã«è¨ç»è¡åãã©ã³ã¯è½ã¡ãã¦ãã¾ãã¾ãã
ãã®ããã«ã¯æ¨å®ãããã©ã¡ã¼ã¿ãå°ãªãã§ããã¨è¯ãã§ããä¾ãã°${b}$ãä½ããã®åå¸ã«å¾ã£ã¦ãããã¨ã«ã§ããã¨ããã®åå¸ãã©ã¡ã¼ã¿ã決å®ããã ãã§è¯ããªãã®ã§ãããããã§ãã
ä¾ãã°å¤å¤éæ£è¦åå¸ã«å¾ã£ã¦ãããã¨ã¨ã¦ãæ±ãããããªãã¾ããã¨ãããã${b}$ã確çå¤æ°ã¨æãã¦ã以ä¸ã®ããã«ä»®å®ããã¨ããæ°ããã¾ãã
$$B\sim{N}({0}, {\Sigma})$$
ããã§ã$\Sigma$ã¯åæ£å
±åæ£è¡åã§ããããã¯ç®çå¤æ°ã®åå¸ã§ã¯ãªããã©ã³ãã å¹æãã©ã¡ã¼ã¿ã®åå¸ã«å¯¾ããä»®å®ãªã®ã§ãç®çå¤æ°ã®åå¸ã«é¢ãã¦ã¯ä¸è¬åç·å½¢ã¢ãã«ã®ç¯çã§(ã¤ã¾ãææ°ååå¸æã«å«ã¾ããåå¸ã§ããã°)æ¡å¼µãå¯è½ã§ããå®élme4
ã§ã¯ãglmer
é¢æ°ã§ãã¸ã¹ãã£ãã¯å帰ã¨ãã¢ã½ã³å帰ãå¯è½ã§ãã
ãã¦ãããã§${\Sigma}$ãæ¨å®ãããã¨ãã§ããã°ã©ã³ãã å¹æ${b}$ã確ççã«çæã§ããããããã¨ããã¨ããã¾ã§è½ã¡çãã¾ããã
ããã§ç´ æµãªãã¨ã¯ã${b}$ã®æå1ã¤1ã¤ãæ¨å®ããªãã¦ããå¤å¤éæ£è¦åå¸ã®åæ£æåãæ¨å®ã§ããã°ããã¨ãã話ã«ããæ¿ãããããã¨ã«ããã¾ãã
ããããæ¨å®ã«å
¥ãã®ã§ãããã³ã³ãã¥ã¼ã¿ããããè¨ç®ã§ããããã«ããã«ã¯ããã®${\Sigma}$ãå解ãã¦ãã¡ããã¡ãããå¿
è¦ããããããªã®ã§ããã
åãããããç解ã§ãã¦ããªãã®ã§ç解ã§ããã追è¨ãã¾ããç· ååªå
ã§ã*2ã
å¹æã®æææ§ã«ã¤ãã¦
é ããç§å¯ã§ãããlme4
ã§ã¯å帰ä¿æ°ã®æææ§ãç®åºã§ãã¾ãããã¨ãããç®åºãããªããã¨ããããªã·ã¼ã§åãã¦ãã¾ããçç±ã¯ã·ã³ãã«ã«ãè¨ç®ã大å¤ã§ãããã¨ãããã¨ã§ãã
æææ§ã«ã¤ãã¦èãã¾ãã
ä¸æ¹ã§lmerTest
ã¨ããã©ã¤ãã©ãªã¯ããã®èªç±åº¦ã®è¨ç®ãè¿ä¼¼è¨ç®ã§å®ç¾ãã¦ãã¾ãã
ãã ããã¡ãã¯åºå®å¹æã«å¯¾ããæææ§ã ãã§ãæ··åå¹æã®æææ§ã¯è©ä¾¡ã§ããªãã£ããã¯ãã§ãã
çµ±è¨çå¦ç¿ã¸ã®æ¡å¼µ
ããããã¯çµ±è¨çå¦ç¿ã®æèã§ã®ç¾æç¹ã§ã®åã®é¢å¿ã®è©±ã§ãã
詳細ã¯å®å
¨ã«ç解ãã¦ããè¨äºãæ¸ããã¨æãã¾ãããã¢ããã¼ã·ã§ã³ã¨ãã¦ã¯
æ··åã¢ãã«ã®æ£ååãã決å®æ¨ãã¯ããã¨ããéç·å½¢ã¢ãã«ã¸ã®æ¡å¼µãç®è«ã¿ãããªãã¨ããã«ããã¾ãã
ä¸è¬åç·å½¢ã¢ãã«ã®ç¯çã§ããã°ãæ£ååãå®æ½ãããã¨ãå¯è½ã§ãããã¨ã¯Tibshiraniãã«ãã£ã¦ç¥ããã¦ãã¾ãã
Rã®ã¢ããã³ãã«ã¬ã³ãã¼ãªã®ã§ãRã®è©±ããã¾ãããä¾ãã°Lasso/Ridge/ElasticNetã«ã¤ãã¦ã¯ãglmnet
ããã±ã¼ã¸ã§å®¹æã«å®è£
ãããã¨ãã§ãã¾ãã
ä¸æ¹ã§ãæ··åã¢ãã«ã®ãã©ã¡ã¼ã¿ã¯åºå®å¹æã¨å¤éå¹æã«åããã¦ãããglmnet
ã§ã¯é£ããæ°ããã¦ãã¾ãã
ã¨ãããå¤éå¹æã«ã¤ãã¦ã¯ãå¤éå¹æã¨ãã¦å¿
è¦ãªå¤æ°ããªããªã®ãããã¥ã¼ãªã¹ãã£ãã¯ã«æ±ºå®ããããªãå ´åãã¾ã¾ããããªã¨æãã¾ããã¤ã¾ãã¯å¤æ°é¸æã®éè¦ãç¸å¯¾çã«é«ãã¨ããããã§ããããããæ£ååã¤ãä¸è¬åç·å½¢æ··åã¢ãã«ã«ã¤ãã¦ãglmmLasso
ããã±ã¼ã¸ãå©ç¨ãããã¨ã§å¯è½ã§ãã
決å®æ¨ç³»ã§æ··åã¢ãã«ãè¡ãå ´åãæè¿ã¯GPBoostã¨ããã¢ãã«ãææ¡ããã¦ãã¾ãã
LightGBMããã¼ã¹ã«æ··åå¹æé¨åã追å ãã¦ããã®ã§ãLightGBMã®ãã©ã¡ã¼ã¿ãçµç±ãã¦æ£ååé
ã追å ã§ãã¾ãã
ãã¡ããRã§ãå©ç¨ã§ããgpboost
ã¨ãã¦å®è£
ããã¦ãã¾ãã
åèæç®
Baayen,R.H., Davidson, D.J., and Bates, D.M. (2008) Mixed-effects modeling with crossed random effects for subjects and items, Journal of Memory and Language
Bates,Douglas., Mächler, Martin., Bolker,Ben., and Walker, Steve.,Fitting Linear Mixed-Effects Models Using lme4
Groll, A. and G. Tutz (2014).Variable selection for generalized linear mixed models by L 1-penalized estimation
Sigrist, Fabio., 2020, Gaussian Process Boosting
Tibshirani, Robert., 1996, Regression Shrinkage and Selection via Lasso
lme4ã§på¤ãåºåãããªã件ã¨ãlmerTestã®på¤ ã«ã¤ãã¦
*1:ã¢ãã«ã®ãã©ã¡ã¼ã¿æ¨å®ãåæãã¦ããªãã®ã§ãç¡è¦ãã¡ããããªãã£ã¡ããããªãã§ãã
*2:ã¡ãªã¿ã«ã¡ãã£ã¨è©±ãã¨$\Sigma$ãYã®åå¸ãã©ã¡ã¼ã¿ã¨$\theta$åã³$\Lambda$ãç¨ãã¦
$\sigma ^2 \Lambda(\theta) \Lambda(\theta) ^T$ã¨å解ãããã¨ã§
æ¨å®ããããããã¦ããããã§ããããªãããã§ããã®ãã¯ãããã¾ããã