MXnet / Kerasãæ¬æ ¼çã«æ®åãã¦ãããã¨ã§ãããããã誰ã§ãï¼å²ã¨ï¼æ°è»½ã«Deep Learningãå®è·µã§ãããæ代ã«ãªã£ã¦ãã¾ããããã¨ãã話ãååã®è¨äºã§ã¯ä¸éããã£ã¦ã¿ã¾ããã
ã¨ãããã¨ã§ããããããã°ãããæ°è»½ã«å®è·µã§ããããã«ãªã£ããã©å®éåé¡Deep Learningã£ã¦ã©ããªãï¼ãã¨ããã®ãè²ã ãµã³ãã«ãã¼ã¿ã»ãããæ¿ãã¦å¦ç¿ï¼äºæ¸¬ã®æåãè¦ããã¨ã§ããã®å®æ ãä½æãã¦ã¿ããã¨ããæè¡ãã¿ã·ãªã¼ãºããã©ãã©ãã£ã¦ã¿ãããã¨æãã¾ãã
ãã¬ã¼ã ã¯ã¼ã¯ã¯MXnet / Kerasã©ã¡ãã§ãè¯ãã¤ããã§ãããåã«èªåã®ç°å¢ã§ã®ãæ軽ããåªå
ãã¦åºæ¬çã«MXnetã§çµ±ä¸ãããã¨æãã¾ãããªã¯ã¨ã¹ããããã°Kerasã§ã®å®è¡ä¾ãå¾ãã追è¨ããããã«ãã¾ãã®ã§ãã©ããã¦ãKerasã§ã©ãåããåãããªãï¼ã¨ããæ¹ã¯ã³ã¡ã³ãæ¬ãªãã§ãä¸å ±ãã ããã
追è¨é¨åã«ã¤ãã¦
é常ã«éè¦ãªãã¤ã³ãã追è¨é¨åã«å«ã¾ãã¦ããï¼è¿½è¨é¨åã«æ¸ããã¦ããæ¹ãæ£ããRå®è¡ä¾ãªã®ã§ãå¿ ããã¡ãããåç §ãã ãã
Tennis: ããã¹å大大ä¼ãã¼ã¿ã»ãã
TennisはUCI ML Repositoryの定番データセットã®ä¸ã¤ã§ãããã®ããã°ã§ãä½åº¦ãåãä¸ãã¦ããã®ã§ãè¨æ¶ã®æ¹ãå¤ãããªã¨ãæè¿ã ã¨ä»¥ä¸ã®è¨äºã§åãä¸ãã¦ãã¾ããã
ã¨ãããã¨ã§ä»åãããã¤ã使ãã¾ããé©å½ã«åå¦çãã¦ããããã¼ã¸ã§ã³ãåã®GitHubã«ç½®ãã¦ããã®ã§ã以ä¸ã®ããã«ãã¦Rã®ã¯ã¼ã¯ã¹ãã¼ã¹ã«èªã¿è¾¼ãã§ããã¾ããã*1ã
# ç·å > dm<-read.csv('https://github.com/ozt-ca/tjo.hatenablog.samples/raw/master/r_samples/public_lib/jp/exp_uci_datasets/tennis/men.txt',header=T,sep='\t') # 女å > dw<-read.csv('https://github.com/ozt-ca/tjo.hatenablog.samples/raw/master/r_samples/public_lib/jp/exp_uci_datasets/tennis/women.txt',header=T,sep='\t') # uninformativeãªå¤æ°ãé¤å¤ãã > dm<-dm[,-c(1,2,6,16,17,18,19,20,21,24,34,35,36,37,38,39)] > dw<-dw[,-c(1,2,6,16,17,18,19,20,21,24,34,35,36,37,38,39)]
ããã§informativeãªèª¬æå¤æ°ã ããdm, dwã®2ã¤ã®ãã¼ã¿ãã¬ã¼ã ã«å
¥ã£ãã¯ãã§ãã
MXnetã®Deep Learning (DNN)ã§ç·åã®ãã¼ã¿ã§å¦ç¿ã女åã®çµæãäºæ¸¬ãã¦ã¿ã
ã§ã¯ãããããDeep Learningã§ããã¹å大大ä¼ãã¼ã¿ã»ããã®åé¡ããã£ã¦ã¿ã¾ããããä»åã®ãã¼ã¿ã»ããã¯ç¹ã«ç©ºéçãªç¹å¾´ãæã£ã¦ããããã§ã¯ãªãã®ã§ãDNNã§ããã°ååã§ããã*2ãåé ã«æ¸ããããã«ãä»å使ãã®ã¯MXnetãã¨ãããããã¼ã¿ã®æºåã ããã¦ããã¾ãããã
> library(mxnet) > train <- data.matrix(dm) > test <- data.matrix(dw) > train.x <- train[,-1] > train.y <- train[,1] > test.x <- test[,-1] > test.y <- test[,1]
ããã§ãDNNã®ã¢ãã«é¨åãæ¸ãã¦mx.model.FeedForward.createã¡ã½ããã§èµ°ãããã°DNNãæ§ç¯ãããã¨ãã§ãã¾ãã
ä½ãèããã«MNISTã®æã¨åæ§ã«ç¹å¾´æ¬¡å æ°Ã10, ç¹å¾´æ¬¡å æ°Ã10, ç¹å¾´æ¬¡å æ°Ã5ã®ä¸é層2層ã§ãã£ã¦ã¿ã
MNISTãDNNã§åé¡ããå ´åã¯ãæ¦ãã¦ãããæ°2400-2400-1200ã®ä¸é層2層*3ã§åãã¦è¯ãçµæãåºã¦ãããªãã¨ããã®ãæãåºããã®ã§ãã¨ãããããç¹å¾´æ¬¡å æ°Ã10, ç¹å¾´æ¬¡å æ°Ã10, ç¹å¾´æ¬¡å æ°Ã5ãã®DNNã§ãã£ã¦ã¿ã¾ããã¡ãªã¿ã«softmaxã§åºåãã¦ããã®ã§ãæåã®ã¦ãããæ°2ã®å±¤ã¯åãªãã¾ã¨ã層ã§ãã
> data <- mx.symbol.Variable("data") > fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=220) > act1 <- mx.symbol.Activation(fc1, name="tanh1", act_type="tanh") > fc2 <- mx.symbol.FullyConnected(act1, name="fc2", num_hidden=220) > act2 <- mx.symbol.Activation(fc2, name="tanh2", act_type="tanh") > fc3 <- mx.symbol.FullyConnected(act2, name="fc3", num_hidden=110) > act3 <- mx.symbol.Activation(fc3, name="tanh3", act_type="tanh") > fc4 <- mx.symbol.FullyConnected(act3, name="fc4", num_hidden=2) > softmax <- mx.symbol.SoftmaxOutput(fc4, name="softmax") > devices <- mx.cpu() > mx.set.seed(71) > model <- mx.model.FeedForward.create(softmax, X=train.x, y=train.y, ctx=devices, num.round=100, array.batch.size=100, learning.rate=0.03, momentum=0.99, eval.metric=mx.metric.accuracy, initializer=mx.init.uniform(0.5), array.layout = "rowmajor", epoch.end.callback=mx.callback.log.train.metric(100)) Start training with 1 devices [1] Train-accuracy=0.5425 [2] Train-accuracy=0.618 [3] Train-accuracy=0.57 ... [99] Train-accuracy=0.502 [100] Train-accuracy=0.534 > preds <- predict(model, test.x, array.layout = "rowmajor") > pred.label <- max.col(t(preds)) - 1 > table(test.y, pred.label) pred.label test.y 1 0 227 1 225
ãã²ãã²ãã²ãã²ãã²ããå
¨ç¶ãã¡ã§ãorzããããæåããæå¾ã¾ã§Train-accuracyã0.5-0.6ãããã§ã¾ãã£ãããã¾ããã£ã¦ã¾ãããã©ãããã¦ãããæ°èªä½ãããããä¸é©åãªããã§ãã
ç¹å¾´æ¬¡å æ°Ã2, ç¹å¾´æ¬¡å æ°Ã2, ç¹å¾´æ¬¡å æ°Ã1ã®ä¸é層2層DNNã§ãã£ã¦ã¿ã
ã¦ãããæ°ã極端ã«å¤éããæ°ãããã®ã§ãå ¨é¨5åã®1ã«ã¾ã§æ¸ããã¦ã¿ã¾ãããããã©ãã§ãããã
> data <- mx.symbol.Variable("data") > fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=46) > act1 <- mx.symbol.Activation(fc1, name="tanh1", act_type="tanh") > fc2 <- mx.symbol.FullyConnected(act1, name="fc2", num_hidden=46) > act2 <- mx.symbol.Activation(fc2, name="tanh2", act_type="tanh") > fc3 <- mx.symbol.FullyConnected(act2, name="fc3", num_hidden=23) > act3 <- mx.symbol.Activation(fc3, name="tanh3", act_type="tanh") > fc4 <- mx.symbol.FullyConnected(act3, name="fc4", num_hidden=2) > softmax <- mx.symbol.SoftmaxOutput(fc4, name="softmax") > devices <- mx.cpu() > mx.set.seed(71) > model <- mx.model.FeedForward.create(softmax, X=train.x, y=train.y, ctx=devices, num.round=100, array.batch.size=100, learning.rate=0.03, momentum=0.99, eval.metric=mx.metric.accuracy, initializer=mx.init.uniform(0.5), array.layout = "rowmajor", epoch.end.callback=mx.callback.log.train.metric(100)) Start training with 1 devices [1] Train-accuracy=0.535 [2] Train-accuracy=0.58 [3] Train-accuracy=0.61 [4] Train-accuracy=0.612 [5] Train-accuracy=0.666 [6] Train-accuracy=0.702 [7] Train-accuracy=0.72 [8] Train-accuracy=0.782 [9] Train-accuracy=0.838 [10] Train-accuracy=0.786 [11] Train-accuracy=0.778 [12] Train-accuracy=0.632 ... [99] Train-accuracy=0.51 [100] Train-accuracy=0.466 > preds <- predict(model, test.x, array.layout = "rowmajor") > pred.label <- max.col(t(preds)) - 1 > table(test.y, pred.label) pred.label test.y 0 0 227 1 225
ãããå ¨ç¶ãã¡ã§ããorzããã ãä¸ã¤æ°ã«ãªã£ãã®ãTrain-accuracyã®å¤é·ãã©ãè¦ã¦ãéä¸ããéå¦ç¿ãã¦ãæ°ããã¾ããã¨ãããã¨ã§ãä»åº¦ã¯epochæ°ã10åå¾ã«ã¾ã§çµã£ã¦ã¿ã¾ããããããã¯è©¦è¡é¯èª¤ã¨ãããã¨ã§ã
> data <- mx.symbol.Variable("data") > fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=46) > act1 <- mx.symbol.Activation(fc1, name="tanh1", act_type="tanh") > fc2 <- mx.symbol.FullyConnected(act1, name="fc2", num_hidden=46) > act2 <- mx.symbol.Activation(fc2, name="tanh2", act_type="tanh") > fc3 <- mx.symbol.FullyConnected(act2, name="fc3", num_hidden=23) > act3 <- mx.symbol.Activation(fc3, name="tanh3", act_type="tanh") > fc4 <- mx.symbol.FullyConnected(act3, name="fc4", num_hidden=2) > softmax <- mx.symbol.SoftmaxOutput(fc4, name="softmax") > devices <- mx.cpu() > mx.set.seed(71) > model <- mx.model.FeedForward.create(softmax, X=train.x, y=train.y, ctx=devices, num.round=9, array.batch.size=100, learning.rate=0.03, momentum=0.99, eval.metric=mx.metric.accuracy, initializer=mx.init.uniform(0.5), array.layout = "rowmajor", epoch.end.callback=mx.callback.log.train.metric(100)) Start training with 1 devices [1] Train-accuracy=0.535 [2] Train-accuracy=0.58 [3] Train-accuracy=0.61 [4] Train-accuracy=0.612 [5] Train-accuracy=0.666 [6] Train-accuracy=0.702 [7] Train-accuracy=0.72 [8] Train-accuracy=0.782 [9] Train-accuracy=0.838 > preds <- predict(model, test.x, array.layout = "rowmajor") > pred.label <- max.col(t(preds)) - 1 > table(test.y, pred.label) pred.label test.y 0 1 0 192 35 1 15 210 > sum(diag(table(test.y, pred.label)))/nrow(test) [1] 0.8893805
ããã§ããããåé¡å¨ããããªãã¾ãããTest-accuracyã§0.889ãªãã¨ããããåé¡ã§ãã¦ããã¨è¨ã£ã¦è¯ãã§ãããã
ãã°ããã¦ãããæ°ãå¤ãã¦ã®ããã¡åã£ã¦ã¿ã
46-46-23ã§0.889ã«ãªãã®ã¯åãã£ããã§ãããä»ã®çµã¿åããã ã¨ã©ããªããã§ããããï¼ãããã§ã¯é¢åãªã®ã§çµæã ãç¾ åãã¦ããã¾ããã
- 46-23-23: 0.602
- 46-23-46: 0.867
- 46-46-46: 0.502
- 69-23-46: 0.520
- 46-23-8: 0.584
ã¨ããããã«ããã以ä¸ãã£ã¦ãå
¨ãTest-accuracyãä¸ããã¾ããorzã層æ°ããã¡ãªã®ããªãããã
ä»åº¦ã¯ä¸é層1層ã«ãã¦ã¿ã
ããããããä¸é層å¤éããããï¼ã¨ãããã¨ã§ããã«æ¸ããã¾ãã
- 46-23: 0.712
- 46-8: 0.861
- 32-8: 0.894
- 69-8: 0.825
- 30-30: 0.531
ããããã³ã¨æ¥ãªããã ããªãã¨ãããã¡ãªã¿ã«ãã®æç¹ã§æ¢ã«Deep Learningã§ã¯ãªãã¦ãã ã®NNã§ãorz
追è¨ï¼å°æ¬ããKaggleræ°ããã®ã³ã¡ã³ãã«å¾ã£ã¦ãã¼ã¿ãæ£è¦åãã¦ã¿ã
ãã®è¨äºãupãã段éã§ãåãå°æ¬ããKagglerã®@toshi_k_datasciãããããããªã³ã¡ã³ããããã ããã®ã§ããã
@TJO_datasci mxnetã¯æ£è¦åå¦çï¼åå¤æ°å¹³å0åæ£1ã«ããã¨ãï¼ãå¿ è¦ãªã®ã§ã¯ãªãã§ããããï¼
— toshi_k (@toshi_k_datasci) 2016å¹´6æ25æ¥
@TJO_datasci åãã¯ãããã©ï¼ããã§ã¯ç²¾åº¦ãå ¨ç¶åºãªãã¨æãã¾ãï¼ã¡ãã£ã¨ãã£ã¦ã¿ãã®ã§ã覧ãã ããï¼ãhttps://t.co/cxVMx24vVh
— toshi_k (@toshi_k_datasci) 2016å¹´6æ25æ¥
ããããããããããæè¿ãã®è¾ºçé¢ç®ã«ãã£ã¦ãªãã®ã§ãã£ããæ ã£ã¦ã¾ããorzãå®éãé»ã£ã¦ããã¨ããã©ã«ãã§ç¹å¾´éã®æ£è¦å(scaling)ããã£ã¦ããã¦ãã¾ãããã±ã¼ã¸ã»ã©ã¤ãã©ãªã£ã¦å¤ããã§ãããããã*4ã¨ãããã¨ã§ã¡ããã¨æ£è¦åãã¦åãã¦ã¿ããã¨ã«ãã¾ãããã¡ãããæ£è¦åããã¨çµæãå¤ããã±ã¼ã¹ãé常ã«å¤ãã®ã§ä»ã®è©¦è¡é¯èª¤ã®ã¨ãããããç´ãã¦ã¿ã¾ããã
ä¸é層ãç¹å¾´æ¬¡å æ°Ã10, ç¹å¾´æ¬¡å æ°Ã10, ç¹å¾´æ¬¡å æ°Ã5ã®ã±ã¼ã¹
ããã¯@toshi_k_datasciããã®ã¹ã¯ãªããã¨å®å ¨ã«åããã¿ã¼ã³ã§ãããæå ã§ãã£ãçµæãåããã¦è²¼ã£ã¦ããã¾ããã¾ãåå¦çã¾ã§ã®ã¨ããã
# æ£è¦åãã > train_means <- apply(train.x, 2, mean) > train_stds <- apply(train.x, 2, sd) > test_means <- apply(test.x, 2, mean) > test_stds <- apply(test.x, 2, sd) > train.x <- t((t(train.x)-train_means)/train_stds) > test.x <- t((t(test.x)-test_means)/test_stds)
ããã§@toshi_k_datasciããã®ãææã®éãã«æ£è¦åãã§ãã¾ãããã§ã¯ãå®éã«åãã¦ã¿ã¾ãããã
> data <- mx.symbol.Variable("data") > fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=220) > act1 <- mx.symbol.Activation(fc1, name="tanh1", act_type="tanh") > fc2 <- mx.symbol.FullyConnected(act1, name="fc2", num_hidden=220) > act2 <- mx.symbol.Activation(fc2, name="tanh2", act_type="tanh") > fc3 <- mx.symbol.FullyConnected(act2, name="fc3", num_hidden=110) > act3 <- mx.symbol.Activation(fc3, name="tanh3", act_type="tanh") > fc4 <- mx.symbol.FullyConnected(act3, name="fc4", num_hidden=2) > softmax <- mx.symbol.SoftmaxOutput(fc4, name="softmax") > devices <- mx.cpu() > mx.set.seed(71) > model <- mx.model.FeedForward.create(softmax, X=train.x, y=train.y, ctx=devices, num.round=100, array.batch.size=100, learning.rate=0.03, momentum=0.99, eval.metric=mx.metric.accuracy, initializer=mx.init.uniform(0.5), array.layout = "rowmajor", epoch.end.callback=mx.callback.log.train.metric(100)) Start training with 1 devices [1] Train-accuracy=0.7375 [2] Train-accuracy=0.914 [3] Train-accuracy=0.952 [4] Train-accuracy=0.968 [5] Train-accuracy=0.98 [6] Train-accuracy=0.992 [7] Train-accuracy=1 [8] Train-accuracy=1 [9] Train-accuracy=0.998 [10] Train-accuracy=1 [11] Train-accuracy=1 ... [99] Train-accuracy=1 [100] Train-accuracy=1 > preds <- predict(model, test.x, array.layout = "rowmajor") > pred.label <- max.col(t(preds)) - 1 > table(test.y, pred.label) pred.label test.y 0 1 0 209 18 1 19 206 > sum(diag(table(test.y, pred.label)))/nrow(dw) [1] 0.9181416
Test-accuracyã0.918ã¨å§åçã«è¯ããªãã¾ãããæ£è¦ååã ã¨å ¨ãå¦ç¿ããã«0.5ã®è¡é²ãç¶ãã¦ããã®ãèããã¨ã¦ã½ã®ããã§ãããã ãéä¸ã§éå¦ç¿ãã¦ããæ°ãããã®ã§æ©ææã¡åããå ¼ãã¦epochæ°ã10ã¾ã§æ¸ããã¦ã¿ã¾ãã
> data <- mx.symbol.Variable("data") > fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=220) > act1 <- mx.symbol.Activation(fc1, name="tanh1", act_type="tanh") > fc2 <- mx.symbol.FullyConnected(act1, name="fc2", num_hidden=220) > act2 <- mx.symbol.Activation(fc2, name="tanh2", act_type="tanh") > fc3 <- mx.symbol.FullyConnected(act2, name="fc3", num_hidden=110) > act3 <- mx.symbol.Activation(fc3, name="tanh3", act_type="tanh") > fc4 <- mx.symbol.FullyConnected(act3, name="fc4", num_hidden=2) > softmax <- mx.symbol.SoftmaxOutput(fc4, name="softmax") > devices <- mx.cpu() > mx.set.seed(71) > model <- mx.model.FeedForward.create(softmax, X=train.x, y=train.y, ctx=devices, num.round=10, array.batch.size=100, learning.rate=0.03, momentum=0.99, eval.metric=mx.metric.accuracy, initializer=mx.init.uniform(0.5), array.layout = "rowmajor", epoch.end.callback=mx.callback.log.train.metric(100)) Start training with 1 devices [1] Train-accuracy=0.7375 [2] Train-accuracy=0.914 [3] Train-accuracy=0.952 [4] Train-accuracy=0.968 [5] Train-accuracy=0.98 [6] Train-accuracy=0.992 [7] Train-accuracy=1 [8] Train-accuracy=1 [9] Train-accuracy=0.998 [10] Train-accuracy=1 > preds <- predict(model, test.x, array.layout = "rowmajor") > pred.label <- max.col(t(preds)) - 1 > table(test.y, pred.label) pred.label test.y 0 1 0 213 14 1 21 204 > sum(diag(table(test.y, pred.label)))/nrow(dw) [1] 0.9225664
Test-accuracyã0.923ã¾ã§åä¸ãã¾ãããããã¾ã§æ¥ãã°ã¨ããããOKããªã¨ã
ç¹å¾´æ¬¡å æ°Ã2, ç¹å¾´æ¬¡å æ°Ã2, ç¹å¾´æ¬¡å æ°Ã1ã§ãã£ã¦ã¿ã
æ¢ã«ä¸ã®æ¹ã§è©¦ããããã«ããã¯ãã¦ãããæ°ãå¤ãã®ã§ã¯ãªããã¨æãããã®ã§5åã®1ã¾ã§æ¸ããã¦ã¿ã¾ããã
> data <- mx.symbol.Variable("data") > fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=46) > act1 <- mx.symbol.Activation(fc1, name="tanh1", act_type="tanh") > fc2 <- mx.symbol.FullyConnected(act1, name="fc2", num_hidden=46) > act2 <- mx.symbol.Activation(fc2, name="tanh2", act_type="tanh") > fc3 <- mx.symbol.FullyConnected(act2, name="fc3", num_hidden=23) > act3 <- mx.symbol.Activation(fc3, name="tanh3", act_type="tanh") > fc4 <- mx.symbol.FullyConnected(act3, name="fc4", num_hidden=2) > softmax <- mx.symbol.SoftmaxOutput(fc4, name="softmax") > devices <- mx.cpu() > mx.set.seed(71) > model <- mx.model.FeedForward.create(softmax, X=train.x, y=train.y, ctx=devices, num.round=30, array.batch.size=100, learning.rate=0.03, momentum=0.99, eval.metric=mx.metric.accuracy, initializer=mx.init.uniform(0.5), array.layout = "rowmajor", epoch.end.callback=mx.callback.log.train.metric(100)) Start training with 1 devices [1] Train-accuracy=0.66 [2] Train-accuracy=0.814 [3] Train-accuracy=0.864 [4] Train-accuracy=0.902 [5] Train-accuracy=0.904 [6] Train-accuracy=0.932 [7] Train-accuracy=0.934 [8] Train-accuracy=0.954 [9] Train-accuracy=0.944 [10] Train-accuracy=0.952 [11] Train-accuracy=0.958 [12] Train-accuracy=0.962 [13] Train-accuracy=0.964 [14] Train-accuracy=0.976 [15] Train-accuracy=0.976 [16] Train-accuracy=0.976 [17] Train-accuracy=0.976 [18] Train-accuracy=0.982 [19] Train-accuracy=0.982 [20] Train-accuracy=0.99 [21] Train-accuracy=0.994 [22] Train-accuracy=0.994 [23] Train-accuracy=0.996 [24] Train-accuracy=0.994 [25] Train-accuracy=0.998 [26] Train-accuracy=1 [27] Train-accuracy=1 [28] Train-accuracy=1 [29] Train-accuracy=1 [30] Train-accuracy=1 > preds <- predict(model, test.x, array.layout = "rowmajor") > pred.label <- max.col(t(preds)) - 1 > table(test.y, pred.label) pred.label test.y 0 1 0 211 16 1 18 207 > sum(diag(table(test.y, pred.label)))/nrow(dw) [1] 0.9247788
Test accuracyã0.925ã¾ã§ä¸ãã£ã¦ãã¾ããããããªãååãªç²¾åº¦ã¨è¨ããã§ããããã¡ãªã¿ã«ãã®ä»ã®ã¦ãããæ°ã®ã±ã¼ã¹ã§ãåæ§ã«è©¦ãã¦ã¿ã¾ããããTest accuracy 0.925ãè¶
ããè¨å®ã¯ããã¾ããã§ããããã ããã©ãã§ãå¿
ãTest accuracy 0.900ãè¶
ããããã«ãªã£ã¦ããã®ã§ããã¯ãæ£è¦åãããã¨ã§ãã¡ãã¨DNNãworkãã¦ãããã¨ãããããã¾ãã
ä»ã®åé¡å¨ã¨æ¯ã¹ã¦ã¿ã
6æ15æ¥ã«ç§èåã§è¬æ¼ããæã«å®ã¯ç¨®æããããããã§ãããè¦ã¯ãã®æã®ã¢ã¹ãªã¼ãã®åæã¨match statsã¨ã®é¢ä¿æ§ã£ã¦çµæ§ã·ã³ãã«ã§ããåã£ãæ¹ãããå¹ççï¼è² ããæ¹ãããéå¹ççããªå¤æ°ã ãããªãã§ãããããªã®ã§ããã®ããã¹å大大ä¼ãã¼ã¿ã»ãããäºå®ä¸ç·å½¢åé¢å¯è½ãã¿ã¼ã³ã¨ãã¦æ¯èãã¾ãããã£ã¦ãã¹ãã¹ã³ã¢ãå©ãåºãã®ã¯L1æ£ååãã¸ã¹ãã£ãã¯å帰ãããã¯ç·å½¢SVMã§ãããã¨ãæ¢ã«åãã£ã¦ãã¦ã試ãã«ãã£ã¦ã¿ãã¨ãããªãã¾ãã
# L1æ£ååãã¸ã¹ãã£ãã¯å帰 > library(glmnet) > dm.l1 <- cv.glmnet(as.matrix(dm[,-1]), as.matrix(dm[,1]), family='binomial', alpha=1) > table(dw$Result, round(predict(dm.l1, newx=as.matrix(dw[,-1]), type='response', s=dm.l1$lambda.min),0)) 0 1 0 215 12 1 18 207 > sum(diag(table(dw$Result, round(predict(dm.l1, newx=as.matrix(dw[,-1]), type='response', s=dm.l1$lambda.min),0))))/nrow(dw) [1] 0.9336283
Test-accuracy 0.934ã§ä¸ã§MXnetã§é å¼µã£ã¦è©¦ããå ¨ã¦ã®DNNã®ã¹ã³ã¢ãä¸åã£ã¦ãã¾ãorz
# ç·å½¢SVM > library(e1071) > dm.svm.l <- svm(as.factor(Result)~., dm, kernel='linear') > table(dw$Result, predict(dm.svm.l, newdata=dw[,-1])) 0 1 0 214 13 1 16 209 > sum(diag(table(dw$Result, predict(dm.svm.l, newdata=dw[,-1]))))/nrow(dw) [1] 0.9358407
Test-accuracy 0.936ã§ããã¡ããå
¨ã¦ã®DNNãä¸åã£ã¦ãã¾ãorzãã¨ãããã¨ã§ãããã¹å大大ä¼ãã¼ã¿ã»ããã«å¯¾ãã¦ã¯Deep Learning (DNN)ã§è¨ããããL1æ£ååãã¸ã¹ãã£ãã¯å帰ãç·å½¢SVMã¨ãã£ãå¤å
¸çãªç·å½¢åé¡å¨ã§è¨ãã æ¹ãããã©ã¼ãã³ã¹ãè¯ãã¨ãããã¨ãåããã¾ããã
ã¤ãã§ã«ç·å½¢åé¢å¯è½ãã¿ã¼ã³2次å ãã¼ã¿ã«å¯¾ããæ¯èããè¦ã¦ã¿ã
ã¨ãããã¾ã§ã§DNNã®ç·å½¢åé¢å¯è½ãã¿ã¼ã³ï¼ä¸å¯è½ãã¿ã¼ã³ã§ã¯ãªãï¼ã«å¯¾ããæåã«ç念ãçãã¦ããã®ã§ãã¡ãã£ã¨ãã£ã¦ã¿ã¾ãããè¨å®ã¨ãã¦ã¯4-3-3ã®DNNã§ãã¹ã¯ãªããã¯ä»¥ä¸ã®éãã§ãã
# ãã¼ã¿ãã¤ã³ãã¼ããã¦ãMXnetã«èªã¾ããããã®ä¸å¦çã¨ãããããå¨ããã°ãªããå«ãã¦æºåãã¦ãã > dbi <- read.csv('https://github.com/ozt-ca/tjo.hatenablog.samples/raw/master/r_samples/public_lib/jp/dbi_large.txt', header=T, sep='\t') > dbi$label <- dbi$label-1 > px <- seq(-4,4,0.03) > py <- seq(-4,4,0.03) > pgrid <- expand.grid(px,py) > names(pgrid) <- names(dbi)[-3] > dbi_train <- data.matrix(dbi) > dbi_train.x <- dbi_train[,-3] > dbi_train.y <- dbi_train[,3] > dbi_test <- data.matrix(pgrid) > label_grid <- rep(0, nrow(pgrid)) > for (i in 1:nrow(pgrid)){ + if (pgrid[i,1]+pgrid[i,2]<0){label_grid[i] <- 1} + } # MXnetã§4-3-3ä¸é層2層ã®DNNãåã > data <- mx.symbol.Variable("data") > fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=4) > act1 <- mx.symbol.Activation(fc1, name="tanh1", act_type="tanh") > fc2 <- mx.symbol.FullyConnected(act1, name="fc2", num_hidden=3) > act2 <- mx.symbol.Activation(fc2, name="tanh2", act_type="tanh") > fc3 <- mx.symbol.FullyConnected(act2, name="fc3", num_hidden=3) > act3 <- mx.symbol.Activation(fc3, name="tanh3", act_type="tanh") > fc4 <- mx.symbol.FullyConnected(act3, name="fc4", num_hidden=2) > softmax <- mx.symbol.SoftmaxOutput(fc4, name="softmax") > devices <- mx.cpu() > mx.set.seed(71) > model <- mx.model.FeedForward.create(softmax, X=dbi_train.x, y=dbi_train.y, ctx=devices, num.round=20, array.batch.size=100, learning.rate=0.03, momentum=0.99, eval.metric=mx.metric.accuracy, initializer=mx.init.uniform(0.5), array.layout = "rowmajor", epoch.end.callback=mx.callback.log.train.metric(100)) Start training with 1 devices [1] Train-accuracy=0.944848484848485 ... [20] Train-accuracy=0.9479 > preds <- predict(model, dbi_test, array.layout = "rowmajor") > pred.label <- max.col(t(preds)) - 1 > table(label_grid, pred.label) pred.label label_grid 0 1 0 33157 2354 1 287 35491 > sum(diag(table(label_grid, pred.label)))/nrow(pgrid) [1] 0.9629536 # æç»ãã > plot(c(), type='n', xlim=c(-4,4), ylim=c(-4,4), xlab='', ylab='') > par(new=T) > polygon(c(-4,4,4),c(4,-4,4),col='#dddddd') > par(new=T) > polygon(c(-4,-4,4),c(4,-4,-4),col='#ffdddd') > par(new=T) > plot(dbi[,-3], pch=19, cex=0.5, col=dbi$label+1, xlim=c(-4,4), ylim=c(-4,4), xlab='', ylab='') > par(new=T) > contour(px, py, array(pred.label, dim=c(length(px),length(py))), col='purple', lwd=5, levels =0.5, drawlabels=F, xlim=c(-4,4), ylim=c(-4,4))
ãããªãè¯ãããã«è¦ãã¾ããçã®ã¯ã©ã¹å¢çã«å¯¾ããTest-accuracyã0.963ã¨æªãããã¾ãããã§ã¯æ¥µç«¯ãªDNNãä¾ãã°200-200-100ã¨ããè¨å®ã§çµãã§ã¿ãã¨ã©ããªããã§ããããï¼
> data <- mx.symbol.Variable("data") > fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=200) > act1 <- mx.symbol.Activation(fc1, name="tanh1", act_type="tanh") > fc2 <- mx.symbol.FullyConnected(act1, name="fc2", num_hidden=200) > act2 <- mx.symbol.Activation(fc2, name="tanh2", act_type="tanh") > fc3 <- mx.symbol.FullyConnected(act2, name="fc3", num_hidden=100) > act3 <- mx.symbol.Activation(fc3, name="tanh3", act_type="tanh") > fc4 <- mx.symbol.FullyConnected(act3, name="fc4", num_hidden=2) > softmax <- mx.symbol.SoftmaxOutput(fc4, name="softmax") > devices <- mx.cpu() > mx.set.seed(71) > model <- mx.model.FeedForward.create(softmax, X=dbi_train.x, y=dbi_train.y, ctx=devices, num.round=20, array.batch.size=100, learning.rate=0.03, momentum=0.99, eval.metric=mx.metric.accuracy, initializer=mx.init.uniform(0.5), array.layout = "rowmajor", epoch.end.callback=mx.callback.log.train.metric(100)) Start training with 1 devices [1] Train-accuracy=0.935656565656566 [2] Train-accuracy=0.938 [3] Train-accuracy=0.9425 [4] Train-accuracy=0.939 [5] Train-accuracy=0.9352 [6] Train-accuracy=0.9379 [7] Train-accuracy=0.936300000000001 [8] Train-accuracy=0.938000000000001 [9] Train-accuracy=0.9439 [10] Train-accuracy=0.9343 [11] Train-accuracy=0.9426 [12] Train-accuracy=0.9412 [13] Train-accuracy=0.9389 [14] Train-accuracy=0.9372 [15] Train-accuracy=0.9372 [16] Train-accuracy=0.9385 [17] Train-accuracy=0.941000000000001 [18] Train-accuracy=0.937900000000001 [19] Train-accuracy=0.9419 [20] Train-accuracy=0.9371 > preds <- predict(model, dbi_test, array.layout = "rowmajor") > pred.label <- max.col(t(preds)) - 1 > table(label_grid, pred.label) pred.label label_grid 0 1 0 28881 6630 1 6 35772 > sum(diag(table(label_grid, pred.label)))/nrow(pgrid) [1] 0.9069141 > plot(c(), type='n', xlim=c(-4,4), ylim=c(-4,4), xlab='', ylab='') > par(new=T) > polygon(c(-4,4,4),c(4,-4,4),col='#dddddd') > par(new=T) > polygon(c(-4,-4,4),c(4,-4,-4),col='#ffdddd') > par(new=T) > plot(dbi[,-3], pch=19, cex=0.5, col=dbi$label+1, xlim=c(-4,4), ylim=c(-4,4), xlab='', ylab='') > par(new=T) > contour(px, py, array(pred.label, dim=c(length(px),length(py))), col='purple', lwd=5, levels =0.5, drawlabels=F, xlim=c(-4,4), ylim=c(-4,4))
ããæå³äºæ³éãã§ãããå¤ãªã¨ããã§éå¦ç¿ããã¦ãã¦å¥å¦ãªæ±ºå®å¢çãæãã¦ãã¾ããå®éãçã®ã¯ã©ã¹å¢çã«å¯¾ããTest-accuracyã0.907ã«ã¾ã§æªåãã¦ãã¾ãã
DNNã«éãããDeep Learningãçµãæã«ãã¬ã¼ã ã¯ã¼ã¯ã使ã£ã¦ããã¨ä½ã¨ãªããã¨ããããã¦ãããæ°å¤§ãããã¨ããããã¿ãããªæ°åã§è¨å®ãã¦ãã¾ããã¡ã§ãããæ ¹æ ããªãã¦ãããæ°ã大ããããã¨éå¦ç¿ããããï¼ã¨ããå¯è½æ§ããååã®ããã¹å大大ä¼ãã¼ã¿ã»ããã»å¾åã®ç·å½¢åé¢å¯è½2次å ãã¿ã¼ã³ã®çµæããã¯ãããããã¾ãã
ãã®è¾ºã®ãã¤ã³ãã«æ³¨æããªããã次å以éãè²ã
ãã¼ã¿ã»ãããæ¿ãã¦ãã£ã¦ã¿ããããªã¨æãã¾ãã
追è¨é¨åãå«ããä¸ã§ã®ææ³
ä¸è¨ã®ããã«éä¸ã§@toshi_k_datasciããããã®ãææã«å¾ã£ã¦ã¡ããã¨æ£è¦åãã¦ã¿ãã確ãã«ããã©ã¼ãã³ã¹ãã°ãã¨åä¸ããããã§ããã¯ããã¿ãã«ããã«ã¹ãã©æ¬ãã«ãæã¦ã¯ãé»è²ãæ¬ããã¨PRMLã«ããç¹å¾´éã¯å¿
ãæ£è¦åããããã³ãã¤ãã¦ãã¨æ¸ãã¦ããã®ã«ã¯ã¡ããã¨çç±ããããã ãªãã¨å確èªãããã¾ããorzããããLIBSVMï¼ä¸è¨ã®ç·å½¢SVMã§ãã{e1071}ã®æ¬ä½ï¼ãããã©ã«ãã§ãã¼ã¿ã®æ£è¦åããã¦ããã§ããããããããããæ£è¦åãªãã§æ¯è¼ããã®ã¯ä¸å
¬å¹³ã§ããã¨ã
ãã ãããã§è¦å®éãã®ããã©ã¼ãã³ã¹ã«éããDNNã§ãã£ã¦ããä»åã®ããã¹å大大ä¼ãã¼ã¿ã»ããã«å¯¾ãã¦ã¯L1æ£ååãã¸ã¹ãã£ãã¯å帰ï¼ç·å½¢SVMã®ãããªç·å½¢åé¡å¨ã«å¯¾ãã¦åã°ãªãã£ãããã§ããããããDNNã®ä½¿ãã©ããã®é£ããç¹ãªã®ããªã¨ãæã£ã次第ã§ãã
ã¨ãããã¨ã§ãã¨ããããä½ããªãã¨ãMXnetã ããã¨Kerasã ããã¨Deep Learningãã¬ã¼ã ã¯ã¼ã¯ä½¿ãæã¯æ£è¦åã¨ãããã¹ã±ã¼ãªã³ã°å¿ããã«ããããã«ãã¾ãorzãè¯ãæè¨ãå¾ãé±æ«ã§ãããã¾ããã
*1:æ®éã«git cloneãã¦ãã¼ã«ã«ã«ç½®ãã¦ããã£ã¦ãæ§ãã¾ãã
*2:ãããCNNã§ãã£ããä½ãèµ·ããã®ãèå³ããããããããªã
*3:1層ç®ã¯å ¥å層
*4:ä¸ã«ã¯svm {e1071}ã¿ããã«æ示çã«scale = Fã¨ããªãã¨åæã«æ£è¦åãã¦ãã¾ããã®ããã