æ¨ä»ã®æ©æ¢°å¦ç¿çéã§ã¯ä½æ
ãPythonãRã®ã©ã¤ãã©ãªãå¤ãã®ã ããCommon Lispãé«éãªãã¤ãã£ãã³ã³ãã¤ã©ãæã¡ãC/C++ã®ã©ã¤ãã©ãªãå¼ã¹ãã®ã§ãæ©æ¢°å¦ç¿ã«ãåé¡ãªã使ããã
å®éã«Common Lispã§å®è£
ããã深層å¦ç¿ãå¯è½ã¨ããã©ã¤ãã©ãªã«MGLããããGithubでMITライセンスで公開されているããããªããã§ãå
æã®Lispmeetup #39ã§MGLãç´¹ä»ãã¦ãããããããã®ã¨ãã®ã¹ã©ã¤ãã
MGLã®ä½è
Gábor Melisæ°ã¯Kaggleã®2014å¹´ã®æ©æ¢°å¦ç¿ã³ã³ãHiggs Boson Machine Learning Challengeã®åªåè
ã§ãä»ã¯DeepMindã«ããããããこのコンペのコードが公開されているããMGLã使ããã¦ãããã¨ãåãããã¾ãMGLã«ã¯æ·±å±¤å¦ç¿ä»¥å¤ã«ãé²åè¨ç®ãèªç¶è¨èªå¦çãªã©ã®ããã±ã¼ã¸ãããããããドキュメントもちゃんと書かれておりãå©ç¨ããã®ã«å¿
è¦ãªåºæ¬çææã¯æã£ã¦ããããã«è¦ããã
MGLã¯è¡åæ¼ç®ã«ä»¥åç´¹ä»ããMGL-MATã使ãããã£ã¦CPUã§ããã°OpenBLASãMKLãªã©ããGPUã§ããã°CUDAãå©ç¨ãã¦é«éã«å¦ç¿ã§ããã以前、自分でも多層ニューラルネットを実装してみたがããããããã¤ã¼ããªå®è£
ãªã®ã§å®ç¨ã¨ã¯è¨ããªãã£ããããã«å¯¾ãã¦ãMGLã¯Python/C++ã®ã©ã¤ãã©ãªã¨æ¯ã¹ã¦ãéè²ãªãæ§è½ã示ãï¼å¾è¿°ã®ãã³ããã¼ã¯åç
§ï¼ã
ãªããMGLã¯å°ãªãã¨ãLinuxã¨Macã§åãã¯ãã ããèªåã¯Linuxã§ãã試ãã¦ãªãã®ã§ãMacã§ãåããã¨ãã人ã¯æãã¦ã»ããã*1
GPUã¯å¥ã«å¿
é ã¨ããããã§ã¯ãªããCPUã®ã¿ã§ä½¿ãå ´åã¯CUDAã®ã¤ã³ã¹ãã¼ã«ãä¸è¦ã§ããã
- ç¾ç¶MGLã§ã§ãããã¨
- å¤å±¤ãã¥ã¼ã©ã«ãããã¯ã¼ã¯(ReLUãMaxoutãªã©å種活æ§åé¢æ°ãDropoutå®å)
- ãªã«ã¬ã³ããã¥ã¼ã©ã«ãããã¯ã¼ã¯ãLSTM
- å¶ç´ãã«ããã³ãã·ã³(RBM)ãã¼ã¹ã®æ師ãªãäºåå¦ç¿(DBNãDBM)
- cl-cudaã«ããGPUè¨ç®ãã¨ã¦ãã¯ãã
- LLAçµç±ã§OpenBLASçã使ã£ã¦CPUè¨ç®ãããªãã¯ãã
- ã§ããªããã¨
- ãã«ãGPUã¯ã§ããªãã£ã½ã
- ç³ã¿è¾¼ã¿ãã¥ã¼ã©ã«ãããã¯ã¼ã¯*2
MGLã®ã¤ã³ã¹ãã¼ã«
MGL-MATã¯LLAã¨cl-cudaã«ä¾åãããLLAã¨MGL-MATã®ã¤ã³ã¹ãã¼ã«ã«ã¤ãã¦ã¯ä»¥åã®è¨äºãåç §ã
MGL-MATã®ã¤ã³ã¹ãã¼ã«ããã§ãã¦ããã°MGLã®ã¤ã³ã¹ãã¼ã«ã¯ç°¡åã§ãGithubããMGLã®ã½ã¼ã¹ããã¦ã³ãã¼ããã¦ãã¦quickloadããã ãã§ããããã®ä»ã®ä¾åã©ã¤ãã©ãªã¯Quicklispããå ¥ãã
cd ~/quicklisp/local-projects git clone https://github.com/melisgl/mgl.git
(ql:quickload :mgl)
MGLに付属のMNISTのサンプルが長いので整理したものを作ったã以ä¸ã§ã¯ãã®ãµã³ãã«ã³ã¼ãã«å¾ã£ã¦è©±ãé²ãã¦ããã
cd ~/quicklisp/local-projects git clone https://github.com/masatoi/mgl-user.git
(ql:quickload :mgl-user)
ãã®ä¸ã® src/example.lisp ãèªãã¨å¤§ä½ã®æµããåãããã¨æãã
MGL-MATã®è¨å®
深層å¦ç¿ã§ã¯æ°å¤ã®ç²¾åº¦ã¯ããã»ã©å¿ è¦ã§ã¯ãªããå精度ã§ååãããã*3 ããã§MGL-MATã§æ±ãæ°å¤ã®ããã©ã«ãã®åãæå®ããå¤æ°*default-mat-ctype*ã®å¤ã:doubleãã:floatã«ãã¦ãããç¹ã«GPUè¨ç®ã§ã¯ãã®è¨å®ã®æç¡ã§é度ã«å¤§ããªéããåºããã¾ããCUDAã使ããã©ãããå¤æ°*cuda-enabled*ã«ãã£ã¦å¶å¾¡ã§ããã
;;; GPUåãã®è¨å® (setf *default-mat-ctype* :float) (setf *cuda-enabled* t)
MNISTãã¼ã¿ã®èªã¿è¾¼ã¿
MNISTデータの配布ページãã4ã¤ã®ãã¡ã¤ã«ããã¦ã³ãã¼ããã¦ãã¦ãé©å½ãªãã£ã¬ã¯ããªã«gunzipãªã©ã§å±éããã
次ã«ãã®ãã£ã¬ã¯ããªã¸ã®ãã¹ãè¨å®ããã
(defparameter *mnist-data-dir* "/path/to/mnist-data/")
ãã®ç¶æ ã§ä»¥ä¸ãå®è¡ããã¨ãå¤æ°*training-data*ã¨*test-data*ã«ãã¼ã¿ãæ ¼ç´ãããã
(progn (training-data) (test-data) 'done)
ãããã®ãã¼ã¿èªã¿è¾¼ã¿é¢æ°ã¯load-mnist.lispã§å®ç¾©ããã¦ãããå½ç¶ãªããä»ã®ãã©ã¼ãããã®ãã¼ã¿ãæ±ãã¨ãã¯ãã®é¨åã¯èªåã§æ¸ãå¿
è¦ãããã
ãã¦ãèªã¿è¾¼ãã ãã¼ã¿ã¯ãã¼ã¿ç¹ã表ãæ§é ä½datumã®ãã¯ã¿ã«ãªã£ã¦ããã
MGL-USER> (aref *training-data* 0) #S(DATUM :ID 1 :LABEL 5 :ARRAY #<MAT 784 B #(0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.011764706 0.07058824 0.07058824 0.07058824 0.49411765 0.53333336 0.6862745 0.101960786 0.6509804 1.0 0.96862745 0.49803922 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.11764706 0.14117648 0.36862746 0.6039216 ... 0.0 0.0 0.0)>)
datumã®ã¹ãããã¯éãçªå·idã¨ãæ£è§£ã©ãã«labelãããã¦å®éã®ãã¼ã¿arrayã®3ã¤ãarrayãæ®éã®Common Lispã®é åã§ã¯ãªãã¦ãMGL-MATã®MATæ§é ä½ã«ãªã£ã¦ãããã¨ã«æ³¨æããã
ã¢ãã«å®ç¾©
ãã£ã¼ããã©ã¯ã¼ããã¥ã¼ã©ã«ãããã¯ã¼ã¯ã¯fnnã¯ã©ã¹ã®ãªãã¸ã§ã¯ãã¨ãã¦ä½ããããfnnã®make-instanceãã©ãããããã¯ãã¨ãã¦build-fnnãããã®ã§ãããã使ã£ã¦é ã層2層ã®ãããã¯ã¼ã¯ãä½ã£ã¦ã¿ãã¨ãããªãã
(defparameter my-fnn (build-fnn (:class 'fnn :max-n-stripes 100) ; ãããããã®ãµã¤ãº100 (inputs (->input :size 784)) ; å ¥å次å æ°784 (f1 (->relu (->activation inputs :size 256))) ; ã¦ãããæ°256ãæ´»æ§åé¢æ°ã¯ReLU (f2 (->relu (->activation f1 :size 256))) ; ã¦ãããæ°256ãæ´»æ§åé¢æ°ã¯ReLU (prediction (->softmax-xe-loss (->activation f2 :size 10))))) ; åºåã¯10éãã®å¤ãæ´»æ§åé¢æ°ã¯ã½ããããã¯ã¹é¢æ°
ããã§å®ç¾©ããããããã¯ã¼ã¯ã®ä¸èº«ãè¦ã¦ã¿ãã¨ã以ä¸ã®ããã«ãªã£ã¦ããã
MGL-USER> (describe my-fnn) #<FNN {1014AC5903}> BPN description: CLUMPS = #(#<->INPUT INPUTS :SIZE 784 1/100 :NORM 0.00000> #<->ACTIVATION (#:G1298 :ACTIVATION) :STRIPES 1/100 :CLUMPS 4> #<->RELU F1 :SIZE 256 1/100 :NORM 0.00000> #<->ACTIVATION (#:G1299 :ACTIVATION) :STRIPES 1/100 :CLUMPS 4> #<->RELU F2 :SIZE 256 1/100 :NORM 0.00000> #<->ACTIVATION (#:G1300 :ACTIVATION) :STRIPES 1/100 :CLUMPS 4> #<->SOFTMAX-XE-LOSS PREDICTION :SIZE 10 1/100 :NORM 0.00000>) N-STRIPES = 1 MAX-N-STRIPES = 100
ãããã¯ã¼ã¯ã®å®ä½ã¯lumpã¨ãããã¬ã¤ã¤ã¼ã®ãããªå½¹å²ãæã¤ã¯ã©ã¹ã®ãªãã¸ã§ã¯ãã®éåã§ãlumpã«ã¯->ãæ¥é è¾ã¨ãã¦ä»ãã¦ãããã¾ãlumpã¯ååã®lumpãçæããé¢æ°ãæã£ã¦ãããä¾ãã° (->input :size 784) ã¯ãµã¤ãº784ã®å
¥å層ã表ã->inputã¯ã©ã¹ã®ã¤ã³ã¹ã¿ã³ã¹ãä½ãé¢æ°ãå¼ã¶ã
build-fnnã®ãã¼ã¯ã¼ããªãã·ã§ã³:classã¯çæãããããã¯ã¼ã¯ã®ãªãã¸ã§ã¯ãã®ã¯ã©ã¹ã§ã:max-n-stripesããããããã®ããããµã¤ãºã表ããbuild-fnnã®æ¬ä½é¨åã¯let*ã¿ãããªãã®ã§(å®élet*ã«å±éããã)ãlumpã«ã¤ããååã¨ãã®lumpãçæããé¢æ°å¼ã³åºãã®å¯¾ã並ãã§ããæ§é ã«ãªã£ã¦ãããlet*ã¨åæ§ã«ä»¥åã«æç¸ãããå¤æ°ãå¾ã®å®ç¾©ä¸ã§ä½¿ããã¨ãã§ããã
lumpã«ã¯->inputã->reluãªã©ã®ã¦ãããã®å¤ãæã¤ãã®ã¨ã層éã®ç·åååãæ
å½ãã->activationãããããã®ä¾ã§ã¯->inputã->reluã->softmax-xe-lossãã¦ãããã®å¤ãæã£ã¦ãããå
·ä½çã«ã¯ãé ä¼æ¬ã®å¤ãå
¥ãnodesã¨ããã¹ãããã¨ãéä¼æ¬ã®å¤ãå
¥ãderivativesã¨ããã¹ããããæã£ã¦ãããã©ã¡ããããããµã¤ãºÃã¦ãããæ°ã®è¡åã«ãªã£ã¦ããã
ä¾ãã°å
¥å層ã®lumpã表示ãã¦ã¿ãã¨ä»¥ä¸ã®ããã«ãªã£ã¦ãããnodesã¨derivativesã«784次å
ã®å
¥åãã¼ã¿ãããããµã¤ãºåæãã¦è¡åã¨ãããã®ãå
¥ã£ã¦ãããã¨ãåããã
MGL-USER> (describe (aref (clumps my-fnn) 0)) #<->INPUT INPUTS :SIZE 784 100/100 :NORM 91.96476> [standard-object] Slots with :INSTANCE allocation: NAME = INPUTS SIZE = 784 NODES = #<MAT 100x784 AF #2A((0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0.. DERIVATIVES = #<MAT 100x784 A #2A((-1.6850416e-7 1.7476857e-6 3.683108e-6.. DEFAULT-VALUE = 0 SHARED-WITH-CLUMP = NIL X = #<->INPUT INPUTS :SIZE 784 100/100 :NORM 91.96476> DROPOUT = NIL MASK = NIL
ãã¦ã次ã¯é ã層ã«ã¤ãã¦ã ããé ã層ã®lumpã¯ãã®å±¤ã«ãããæ´»æ§åé¢æ°ã®ç¨®é¡ã«å¿ãã¦->reluã->sigmoidã->tanhã->maxoutãªã©ã®ä¸ããé¸ã¶ãæè¿ã¯æ·±å±¤å¦ç¿ã«ã¯ReLUãMaxoutã使ããã¨ãå¤ãã
åºå層ã®lumpã¯ãMNISTãåé¡åé¡ã§ããã®ã§ãæ´»æ§åé¢æ°ãã½ããããã¯ã¹é¢æ°ãæ失é¢æ°ãã¯ãã¹ã¨ã³ãããã¼ã¨ãã ->softmax-xe-loss ã使ã£ã¦ããã->softmax-xe-lossã«ã¯targetã¨ããã¹ãããããããæ£è§£ã©ãã«ãããããµã¤ãºå並ã¹ããªã¹ããå
¥ãããããå帰åé¡ã§ããã°ãæ´»æ§åé¢æ°ã¯æçååã§æ失é¢æ°ã¯äºä¹èª¤å·®ã¨ãªãã代ããã« ->squared-difference ã使ããã¨ã«ãªããå帰ã«ã¤ãã¦ã¯å¥ã®è¨äºã§ç´¹ä»ã§ããã°ã¨æãã
- >activationã¯åã®å±¤ã®åºåãã次ã®å±¤ã®å ¥åã§ããã¦ãããæ´»æ§ãè¨ç®ããããã®lumpã§ãããèªä½ãlumpã®éåãæã¤ãµããããã¯ã¼ã¯ã«ãªã£ã¦ãããããã«å¦ç¿å¯¾è±¡ã§ãã層éã®éã¿ããã¤ã¢ã¹ãå ¥ã£ã¦ããã
Maxoutã¨Dropout
æ´»æ§åé¢æ°ã«Maxoutã使ãã¨ãããã¯ã¼ã¯ãå¤å±¤ã«ããã¨ãã§ãå¦ç¿ããã¾ããããã¨ãç¥ããã¦ãããMaxoutã¯è¤æ°ã®ç·å½¢ã¦ããããããªãé¢æ°ã§ãè¤æ°ã®ç´ç·ãããã®å ¥åã«ãããæ大å¤ãé¸ã¶ãã¨ã«ãã£ã¦ä¸ã«å¸ãªé¢æ°ã«ãªãããã®ããä½åã®ç´ç·ã使ããããã©ã¡ã¼ã¿:group-sizeã§æå®ãã¦ããå¿ è¦ããããä¾ãã°5æ¬ã®ç´ç·ã使ããªã以ä¸ã®ããã«æ¸ããã
(->max activations :group-size 5)
Dropoutã¯å¦ç¿ã®ãã³ã«ã¦ãããã®ä¸é¨ãä¸å®ç¢ºçã§ç¡å¹åãããããã¯æ²¢å±±ã®ãµããããã¯ã¼ã¯ã§ã¢ã³ãµã³ãã«å¦ç¿ãã¦ãããããªãã®ã§ãå¼·ãæ£ååã®å¹æããããä¾ãã°ä¸ã®æ´»æ§åé¢æ°ã§ã¦ãããã®ååãã©ã³ãã ã«ç¡å¹åããã¨ãã¯ããã®ããã«æ¸ããã
(->dropout (->max activations :group-size 5) :dropout 0.5)
ããããã¾ã¨ããã¨ããããã¯ã¼ã¯å ¨ä½ã¯ãã®ããã«æ¸ããã
(defparameter fnn-maxout-dropout (let ((group-size 5)) (build-fnn (:class 'fnn :max-n-stripes 100) (inputs (->input :size 784 :dropout 0.2)) ; å ¥å層ã¯0.2ã®ç¢ºçã§ç¡å¹å (f1-activations (->activation inputs :name 'f1 :size 1200)) (f1* (->max f1-activations :group-size group-size)) (f1 (->dropout f1* :dropout 0.5)) ; é ã層ã¯0.5ã®ç¢ºçã§ç¡å¹å (f2-activations (->activation f1 :name 'f2 :size 1200)) (f2* (->max f2-activations :group-size group-size)) (f2 (->dropout f2* :dropout 0.5)) (f3-activations (->activation f2 :name 'f3 :size 1200)) (f3* (->max f3-activations :group-size group-size)) (f3 (->dropout f3* :dropout 0.5)) (prediction (->softmax-xe-loss (->activation f3 :name 'prediction :size 10) :name 'prediction)))))
ãããã¯ã¼ã¯ã¸ã®å ¥åºåã®è¨å®
ãããã¦ãããã¯ã¼ã¯ã®ãªãã¸ã§ã¯ããçæããããã ããå¦ç¿ãäºæ¸¬ãè¡ãªãããã«ã¯ãã¾ããã®ãããã¯ã¼ã¯ã«å
¥åºåãè¨å®ãã¦ããå¿
è¦ãããã
å
¥åã¯->inputã®nodesã¹ãããã«ãåºåã¯->softmax-xe-lossã®targetã¹ãããã«ããããè¨å®ããããããè¡ãã¡ã½ãããset-inputã§ãããä¾ãã°ä»¥ä¸ã®ããã«å®ç¾©ãããã
(defmethod set-input (samples (bpn fnn)) (let* ((inputs (find-clump 'inputs bpn)) (prediction (find-clump 'prediction bpn))) (clamp-data samples (nodes inputs)) ; ããããµã¤ãºåã®å ¥åãè¨å® (setf (target prediction) (label-target-list samples)))) ; ããããµã¤ãºåã®åºå(æ£è§£ã©ãã«)ãè¨å®
set-input ã«ãã£ã¦ãããã¯ã¼ã¯ã«æ£ãããã¼ã¿ã®å¤ãè¨å®ãããã®ã§ããã°ããã¼ã¿ã®è¡¨ç¾ã¯å¥ã«ä½ã§ãããã¨ãããã¨ãåããã
å¦ç¿é¨åã®å®ç¾©
次ã«ãå¦ç¿ãè¡ãªãé¢æ°ãå®ç¾©ããã
ããã§ã©ããªã¢ã«ã´ãªãºã ã使ãããå¦ç¿ã®éä¸çµéãã¢ãã¿ãªã³ã°ãããçã決ãããä¾ãã°ã¢ã¼ã¡ã³ã¿ã SGDã使ã£ã¦æé©åããã¨ãã¯ã以ä¸ã®ããã«æ¸ããã
(defun train-fnn (fnn training &key (n-epochs 3000) (learning-rate 0.1) (momentum 0.9)) (let ((optimizer (make-instance 'segmented-gd-optimizer :segmenter (constantly (make-instance 'sgd-optimizer :learning-rate learning-rate :momentum momentum :batch-size (max-n-stripes fnn))))) (learner (make-instance 'bp-learner :bpn fnn)) (dateset (make-sampler training :n-epochs n-epochs))) (minimize optimizer learner :dataset dateset) fnn))
æé©åã®æ¬ä½ã¯minimizeã§ãæé©åã¢ã«ã´ãªãºã ã表ãoptimizerã¨å¦ç¿ä¸»ä½ã表ãlearnerãå¼æ°ã«åããããã§ã¯optimizerã®ã¨ããã¯ã¢ã¼ã¡ã³ã¿ã SGDãæå®ãã¦ããããä»ã«ãADAMãªã©ã使ãããæé©åã¢ã«ã´ãªãºã ã¯å¦ç¿çãã¢ã¼ã¡ã³ã¿ã ãªã©ç¬èªã®ãã©ã¡ã¼ã¿ãæã¤ã
optimizerãlearnerã«ã¯å¦ç¿ã®éä¸çµéãåºåããã¢ãã¿ã¼é¢æ°ãåã¾ãããã¨ãã§ãã¦ãä¾ãã°è¨ç·´ãã¼ã¿ã¨ãã¹ããã¼ã¿ã«å¯¾ããæ£ççã表示ããé¢æ°log-bpn-test-errorãæé©åã®éä¸ã§ä¸å®ééã§å®è¡ããå ´åã«ã¯ãã®ããã«ããã
(defun train-fnn-with-monitor (fnn training test &key (n-epochs 3000) (learning-rate 0.1) (momentum 0.9)) (let ((optimizer (monitor-optimization-periodically (make-instance 'segmented-gd-optimizer-with-data :training training :test test :segmenter (constantly (make-instance 'sgd-optimizer :learning-rate learning-rate :momentum momentum :batch-size (max-n-stripes fnn)))) '((:fn log-bpn-test-error :period log-test-period) (:fn reset-optimization-monitors :period log-training-period :last-eval 0)))) (learner (make-instance 'bp-learner :bpn fnn)) (dateset (make-sampler training :n-epochs n-epochs))) (minimize optimizer learner :dataset dateset) fnn))
ãããã使ã£ã¦ãå¦ç¿ã®é²è¡éç¨å ¨ä½ã¯ãã®ããã«æ¸ããã
(defun train-fnn-process (fnn training test &key (n-epochs 30) (learning-rate 0.1) (momentum 0.9)) (with-cuda* () (repeatably () (init-bpn-weights fnn :stddev 0.01) ; fnnã®éã¿ãåæåãã (train-fnn-with-monitor fnn training test :n-epochs n-epochs :learning-rate learning-rate :momentum momentum))) fnn)
ããã§with-cuda*ã¯MGL-MATã®è¨äºã§ãåºã¦ããCUDAã«ããè¨ç®ãæå¹åããããã®ãã¯ãã§ãinit-bpn-weightsã¯æ£è¦åå¸ã«å¾ã£ã¦éã¿ã®åæå¤ãè¨å®ããé¢æ°ãn-ephochã¯è¨ç·´ãã¼ã¿ãä½å¨ãããã表ãã
å¦ç¿ããã»ã¹ãå®è¡
å ã«å®ç¾©ããmy-fnnãå®éã«å¦ç¿ãããã¨my-fnnä¸ã®éã¿ãã©ã¡ã¼ã¿ãç ´å£çã«æ´æ°ããããå¦ç¿ããã»ã¹å ¨ä½ã®åºåã¯ã
MGL-USER> (train-fnn-process my-fnn *training-data* *test-data* :n-epochs 10) ... 2016-05-22 18:40:16: --------------------------------------------------- 2016-05-22 18:40:17: training at n-instances: 600000 2016-05-22 18:40:17: test at n-instances: 600000 2016-05-22 18:40:18: pred. train bpn PREDICTION acc.: 99.75% (10000) 2016-05-22 18:40:18: pred. train bpn PREDICTION xent: 9.027d-5 (10000) 2016-05-22 18:40:18: pred. test bpn PREDICTION acc.: 98.10% (10000) 2016-05-22 18:40:18: pred. test bpn PREDICTION xent: 8.772d-4 (10000) ...
ã¨ãªã£ã¦æçµçã«ãã¹ããã¼ã¿ã«å¯¾ãã¦98.10%ã®æ£ççã§ãããã¨ãåããã
Dropoutãå
¥ãã以ä¸ã®ãããã¯ã¼ã¯ã§100ã¨ããã¯ã»ã©åãã¨98.96%ã¾ã§ãã£ãã
(defparameter fnn-relu-dropout (build-fnn (:class 'fnn :max-n-stripes 100) (inputs (->input :size 784 :dropout 0.2)) (f1-activations (->activation inputs :name 'f1 :size 1200)) (f1* (->relu f1-activations)) (f1 (->dropout f1* :dropout 0.5)) (f2-activations (->activation f1 :name 'f2 :size 1200)) (f2* (->relu f2-activations)) (f2 (->dropout f2* :dropout 0.5)) (f3-activations (->activation f2 :name 'f3 :size 1200)) (f3* (->relu f3-activations)) (f3 (->dropout f3* :dropout 0.5)) (prediction (->softmax-xe-loss (->activation f3 :name 'prediction :size 10) :name 'prediction))))
ãµã³ãã«ãã¡ã¤ã«ã®ã³ã¡ã³ãã«ããã¨ãDBMã§äºåå¦ç¿ãã¦ããDropoutããã§ãã¡ã¤ã³ãã¥ã¼ãã³ã°ãããã¨ã«ãã£ã¦99.22%ã¾ã§ãããããã
äºæ¸¬
ãã¼ã¿ã»ããã®æ¤è¨¼ã«ã¯monitor-bpn-resultsã¨ããé¢æ°ãç¨æããã¦ãããä¾ãã°ä»¥ä¸ã®ããã«ä½¿ãã
(defun test-fnn (fnn test) (monitor-bpn-results (make-sampler test :max-n (length test)) fnn (make-datum-label-monitors fnn)))
MGL-USER> (test-fnn my-fnn *test-data*) (#<CLASSIFICATION-ACCURACY-COUNTER bpn PREDICTION acc.: 98.10% (10000)> #<CROSS-ENTROPY-COUNTER bpn PREDICTION xent: 8.772d-4 (10000)>)
åã ã®ãã¼ã¿ãäºæ¸¬ããã«ã¯ããããã¯ã¼ã¯ã®å ¥å層ã®nodesã«å¤ãè¨å®ããforwardé¢æ°ããããã¯ã¼ã¯ãå¼æ°ã¨ãã¦å¼ãã å¾ãåºå層ã®nodesãè¦ãã°äºæ¸¬çµæãå ¥ã£ã¦ããã
(defun predict-datum (fnn datum) (let* ((a (datum-array datum)) (len (mat-dimension a 0)) (input-nodes (nodes (find-clump 'inputs fnn))) (output-nodes (nodes (find-clump 'prediction fnn)))) ;; set input (loop for i from 0 to (1- len) do (setf (mref input-nodes 0 i) (mref a i))) ;; run (forward fnn) ;; return output (reshape output-nodes (mat-dimension output-nodes 1))))
ä¾ãã°ãè¨ç·´ãã¼ã¿ã®æåã®ãã¼ã¿ãå¦ç¿æ¸ã¿ã®my-fnnã«ãã£ã¦äºæ¸¬ãã¦ã¿ãã¨ã以ä¸ã®ãããªåè¨ã1ã«ãªããããªç¢ºçãªã¹ãã表示ãããã
MGL-USER> (predict-datum my-fnn (aref *training-data* 0)) #<MAT 0+10+990 - #(1.6357367e-20 1.007323e-17 1.829405e-18 1.9152902e-5 3.730603e-23 0.9999808 4.772671e-19 6.1250017e-20 1.6942288e-18 2.232391e-16)>
ãã®ãã¡æã確çãé«ãã¯ã©ã¹5ãäºæ¸¬çµæã¨ãªãããã¼ã¿ã®æ£è§£ã©ãã«ã5ãªã®ã§æ£è§£ãã¦ãããã¨ãåããã
ãã³ããã¼ã¯
MNISTãå¤å±¤ãã¥ã¼ã©ã«ãããã§å¦ç¿ããä¾ãTensorflowãChainerã¨æ¯è¼ãã¦ã¿ãã
LLAã§OpenBLASã使ã£ã¦ããã®ã§ãNumpyã§ãOpenBLASã使ãããã«ããã(åè: http://qiita.com/unnonouno/items/8ab453a1868d77a93679)
~/.numpy-site.cfg ã«
[openblas] libraries = openblas library_dirs = /usr/lib/openblas-base/ include_dirs = /usr/include/ runtime_library_dirs = /usr/lib/openblas-base/
ã¨æ¸ã㦠pip install numpy ããã¨OpenBLASã使ãããã«ãªãã
Tensorflowã®å¤å±¤ãã¥ã¼ã©ã«ãããã®ä¾ã¯これを元にしたãChainerã«ã¯MNISTã®ãµã³ãã«ãä»å±ããã®ã§それを元にしたã
é ã層ã¯2層ãã©ã¡ããã¦ãããæ°ã¯256ãæ´»æ§åé¢æ°ãReLUãæé©åã¢ã«ã´ãªãºã ãã¢ã¼ã¡ã³ã¿ã SGDã¨ããããããµã¤ãº100ã§15ã¨ããã¯å¦ç¿ããã¨ãã®ãå¦ç¿é¨åã®ã¿ã«è¦ããæéã§æ¯è¼ãããç°å¢ã¯ Core i5 4670ãGeForce GTX 750Tiã
çµæã¯ãã®ããã«ãªãã
ã¾ã¨ã
- MGLã®ç¹å¾´
- éã
- è¡åæ¼ç®ã«é¢ããé¨å以å¤ã¯ãã¹ã¦Common Lispã§å®è£ ããã¦ãããCommon Lispã§æ¡å¼µå¯è½
- å®è¡é度ãåºãããã«å¯ä½ç¨ã使ãã¾ãã£ã¦ããã®ã§é¢æ°åçã¨ã¯ãããªã
- åã
ã®ã±ã¼ã¹ã«ããã¦å®è£
ããªããã°ãªããªããã¨ã決ããªããã°ãªããªããã¨
- ãã¼ã¿ãèªã¿è¾¼ãã§ä½ããã®ãã¼ã¿æ§é ã«å ¥ããé¨å
- ããããµã¤ãºåã®ãã¼ã¿ããããã¯ã¼ã¯ã«è¨å®ããé¨å (set-inputã¡ã½ãã)
- ãããã¯ã¼ã¯ã®æ§é
- é ã層ã®å±¤æ°ãå層ã®ã¦ãããæ°ãæ´»æ§åé¢æ°ã®é¸æãMaxoutã®å ´åã¯group-size
- Dropoutãããå ´åã¯ã¦ããããç¡å¹åãã確ç
- Optimizerã«ãããã¢ã«ã´ãªãºã ã®é¸æã¨ãã©ã¡ã¼ã¿ã®æå®
- ä¾ãã°momentum SGDãªãå¦ç¿çã¨ã¢ã¼ã¡ã³ã¿ã ã決ãã
- ã¢ãã¿ãªã³ã°é¢æ°ããã®ã³ã°é¢æ°ãªã©
TODO
- MNISTã®ãµã³ãã«ã§ã¯ããã«L2æ£ååããã£ã¦ããã®ã§ããã詳ããè¦ã¦ããã
- å帰åé¡ã®ä¾
- RNN/LSTMã®ä¾
- Dropconnectã®å®è£
- åå¦çã§ç²¾åº¦ãä¸ããã: Grobal contrast normalizationãªã© http://www.slideshare.net/Takayosi/miru2014-tutorial-deeplearning-37219713 P.53
*1:ã¡ãªã¿ã«å½æ¹ã®ç°å¢ã¯ãã¼ããCore i5 4670ãGeForce GTX 750Tiã®èªä½æ©ãã½ãããUbuntu 14.04(64bit)ãCUDA 7.5ãOpenBLAS 0.2.8ãSBCL 1.3.1ã¨ãªã£ã¦ããã
*2:ã¨ã¯ãããæ大ãã¼ãªã³ã°ã¯Maxoutã¨ã»ã¨ãã©åãå¹æã¨ãã話ãããã®ã§ãMaxout+Dropoutã®DNNã§ãå¹³è¡ç§»åã«å¯¾ããèæ§ã¯ãã
*3:æè¿ã®GPUã§ã¯å精度ã§SIMDæ¼ç®ãããã¨ã§ããã«é«éåãã¦ããããã