Kaggle CIFAR-10ã®è©±
以åãKaggle CIFAR-10 ã«åå ãã¦ããã¨æ¸ãã¾ããããããã2é±éã»ã©åã«çµããã¾ãããã³ã³ãã¯ã¾ã Validating Final Results ã®ç¶æ ãªã®ã§ããã2é±éãã£ã¦ãçµãããããã¤çµããã®ãè¬ãªã®ã¨ãå¤åçµæã¯å¤ãããªãã¨æãã®ã§å ã«æ¸ãã¾ãã
CIFAR-10ã¯ã次ã®ãããª32x32ã®å°ããªç»åã«ãã³ãç¬ãé³¥ãªã©10種é¡ã®ç©ä½ãåã£ã¦ããã®ã§ãä¸ããããç»åã«ä½ãåã£ã¦ãããå½ã¦ãåé¡ã§ãã
ï¼Kaggle CIFAR-10ã®ãã¼ã¿ã»ããã¯ãé常ã®CIFAR-10ã¨çµæã®äºææ§ãããã¾ããããã¼ãé²æ¢ã«ç»åã®ããã·ã¥å¤ãå¤ããããã«æ¹å¤ããã¦ããã®ã¨ããã¹ãã»ããã«29ä¸æã®ã¸ã£ã³ã¯ã¤ã¡ã¼ã¸ãå«ã¾ãã¦ãã¾ããï¼
èªåã®çµæã¯ã0.9415 (æ£è§£ç94.15%)ã§ãClassification datasets results ã«ããã¨ãstate-of-the-artã91.78%ãªã®ã§ããããä¸åã£ã¦äººéã«ãã精度ã§ãã94%ã«éãã¦ããã®ã§ããããã®ã¹ã³ã¢ã§ãªãã¨5ä½ã§ããã1ä½ã¯DeepCNetã§ã95.53%ã¨ããé©ç°ã®ç²¾åº¦ãåºãã¦ãã¾ãã2ä½ãDeepCNetã¨DropConnectã®çµæãåããããã®ãªã®ã§ãDeepCNetæå¼·ã ã£ãã¨ããæãã§ãï¼DeepCNetã®ã³ã¼ãã¯ã³ã³ãçµäºåã«å ¬éããã¦ãã¾ããï¼ã3ã4ä½ã¯ææ³ãå ¬éãã¦ããªãã®ã§ä¸æã§ãã
èªåã®ææ³
kaggle-cifar10-torch7 - Github
ã§ã³ã¼ããå
¬éãã¦ãã¾ããTorch7ã§å®è£
ãã¦ãã¾ãã
ãªãªã¸ãã«æ§ãé«ããã®ã¯ãªããããããææ³ãããã¤ãçµã¿åãããã®ã¨ãVGG(University of Oxfordã®Visual Geometry Group)ãILSVRC2014ã§ä½¿ã£ãã¢ãã«ãCIFAR-10ã«èª¿ç¯ããã ããã®ã§ãã
ãã£ããã¨ã¯
- å¦ç¿ãã¼ã¿ã36åã«å¢å(Data Augmentation)
- GCN + ZCA Whiteningã§æ£è¦å
- VGGã®ã¢ãã«ããã¼ã¹ã«ããConvolutional Neural Network(CNN)ãå¦ç¿
- ä¸è¨ã®ã¢ãã«ãéã¿ã®åæå¤ã¨Mini-Batch-SGDã®æ´æ°é ãå¤ãã¦6åå¦ç¿ããååé¡å¨ã®å¹³åãäºæ¸¬ã¨ãã¦åºå
ã§ãã
å¦ç¿ãã¼ã¿ã36åã«å¢å(Data Augmentation)
ãã¥ã¼ã©ã«ãããã¯ã¼ã¯ã¯ãçµé¨çã«ã¯ãªã¼ãã¼ãã£ããã£ã³ã°ããªããã°å±¤æ°ãç´ åæ°ãå¤ãã»ã©ããã¨ããã®ããã£ã¦ãããDeeeepãããã¨ããæãã¯ããã¾ãããã¼ã¿ãå°ãªãã¨ãªã¼ãã¼ãã£ããã£ã³ã°ãã¦ãã¾ãã®ã§ããã¼ã¿ãå¢ãã¦è¤éãªã¢ãã«ã§ããªã¼ãã¼ãã£ããã£ã³ã°ãã«ããããã«ãã¾ããã
ã³ãã¨ãã¦ã¯ãã§ããã ã"ããå¾ãç¯å²"ã®å¤æã«ããå¢ããã¨ãããã¨ã§ããç»åã®å ´åã人工çãªãã¤ãºãæªãããã§ãããã§ããã¿ã¼ã³ãå¢ããã¾ããããã¹ãã»ããã«åºã¦ããªããããªãã¿ã¼ã³ã§å¦ç¿ãã¼ã¿ã®åå¸ãæªãã¦ãã¾ãã¨ãããªãã®ã§ãå
ã®ãã¼ã¿ã¨ã¯éããã©ãã¹ãã«ã¯åºã¦ãããã¿ã¼ã³ã«å¤æã§ããã®ãçæ³ã§ãã
ä»åã¯æ¬¡ã®ï¼ã¤ã®ã¡ã½ããã使ãã¾ããã
- Cropping
- CIFAR-10ã®å¦ç¿ç»åã¯32x32ã§ãããããã24x24ã®é¨åç»åã«å解ãã¾ãã4pxç½®ãã«åãåºãã¨ã3x3ã®9ãã¿ã¼ã³ãåãåºããã®ã§ãã¼ã¿ã9åã«å¢ããã¾ããï¼å¦ç¿ç»åã¯å°ãããªãã¾ãï¼
- Scaling
- Croppingã§ã®åãåºããµã¤ãºã28x28ã«ãã¦2pxç½®ãã«åãåºãã¨3x3ã®9ãã¿ã¼ã³ãåãåºãã¾ãããã®é¨åç»åã24x24ã«ãªãµã¤ãº(ãºã¼ã ã¢ã¦ã)ãã¦å¦ç¿ç»åã¨ãã¾ãã
- Horizontal reflection
- å·¦å³å転ã§ããããã¾ã§å¢ãããç»åãå·¦å³éã®2ã®ãã¿ã¼ã³ã«åãã¦ï¼åã«å¢ããã¾ãã
ããã§(9 + 9) * 2 = 36åã«ãªãã¾ããå¦ç¿ç»åã¯32x32ã§ã¯ãªãã24x24ã«ãªãã¾ãã
äºæ¸¬æã¯ãäºæ¸¬å¯¾è±¡ã®ç»åãåãæ¹æ³ã§36åã«å¢ããã¦ãåç»åã«å¯¾ãã¦äºæ¸¬ãè¡ããããããå¹³åãã¦äºæ¸¬çµæã¨ãã¦ãã¾ããå½ç¶ãäºæ¸¬ã«ãããæéãå¢ãã¾ãã
ãã®å¦çã«ãã£ã¦ã2ã3%ããã精度ããããªãã¾ããã
1è¡ç®ãCroppingã2è¡ç®ãCropping+Scalingã3ã4è¡ç®ãHorizontal reflectionã§ãã
GCN + ZCA Whiteningã§æ£è¦å
GCN(Global Contrast Normalization)ã¯ãstandardizeã¨ãz-scoreã¨ãè¨ããããã®ã¨åãã§ããã¼ã¿å
¨ä½ããåè¦ç´ ã®å¹³åã¨æ¨æºåå·®ãæ±ãã¦ãå¹³åãå¼ãã¦æ¨æºåå·®ã§å²ãã¾ããå
¥åã®å¤åã-2ã2ãããã«æ£è¦åããã¦ãã¹ã±ã¼ã«ã®ç°ãªã軸ããã£ãå ´åã§ããã®ç¯å²ã«ãããããã¾ããã¾ããããåºãå¤ã¯å¹³åã«è¿ããªãããã¾ãåºãªãå¤ã¯å¤§ããªçµ¶å¯¾å¤ãæã¤ããã«ãªãã¾ããã¹ã±ã¼ã«ã®å¤§ããªè»¸ã®å½±é¿ãæããã®ã¨ãå¦ç¿æã®åæãéããªãå¹æãããã¾ãã
2014/12/20追è¨
ãã®èª¬æã¯ééã£ã¦ãã¾ãããGCNã¯ãç»åãã¾ãããã«ãç»åå
ã§ã®å¹³åã¨åæ£ãæ±ãã¦å¹³åãå¼ãã¦åæ£ã§å²ãããã§ããããã¯ã"An Analysis of Single-Layer Networks"ã®å®è£
ã§ã¯ãlocal contrast normalizationã¨æ¸ããã¦ããå¦çã§ãz-scoreã¯global standardizationã¨æ¸ããã¦ãã¦ãMaxoutã®è«æã§ã¯ããã®local contrast normalizationãglobal contrast normalizationã¨æ¸ããã¦ããã®ã§ãglobalã¨localã®æ¦å¿µãã©ãã«ããã®ãæ··ä¹±ãã¦åéããã¦ãã¾ããããã èªåã®å®è£
ã§ã¯ããã®ããã°éãã®z-scoreã使ã£ã¦ãã¾ãã
ZCA Whiteningã¯ããã¼ã¿å
¨ä½ã®åæ£å
±åæ£è¡åã®åºæãã¯ãã«ã§ä¸»è»¸å¤æãè¡ãªã£ã¦ãå¤æå¾ã®ç©ºéã§standardizeãè¡ãªã£ã¦å
ã®ç©ºéã«æ»ãã¨ãããã®ã§ããèªç¶ç»åã¯ããããã¯ã»ã«ã¯ãã®è¿é£ã®ãã¯ã»ã«ã¨ç¸é¢ãå¼·ãã¨ããç¹å¾´ãããã®ã§ããã®ç¸é¢ãæ¶ããã¨ã§è²ã®æ
å ±ãæã£ãã¾ã¾ã¨ãã¸æ¤åºãè¡ã£ããããªçµæãå¾ããã¾ããå
ã®ç©ºéã«æ»ãã®ã¯ãCNNãå
ç»åã®æ§é ãåæã¨ãã¦ããããã§ãã
ZCA Whiteningã¯ãAn Analysis of Single-Layer Networks in Unsupervised Feature Learning - Andrew Ngã§ãããããçµæãåºããåå¦çã§ãåããã®ææ³ãå®è£
ãããã¨ãããã®ã§ããããã®ææ³ã«ããã¦ã¯ZCA Whiteningããããããªããã§ãCIFAR-10ã®ç²¾åº¦ã15%ãããå¤ããã¾ãããã ãæè¿ã®Deep CNNã§ã¯ã»ã¨ãã©å·®ãã§ãªãã®ã§ãå¿
è¦ãªãã£ãã®ã§ã¯ãªããã¨æã£ã¦ãã¾ãããã®åã®ã¢ãã«(Network In Network)ã§ã¯ã»ãã®å°ãã ã精度ãæ¹åã§ãã¦ããã®ã¨ãå¤ãã¦å¤ãããªã精度ãåºãã試ãã¦ããä½è£ããªãã£ãã®ã§ããã®ã¾ã¾æ
£æ§ã§å
¥ãã¦ãã¾ãã
VGGã®ã¢ãã«ããã¼ã¹ã«ããDeep Convolutional Neural Networkãå¦ç¿
[1409.1556] Very Deep Convolutional Networks for Large-Scale Image Recognition ã§ææ¡ããã¦ãããã®ããã¼ã¹ã«ããDeep Convolutional Neural Networkã§åé¡å¨ãä½ãã¾ããã
ä¼çµ±çãªCNNã§CIFAR-10ç¨ã®ã¢ã¼ããã¯ãã£ãä½ãã¨ãconv 5x5 -> maxpool -> conv 5x5 -> maxpool -> conv 5x5 -> maxpool -> fc(fully connected) -> softmaxã®ããã«ãªãã®ã§ããããã®conv 5x5ã®é¨åã3x3ã«ã¼ãã«ã2ã¤ã3ã¤ä¸¦ã¹ããã®ã«ç½®ãæãã¾ãã
ããã§
- 層æ°ãå¢ãã
- ç·å½¢ã®å¤§ããªã«ã¼ãã«ãéç·å½¢(convãã¨ã«ReLUãæãã§ãããã)ã®3x3ã並ã¹ããã®ã«ç½®ãæããã®ã§è¡¨ç¾åãä¸ãã
- 大ããªã«ã¼ãã«ã§ä¸åç³ã¿è¾¼ããããå°ããªã«ã¼ãã«ã§è¤æ°åç³ã¿è¾¼ãã ã»ããè¨ç®éãå°ãªãï¼5x5 > 3x3x2, 7x7 > 3x3x3ï¼
ã¨ãããããªå¹æãããã¾ããã¾ã3x3ã«ã¼ãã«ã«1pxã®paddingãå ããã¨ãç³ã¿è¾¼ã¿å±¤ã«ãã£ã¦ç»åãµã¤ãºã縮å°ãããªããªãã®ã§ãçè«ä¸ã¯ç¡éã«å±¤æ°ãå¢ãããããã«ãªãã¾ããããã¯CIFAR-10ã®ãããªå ¥åç»åãå°ããå ´åã«å¬ããã§ãï¼ç³è¾¼ã¿ã§ãµã¤ãºãæ¸ã£ã¦ããã¨å¢ããã層æ°ã«éçãããããï¼ã
æçµçã«ä½¿ã£ãã¢ã¼ããã¯ãã£ã¯ã
conv 3x3 -> conv 3x3 -> maxpool -> conv 3x3 -> conv 3x3 -> maxpool -> conv 3x3 -> conv 3x3 -> conv 3x3 -> conv 3x3 -> maxpool -> fc -> fc -> softmax
ã¨ããDeeeepãªãã®ã§ãã詳ããã¯ソースコードã®ãã¼ã¸ã«è¡¨ãæ¸ãã¦ããã®ã§åç §ãã¦ãã ããã
ä¸è¨ã®ã¢ãã«ãéã¿ã®åæå¤ã¨Mini-Batch-SGDã®æ´æ°é ãå¤ãã¦6åå¦ç¿ããåèå¥å¨ã®å¹³åãäºæ¸¬ã¨ãã¦åºå
ãã¥ã¼ã©ã«ãããã¯ã¼ã¯ã使ã£ãã¢ãã«ã§ç°¡åã«ç²¾åº¦ãä¸ããæ¹æ³ã¨ãã¦ãããã¤ãã®ã¢ãã«ãå¦ç¿ãã¦å¹³åãåãã¨ããã®ãããã¾ãããã¥ã¼ã©ã«ãããã¯ã¼ã¯ã¯åæå¤ä¾åãããã®ã¨å¤§åæé©åã¯ãããªãã®ã§ãä¹±æ°ã®seedãç°ãªãã¨ï¼å¾®å¦ã«ï¼ç°ãªãçµæãåºåããã¢ãã«ãå¦ç¿ããã¾ãããªã®ã§ãããã¤ãå¦ç¿ãã¦å¹³åãåãã¨çµæãå®å®ãã¾ãã
ããããã®ã¯Bagging(Committee Network)ã§ãããBaggingã¯ãµã³ããªã³ã°ã®å²åãªã©èª¿ç¯ããªããã°ãªããªãã®ã¨ä»åã¯ããããããã¨æã£ãã®ãçµäº3æ¥åã§ã調ç¯ããä½è£ããªãä¸çºåè² ã ã£ããããè¯ããªããã¨ã¯ãã£ã¦ãæªããªããã¨ã¯ãªãã ããã¨ããèãã§ä»¥ä¸ã®è¨å®ã§è¡ãã¾ããã
- åãå¦ç¿ãã¼ã¿
- ç°ãªãåæéã¿
- ç°ãªãæ´æ°é
ã¾ãä»å使ã£ãã¢ãã«ã¯å¦ç¿ã«20æéç¨åº¦ããããåä½ãã·ã³ã§å¦ç¿ãã¦ãã¦ã¯2ã¤ããå¦ç¿ã§ããªãã®ã§ãEC2ã®Spot Instanceã§GPU Instanceã6åãã¡ä¸ãã¦ä¸¦è¡ãã¦å¦ç¿ãã¾ããã
çµæçã«ã¯ãã·ã³ã°ã«ã¢ãã«ã ã¨93.33%ã6ã¢ãã«ã®å¹³åã§94.15%ã ã£ãã®ã§ããã®å¦çã§0.85%æ¹åã§ãã¦ãã¾ããã
ãã®ä»ã®è©±
VGGã®è«æãçºè¡¨ãããåã¯ãNetwork In Networkã使ã£ã¦ãã¾ãããããã¯æçµçã«ã¯92.4%ã®ç²¾åº¦ãåºããã®ã§ãæªãã¯ãªãã£ãã¨æãã¾ãã
æå¾ã®æ¹ã§ã¯ããã®ã¾ã¾ã§ã¯ã©ããã£ã¦ãDeepCNetã«åã¦ãªãã¨æã£ãã®ã§ãGoogLeNetãå®è£ ãã¦ã¿ãã®ã§ãããå¦ç¿ãã¯ãã½é ãä¸validationã§88%ããåºãªãã£ãã®ã§è«¦ãã¾ããï¼èª¿ç¯ã足ããªãã®ããåé¡ã«åã£ã¦ããªãã®ãããªã«ãééã£ã¦ããã®ãåãã£ã¦ããªãï¼ã
ãã®ï¼ã¤ã®ã¢ãã«ã¯ãåèã¨ãã¦ソースコードã®ãã£ã¬ã¯ããªã«ç½®ãã¦ããã¾ãã(nin_model.luaã¨inception_model.lua)