Deep Learningã®ãã©ã¡ã¼ã¿ããã¤ããªâåï¼-1ã¨1ã«ã¨ã³ã³ã¼ãï¼ãããã¨ã«ãããå¿
è¦ãªã¡ã¢ãªå®¹éãæãæ¤åºãé«éåããæ¹æ³ãããã
æåãªãã¤ããªâåæ¹æ³ã¨ãã¦ã¯ãä¸è¨ã®ï¼ã¤ãç¥ããã¦ããã
ï¼ï¼BinaryConnect:ãã©ã¡ã¼ã¿èªä½ã¯é£ç¶å¤ã§æã£ã¦ããã¦ãforwardæã«æ±ºå®çã¾ãã¯ç¢ºççã«éã¿ãã©ã¡ã¼ã¿wã+1ã¾ãã¯-1ã«å¤æãããæ±ºå®çãªå¤ææ¹æ³ã¨ã¯ãwã0以ä¸ã®å ´å+1ã0æªæºã®å ´å-1ã«å¤æãããã®ã§ããã䏿¹ã確ççãªå¤ææ¹æ³ã¨ã¯ã確çp=Ï(w)ï¼hard sigmoid颿°ï¼ã«å¾ã£ã¦+1ã«ãã確ç1-pã«å¾ã£ã¦-1ã«å¤æãããã®ã§ãããã¤ã¾ããwãæ£ã«å¤§ããã»ã©+1ã«ãªã確çãé«ããªããbackwardæã¯ãã¤ããªåã¯è¡ããªããã®ã®ãwã®æ´æ°æã«ãwã®å¤ã[-1,1]ã«ãªãããã«clippingã®æä½ãå ¥ãã¦ããã
詳細ã«ã¤ãã¦ã¯ãä¸è¨ã®è«æãåç
§ã
BinaryConnect: Training Deep Neural Networks with binary weights during propagations
Matthieu Courbariaux, Yoshua Bengio, Jean-Pierre David
http://arxiv.org/abs/1511.00363
ï¼ï¼Binarized Neural Networks:forwardæã®ãã¤ããªåã®æ¹æ³ããbackwardæã¯ãã¤ããªåãè¡ããclippingãç¨ããç¹ãªã©ã¯ãBinary Connectã¨åºæ¬çã«ã¯åããç°ãªãç¹ã¯ãforwardæã«ãconvãªã©ã®ç·å½¢å¤æã®å¾ã«ãbatch normalizationãè¡ãåå¸ãæ£è¦åããå¾ã«ãå度ãã¤ããªåãè¡ãã¨ããã§ãããReLUãªã©ã®æ´»æ§å颿°ããããå ´åã¯ããã®å¾ã«å度ãã¤ããªåãè¡ãã®ã§ãå層ã®åºå(conv層ã®å ´åã¯ãç¹å¾´ããã)ããã¤ããªã«ãªãã
ããã«ãbackwardæã¯ãå¾é
ã®æ¨å®ã«straight-through estimatorã¨ããã®ãæ¡ç¨ãã¦ããã絶対å¤ã1以ä¸ã®å¾é
ã®ã¿ç¨ãã¦æ´æ°ããã
詳細ã«ã¤ãã¦ã¯ãä¸è¨ã®è«æãåç
§ã
Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1
Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, Yoshua Bengio
http://arxiv.org/abs/1602.02830
以ä¸ã¯è«æããã®psuedo codeã®æç²ã
岡éåæ°ããchainerã§Binarized Neural Networksãç·å½¢é¢æ°ç¨ã«å®è£
ãããã®ãä¸è¨ã§å
¬éãã¦ããã
https://github.com/hillbig/binary_net
ä¸è¨ã®ããã«ãfull connected層ã«å®è£ ãã¦ã¿ãã
class MnistCNN_binaryLinear(chainer.Chain): """An example of convolutional neural network for MNIST dataset. """ def __init__(self, channel=1, c1=16, c2=32, c3=64, f1=256, \ f2=512, filter_size1=3, filter_size2=3, filter_size3=3): super(MnistCNN_binaryLinear, self).__init__( conv1=L.Convolution2D(channel, c1, filter_size1), conv2=L.Convolution2D(c1, c2, filter_size2), conv3=L.Convolution2D(c2, c3, filter_size3), l1=link_binary_linear.BinaryLinear(f1, f2), l2=link_binary_linear.BinaryLinear(f2, 10), bnorm1=L.BatchNormalization(c1), bnorm2=L.BatchNormalization(c2), bnorm3=L.BatchNormalization(c3), bnorm4=L.BatchNormalization(f2), bnorm5=L.BatchNormalization(10) ) def __call__(self, x): # param x --- chainer.Variable of array x.data = x.data.reshape((len(x.data), 1, 28, 28)) h = F.relu(self.bnorm1(self.conv1(x))) h = F.max_pooling_2d(h, 2) h = F.relu(self.bnorm2(self.conv2(h))) h = F.max_pooling_2d(h, 2) h = F.relu(self.bnorm3(self.conv3(h))) h = F.max_pooling_2d(h, 2) h = bst.bst((self.bnorm4(self.l1(h)))) y = self.bnorm5(self.l2(h)) return y
以ä¸ãããã¤ããªåããªãå ´åã¨ãã¤ããªåããå ´åã®çµæã§ããã
ãã¤ããªåãªãï¼
ãã¤ããªåããï¼
ãã¤ããªåãã¦ããã®ã«ãé¢ããããããã¾ã§ç²¾åº¦ã¯è½ã¡ã¦ããªãã®ããããã