@nishio ããã«æãã¦ããã£ãã®ã ããLasagne ã¨ãããã¥ã¼ã©ã«ãããã¯ã¼ã¯ã® Python ã©ã¤ãã©ãªã Kaggle ã§ãã£ãããã使ããã¦ãããããã
ã¤ã¿ãªã¢èªèªã¿ããã¨ãã©ã¶ã¼ãã§ããLasagna(ã©ã¶ãã¢) ã®è¤æ°å½¢ãªã®ã§ãã¾ãæ¥æ¬äººãå¼ã¶åã«ã¯ãã©ã¶ãã¢ãã§ããæ°ãããã
2015å¹´6æç¾å¨ã§ãã¼ã¸ã§ã³ã 0.1.dev ã¨ãä»æãåºãã®ã¯äººæ±ææºè¼ã
å®éãèªåã®æã£ãã¨ããã®ã¢ãã«ãåããã®ã¯ãªããªãã«å¤§å¤ã ã£ãã®ã§ããã®ã¡ã¢ãæ®ãã¦ããã
ã¤ã³ã¹ãã¼ã«ã¯å¥ã«é£ããã¨ããã¯ãªãã
ãã Theano åæãªã®ã§ãPython 2.7 ã§ãªãã¨åããªãããWindows ã§åããã®ã¯è¨ã®éã ããã
ã¾ããããã¥ã¡ã³ãã«ã¯ "Install from PyPI" ã¨ããããã«ãpip ã§ã¯ã¤ã³ã¹ãã¼ã«ã§ããªã(ã¯ããã®ï¼)ã
ãããã¨ã
Lasagne ã PyPI ããã¤ã³ã¹ãã¼ã«ã§ããªããã§ããã©
git clone ã§ã¤ã³ã¹ãã¼ã«ã§ããã
ãããããã ãããã©ãããã¥ã¡ã³ãã«ã¯ "Install from PyPI" ã£ã¦æ¸ãã¦ãããï¼
ãããªãã¨è¨ã£ã¦ãã§ããªãããã¯ã§ããªããã ãããã¬ã¿ã¬ã¿ã¬ããã git ããå ¥ãã¨ã
ã¿ãããªããã¨ããå¼ã£ããã£ã¦ãã¦ã±ãã
ã¨ããããã§ããã¨ãªãã git clone & python setup.py ãããã
ã¤ã³ã¹ãã¼ã«å¾ãgit clone ããå ´æã« examples ã¨ãããã£ã¬ã¯ããªããã£ã¦ããã® MNIST ã使ã£ããµã³ãã«ã³ã¼ããç½®ãã¦ããã
GPGPU ãå©ããªãç°å¢ã§ã mnist.py 㨠mnist_conv.py ã¨ãããµã³ãã«ã¯åé¡ãªãåãã®ã§ãã¾ãã¯ããã§éãã§ã¿ãã¨ããã
æ¨æºã®ãµã³ãã«ãªã®ã«ãããªã
The uniform initializer no longer uses Glorot et al.'s approach to determine the bounds, but defaults to the range (-0.01, 0.01) instead. Please use the new GlorotUniform initializer to get the old behavior. GlorotUniform is now the default for all layers.
ã¿ãããªã¯ã¼ãã³ã°ãåºãã®ã ããå¤åæ°ã«ãããè² ãã
mnist.py 㯠512 åãã¤ã®ã¦ããããæã¤ï¼æ®µã®é ã層ãããªãå¤ãè¯ããã¥ã¼ã©ã«ãããã¯ã¼ã¯ã§ãç°å¢ã«ãããã ããã2æéåãããå¦ç¿ã㦠98.5% ãããã®ç²¾åº¦ãåºãã
mnist_conv.py 㯠5x5 ã®ç³è¾¼ã¿ã¨ 2x2 ã® max-pooling ã2åéãããã¨ã256 ã¦ãããã®é ã層ãããã¦ããããã¢ã¦ãã¨ããä»é¢¨ã®ãã£ã¼ããªãã¥ã¼ã©ã«ãããããããã«éããããã§ã 27æéã»ã©ã§å¦ç¿ãçµãã¦ã99.4% ã®ç²¾åº¦ãå©ãåºãã
ããã Python ã®ã³ã¼ããã¡ãã¡ãã£ã¨æ¸ãã ãã§åã(ã¦ã½)ãã ããã楽ãããã§ãããï¼
mnist.py ã®ã³ã¼ããè¦ãã¨ã¢ãã«ãå®ç¾©ããã®ã¯ç°¡åãããªã®ã§ãç°¡åã«ä½¿ããã®ãã¨æã£ã¦ãmnist.py ãæ¹é ãã¦èªåã®ãã¼ã¿ãåã®èããæå¼·ã®ã¢ãã«ã«é£ããããã¨ããããå³ã£ãããã«åããªãã
ã¾ã mnist.py ã®ã³ã¼ããç¡é§ã«è¤éã§ãæ±ç¨åãã¦ããã¤ãããªãã ãããã©ãæ示ãã¦ããªãä»æ§ãããããããããã§ãè¬ã®åã¨ã©ã¼ããã³ãã³åºãã
ãããæ¹é ã¯ããããã¦ä¸ããã³ã¼ããæ¸ãããããã¥ã¡ã³ãã«ã¯ã¡ãã㨠TUTORIAL ã®æåããã(ã¯ããã®2)ã
éãã¨ã
Understand the MNIST example
TODO:
è¯ãã£ããç´ã®ããã¥ã¢ã«ã ã£ããå£ã«å©ãã¤ãã¦ããã¨ããã ã£ããé»ååãã³ã¶ã¤ã
ããããªããMNIST ãµã³ãã«ã³ã¼ããç解ãã¦ããããããªããã
ã¨ãå¢ãè¾¼ãã§èªã¿å§ããããå¦ç¿ãäºæ¸¬ã®ããã®ã³ã¼ãã 100è¡ä»¥ä¸ãã£ã¦ããããæ°è¡ã§æ©æ¢°å¦ç¿ã§ãã scikit-learn(ã¬ãã¾æ¹¯) ã«æ
£ãããã ãã¨ã ã«ã¯å¤§å±¤ãã©ã¤ã
ã¨ãããããããã¦ä¸å¿ç解ããã¤ããã§ãå¿
è¦æå°éã«ãã¼ã£ã Lasagne ã®ã¹ã¢ã¼ã«ãµã³ãã«ã³ã¼ãããã¡ãã
import numpy import lasagne import theano import theano.tensor as T #### dataset def digits_dataset(test_N = 400): import sklearn.datasets data = sklearn.datasets.load_digits() numpy.random.seed(0) z = numpy.arange(data.data.shape[0]) numpy.random.shuffle(z) X = data.data[z>=test_N, :] y = numpy.array(data.target[z>=test_N], dtype=numpy.int32) test_X = data.data[z<test_N, :] test_y = numpy.array(data.target[z<test_N], dtype=numpy.int32) return X, y, test_X, test_y X, y, test_X, test_y = digits_dataset() N, input_dim = X.shape n_classes = 10 print(X.shape, test_X.shape) #### model batch_size=100 l_in = lasagne.layers.InputLayer( shape=(batch_size, input_dim), ) l_hidden1 = lasagne.layers.DenseLayer( l_in, num_units=512, nonlinearity=lasagne.nonlinearities.rectify, ) l_hidden2 = lasagne.layers.DenseLayer( l_hidden1, num_units=64, nonlinearity=lasagne.nonlinearities.rectify, ) model = lasagne.layers.DenseLayer( l_hidden2, num_units=n_classes, nonlinearity=lasagne.nonlinearities.softmax, ) #### loss function objective = lasagne.objectives.Objective(model, loss_function=lasagne.objectives.categorical_crossentropy) X_batch = T.matrix('x') y_batch = T.ivector('y') loss_train = objective.get_loss(X_batch, target=y_batch) #### update function learning_rate = 0.01 momentum = 0.9 all_params = lasagne.layers.get_all_params(model) updates = lasagne.updates.nesterov_momentum( loss_train, all_params, learning_rate, momentum) #### training train = theano.function( [X_batch, y_batch], loss_train, updates=updates ) #### prediction loss_eval = objective.get_loss(X_batch, target=y_batch, deterministic=True) pred = T.argmax( lasagne.layers.get_output(model, X_batch, deterministic=True), axis=1) accuracy = T.mean(T.eq(pred, y_batch), dtype=theano.config.floatX) test = theano.function([X_batch, y_batch], [loss_eval, accuracy]) #### inference numpy.random.seed() nlist = numpy.arange(N) for i in xrange(100): numpy.random.shuffle(nlist) for j in xrange(N / batch_size): ns = nlist[batch_size*j:batch_size*(j+1)] train_loss = train(X[ns], y[ns]) loss, acc = test(test_X, test_y) print("%d: train_loss=%.4f, test_loss=%.4f, test_accuracy=%.4f" % (i+1, train_loss, loss, acc))
ãã®ã³ã¼ãã¯ä½ããã£ã¦ãããã
- ãã¼ã¿ã¯ scikit-learn ã® datasets ã«å«ã¾ãã digits ã0 ãã 9 ã¾ã§ã®æ°åç»å(16é調 8x8 ãã¯ã»ã«)ã 1797 件ãä»å scikit-learn ã¯ãã®ããã ãw*1
- 400 件ããã¹ããã¼ã¿ã«ãæ®ã 1397 件ãè¨ç·´ãã¼ã¿ã«åãã¦ããããã¹ããã¼ã¿ãåãã®è¯ãæ°åã«ãã¦ããã®ã¯æ¬¡åã¸ã®æ¯ã
- ã¢ãã«ã¯é ã層2層ï¼1å±¤ç® 512ã¦ãããã2å±¤ç® 64ã¦ããã)ã100å¨ã®å¦ç¿ã§ 97% ãããã®ç²¾åº¦ã«ãªãã
ç´°ãã解説ã¯æ¬¡åã«åãããã¨ãããã Lasagne ã®å®åç¯å²ã¯ãå
é¨ DSL çã«è¨è¿°ãããã¢ãã«ãããç®çé¢æ°ãçæããã¨ããã ãã¨ãããã¨ã念é ã«ç½®ãã°ããã®ã³ã¼ãã¯ç¹ã«è¦ããªãèªããã¨æãã
å¦ç¿ã«ããããã©ã¡ã¼ã¿æ´æ°ã¨ãããã¹ããã¼ã¿ã®è©ä¾¡ã¨ãã¯ã»ã¼ Theano é ¼ã¿ã§ãç¾ç¶ã¯ããã®ã¤ãªããå©ç¨è
ãæ¸ãå¿
è¦ãããï¼ã ããæ¸ããªãã¨ãããªãã³ã¼ããå¤ãï¼ãã¾ã 0.1.dev ãªãã§ã
ã¾ããã®ã³ã¼ãã§ã¯å¦ç¿å¾ã®ã¢ãã«ãä¿åãã¦ããªãã(ãã®ãµã³ãã«ãã¼ã¿ã®è¦æ¨¡ãªãä¿åããå¿
è¦ããªãã ããã)ãã¾ããã«ãããªãå½ç¶ãã®è¦æã¯åºã¦ããã ããã
ãã®ã¨ã㯠lasagne.layers.get_all_params(model) ããã©ã¡ã¼ã¿ãæ ¼ç´ãã Theano ã® SharedVariable ã®ãªã¹ããè¿ãã®ã§ãããã¤ããä½ããã®æ¹æ³ã§æ°¸ç¶åããã¨ããã
ç¶ãã
*1:ãªã®ã§ããããã def ãã¦ã¹ã³ã¼ããåããã¨ãã㧠sklearn ã import ãã¦ãã