DeepMind ã®æ·±å±¤å¦ç¿ã©ã¤ãã©ãª Sonnet ãæ©é試ãã¦ã¿ã
ã©ããï¼Ryobot ã§ãï¼å¤æ¡ãçºããªããé ãåã¿ããå£ç¯ã«ãªã£ã¦åãã¾ããï¼
è¿½è¨ 4/19
DeepMind ãã Differentiable Neural Computers ã® Sonnet å®è£
ãå
¬éããã¾ããï¼ä»å¾ã
PathNet ã Elastic Weight Consolidation çã®å®è£
ãå
¬éããããã¨ãæå¾
ãããã§ããï¼
Sonnet 㯠4æ 7æ¥ã«å ¬éãããã°ããã® DeepMind 謹製ã®æ·±å±¤å¦ç¿ã©ã¤ãã©ãªã§ããï¼ãã¨ã㨠DeepMind ã®ç¤¾å ã§ä½¿ç¨ããã¦ãã TensorFlow ã®ã©ããã¼ã©ã¤ãã©ãªã ã£ããï¼è«æã®å®è£ ãå ±æããããããããã«ãªã¼ãã³ã½ã¼ã¹ã¨ãã¦å ¬éããããã ï¼Sonnet ã®æããç¹å¾´ã¨ãã¦åå©ç¨å¯è½ãªã¢ã¸ã¥ã¼ã«ãè¤æ°åæ¥ç¶ãã¦è¨ç®ã°ã©ããæ§æãããã¨ãæããããï¼Sonnet 㯠TensorFlow Core ã®é¢æ°ãä»ã®é«æ°´æºã©ã¤ãã©ãª (TF Slim ç) ã¨ä½µç¨ãã¦å©ç¨ã§ããã®ã§ï¼åããã©ããã¼ã® Keras ã«æ¯ã¹ã¦ TensorFlow ã³ã¼ãã®æ®æ»ãè¦ã¦ã¨ããï¼æ±ç¨çã ãåé·ã§ããã£ã TensorFlow ããã¥ã¼ã©ã«ãããåãã«æ½è±¡åãããã¨ã«ãã£ã¦ RNN çã®è¤éãªãããã¯ã¼ã¯ãå°ãªãã³ã¼ãéã§å¯èªçã«è¨è¿°ã§ããããã§ããï¼
Sonnet ã¯æ·±å±¤å¦ç¿ã©ã¤ãã©ãª Touch ã«ã¤ã³ã¹ãã¤ã¢ããã¦ããããã§ï¼ä¼¼ããããªãªãã¸ã§ã¯ãæåãæ¡ç¨ãã¦ãã (æ¨æºã©ã¤ãã©ãªã« TensorFlow ãæ¡ç¨ããåã® DeepMind ã§ã¯ Touch ãæ¡ç¨ãã¦ãã)ï¼ãã詳細ãç¥ãããæ¹ã¯ä»¥ä¸ãåç §ããããï¼
- ãã¬ã¹ãªãªã¼ã¹: https://deepmind.com/blog/open-sourcing-sonnet/
- GitHub: https://github.com/deepmind/sonnet
- Qiita: DeepMindã®Sonnetã触ã£ãã®ã§ãTensorFlowãKerasã¨æ¯è¼ããªãã解説ãã¦ã¿ã (ããã¹ããããã)
ããã§ã¯ï¼ã¤ã³ã¹ãã¼ã«ããå ·ä½çãªä½¿ç¨æ³ã»ä½¿ç¨ä¾ãç´¹ä»ãããï¼
ã¤ã³ã¹ãã¼ã«
ãªãã¸ããªã® README ã«ã¯ Linux / Mac OS X ããã³ Python 2.7 ã¨äºææ§ãããã¨æè¨ããã¦ãããï¼ãã«ãªã¯ Python3 compatibility? #10 ãè¦ãã¨æ°ã«æã®å¤æ´ã§ä¸å¿ 3ç³»ã§æ£å¸¸ã«åä½ããã£ã½ãï¼åã Python 3.5 ç°å¢ã§ãã¼ã¸å¾ã®ãã®ãå®è¡ãã¦ã¿ããå¹¾ã¤ãã¨ã©ã¼ãæ®ã£ã¦ããï¼ã¾ãä»é±ä¸ã« examples ããããªã 3ç³»ã§åãã®ã§ã¯ï¼ (ä»åæ¬é¡)
Sonnet ã®ã¤ã³ã¹ãã¼ã«ã«ã¯ï¼bazel ã使ç¨ãã¦ã©ã¤ãã©ãªãã³ã³ãã¤ã«ããå¿ è¦ãããï¼bazel ã®ããã¥ã¡ã³ã
Mac ã§ããã° Homebrew ã§ç°¡åã« bazel ãã¤ã³ã¹ãã¼ã«ã§ããï¼
$ brew install bazel
å½ç¶ TensorFlow ãäºåã«ã¤ã³ã¹ãã¼ã«ãã¦ããï¼ã¤ã³ã¹ãã¼ã«æ¹æ³ã¯ãã¡ããåç §ï¼æ³¨æç¹ã¨ãã¦ææ°ãã¼ã¸ã§ã³ã® 1.0.1 以å¤ã§ã¯ Sonnet ã®ã¤ã³ã¹ãã¼ã«æã«ã¨ã©ã¼ã¨ãªãï¼
# ææ°ãã¼ã¸ã§ã³ä»¥å¤ããã§ã«ã¤ã³ã¹ãã¼ã«ãã¦ããå ´åï¼ # TensorFlow ãã¢ã³ã¤ã³ã¹ãã¼ã«ããï¼ $ sudo pip uninstall tensorflow # Mac ã® Python2.7 ç°å¢ã« TensorFlow ã®ææ°ãã¼ã¸ã§ã³ 1.0.1 ãã¤ã³ã¹ãã¼ã«ããï¼ $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.0.1-py2-none-any.whl # ãã "Cannot remove entries from nonexistent file ã»ãã»ã easy-install.pth" # ã¿ãããªã¨ã©ã¼ã表示ãããã --ignore-installed ã¤ãããè¯ãããï¼ $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.0.1-py2-none-any.whl --ignore-installed
Sonnet ã®ã¤ã³ã¹ãã¼ã«
# Sonnet ã®ã½ã¼ã¹ã³ã¼ãããã¦ã³ãã¼ã $ git clone --recursive https://github.com/deepmind/sonnet # configure ã®å®è¡ $ cd sonnet/tensorflow $ ./configure $ cd ../ # ã¤ã³ã¹ãã¼ã«ã¹ã¯ãªãããå®è¡ã㦠wheel ãã¡ã¤ã«ãä½æãã $ mkdir /tmp/sonnet $ bazel build --config=opt :install $ ./bazel-bin/install /tmp/sonnet # çæãã wheel ãã¡ã¤ã«ãã¤ã³ã¹ãã¼ã«ãã # ããã§ææ°ãã¼ã¸ã§ã³ã® TensorFlow 1.0.1 以å¤ã ã¨ã¨ã©ã¼ã«ãªã (1.0.0 ã¯ãã¡ã§ãã) $ pip install /tmp/sonnet/*.whl
ããã§ã¤ã³ã¹ãã¼ã«ã¯å®äºï¼
Sonnet ã®ä½¿ç¨æ³
Sonnet ã®åºæ¬çãªä½¿ãæ¹ã¯ï¼Torch 風ã®ãªãã¸ã§ã¯ãæåãæ¡ç¨ãã¦é ä¼æãå®ç¾©ããã¢ã¸ã¥ã¼ã«ãä½æãããã¨ã§ããï¼å
·ä½çã«ã¯ sonnet.AbstractModule
ã®ãµãã¯ã©ã¹ã® Python ãªãã¸ã§ã¯ã (ããã Modules
ã¨å¼ã³ï¼ãã¥ã¼ã©ã«ãããã®ä¸é¨ã§ãã) ãæ§ç¯ãï¼ããããåå¥ã«è¤æ°åæ¥ç¶ãã¦è¨ç®ã°ã©ããæ§æããï¼ã¢ã¸ã¥ã¼ã«å
ã§å®£è¨ãããåå¤æ°ã¯æ¥ç¶å¼ã³åºãã§èªåçã«ã©ããã¼ã®ã¢ã¸ã¥ã¼ã«ãè¨ç®ã°ã©ãã«å
±æãããï¼ã¾ãï¼å¤æ°ã®å
±æãå¶å¾¡ããååã¹ã³ã¼ãã reuse=
ãã©ã°ãªã©ã® TensorFlow ã®ä½æ°´æºãªæ©è½ã¯ã¦ã¼ã¶ã¼ããæ½è±¡å (é è½) ãããï¼
åºæ¬
import sonnet as snt import tensorflow as tf # è¨ç·´ãã¼ã¿ã¨ãã¹ããã¼ã¿ train_data = get_training_dataï¼ï¼ test_data = get_test_dataï¼ï¼ # Linear ã¢ã¸ã¥ã¼ã«ã¯å¤æ° (ç·å½¢å¤æã®ããã®éã¿ã¨ãã¤ã¢ã¹) ããã£ãå ¨çµåå±¤ï¼ lin_to_hidden = snt.Linear(output_size=FLAGS.hidden_size, name='inp_to_hidden') hidden_to_out = snt.Linear(output_size=FLAGS.output_size, name='hidden_to_out') # Sequential ã¢ã¸ã¥ã¼ã«ã¯ä¸ãããããã¼ã¿ã«ä¸é£ã®å é¨ã¢ã¸ã¥ã¼ã«ã # æä½ (op) ãé©å¿ããï¼ãã®ä¾ã§ã¯ Linear ã¢ã¸ã¥ã¼ã«ï¼ã·ã°ã¢ã¤ãé¢æ° opï¼ # Linear ã¢ã¸ã¥ã¼ã«ãé ã«é©å¿ãã¦ããï¼sigmoid ã®ãã㪠TF ã®ä½æ°´æºãª # op ã¯å¤æ°ãæããªãã®ã§ï¼Sonnet ã®ã¢ã¸ã¥ã¼ã«ã¨æ··ãã¦ä½¿ç¨ã§ããï¼ mlp = snt.Sequential([lin_to_hidden, tf.sigmoid, hidden_to_out]) # Sequential ã¯è¨ç®ã°ã©ãã«è¤æ°åæ¥ç¶ã§ããï¼ train_predictions = mlp(train_data) test_predictions = mlp(test_data) # éã¿ã¨ãã¤ã¢ã¹ãåæåããï¼ # tf.truncated_normal(shape, mean=0.0, stddev=1.0, # dtype=tf.float32, seed=None, name=None) # tf.truncated_normal ã¯æ£è¦åå¸ããã®ã©ã³ãã ãµã³ããªã³ã°ï¼ # å¹³åå¤ãã 2 æ¨æºåå·®ãè¶ ããå¤ãåé¤ãåé¸æãã initializers={"w": tf.truncated_normal_initializer(stddev=1.0), "b": tf.truncated_normal_initializer(stddev=1.0)} # éã¿ã¨ãã¤ã¢ã¹ãæ£ååããï¼ # L1 æ£ååã¯éã¿ãã¹ãã¼ã¹ã«ãï¼L2 æ£ååã¯è¨ç·´ãã¼ã¿ã®éå¦ç¿ãé²ãï¼ # scale ã¯æ£ååé ã®ä¿æ°ã§ããï¼0.0 ã®ã¨ãæ£ååã¯ãªããªãï¼ regularizers = {"w": tf.contrib.layers.l1_regularizer(scale=0.1), "b": tf.contrib.layers.l2_regularizer(scale=0.1)} # åæåã¨æ£ååã追å ãã Linear ã¢ã¸ã¥ã¼ã«ï¼ linear_regression_module = snt.Linear(output_size=FLAGS.output_size, initializers=initializers, regularizers=regularizers) # è¨ç®ãããæ£ååã®æ失㯠tf.GraphKeys.REGULARIZATION_LOSSES # ã¨ããååã§ã³ã¬ã¯ã·ã§ã³ã«è¿½å ãããï¼ # ã³ã¬ã¯ã·ã§ã³ã®æ£ååã®æ失ãåå¾ãï¼ç·åãæ±ããï¼ graph_regularizers = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) total_regularization_loss = tf.reduce_sum(graph_regularizers) # é常ã®æ失ã¨æ£ååã®æ失ã®åãæå°åãã train_op = optimizer.minimize(loss + total_regularizer_loss)
ç¬èªã¢ã¸ã¥ã¼ã«
ãã¦ï¼å®éã« snt.AbstractModule
ãç¶æ¿ããæ°ããã¯ã©ã¹ãä½æãã¦ç¬èªã®ã¢ã¸ã¥ã¼ã«ãå®ç¾©ãããï¼ã¯ã©ã¹å®ç¾©ã¯ Chainer ã£ã½ãæ¸ãæ¹ã§ããï¼
ãã®ã¯ã©ã¹ (ã¢ã¸ã¥ã¼ã«) ã®ã³ã³ã¹ãã©ã¯ã¿ __init__
ã¯ã¾ãã¹ã¼ãã¼ã¯ã©ã¹ã®ã³ã³ã¹ãã©ã¯ã¿ super(hoge, self).__init__
ãå¼ã³åºãã¦ã¢ã¸ã¥ã¼ã«å name
ã渡ãï¼ã¾ã __init__
ã®å¯å¤é·å¼æ°ã®è¾æ¸å name
ãã¼ã¯å¸¸ã«ãªã¹ãã®æå¾ã«ä¸ãï¼ãã®ããã©ã«ãå¤ã¯ã¹ãã¼ã¯ãã¼ã (my_mlp ã®ãããªã¢ã³ãã¼ãã¼åºåãã®åå) ã®ã¯ã©ã¹å (ã¢ã¸ã¥ã¼ã«å) ã«ãªãï¼
class MyMLP(snt.AbstractModule): def __init__(self, hidden_size, output_size, nonlinearity=tf.tanh, name="my_mlp"): super(MyMLP, self).__init__(name) self._hidden_size = hidden_size self._output_size = output_size self._nonlinearity = nonlinearity
次㫠_build()
ã¡ã½ãããå®è£
ããï¼_build()
ã¯ã¢ã¸ã¥ã¼ã«ã tf.Graph
ã«æ¥ç¶ããããã³ã«å¼ã³åºããï¼å
¥åã¨ãã¦è¤æ°ã®ãã³ã½ã« (空ã 1ã¤ã§ããã) ãå«ãã æ§é ãåãåãï¼ãã®ã¨ãè¤æ°ã®ãã³ã½ã«ã¯ã¿ãã«ããã㯠namedtuple ã§ï¼ãã®è¦ç´ ãã¾ããã³ã½ã«ãã¿ãã« / namedtuple ã¨ãªãï¼ã¾ãå
¥åãã³ã½ã«ã¯ãããã¨ãã¦ä¸ãï¼ããã«ã©ã¼ãã£ãã«ãããã°æå¾ã®éã«ä¸ããï¼ã¡ãªã¿ã«ï¼ãªã¹ããè¾æ¸ã®å¯å¤æ§ã¯ãã°ã®åå ã«ãªããããã®ã§å©ç¨ããµãã¼ããããªãï¼
def _build(self, inputs): """å ¥åãã³ã½ã«ããåºåãã³ã½ã«ãè¨ç®ããï¼""" lin_x_to_h = snt.Linear(output_size=self._hidden_size, name="x_to_h") lin_h_to_o = snt.Linear(output_size=self._output_size, name="h_to_o") return lin_h_to_o(self._nonlinearity(lin_x_to_h(inputs)))
_build()
ã¡ã½ããã®å½¹å²ã¯ãããã以ä¸ã® 3ã¤ããï¼
- å é¨ã¢ã¸ã¥ã¼ã«ã®ä½¿ç¨
- ã³ã³ã¹ãã©ã¯ã¿ã«æ¸¡ãããæ¢åã®ã¢ã¸ã¥ã¼ã«ã®ä½¿ç¨
- å¤æ°ã®ä½æ
å¤æ°ã¯å¿
ã tf.get_variable
ã§ä½æããï¼tf.Variable
ã³ã³ã¹ãã©ã¯ã¿ãå¼ã³åºãã¨ã¢ã¸ã¥ã¼ã«ãåãã¦æ¥ç¶ãããæã¯åä½ãããï¼2åç®ã®å¼ã³åºãã¯ã¨ã©ã¼ã«ãªãï¼
ä¸è¨ã®ã³ã¼ãã§ã¯ï¼_build()
ãå¼ã³åºããã³ã«æ°ãã snt.Linear
ã¤ã³ã¹ã¿ã³ã¹ãçæãããï¼ãã®ã¨ãä½æãããå¤æ°ã¯äºãã«ç°ãªãå¤ãå
±æãããªãã¨ããããã§ãªãï¼MLP ãè¨ç®ã°ã©ãã«ä½åº¦ãæ¥ç¶ããã¨ãã¦ã 4ã¤ã®å¤æ° (1ã¤ã® Linear
ã«ã¤ãéã¿ã¨ãã¤ã¢ã¹ã® 2ã¤ã®å¤æ°) ã®ã¿ä½æãããï¼
ãµãã¢ã¸ã¥ã¼ã«ã®å®£è¨
Sequential ã®ããã«ï¼ã¢ã¸ã¥ã¼ã«ã¯å¤é¨ã§æ§ç¯ãããä»ã®ã¢ã¸ã¥ã¼ã«ãåãåã£ã¦ä½¿ç¨ã§ããï¼
ãµãã¢ã¸ã¥ã¼ã«ã¯è¦ªã¢ã¸ã¥ã¼ã«ã¨å¼ã°ããä»ã®ã¢ã¸ã¥ã¼ã«ã®ã³ã¼ãå
é¨ã§æ§ç¯ãããã¢ã¸ã¥ã¼ã«ã§ããï¼LSTM ãå«ãã»ã¨ãã©ã®å®è£
ã§ã¯ 1ã¤ä»¥ä¸ã® Linear ã¢ã¸ã¥ã¼ã«ãå
é¨ã«æ§ç¯ããï¼ãµãã¢ã¸ã¥ã¼ã«ã¯å¤æ°ã¹ã³ã¼ããæ£ãããã¹ãã«ãªããã _build()
ã«ä½æããï¼
class ParentModule(snt.AbstractModule): def __init__(self, hidden_size, name="parent_module"): super(ParentModule, self).__init__(name=name) self._hidden_size = hidden_size def _build(self, inputs): lin_mod = snt.Linear(self._hidden_size) # ãµãã¢ã¸ã¥ã¼ã«ã®æ§ç¯ã return tf.relu(lin_mod(inputs)) # æ¥ç¶ãã
Linear ã§ä½æãããå¤æ°ã¯ parent_module/linear/w
ã®ãããªååããã¤ï¼
å®ç¨çãªçç±ããã³ã³ã¹ãã©ã¯ã¿å
ã§ãµãã¢ã¸ã¥ã¼ã«ãæ§ç¯ãããã¦ã¼ã¶ã¼ãããã ããï¼ãã®å ´åã¯é©åã«å¤æ°ããã¹ãããããã« self._enter_variable_scope
ã®å¼ã³åºãå
ã§ãµãã¢ã¸ã¥ã¼ã«ãæ§ç¯ããªããã°ãªããªãï¼
class OtherParentModule(snt.AbstractModule): def __init__(self, hidden_size, name="other_parent_module"): super(OtherParentModule, self).__init__(name=name) self._hidden_size = hidden_size with self._enter_variable_scope(): # ãã®è¡ãéè¦ self._lin_mod = snt.Linear(self._hidden_size) # ããã«ãµãã¢ã¸ã¥ã¼ã«ãæ§ç¯ def _build(self, inputs): return tf.relu(self._lin_mod(inputs)) # äºåã«æ§ç¯ããã¢ã¸ã¥ã¼ã«ãæ¥ç¶ãã
ãªã«ã¬ã³ãã¢ã¸ã¥ã¼ã«
Sonnet ã«ã¯ TensorFlow ã®ã»ã« (cells) ã«ç¸å½ãããªã«ã¬ã³ãã³ã¢ã¢ã¸ã¥ã¼ã«ãç¨æããï¼1ã¿ã¤ã ã¹ãããã®è¨ç®ãå®è¡ããï¼ãã㯠TensorFlow ã® dynamic_rnn()
é¢æ°ã«ãã£ã¦æé軸æ¹åã«å±éã§ããï¼LSTM ã¢ã¸ã¥ã¼ã«ã®å ´åã¯æ¬¡ã®ã¨ããï¼
hidden_size = 5 batch_size = 20 # input_sequence 㯠[ã¿ã¤ã ã¹ããã, ããããµã¤ãº, å ¥åã®ç¹å¾´] # ãµã¤ãºã®ãã³ã½ã«ã§ããï¼ input_sequence = ... lstm = snt.LSTM(hidden_size) initial_state = lstm.initial_state(batch_size) output_sequence, final_state = tf.nn.dynamic_rnn( lstm, input_sequence, initial_state=initial_state, time_major=True)
batch_size
ã渡ãã initial_state()
ã¡ã½ãã㯠int32
åã®ãã³ã½ã«ã«ãªãï¼
æå¾ã«ï¼ç¬èªã®ãªã«ã¬ã³ãã¢ã¸ã¥ã¼ã«ãå®ç¾©ãããï¼
ãªã«ã¬ã³ãã¢ã¸ã¥ã¼ã«ã¯ snt.AbstractModule
㨠tf.RNNCell
ã®ä¸¡æ¹ãç¶æ¿ãã snt.RNNCore
ã®ãµãã¯ã©ã¹ã§ããï¼ãã®å¤éç¶æ¿ã«ãã Sonnet ã®å¤æ°å
±æã¢ãã«ã使ç¨ã§ãï¼TensorFlow ã® RNN ã³ã³ããã使ç¨ã§ããï¼ããã¯ããã¨ãåãã§ããï¼
class Add1RNN(snt.RNNCore): """ãªã«ã¬ã³ãã®å é¨ç¶æ ã« 1 ã追å ãï¼åºåã¨ã㦠0 ãçæããã·ã³ãã«ãªã³ã¢ï¼ ãã®ã³ã¢ã¯ä»¥ä¸ãè¨ç®ããï¼ (`input`, (`state1`, `state2`)) -> (`output`, (`next_state1`, `next_state2`)) (`state1`, `state2`) ã¯ããããµã¤ãºã® memory cell 㨠hidden vector ã表ãï¼ è¦ç´ ã¯ãã¹ã¦ãã³ã½ã«ã§ï¼`next_statei` = `statei` + 1 㨠`output` = 0 ãæãç«ã¤ï¼ åºå (`output` and `state`) ã¯ãã¹ã¦ (`batch_size`, `hidden_size`) ãµã¤ãºã§ããï¼ """ def __init__(self, hidden_size, name="add1_rnn"): """ã¢ã¸ã¥ã¼ã«ã®ã³ã³ã¹ãã©ã¯ã¿ï¼ Args: hidden_size: intåã®ã¢ã¸ã¥ã¼ã«ã®åºåãµã¤ãº name: ã¢ã¸ã¥ã¼ã«å """ super(Add1RNN, self).__init__(name=name) self._hidden_size = hidden_size def _build(self, inputs, state): """1ã¿ã¤ã ã¹ãããã®è¨ç®ãå®è¡ãã TensorFlow ã®ãµãã°ã©ããæ§ç¯ããï¼""" batch_size = tf.TensorShape([inputs.get_shape()[0]]) outputs = tf.zeros(shape=batch_size.concatenate(self.output_size)) state1, state2 = state next_state = (state1 + 1, state2 + 1) return outputs, next_state @property def state_size(self): """ãããã®éããªãå é¨ç¶æ ã®ãµã¤ãºãè¿ãï¼""" return (tf.TensorShape([self._hidden_size]), tf.TensorShape([self._hidden_size])) @property def output_size(self): """ãããã®éããªãåºåãµã¤ãºãè¿ãï¼""" return tf.TensorShape([self._hidden_size]) def initial_state(self, batch_size, dtype): """åæå¤ 0 ã®å é¨ç¶æ ãè¿ãï¼ NOTE: ãã®ã¡ã½ããã¯èª¬æç®çã§æè¨ãã¦ããï¼ ã¹ã¼ãã¼ã¯ã©ã¹ã«å¯¾å¿ããã¡ã½ããããã§ã«åå¨ãã¦ããï¼ """ sz1, sz2 = self.state_size # å é¨ç¶æ ã® shape ã®å é ã«ããããµã¤ãºã追å ãï¼zeros ãä½æããï¼ return (tf.zeros([batch_size] + sz1.as_list(), dtype=dtype), tf.zeros([batch_size] + sz2.as_list(), dtype=dtype))
examples ã® rnn_shakespeare (å¤å±¤ LSTM) ã試ãã¦ã¿ã
rnn_shakespeare.py
ã¯å¤å±¤ LSTM ã使ç¨ããæåã¬ãã«è¨èªã¢ãã« (ä¸æåãã¤çæãã¦ããè¨èªã¢ãã«) ã®ã½ã¼ã¹ã³ã¼ãã§ããï¼
ã¢ãã«ã®ã³ã¼ããè¦ã¦ã¿ãï¼ãã®ä»ã®è¨ç·´ã³ã¼ããªã©ã¯ä¸è¬ç㪠TensorFlow ã¨åæ§ãªã®ã§ããã§ã¯è§¦ããªãï¼
class TextModel(snt.AbstractModule): """å°ããªã·ã§ã¤ã¯ã¹ãã¢ãã¼ã¿ã»ããã使ç¨ãã深層 LSTM ã¢ãã«ï¼""" def __init__(self, num_embedding, num_hidden, lstm_depth, output_size, use_dynamic_rnn=True, use_skip_connections=True, name="text_model"): """`TextModel` ãæ§ç¯ããï¼ Args: num_embedding: one-hot ã¨ã³ã³ã¼ãå ¥åã®ãã¨ã«ä½¿ç¨ããåãè¾¼ã¿è¡¨ç¾ã®ãµã¤ãºï¼ num_hidden: å LSTM 層ã®é ãã¦ãããæ°ï¼ lstm_depth: LSTM 層ã®æ°ï¼ output_size: 深層 RNN ã®é ä¸ã«ããåºå層ã®ãµã¤ãºï¼ use_dynamic_rnn: TensorFlow ã® dynamic_rnn ãå©ç¨ãããå¦ãï¼ `False` 㯠static_rnn ã使ç¨ããï¼ããã©ã«ã㯠`True`. use_skip_connections: `snt.DeepRNN` 㧠skip connections ã å©ç¨ãããå¦ãï¼ããã©ã«ã㯠`True`. name: ã¢ã¸ã¥ã¼ã«åï¼ """ super(TextModel, self).__init__(name=name) self._num_embedding = num_embedding self._num_hidden = num_hidden self._lstm_depth = lstm_depth self._output_size = output_size self._use_dynamic_rnn = use_dynamic_rnn self._use_skip_connections = use_skip_connections with self._enter_variable_scope(): self._embed_module = snt.Linear(self._num_embedding, name="linear_embed") self._output_module = snt.Linear(self._output_size, name="linear_output") self._lstms = [ snt.LSTM(self._num_hidden, name="lstm_{}".format(i)) for i in range(self._lstm_depth) ] # lstm_depth åã® LSTM ã¢ã¸ã¥ã¼ã«ã®ãªã¹ã self._core = snt.DeepRNN(self._lstms, skip_connections=True, name="deep_lstm") def _build(self, one_hot_input_sequence): """深層 LSTM ã¢ãã«ã®ãµãã°ã©ããæ§ç¯ããï¼ Args: one_hot_input_sequence: one-hot 表ç¾ã¨ãã¦ã¨ã³ã³ã¼ããããå ¥å ã·ã¼ã±ã³ã¹ã®ãã³ã½ã«ï¼é㯠`[truncation_length, batch_size, output_size]` ã«ãªãï¼ Returns: output_sequence_logits: ãããã®åºåãã¸ããã®ãã³ã½ã«ã®ã¿ãã«ï¼ ãã³ã½ã«ã®é㯠`[truncation_length, batch_size, output_size]` ã«ãªãï¼ final_state: æéå±éããã³ã¢ã® final_stateï¼ """ input_shape = one_hot_input_sequence.get_shape() batch_size = input_shape[1] # BatchApply ã¢ã¸ã¥ã¼ã«ã¯ãã³ã½ã«ã®æåã®éã«ãããã®éã追å ãï¼ # å ¥åã®æ¬¡å ã«ãããããããã«ãããã®éã®æ¬¡å ãåå²ããï¼ batch_embed_module = snt.BatchApply(self._embed_module) input_sequence = batch_embed_module(one_hot_input_sequence) input_sequence = tf.nn.relu(input_sequence) initial_state = self._core.initial_state(batch_size) if self._use_dynamic_rnn: output_sequence, final_state = tf.nn.dynamic_rnn( cell=self._core, inputs=input_sequence, time_major=True, initial_state=initial_state) else: # tf.unstack ã¯ãã³ã½ã«ã®æåã®éã®ã¹ã¿ãã¯ã解é¤ãï¼ãã³ã½ã«ã®ãªã¹ããè¿ãï¼ rnn_input_sequence = tf.unstack(input_sequence) output, final_state = tf.contrib.rnn.static_rnn( cell=self._core, inputs=rnn_input_sequence, initial_state=initial_state) output_sequence = tf.stack(output) # tf.stack ã¯ãã³ã½ã«ã®ãªã¹ããã¹ã¿ãã¯ããï¼ batch_output_module = snt.BatchApply(self._output_module) output_sequence_logits = batch_output_module(output_sequence) return output_sequence_logits, final_state # @snt.experimental.reuse_vars ããã³ã¬ã¼ãããã¨ï¼ # build() ã¨åãããã«å¤æ°ãåå©ç¨ã§ããï¼ @snt.experimental.reuse_vars def generate_string(self, initial_logits, initial_state, sequence_length): """ãµãã°ã©ããæ§ç¯ãï¼ã¢ãã«ãããµã³ãã«ãããæååãçæããï¼ Args: initial_logits: ãµã³ããªã³ã°ããåæã®ãã¸ããï¼ initial_state: RNN ã³ã¢ã®æåã®å é¨ç¶æ ï¼ sequence_length: ãµã³ãã«ããæåã®æ°ï¼ Returns: generated_string: æåã®ãã³ã½ã«ï¼ éã¯`[sequence_length, batch_size, output_size]` ã«ãªãï¼ """ current_logits = initial_logits current_state = initial_state generated_letters = [] for _ in xrange(sequence_length): # åå¸ããæåã®ã¤ã³ããã¯ã¹ããµã³ãã«ããï¼ char_index = tf.squeeze(tf.multinomial(current_logits, 1)) char_one_hot = tf.one_hot(char_index, self._output_size, 1.0, 0.0) generated_letters.append(char_one_hot) # deep_lstm ã« one-hot ã®æåãä¸ãï¼ãã¸ãããå¾ãï¼ gen_out_seq, current_state = self._core( tf.nn.relu(self._embed_module(char_one_hot)), current_state) current_logits = self._output_module(gen_out_seq) generated_string = tf.stack(generated_letters) return generated_string
è¨ç·´ã¨ãã¹ããå®è¡ããï¼
FLAGS ã¯ããã©ã«ãå¼æ°ãªã®ã§ï¼10000 ã¤ãã¬ã¼ã·ã§ã³ï¼3 層ã®LSTMï¼ããããµã¤ãº 32ï¼åãè¾¼ã¿ãµã¤ãº 32ï¼é ã層ãµã¤ãº 128 ã§ããï¼
$ python rnn_shakespeare.py INFO:tensorflow:Create CheckpointSaverHook. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your md could speed up CPU computations. INFO:tensorflow:Saving checkpoints for 0 into /tmp/tf/rnn_shakespeare/model.ckpt. INFO:tensorflow:0: Training loss 270.180481. INFO:tensorflow:1: Training loss 248.135193. INFO:tensorflow:2: Training loss 249.848511. ... INFO:tensorflow:498: Training loss 106.377846. INFO:tensorflow:Saving checkpoints for 500 into /tmp/tf/rnn_shakespeare/model.ckpt. INFO:tensorflow:499: Training loss 103.404739. ... INFO:tensorflow:9996: Training loss 76.107834. INFO:tensorflow:9997: Training loss 79.831573. INFO:tensorflow:9998: Training loss 79.587143. INFO:tensorflow:Saving checkpoints for 10000 into /tmp/tf/rnn_shakespeare/model.ckpt. INFO:tensorflow:9999: Training loss 82.172134. Validation loss 92.072212. Sample = b_0: jealouses|Wherevoctmen to rail this night, my earnest lie.| Not say, fare you wench them: lest not lip fight:| Then, to old Prince what false as heart:| But sight God straight your lordship his tongue| To shameliness smell there. Concey, I'll keep| the deserves of substigues to me to me from mine:| As ever aimish thee, to do no mannerers,| Farewell. Let no strength-houser sorrow!| Forbight but easy and stay all doing, Marcius,| Romeo, most smaction upon our thanks| The petition of princely man, that boldy's soldier;| RIbe the rambroak and bird makes it have I din| Wedly seal and meet live.||BENVOLIO:| What with the tribunes? Why, this is the| upon speaks of tell thou? I'll six him untutor'd their ease:| 'Tis tear in the boar of his infrict so drinks;| And he harks out upon this issue.||Nurse:| Ay, to do him in rise and death.| Potness me to going and solely secret's| And my pentice comes the that in, about a word,| Ready with words, till the poor prevail'd too| But counterft and happy hath told thee.|Art t INFO:tensorflow:Reducing learning rate. INFO:tensorflow:Test loss 117.373634
çææãã¿ã¦ã¿ãã¨å¦ç¿ã足ããªããããããããè±èªã«ãªã£ã¦ããªãï¼
ãããã«
深層å¦ç¿ã©ã¤ãã©ãªã¯æ°å¤ãããæè¿ã§ã¯ TensorFlow, Keras ã®ã»ãã« PyTouch, DyNet ã«å¢ããæããï¼PyTouch, DyNet (ãã㦠Chainer) ã¯ã¨ãã« Define by Run (ãã¼ã¿ã®æµãããããã¯ã¼ã¯ã®æ§é ã決ããã¢ããã¼ã) ãæ¡ç¨ãã¦ããï¼Define by Run ã§ä½æãããã°ã©ãæ§é ã¯åçè¨ç®ã°ã©ã (Dynamic Computation Graphs) ãåçãã¥ã¼ã©ã«ãããã¯ã¼ã¯ (Dynamic neural networks) ã¨å¼ã°ãæè¿ãããæµè¡ã£ã¦ããæ°ãããï¼ã¡ãªã¿ã« TensorFlow, Caffe, Theano, Touch çã®ä»ã®ã©ã¤ãã©ãªã¯ Define and Run (ãããã¯ã¼ã¯ã®æ§é ã決ãã¦ãããã¼ã¿ãæµãã¢ããã¼ã) ãæ¡ç¨ãã¦ããï¼ã©ã¡ãã®è¨è¨ææ³ãã³ã¼ãã£ã³ã°ã®ããããã»åºå®ã°ã©ãã®ä¿åãåæ£ã·ã¹ãã 対å¿ãªã©ã§ã¡ãªããã»ãã¡ãªãããããã®ã§ä¸æ¦ã«ç²ä¹ãã¤ããããï¼è©³ããã¯ãã¡ããåç §.
TensorFlow ãã¼ã¹ã® Sonnet 㯠Define and Run ã®ç«å ´ã§ãããï¼RNN çã®è¤éãªãããã¯ã¼ã¯ãããã¨ç°¡åã«æ¸ããã¨ãã§ããï¼ãã®ã©ã¤ãã©ãªã®ç価ã¯ãã¾ã æªç¥æ°ã ãï¼DeepMind ã®æå 端ã®æè¡ (e.g., AlphaGo, DNC, EWC) ã Sonnet å®è£ ã§å ¬éãããå¯è½æ§ãèããã¨ä»å¾ã®ååã¯è¿½ã£ã¦ããã®ãç¡é£ã§ãããï¼
話ãé¸ãããä»æ´ TensorFlow ã® TensorBoard ã使ã£ã¦ã¿ã¦ããã便å©ã§ã³ã£ããããï¼ä¸ã®åç»ã¯ MNIST ã®åãè¾¼ã¿ã t-SNE ã§å¯è¦åãã¦ã¿ãæ§åã§ããï¼
TensorFlow ã¯æè¿ Tensor Fold ã§åçè¨ç®ã°ã©ãã«ã対å¿ãï¼Keras, Sonnet, Edward, seq2seq, tf.contrib.learn, TensorFlow-Slim çã®åªç§ãªã©ããã¼&é«æ°´æºã©ã¤ãã©ãªã 6ã¤ãæã£ã¦ããï¼NARUTO ã§ãã£ããé·é + ãã¤ã³å éã¿ãããªãã¼ãç´ã®å¼·ãã ããï¼