ä½ã®è©±ãã¨ããã¨
Google Cloud MLãå©ç¨ãã¦ãTensorFlowã®åæ£å¦ç¿ãè¡ãæ¹æ³ã§ããåãæ¥ããèªåç¨ã®ã¡ã¢ã¨ãã¦å ¬éãã¦ããã¾ãã
åæ£å¦ç¿ã«ã¯ããã¤ãã®ãã¿ã¼ã³ãããã¾ãããæãã·ã³ãã«ãªããã¼ã¿åæ£ãã®å ´åã説æãã¾ããåãã¼ãã¯åãã¢ãã«ã«å¯¾ãã¦ãåå¥ã«å¦ç¿ãã¼ã¿ãé©ç¨ãã¦ãVariableãä¿®æ£ããå¾é ãã¯ãã«ãè¨ç®ãã¾ããããããã§è¨ç®ããå¾é ãã¯ãã«ãç¨ãã¦ãå ±éã®Variableãä¿®æ£ãã¦ããã¾ãã
åæç¥è
TensorFlowã®åæ£å¦ç¿å¦çãè¡ãéã¯ã3種é¡ã®ãã¼ãã使ç¨ãã¾ãã
ã»Parameter Serverï¼Workerãè¨ç®ããå¾é
ãã¯ãã«ãç¨ãã¦ãVariableã®ã¢ãããã¼ããè¡ãã¾ãã
ã»Workerï¼æ師ãã¼ã¿ããå¾é
ãã¯ãã«ãè¨ç®ãã¾ãã
ã»Masterï¼Workerã¨åæ§ã®å¦çã«å ãã¦ãå¦ç¿æ¸ã¿ã¢ãã«ã®ä¿åããã¹ãã»ããã«å¯¾ããè©ä¾¡ãªã©ã®è¿½å å¦çãè¡ãã¾ãã
ä¸è¬ã«Parameter Serverã¯1ã2ãã¼ããWorkerã¯å¿ è¦ã«å¿ãã沢山ã®ãã¼ããMasterã¯1ãã¼ãã ãã§åããã¾ããCloud MLã«ã¸ã§ããæããã¨ããããã®ãã¼ã群ãã³ã³ããã§çæããã¦ãåãã¼ãã§ã³ã¼ãã®å®è¡ãè¡ããã¾ãã
ã¯ã©ã¹ã¿ã¼æ§ææ å ±ã¨ãã¼ãã®å½¹å²ã®åå¾
TensorFlowã®ã³ã¼ãã¯å ¨ãã¼ãã§å ±éã§ãããã³ã¼ãå ã§ãã¼ãã®ç¨®é¡ã«å¿ãã¦å¦çãåå²ããã¾ããã³ã¼ãå ã§ã¯ç°å¢å¤æ° TF_CONFIG ãéãã¦ãã¯ã©ã¹ã¿ã¼ã®æ§ææ å ±ã¨èªåèªèº«ã®å½¹å²ãåå¾ãã¾ãã次ã®ã³ã¼ãã¯ãã¯ã©ã¹ã¿ã¼æ§ææ å ±ãªãã¸ã§ã¯ã cluster_spec ãä½æãã¦ãjob_name 㨠task_index ã«ãã¼ãã®å½¹å²ãæ ¼ç´ãã¾ãã
# Get cluster and node info env = json.loads(os.environ.get('TF_CONFIG', '{}')) cluster_info = env.get('cluster', None) cluster_spec = tf.train.ClusterSpec(cluster_info) task_info = env.get('task', None) job_name, task_index = task_info['type'], task_info['index']
job_name ã«ã¯ããpsããmasterããworkerãã®ããããã®æååãå ¥ãã¾ããtask_index ã¯åã種é¡ã®ãã¼ããè¤æ°ããéã« 0 ããã®éãçªå·ãå ¥ãã¾ãããããã®æ å ±ãç¨ãã¦ã次ã®ããã« Server ãªãã¸ã§ã¯ããä½æããã¨ããã®ãªãã¸ã§ã¯ããã¯ã©ã¹ã¿ã¼å ã®ä»ã®ãã¼ãã¨ã®éä¿¡å¦çãæ ãã¾ãã
server = tf.train.Server(cluster_spec, job_name=job_name, task_index=task_index)
Parameter Serverã®å¦ç
Parameter Serverã®å ´åã¯ã次ã®ã³ãã³ã㧠Server ãªãã¸ã§ã¯ããèµ·åããã°ãããã§å¿ è¦ãªå¦çã¯çµããã§ãããã¨ã¯ Server ãªãã¸ã§ã¯ãã Parameter Server ã¨ãã¦ã®æ©è½ãæä¾ãã¦ããã¾ãããã®ã³ãã³ãã¯å¤é¨ããããã»ã¹ãåæ¢ããã¾ã§ãæ»ã£ã¦ãããã¨ã¯ããã¾ããã
if job_name == "ps": # Parameter server server.join()
Workerã®å¦ç
Workerï¼ããã³ãMasterï¼ã§ã¯ãå¦ç¿å¦çã®ã«ã¼ããåãå¿ è¦ãããã¾ããããã®éãä»ã®ãã¼ãã¨å調åä½ããããã«ãSupervisorãªãã¸ã§ã¯ããçæããå¾ã«ããã®ãªãã¸ã§ã¯ããçµç±ãã¦ããã§ãã¯ãã¤ã³ããã¡ã¤ã«ã®ä¿åãã»ãã·ã§ã³ã®ä½æã¨ãã£ãå¦çãå®æ½ãã¾ãã
次ã¯ãSupervisorãªãã¸ã§ã¯ããçæããã³ã¼ãã®ä¾ã§ãã
if job_name == "master" or job_name == "worker": # Worker node is_chief = (job_name == "master") ... global_step = tf.Variable(0, trainable=False) init_op = tf.global_variables_initializer() saver = tf.train.Saver() # Create a supervisor sv = tf.train.Supervisor(is_chief=is_chief, logdir=LOG_DIR, init_op=init_op, saver=saver, summary_op=None, global_step=global_step, save_model_secs=0)
ã»is_chiefï¼Masterãã¼ãã®å ´åã« True ã渡ãã¾ãã
ã»logdirï¼ãã§ãã¯ãã¤ã³ããã¡ã¤ã«ã®ä¿åãã£ã¬ã¯ããªã¼
ã»init_opï¼ã»ãã·ã§ã³ä½ææã®Variableåæåå¦ç
ã»saverï¼ãã§ãã¯ãã¤ã³ããã¡ã¤ã«ãä¿åããããã®Saverãªãã¸ã§ã¯ã
ã»global_stepï¼æé©åå¦çã®å®æ½åæ°ãã«ã¦ã³ãããVariable
ã»summary_opï¼TensorBoardç¨ã®ãµããªã¼ãªãã¸ã§ã¯ãï¼Supervisorãªãã¸ã§ã¯ããä»ããã«ãµããªã¼ãä¿åããéã¯Noneãæå®ï¼
ã»save_model_secsï¼ãã§ãã¯ãã¤ã³ãã®å®æä¿åééï¼èªåä¿åã§ã¯ãªããæ示çã«ä¿åããéã¯Noneãæå®ï¼
ã¾ããã¢ãã«ãå®ç¾©ããéã¯ã次ã®æ§ã«ãjob_name 㨠task_index ãç¨ãã¦ãèªåã®å½¹å²ã tf.device ã§è¨å®ãã with æ§æã®ä¸ã§å®ç¾©ãã¦ããã¾ãã
device_fn = tf.train.replica_device_setter( cluster=cluster_spec, worker_device="/job:%s/task:%d" % (job_name, task_index) ) ... if job_name == "master" or job_name == "worker": # Worker node is_chief = (job_name == "master") with tf.Graph().as_default() as graph: with tf.device(device_fn):
ããã¦æ¬¡ã¯ãã»ãã·ã§ã³ãçæãã¦ãå¦ç¿å¦çã®ã«ã¼ããåãé¨åã§ãã
# Create a session and run training loops with sv.managed_session(server.target) as sess: reports, step = 0, 0 start_time = time.time() while not sv.should_stop() and step < MAX_STEPS: images, labels = mnist_data.train.next_batch(BATCH_SIZE) feed_dict = {x:images, t:labels, keep_prob:0.5} _, loss_val, step = sess.run([train_step, loss, global_step], feed_dict=feed_dict) if step > CHECKPOINT * reports: reports += 1 logging.info("Step: %d, Train loss: %f" % (step, loss_val)) if is_chief: # Save checkpoint sv.saver.save(sess, sv.save_path, global_step=step) ... # Save summary feed_dict = {test_loss:loss_val, test_accuracy:acc_val} sv.summary_computed(sess, sess.run(summary, feed_dict=feed_dict), step) sv.summary_writer.flush()
ããã§ã®ãã¤ã³ãã¯ãæé©åã¢ã«ã´ãªãºã train_step ãè©ä¾¡ã㦠Variable ãã¢ãããã¼ãããéã«ãglobal_step ãä¸ç·ã«è©ä¾¡ããç¹ã§ããããã«ãããglobal_step ã®å¤ã 1 å¢å ãã¦ãä»ã®ãã¼ããå«ããå¦ç¿å¦çã®ãã¼ã¿ã«ã®åæ°ãåå¾ã§ãã¾ãããã®ä¾ã§ã¯åå¾ããå¤ãå¤æ° step ã«æ ¼ç´ãã¦ãå ¨ä½ã¨ã㦠CHECKPOINT åè©ä¾¡ãããã¨ã«é²æããã°åºåããã¨ãããã¨ãè¡ã£ã¦ãã¾ããã¾ããå ã«å®ç¾©ãã¦ããã is_chief ï¼Masterã®å ´åã« Trueï¼ãç¨ãã¦ãMasterã ãã§è¿½å ã®å¦çããããã¨ãã§ãã¾ãããã®ä¾ã§ã¯ããµããªã¼ã®åºåã¨ãã§ãã¯ãã¤ã³ãã®ä¿åãè¡ã£ã¦ãã¾ããsv.save_path ã«ã¯ãSupervisorãä½æããæã« logdir ã§æå®ãããã£ã¬ã¯ããªã¼ãå ¥ãã¾ãã
å¦ç¿æ¸ã¿ã¢ãã«ã®ä¿å
åæ£å¦ç¿ã§ããªããã¼ãªç¹ã®ï¼ã¤ãå¦ç¿æ¸ã¿ã¢ãã«ã®ä¿åæ¹æ³ã§ããå ã«æ§ç¯ããã¢ãã«ã¯ãåæ£å¦ç¿ç¨ã«ãã¼ãã®æ å ±ãã²ãä»ãã¦ãã¾ãããå¦ç¿æ¸ã¿ã¢ãã«ã§åé¡ãè¡ãéã¯ããããã®æ å ±ã¯ä¸è¦ã§ããããã§ãåä½ãã¼ãã§ããªã¹ãã¢å¯è½ãªåé¡å¦çå°ç¨ã®ã¢ãã«ãåæ§ç¯ãã¦ããããä¿åããã¨ããå¦çãè¡ãã¾ãã
ã¾ããå¦ç¿å¦çã®ã«ã¼ããæããé¨åã§æ¬¡ãå®è¡ãã¾ããããã¯ãæçµç¶æ ã® Variable ãä¸æ¦ãã§ãã¯ãã¤ã³ããã¡ã¤ã«ã«ä¿åãã¦ãããããã¢ãã«ã®åæ§ç¯ã»ä¿åå¦çï¼export_modelï¼ãå¼ã³åºãã¦ãã¾ãã
if is_chief: # Export the final model sv.saver.save(sess, sv.save_path, global_step=sess.run(global_step)) export_model(tf.train.latest_checkpoint(LOG_DIR))
ããã¦ãåæ§ç¯ã»ä¿åå¦çã®ä¾ã¯æ¬¡ã®ããã«ãªãã¾ãã
def export_model(last_checkpoint): # create a session with a new graph. with tf.Session(graph=tf.Graph()) as sess: x = tf.placeholder(tf.float32, [None, 784]) p = mnist.get_model(x, None, training=False) # Define key elements input_key = tf.placeholder(tf.int64, [None,]) output_key = tf.identity(input_key) # Define API inputs/outpus object inputs = {'key': input_key.name, 'image': x.name} outputs = {'key': output_key.name, 'scores': p.name} tf.add_to_collection('inputs', json.dumps(inputs)) tf.add_to_collection('outputs', json.dumps(outputs)) init_op = tf.global_variables_initializer() sess.run(init_op) # Restore the latest checkpoint and save the model saver = tf.train.Saver() saver.restore(sess, last_checkpoint) saver.export_meta_graph(filename=MODEL_DIR + '/export.meta') saver.save(sess, MODEL_DIR + '/export', write_meta_graph=False)
ããã§ã¯ãæ°ããªã°ã©ãã¨ã»ãã·ã§ã³ãç¨æãã¦ãå ¥å x ã«å¯¾ãã¦ãäºæ¸¬çµæ p ãè¨ç®ããæä½éã®é¢ä¿ãå®ç¾©ããå¾ã«ãå ã»ã©ä¿åãããã§ãã¯ãã¤ã³ãã®å 容ããªã¹ãã¢ãã¦ãã¾ãããã®ã»ãã·ã§ã³ã«å«ã¾ãã¦ããªã Variable ã®å¤ã¯åç´ã«ç¡è¦ããã¾ããã¾ããåé¡ç¨ã³ã¼ãã«å ¥åºåå¤æ°ã渡ãããã«ãå ¥åºåã«é¢é£ããå¤æ°åã JSON ã«ã¾ã¨ãããã®ã collection ã«å ¥ãã¦ããã¾ããoutput_key ã¯ãPlaceholder ã® input_key ã«ãããå¤ããã®ã¾ã¾åºã¦ããå¤æ°ã§ãè¤æ°ãã¼ã¿ããããå¦çããéã«ã©ã®åºåãã¼ã¿ãã©ã®å ¥åãã¼ã¿ã«å¯¾å¿ããããç´ä»ããããã«ä½¿ç¨ãã¾ãã
ã³ã¼ãã®å ¨ä½å
ã¢ãã«å®ç¾©ã®ä¸èº«ã¯é©å½ã«ç¨æãã MNIST ç¨ CNN ã§ãã
trainer âââ __init__.py # 空ãã¡ã¤ã« âââ mnist.py # ã¢ãã«å®ç¾© âââ task.py # å¦ç¿å¦çç¨ã³ã¼ã
trainer/task.py
import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data import numpy as np import time, json, os, logging import mnist flags = tf.app.flags FLAGS = flags.FLAGS flags.DEFINE_integer('batch_size', 100, 'Batch size. Must divide evenly into the dataset sizes.') flags.DEFINE_integer('max_steps', 10000, 'Number of steps to run trainer.') flags.DEFINE_integer('checkpoint', 100, 'Interval steps to save checkpoint.') flags.DEFINE_string('log_dir', '/tmp/logs', 'Directory to store checkpoints and summary logs') flags.DEFINE_string('model_dir', '/tmp/model', 'Directory to store trained model') # Global flags BATCH_SIZE = FLAGS.batch_size MODEL_DIR = FLAGS.model_dir LOG_DIR = FLAGS.log_dir MAX_STEPS = FLAGS.max_steps CHECKPOINT = FLAGS.checkpoint def export_model(last_checkpoint): # create a session with a new graph. with tf.Session(graph=tf.Graph()) as sess: x = tf.placeholder(tf.float32, [None, 784]) p = mnist.get_model(x, None, training=False) # Define key elements input_key = tf.placeholder(tf.int64, [None,]) output_key = tf.identity(input_key) # Define API inputs/outpus object inputs = {'key': input_key.name, 'image': x.name} outputs = {'key': output_key.name, 'scores': p.name} tf.add_to_collection('inputs', json.dumps(inputs)) tf.add_to_collection('outputs', json.dumps(outputs)) init_op = tf.global_variables_initializer() sess.run(init_op) # Restore the latest checkpoint and save the model saver = tf.train.Saver() saver.restore(sess, last_checkpoint) saver.export_meta_graph(filename=MODEL_DIR + '/export.meta') saver.save(sess, MODEL_DIR + '/export', write_meta_graph=False) def run_training(): # Get cluster and node info env = json.loads(os.environ.get('TF_CONFIG', '{}')) cluster_info = env.get('cluster', None) cluster_spec = tf.train.ClusterSpec(cluster_info) task_info = env.get('task', None) job_name, task_index = task_info['type'], task_info['index'] device_fn = tf.train.replica_device_setter( cluster=cluster_spec, worker_device="/job:%s/task:%d" % (job_name, task_index) ) logging.info('Start job:%s, index:%d' % (job_name, task_index)) # Create server server = tf.train.Server(cluster_spec, job_name=job_name, task_index=task_index) if job_name == "ps": # Parameter server server.join() if job_name == "master" or job_name == "worker": # Worker node is_chief = (job_name == "master") with tf.Graph().as_default() as graph: with tf.device(device_fn): # Prepare training data mnist_data = input_data.read_data_sets("/tmp/data/", one_hot=True) # Create placeholders x = tf.placeholder_with_default( tf.zeros([BATCH_SIZE, 784], tf.float32), shape=[None, 784]) t = tf.placeholder_with_default( tf.zeros([BATCH_SIZE, 10], tf.float32), shape=[None, 10]) keep_prob = tf.placeholder_with_default( tf.zeros([], tf.float32), shape=[]) global_step = tf.Variable(0, trainable=False) # Add test loss and test accuracy to summary test_loss = tf.placeholder_with_default( tf.zeros([], tf.float32), shape=[]) test_accuracy = tf.placeholder_with_default( tf.zeros([], tf.float32), shape=[]) tf.summary.scalar("Test_loss", test_loss) tf.summary.scalar("Test_accuracy", test_accuracy) # Define a model p = mnist.get_model(x, keep_prob, training=True) train_step, loss, accuracy = mnist.get_trainer(p, t, global_step) init_op = tf.global_variables_initializer() saver = tf.train.Saver() summary = tf.summary.merge_all() # Create a supervisor sv = tf.train.Supervisor(is_chief=is_chief, logdir=LOG_DIR, init_op=init_op, saver=saver, summary_op=None, global_step=global_step, save_model_secs=0) # Create a session and run training loops with sv.managed_session(server.target) as sess: reports, step = 0, 0 start_time = time.time() while not sv.should_stop() and step < MAX_STEPS: images, labels = mnist_data.train.next_batch(BATCH_SIZE) feed_dict = {x:images, t:labels, keep_prob:0.5} _, loss_val, step = sess.run([train_step, loss, global_step], feed_dict=feed_dict) if step > CHECKPOINT * reports: reports += 1 logging.info("Step: %d, Train loss: %f" % (step, loss_val)) if is_chief: # Save checkpoint sv.saver.save(sess, sv.save_path, global_step=step) # Evaluate the test loss and test accuracy loss_vals, acc_vals = [], [] for _ in range(len(mnist_data.test.labels) // BATCH_SIZE): images, labels = mnist_data.test.next_batch(BATCH_SIZE) feed_dict = {x:images, t:labels, keep_prob:1.0} loss_val, acc_val = sess.run([loss, accuracy], feed_dict=feed_dict) loss_vals.append(loss_val) acc_vals.append(acc_val) loss_val, acc_val = np.sum(loss_vals), np.mean(acc_vals) # Save summary feed_dict = {test_loss:loss_val, test_accuracy:acc_val} sv.summary_computed(sess, sess.run(summary, feed_dict=feed_dict), step) sv.summary_writer.flush() logging.info("Time elapsed: %d" % (time.time() - start_time)) logging.info("Step: %d, Test loss: %f, Test accuracy: %f" % (step, loss_val, acc_val)) # Finish training if is_chief: # Export the final model sv.saver.save(sess, sv.save_path, global_step=sess.run(global_step)) export_model(tf.train.latest_checkpoint(LOG_DIR)) sv.stop() def main(_): run_training() if __name__ == '__main__': logging.basicConfig(level=logging.INFO) tf.app.run()
trainer/mnist.py
import tensorflow as tf import json def get_model(x, keep_prob, training=True): num_filters1 = 32 num_filters2 = 64 with tf.name_scope('cnn'): with tf.name_scope('convolution1'): x_image = tf.reshape(x, [-1,28,28,1]) W_conv1 = tf.Variable(tf.truncated_normal([5,5,1,num_filters1], stddev=0.1)) h_conv1 = tf.nn.conv2d(x_image, W_conv1, strides=[1,1,1,1], padding='SAME') b_conv1 = tf.Variable(tf.constant(0.1, shape=[num_filters1])) h_conv1_cutoff = tf.nn.relu(h_conv1 + b_conv1) h_pool1 = tf.nn.max_pool(h_conv1_cutoff, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME') with tf.name_scope('convolution2'): W_conv2 = tf.Variable( tf.truncated_normal([5,5,num_filters1,num_filters2], stddev=0.1)) h_conv2 = tf.nn.conv2d(h_pool1, W_conv2, strides=[1,1,1,1], padding='SAME') b_conv2 = tf.Variable(tf.constant(0.1, shape=[num_filters2])) h_conv2_cutoff = tf.nn.relu(h_conv2 + b_conv2) h_pool2 = tf.nn.max_pool(h_conv2_cutoff, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME') with tf.name_scope('fully-connected'): h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*num_filters2]) num_units1 = 7*7*num_filters2 num_units2 = 1024 w2 = tf.Variable(tf.truncated_normal([num_units1, num_units2])) b2 = tf.Variable(tf.constant(0.1, shape=[num_units2])) hidden2 = tf.nn.relu(tf.matmul(h_pool2_flat, w2) + b2) with tf.name_scope('output'): if training: hidden2_drop = tf.nn.dropout(hidden2, keep_prob) else: hidden2_drop = hidden2 w0 = tf.Variable(tf.zeros([num_units2, 10])) b0 = tf.Variable(tf.zeros([10])) p = tf.nn.softmax(tf.matmul(hidden2_drop, w0) + b0) tf.summary.histogram("conv_filters1", W_conv1) tf.summary.histogram("conv_filters2", W_conv2) return p def get_trainer(p, t, global_step): with tf.name_scope('optimizer'): loss = -tf.reduce_sum(t * tf.log(p), name='loss') train_step = tf.train.AdamOptimizer(0.0001).minimize(loss, global_step=global_step) with tf.name_scope('evaluator'): correct_prediction = tf.equal(tf.argmax(p, 1), tf.argmax(t, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32), name='accuracy') return train_step, loss, accuracy
ãã®ã³ã¼ãã使ã£ã¦ãCloud MLã§å¦ç¿ããéã¯ãCloud Shellãã次ã®ã³ãã³ããå®è¡ãã¦ããã¾ããï¼trainerãã£ã¬ã¯ããªã¼ã®è¦ªãã£ã¯ã¬ã¯ããªã¼ã§å®è¡ãã¾ããï¼
$ PROJECT_ID=project01 # your project ID $ TRAIN_BUCKET="gs://$PROJECT_ID-mldata" $ gsutil mkdir $TRAIN_BUCKET $ cat << EOF > config.yaml trainingInput: # Use a cluster with many workers and a few parameter servers. scaleTier: STANDARD_1 EOF $ JOB_NAME="job01" $ gsutil rm -rf $TRAIN_BUCKET/$JOB_NAME $ touch .dummy $ gsutil cp .dummy $TRAIN_BUCKET/$JOB_NAME/train/ $ gsutil cp .dummy $TRAIN_BUCKET/$JOB_NAME/model/ $ gcloud beta ml jobs submit training ${JOB_NAME} \ --package-path=trainer \ --module-name=trainer.task \ --staging-bucket="${TRAIN_BUCKET}" \ --region=us-central1 \ --config=config.yaml \ -- \ --log_dir=$TRAIN_BUCKET/$JOB_NAME/train \ --model_dir=$TRAIN_BUCKET/$JOB_NAME/model \ --max_steps=10000
TensorBoardã§é²æãè¦ãæã¯ã次ãå®è¡ãã¾ãã
$ tensorboard --port 8080 --logdir $TRAIN_BUCKET/$JOB_NAME/train
å¦ç¿ãçµããã¨ãã$TRAIN_BUCKET/$JOB_NAME/modelã以ä¸ã«å¦ç¿æ¸ã¿ã¢ãã«ï¼export.metaãããã³ãexprot-data.xxxxï¼ãåºåããã¾ãã
å¦ç¿æ¸ã¿ã¢ãã«ããªã¹ãã¢ãã¦ãäºæ¸¬å¦çãè¡ãã³ã¼ãã®ä¾ã¯æ¬¡ã«ãªãã¾ãã
#!/usr/bin/python import tensorflow as tf import numpy as np import json from tensorflow.examples.tutorials.mnist import input_data model_meta = 'gs://project01-mldata/job01/model/export.meta' model_param = 'gs://project01-mldata/job01/model/export' with tf.Graph().as_default() as graph: sess = tf.InteractiveSession() saver = tf.train.import_meta_graph(model_meta) saver.restore(sess, model_param) inputs = json.loads(tf.get_collection('inputs')[0]) outputs = json.loads(tf.get_collection('outputs')[0]) x = graph.get_tensor_by_name(inputs['image']) input_key = graph.get_tensor_by_name(inputs['key']) p = graph.get_tensor_by_name(outputs['scores']) output_key = graph.get_tensor_by_name(outputs['key']) mnist_data = input_data.read_data_sets("/tmp/data/", one_hot=True) images, labels = mnist_data.test.next_batch(10) index = range(10) keys, preds = sess.run([output_key, p], feed_dict={input_key:index, x:images}) for key, pred, label in zip(keys, preds, labels): print key, np.argmax(pred), np.argmax(label)
Disclaimer: All code snippets are released under Apache 2.0 License. This is not an official Google product.