ãã£ã¼ãã©ã¼ãã³ã°ã§ããæ¾ããã®å ã¤åã¯è¦åããããã®ã ãå®æ½ç·¨ã
ååãããæ¾ãããã¡ããã£ã¼ãã©ã¼ãã³ã°ã§è¦åãããããæºåç·¨ã¨ãã¦ããæ¾ãããã¡ã®é¡ç»åã5644æéãã¾ããã ä»åã¯ãããç¨ãã¦ããã£ã¼ãã©ã¼ãã³ã°ã§å¦ç¿ãããå¤å¥å¨ãä½ã£ã¦æ¤è¨¼ãã¾ãã
éããç»å
äººç© | ææ° | ä¾ |
---|---|---|
ããæ¾ | 1126 | ![]() |
ããæ¾ | 769 | ![]() |
ãã§ãæ¾ | 1047 | ![]() |
ä¸æ¾ | 736 | ![]() |
ååæ¾ | 855 | ![]() |
ã¨ã©æ¾ | 729 | ![]() |
ãã®ä» | 383 |
使ç¨ãã¬ã¼ã ã¯ã¼ã¯
æè¿GoogleããTensorFlowã¨ããæ°ãããã£ã¼ãã©ã¼ãã³ã°ã®ãã¬ã¼ã ã¯ã¼ã¯ãçºè¡¨ããã¾ããã ä¼ç¤¾ã®ããã°ã«ä½¿ãæ¹æ¸ããã®ã§ãããã¾ã æ £ãã¦ããªãã®ã§ãä»åã¯chainerã使ãã¾ãããã¡ãã ã¨ããã«é«ãææãä¸ãã¦ããImageNetã®NINã¢ãã«ãï¼å±¤ç³ã¿è¾¼ã¿ãã¥ã¼ã©ã«ãããã¯ã¼ã¯ããµã³ãã«ã§å ¥ã£ã¦ãã¾ãã®ã§ããã¡ããæ¹è¯ãã¦ä½¿ãã¾ãã
imageNetã®ä½¿ãæ¹ã¯ããã¡ãããã¡ããåèã«ãã¦ãã¾ãã
è¨ç·´ãã¼ã¿ã»ãã
ImageNetã§å¦ç¿ãè¡ãã«ã¯ãtrain.txtã¨test.txtã®ï¼ã¤ãå¿ è¦ã§ãã ã©ã¡ãã
/path/to/image/hoge1.jpg 1 /path/to/image/hoge2.jpg 5 /path/to/image/hoge3.jpg 2 /path/to/image/hoge4.jpg 4
ã¨ãããç»åãã¹å ã©ãã«çªå·ãã¨ãã£ãå½¢å¼ããã¦ãã¾ãã
éããç»åã®ãã¡ã85%ãå¦ç¿ãã¼ã¿ã15%ããã¹ãç¨ãã¼ã¿ã«ãã¾ãã
å¦ç¿ã«ç¨ããã½ã¼ã¹
åºæ¬çã«ã¯ãImageNetã®ã½ã¼ã¹ã使ãã¾ãããããã¤ãå¤æ´ãæ½ãã¾ãã ã¾ãGPUã§å¦ç¿ããããã¨ã«çæãããã¢ãã«ãCPUã§ã使ããããã«ãã¦ä¿åãã¾ãã ããã¯train_imagenet.pyã®æçµã¢ãã«ãä¿åããã¨ãã«ãcpuã§ä½¿ããå½¢å¼ã«ãã¦ããä¿åãã¾ãã
# æçµè¡ãå¤æ´ # Save final model model.to_cpu() pickle.dump(model, open(args.out, 'wb'), -1)
ã¾ããnin.pyã«ã¯ç´æ¥è§£æçµæãè¿ãã¦ãããã¡ã½ããããªãã®ã§ãã¹ã³ã¢ãã½ããããã¯ã¹é¢æ°ã§å¦çããçµæãè¿ãã¡ã½ããã追å ãã¾ãããã¡ããåèã«ãã¾ãã
def predict(self, x_data, train=False): x = chainer.Variable(x_data, volatile=True) h = F.relu(self.conv1(x)) h = F.relu(self.conv1a(h)) h = F.relu(self.conv1b(h)) h = F.max_pooling_2d(h, 3, stride=2) h = F.relu(self.conv2(h)) h = F.relu(self.conv2a(h)) h = F.relu(self.conv2b(h)) h = F.max_pooling_2d(h, 3, stride=2) h = F.relu(self.conv3(h)) h = F.relu(self.conv3a(h)) h = F.relu(self.conv3b(h)) h = F.max_pooling_2d(h, 3, stride=2) h = F.dropout(h, train=train) h = F.relu(self.conv4(h)) h = F.relu(self.conv4a(h)) h = F.relu(self.conv4b(h)) h = F.reshape(F.average_pooling_2d(h, 6), (x_data.shape[0], 1000)) return F.softmax(h)
å¦ç¿
ããã§ã¯å¹³åç»åãçæãã¦ãããå¦ç¿ãéå§ãã¾ãã ç»åã¯ããµã¤ãºã256*256ã«ããå¿ è¦ãããããããã¡ãã®crop.pyã使ã£ã¦ãµã¤ãºèª¿æ´ãè¡ãã¾ããã
$ ./crop.py /path/to/image/directory /path/to/image/directory $ ./compute_mean.py train.txt --root /path/to/image/directory $ ./train_imagenet.py train.txt test.txt --batchsize 14 --val_batchsize 80 --epoch 50 --gpu 0 --root /path/to/image/directory
è¨ç·´ãã¼ã¿ã§ã®çµæ
ãã°ãã¼ã¿ããè¨ç·´ãã¼ã¿ã«ãããã¨ã©ã¼çã®æ¨ç§»ã¯ãã®ããã«ãªãã¾ããã
ããããã¯ãã¡ãã®plot.pyã使ããã¦ããã ãã¾ããã
10000ã¤ãã¬ã¼ã·ã§ã³ãããã§ã»ã¨ãã©ä¸ãããã£ã¦ãã¾ãã
trainã§ã®æçµçãªã¨ã©ã¼ã¬ã¼ãã¯0.02857
valã§ã¯0.0360ã§ãã96.4%ã®ç²¾åº¦ãã§ã¦ãã¾ãã
ããæ¾ããå¤å¥å¨ã«ã¤ãã¦
ä¸è¨ã§ä½æããé¡ã®åé¡å¨(ãã£ã¼ãã©ã¼ãã³ã°é¨)ã¨ãåè¨äºã§ä½ã£ãé¡æ¤åºå¨(HOG+SVM)ãç¨ãã¦ããã£ããã£ç»åããèªèã§ããããã«ãã¾ãã
å¦çæé ã¨ãã¦ã¯ã
1, ç»åãµã¤ãºã調æ´ãã
2, é¡æ¤åºå¨ã§é¡ãåãæã
3, åãæããé¡ãåé¡å¨ã«ããã
4, å
ç»åã«çµæã表示ãã
ã¨ããããã«ãªãã¾ãã 以ä¸ããã®ã½ã¼ã¹ã§ãã
#!/usr/bin/env python #! -*- coding: utf-8 -*- import os import sys import dlib from skimage import io import numpy as np import cv2 import argparse import os from PIL import Image import six import cPickle as pickle from six.moves import queue parser = argparse.ArgumentParser( description='Learning convnet from ILSVRC2012 dataset') parser.add_argument('image', help='Path to image') parser.add_argument('--mean', '-m', default='mean.npy', help='Path to the mean file (computed by compute_mean.py)') parser.add_argument('--detector', '-d', default='detector.svm', help='Path to the detector file ') parser.add_argument('--model', '-mo', default='model', help='Path to the model file') args = parser.parse_args() # ãã¡ã¤ã«èªã¿è¾¼ã¿ detector = dlib.simple_object_detector(args.detector) img = cv2.imread(args.image,1) import nin model = pickle.load(open(args.model,'rb')) mean_image = pickle.load(open(args.mean, 'rb')) categories = np.loadtxt("labels.txt", str, delimiter="\t") cropwidth = 256 - model.insize out = img.copy() # ç»åãµã¤ãºå¤æ´ def resize(img): target_shape = (256, 256) height, width, depth = img.shape print "height:"+str(height) + "width:" + str(width) output_side_length=256 new_height = output_side_length new_width = output_side_length if height > width: new_height = output_side_length * height / width else: new_width = output_side_length * width / height resized_img = cv2.resize(img, (new_width, new_height)) height_offset = (new_height - output_side_length) / 2 width_offset = (new_width - output_side_length) / 2 cropped_img = resized_img[height_offset:height_offset + output_side_length, width_offset:width_offset + output_side_length] return cropped_img def read_image(src_img, center=True, flip=False): # Data loading routine image = np.asarray(Image.open(src_img)).transpose(2, 0, 1) #image = src_img.transpose(2, 0, 1) if center: top = left = cropwidth / 2 else: top = random.randint(0, cropwidth - 1) left = random.randint(0, cropwidth - 1) bottom = model.insize + top right = model.insize + left image = image[:, top:bottom, left:right].astype(np.float32) image -= mean_image[:, top:bottom, left:right] image /= 255 return image # é¡æ¤åº dets = detector(img) print "faces:" + str(len(dets)) height, width, depth = img.shape x = np.ndarray((len(dets), 3, model.insize, model.insize), dtype=np.float32) faces = [() for i in range(len(dets))] for i,d in enumerate(dets): # é¡é åã®èª¿æ´ f_top = max((0, d.top())) f_bottom = min((d.bottom(), height -1)) f_left = max((0, d.left())) f_right = min((d.right(), width -1)) print "%d %d %d %d" %(f_top, f_bottom, f_left, f_right) faces[i] = (f_top, f_bottom, f_left, f_right) face_img = img[f_top:f_bottom, f_left:f_right] resized_face = resize(face_img) cv2.imwrite("temp.jpg", resized_face) x[i] = read_image("temp.jpg") # é¡ã®åé¡ã¹ã³ã¢åå¾ scores = model.predict(x) # çµæ表示 face_info = [] for i,face in enumerate(faces): prediction = zip(scores.data[i], categories) prediction.sort(cmp=lambda x, y: cmp(x[0], y[0]), reverse=True) score, name = prediction[0] for j in range(6): print ('%s score:%4.1f%%' % (prediction[j][1], prediction[j][0] * 100)) if name == "osomatsu": color = (0,0,255) elif name == "karamatsu": color = (255,0,0) elif name == "choromatsu": color = (0,255,0) elif name == "ichimatsu": color = (133,22, 200) elif name == "jushimatsu": color = (0, 255,255) elif name == "todomatsu": color = (167, 160, 255) else : color = (255,255,255) cv2.rectangle(out, (face[2], face[0]), (face[3], face[1]), color, 3) cv2.putText(out,"%s" %(name),(face[2],face[1]+15),cv2.FONT_HERSHEY_COMPLEX, 0.5 ,color) cv2.putText(out,"%4.1f%%" %( score*100),(face[2],face[1]+30),cv2.FONT_HERSHEY_COMPLEX, 0.5 ,color) cv2.imshow('image',out) cv2.waitKey(0) cv2.destroyAllWindows()
試ãã¦ã¿ã
è¨ç·´ãã¼ã¿è©±æ°å ãã
ã¾ãã¯è¨ç·´ãã¼ã¿ã«å«ã¾ãã¦ããã®ããã試ãã¦ã¿ã¾ããèªèçµæã¯ãããããã®ã«ã©ã¼ã§é¡ãå²ãããã«ãã¾ãã
ããä¸åº¦ç¢ºèªãã¦ããã¨ãããæ¾ãããæ¾ããã§ãæ¾ãä¸æ¾ãååæ¾ãã¨ã©æ¾ãã§ãã
OPãã
è¨ç·´ãã¼ã¿ã§ã®ã¨ã©ã¼çã¯æ°%ãªã®ã§ãã¡ããã¨èªèã§ãã¦ãã¾ãã
ããä½æã
è¨ç·´ãã¼ã¿è©±æ°å¤ãã
åé¡ã¯ãè¨ç·´ãã¼ã¿ã«å«ã¾ãã¦ããªã話æ°ã®é¡ãèªèã§ãããã©ããã§ãã ï¼è©±ã®éä¸ã¾ã§ãè¨ç·´ãã¼ã¿ã§ç¨ããã®ã§ãï¼è©±ããæ½åºãã¦è§£æãã¦ã¿ã¾ãã
ãããã¡ããã®å·äºã«ä¸æ¾ã¨ééããããååæ¾ãã¡ããã¨ååæ¾ã¨ãã¦èªèããã¦ãã¾ãï¼ãç»åè¦ãã ãããèªåãããããªãã®ã«ãããï¼
ãã¨æãã°ãä¸æ¾ãã¨ã©æ¾ã¨ãã¦èªèãã¦ãã¾ãã
誤èªèé¨åã®ã¹ã³ã¢ãè¦ã¦ã¿ãã¨ã
todomatsu score:56.3%
karamatsu score:37.9%
ichimatsu score: 3.8%
osomatsu score: 1.9%
jushimatsu score: 0.2%
ã¨ãªã£ã¦ãã¾ããä»ã®æ£ç¢ºã«èªèã§ãã¦ããé¡ã®ã¹ã³ã¢ã¯ã»ã¼100%ãªã®ã§ãããã«æ¯ã¹ãã°ç¢ºããã¯ä½ããªã£ã¦ã¯ãã¾ãã
ä»ã®çµæãè¦ãéããç¹å¾´ã®å°ãªãç»åã¯ãã¨ã©æ¾ã®ã¹ã³ã¢ãé«ããªãå¾åãããæ°ããã¾ãã
ãã¡ãã¯å
¨å¡ã¡ããã¨èªèããã¦ãã¾ãã
ä¸æ¾ä»¥å¤ã¿ããªã¨ã©æ¾ã«ãªã£ã¦ãã¾ãã
ããæ¾ã ãããèªåãè¦åããããä¸æ¾ã¨ååæ¾ã¯æ£è§£çé«ãã
ãã£ã¦ã¿ã¦
æ£ç´ãå¦ç¿ããã¦ããã¾ãå¤å¥ã§ããªãããããªããã¨æ¸å¿µãã¦ãã®ã§ãæã£ã¦ãããã¯ã¡ããã¨è¦åãããã¦ããã¨ããææ³ã§ãã 人ã®ç®ã§è¦ã¦åããã¥ããé¡ç»åã¯ããããã
- ç»åã«ãããè¦åããããé¡ã®ç¹å¾´ãå°ãªã
- è¦åãããã¤ã³ããããã£ã¦ããªã
ã®ã©ã¡ããã ã¨æãã¾ããåè ã¯ãããããã¡ããã¨æ¸ãåãã¦ãªãå ´åã§ããã£ã¼ãã©ã¼ãã³ã°ã§è§£æãã¦ãå¤æãã¥ããã®ã§ãããã å¾è ã¯ãã¡ããã¨æãåãã¦ããã©ã人ããããç解ã§ãã¦ããªãå ´åã§ãããã®å ´åã¯ããã£ã¼ãã©ã¼ãã³ã°ãç¹å¾´ãè¦ã¤ãåºãã¦å¤å¥ã§ãããã¨ãããã¨ã ã¨æãã¾ãã
精度ã«ã¤ãã¦ã¯ãã¾ã æ¹åã®ä½å°ã¯ãããã¨æãã¾ããå¦ç¿ãã¼ã¿ã§ããç»åã®è³ªããããå¿ è¦ãããã¾ãããã£ããã£ããåãåã£ã¦ããã¨ãã©ããã¦ãåããªããã£ã©ã¯ãã£ã¨åãç»åã«ãªã£ã¦ãã¾ãã®ã§ãå®è³ªææ°ããã¾ãå¤ãããã¾ãããä¾ãã°ï¼äººããç¶æ ã§ã60ãã¬ã¼ã ã»ã©ãã§ãæ¾ã®ã¿ãåã£ãã¨ãã¾ããããã¨60ãã¬ã¼ã ååãåãã¨360æåã®é¡ç»åãçæãããã®ã§ããã295æã¯éè¤ç»åã§ããã¾ããã§ãæ¾ã®60æããåã£ã¦ããã³ãã¯ããããå£éããå£åéããå£éãã®ï¼ã4種é¡ç¨åº¦ããåå¨ãã¾ããããªã®ã§360æä¸ãå®è³ª9種é¡ç¨åº¦ããããã¾ããã ä»åã®ç»ååéã§æ¥µååä¸ã³ãã®ç»åã¯é¿ãã¦ããã®ã§ãããããã§ãå¤ãã§ãã ã§ãã®ã§ãåãç»åã¯æé¤ããä¸ã¤ã®ç»åãããã¤ãºãããããå種ãã£ã«ã¿ãããã¦æ°ãã«ç»åãçæããã®ããããã¨æãã¾ãã
ã¾ãåè¨äºã§çæããããæ¾ããç¨é¡æ¤åºå¨ã®ç²¾åº¦ããã¾ããããªããåãéããã¦ããåçãçµæ§ããã¾ãã ã§ãã®ã§ãæ¤åºå¨ã®ç²¾åº¦ãåä¸ãããããã®é¡ç»åé¸å®ãå¿ è¦ã«ãªã£ã¦ãã¾ãã ããå°ã精度åä¸ã«åãã¦é å¼µãã¾ãã
åè
Chainerã®NINã§èªåã®ç»åã»ããã深層å¦ç¿ããã¦èªèããã
Googleã®å ¬éãã人工ç¥è½ã©ã¤ãã©ãªTensorFlowã触ã£ã¦ã¿ã