19æ¥ã«è¡ããã Kyoto.ãªãã #3 ã§çºè¡¨ã»ãã¢ãããã¦ããã ããå 容ã¾ã¨ãã§ãã
ã¯ããã«: æ¤åºå¨ã®éè¦æ§
ã¢ã¤ãã«é¡èå¥ ããã£ã¨ãã£ã¦ããä¸ã§ãé¡ã®èå¥ã»åé¡(Classification)ã¯CNNã使ã£ã¦åºæ¥ã¦ããããã© ã¾ã ä¸æãåºæ¥ã¦ããªãå¥ã®ã¿ã¹ã¯ããã£ã¦ã
ãããç»åå ããã®é¡é åã®æ¤åº (Detection, Localization)ã
ãç»åå ã«åã£ã¦ãã人ç©ã誰ã§ãããããèå¥ããããã«ã¯ãã¾ãã¯ãã®ç»åã«åã£ã¦ãããé¡ããæ¤åºããå¿ è¦ãããã ãã®æ¤åºãããé¡ããããã«ã¤ãã¦åé¡å¨ã«ããã¦ããã®é¡ã¯ââãããããã®é¡ã¯ÃÃãããã¨åé¡ãã¦ãããã¨ã«ãªãããã§ã
åé¡å¨ã«ä¸ããå ¥åç»åãåãæãã¦æ½åºããã®ã«ãã¾ãé¡é åãæ¤åºããå¿ è¦ããããããã®åé¡å¨ãå¦ç¿ãããããã®ãã¼ã¿ã»ããããæ§ã ãªç»åããé¡é åãæ¤åºãã¦åãæãã¦ããããã«å¯¾ãã¦ã©ãã«ä»ããããã¨ã§ä½ã£ã¦ããã ãªã®ã§ãé¡èå¥ã¿ã¹ã¯ã«ã¯ãé¡é åã®æ¤åºããä¸å¯æ¬ ã¨ãªã£ã¦ããã
å¾æ¥ã®æ¹æ³
ä»ã¾ã§ã¯ããã¼ã¿ã»ããä½æã®ããã®é¡ç»ååéã«ã¯OpenCVã使ã£ãå転è£æ£æ©è½ã¤ãã®æ¤åºå¨ãèªä½ãã¦ä½¿ã£ã¦ããã
OpenCVã®Haarç¹å¾´ã«ããã«ã¹ã±ã¼ãååé¡å¨ã使ã£ãé åæ¤åºã¯ãæ£é¢åãé¡ã»ç®ãªã©ãæ¤åºããããå¦ç¿æ¸ã¿ãã¼ã¿ãæ¨æºã§å梱ããã¦ãããæãæ軽ã«ä½¿ããæ¤åºå¨ã¨è¨ããã ãããããã®æ¤åºå¨ã¯æãã«å¾ããé¡ã«å¯¾ãã¦ã¯ä¸æ°ã«ç²¾åº¦ãä¸ããã¨ããå¼±ç¹ããããæãã«åã£ã¦ãããã¨ãå¤ãã¢ã¤ãã«ã®èªæ®ãã§ã¯ä¸æãæ¤åºã§ããªãå ´åãå¤ãã ãããå æããããã«ãå ç»åãå°ããã¤å転ããããã®ãçæã ããããã«å¯¾ãã¦æ¤åºå¨ã«ãã ãããã®çµæããã¼ã¸ãããã¨ããæ¹æ³ã使ã£ã¦æãã®ãã®ããããªãã®ç²¾åº¦ã§é¡æ¤åºã§ãããã®ãä½ã£ãã
詳ããã¯éå»ã®ãã®è¨äºã
ããã«ãã£ã¦ããç¨åº¦ã®ç²¾åº¦ã§é¡é åãæ¤åºãããã¨ãã§ããã¾ã é¡ã¨åæã«ä¸¡ç®ã®ä½ç½®ãæ¤åºããããã«ããã®ã§ ãã®æ¤åºãããç®ã®åº§æ¨ã®xå·®å, yå·®åã使ã£ãéæ£æ¥ atan2
ã§å¾ãè§åº¦ãæ±ãããã¨ãã§ããã
ããã§å¤§ä½ãããããã¨ã¯å®ç¾ã§ãããã ã£ãã®ã§ãèªåã®åãçµãã§ããã¢ã¤ãã«é¡èå¥ã«ããã¦ã¯ãã®æ¤åºå¨ã使ã£ã¦ãã¼ã¿ã»ããç¨ã®é¡ç»åæ½åºãè¡ã£ã¦ããã
ããã ãã®æ¤åºå¨ã§ãã¾ã åé¡ã¯æ®ã£ã¦ãã¦ã
- ã¨ã«ããå¦çãéãæéãããã
- å転ããè¤æ°ã®ç»åãä½ããããããããæ¤åºãããã®ã§å½ç¶
- ã¾ã 誤æ¤åºãå¤ã
- é¡ã§ã¯ãªãå£ãæã®æ¨¡æ§ãé¡ã¨ãã¦æ¤åºãã¦ãã¾ããã¨ãå¤ã
- å¦ç¿ããã«ã¹ã¿ãã¤ãºããªã©ãã¥ãã
- ãã¼ã¿ã»ãããç¨æãã¦å¦ç¿ããããã¨ã¯åºæ¥ãããããâ¦
é ãã®ã¯ãã¼ã¿ã»ããç¨ã®åéã«ã¯ããã»ã©åé¡ã§ã¯ãªãããã©ãä¾ãã°é¡èå¥BOTã®ããã«ã¤ã³ã¿ã©ã¯ãã£ãã«ã¬ã¹ãã³ã¹ãè¿ãããå ´é¢ã«ããã¦ã¯è´å½çã§ãä»æ¹ãªãã®ã§Botç¨ã®æ¤åºã«ã¯ Cloud Vision API ã使ãããã«ãã¦ããã®ãç¾ç¶ã
ã¾ã精度çã«ãå°ãåé¡ããã£ã¦ãç¹ã«é髪ã®äººç©ã®å ´åã« å®éã®é¡é åãã大ããæ¤åºããããã¨ãå¤ãããã ã£ããé¡ã¨é«ªã®åºå¥ãã¤ãã¥ãããããâ¦ï¼
LBPãªã©ä»ã®ç¹å¾´ã使ã£ãæ¤åºå¨ã«åãæ¿ãããã¾ãdlibãªã©ä»ã®ã©ã¤ãã©ãªã使ç¨ãããã¨ã§æ¹åãåºæ¥ããããããªããã©ãæè§ãªã®ã§ãã㯠Deep Learning ã使ã£ãæ¤åºå¨ãä½ã£ã¦èªåã®å¦ç¿ãã¼ã¿ãé£ããã¦å¦ç¿ãããããã¨ããæãããã ä»åã¯ããã«ææ¦ãã¦ã¿ããã¨ã«ããã
Deep Learning ã«ããç©ä½æ¤åº
Deep Learning ã使ã£ãç©ä½æ¤åºã®ææ³ãããããç 究ããã¦ãã¦è¿å¹´ããã¾ããçºå±ãéãã¦ããããã§ã代表çãªææ³ã¨ãã¦ãããªãã®ãææ¡ããã¦ãããã¨ä¸è¨è¨äºã§ç´¹ä»ããã¦ããã
ä¸çªæè¿ã®ãã®ã¨ãã¦ç´¹ä»ããã¦ãã SSD (Single Shot MultiBox Detector) ãã¨ã¦ãé«éã«é«ç²¾åº¦ã§æ¤åºãã§ãããã§è¯ãããã 㪠ã¨æã£ã¦ããã¡ããè«æãå°ãç®ãéãã¦ã¿ãã å ã®å®è£ ã¯caffeã«ãããã®ã§ãTensorFlowçãæ¸ãã¦ãã人ãæ°äººãããã© ä½ã¨ãªãã®åçã¯åãã£ããããªæ°ããããèªåã§ãåå¼·ãã¦ãTensorFlowã§æ¸ãã¦ã¿ããâ¦ã¨ãã¦ãé£ãããã¦éä¸ã§æ«æããã ã®ãä»å¹´ã®1æé ã®è©±ã
Object Detection API
æã¯éããä»å¹´ã®6æä¸æ¬ã TensorFlowå ¬å¼ã®ã¢ãã«ç¾¤ TensorFlow Models ãªãã¸ããªã§ã “Object Detection API” ãå ¬éãããã ããã§ã¯ãMS COCO dataset ã使ã£ã¦å¦ç¿æ¸ã¿ã®5種é¡ã®ä¸è¬ç©ä½æ¤åºã¢ãã«ãå ¬éããã¦ããã
Model name Speed COCO mAP ssd_mobilenet_v1_coco fast 21 ssd_inception_v2_coco fast 24 rfcn_resnet101_coco medium 30 faster_rcnn_resnet101_coco medium 32 faster_rcnn_inception_resnet_v2_atrous_coco slow 37
ä¸ã®æ¹ããã精度ãé«ãããã®åã¢ãã«ã¯å¤§ãããªããå¦çãéããªãããã ã ä¸2ã¤ã¯ã¾ãã« SSD ã使ã£ããã®ã§ããããã¼ã¹ã¨ãªã CNN ãå è«æã§ã¯ VGG16 ã使ã£ã¦ããã®ã«å¯¾ã軽é㪠MobileNet ã使ã£ããã®ã Inception V2 ã使ã£ããã® ã¨2種é¡ããããã§å®ç¾ãã¦ããããã ã Faster RCNN ãªã©ã使ã£ããã®ããæ¤åºç²¾åº¦ã¯å£ããã®ã®ããã¯ãå¦çé度ã¯å§åçã« SSD ã®æ¹ãæ©ããã
ããã«ãã®ãªãã¸ããªã§ã¯ãå¦ç¿æ¸ã¿ã¢ãã«ãå©ç¨ãã転移å¦ç¿ã§ å¥ã®ãã¼ã¿ã»ãããå©ç¨ãã¦å¦ç¿ããªããæ¹æ³ã«ã¤ãã¦ãä»çµã¿ãç¨æãã ä¸å¯§ã«èª¬æããã¦ããã
ã¤ã¾ãããã®ãªãã¸ããªã®ã¢ãã«ã§æ±ãå½¢ã«é©åãã tfrecord
ãã¡ã¤ã«ãèªåã§ç¨æã§ããã°ãç°¡åã«ããã使ã£ãæ¤åºå¨ãå¦ç¿ãã使ããã¨ãã§ãããã¨ãããã¨ã®ããã ã
ããã使ããªãæã¯ç¡ããã¨ãããã¨ã§è©¦ãã¦ã¿ãã
FDDB dataset ããå¦ç¿ç¨ãã¼ã¿ã»ãããä½ã
èªåãéãã¦ããã¢ã¤ãã«é¡ç»åããç¨æãã¦ãè¯ãã£ããã©ãã¾ãã¯ä¸è¬ã«å ¬éããã¦ãããã¼ã¿ã»ããã§è©¦ãã¦ã¿ãããã¨æã£ã¦æ¢ãã¦ã¿ãã¨ãããFDDB ã¨ãããã¼ã¿ã»ããããããããã
2,845ç¹ã®ç»åããããã«ã¤ãã¦ãåã£ã¦ããé¡é åãæ¥åã§è¡¨ç¾ã ãã®ä¸å¿åº§æ¨ãé·å¾ã»çå¾ãå¾ãè§åº¦ ã®ã»ãããã¢ããã¼ã·ã§ã³ã¨ãã¦è¨5,171件 ä¸ãããã¦ããã
ããã§é¡é åã®æ¤åºã ããªãå¦ç¿ããããããã ãã©ãããã ãã§ã¯é¡ã®å¾ãã¯åå¾ã§ããªãã OpenCVã使ã£ããã®ã¨åæ§ã両ç®ã®ä½ç½®ããåããã°ããããè§åº¦ã¯ç®åºã§ãããã ãã©ã両ç®ã®ä½ç½®ã®æ å ±ã¯æ®å¿µãªããä»å±ã®ã¢ããã¼ã·ã§ã³ã«ã¯å«ã¾ãã¦ããªãã
ããããé¡ã®åå¨ããä½ç½®ããå¾ãããä¸ãããã¦ãããªãããã®é åãçãæã¡ã㦠OpenCV ã§æ¤åºãããã¨ãå¯è½ãªã¯ãã
é¡ã¢ããã¼ã·ã§ã³ããããã«ã¤ãã¦ã
- ä¸ãããã¦ããå¾ããè£æ£ããããå転ããã¦
- é¡ã®ä¸å¿åº§æ¨ãä¸å¿ã¨ãã
é·å¾ * 1.1
ã®ãå°ã大ããã®ãµã¤ãºã®æ£æ¹å½¢ã§åãæã
ã¨ããæä½ã§ã縦ã«çã£ç´ãã«ãªã£ãé¡ãåã£ã¦ããã¯ãã®é åããæ½åºããç»åãä¸åº¦ä½ããããã«å¯¾ã㦠OpenCV ã«ããé¡æ¤åºããããã ãããã¦æ¤åºãããç®ã®é åã表ã座æ¨ãããããå転åã®åº§æ¨ã«å¤æããã°ãå ç»åã«å¯¾ããç®ã®é åãåå¾ã§ããã
ãã¯ãããç¨åº¦ã®èª¤æ¤åºã¯ããã®ã§ãé©å½ã«ãã£ã«ã¿ãªã³ã°ãã¦è£æ£ããé¤å¤ã ãããªæãã®ã³ã¼ãã§
import cv2 import math import os CASCADES_DIR = os.path.normpath(os.path.join(cv2.__file__, '..', '..', '..', '..', 'share', 'OpenCV', 'haarcascades')) FACE_CASCADE = cv2.CascadeClassifier(os.path.join(CASCADES_DIR, 'haarcascade_frontalface_default.xml')) EYES_CASCADE = cv2.CascadeClassifier(os.path.join(CASCADES_DIR, 'haarcascade_eye.xml')) def detect_faces(img, lines): results = [] for line in lines: e = line.split(' ') size = max(float(e[0]), float(e[1])) * 1.1 # å°ãããããã®ã¯é¤å» if size < 60.0: break # çã£ç´ãã«ãªã£ã¦ããã¯ãã®é åãåãæã center = (int(float(e[3])), int(float(e[4]))) angle = float(e[2]) / math.pi * 180.0 if angle < 0: angle += 180.0 M = cv2.getRotationMatrix2D(center, angle - 90.0, 1) M[0, 2] -= float(e[3]) - size M[1, 2] -= float(e[4]) - size target = cv2.warpAffine(img, M, (int(size * 2), int(size * 2))) # åãæããç»åããé¡ã¨ç®ãæ¤åºãã faces = FACE_CASCADE.detectMultiScale(target) if len(faces) != 1: print('{} faces found...'.format(len(faces))) break face = faces[0] face_img = target[face[1]:face[1] + face[3], face[0]:face[0] + face[2]] eyes = [] for eye in EYES_CASCADE.detectMultiScale(face_img): # å§ç¹ã®é«ããå ç»åã®ä¸ååã«ãããããªãããããããã¯èª¤æ¤åº if eye[1] > face_img.shape[0] / 2: break eyes.append(eye) if len(eyes) != 2: print('{} eyes found...'.format(len(eyes))) break # 両ç®ã®ãµã¤ãºããã¾ãã«ãç°ãªãã®ã¯ä¸èªç¶ãªã®ã§æ¤åºå¤±æã¨ãã if not (2. / 3. < eyes[0][2] / eyes[1][2] < 3. / 2. and 2. / 3. < eyes[0][3] / eyes[1][3] < 3. / 2.): break ...
ããã§ã ãããã¯ä¸æãæ¤åºã§ããããã ã£ãã
ãã®æ¹æ³ã§ä¸æãæ¤åºã§ããä¸ãããã¦ããã¢ããã¼ã·ã§ã³ã¨åæ°ã®é¡ãæ£ãã両ç®ã¨å
±ã«æ¤åºããããã®ã ããç¨ãã¦ãã¼ã¿ã»ãããä½æã
çµæã¨ãã¦ã使ç¨ã§ããã®ã¯2,845ç¹ã®ãã¡936ç¹ã ã£ãã
ã¡ãã£ã¨å°ãªããã©ä»æ¹ãªãã
trainç¨ã¨validationç¨ã«åããå¿
è¦ãããããã ã£ãã®ã§ãããããã« 843:93
ã«åå²ãã¦ä½¿ç¨ããã
ã§ããã¨ã¯ãããããããç»åã«å¯¾ãã image/objcet/bbox/*
ã image/object/class/*
ã¨ãã£ãkeyã«æ
å ±ãå«ã㦠tfrecord
å½¢å¼ã«æ¸ãåºãã
feature = { 'image/height': tf.train.Feature(int64_list=tf.train.Int64List(value=[h])), 'image/width': tf.train.Feature(int64_list=tf.train.Int64List(value=[w])), 'image/filename': tf.train.Feature(bytes_list=tf.train.BytesList(value=[filepath.encode('utf-8')])), 'image/source_id': tf.train.Feature(bytes_list=tf.train.BytesList(value=[filepath.encode('utf-8')])), 'image/encoded': tf.train.Feature(bytes_list=tf.train.BytesList(value=[encoded])), 'image/format': tf.train.Feature(bytes_list=tf.train.BytesList(value=['jpeg'.encode('utf-8')])), 'image/object/bbox/xmin': tf.train.Feature(float_list=tf.train.FloatList(value=xmin)), 'image/object/bbox/xmax': tf.train.Feature(float_list=tf.train.FloatList(value=xmax)), 'image/object/bbox/ymin': tf.train.Feature(float_list=tf.train.FloatList(value=ymin)), 'image/object/bbox/ymax': tf.train.Feature(float_list=tf.train.FloatList(value=ymax)), 'image/object/class/text': tf.train.Feature(bytes_list=tf.train.BytesList(value=class_text)), 'image/object/class/label': tf.train.Feature(int64_list=tf.train.Int64List(value=class_label)), } example = tf.train.Example(features=tf.train.Features(feature=feature)) writer.write(example.SerializeToString())
ããã§ä¸å¿ããã¼ã¿ã»ãããä½æã§ããã®ã§ ãã¨ã¯ããã使ã£ã¦å¦ç¿ãããã
ssd_inception_v2_coco
ã®å¦ç¿æ¸ã¿ã¢ãã«ããã¼ã¹ã«Fine-Tuningããå½¢ã§ã
Google Cloud Machine Learning ã使ãæ¹æ³ãæ¸ãã¦ãã£ãã®ã ãã© ã¡ãã£ã¨ä½æ
ãä¸æããããªãã£ã(è¦ åææ¦)ã®ã§ãä»åã¯EC2ã®g2.2xlarge
ã¤ã³ã¿ã³ã¹ã使ã£ã¦å¦ç¿ãè¡ã£ãã
1stepããã2ç§å¼±ãããã丸ä¸æ¥ã§ 50,000stepã»ã©å¦ç¿ãé²ã¿ãã ãããã¯å¦ç¿ãåºæ¥ãé°å²æ°ã ã£ãã
ããã使ã£ã¦æ¤åºãã¦ã¿ãçµæãåé ã®ç»åã
ç¨æãããã¼ã¿ã»ããã«å¾ãã¦ããé¡ãããç¨åº¦ã¯å«ã¾ãã¦ããã®ã§ããããããã®ãããç¨åº¦ã¯æ¤åºã§ããããã ã£ãã ãã£ã800件ã¡ãã£ã¨ã®ç»åã§ã®ãã¼ã¿ãç¨æã ãã§ãããã ãæ¤åºã§ããããã«ãªã£ã¦ããã®ã ããååããªãã¨ããæ触ã ããããããã«ãã¼ã¿ã»ãããå¢ããã¦ããã°ã©ãã©ã精度ã¯ä¸ããããããªæ°ãããã
ãã¨ã¯å®éã®é¡èå¥ã«ä½¿ããããªèªæ®ãã®å¤ãç»åãã¡ã ã©ãã¢ããã¼ã·ã§ã³ä»ãã¦ã©ã管çããã©ãæ§è½è©ä¾¡ãã¦ããããã£ã¦è©±ã«ãªã£ã¦ããã¨æã
Webã¢ããªå
ããããã¯å®å ¨ã«ä½è«ãªã®ã ãã©ããã£ããé«éã«é¡æ¤åºã§ããã¢ãã«ãTensorFlowã§æ§ç¯ã§ããã®ã ãããWebãµã¼ãã¹ã¨ãã¦å ¬éã§ããããã«ããããã¨ã é¡æ¤åºã¢ãã«ã¯Flaskã使ã£ã¦JSON APIåã§ããã ãã¨ã¯ããã³ãã¨ã³ãã ãã©ãã«ããã¦UIãä½ãã ãã
以åãã¡ããã¡ããReactã¨ãwebpackã¨ã使ã£ã¦ä¼¼ããããªãã®ã¯ä½ã£ã¦ããã®ã§ä½¿ãåãã ãã©ãä»åã¯TypeScriptã§.tsx
ãæ¸ãã¦ts-loader
ã§ãã©ã³ã¹ãã¤ã«ãã¨ããæãã§ãã£ã¦ã¿ãã
åãä»ãã¨åãããããæ¸ããããããã ãã© ãªããªãæ
£ããªãã¦æã£ã以ä¸ã«è¦æ¦ããâ¦
ã§ãã¨ããããæä½éåãã¨ããã¾ã§åºæ¥ãã®ã§å ¬éããã®ããã¡ãã
ããããããªãHerokuã§åããããã¨æã£ããã©ãããdeployãã¦ã¿ãã¨ãã “Memory quota exceeded” ã®ã¨ã©ã¼ãåºã¾ãã£ã¦ãã¾ã£ã¦ãã©ããã¡ã¢ãªã®ä½¿ç¨éãã¤ãããããâ¦ã ãã¡ããåããã¨ã¯åããã©ããã¤æ¢ã¾ã£ã¦ãã¾ã£ã¦ããããããªããã¨ããæãã ç³ã¿è¾¼ã¿4層ã®èå¥ã¢ãã«ããããªã大ä¸å¤«ã ã£ããã© ãããããã®è¦æ¨¡ã ã¨å³ãããããã
Herokuã§ã¡ã¢ãªå¤ãã®dynoã«ã¢ããã°ã¬ã¼ãããã¨$25ããããããã¿ããã ããããã ã£ããã©ããã®VPSã§2GBããããããã¤ãåããæ¹ããããâ¦ï¼ çé¢ç®ã«éç¨ãããã¨ã«ãªã£ããèããããã
Repository
- https://github.com/sugyan/tf-face-detector (Detector æ¬ä½)
- https://github.com/sugyan/tf-face-detector-app (Web App)