ã¯ãªã¹ãã¹ã«Chainerã使ã£ã¦å¦æ³åãå¯è¦åããã
çããããã«ã¡ã¯
ãå
æ°ã§ããããã£ã¨æ¸ãåãã¾ããã
å®ã¯ä»å¹´ãããå°ãã§ãã
æ¨å¹´åº¦ã¯カノジョを作成ãããã®åã¯友利奈緒を真面目に解析ãã¦ãã¾ãã
ãããæã®æµãã¯æ©ããã®ã§ãã
ãããè¸ã¾ããä»å¹´ã¯ä½ãããããèãã¾ããã
Chainer Advent Calendar 25æ¥ç®ã§ãçæ§ã®å¦æ³åãå ·ç¾åãããã¨æãã¾ãã
- 妿³ã«ã¤ãã¦
- ã¢ã¶ã¤ã¯å¦çå¤ã
- ãã¼ã¿ã»ããã®åé
- ã¢ã¶ã¤ã¯å¦çå¤ãã®ã¢ã«ã´ãªãºã
- çµæ
- æå¾ã«
妿³ã«ã¤ãã¦
ãã¦ãèãã¦ã¿ã¦æ¬²ããã妿³ã¨ãããã®ã
ç»åã«ä¸èªç¶ã«æãç®æãä½ãè¦ã«ãããªãã¨æãç®æã
ããã¯ãããªæãã«ãªã£ã¦ããã¨æ¨æ¸¬ãã¦ããã®ã§ã¯ãªãã§ããããã
ãããæ¨æ¸¬ããããè£å®ã§ãã
ãã¦ãæ¬é¡ã§ãããé¨åçã«ã¢ã¶ã¤ã¯ãããã£ãç»åãè¦ãã¨ããã«ã¯ãããããã¨å¦æ³ããã§ãããã
ãããªããã§ãæ¬æ¥ãçæ§ã®å¦æ³ãå
·ç¾åãããã¤ã¾ããã¢ã¶ã¤ã¯ãå¤ããã¨ã試ã¿ã¾ãã
ã¢ã¶ã¤ã¯å¦çå¤ã
ã¢ã¶ã¤ã¯å¦çã¨ã¯
ã¢ã¶ã¤ã¯å¦çã¯æ¬¡ã®ããã«å®ç¾©ããã¦ããããã§ããï¼Wikipediaï¼
ã¢ã¶ã¤ã¯å¦çï¼ã¢ã¶ã¤ã¯ãããï¼(è±èª: mosaic processing)ã»ãã¯ã»ã«å (è±èª: pixelization)ã¨ã¯åçã»ç»åã»éæ¢ç»ã»æ åã»åç»ã«ããã¦è¡¨ç¤ºããããªãé¨åããã¯ã»ã«åä½ã§è¦ãã«ããããæ åå¦çãï¼by Wikipediaï¼
è¦ã¯è¦ãã«ããé¨åãå å·¥ããå¦çã§ããã¢ã¶ã¤ã¯å¦çèªä½ã¯å¾ã§è©³ç´°ãè¿°ã¹ã¾ãã
ã¢ã¶ã¤ã¯å¦çã®ç¨®é¡
ã¢ã¶ã¤ã¯å¦çã¯å¤§å¥ãã¦2種é¡ã®æ¹å¼ãããã¾ãã
ã¢ã¶ã¤ã¯ã¯ãéå¯éå§ç¸®ãå¯éå§ç¸®ã®ææ³ã§ãã
éå¯éå§ç¸®ã¯å
ã«æ»ããã¨ãã§ããªãå§ç¸®æ¹å¼ã§ãããã¢ã¶ã¤ã¯ç»åããå
ç»åãå®å
¨ã«å¾©å
ãããã¨ãã§ãã¾ããã
ããã«å¯¾ããå¯éå§ç¸®ã¯å
ã«æ»ããã¨ãå¯è½ãªæ¹å¼ã§ãããç¾å¨ã¯éå¯éå§ç¸®ã主æµã¨ãªã£ã¦ããããã§ãã
ï¼å¯éå§ç¸®ã®ä»£è¡¨ä¾ã¨ãã¦FLMASKã¨å¼ã°ãããã¼ã«ãæãã£ãããããåãä¼ç¤¾ã®äººã«æãã¦ããã£ããï¼
ç¾å¨ã¯ãéå¯éå§ç¸®ã®æ¹å¼ã主æµã®ãããæ¬å®é¨ã§ãåæ§ã®å§ç¸®æ¹å¼ããã¼ã¹ã«å®ç¾ãã¾ãã
æ¬ææ³ã§é¸æããã¢ã¶ã¤ã¯ã®ææ³ã¯æ¬¡ã®ã¨ããã§ãã
- é»ãã¢ã¶ã¤ã¯ã»ã»ã»ç»é¢ã®ä¸é¨ãé»ãã¢ã¶ã¤ã¯ã¨ãªãã
- ä¸å¤®å¤ãç¨ããã¢ã¶ã¤ã¯ã»ã»ã»ã¼ããããç®æãmedian filterã«ããã
- Gaussian Filterã使ã£ãã¢ã¶ã¤ã¯ã»ã»ã»ã¼ããããç®æã«Gaussian Filterã使ãã
ä»åã¯å®ç¨æ§ãéè¦ãããããé¨åç»åã¸ã®é©ç¨ã試ã¿ã¾ãã
ããã¯ãå
¨ä½ãé ããªããã°ãªããªãã±ã¼ã¹ã¯å°ãªã
é¨åçã«è¦ã¦ã¯ãããªããã®ã¨ãã¦é ãã¦ãããã¨ãããããã§ãã
ä¹±æ°ã«ããåè¾¼ãç»åã«å¯¾ãã宿åããã¾ãã«ä¸èªç¶ãªã®ã§å©ç¨ãã¾ããã
ã¢ã¶ã¤ã¯å¦çã¢ã«ã´ãªãºã ã®å®è£
æ¬å¦çã¯Pythonã§å®è£ ãã¾ãã
ã©ã³ãã ã§64æåºåããç»åã¯æ¬¡ã®éãã§ãã
ãã¼ã¿ã»ããã®åé
æ¬ç»åå¦çã§ç¨ãããã¼ã¿ã»ãããåéãã¾ãã
ä»åã¯ã©ãã©ã¤ãã®ãã¼ã¿ãå©ç¨ãã¦ã¢ã¶ã¤ã¯é¤å»ã®å®é¨ãè¡ãã¾ãã
Googliser
Googliserã¯Googleç»åæ¤ç´¢ãç¨ãã¦ãç»åãåå¾ããã½ããã¦ã§ã¢ã§ãã
ãã®ã½ããã¦ã§ã¢ãå©ç¨ãã¦ã©ãã©ã¤ãã®ãã¼ã¿ãéãã¾ãã
googliserã¯cloneãã¦å é¨ã«ããgoogliser.shãåä½ãããã°å®è¡å¯è½ã§ãã
./googliser.sh -p "ç¢æ¾¤ã«ã" --number 1000
ã¢ã¶ã¤ã¯å¦çå¤ãã®ã¢ã«ã´ãªãºã
ã¢ã¶ã¤ã¯å¦çå¤ã
ã¯ããã®æ¹ã§ä¸è¬çãªã¢ã¶ã¤ã¯ãéå¯éå§ç¸®ã§ããã¨èª¬æãã¾ããã
ãã®ãããå³å¯ã«ã¯ã¢ã¶ã¤ã¯å¦çãå¤ãã®ã§ã¯ãªããã¢ã¶ã¤ã¯ãããã£ãç®æãè£å®ããå¦çãè¡ãã¾ãã
ãã®è£å®å¦çã§ãããçæã¢ãã«ã§å©ç¨ãããGANã¨Semantic Segmentationã®æè¡ãå¿ç¨ãã¦å®ç¾ãã¾ãã
SIGGRAPH 2017ã§çºè¡¨ã®ãã£ããGlobally and Locally Consistent Image Completionããåèã«å®è£
ãã¾ãã
Globally and Locally Consistent Image Completion
Globally and Locally Consistent Image Completionã¯
Deep Learningãå©ç¨ããç»åè£å®ã¢ã«ã´ãªãºã ã§ãã
æåã«Generatorãå¦ç¿ããæ¬¡ã«Discriminatorãå¦ç¿ãã
æå¾ã«GANã®æ¹å¼ã§Generator, Discriminatorã®ä¸¡æ¹ãå¦ç¿ãã
é«éãã¼ãã³ã°æ³ã¨å¼ã°ããã¢ã«ã´ãªãºã ã§å¨å²ãè£å®ãã¾ãã
å
é¨ã§ããã¤ãç´°ããææ³ã¯ããã®ã§ããããããã¯ã¼ã¯ã®æ§é ããã®ã¾ã¾å©ç¨ãã
諸ã
ã®å¦çã®å®è£
ããã¦ãã¾ãããï¼é«éãã¼ãã³ã°æ³ï¼
Generatorãå¦ç¿ããéã«ã¯å¾©å
ç»åãçæããMaské¨ã®èª¤å·®ã«ããæ´æ°ãã¾ãã
è§£å度ãè½ã¨ãããDilatedConvolutionã§å¤§åçãªç¹å¾´ãå©ç¨ããç®æã¨å¾æ®µã®Discriminatorã§ã¯ã
å
¨ä½ã®ç»åã¨é¨åç»åï¼ãã¹ã¯ï¼ãå
¥åã¨ããå¤å®ãã¦ãã¾ãã
Chainerã§æ§ç¯ããã¢ã¶ã¤ã¯å¤ãã¢ã«ã´ãªãºã
ãã¦ãä½ã£ã¦ã¿ã¾ãããçµæ§é·ããªã£ãã®ã§å
¨ä½ã¯å¾ã»ã©githubã¸ç½®ãã¦ããã¾ãã
ã¢ãã«ã¨ã¢ãããã¼ãé¨ã«ã¤ãã¦ã¯æ¬¡ã§è¨è¿°ãã¾ãã
ã³ã¼ãã¯ãã¾ããã£ã¡ãè¨è¼ãã¦ããªãã®ã§ã綺éºã«ããããã«ã¯ããå°ãæ¹è¯ãå¿
è¦ã§ãã
ç»åã¯128x128ã§å ¥åãã¦ãã¾ããï¼å¦çã®é½åï¼
Generator
ã¾ãã¯ãGeneratorã®ã³ã¼ãã§ãã
ããã§èã¨ãªãã®ã¯ãDilated Convolutionã使ãã大åçãªç¹å¾´ãè¦ã¦ãããã¨
ããã¦ã誤差è¨ç®é¨ï¼__call__ï¼ã®å¼æ°ã«mskãå
¥ãã¦ãã¾ãã
ãã¹ã¯ãå
±ã«å
¥åã¨ããå ´åã«ã¢ã¶ã¤ã¯é å以å¤ã®èª¤å·®ã0ã«ãã¦ãã¾ãã
class GLCICGenerator(chainer.Chain): def __init__(self): super(GLCICGenerator, self).__init__() with self.init_scope(): self.conv0 = L.Convolution2D(4, 64, ksize=3, stride=1, pad=1) self.bn0 = L.BatchNormalization(64) self.conv1_1 = L.Convolution2D(64, 128, ksize=3, stride=2, pad=1) self.bn1_1 = L.BatchNormalization(128) self.conv1_2 = L.Convolution2D(128, 128, ksize=3, stride=1, pad=1) self.bn1_2 = L.BatchNormalization(128) self.conv2_1 = L.Convolution2D(128, 256, ksize=3, stride=2, pad=1) self.bn2_1 = L.BatchNormalization(256) self.conv2_2 = L.Convolution2D(256, 256, ksize=3, stride=1, pad=1) self.bn2_2 = L.BatchNormalization(256) self.conv2_3 = L.Convolution2D(256, 256, ksize=3, stride=1, pad=1) self.bn2_3 = L.BatchNormalization(256) self.conv2_4 = L.DilatedConvolution2D(256, 256, ksize=3, stride=1, pad=2, dilate=2) self.bn2_4 = L.BatchNormalization(256) self.conv2_5 = L.DilatedConvolution2D(256, 256, ksize=3, stride=1, pad=4, dilate=4) self.bn2_5 = L.BatchNormalization(256) self.conv2_6 = L.DilatedConvolution2D(256, 256, ksize=3, stride=1, pad=8, dilate=8) self.bn2_6 = L.BatchNormalization(256) self.conv2_7 = L.Convolution2D(256, 256, ksize=3, stride=1, pad=1) self.bn2_7 = L.BatchNormalization(256) self.conv2_8 = L.Convolution2D(256, 256, ksize=3, stride=1, pad=1) self.bn2_8 = L.BatchNormalization(256) self.deconv2_1 = L.Deconvolution2D(256, 128, ksize=4, stride=2, pad=1) self.debn2_1 = L.BatchNormalization(128) self.deconv2_2 = L.Convolution2D(128, 128, ksize=3, stride=1, pad=1) self.debn2_2 = L.BatchNormalization(128) self.deconv1_1 = L.Deconvolution2D(128, 128, ksize=4, stride=2, pad=1) self.debn1_1 = L.BatchNormalization(128) self.deconv1_2 = L.Convolution2D(128, 64, ksize=3, stride=1, pad=1) self.debn1_2 = L.BatchNormalization(64) self.deconv0 = L.Convolution2D(64, 3, ksize=3, stride=1, pad=1) def predict(self, x): h = F.relu(self.bn0(self.conv0(x))) h = F.relu(self.bn1_1(self.conv1_1(h))) h = F.relu(self.bn1_2(self.conv1_2(h))) h = F.relu(self.bn2_1(self.conv2_1(h))) h = F.relu(self.bn2_2(self.conv2_2(h))) h = F.relu(self.bn2_3(self.conv2_3(h))) h = F.relu(self.bn2_4(self.conv2_4(h))) h = F.relu(self.bn2_5(self.conv2_5(h))) h = F.relu(self.bn2_6(self.conv2_6(h))) h = F.relu(self.bn2_7(self.conv2_7(h))) h = F.relu(self.bn2_8(self.conv2_8(h))) h = F.relu(self.debn2_1(self.deconv2_1(h))) h = F.relu(self.debn2_2(self.deconv2_2(h))) h = F.relu(self.debn1_1(self.deconv1_1(h))) h = F.relu(self.debn1_2(self.deconv1_2(h))) return F.sigmoid(self.deconv0(h)) def __call__(self, x, msk=None, t=None): h = self.predict(x) if msk is not None: h = msk * h t = msk * t loss = F.mean_squared_error(h, t) chainer.report({'loss': loss}, self) return loss else: return h
Discriminator
次ã«Discriminatorã§ããç¹å¾´ã¯å
¥åãå
¨ä½ã®ç»åã¨ã¢ã¶ã¤ã¯ç»åãä¸å¿ã¨ãã
64x64ã®ç»åãå
¥åã¨ãã¦ãã¾ããæå¾ã«çµåããåºåãç®åºãã¦ãã¾ãã
class GLCICDiscriminator(chainer.Chain): def __init__(self): super(GLCICDiscriminator, self).__init__() with self.init_scope(): self.c0_l = L.Convolution2D(3, 32, 3, 2, 1) self.bn0_l = L.BatchNormalization(32) self.c1_l = L.Convolution2D(32, 64, 3, 2, 1) self.bn1_l = L.BatchNormalization(64) self.c2_l = L.Convolution2D(64, 128, 3, 2, 1) self.bn2_l = L.BatchNormalization(128) self.c3_l = L.Convolution2D(128, 256, 3, 2, 1) self.bn3_l = L.BatchNormalization(256) self.c4_l = L.Convolution2D(256, 512, 3, 2, 1) self.bn4_l = L.BatchNormalization(512) self.c0_g = L.Convolution2D(3, 16, 3, 2, 1) self.bn0_g = L.BatchNormalization(16) self.c1_g = L.Convolution2D(16, 32, 3, 2, 1) self.bn1_g = L.BatchNormalization(32) self.c2_g = L.Convolution2D(32, 64, 3, 2, 1) self.bn2_g = L.BatchNormalization(64) self.c3_g = L.Convolution2D(64, 128, 3, 2, 1) self.bn3_g = L.BatchNormalization(128) self.c4_g = L.Convolution2D(128, 256, 3, 2, 1) self.bn4_g = L.BatchNormalization(256) self.c5_g = L.Convolution2D(256, 512, 3, 2, 1) self.bn5_g = L.BatchNormalization(512) self.fc = L.Linear(None, 1) def __call__(self, x1, x2): h1 = F.leaky_relu(self.bn0_l(self.c0_l(x1))) h1 = F.leaky_relu(self.bn1_l(self.c1_l(h1))) h1 = F.leaky_relu(self.bn2_l(self.c2_l(h1))) h1 = F.leaky_relu(self.bn3_l(self.c3_l(h1))) h1 = F.leaky_relu(self.bn4_l(self.c4_l(h1))) h2 = F.leaky_relu(self.bn0_g(self.c0_g(x2))) h2 = F.leaky_relu(self.bn1_g(self.c1_g(h2))) h2 = F.leaky_relu(self.bn2_g(self.c2_g(h2))) h2 = F.leaky_relu(self.bn3_g(self.c3_g(h2))) h2 = F.leaky_relu(self.bn4_g(self.c4_g(h2))) h2 = F.leaky_relu(self.bn5_g(self.c5_g(h2))) concat_h = F.concat([h1, h2]) return self.fc(concat_h)
Updater
æå¾ã«æ´æ°ç¨ã®Updaterã§ãã
ãã®Updaterã¯Discriminatoræ´æ°ã¨GANæ´æ°ç¨ã§ãã
æ¬ä½ã¯update_core颿°ã«ãªãã¾ããupdate_coreé¨ã§ã¯æ¬¡ã®ãã¨ããã¦ãã¾ãã
- Discriminatorã®èª¤å·®ã«å¿ è¦ãªæ å ±ãè¨ç®ï¼ãªãªã¸ãã«ç»åï¼
- Generatorã§ç»åãçæ
- Discriminatorã®èª¤å·®ã«å¿ è¦ãªæ å ±ãè¨ç®ï¼çæç»åï¼
- updateã§æ´æ°
ã¡ã½ããåã¨ãã®èª¬æãæ¬¡ã«æ²è¼ãã¾ãã
ã¡ã½ããå | 説æ |
---|---|
loss_dis | èå¥å¨ï¼Discriminatorï¼ãæ´æ° |
loss_gen | çæå¨ï¼Generatorï¼ãæ´æ° |
extract_img | ã¢ã¶ã¤ã¯é¨è¿è¾ºã®ç»åãåãåãã¡ã½ãã |
extract_mosaic_area | ã¢ã¶ã¤ã¯é¨è¿è¾ºã®ç»åãåãåãã¡ã½ããï¼ãããçã«å¦çããé¨åï¼ |
update_core | æ´æ°ç¨ã®é¢æ°ãå¼ã³åºãã³ã¢é¨å |
class GLCICUpdater(chainer.training.StandardUpdater): def __init__(self, is_gen_training=True, alpha=4e-4, *args, **kwargs): self.gen, self.dis = kwargs.pop('models') self.is_gen_training = is_gen_training self.alpha = alpha super(GLCCICUpdater, self).__init__(*args, **kwargs) def loss_dis(self, dis, y_fake, y_real): batchsize = len(y_fake) L1 = F.sum(F.softplus(-y_real)) / batchsize L2 = F.sum(F.softplus(y_fake)) / batchsize loss = (L1 + L2) * self.alpha chainer.report({'loss': loss}, dis) return loss def loss_gen(self, gen, y_fake, x_fake, img_batch_variable, masks): batchsize = len(y_fake) h = masks * x_fake t = masks * img_batch_variable abs_pixel_loss = F.mean_squared_error(h, t) loss = (F.sum(F.softplus(-y_fake)) * self.alpha) / batchsize + abs_pixel_loss chainer.report({'loss': loss, 'pixel_loss': abs_pixel_loss}, gen) return loss def extract_img(self, img, bbox): while True: min_h = max(min(bbox[3], 127) - 64, 0) max_h = min(bbox[2], 63) min_w = max(min(bbox[1], 127) - 64, 0) max_w = min(bbox[0], 63) start_h = random.randint(min_h, max_h) end_h = start_h + 64 start_w = random.randint(min_w, max_w) end_w = start_w + 64 if start_h >= 0 and start_w >= 0 and end_w < img.shape[1] and end_h < img.shape[2]: return img[:, start_h: end_h, start_w: end_w] def extract_mosaic_area(self, images, bboxs): mosaic_region_imgs = [] for fake_variable, bbox_variable in zip(images.data, bboxs): fake = chainer.cuda.to_cpu(fake_variable) bbox = chainer.cuda.to_cpu(bbox_variable) mosaic_region_img = self.extract_img(fake, bbox).transpose((1, 2, 0)) mosaic_region_imgs.append(cv2.resize(mosaic_region_img, (64, 64)).transpose((2, 0, 1))) return mosaic_region_imgs def update_core(self): if self.is_gen_training: gen_optimizer = self.get_optimizer('gen') dis_optimizer = self.get_optimizer('dis') batch = self.get_iterator('main').next() img_batch, mosaic_batch_imgs, img_with_mask_batch, bbox_batch, masks = self.converter(batch, self.device) img_batch_variable = Variable(img_batch) x_real = Variable(img_with_mask_batch) xp = chainer.cuda.get_array_module(x_real.data) gen, dis = self.gen, self.dis region_real_images = self.extract_mosaic_area(img_batch_variable, bbox_batch) region_real_images_variable = Variable(xp.asarray(region_real_images)) y_real = dis(region_real_images_variable, img_batch_variable) # cut off image x_fake = gen(x_real) region_fake_images = self.extract_mosaic_area(x_fake, bbox_batch) region_fake_images_variable = Variable(xp.asarray(region_fake_images)) y_fake = dis(region_fake_images_variable, x_fake) dis_optimizer.update(self.loss_dis, dis, y_fake, y_real) if self.is_gen_training is True: gen_optimizer.update(self.loss_gen, gen, y_fake, x_fake, img_batch_variable, masks)
çµæ
Generator
ã¾ãã¯ãGeneratoré¨ãå©ç¨ãã¾ããã
Googliserã§ç»åã¯åéãããã®ä¸ããä¸é¨ããã¹ãç¨ã¨ããã®ã§ãæ¤ç´¢ã被ã£ã¦ããã°éè¤ã®å¯è½æ§ãããã¾ãããäºæ¿ãã ããã
1epochã®ç»åãä½ã表ç¾ã§ãã¦ãã¾ããã
150epochãå°ããã¤è¡¨ç¾ãã§ãã¦ããï¼
400epochãããã¾ãå¤ãããªãã
GAN
ããã¦ãGANãå©ç¨ããçæå¦çã宿½ãã¾ãã
1 epochã表ç¾ãæ®å¿µãªã¨ããããã¹ã¿ã¼ã
150epoch
ãã¾ã
å°ããããããã®ããããã°ç¨ç»åã§ãã
ãã®ãããã°ç¨ç»åã¯Generatorã®åºåããã®ã¾ã¾è¡¨ç¤ºãã¦ããç»åã§ã
å
ç»åã§ã®ä¸æ¸ããä½ããã¦ãã¾ããã
Generatorã®ã¿å¦ç¿ããå ´å
GANã®å ´å
å ¥åããã¦ããªãã¨æãããã©ãã©ã¤ãç»åï¼ãµã³ã·ã£ã¤ã³ããåã£ã¦ããï¼
å
ç»åï¼
大éã«çæããã¢ã¶ã¤ã¯ï¼å¾©å
ç»åï¼
å
ç»åï¼
大éã«çæããã¢ã¶ã¤ã¯ï¼å¾©å
ç»åï¼
æã£ãããã綺éºã«çæããã¦ãã¾ããã
æå¾ã«
妿³åãå®å
¨ã«å¯è¦åããã«ã¯ã¾ã ã¾ã ãããããã§ãã
çµæ§ãã¿ã¨ãã¦é¢ç½ãã£ãã®ã§ã¡ãã£ã¨ç»å大ãããã¦ãã£ã¦ã¿ãã®ãé¢ç½ãããªãã¨æãã¾ããã
â»ä½ããã°ãéããããããã°è¯ãçããã°æãã¦ãã ããã