16. 顔認証(原理)
16
代表的な手法
Eigen Face [Turk1991]
Bunch Graph Matching [Wiskott1997]
Deep Face [Taigman2014]
Eigen Face
Credit: [Wikipedia]
Deep Face
Credit: [Taiman2014]
19. 顔器官検出(原理)
19
代表的な手法
Active Appearance Model [Matthews2004]
Constrained Local Model [Cristinacce2006]
Explicit Shape Regression [Cao2012]
)(pp
Constrained Local Model Credit:[Saragih2011]
36. ライブラリ
36
Open Source
Active Appearance Model
AAM-OpenCV https://code.google.com/p/aam-opencv/
AAM-API http://www2.imm.dtu.dk/~aam/
Constrained Local Model
https://sites.google.com/site/xgyanhome/home/projects/clm-
implementation
[Saragih2011]のJavaScript実装
https://github.com/auduno/clmtrackr
Pictorial Structure [Andriluka2009]
http://www.d2.mpi-inf.mpg.de/andriluka_cvpr09
37. 参考文献
37
[Viola2001]Viola, P., & Jones, M. (2001). Rapid object detection using a
boosted cascade of simple features. IEEE International Conference on
Computer Vision and Pattern Recognition (CVPR).
[Turk1991]Turk, M., & Pentland, A. (1991). Eigenfaces for Recognition.
Journal of Cognitive Neuroscienceo, 3(1), 71–86.
[Wiskott1997]Wiskott, L., Fellous, J.-M., Kruger, N., & Malsburg, C. von
der. (1997). Face recognition by elastic bunch graph matching. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 19(7), 775–
779.
[Taigman2014]Taigman, Y., Ranzato, M. A., & Wolf, L. (2014). DeepFace:
Closing the Gap to Human-Level Performance in Face Verification. In
IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[Belhumeur1997]Belhumeur, P. N., Hespanha, J. P., & Kriegman, D. J.
(1997). Eigenfaces vs. Fisherfaces: Recognition Using Class Specific
Linear Projection. IEEE Transaction on Pattern Analysis and Machine
Intelligence, 19(7), 711–720.
38. 参考文献
38
[Matthews2004]Matthews, I., & Baker, S. (2004). Active
appearance models revisited. International Journal of Computer
Vision, 60(2), 135–164.
[Cristinacce2006]Cristinacce, D., & Cootes, T. (2006). Feature
detection and tracking with constrained local models. In Proc.
British Machine Vision Conference (Vol. 3, pp. 929–938).
[Saragih2011]Saragih, J. M., Lucey, S., & Cohn, J. F. (2011).
Deformable Model Fitting by Regularized Landmark Mean-Shift.
International Journal of Computer Vision, 91(2), 200–215.
[Cao2012]Cao, X., Wei, Y., Wen, F., & Sun, J. (2012). Face
Alignment by Explicit Shape Regression. In IEEE Conference on
Computer Vision and Pattern Recognition (CVPR).
[Dalal2005]Dalal, N., & Triggs, B. (2005). Histograms of Oriented
Gradients for Human Detection. IEEE Conference on Computer
Vision and Pattern Recognition (CVPR).
39. 参考文献
39
[Felzenswalb2009]Felzenszwalb, P. F., Girshick, R. B., McAllester, D., &
Ramanan, D. (2009). Object detection with discriminatively trained part-
based models. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 32(9), 1627–1645.
[Toshev2014]Toshev, A., & Szegedy, C. (2014). DeepPose: Human pose
estimation via deep neural networks. In IEEE Conference on Computer
Vision and Pattern Recognition (CVPR).
[Felzenszwalb2005]Felzenszwalb, P. F., & Huttenlocher, D. P. (2005).
Pictorial Structures for Object Recognition. International Journal of
Computer Vision, 61(1), 55–79.
[Ferrari2008]Ferrari, V., Mar, M., & Zisserman, A. (2008). Progressive
Search Space Reduction for Human Pose Estimation. In IEEE
Conference on Computer Vision and Pattern Recognition (CVPR).
[Shotton2011]Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio,
M., Moore, R., Kipman A., Blake, A. (2011). Real-time human pose
recognition in parts from single depth images. In IEEE Conference on
Computer Vision and Pattern Recognition (CVPR).
40. 参考文献
40
[Andriluka2009]Andriluka, M., Roth, S., & Schiele, B. (2009). Pictorial
structures revisited: People detection and articulated pose estimation. In
IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[Zhang2008]Zhang, W., Sun, J., & Tang, X. (2008). Cat Head Detection -
How to Effectively Exploit Shape. In IEEE conference on Computer
Vision and Pattern Recognition (CVPR).
[Kozakaya2009]Kozakaya, T., Ito, S., Kubota, S., Yamaguchi, O. (2009).
Cat face detection with two heterogeneous features., In IEEE
International Conference on Image Processing (pp. 1209-1212)
44. 特定物体認識(原理)
44
代表的な手法
SIFT等の局所特徴量+近似最近傍探索 [Lowe1999]
大規模なデータベースに対してはBag-of-Featuresを用いる
[Sivic2003]
Histogram of Gradient
Orientations
DB
・・・
x
x
x
x
x
x
x
x
x
x
x
xx
x
x
x
xx x
x
マッチング+投票
45. 特定物体認識(実装例)
45
アプリケーション
Google Goggles(ランドマークや書籍などの認識)
Amazon Fire Phone (書籍やCDジャケット等)
マーケティング/販促
TSUTAYA DVDジャケット撮影で作品情報提供
http://www.nikkei.com/article/DGXNASDD0301Y_T00C13A8TJC000/
楽天 スマホで撮った商品を自動検索
http://www.nikkei.com/article/DGXNASDD180LC_Y3A710C1TJ1000/
Google Goggles
(Google)
66. ソフトウェア/ライブラリ
66
Deep Learning
Deep learning関係のソフトウェアまとめ
http://deeplearning.net/software_links/
Theano, http://deeplearning.net/software/theano
Convolutional Neural Network(CNN), Deep Belief Net(DBN), Deep
Boltzmann Machine(DBM)のPython実装 with CUDA and BLAS
EBlearn, http://eblearn.cs.nyu.edu:21991/
CNNのC++実装 with IPP/SSE/OpenMP
cuda-convent, http://code.google.com/p/cuda-convnet
ILSVRC2012という大規模一般物体認識のコンテストでぶっちぎり一位
になったCNN実装 with CUDA
Caffe, http://caffe.berkeleyvision.org/index.html
CNNのC++実装 with CUDA
ConfnetJS, http://cs.stanford.edu/people/karpathy/convnetjs/
CNNのJavaScript実装
67. 参考文献
67
[Lowe1999]Lowe, D. G. (1999). Object recognition from local scale-
invariant features. In IEEE International Conference on Computer Vision
(pp. 1150–1157 vol.2).
[Sivic2003]Sivic, J., & Zisserman, A. (2003). Video Google: a text
retrieval approach to object matching in videos. In IEEE Internatinal
Conference on Computer Vision (CVPR).
[Csurka2004]Csurka, G., Dance, C. R., Fan, L., Willamowski, J., & Bray, C.
(2004). Visual categorization with bags of keypoints. In Workshop on
statistical learning in computer vision, ECCV (Vol. 1, p. 22).
[Krizhevsky2012]Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012).
ImageNet Classification with Deep Convolutional Neural Networks. In
Advances in Neural Information Processing Systems (NIPS) (pp. 1106–
1114).
[Kumar2012]Kumar, N., Belhumeur, P. N., Biswas, A., Jacobs, D. W., Kress,
W. J., Lopez, I., & Soares, J. V. B. (2012). Leafsnap: A Computer Vision
System for Automatic Plant Species Identification. In European
Conference on Computer Vision.
68. 参考文献
68
[Berg2014]Berg, T., Liu, J., Lee, S. W., Alexander, M. L., Jacobs, D.
W., & Belhumeur, P. N. (2014). Birdsnap: Large-scale Fine-grained
Visual Categorization of Birds. In IEEE conference on Computer
Vision and Pattern Recognition (CVPR).
[Itti2000]Itti, L., & Koch, C. (2000). A saliency-based search
mechanism for overt and covert shifts of visual attention. Vision
Research, 40(10-12), 1489–506.
[Wang2012]Wang, P., Wang, J., Zeng, G., Feng, J., Zha, H., & Li, S.
(2012). Salient object detection for searched web images via
global saliency. In IEEE Conference on Computer Vision and
Pattern Recognition (CVPR).
[木村2012]木村昭悟, 米谷竜, 平山高嗣. (2012). “[サーベイ論文]
人間の視覚的注意の計算モデル”, 電気情報通信学会技術報告
[Alexe2012]Alexe, B., Deselaers, T., & Ferrari, V. (2012). Measuring
the objectness of image windows. IEEE Transaction on Pattern
Analysis and Machine Intelligence, 34(11), 1–14.
69. 参考文献
69
[Cheng2014]Cheng, M.-M., Zhang, Z., Lin, W.-Y., & Torr, P. (2014).
BING : Binarized Normed Gradients for Objectness Estimation at
300fps. In IEEE Conference on Computer Vision and Pattern
Recognition (CVPR).
[LeCun1998]LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998).
Gradient-based learning applied to document recognition. In
Proceedings of the IEEE (pp. 2278–2324).
102. 参考文献
102
[Tomasi1998]Tomasi, C., & Manduchi, R. (1998). Bilateral filtering for gray
and color images. International Conference on Computer Vision (CVPR).
[Buades2005]Buades, A., Coll, B., & Morel, J.-M. (2005). A non-local
algorithm for image denoising. In IEEE Conference on Computer Vision
and Pattern Recognition (CVPR).
[Dabov2007]Dabov, K., Foi, A., Katkovnik, V., & Egiazarian, K. (2007).
Image denoising by sparse 3D transform-domain collaborative filtering.
IEEE Transactions on Image Processing, 16(8), 2080–2095.
[Freeman2002]Freeman, W. T., Jones, T. R., & Pasztor, E. C. (2002).
Example-based super-resolution. Computer Graphics and Applications,
22(2), 56–65.
[Yang2008]Yang, J., Wright, J., Ma, Y., & Huang, T. (2008). Image super-
resolution as sparse representation of raw image patches. In IEEE
Conference on Computer Vision and Pattern Recognition (CVPR).
103. 参考文献
103
[Osher1988]Osher, S., & Sethian, J. A. (1988). Fronts propagating with
curvature-dependent speed: algorithms based on Hamilton-Jacobi
formulations. Journal of Computational Physics, (1988), 12–49.
[Comaniciu2002]Comaniciu, D., & Meer, P. (2002). Mean shift: A robust
approach toward feature space analysis. IEEE Transaction on Pattern
Analysis and Machine Intelligence, 24(5), 603–619.
[Rother2004]Rother, C., Kolmogorov, V., & Blake, A. (2004). Grabcut:
Interactive foreground extraction using iterated graph cuts. In
Conference on Computer Graphics and Interactive Techniques
(SIGGRAPH).
[Pérez2003]Pérez, P., Gangnet, M., & Blake, A. (2003). Poisson image
editing. In Conference on Computer Graphics and Interactive Techniques
(SIGGRAPH).
[Agarwala2004]Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S.,
Colburn, A., Curless, B., … Cohen, M. (2004). Interactive digital
photomontage. In Conference on Computer Graphics and Interactive
Techniques (SIGGRAPH) (Vol. 23).
104. 参考文献
104
[Chen2009]Chen, T., Cheng, M.-M., Tan, P., Shamir, A., & Hu, S.-M.
(2009). Sketch2Photo: internet image montage. In Conference on
Computer Graphics and Interactive Techniques (SIGGRAPH).
[Brown2003]Brown, M., & Lowe, D. G. (2003). Recognising
Panoramas. In International Conference on Computer Vision
(CVPR).
[Seitz1996]Seitz, S. M., & Dyer, C. R. (1996). View morphing.
Conference on Computer Graphics and Interactive Techniques
(SIGGRAPH).
[Bertalmio2000]Bertalmio, M., Guillermo, S., Caselles, V., &
Ballester, C. (2000). Image inpainting. In Conference on Computer
Graphics and Interactive Techniques (SIGGRAPH) (pp. 417–424).
[Criminisi2004]Criminisi, A., Pérez, P., & Toyama, K. (2004). Region
filling and object removal by exemplar-based image inpainting.
IEEE Transactions on Image Processing : A Publication of the IEEE
Signal Processing Society, 13(9), 1200–12.
105. 参考文献
105
[Telea2004]Telea, A. (2004). An image inpainting technique based
on the fast marching method. Journal of Graphics Tools, 9(1), 25–
36.
[Avidan2007]Avidan, S., & Shamir, A. (2007). Seam carving for
content-aware image resizing. In Conference on Computer
Graphics and Interactive Techniques (SIGGRAPH).
[Hays2007]Hays, J., & Efros, A. A. (2007). Scene completion using
millions of photographs. Conference on Computer Graphics and
Interactive Techniques (SIGGRAPH).
[Barnes2009]Barnes, C., Shechtman, E., Finkelstein, A., & Goldman,
D. B. (2009). PatchMatch: A randomized correspondence algorithm
for structural image editing. In Conference on Computer Graphics
and Interactive Techniques (SIGGRAPH).
[Blanz1999]Blanz, V., & Vetter, T. (1999). A morphable model for
the synthesis of 3D faces. In Conference on Computer Graphics
and Interactive Techniques (SIGGRAPH) (pp. 187–194).
106. 参考文献
106
[Hoiem2005]Hoiem, D., & Efros, A. A. (2005). Automatic photo
pop-up. In Conference on Computer Graphics and Interactive
Techniques (SIGGRAPH).
[Saxena2008]Saxena, A., Sun, M., & Ng, A. Y. (2008). Make3D:
Depth Perception from a Single Still Image. In AAAI national
conference on Artificial intelligence (pp. 1571–1576).
110. 画像を集めて街を三次元復元する
110
代表的なプロジェクト(リンク先にデモ動画等あり)
Photo Tourism[Snavely2006]
http://phototour.cs.washington.edu/
Building Rome in a Day[Agarwal2009]
http://grail.cs.washington.edu/rome/
Building Rome on a cloudless day [Frahm2010]
https://www.youtube.com/watch?v=4cEQZreQ2zQ
118. 参考文献
118
[Snavely2006]Snavely, N., Seitz, S. M., & Szeliski, R. (2006). Photo
tourism: exploring photo collections in 3D. In Conference on Computer
Graphics and Interactive Techniques (SIGGRAPH).
[Agarwal2009]Agarwal, S., Snavely, N., Simon, I., Seitz, S. M., & Szeliski, R.
(2009). Building Rome in a day. In International Conference on Computer
Vision (pp. 72–79).
[Frahm2010]Frahm, J., Fite-georgel, P., Gallup, D., Johnson, T., Raguram,
R., Wu, C., … Pollefeys, M. (2010). Building Rome on a Cloudless Day. In
European Conference on Computer Vision (pp. 368–381).
[Hays2008]Hays, J., & Efros, A. A. (2008). IM2GPS: estimating geographic
information from a single image. In IEEE conference on Computer Vision
and Pattern Recognition (CVPR) .
Chen, C.-Y., & Grauman, K. (2011). Clues from the beaten path: Location
estimation with bursty sequences of tourist photos. In IEEE conference
on Computer Vision and Pattern Recognition (CVPR).
119. 参考文献
119
[Dhar2011]Dhar, S., Berg, T. L., & Brook, S. (2011). High Level
Describable Attributes for Predicting Aesthetics and Interestingness. In
IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[Nishiyama2011]Nishiyama, M., Okabe, T., Sato, I., & Sato, Y. (2011).
Aesthetic quality classification of photographs based on color harmony.
In IEEE conference on Computer Vision and Pattern Recognition (CVPR).
[Tang2011]Tang, X., Luo, W., & Wang, X. (2011). Content-Based Photo
Quality Assessment. In IEEE Internatinal Conference on Computer Vision.
[Khosla2014]Khosla, A., Sarma, A. Das, & Hamid, R. (2014). What makes
an image popular? In International World Wide Web Conference (WWW).
[Kim2014]Kim, G., & Xing, E. P. (2014). Reconstructing Storyline Graphs
for Image Recommendation from Web Community Photos. In IEEE
Conference on Computer Vision and Pattern Recognition (CVPR).