This slides explain about scanning picture feature points that is made by SIFT(Scale Invariant Feature Transform) which uses Gaussian Filter Difference Logic (DoG).
研究室の輪講で使った古いスライド。物体検出の黎明期からシングルショット系までのまとめ。
Old slides used in a lab lecture. A summary of object detection from its early days to single-shot systems.
フォント不足による表示崩れがあります(筑紫A丸ゴシック、Montserratを使用)。
DynamicFusion is a method for reconstructing and tracking non-rigid scenes in real-time by extending KinectFusion. It uses a volumetric truncated signed distance function (TSDF) to integrate depth maps from multiple viewpoints into a global reconstruction. Live depth frames are aligned to a dense surface prediction generated by raycasting the TSDF. This closes the loop between mapping and localization for tracking dynamic, non-rigid scenes.
This document summarizes recent research on applying self-attention mechanisms from Transformers to domains other than language, such as computer vision. It discusses models that use self-attention for images, including ViT, DeiT, and T2T, which apply Transformers to divided image patches. It also covers more general attention modules like the Perceiver that aims to be domain-agnostic. Finally, it discusses work on transferring pretrained language Transformers to other modalities through frozen weights, showing they can function as universal computation engines.
本スライドは、弊社の梅本により弊社内の技術勉強会で使用されたものです。
近年注目を集めるアーキテクチャーである「Transformer」の解説スライドとなっております。
"Arithmer Seminar" is weekly held, where professionals from within and outside our company give lectures on their respective expertise.
The slides are made by the lecturer from outside our company, and shared here with his/her permission.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
MYCIN was an early expert system developed in the 1970s to diagnose and recommend treatments for infections. It used a knowledge base of around 200 rules, certainty factors, and backward chaining to evaluate patients' symptoms and test results. MYCIN was found to match expert physician recommendations for treating infections 52% of the time in evaluations. The system helped demonstrate the potential for rule-based and probabilistic reasoning in medical expert systems.
研究室の輪講で使った古いスライド。物体検出の黎明期からシングルショット系までのまとめ。
Old slides used in a lab lecture. A summary of object detection from its early days to single-shot systems.
フォント不足による表示崩れがあります(筑紫A丸ゴシック、Montserratを使用)。
DynamicFusion is a method for reconstructing and tracking non-rigid scenes in real-time by extending KinectFusion. It uses a volumetric truncated signed distance function (TSDF) to integrate depth maps from multiple viewpoints into a global reconstruction. Live depth frames are aligned to a dense surface prediction generated by raycasting the TSDF. This closes the loop between mapping and localization for tracking dynamic, non-rigid scenes.
This document summarizes recent research on applying self-attention mechanisms from Transformers to domains other than language, such as computer vision. It discusses models that use self-attention for images, including ViT, DeiT, and T2T, which apply Transformers to divided image patches. It also covers more general attention modules like the Perceiver that aims to be domain-agnostic. Finally, it discusses work on transferring pretrained language Transformers to other modalities through frozen weights, showing they can function as universal computation engines.
本スライドは、弊社の梅本により弊社内の技術勉強会で使用されたものです。
近年注目を集めるアーキテクチャーである「Transformer」の解説スライドとなっております。
"Arithmer Seminar" is weekly held, where professionals from within and outside our company give lectures on their respective expertise.
The slides are made by the lecturer from outside our company, and shared here with his/her permission.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
MYCIN was an early expert system developed in the 1970s to diagnose and recommend treatments for infections. It used a knowledge base of around 200 rules, certainty factors, and backward chaining to evaluate patients' symptoms and test results. MYCIN was found to match expert physician recommendations for treating infections 52% of the time in evaluations. The system helped demonstrate the potential for rule-based and probabilistic reasoning in medical expert systems.
This document contains contact information for several researchers from the Machine Perception and Robotics Group at Chubu University in Japan, including professors, lecturers, and research assistants. It lists their names, titles, contact details such as phone numbers and email addresses, and web links for the group's website. The group is part of the Department of Robotics Science and Technology or Department of Computer Science within the College of Engineering at Chubu University.
This document contains contact information for several members of the Machine Perception and Robotics Group at Chubu University in Japan, including professors Hironobu Fujiyoshi and Takayoshi Yamashita. It lists their names, titles, departments, addresses, phone numbers, and email addresses. Brief biographies are also provided for Professors Fujiyoshi and Yamashita, mentioning their research interests and accomplishments.
64. Harrisのコーナー検出
• Harris行列を用いたコーナー検出
64
第一固有値 : α
第二固有値 : β
Edge
EdgeFlat
Corner
α
β
R 0 (α, βともに小さい) :フラット
R << 0 (α>>β または β>>α) :エッジ
R >> 0 (α, βともに大きい) :コーナー
H = g( )
I2
x(x) IxIy(x)
IxIy(x) I2
y (x)
⇥
:ガウス関数Ix(·) Iy(·) g(·):y軸方向の1次微分:x軸方向の1次微分
(1次微分の値をガウス関数で平滑化することにより2次微分の計算の代用となる)
R = det(H) ktr(H)2判別式 (k = 0.04 ∼ 0.06)
65. • 注目画素 p の周辺の円周上の16画素を観測
FAST: Features from Accelerated Segment Test [Rosten2010]
65
注目画素 p がコーナーである条件
p の輝度値と比較して円周上の輝度値が連続してn 個以上が
しきい値 t 以上 明るい,もしくは暗い (図中の破線)
88. • 予め積分画像を作成
• 任意領域の輝度値の総和を高速に算出可能
‒ 3回の足し算と1回の引き算
O
積分画像(Integral Image)
A
B
C
D
S
I(i, j)
(i, j)
S = -B-C+DA
88
積分画像
S(x, y) =
i x
i=0
j y
j=0
I(i, j)積分画像:
155. 1.SIFT関連の参考文献
‒ [Lowe2004] D. G. Lowe, Distinctive image features from scale-invariant keypoints , Int.Journal of
Computer Vision,Vol.60, No.2, pp.91-110, 2004.
‒ [Lindeberg1998] T. Lindeberg, "Feature detection with automatic scale selection", Int.Journal of
Computer Vision, Vol. 30, No. 2, pp. 79-116, 1998.
‒ [高木2008] 高木雅成, 藤吉弘亘, SIFT特徴量を用いた交通道路標識認識 , 電気学会論文誌, Vol. 129-C, No. 5,
pp. 824-831, 2009.
‒ [Csurka2004] G. Csurka, C. Bray, C. Dance, L. Fan, Visual categorization with bags of keypoints ,
Workshop on Statistical Learning in Computer Vision,European Conference on Computer Vision, pp.
1‒22, 2004.
‒ [Brown2007] M. Brown, D. G. Lowe, Automatic Panoramic Image Stitching using Invariant Features ,
Int.Journal of Computer Vision,Vol. 74,
No.1, pp.59-73, 2007.
‒ [Mikolajczyk2005] K. Mikolajczyk, C. Schmid, GLOH A performance evaluation of local descriptors ,
IEEE tran. On Pattern Analysis and Machine Intelligence, pp.1615-1630, 2005.
‒ [Tola2007] E. Tola, V. Lepetit, F. Pascal, A Fast Local Descriptor for Dense Matching , Computer
Vision and Pattern Recognition, 2008.
155
156. 2.キーポイント検出器の参考文献
‒ [Bay2006] B. Herbert, T. Tinne, G. Luc, SURF: Speeded Up Robust Features. European Conference
on Computer Vision , pp.404‒417, 2006.
‒ [Grabner2006] M. Grabner, H. Grabner, and H. Bischof, Fast Approximated SIFT, Asian Conference
on Computer Vision, pp.918‒927, 2006.
‒ [Sinha2006] S. N. Sinha, J. Frahm, M. Pollefeys, and Y. Genc, GPU-based Video Feature Tracking And
Matching, Workshop on Edge Computing Using New Commodity Architectures, 2006.
‒ [Mikolajczyk2004] K. Mikolajczyk, C. Schmid, Scale & affine invariant interest point detectors.
Int.Journal of Computer Vision , pp.63‒86, 2004.
‒ [Matas2007] J. Matas, O. Chum, M. Urban, T. Pajdla, Robust Wide Baseline Stereo from Maximally
Stable Extremal Regions , British Machine Vision Conference. pp. 384‒393, 2002.
‒ [Rosten2010] E. Rosten, R. Porter, T. Drummond, Faster and Better: A Machine Learning Approach
To Corner Detection , IEEE tran. On Pattern Analysis and Machine Intelligence, pp. 105-119, 2010.
‒ [長谷川2013] 長谷川昂宏, 山内悠嗣, 藤吉弘亘, 安倍満, 吉田悠一, Cascaded FASTによるキーポイント検出 ,
画像センシングシンポジウム, 2013.
156
157. 3.キーポイント記述子の参考文献
‒ [Mikolajczyk2005] K. Mikolajczyk and C. Schmid, A Performance Evaluation of Local Descriptors,
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.27, no. 10, pp.31‒47, 2005.
‒ [Tola2008] E. Tola, V. Lepetit, F. Pascal, A Fast Local Descriptor for Dense Matching , Computer
Vision and Pattern Recognition, 2008.
‒ [Ke2004] Yan Ke, Rahul Sukthankar, PCA-SIFT: A more distinctive representation for local image
descriptors , Computer Vision and Pattern Recognition, pp.506-503, 2004.
‒ [Bay2006] H. Bay, T. Tuytelaars, L. Van Gool, SURF: Speeded Up Robust. Features , European
Conference on Computer Vision , pp.404-417, 2006.
‒ [M.Calonder2010] M. Calonder, V. Lepetit, C. Strecha, P. Fua, BRIEF: Binary Robust Independent
Elementary Features , European Conference on Computer Vision, pp.778-792, 2010.
‒ [Leutenegger2011] S. Leutenegger, M. Chli, R. Y. Siegwart BRISK: Binary Robust Invariant Scalable
Keypoints , Int. Conference on Computer Vision, 2011.
‒ [Rublee2011] E.Rublee, V.Rabaud, K.Konolige, G.Bradski ORB: an efficient alternative to SIFT or
SURF , Int.Conference on Computer Vision, 2011.
‒ [Alahi2012] A. Alahi, R. Ortiz, P. Vandergheynst, FREAK: Fast Retina Keypoint , IEEE Conference on
Computer Vision and Pattern Recognition, 2012.
‒ [Ambai2011] M. Ambai, Y. Yoshida, CARD: Compact And Real-time Descriptors , Int. Conference on
Computer Vision, 2011.
157
158. 3.キーポイント記述子の参考文献
‒ [Trzchinski2012] T. Tomasz, L. Vincent, Efficient Discriminative Projections for Compact Binary
Descriptors , European Conference on Computer Vision,
pp.228‒242, 2012.
‒ [Trzcinski2013] T. Tomasz, M. Christoudias, P. Fua, V. Lepetit, Boosting Binary Keypoint Descriptors ,
IEEE Conference on Computer Vision and Pattern Recognition, 2013.
158
159. その他の参考文献
• 評価・実装
‒ [Heinly2012] J. Heinly, E. Dunn, J, Frahm, Comparative Evaluation of Binary Features , European
Conference on Computer Vision, 2012.
• チュートリアル
- Andrea Vedaldi, Jiri Matas, Krystian Mikolajczyk, Tinne Tuytelaars, Cordelia Schmid, Andrew
Zisserman, modern features: advances, applications and software , European Conference on
Computer Vision, 2012.
• 解説記事
- [藤吉2011] 藤吉 弘亘, 安倍 満, 局所勾配特徴抽出 -SIFT以降のアプローチ- , 精密工学会誌, Vol.77, No.
12, pp.1109-1116, 2011.
• サーベイ
- [Tuytelaars2008] T. Tuytelaars, K. Mikolajczyk, Local invariant feature detectors: a survey ,
Foundations and Trends® in Computer Graphics and Vision, Vol.3, No.3, 2008.
• Mendeley
- グループ名:ImageLocalFeature
159