3. 単語列からラベル: ̂𝑦𝑦 = argmax
𝑦𝑦∈𝑌𝑌
𝑃𝑃(𝑦𝑦|𝒙𝒙)
2016-06-30 深層ニューラルネットワークによる知識の自動獲得・推論 3
The movie is the best I’ve ever seen!
The movie is coming soon on cinemas.
This movie is rubbish!!!
• モデル:ナイーブ・ベイズ,パーセプトロン,
ロジスティック回帰,サポート・ベクトル・マシン
𝒙𝒙: 単語列 𝑃𝑃(𝑦𝑦|𝒙𝒙) �𝑦𝑦
4. 単語列から系列ラベル: �𝒚𝒚 = argmax
𝒚𝒚∈𝑌𝑌 𝑚𝑚
𝑃𝑃(𝒚𝒚|𝒙𝒙)
2016-06-30 深層ニューラルネットワークによる知識の自動獲得・推論 4
• モデル:隠れマルコフモデル,条件付き確率場,符号・復号
• 探索法:点予測,動的計画法,ビーム探索,…
In March 2005, the New York Times acquired About, Inc.
IN NNP CD DT NNP NNP NNP VBD NNP NNP
O B-NP I-NP B-NP I-NP I-NP I-NP B-VP B-NP B-NP
2005年 3月 , ニューヨーク・タイムズ は About 社 を 買収 し た .
(品詞)
(句)
(翻訳)
(入力)
I heard Google and Yahoo were among the other bidders.(対話)
7. 言語処理におけるDNNの進展
• 単語の分散表現
• エンコーダ・デコーダ
• 分散表現の合成
• アテンション
2016-06-30 深層ニューラルネットワークによる知識の自動獲得・推論 7
very
good
movie
very
good
movie
very good movie
very
good
movie
とても
良い
映画
very
good
movie
8. 単語の分散表現の学習:
Skip-gram with negative sampling (Mikolov+ 13)
2016-06-30 深層ニューラルネットワークによる知識の自動獲得・推論 8
draughtofferpubs beer, cider, and wine
last
use
place
people
make
city
full
know
build
time
group
have
new
game
rather
age
show
take
take
team
season
say
個
の
単
語
を
ユ
ニ
グ
ラ
ム
分
布
か
ら
サ
ン
プ
リ
ン
グ
し
,
こ
れ
ら
が
予
測
さ
れ
な
い
よ
う
に
更
新
(
負
例
)
個
の
文
脈
語
を
予
測
す
る
よ
う
に
更
新
同じ単語がサン
プルされること
もあり得る
単語ベクトル𝒗𝒗𝑤𝑤 (𝑑𝑑次元)
予測ベクトル�𝒗𝒗𝑐𝑐 (𝑑𝑑次元)
: 内積 → +∞ へ
: 内積 → −∞ へ
ベクトルの更新方針
コーパス
(ℎ = 2, 𝑘𝑘 = 1の場合)
16. 関係知識 (Nickel+ 16)
• (subject, predicate, object)の三つ組の集合
Leonard Nimoy was an actor who played the character Spock
in the science-fiction movie Star Trek
2016-06-30 深層ニューラルネットワークによる知識の自動獲得・推論 16
Subject (s) Predicate (r) Object (t)
Leonard_Nimoy profession Actor
Leonard_Nimoy starredIn Star_Trek
Leonard_Nimoy played Spock
Spock characterIn Star_Trek
StarTrek genre Science_Fiction
17. 関係知識の因子分解(分散表現化)
• RESCAL (Nickel+ 11)
• score 𝑠𝑠, 𝑟𝑟, 𝑡𝑡 = 𝑥𝑥𝑠𝑠
T 𝑊𝑊𝑟𝑟 𝑥𝑥𝑡𝑡
• 𝑥𝑥𝑠𝑠 ∈ ℝ𝑑𝑑
, 𝑥𝑥𝑡𝑡 ∈ ℝ𝑑𝑑
, 𝑊𝑊𝑟𝑟 ∈ ℝ𝑑𝑑×𝑑𝑑
• TransE (Bordes+ 13)
• score 𝑠𝑠, 𝑟𝑟, 𝑡𝑡 = − 𝑥𝑥𝑠𝑠 + 𝑤𝑤𝑟𝑟 − 𝑥𝑥𝑡𝑡 2
2
• 𝑥𝑥𝑠𝑠 ∈ ℝ𝑑𝑑
, 𝑤𝑤𝑟𝑟 ∈ ℝ𝑑𝑑
, 𝑥𝑥𝑡𝑡 ∈ ℝ𝑑𝑑
• max-margin損失関数を最小化する場合
𝐽𝐽 = �
(𝑠𝑠,𝑟𝑟,𝑡𝑡)∈𝐷𝐷
max 0,1 − score 𝑠𝑠, 𝑟𝑟, 𝑡𝑡 + score 𝑠𝑠∗, 𝑟𝑟, 𝑡𝑡∗
2016-06-30 深層ニューラルネットワークによる知識の自動獲得・推論 17
𝐷𝐷に存在しない三つ組(負例)←知識ベースの三つ組集合
Japan
Tokyo
capital capital
UK
London
Japan capital Tokyo
25. 動的分散表現による読解 (Kobayashi+ 16)
Once X1 was the U.S. president. X1 faced criticism for affairs.
2016-06-30 深層ニューラルネットワークによる知識の自動獲得・推論 25
X1の表現1
Later X1 was divorced with the wife X2.
質問: [?] was the wife of the president.
X1の表現2
X1の表現3 X2の表現1
双方向LSTMでXのベク
トルをエンコード(先頭
と末尾の単語のベクトル
も結合して用いる)
[?]の表現
X1の表現 X2の表現
アテンションでXの異なる
文脈のベクトルを統合
内積の大きい方を解答
するようにモデル化
Max-poolingによる初期化
35. 参考文献
• D Bahdanau, K Cho, Y Bengio: Neural Machine Translation by Jointly Learning to Align and Translate, in ICLR
(2015)
• Y Bengio, R Ducharme, P Vincent, C Janvin: A Neural Probabilistic Language Model, Journal of Machine Learning
Research, Vol. 3, pp. 1137–1155 (2003)
• A Bordes, N Usunier, A Garcia-Duran, J Weston, O Yakhnenko: Translating Embeddings for Modeling Multi-
relational Data. in Proc. of NIPS, pp. 2787–2795 (2013)
• D Chen, J Bolton, C D Manning: A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task, in
Proc. ACL (to apepar) (2016).
• K Cho, van B Merrienboer, C Gulcehre, D Bahdanau, F Bougares, H Schwenk, Y Bengio: Learning Phrase
Representations using RNN Encoder–Decoder for Statistical Machine Translation, in Proc. of EMNLP, pp. 1724–
1734 (2014)
• W W Cohen: TensorLog: A Differentiable Deductive Database, CoRR, Vol. abs/1605.06523 (2016)
• R Collobert, J Weston: A Unified Architecture for Natural Language Processing: Deep Neural Networks with
Multitask Learning, in Proc of ICML, pp. 160–167 (2008)
• A Graves: Generating Sequences With Recurrent Neural Networks, CoRR, Vol. abs/1308.0850 (2013)
• M Lee, X He, W-T Yih, J Gao, L Deng, P Smolensky: Reasoning in Vector Space: An Exploratory Study of Question
Answering, in Proc. of ICLR (2016)
• M-T Luong, H Pham, C D Manning: Effective Approaches to Attention-based Neural Machine Translation, in Proc.
of EMNLP, pp. 1412-1421 (2015)
• K Guu, J Miller, P Liang. Traversing Knowledge Graphs in Vector Space, in Proc. of EMNLP, pp 318-327 (2015)
• T Mikolov, I Sutskever, K Chen, G S Corrado, J Dean: Distributed Representations ofWords and Phrases and their
Compositionality, in Proc. of NIPS, pp. 3111–3119 (2013)
• K M Hermann, T Kocisky, E Grefenstette, L Espeholt, W Kay, M Suleyman, P Blunsom. Teaching machines to read
and comprehend, in Proc. of NIPS, pp. 1684-1692 (2015)
2016-06-30 深層ニューラルネットワークによる知識の自動獲得・推論 35
36. • F Hill, A Bordes, S Chopra, J Weston: The Goldilocks Principle: Reading Children's Books with Explicit Memory
Representations, in Proc. of ICLR (2016)
• S Kobayashi, R Tian, N Okazaki, K Inui. Dynamic Entity Representation with Max-pooling Improves Machine
Reading, in Proc. of NAACL-HLT, pp. 850-855 (2016)
• A Kumar, P Ondruska, M Iyyer, J Bradbury, I Gulrajani, V Zhong, R Paulus, R Socher: Ask Me Anything: Dynamic
Memory Networks for Natural Language Processing, in Proc. of ICML (2016)
• M Nickel, V Tresp, H-P Kriegel. A Three-Way Model for Collective Learning on Multi-Relational Data. in Proc. of
ICML, pp. 809–816 (2011)M Nickel, K Murphy, V Tresp, E Gabrilovich. A Review of Relational Machine Learning for
Knowledge Graphs. Proceedings of the IEEE, 104(1):11–33 (2016)
• P Rajpurkar, J Zhang, K Lopyrev, P Liang: SQuAD: 100,000+ Questions for Machine Comprehension of Text, CoRR,
Vol. abs/1606.05250, (2016)
• D Paperno, G Kruszewski, A Lazaridou, Q Pham, R Bernardi, S Pezzelle, M Baroni, G Boleda and R Fernandez. The
LAMBADA Dataset: Word Prediction Requiring a Broad Discourse Context, in Proc. of ACL (2016)
• S Riedel, L Yao, A McCallum: Latent Relation Representations for Universal Schemas, in Proc. of ICLR (2013)
• P Smolensky: Tensor product variable binding and the representation of symbolic structures in connectionist
• systems. Artificial Intelligence, 46(1-2), (1990)
• S Sukhbaatar, A Szlam, J Weston, R Fergus: End-to-End Memory Networks, in Proc of NIPS (2015)
• I Sutskever, J Martens, G Hinton: Generating Text with Recurrent Neural Networks, in Proc. of ICML, pp. 1017–1024
(2011)
• I Sutskever, O Vinyals, Q V Le: Sequence to Sequence Learning with Neural Networks, in Proc. of NIPS, pp. 3104–
3112 (2014)
• S Takase, N Okazaki, K Inui: Composing Distributed Representations of Relational Patterns. in Proc. ACL (2016).
• K Toutanova, D Chen, P Pantel, H Poon, P Choudhury, M Gamon: Representing Text for Joint Embedding of Text
and Knowledge Bases, in Proc. of EMNLP, pp. 1499-1509 (2015)
• J Weston, A Bordes, S Chopra, A M Rush, B van Merrienboer, A Joulin, T Mikolov: Towards AI-Complete Question
Answering: A Set of Prerequisite Toy Tasks, in Proc. of ICLR (2016)
2016-06-30 深層ニューラルネットワークによる知識の自動獲得・推論 36