1. The document discusses the history and recent developments in natural language processing and deep learning. It provides an overview of seminal NLP papers from the 1990s to 2010s and deep learning architectures from 2003 to present.
2. Key deep learning models discussed include neural language models, word2vec, convolutional neural networks, and LSTMs. The document also notes the increasing interest and research in deep learning starting in 2012 by tech companies like Google, Facebook and Baidu.
3. Application examples mentioned include search engines, conversational agents, social media and news summarization tools.
47. 2.
! 2012/3: Google Hinton DNNresearch
! 2012/4: Baidu Institute of Deep Learning
! 2012/8, 10: Yahoo! IQ Engines LookFlow
! 2012/12: Facebook AI Lab LeCun
! 2014/1: Google DeepMind
! 2014/5: Andrew Ng Baidu
! 2014/8: IBM SyNAPSE
93. (1/4)
! [Brown+93] Peter F. Brown, Vincent J. Della Pietra, Stephen A.
Della Pietra, Robert L. Mercer.
The mathematics of statistical machine translation: parameter
estimation. Computational Linguistics Vol. 19 (2), 1993.
! [Berger+96] Adam L. Berger, Vincent J. Della Pietra, Stephen A.
Della Pietra.
A Maximum Entropy Approach to Natural Language
Processing. Computational Linguistics, Vol. 22 (1), 1996.
! [Lafferty+01] John Lafferty, Andrew McCallum, Fernando C. N.
Pereira.
Conditional Random Fields: Probabilistic Models for
Segmenting and Labeling Sequence Data. ICML2001.
94. (2/4)
! [Blei+03] David M. Blei, Andrew Y. Ng, Michael I. Jordan.
Latent Dirichlet Allocation. JMLR Vol. 3, 2003.
! [Teh06] Yee Whye Teh.
A Hierarchical Bayesian Language Model based on Pitman-Yor
Processes. ACL 2006.
! [Clarke+06] James Clarke, Mirella Lapata.
Constraint-Based Sentence Compression: An Integer
Programming Approach. COLING/ACL 2006.
! [Riedel+06] Sebastian Riedel, James Clarke.
Incremental Integer Linear Programming for Non-projective
Dependency Parsing. COLING/ACL 2006.
95. (3/4)
! [Koo+10] Terry Koo, Alexander M. Rush, Michael Collins, Tommi
Jaakkola, David Sontag.
Dual Decomposition for Parsing with Non-Projective Head
Automata. EMNLP 2010.
! [Rush+10] Alexander M. Rush, David Sontag, Michael Collins,
Tommi Jaakkola.
On Dual Decomposition and Linear Programming Relaxations
for Natural Language Processing. EMNLP 2010.
! [Bengio+03] Yoshua Bengio, Réjean Ducharme, Pascal Vincent,
Christian Jauvin.
A Neural Probabilistic Language Model. JMLR, 2003.
96. (4/4)
! [Mikolov+10] Tomas Mikolov, Martin Karafiat, Lukas Burget, Jan
"Honza" Cernocky, Sanjeev Khudanpur.
Recurrent neural network based language model.
Interspeech, 2010.
! [Mikolov+13] Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean.
Efficient Estimation of Word Representations in Vector Space. CoRR,
2013.
! [Socher+12] Richard Socher, Brody Huval, Christopher D. Manning,
Andrew Y. Ng.
Semantic Compositionality through Recursive Matrix-Vector Spaces.
EMNLP2012.
! [Kalchbrenner+14] Nal Kalchbrenner, Edward Grefenstette, Phil
Blunsom.
A Convolutional Neural Network for Modelling Sentences. ACL2014.