NLP
NLP
NLP
(Lecture 1 Introduction)
Logic
Machine
Learning
NLP
Vision
Knowledge
Representation
Planning
Robotics
Expert
Systems
Books etc.
Main Text(s):
Foundations of Statistical NLP: Manning and Schutze
Natural Language Understanding: James Allan
Speech and NLP: Jurafsky and Martin
Journals
Computational Linguistics, Natural Language Engineering, AI, AI
Magazine, IEEE SMC
Conferences
ACL, EACL, COLING, MT Summit, EMNLP, IJCNLP, HLT,
ICON, SIGIR, WWW, ICML, ECML
Allied Disciplines
Philosophy
Linguistics
Cognitive Science
Psychology
Brain Science
Physics
Shallow Processing
Part of Speech Tagging and Chunking using HMM, MEMM, CRF, and
Rule Based Systems
EM Algorithm
Language Modeling
N-grams
Probabilistic CFGs
Deep Parsing
What is NLP
Branch of AI
2 Goals
Science Goal: Understand the way
language operates
Engineering Goal: Build systems that
analyse and generate language; reduce
the man machine gap
Test conductor
Machine
Human
Can the test conductor find out which is the machine and which
the human
Inspired Eliza
http://www.manifestation.com/neurot
oys/eliza.php3
Inspired Eliza
(another sample
interaction)
A Sample of Interaction:
Phonetics
It is concern with the processing of speech
Challenges
Homophones: bank (finance) vs. bank (river
bank)
Near Homophones: maatraa vs. maatra (hin)
Word Boundary
aajaayenge (aa jaayenge (will come) or aaj aayenge (will
come today)
I got [ua]plate
Phrase boundary
mtech1 students are especially urged to attend as such
seminars are integral to one's post-graduate education.
Disfluency: ah, um, ahem etc.(are used by speaker to get some
time to organize thoughts)
Morphology
It deals with word formation rules from root words.
Nouns: Plural (boy-boys); Gender marking (czar-czarina)
Verbs: Tense (stretch-stretched); Aspect (e.g. perfective sithad sat); Modality (e.g. request khaanaa khaaiie)
First crucial first step in NLP
Languages rich in morphology: e.g., Dravidian, Hungarian,
Turkish
Languages poor in morphology: Chinese, English
Languages with rich morphology have the advantage of easier
processing at higher stages of processing
A task of interest to computer science: Finite State Machines
for Word Morphology
Lexical Analysis
Essentially refers to dictionary access and
obtaining the properties of the word
e.g. dog
noun (lexical property)
take-s-in-plural (morph property)
animate (semantic property)
4-legged (-do-)
carnivore (-do)
Challenge: Lexical or word sense disambiguation
Question: why we need to store this information on to
the dictionary?
When we produce the dictionary entries in a lexicon one of our
main concern is how to embed richness in this data structure.
Property of any word is available in dictionary.
Example: How many years does a dog lives?
Lexical Disambiguation
First step: part of Speech Disambiguation
Dog as a noun (animal)
Dog as a verb (to pursue)
Ex: Where aver Ram went misfortune dogged him
Sense Disambiguation
Dog (as animal)
Dog (as a very detestable person)
Many meaning of the word come into play depending on the context. A
word typically follows one sense per discourse.
Ex: when we see dog in the text , in general you take dog as a animal.
Very rarely dog means a detestable person.
VP
NP
V
NP
I
like
mangoes
Parsing Strategy
Driven by
grammar
S-> NP VP
NP-> N | PRON
VP-> V NP | V PP
N-> Mangoes
PRON-> I
V-> like
Structural Ambiguity
Overheard
I did not know my PDA had a phone for 3 months
Semantic Analysis
Representation in terms of
Predicate calculus/Semantic
Nets/Frames/Conceptual Dependencies and
Scripts
John gave a book to Mary
Give action: Agent: John, Object: Book,
Recipient: Mary
Challenge: ambiguity in semantic role labeling
(Eng) Visiting aunts can be a nuisance
(Hin) aapko mujhe mithaai khilaanii padegii
(ambiguous in Marathi and Bengali too; not in
Dravidian languages)
Pragmatics
Very hard problem
Model user intention
Tourist (in a hurry, checking out of the hotel,
motioning to the service boy): Boy, go
upstairs and see if my sandals are under the
divan. Do not be late. I just have 15 minutes
to catch the train.
Boy (running upstairs and coming back
panting): yes sir, they are there.
Discourse
Processing of sequence of sentences
Mother to John:
John go to school. It is open today. Should you
bunk? Father will be very angry.
Ambiguity of open
bunk what?
Why will the father be angry?
Complex chain of reasoning and application of
world knowledge
Ambiguity of father
father as parent
or
father as headmaster
Complexity of Connected
Text
John was returning from school
dejected today was the math
test
He couldnt control the class
Teacher shouldnt have made him
responsible
After all he is just a gatekeeper
Structure Disambiguation is as
critical as Sense Disambiguation
Scope (portion of text in the scope of a
modifier)
Old men and women will be taken to safe
locations
No smoking areas allow hookas inside
Clause
I told the child that I liked that he came to
the game on time
Preposition
I saw the boy with a telescope
Structure Disambiguation is as
critical as Sense Disambiguation
(contd.)
Semantic role
Postposition
Preposition Attachment
Disambiguation
Problem definition
4-tuples of the form V N1 P N2
saw (V) boys (N1) with (P) telescopes
(N2)
Example: lexical
association
A table entry is considered a definite instance of the
prepositional phrase attaching to the verb if:
the verb definitely licenses the prepositional phrase
E.g. from Propbank,
absolve
frames
absolve.XX: NP-ARG0 NP-ARG2-of obj-ARG1 1
absolve.XX NP-ARG0 NP-ARG2-of obj-ARG1
On Friday , the firms filed a suit *ICH*-1 against West
Virginia in New York state court asking for [ ARG0 a
declaratory judgment] [rel absolving] [ARG1 them] of
[ARG2-of liability] .
Core steps
Seven different procedures for
deciding whether a table entry is an
instance of no attachment, sure
noun attach, sure verb attach, or
ambiguous attach
able to extract frequency
information, counting the number of
times a particular verb or noun
attaches with a particular preposition
Critique
Limited by the number of
relationships in the training corpora
Too large a parameter space
Model acquired during training is
represented in a huge table of
probabilities, precluding any
straightforward analysis of its
workings
Example Transformations
Accuracy
Hindle and
Rooth
(baseline)
Transformatio
ns
Transformatio
ns
(word classes)
70.4 to
75.8%
#of
transforma
tion rules
NA
79.2
418
81.8
266
Core formulation
We denote
the partially parsed verb phrase, i.e.,
the verb phrase without the
attachment decision, as a history h,
and
the conditional probability of an
attachment as P(d|h),
where d and corresponds to a noun or
verb attachment- 0 or 1- respectively.
--(2)
--(3)
--(4)
Features
Two types of binary-valued
questions:
Questions about the presence of
any n-gram of the four head words,
e.g., a bigram maybe V == is, P
== of
Features comprised solely of
questions on words are denoted as
word features
Features (contd.)
Questions that involve the class
membership of a head word
Binary hierarchy of classes derived
by mutual information
Features (contd.)
Given a binary class hierarchy,
we can associate a bit string with every word in
the vocabulary
Then, by querying the value of certain bit
positions we can construct
binary questions.
Probabilistic formulation
Or briefly,
If
Maximum Likelihood
estimate
The cut off frequencies (c1, c2 ....) are thresholds determining whether to back-off or not at each levelcounts lower than ci at stage i are deemed to be too low to give an accurate estimate, so in this case
backing-off continues.
Lower bound
(most frequent)
Upper bound
(human experts
Looking at 4 word
only)
Results
Transformation
Learning,
Brill et. al.
Experimental setup
Training Data:
Brown corpus (raw text). Corpus size is 6 MB, consists of
51763 sentences, nearly 1 million 27 thousand words.
Most frequent Prepositions in the syntactic context N1-P-N2:
of, in, for, to, with, on, at, from, by
Most frequent Prepositions in the syntactic context V-P-N: in,
to, by, with, on, for, from, at, of
The Extracted unambiguous N1-P-N2: 54030 and V-P-N:
22362
Test Data:
Penn Treebank Wall Street Journal (WSJ) data extracted by
Ratnaparkhi
It consists of V-N1-P-N2 tuples: 20801(training),
4039(development) and 3097(Test)
Chunker
Extraction
Heuristics
Output
The professional conduct of the doctors is guided by Indian
Medical Association.
The_DT professional_JJ conduct_NN of_IN the_DT
doctors_NNS is_VBZ guided_VBN by_ IN Indian_NNP
Medical_NNP Association_NNP._.
[The_DT professional_JJ conduct_NN ] of_IN [the_DT
doctors_NNS ] (is_VBZ guided_VBN) by_IN [Indian_NNP
Medical_NNP Association_NNP].
After replacing each chunk by its head word it results
in:
conduct_NN of_IN doctors_NNS guided_VBN by_IN
Association_NNP
N1PN2: conduct of doctors and VPN: guided by Association
Morphing
DSRP (Synset
Replacement)
Example of DSR by
inferencing
V1-P-N1: play in garden and V2-P-N1:
sit in garden
V1-P-N2: play in house and V2-P-N2:
sit in house
V3-P-N2: jump in house exists
Infer the existence of V3-P-N1:
jump in garden
Results