Probability Theory and Mathematical
Probability Theory and Mathematical
Probability Theory and Mathematical
Mathematical
Statistics
A WILEY PUBLICATION
IN MATHEMATICAL
STATISTICS
FISZ
THIRD EDITION
J ohn
Sydney
PRAWDOPODOBIENSTWA
STATYSTYKA
MATEMATYCZNA
COPYRIGHT
1963
BY
JOHN WILEY & SONS, INC.
1967
63-7554
viii
* * *
IX
Although great care has been taken to make this book mathematicalIy
rigorous, the intuitive approach as weIl as the applicability of the concepts
and theorems presented are heavily stressed.
For the most part, the theorems are given with complete proofs. Sorne
proofs, which are either too lengthy or require mathematical knowledge
far beyond the scope of this book, were omitted.
The entire text of .the book may be read by students with sorne background in calculus and algebra. However, no advanced knowledge in
these fields or a knowledge in measure and integration theory is required.
Sorne necessary advanced concepts (for instance, that of the Stieltjes
integral) are presented in the text. Furthermore, this book is provided
with a Supplement, in which sorne basic concepts and theorems of modero
measure and integration theory are presented.
Every chapter is folIowed by "Problerns and Complements." A large
part ofthese problems are relatively easy and are to be solved by the reader,
with the remaining ones given for information and stimulation.
This book may be used for systematic one-year courses either in
probability theory or in mathematical statistics, either for senior undergraduate or graduate students. 1 have presented parts of the material,
covered by this book, in courses at the University of Warsaw (Poland)
for nine academic years, from 1951/1952 to 1959/1960, at the Peking
University (China) in the Spring term of 1957, and in this country at the
University of Washington and at Stanford, Columbia and New York
Universities for the last several years.
This book is also suitable for nonmathematicians, as far as concepts,
theorems, and methods of application are concemed.
* * *
1 started to write this book at the end of 1950. Its first edition (374
pages) was published in Polish' in 1954. AH copies were sold within a
few months. 1 then prepared the second, revised and extended, Polish
* * *
MAREK
FISZ
Contents
CHAPTER
PAGE
PART 1
PROBABILlTY THEORY
RANDOM EVENTS
RANDOM VARIABLES
xi
1
3
5
11
16
18
22
24
25
29
29
31
33
36
40
46
48
52
56
xii
CONTENTS
62
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
Expected values
Moments.
The Chebyshev inequality
Absolute moments
Order parameters
Moments of random vectors
Regression of the first type
Regression of the second type
Problems and Complements
64
64
67
74
76
77
79
91
96
101
CHARACTERISTIC FUNCTIONS
4.1
4.2
4.3
4.4
62
105
107
110
112
115
121
125
126
129
5.1
5.2
5.3
5.4
5.5
5.6
129
130
134
135
140
145
5.7
5.8
5.9
5.10
5.11
The
The
The
The
The
147
151
154
156
158
normal distribution
gamma distribution
beta distribution
Cauchy and Laplace distributions
multidimensional normal distribution
CONTENTS
Preliminary remarks
Stoehastie eonvergenee
Bernoulli's law of large numbers
The eonvergenee of a sequenee of distribution funetions
The Riemann-Stieltjes integral
The Lvy-Cramr theorem .
6.7 The de Moivre-Laplaee theorem .
6.8 The Lindeberg-Lvy theorem
6.9 The Lapunov theorem
6.10 The Gnedenko theorem
6.11 Poisson's, Chebyshev's, and Khintehin's laws of large numbers
6.12 The strong law of large numbers
6.13 Multidimensionallimit distributions
6.14 Limit theorems for rational funetions ofsome random variables
6.15 Final remarks
Problems and Complements
MARKOV CHAINS
7.1
7.2
7.3
7.4
7.5
170
175
176
179
180
184
188
192
196
202
211
216
220
232
236
239
239
250
Preliminary remarks .
Homogeneous Markov ehains
The transition matrix
The ergodie theorem .
Random variables forming a homogeneous Markov ehain
250
250
252
255
263
267
STOCHASTIC PROCESSES
8.1
8.2
8.3
8.4
8.5
163
164
175
LlMIT THEOREMS
6.1
6.2
6.3
6.4
6.5
6.6
Xlll
271
271
272
276
281
287
298
301
CONTENTS
XIV
8.8
8.9
8.10
8.11
8.12
304
309
314
323
325
327
PART 2
9
MATHEMATICAL STATISTICS
335
335
337
337
339
343
348
354
357
358
363
366
368.
ORDER STATlSTlCS
372
372
372
374
377
379
384
387
388
388
390
394
405
407
410
CONTENTS
11
11.1
11.2
11.3
11.4
12
13
415
415
416
421
423
TESTS
425
425
427
433
436
445
449
456
459
461
Preliminary notions
Consistent estimates
Unbiased estimates
The sufficiency of an estimate
The efficiency of an estimate
Asymptotically most efficient estimates
Methods of finding estimates
Confidence intervals .
Bayes theorem and estimation
461
461
462
465
467
479
484
490
494
499
503
13.1
13.2
13.3
13.4
13.5
13.6
13.7
13.8
13.9
14
415
SIGNIFICANCE
12.1
12.2
12.3
12.4
12.5
12.6
12.7
XV
14.1
14.2
14.3
"14.4
14.5
Preliminary remarks .
Methods of random sampling
Schemes of independent and dependent random sampling
Schemes of unrestricted and stratified random sampling
Random errors of measurements
503
504
509
512
520
522
CONTENTS
XVi
15
16
524
15.1
15.2
15.3
One-wa y c1assification
Multiple c1assification
A modified regression problem
524
531
535
540
541
16.1
16.2
16.3
16.4
16.5
lq.6
16.7
541
541
552
558
560
566
578
Preliminary remarks .
The power function and the OC function
Most powerfuI tests
Uniformly most powerfuI test
Unbiased tests .
The powcr and consistency of nonparametric
AdditionaI remarks
.
tests
17
578
584
17.1
17.2
17.3
17.4
17.5
17.6
17.7
17.8
584
585
587
591
592
Preliminary rernarks .
The sequential probability ratio test
Auxiliary theorems
The fundamental Identity .
The OC function of the sequential probability ratio test
The expected value E(n)
The determination of A and B
Testing a hypothesis concerning the parameter p of a zero-one
distribution
17.9 Testing a hypothesis concerning the expected value m of a
normal population
17.10 Additional remarks
.
Problems and Complements
REFERENCES
T ABLES
AUTHORINDEX
SUBJECT
597
604
608
610
612
621
SUPPLEMENT
ST ATISTICAL
595
597
INDEX
658
665
671
PA R T
Probability
Theory
CHAPTER
Random Events
1.1
PRELIMINAR Y REMARKS
A
Probability theory is a part of mathematics which is useful in discovering and investigating the regular features of random events. The
following examples show what is ordinarily understood by the term
random event.
Example 1.1.1. Let us toss a symmetric coin. The result may be either a head
or a tail. For any one throw, we cannot predict the result, although it is obvious
that it is determined by definite causes. Among them are the initial velocity of
the coin, the initial angle of throw, and the smoothness of the table on which the
coin falIs. However, since we cannot control alI these parameters, we cannot
predetermine the result of any particular toss. Thus the result of a coin tossing,
head or tail, is a random event.
Example 1.1.2. Suppose that we observe the average monthly temperature
at a definite place and for a definite month, for instance, for January in Warsaw.!
This average depends on many causes such as the humidity and the direction and
strength of the wind. The effect of these causes changes year by year. Hence
Warsaw's average temperature in January is not always the same. Here we can
determine the causes for a given average tempera ture, but often we cannot determine the reasons for the causes themselves. As a result, we are not able to
predict with a sufficient degree of accuracy what the average temperature for a
certain January wilI be. Thus we refer to it as a random evento
B
It might seem that there is no regularity in the examples given.
But if the number of observations is large, that is, if we deal with a mass
phenomenon, sorne regularity appears.
Let us return to exarnple 1.1.1. We cannot predict the result of any
particular toss, but if we perform a long series of tossings, we notice
that the number of times heads occur is approxirnately equal to the
number of times tails appear. Let n denote the nurnber of all our tosses
and m the number of times heads appear. The fraction m/n is called the
1
PROBABILITY THEORY
Year
of Birth
Number of Births
Boys
m
Girls
1927
1928
1929
1930
1931
1932
496,544
513,654
514,765
528,072
496,986
482,431
462,189
477,339
479,336
494,739
467,587
452,232
Total
3,032,452
2,833,422
Total
Number
of Births
Frequency of Births
Boys
PI
Girls
958,733
990,993
994,101
1,022,811
964,573
934,663
0.518
0.518
0.518
0.516
0.515
0.516
0.482
0.482
0.482
0.484
0.485
0.484
5,865,874
0.517
0.483
+I
P2
tossed a coin 4040 times, and obtained heads 2048 times; hence the ratio
of heads was m/n = 0.50693. In 24,000 tosses, K. Pearson obtained a
frequency of heads equal to 0.5005. We can see quite clearly that the
observed frequencies oscillate about the number 0.5.
As a result of long observation, we can also notice certain regularities
in example 1.1.2. We investigate this more closely in example 12.5.1.
Example 1.1.3. We cannot predict the sex of a newborn baby in any particular case. We treat this phenornenon as a random event. But if we observe a
large number of births, that is, if we deal with a mass phenomenon, we are able to
predict with considerable accuracy what will be the percentages of boys and girls
.among all newborn babies. Let us consider the number of births of boys and
girIs in Poland in the years 1927 to 1932. The data are presented in Table 1.1.1.
In. this table m andf denote respectively the number of births of boys and girls
in particular years. Denote the frequencies of births by PI and P2' respectively;
then
m
PI = m + f'
One can see that the values of PI oscilIate about the number 0.517, and the
values of P2 oscilIate about the number 0.483.
RANDOM
EVENTS
PROBABILlTY THEOR y
( nI)
one-element events,
(n2)
two-element events,
( n -n I)
(n - l)-element events,
RANDOM
EVENTS
A=B.
We now postulate the folIowing properties of Z.
Property 1.2.1. The set Z 01 random events
contains as an element the whole set E.
Property 1.2.2. The set Z 01 random eoents
contains as an element the empty set (O).
These two properties state that the set Z of
random events contains as elements the sure
Fig.1.2.1
and the impossible events.
Definition 1.2.5. We say that two events A and B are exclusive if they
do not have any common element of the set E.
Example 1.2.2. Consider the random event A that two persons from the
group of n persons born in Warsaw in 1950 will stilI be alive in the year 2000 and
the event B that two or more persons from the group considered will still be alive
in the year 2000. Events A and B are not exclusive.
If, however, we consider the event B' that onlv one person will still be alive in
the year 2000, events A and B' will be exclusive.
Let us analyze this example more closely. In the group of n elements being
considered it may happen that 1, or 2, or 3 ... up to n persons will still be alive
in the year 2000, and it may happen that none of them will be alive at that time.
Then the set E consists of n + 1 elementary events eo' el' ... , en, where the ndices O, 1, ... , n denote the number of persons from the group being considered
who will still be alive in the year 2000. The random event A in this example contains only one element, namely, the elementary event e2. The random event B
contains n - 1 elementary events, namely ,e2' e3' ... ,en- The common element
of the two events A and Bis the elementary event e2, and hence these two events .
are not exclusive. However, event B' contains only one element, namely, the
elementary event el. Thus events A and B' have no common element, and are
exclusive.
Al
and read:
A2
U ...
Al or A2 or ....
or
Al
A2
+ ... ,
PROBABILlTY THEORY
E
Fig. 1.2.2
Fig.1.2.3
= A.
RANDOM
EVENTS
= Al
n A2 n ... ,
or
= AlA2'
.. ,
or
TI Ai
i
10
PROBABILITY THEOR y
E
Fig.1.2.4
Fig.1.2.5
two. Consider also the event B that on the farm there is exaetly one horse and at
most one plow. We find the produet of events A and B.
In this example the set of elementary events has 9 elements which are denoted
by the symbols
the first index denoting the number of horses, and the seeond the number of
plows.
The random event A eontains four elementary events, ell' e12, e21, e22 and the
random event B eontains two elementary events, e10and ell. The produet A n B
eontains one elementary event en, and hence the event A r, B oeeurs if and only
if on the ehosen farm there is exaetly one horse and exaetly one plow.
Definition 1.2.9. The difference of events E - A is called the complemen t
of the event A and is denoted by .
The complement of an event is illustrated by Fig. 1.2.5, where square
E represents the set of elementary events, and circle A denotes some
event; the shaded area represents the complement A of A.
This definition may also be formulated in the following way: Event A
occurs if and only if event A does not occur.
According to properties 1.2.1 and 1.2.4 of the set Z of random events,
the complement A of A is a random event.
Example 1.2.5. Suppose we have a number of eleetrie light bulbs. We are
interested in the time t that they gIow. We fix a certain value lo such that if the
bulb burns out in a time shorter than lo, we eonsider it to be defective. We
select a bulb at random. Consider the random event A that we select a defective bulbo Then the random event that we seleet a good one, that is, a bulb that
gIowsfor a time no shorter than 10' is the event A-, the complement of the event A.
We now give the definition (see Supplement) of the Borel field of events
which was mentioned earlier.
Definition 1.2.10. A set Z 9f subsets of the set E of elementary events
with properties 1.2.1 to 1.2.5 is called a Borel field of events, and its elements
are caIIed random events.
In the sequel we consider only random events, and often instead of
writing "random event" we simply write "event."
RANDOM
11
EVENTS
D
"The following definitions will facilitate some of the formulations
and proofs given in the subsequent parts of this book.
Definition 1.2.11. The sequence {An}(n = 1,2, ... ) of events is called
nonincreasing if for every n we have
An ~ An+1'
The product of a nonincreasing
limit of this sequence, We write
sequence of events
TI An
A =
Ano
= lim
n;;'l
n++a:
An+1
The sum of a nondecreasing
sequence.
We write
A =
Ano
sequence
2 An
n;;'l
= lim
Ano
n-+oo
12
PROBABILITY THEORY
n.
Example 1.3.1. Suppose there are only black balls in an urn. Let the random
experiment consist in drawing a ball from the urn. Let m/n denote, as before,
the frequency of appearance of the black ball. It is obvious that in this example
we shall always have m/n = 1. Here, drawing the black ball out of the urn is a
sure event and we see that its frequency equals one.
Taking into account this property of the sure event, we formuIate the
foIlowing axiom.
Axiom JI. The probability of the sure etent equals one.
We write
P(E) = 1.
We shaIl see in Section 2.3 that the converse ofaxiom II is not true:
if the probability of a random event A equaIs one, or peA) = 1, the set A
may not include all the elementary events of the set E.
\
We have already seen that the frequency of appearance of face 6 in
throwing a die oscillates about the number l. The same is true for face 2.
We notice that these two events are exclusive and that the frequency of
occurrence ofeither face 6 or face 2 (that is, the frequency ofthe aIternative
of these events), which equals the sum of their frequencies, oscilIates
about the number i
t = l
Experience shows that if a card is selected from a deck of 52 cards
(4 suits of 13 cards each) many times over, the frequency of appearance
of any one of the four aces equals about l2' and the frequency of appearance of any spade equaIs about H. Nevertheless, the frequency of
appearance of the aIternative, ace or spade, oscil1ates not about the
number 5~2 + H = H but about the number H. This phenomenon is
explained by the fact that ace and spade are not exclusive random .
events (we could select the ace of spades). Therefore the frequency of the
RANDOM
13
EVENTS
alternative, ace or spade, is not equal to the sum of the frequencies of ace
and spade. Taking into account this property of the frequency of the
alternative of events, we formulate the last axiom.
Axiom IIl. The probability 01 the alternative 01 a finite or denumerable
number 01pairwise exclusive events equals the sum 01 the probabilities 01
these events.
Thus, if we have a finite or countable sequence of pairwise exclusive
events {Ak}, k = 1,2, ... , then, according to axiom IlI, the following
formula holds:
(1.3.1)
In particular, if a random event contains a finite or countable number
of elementary events ek and (ek) E Z (k = 1,2, ... ),
P(e1, e2,
= P(e1)
P(e2)
+ ...
The property expressed by axiom III is called the countable (or complete)
additioity of probablity.'
Axiom III concerns only the sums of pairwise exclusive events. Now
let A and B be two arbitrary random events, exclusive or not. We shall
find the probability of their alternative.
We can write
A U B = A U (B - AB),
B
= AB
U (B
- AB).
Pi A
of exclusive events.
+ P(B - AB),
= P(AB) + P(B - AB).
U B) = peA)
P(B)
peA
U B)
= peA) + P(B)
- P(AB).
random events.
It is
PC~lAk)=J/(A -k,t/(Ak,A
k)
+ I
kl,1.2.k3 =1
k1<k2<k3
k,)
kl <i
P(AklAk2Ak)
+ ... + (-1)n+lp(A1
... An).
1 We could have said that the probability f(A), satisfying axioms 1 to IJI, is a normed,
non-negatioe, and countably additioe measure on the Borel field Z of subsets of E.
14
PROBABILITY THEORY
B
Consider a finite or countable number of random events Ak, where
k = 1, 2, . . .. lf every elementary event of the set E belongs to at least
one of the random events Ah A2, , we say that these events exhaust
the set 01 elementary eoents E. The alternative LAk contains all the
k
elementary events of the set E and therefore is the sure event. By axiom
II we obtain
Theorem 1.3.1. If the eoents Al' A2, exhaust the set 01 elementary
ecents E,
(1.3.3)
Example 1.3.2. Let the set of all non-negative integers form the set of
elementaryevents. Let (en) be the event of obtaining the number n, where n =
0, 1, 2, . . .. Suppose that
P
<Xl)
),
r:IJ
Len
=cL
n=O
n=O
1
"
n.
ce
A) = 1.
A) = peA)
+ peA)
and finaIly
(l.3.4)
peA)
peA)
1.
RANDOM
15
EVENTS
AuE=E.
If A is the impossible event (does not contain any ofthe elementary events),
A and E are exclusive because they have no common element. Applying
axiom IlI, we obtain
peA)
+ P(E) = P(E).
We shaIl see in Section 2.3 that the converse is not true ; from the fact
that the probability of sorne event equa]s zero it does not foIlow that this
event is impossible.
e
The folIowing two theorems have numerous applications.
Theorem 1.3.4. Let {An}, n = 1, 2, ... , be a nonincreasing sequence of
eoents and let A be their producto Then
(1.3.5)
Proofi
An = L AkAk+l
k=n
+ A.
P(A
n) (Jn
=P
k+1) + peA)
AkA
- P (A
Jn
AkAk+1
).
We note that
00
AL
k=n
00
Ak'4,c+l = L AAkAk+l"
lc=n
therefore
P(AAkAk+I)
Since the events under the summation sign on the right-hand side offormula
(I.3.6) are exclusive, we have
oc
(1.3.7)
P(An) = L P(AkAk+l)
k=n
+ peA).
16
PROBABILITY
THEORY
I P(AkA
k+l)
k=l
(1.3.8)
ex)
Proof. Consider the sequence of events {ArJ which are the complements
of the events An' From the assumption that {An} is a nondecreasing
sequence it follows that {An} is a nonincreasing sequence. Let A be the
product of events An' From theorem 1.3.4 it follows that
peA)
= lim
P(An).
n-+ 00
Hence
n-+oo
ex)
n-+oo
then
peA)
Proof,
< P(B).
Let us write
B =A
(B - A).
+ P(B
- A).
RANDOM
EVENTS
17
= t + ~-+ t + t = -*.
Example 1.4.3. Let us toss a coin three times. What is the probability that
heads appear twiee?
The number of all possible combinations which may occur as a result of three
suceessive tosses equals 23 = 8. Denote the appearanee of heads by H and the
appearanee of tails by T. We have the following possible eombinations:
HHH, HHT, HTH, THH, HTT, THT, TTH, TTT.
( mn)
If every possible result of
the required probability is
(1.4.1)
11
n!
=m!(I1-111)!'
18
PROBABILITY THEORY
Example 1.4.4. Compute the probability that heads appear at least twice in
three successive tosses of a coin.
The random event under consideration will occur if in three tosses heads appear
two or three times. According to formula (1.4.1), the probability that heads
appear three times equals
3!
233! O! = 8'
and the probability that heads appear twice equals l, as we already know.
Hence, according to axiom Hl, the required probability is
i + i = .
In exarnples 1.4.1 to 1.4.4 the equiprobability of all elernentary events
was assurned. This assurnption was obviously satisfied in our exarnples,
but it is not always acceptable.
Example 1.5.1. A. Markov [4] has investigated the probability of the appearance of these pairs of letters in Russian:
Vowel after vowel,
Vowel after coi.sonant.
To compute these probabilities he counted the corresponding pairs of letters in
Pushkin's poem Eugene Onegin on the basis of a text of 20,000 letters, and he
accepted the observed frequencies as probabilities.' The experiment yielded the
following results: there were 8638 vowels, and the pair "vowel after vowel"
appeared 1104 times.
Let us analyze this example. Denote a vowel by a and a consonant by b. As
elementary events we shall consider the pairs aa, ba, ab, bb, the set of elementary
events is then (aa, ab, ba, bb).
Consider event B that a pair of letters will appear in which a vowel is in
second place. Event B may be written as (aa, ba). It is known that a vowel
appears 8638 times. These vowels follow either another vowel (in the pairs aa)
or a consonant (in the pairs ba). Because no vowel appears at the beginning of
the text considered
"Mo ARAR causrx teCTHbIX npaunn ... ,"2
P(B)
= 20,000 = 0.432.
Consider now event A that the pair of letters occurs with a vowel in first place.
Event A may be written as (aa, ab).
1
2
The methods of verification of such hypotheses are given in Part 2 of this book.
It mean s, "My uncIe's shown his good intentions."
19
RANDOM EVENTS
8638 = 0.128.
B
In general, let B be an event in the set
of elementary events E. The set B is then an
element of the Borel field Z of subsets of the
set E of all elementary events. SupposeP(B)
O.
Let us consider B as a new set of elementary
events and denote by Z' the Borel field of all
Fig. 1.5.1
subsets of B which belong to the fieId Z.
Consider an arbitrary event A from the fieId Z. It may happen in
particular cases that the event A belongs to the field Z', namely, when A
is a subset of B. If, however, A contains any element of E which does not
belong to B, A is not an element of Z'; yet sorne part of A rnay be a
random event in Z', namely, when A and B have cornrnon eIernents, that
is, when the product AB is not empty.
Now let B denote a fixed elernent of the field Z, where P(B)
0, while
A runs over all possible elements of Z; then all elernents of Z' are products
of the form AB. To stress the fact that the product AB is now being
considered as an element of Z' (and not of Z) we denote it by the symboI
A I B and read: "A provided that B" or "A provided that B has occurred."
If A contains B, A IBis the sure event (in the field Z').
Event A IBis illustrated by Fig. 1.5.1. Here square E represents the
set of all elementary events, and circles A and B denote sorne random
events. The shaded area represents the random event B, and the doubly
shaded area represents the random event A B, that is, "event A provided
that B has occurred."
The probability of the event A I B in the field Z' will be denoted by
peA I B) and read: The conditional probability of A provided B has
occurred.
As will be shown shortly this probability can be defined by using the
probability in the field Z; hence there is no need to postulate separately .
the existence of the probability Pt A I B) and its properties.
>
>
20
PROBABILITY THEOR y
<
k
m
(1.5.1)
k/n
m/n'
(1.5.2)
peA B) = P(AB)
P(B) ,
where
P(B)
> O.
P(B I A) = P(AB)
peA) ,
where peA)
> O.
Similarly,
(1.5.3)
P(AB) = P(B)P(A
I B) = P(A)P(B
lA).
>
(1.5.5)
IA
peA
3
A ) = P(AA2A3) .
P(AA2)
From (l.5.5) and (l.5.3) we obtain for the probability of the product of
three events thc reIations
(l.5.6)
= P(A)P(A2
I A)P(A3 I A1A2)
RANDOM
21
EVENTS
(1.5.7)
P(AlA2 ...
An)
= P(Al)P(A2
I Al)P(A3 I AlA2)
... P(An
D
We shall show that the conditional probability
to 111.
We notice that the folIowing inequality is true:
(1.5.8)
P(AB)
I Al
...
An-l)
satisfies axioms 1
< P(B).
In fact, event B may occur either when event A occurs, or when event
A does not occur; hence
B= AB UAB,
where A is the complement of A. Thus AB e B, and from theorem 1.3.6,
we obtain (1.5.8).
Since P(AB)
O and P(B)
O we obtain, from formula (1.5.8),
>
>
O~p(AIB)~l,
and hence
peA lB)
1.
tCA;1
B) = (tA,) lB,
and hence
22
PROBABILITY THEORY
[( t Ai ) I B]
p[(tA,)BJ
P(tA,B)
P(B)
P(B)
= I P(AiB) = I P(Ai
P(B)
lB).
Example 1.6.1. We have two urns. There are 3 white and 2 black balls in the
first urn and 1 white and 4 black balls in the second. From an urn chosen at
random we select one ball at random. What is the probability of obtaining a
white ball if the probability of selecting each of the urns equals 0.5?
Denote by Al and A 2 respcctively, the events of selecting the first or second
urn, and by B the event of selecting a white ball. Event B may happen either
together with event Al or together with event A2; hence we have
B = AIB
+ A2B,
= P(AIB) + P(A2B).
+ P(A2)P(B I A2)
In this example we have peAl) = P(A2) = 0.5, P(B I Al) = 0.6, and
= 0.2. Placing these values into (1.6.1) we obtain P(B) = 0.4.
P(B I A 2)
>
(1.6.2)
P(B) = P(AI)P(B
I Al)
+ P(A2)P(B
IA
2)
+ ...
+ A2B + ...
and
(1.6.3)
P(B) = P(AIB)
+ P(A2B) + ...
RANDOM
23
EVENTS
>
(1.6.5)
P(Ai
I B) =
P(Al)P(B
P(Ai)P(B I Ai)
I Al) + P(A )P(B I A + ...
2
2)
B) = P(Al)P(B
P(A2)P(B I A2)
I Al) + P(A2)P(B
I A2)
0.7 P(A2)
0.9P(A2)
0.8
+ 0.7P(A2) = 0.493..
24
PROBABILITY THEORY
(1.7.1)
is of special importance. Then the fact that B has occurred does not have
any influence on the probability of A, or we could say, the probability
of A is,independent of the occurrence of B.
We notice that if(l. 7.1) is satisfied, formula (1.5.4) gives
(1.7.2)
P(AB) = P(A)P(B).
(1.7.3)
P(B A) = P(B)
Formula (1.7.2) was derived from formula (1.5.4) where it was assumed
that peA) > O and P(B) > O; nevertheless this formula is also valid
when one of these probabilities equals zero.
We now define the notion of independence of two random events.
Definition 1.7.1. Two events A and B are called independent if their
probabilities satisfy (1.7.2), that is, if the probability of the product AB
is equal to the product of the probabilities of A and B.
It follows from this definition that the notion of independence of two
events is symmetric with respect to these events.
As we have already established, formula (1.7.2) can be obtained from
either of the formulas (1.7.1) and (1.7.3). We also notice that formulas
(1.5.4) and (1.7.2) with peA) > Oand P(B) > Ogive formulas (1.7.1) and
(1.7.3). We thus deduce that each of the last formulas is a necessary and
sufficient condition for
probabili 1ies.
B
The notion of independence of two events can be generalized to
the case of any finite number of events.
Definition 1.7.2. Events Al' A2, , An are independent iffor all integer
indices kl, k2, k, satisfying the conditions
1
we have
(1.7.4)
Pt A;1Ak 2 ...
Pt A, ),
S
RANDOM
25
EVENTS
It is possible that Al' A2, , An are pairwise independent, that is, each
two events among Al' A2, , An are independent, that each three ofthem
are independent, and so on, and yet Ab A2, . , An are not independent.
This is illustrated by the example given by Bernstein.
Example 1.7.1. There are four slips of paper of identical size in an urn.
Each slip is marked with one of the numbers 110, 101,011,000 and there are no
two slips marked with the same number. Consider event Al that on the slip selected the number 1 appears in the first place, event A2 that 1 appears in the second
place, and A3 that 1 appears in the third place. The number of slips of each category is 2, the number of all slips is 4; hence if we assume that each slip has the
same probability of being selected, we have
= P(A2) = P(A3) = t .
Let A denote the product A1A2A3' P(A) = Osince event A is impossible (there
P(Al)
= P(A2),
since there are only two slips having 1 in the first place, and only one among them
with 1 in the second place. In a similar way we could show that the remaining
pairs are independent.
Independence of a countable number of events is defined in the following
way.
Definition 1.7.3. Events Al' A2, are independentifforevery n = 2,3, ...
events Ab A2, , An are independent.
From definitions 1.7.3 and 1.7.2 it follows that if the random events
Ab A2, are independent, for n = 2, 3, . . . and for arbitrary indices
kl, k2, , km events Ak 1, Ak 2, ... ,Ak n are independent.
To stress the fact that we consider the independence in the sense of
definitions 1.7.2 and 1.7.3, and not the independence of pairs, triples
and so on, the term independence en bloc, or mutual independence, is
used in probability theory. We shall avoid these terms, and independence
will always be understood in the sense of the given definitions.
Problems and Complements
1.1. Prove that the operations of addition and multiplication of random
events are commutative and satisfy the following associative and distributive
laws:
+ A2 = A2 + Ah
A1A2 = A2A,
Al + (A2 + A3) = (Al + A2)
Al(A2A3) = (A1A2)A3,
Al
Al(A2
+ A3,
26
PROBABILITY THEORY
A2
= A1A2,
A1A2
+ ... + An = Al + (A2
- A1A2)
U Ai = Al + (A2
= Al + A2
n (n > 2) random
+ ... + (An
- A1An ...
events.
- An-1An)'
- A1A2)
+ (A3
- A1A3 - A2A3)
+ ....
i=l
1.6. Prove that properties 1.2.2 and 1.2.5 follow from properties 1.2.1, 1.2.3,
and 1.2.4.
1.7. Let {An} (n = 1,2, ... ) be an arbitrary sequence of random events. The
random event A * which contains all the elementary events which belong to an infinite nurnber of the events An will be called the upper limit of the sequence {An},
A* = lim sup An'
n---a)
The random event A* whch contains all the elementary events which belong to
all but a finite number of the events An will be called the lower Iimit of the sequence
{An},
n-a)
a)
:L A
n=lk=n
A*
= TI
A*
a)
k,
ce
I TIAk'
n=lk=n
= lim
Ano
sequence,
ce
A=A*=A*
=_LAn.
n=l
A =A*
=A*
=TIAn.
n=l
A*
A*
RANDOM
EVENTS
27
1.9. (a) Prove that for an arbitrary sequence of random events {An}
P (lim sup An) ~ lim sup P(An),
n->-oo
n-oo
and
P (lim inf An) ~ lim inf P(An).
1t->-00
n->- 00
P ( lim An)
n->- 00
1.5(a)
= lim P(An).
n->- 00
and
(b),
Cl
A;) Si~/(A;),
PCl
A;) Si~/(Ai)'
In combinatorial Problems 1.11 to 1.16 assume that all the possible outcomes
have the same probability.
1.11. A deck of cards contains 52 cards. Player G has been dealt 13 of them.
Compute the probability that player G has
(a) exactIy 3 aces,
(b) at least 3 aces,
(e) any 3 face cards of the same face value,
(d) any 3 cards of the same face value from the five highest denominations,
(e) any 3 cards of the same face value from the eight lowest denominations,
(f) any 3 cards of the same face value,
(g) three successive spades,
(h) at least three successive spades,
(i) three successive cards of any suit,
(j) at least three successive cards of any suit.
1.12. Three dice are thrown once. Compute the probability of obtaining
(a) face 2 on one die,
(b) face 3 on at least one die,
(e) an even sum,
(d) a sum divisible by 3,
(e) a sum exceeding 7,
(f) a sum smaller than 12,
(g) a sum which is a prime number.
1.13. (Cheoalier de Mr's problem)
Find which of the following two events
is more likely to occur: (1) to obtain at least one ace in a simultaneous throw of
four dice; (2) at least one double ace in a series of 24 throws of a pair of dice.
1.14. (Banach's problem) A mathematician carries two boxes of matches,
each of which originally contained n matches. Each time he lights a cigarette
he selects one box at random. Compute the probability that when he eventually
selects an empty box, the other will contain r matches, where r = 0, 1, ... , n.
28
PROBABILITY THEORY
1.15. An urn contains m white and n-m black balls. Two players successively draw balls at random, putting the drawn ball back into the urn before the
next drawing. The player who first succeeds in drawing a white ball wins. Compute the probability of winning by the player who starts the game.
1.16. There are 28 slips of paper; on each of them one letter is written. The
letters and their frequencies are presented in the following table:
Letter
Number of slips
e
3
.i
lmn
The slips are then arranged in random order. What is the probability of obtaining the sentence.! "Sto lat sto lat niech iyje iyje nam"?
1.17. The famous poet Goethe once gave his guest, the famous chemist Runge,
a box of coffee beans. Runge used this gift-at that time very valuable-for
scientific experiments, and for the first time obtained pure caffeine. Is it possible
to compute the probability of this event? If so, is the answer unique? What are
the factors which determine the precise formulation of the random event whose
probability we compute?
1.18. (Bertrand's paradox)
A circle is drawn around an equilateral triangle
with side a. Then a random chord is drawn in this circle, The event A occurs if
and only if the length 1of this chord satisfies the relation 1 > a. State the conditions under which (a) P(A) = 0.5, (b) P(A) = 1/3, (e) P(A) = 1/4. Should these
resuIts be considered as paradoxical?
1.19. (Buffon's problem)
A needle oflength 21 is thrown at random on aplane
on which parallel lines are drawn at a distance 2a apart (a > 1). What is the
probability of the needle intersecting any of these lines?
. 1.20. The probability that both of a pair of twins are boys equals 0.32 and the
probability that both of them are girls equals 0.28. Find the conditional probability that
.
(a) the second twin is a boy, provided the first is a boy,
(b) the second twin is a girl, provided the first is a girl.
Hint: Use example 1.1.3.
1.21. (a) What should n be in order that the probability of obtaining the face 6
at least once in a series of n independent throws of a die will exceed 3/4?
(b) The events Al' A2, are independent and P(Aj) = P (j = 1, 2, ... ).
Find the least n such that
CHAPTER
Random Variables
2.1
THE CONCEPT
OF' A RANDOM
VARIABLE
29