Document
Document
Ravish R Singh
Director
Thakur Ramnarayan College of Arts & Commerce
Mumbai, Maharashtra
Mukul Bhatt
Assistant Professor
Thakur Ramnarayan College of Arts & Commerce
Mumbai, Maharashtra
Information contained in this work has been obtained by McGraw Hill Education (India), from sources
believed to be reliable. However, neither McGraw Hill Education (India) nor its authors guarantee the accuracy
or completeness of any information published herein, and neither McGraw Hill Education (India) nor its
authors shall be responsible for any errors, omissions, or damages arising out of use of this information. This
work is published with the understanding that McGraw Hill Education (India) and its authors are supplying
information but are not attempting to render engineering or other professional services. If such services are
required, the assistance of an appropriate professional should be sought.
2
6.16 Chi-square (c ) Test 6.65
6.17 Chi-square Test: Goodness of Fit 6.66
6.18 Chi-square Test for Independence of Attributes 6.74
7. Curve Fitting 7.1–7.26
7.1 Introduction 7.1
7.2 Least Square Method 7.2
7.3 Fitting of Linear Curves 7.2
7.4 Fitting of Quadratic Curves 7.10
7.5 Fitting of Exponential and Logarithmic Curves 7.18
Appendix A.1–A.4
Index I.1–I.4
Preface
Probability and Statistics is a key area of study in any engineering course. A sound
knowledge of this subject will help engineering students develop analytical skills, and
thus enable them to solve numerical problems encountered in real life, as well as apply
mathematical principles to physical problems, particularly in the field of engineering.
Users
This book is designed for the first year GTU engineering students pursuing the course
Probability and Statistics, Subject CODE: 3130006 in their 3rd Semester. It covers the
complete GTU syllabus for the course on Probability and Statistics.
Objective
The crisp and complete explanation of topics will help students easily understand the basic
concepts. The tutorial approach (i.e., teach by example) followed in the text will enable
students develop a logical perspective to solving problems.
Features
Each topic has been explained from the examination point of view, wherein the theory
is presented in an easy-to-understand student-friendly style. Full coverage of concepts is
supported by numerous solved examples with varied complexity levels, which is aligned
to the latest GTU syllabus. Fundamental and sequential explanation of topics are well
aided by examples and exercises. The solutions of examples are set following a ‘tutorial’
approach, which will make it easy for students from any background to easily grasp the
concepts. Exercises with answers immediately follow the solved examples enforcing a
practice-based approach. We hope that the students will gain logical understanding from
solved problems and then reiterate it through solving similar exercise problems themselves.
The unique blend of theory and application caters to the requirements of both the students
and the faculty.
Highlights
∑ Crisp content strictly as per the latest GTU syllabus of Probability and Statistics
∑ Comprehensive coverage with lucid presentation style
∑ Each section concludes with an exercise to test understanding of topics
∑ Rich exam-oriented pedagogy:
Solved examples within chapters: 360+
Unsolved exercises: 330+
xii Preface
Chapter Organization
The content spans the following 7 chapters which wholly and sequentially cover each mod-
ule of the syllabus.
Chapter 1 introduces Probability.
Chapter 2 discusses Random Variables.
Chapter 3 presents Basic Statistics.
Chapter 4 covers Correlation and Regression.
Chapter 5 deals with Some Special Probability Distributions.
Chapter 6 presents Applied Statistics: Test of Hypothesis.
Chapter 7 presents Curve Fitting.
Acknowledgements
We are grateful to the following reviewers who reviewed sample chapters of the book and
generously shared their valuable comments:
We would also like to thank all the staff at McGraw Hill Education (India), especially Navneet
Kumar, Hemant K Jha, Satinder Singh Baveja and Anuj Shriwastava for coordinating with
us during the editorial, copyediting, and production stages of this book.
Our acknowledgements would be incomplete without a mention of the contribution of
all our family members. We extend a heartfelt thanks to them for always motivating and
supporting us throughout the project.
Constructive suggestions for the improvement of the book will always be welcome.
Ravish R Singh
Mukul Bhatt
Publisher’s Note
Remember to write to us. We look forward to receiving your feedback, comments and
ideas to enhance the quality of this book. You can reach us at [Link]@[Link].
Please mention the title and authors’ name as the subject. In case you spot piracy of this
book, please do let us know.
Roadmap to the Syllabus
Probability and Statistics
Subject Code: 3130006
CHAPTER 1: Probability
Go To
CHAPTER 2: Random Variables
Probability
1
Chapter Outline
1.1 Introduction
1.2 Some Important Terms and Concepts
1.3 Definitions of Probability
1.4 Theorems on Probability
1.5 Conditional Probability
1.6 Multiplicative Theorem for Independent Events
1.7 Bayes’ Theorem
1.1 Introduction
The concept of probability originated from the analysis of the games of chance. Even
today, a large number of problems exist which are based on the games of chance, such
as tossing of a coin, throwing of dice, and playing of cards. The utility of probability
in business and economics is most emphatically revealed in the field of predictions for
the future. Probability is a concept which measures the degree of uncertainty and that
of certainty as a corollary.
The word probability or ‘chance’ is used commonly in day-to-day life. Daily, we come
across the sentences like, ‘it may rain today’, ‘India may win the forthcoming cricket
match against Sri Lanka’, ‘the chances of making profits by investing in shares of
Company A are very bright, etc. Each of the above sentences involves an element
of uncertainty. A numerical measure of uncertainty is provided by a very important
branch of mathematics called theory of probability. Before we study the probability
theory in detail, it is appropriate to explain certain terms which are essential for the
study of the theory of probability.
If the outcome is not unique but may be any one of the possible outcomes, the
experiment is called a random experiment, e.g., tossing a coin, throwing a dice.
(a) In tossing a coin, the events head or tail are mutually exclusive since both head
and tail cannot occur at the same time.
(b) In throwing a dice, all the six events, i.e., getting 1 or 2 or 3 or 4 or 5 or 6 are
mutually exclusive events.
6. Equally Likely Events The outcomes of a random experiment are said to be
equally likely if the occurrence of none of them is expected in preference to others. For
example, consider the following:
In throwing of two dice, the favourable events of getting the sum 5 is (1, 4), (4, 1),
(2, 3), (3, 2), i.e., 4.
(a) In a random experiment of tossing of a coin, the sample space consists of two
elementary events.
S = {H, T}
1.4 Chapter 1 Probability
(b) In a random experiment of throwing of a dice, the sample space consists of six
elementary events.
S = {1, 2, 3, 4, 5, 6}
The elements of S can either be single elements or ordered pairs. If two coins are
tossed, each element of the sample space consists of the following ordered pairs:
S = {(H, H), (H, T), (T, H), (T, T)}
2. Event Any subset of a sample space is called an event. In the experiment of
throwing of a dice, the sample space is S = {1, 2, 3, 4, 5, 6}. Let A be the event that
an odd number appears on the dice. Then A = {1, 3, 5} is a subset of S. Similarly, let
B be the event of getting a number greater than 3. Then B = {4, 5, 6} is another subset
of S.
i.e., the probability of a union of mutually exclusive events is the sum of probabilities
of the events themselves.
Example 1
What is the probability that a leap year selected at random will have
53 Sundays?
Solution
A leap year has 366 days, i.e., 52 weeks and 2 days. These 2 days can occur in the
following possible ways:
(i) Monday and Tuesday (ii) Tuesday and Wednesday
(iii) Wednesday and Thursday (iv) Thursday and Friday
(v) Friday and Saturday (vi) Saturday and Sunday
(vii) Sunday and Monday
Number of exhaustive cases n = 7
Number of favourable cases m = 2
1.3 Definitions of Probability 1.5
Example 2
Three unbiased coins are tossed. Find the probability of getting
(i) exactly two heads, (ii) at least one tail, (iii) at most two heads, (iv) a
head on the second coin, and (v) exactly two heads in succession.
Solution
When three coins are tossed, the sample space S is given by
S = {HHH, HTH, THH, HHT, TTT, THT, TTH, HTT}
n(s) = 8
(i) Let A be the event of getting exactly two heads.
A = {HTH, THH, HHT}
n( A) = 3
n( A) 3
P( A) = =
n(S ) 8
(ii) Let B be the event of getting at least one tail.
B = {HTH, THH, HHT, TTT, THT, TTH, HTT}
n( B) = 7
n( B) 7
P ( B) = =
n(S ) 8
(iii) Let C be the event of getting at most two heads.
C = {HTH, THH, HHT, TTT, THT, TTH, HTT}
n(C ) = 7
n(C ) 7
P (C ) = =
n(S ) 8
(iv) Let D be the event of getting a head on the second coin.
D = {HHH, THH, HHT, THT}
n( D) = 4
n( D) 4 1
P( D) = = =
n( S ) 8 2
1.6 Chapter 1 Probability
Example 3
A fair dice is thrown. Find the probability of getting (i) an even number,
(ii) a perfect square, and (iii) an integer greater than or equal to 3.
Solution
When a dice is thrown, the sample space S is given by
S = {1, 2, 3, 4, 5, 6}
n(S) = 6
(i) Let A be the event of getting an even number.
A = {2, 4, 6}
n( A) = 3
n( A) 3 1
P( A) = = =
n(S ) 6 2
(ii) Let B be the event of getting a perfect square.
B = {1, 4}
n( B) = 2
n( B) 2 1
P ( B) = = =
n(S ) 6 3
(iii) Let C be the event of getting an integer greater than or equal to 3.
C = {3, 4, 5, 6}
n(C ) = 4
n(C ) 4 2
P (C ) = = =
n(S ) 6 3
Example 4
A card is drawn from a well-shuffled pack of 52 cards. Find the probability
of (i) getting a king card, (ii) getting a face card, (iii) getting a red card,
(iv) getting a card between 2 and 7, both inclusive, and (v) getting a
card between 2 and 8, both exclusive.
1.3 Definitions of Probability 1.7
Solution
Total number of cards = 52
One card out of 52 cards can be drawn in ways.
n(S) = 52C1 = 52
(i) Let A be the event of getting a king card. There are 4 king cards and one of them
can be drawn in 4C1 ways.
n( A) = 4C1 = 4
n( A) 4 1
P( A) = = =
n(S ) 52 13
(ii) L
et B be the event of getting a face card. There are 12 face cards and one of them
can be drawn in 12C1 ways.
n( B) = 12C1 = 12
n( B) 12 3
P ( B) = = =
n(S ) 52 13
(iii) L
et C be the event of getting a red card. There are 26 red cards and one of them
can be drawn in 26C1 ways.
26
n(C ) = C1 = 26
n(C ) 26 1
P(C ) = = =
n(S ) 52 2
(iv) L
et D be the event of getting a card between 2 and 7, both inclusive. There are
6 such cards in each suit giving a total of 6 × 4 = 24 cards. One of them can be
drawn in 24C1 ways.
24
n( D) = C1 = 24
n( D) 24 6
P( D) = = =
n(S ) 52 13
(v) L
et E be the event of getting a card between 2 and 8, both exclusive. There are 5
such cards in each suit giving a total of 5 × 4 = 20 cards. One of them can be drawn
in 20C1 ways.
20
n( E ) = C1 = 20
n( E ) 20 5
= = =
n(S ) 52 13
Example 5
A bag contains 2 black, 3 red, and 5 blue balls. Three balls are drawn
at random. Find the probability that the three balls drawn (i) are blue
(ii) consist of 2 blue and 1 red ball, and (iii) consist of exactly one black
ball.
1.8 Chapter 1 Probability
Solution
Total number of balls = 10
3 balls out of 10 balls can be drawn in 10C3 ways.
n(S) = 10C3 = 120
(i) Let A be the event that the three balls drawn are blue. 3 blue balls out of 5 blue
balls can be drawn in 5C3 ways.
n( A) = 5C3 = 10
n( A) 10 1
P( A) = = =
n(S ) 120 12
(ii) Let B be the event that the three balls drawn consist of 2 blue and 1 red ball.
2 blue balls out of 5 blue balls can be drawn in 5C2 ways. 1 red ball out of 3 red
balls can be drawn in 3C1 ways.
n( B) = 5C2 ¥ 3C1 = 30
n( B) 30 1
P ( B) = = =
n(S ) 120 4
(iii) L
et C be the event that three balls drawn consist of exactly one black ball, i.e.,
remaining two balls can be drawn from 3 red and 5 blue balls. One black ball can
be drawn from 2 black balls in 2C1 ways and the remaining 2 balls can be drawn
from 8 balls in 8C2 ways.
n(C ) = 2C1 ¥ 8C2 = 56
n(C ) 56 7
P(C ) = = =
n(S ) 120 15
Example 6
A class consists of 6 girls and 10 boys. If a committee of three is chosen
at random from the class, find the probability that (i) three boys are
selected, and (ii) exactly two girls are selected.
Solution
Total number of students = 16
A committee of 3 students from 16 students can be selected in 16C3 ways.
n(S) = 16C3 = 560
(i) Let A be the event that 3 boys are selected.
n( A) = 10C3 = 120
n( A) 120 3
P( A) = = =
n(S ) 560 14
1.3 Definitions of Probability 1.9
(ii) L
et B be the event that exactly 2 girls are selected. 2 girls from 6 girls can be
selected in 6C2 ways and one boy from 10 boys can be selected in 10C1 ways.
n( B) = 6C2 ¥ 10C1 = 150
n( B) 150 15
P ( B) = = =
n(S ) 560 16
Example 7
From a collection of 10 bulbs, of which 4 are defective, 3 bulbs are
selected at random and fitted into lamps. Find the probability that (i) all
three bulbs glow, and (ii) the room is lit.
Solution
Total number of bulbs = 10
3 bulbs can be selected from 10 bulbs in 10C3 ways.
n(S) = 10C3 = 120
(i) Let A be event that all three bulbs glow. This event will occur when 3 bulbs are
selected from 6 nondefective bulbs in 6C3 ways.
n( A) = 6C3 = 20
n( A) 20 1
P( A) = = =
n(S ) 120 6
(ii) L
et B be the event that the room is lit. Let B be the event that the room is dark.
The event B will occur when 3 bulbs are selected from 4 defective bulbs in 4C3
ways.
n( B ) = 4C3 = 4
n( B ) 4 1
P( B) = = =
n(S ) 120 30
1 29
\ P ( B) = 1 - P ( B ) = 1 - =
30 30
Example 8
There are 20 tickets numbered 1, 2, ..., 20. One ticket is drawn at random.
Find the probability that the ticket bears a number which is (i) even,
(ii) a perfect square, and (iii) multiple of 3.
Solution
There are 20 tickets numbered from 1 to 20.
1.10 Chapter 1 Probability
n(S) = 20
(i) Let A be the event that a ticket bears a number which is even.
A = {2, 4, 6, 8, 10, 12, 14, 16, 18, 20}
n( A) = 10
n( A) 10 1
P( A) = = =
n(S ) 20 2
(ii) Let B the event that a ticket bears a number which is a perfect square.
B = {1, 4, 9, 16}
n( B) = 4
n( B) 4 1
P ( B) = = =
n(S ) 20 5
(iii) Let C be the event that a ticket bears a number which is a multiple of 3.
C = {3, 6, 9, 12, 15, 18}
n(C ) = 6
n(C ) 6 3
P (C ) = = =
n(S ) 20 10
Example 9
Four letters of the word ‘THURSDAY’ are arranged in all possible ways.
Find the probability that the word formed is ‘HURT’.
Solution
Total number of letters in the word ‘THURSDAY’ = 8
Four letters from 8 letters can be arranged in 8P4 ways.
n(S) = 8P4 = 1680
Let A be the event that the word formed is ‘HURT’. The word ‘HURT’ can be formed
in one way only.
n( A) = 1
n( A) 1
P( A) = =
n(S ) 1680
Example 10
A bag contains 5 red, 4 blue, and m green balls. If the probability of
1
getting two green balls when two balls are selected at random is ,
find m. 7
1.3 Definitions of Probability 1.11
Solution
Total number of balls = 5 + 4 + m = 9 + m
2 balls out of 9 + m balls can be drawn in 9 + mC2 ways.
n(S) = 9 + mC2
Let A be the event that both the balls drawn are green.
2 green balls out of m green balls can be drawn in mC2 ways.
n( A) = m C2
m
n( A) C2
P( A) = = 9+ m
n(S ) C2
1
But P ( A) =
7
m
C2 1
9+ m
=
C2 7
m(m - 1) 1
=
(m + 9)(m + 8) 7
(m + 9) (m + 8) = 7 m (m - 1)
m 2 + 17m + 72 = 7m 2 - 7m
6 m 2 - 24 m - 72 = 0
3m 2 - 12 m - 36 = 0
3m 2 - 18m + 6 m - 36 = 0
3m(m - 6) + 6(m - 6) = 0
(3m + 6)(m - 6) = 0
3m + 6 = 0 or m - 6 = 0
m = -2 or m=6
But m π –2
\ m=6
Exercise 1.1
2. An unbiased coin is tossed twice. Find the probability of (i) exactly one
head, (ii) at most one head, (iii) at least one head, and (iv) same face on
both the coins.
È 1 3 3 1˘
ÍÎ ans.: (i) 2 (ii) 4 (iii) 4 (iv) 2 ˙˚
3. A fair dice is thrown thrice. Find the probability that the sum of the
numbers obtained is 10.
È 1˘
ÍÎ ans.: 8 ˙˚
4. A ball is drawn at random from a box containing 12 red, 18 white, 19 blue,
and 15 orange balls. Find the probability that (i) it is red or blue, and
(ii) it is white, blue, or orange.
È 2 43 ˘
ÍÎ ans.: (i) 5 (ii) 55 ˙˚
5. Eight boys and three girls are to sit in a row for a photograph. Find the
probability that no two girls are together.
È 28 ˘
ÍÎ ans.: 55 ˙˚
6. If four persons are chosen from a group of 3 men, 2 women, and 4
children, find the probability that exactly two of them will be children.
È 10 ˘
ÍÎ ans.: 21˙˚
7. A box contains 2 white, 3 red, and 5 black balls. Three balls are drawn at
random. What is the probability that they will be of different colours?
È 1˘
ÍÎ ans.: 4 ˙˚
8. Two cards are drawn from a well-shuffled pack of 52 cards. Find the
probability of getting (i) 2 king cards, (ii) 1 king card and 1 queen card,
and (iii) 1 king card and 1 spade card.
È 1 8 1˘
ÍÎ ans.: (i) 221 (ii) 663 (iii) 26 ˙˚
9. A four-digit number is to be formed using the digits 0, 1, 2, 3, 4, 5. All the
digits are to be different. Find the probability that the digit formed is
(i) odd, (ii) greater than 4000, (iii) greater than 3400, and (iv) a multiple
of 5.
È 12 2 12 9˘
ÍÎ ans.: (i) 25 (ii) 5 (iii) 25 (iv) 25 ˙˚
1.4 Theorems on Probability 1.13
P ( A » B) = P ( A « B )
P ( A « B) = P ( A » B )
B = ( A « B) » ( A « B)
P( B) = P ÈÎ( A « B) » ( A « B)˘˚
– Fig. 1.1
Since (A « B) and (A « B) are mutually exclusive events,
P ( B) = P ( A « B) + P ( A « B)
P ( A « B) = P ( B) - P ( A « B)
Similarly, it can be shown that
P( A « B ) = P( A) - P( A « B)
1.4 Theorems on Probability 1.15
A » B = A » ( A « B)
P ( A » B) = P ÎÈ A » ( A « B ˚˘
–
Since A and (A « B) are mutually exclusive events,
Remarks
= P ( A « B ) + P ( A « B) ÈÎ∵ ( A « B ) « ( A « B) = f ˘˚
= P ( A) - P ( A « B) + P( B) - P( A « B) [Using Theorem 3]
= P ( A) + P ( B) - 2 P ( A « B)
= P ( A » B) - P ( A « B) [Using Theorem 4]
= P (at least one of the two events occur)
– P (the two events occur simultaneously)
1.16 Chapter 1 Probability
Corollary 3 The addition theorem can be applied for more than two events. If A,
B, and C are three events of a sample space S then the probability of occurrence of at
least one of them is given by
P ( A » B » C ) = P [A » ( B » C )]
= P ( A) + P ( B » C ) - P [A « ( B » C )]
= P ( A) + P ( B » C ) - P [A « B) » ( A « C )]
= P ( A) + P ( B) + P (C ) - P ( B « C ) - P ( A « B) - P( A « C ) + P ( A « B « C )
[Applying Theorem 4 on second and third term ]
Alternately, the probability of occurrence of at least one of the three events can also
be written as
P( A » B » C ) = 1 - P( A « B « C )
If A, B, and C are mutually exclusive events,
P(A » B » C) = P(A) + P(B) + P(C)
Corollary 4 The probability of occurrence of at least two of the three events is
given by
P [A « B) » ( B « C ) » ( A « C )] = P ( A « B) + P ( B « C ) + P( A « C ) - 3P ( A « B « C )
+ P( A « B « C ) [Using Corollary 3]
= P ( A « B) + P ( B « C ) + P ( A « C ) - 2 P ( A « B « C )
P ÎÈ A « B « C ) » ( A « B « C ) » ( A « B « C )˚˘
= P [( A « B) » ( B « C ) » ( A « C )] - P ( A « B « C ) [Using Corollary 2]
= P ( A « B) + P( B « C ) + P( A « C ) - 3P( A « B « C ) [Using Corollary 4]
= P( A) + P ( B) + P (C ) - 2 P ( A « B) - 2 P ( B « C ) - 2 P ( A « C ) + 3P ( A « B « C )
1.4 Theorems on Probability 1.17
Example 1
A card is drawn from a well-shuffled pack of cards. What is the probability
that it is either a spade or an ace?
Solution
Let A and B be the events of getting a spade and an ace card respectively.
13
C1 13
P ( A) = 52
=
C1 52
4
C1 4
P ( B) = 52
=
C1 52
1
C1 1
P ( A « B) = 52
=
C1 52
Probability of getting either a spade or an ace card
P( A » B) = P( A) + P( B) - P( A « B)
13 4 1
= + -
52 52 52
4
=
13
Example 2
Two cards are drawn from a pack of cards. Find the probability that they
will be both red or both pictures.
Solution
Let A and B be the events that both cards drawn are red and pictures respectively.
26
C2 325
P( A) = 52
=
C2 1326
12
C2 66
P ( B) = 52
=
C2 1326
6
C2 15
P ( A « B) = 52
=
C2 1326
Probability that both cards drawn are red or pictures
P( A » B) = P( A) + P( B) - P( A « B)
1.18 Chapter 1 Probability
325 66 15
= + -
1326 1326 1326
188
=
663
Example 3
2
The probability that a contractor will get a plumbing contract is
3
5
and the probability that he will not get an electric contract is . If the
9
4
probability of getting any one contract is , what is the probability that
5
he will get both the contracts?
Solution
Let A and B be the events that the contractor will get plumbing and electric contracts
respectively.
2 5 4
P( A) = , P( B ) = , P( A » B) =
3 9 5
5 4
P ( B) = 1 - P ( B ) = 1 - =
9 9
Probability that the contractor will get any one contract
P( A » B) = P( A) + P( B) - P( A « B)
Probability that the contractor will get both the contracts
P( A « B) = P( A) + P( B) - P( A » B)
2 4 4
= + -
3 9 5
14
=
45
Example 4
A person applies for a job in two firms A and B, the probability of his
being selected in the firm A is 0.7 and being rejected in the firm B is 0.5.
The probability of at least one of the applications being rejected is 0.6.
What is the probability that he will be selected in one of the two firms?
Solution
Let A and B be the events that the person is selected in firms A and B respectively.
1.4 Theorems on Probability 1.19
Example 5
In a group of 1000 persons, there are 650 who can speak Hindi, 400 can
speak English, and 150 can speak both Hindi and English. If a person is
selected at random, what is the probability that he speaks (i) Hindi only,
(ii) English only, (iii) only of the two languages, and (iv) at least one of
the two languages?
Solution
Let A and B be the events that a person selected at random speaks Hindi and English
respectively.
650 400 150
P( A) = , P ( B) = , P ( A « B) =
1000 1000 1000
(i) Probability that a person selected at random speaks Hindi only
P( A « B ) = P( A) - P( A « B)
650 150
= -
1000 1000
1
=
2
(ii) Probability that a person selected at random speaks English only
P ( A « B) = P ( B) - P ( A « B)
400 150
= -
1000 1000
1
=
4
1.20 Chapter 1 Probability
(iii) Probability that a person selected at random speaks only one of the languages.
P ÈÎ( A « B ) » ( A « B)˘˚ = P( A) + P( B) - 2 P( A « B)
650 400 Ê 150 ˆ
= + - 2Á
1000 1000 Ë 1000 ˜¯
3
=
4
(iv) P
robability that a person selected at random speaks at least one of the two
languages
P( A » B) = P( A) + P( B) - P( A « B)
650 400 150
= + -
1000 1000 1000
9
=
10
Example 6
A box contains 4 white, 6 red, 5 black balls, and 5 balls of other colours.
Two balls are drawn from the box at random. Find the probability that
(i) both are white or both are red, and (ii) both are red or both are
black.
Solution
Let A, B, and C be the events of drawing white, red and black balls from the box
respectively.
4
C2 3
P( A) = 20
=
C2 95
6
C2 3
P ( B) = 20
=
C2 38
5
C2 1
P(C ) = 20
=
C2 19
(i) Probability that the both balls are white or both are red
P( A » B) = P( A) + P( B) - P( A « B)
3 3
= + -0
95 38
21
=
190
1.4 Theorems on Probability 1.21
(ii) Probability that both balls are red or both are black
P ( B » C ) = P ( B) + P (C ) - P ( B « C )
3 1
= + -0
38 19
5
=
38
Example 7
Three students A, B, C are in a running race. A and B have the same
probability of winning and each is twice as likely to win as C. Find the
probability that B or C wins.
Solution
Let A, B, and C be the events that students A, B, and C win the race respectively.
P( A) = P ( B) = 2 P (C )
P( A) + P( B) + P(C ) = 1
2 P(C ) + 2 P(C ) + P(C ) = 1
1
P(C ) =
5
2 2
\ P( A) = and P( B) =
5 5
Probability that student B or C wins
P( B » C ) = P( B) + P(C ) - P( B « C )
2 1
= + -0
5 5
3
=
5
Example 8
A card is drawn from a pack of 52 cards. Find the probability of getting
a king or a heart or a red card.
Solution
Let A, B and C be the events that the card drawn is a king, a heart and a red card
respectively.
1.22 Chapter 1 Probability
4
C1 4
P ( A) = 52
=
C1 52
13
C1 13
P ( B) = 52
=
C1 52
26
C1 26
P(C ) = 52
=
C1 52
1
C1 1
P ( A « B) = 52
=
C1 52
13
C1 13
P( B « C ) = 52
=
C1 52
2
C1 2
P( A « C ) = 52
=
C1 52
1
C1 1
P( A « B « C ) = 52
=
C1 52
Example 9
From a city, 3 newspapers A, B, C are being published. A is read by
20%, B is read by 16%, C is read by 14%, both A and B are read by 8%,
both A and C are read by 5%, both B and C are read by 4% and all three
A, B, C are read by 2%. What is the probability that a randomly chosen
person (i) reads at least one of these newspapers, and (ii) reads one of
these newspapers?
Solution
Let A, B, and C be the events that the person reads newspapers A, B, and C respectively.
P( A) = 0.2, P( B) = 0.16 P(C ) = 0.14
P( A « B) = 0.08, P( A « B) = 0.05, P( B « C ) = 0.04
P ( A « B « C ) = 0.02
1.4 Theorems on Probability 1.23
(i) Probability that the person reads at least one of these newspapers
P( A » B » C ) = P( A) + P( B) + P(C ) - P( A « B) - P( A « C ) - P( B « C )
+ P( A « B « C )
= 0.2 + 0.16 + 0.14 - 0.08 - 0.05 - 0.04 + 0.02
= 0.35
(ii) Probability that the person reads none of these newspapers
P( A « B « C ) = 1 - P( A » B » C )
= 1 - 0.35
= 0.65
Alternatively, the problem can be solved by a Venn diagram A C
(Fig. 1.2). 9 3 7
2
65 6 2
(the person reads at least one paper) = 1 -
(i) P = 0.35
100 6
B 65
(ii) P(the person reads none of these papers) = 0.65
Fig. 1.2
Exercise 1.2
2
1. The probability that a student passes a Physics test is and the
3
14
probability that he passes both Physics and English tests is . The
45
4
probability that he passes at least one test is . What is the probability
5
that the student passes the English test?
È 4˘
ÍÎ ans.: 9 ˙˚
2. What is the probability of drawing a black card or a king from a well-
shuffled pack of playing cards?
È 7˘
ÍÎ ans.: 13 ˙˚
3. A pair of unbiased dice is thrown. Find the probability that (i) the sum
of spots is either 5 or 10, and (ii) either there is a doublet or a sum less
than 6.
È 7 7˘
ÍÎ ans.: (i) 36 (ii) 18 ˙˚
1.24 Chapter 1 Probability
È 2 56 38 47 ˘
Í ans.: (i) 5225 (ii) 5225 (iii) 85 (iv) 85 ˙
Í ˙
Í 703 839 997 ˙
(v) (vi) (vii)
ÍÎ 1700 850 1700 ˙˚
9. From a set of 16 cards numbered 1 to 16, one card is drawn at random.
Find the probability that (i) the number obtained is divisible by 3 or 7,
and (ii) not divisible by 3 and 7.
È 7 9˘
ÍÎ ans.: (i) 16 (ii) 16 ˙˚
1.6 Multiplicative Theorem for Independent Events 1.25
10. There are 12 bulbs in a basket of which 4 are working. A person tries
to fit them in 3 sockets choosing 3 of the bulbs at random. What is
the probability that there will be (i) some light, and (ii) no light in the
room?
È 41 14 ˘
ÍÎ ans.: (i) 55 (ii) 55 ˙˚
For any two events A and B in a sample space S, the probability of their simultaneous
occurrence, i.e., both the events occurrings simultaneously is given by
P( A « B) = P( A) P( B /A)
or P ( A « B ) = P ( B ) P ( A /B )
where P(B/A) is the conditional probability of B given that A has already occurred.
P(A/B) is the conditional probability of A given that B has already occurred.
If A and B are two independent events, the probability of their simultaneous occur-
rence is given by
P( A « B) = P ( A) P ( B)
P ( A « B ) = P ( B ) P ( A /B ) ...(1.1)
Proof A = ( A « B) » ( A « B )
Since ( A « B) and ( A « B ) are mutually exclusive events,
Remark The additive law is used to find the probability of A or B, i.e., P(A » B).
The multiplicative law is used to find the probability of A and B, i.e., P(A « B).
1.26 Chapter 1 Probability
Example 1
If A and B are two events such that P( A) = 2 , P( A « B) = 1 and
3 6
1
P(A « B) = , find P ( B), P ( A » B), P ( A /B), P ( B /A), P ( A » B) and
– 3
P(B ). Also, examine whether the events A and B are (i) equally likely,
(ii) exhaustive, (iii) mutually exclusive, and (iv) independent.
Solution
P ( B) = P ( A « B) + P ( A « B)
1 1
= +
6 3
1
=
2
1.6 Multiplicative Theorem for Independent Events 1.27
P ( A » B ) = P ( A) + P ( B ) - P ( A « B )
2 1 1
= + -
3 2 3
5
=
6
P ( A « B)
P ( A /B) =
P ( B)
Ê 1ˆ
ÁË 3 ˜¯
=
Ê 1ˆ
ÁË 2 ˜¯
2
=
3
P ( A « B)
P ( B /A) =
P ( A)
Ê 1ˆ
ÁË 3 ˜¯
=
Ê 2ˆ
ÁË 3 ˜¯
1
=
2
P ( A » B) = P ( A) + P ( B) - P ( A « B)
1 1 1
= + -
3 2 6
2
=
3
P ( A « B ) = 1 - P ( A » B)
5
= 1-
6
1
=
6
P ( B ) = 1 - P ( B)
1
= 1-
2
1
=
2
(i) Since P(A) π P(B), A and B are not equally like events.
(ii) Since P(A » B) π 1, A and B are not exhaustive events.
1.28 Chapter 1 Probability
Example 2
If A and B are two events such that P(A) = 0.3, P(B) = 0.4,
P(A « B) = 0.2, find (i) P(A » B), (ii) P ( A /B), and (iii) P ( A /B ).
Solution
(i) P( A » B) = P( A) + P( B) - P( A « B)
= 0.3 + 0.4 - 0.2
= 0.5
(ii) P( A /B) = P( A « B)
P ( B)
P ( B) - P ( A « B)
=
P ( B)
0.4 - 0.2
=
0.4
= 0.5
(iii) P( A /B ) = P ( A « B )
P( B)
P ( A) - P ( A « B)
=
1 - P ( B)
0.3 - 0.2
=
1 - 0.4
1
=
6
Example 3
1 1 1
If A and B are two events with P( A) = , P( B) = , P( A « B) = .
3 4 12
Find (i) P(A/B), (ii) P(B/A), (iii) P ( B /A), and (iv) P ( A « B ).
Solution
1
P ( A « B) 12 1
(i) P( A /B) = = =
P ( B) 1 3
4
1.6 Multiplicative Theorem for Independent Events 1.29
1
P ( A « B) 12 1
(ii) P( B /A) = = =
P ( A) 1 4
3
P ( B « A)
(iii) P( B /A) =
P ( A)
P( B) - P( B « A)
=
1 - P ( A)
1 1
-
= 4 12
1
1-
3
1
=
4
(iv) P( A « B ) = P( A) - P( A « B)
1 1
= -
3 12
1
=
4
Example 4
Find the probability of drawing a queen and a king from a pack of cards
in two consecutive draws, the cards drawn not being replaced.
Solution
Let A be the event that the card drawn is a queen.
4
C1 4 1
P( A) = 52
= =
C1 52 13
Let B be the event that the cards drawn are a king in the second draw given that the
first card drawn is a queen.
4
C1 4
P( B /A) = 51
=
C1 51
Example 5
A bag contains 3 red and 4 white balls. Two draws are made without
replacement. What is the probability that both the balls are red?
Solution
Let A be the event that the ball drawn is red in the first draw.
3
P( A) =
7
Let B be the event that the ball drawn is red in the second draw given that the first ball
drawn is red.
2
P( B /A) =
6
Probability that both the balls are red
P( A « B) = P( A) P( B /A)
3 2
= ¥
7 6
1
=
7
Example 6
A bag contains 8 red and 5 white balls. Two successive draws of 3 balls
each are made such that (i) the balls are replaced before the second
trial, and (ii) the balls are not replaced before the second trial. Find the
probability that the first draw will give 3 white and the second, 3 red balls.
Solution
Let A be the event that all 3 balls obtained at the first draw are white, and B be the event
that all the 3 balls obtained at the second draw are red.
(i) When balls are replaced before the second trial,
5
C3 5
P( A) = 13
=
C3 143
8
C3 28
P ( B) = 13
=
C3 143
1.6 Multiplicative Theorem for Independent Events 1.31
Probability that the first draw will give 3 white and the second, 3 red balls
P ( A « B) = P ( A) P ( B)
5 28
= ¥
143 143
140
=
20449
(ii) When the balls are not replaced before the second trial
8C3 7
P( B /A) = =
10C3 15
Probability that the first draw will give 3 white and the second, 3 red balls
P ( A « B) = P( A) P( B /A)
5 7
= ¥
143 15
7
=
429
Example 7
From a bag containing 4 white and 6 black balls, two balls are drawn at
random. If the balls are drawn one after the other without replacements,
find the probability that the first ball is white and the second ball is
black.
Solution
Let A be the event that the first ball drawn is white and B be the event that the second
ball drawn is black given that the first ball drawn is white.
4
P( A) =
10
6
P( B /A) =
9
Probability that the first ball is white and the second ball is black.
P ( A « B) = P( A) P( B /A)
4 6
= ¥
10 9
4
=
15
1.32 Chapter 1 Probability
Example 8
Data on readership of a certain magazine show that the proportion of
male readers under 35 is 0.40 and that over 35 is 0.20. If the proportion
of readers under 35 is 0.70, find the probability of subscribers that are
females over 35 years. Also, calculate the probability that a randomly
selected male subscriber is under 35 years of age.
Solution
Let A be the event that the reader of the magazine is a male. Let B be the event that
reader of the magazine is over 35 years of age.
P( A « B ) = 0.40, P ( A « B) = 0.20, P( B ) = 0.7
P ( B) = 1 - P ( B )
= 1 - 0.7
= 0.3
(i) Probability of subscribers that are females over 35 years
P ( A « B) = P ( B) - P ( A « B)
= 0.3 - 0.2
= 0.1
(ii) Probability that a randomly selected male subscriber is under 35 years of age
P( A « B)
P( B /A) =
P( A)
P( A « B)
=
P ( A « B) + P ( A « B )
0.4
=
0.2 + 0.4
0.4
=
0.6
2
=
3
Example 9
From a city population, the probability of selecting (a) a male or a
7 2
smoker is , (b) a male smoker is , and (c) a male, if a smoker is
10 5
1.6 Multiplicative Theorem for Independent Events 1.33
2
already selected, is . Find the probability of selecting (i) a nonsmoker,
3
(ii) a male, and (iii) a smoker, if a male is first selected.
Solution
Let A be the event that a male is selected. Let B be the event that a smoker is
selected.
7 2 2
P ( A » B ) = , P ( A « B ) = , P ( A /B ) =
10 5 3
(i) Probability of selecting a nonsmoker
P ( B ) = 1 - P ( B)
P ( A « B)
= 1-
P ( A /B )
Ê 2ˆ
ÁË 5 ˜¯
= 1-
Ê 2ˆ
ÁË 3 ˜¯
2
=
5
(ii) P ( B) = 1 - P ( B )
2
= 1-
5
3
=
5
P( A » B) = P( A) + P( B) - P( A « B) ... (1)
Probability of selecting a male
P( A) = P( A » B) + P( A « B) - P( B) [Using Eq. (1)]
7 2 3
= + -
10 5 5
1
=
2
(iii) Probability of selecting a smoker if a male is first selected
P ( A « B)
P( B /A) =
P ( A)
1.34 Chapter 1 Probability
Ê 2ˆ
ÁË 5 ˜¯
=
Ê 1ˆ
ÁË 2 ˜¯
4
=
5
Example 10
Sixty per cent of the employees of the XYZ corporation are college
graduates. Of these, ten percent are in sales. Of the employee who
did not graduate from college, eighty percent are in sales. What is the
probability that
(i) an employee selected at random is in sales?
(ii) an employee selected at random is neither in sales nor a college
graduate?
Solution
Let A be the event that an employee is a college graduate. Let B be the event that an
employee is in sales.
P( A) = 0.6, P ( B /A) = 0.10, P ( B /A) = 0.8
= 1 - [ P( A) + P( B) - P( A) P( B /A)]
= 1 - [0.60 + 0.38 - (0.60 ¥ 0.10)]
= 0.08
1.6 Multiplicative Theorem for Independent Events 1.35
Example 11
3 5
If A and B are two events such that P ( A) = , P ( B) = and
3 8 8
P( A » B) = , find P(A/B) and P(B/A). Show whether A and B are
4
independent.
Solution
P ( A » B ) = P ( A) + P ( B ) - P ( A « B )
3 3 5
= + - P ( A « B)
4 8 8
1
P ( A « B) =
4
P ( A « B)
P ( A /B) =
P ( B)
Ê 1ˆ
ÁË 4 ˜¯
=
Ê 5ˆ
ÁË 8 ˜¯
2
=
5
P ( A « B)
P ( B /A) =
P ( A)
Ê 1ˆ
ÁË 4 ˜¯
=
Ê 3ˆ
ÁË 8 ˜¯
2
=
3
3 5 15
P( A) P( B) = ¥ =
8 8 64
P( A « B) π P ( A) P ( B)
Hence, the events A and B are not independent.
Example 12
2
The probability that a student A solves a mathematics problem is and
2 5
the probability that a student B solves it is . What is the probability
3
1.36 Chapter 1 Probability
that (i) the problem is not solved, (ii) the problem is solved, and (iii) both
A and B, working independently of each other, solve the problem?
Solution
Let A and B be events that students A and B solve the problem respectively.
2 2
P( A) = , P( B) =
5 3
Events A and B are independent.
Probability that the student A does not solve the problem
P( A) = 1 - P( A)
2
= 1-
5
3
=
5
Probability that the student B does not solve the problem
P ( B ) = 1 - P ( B)
2
= 1-
3
1
=
3
(i) Probability that the problem is not solved
P ( A « B ) = P ( A) P ( B )
3 1
= ¥
5 3
1
=
5
(ii) Probability that the problem is solved
P ( A » B) = 1 - P ( A « B )
1
= 1-
5
4
=
5
(iii) Probability that both A and B solve the problem
P( A « B) = P ( A) P ( B)
2 2
= ¥
5 3
4
=
15
1.6 Multiplicative Theorem for Independent Events 1.37
Example 13
The probability that the machine A will perform a usual function in
1
5 years’ time is , while the probability that the machine B will perform
4 1
the function in 5 years’ time is . Find the probability that both machines
3
will perform the usual function.
Solution
Let A and B be the events that machines A and B will perform the usual function
respectively.
1
P( A) =
4
1
P ( B) =
3
Events A and B are independent.
Probability that both machines will perform the usual function
P( A « B) = P ( A) P ( B)
1 1
= ¥
4 3
1
=
12
Example 14
A person A is known to hit a target in 3 out of 4 shots, whereas another
person B is known to hit the same target in 2 out of 3 shots. Find the
probability of the target being hit at all when they both try.
[Summer 2015]
Solution
Let A and B be the events that the persons A and B hit the target respectively.
3
P( A) =
4
2
P ( B) =
3
Events A and B are independent.
3 1
Probability that the person A will not hit the target = P( A) = 1 - P( A) = 1 - =
4 4
1.38 Chapter 1 Probability
2 1
Probability that the person B will not hit the target = P( B ) = 1 - P( B) = 1 - =
3 3
Probability that the target is not hit at all
P ( A « B ) = P ( A) P ( B )
1 1
= ¥
4 3
1
=
12
Probability that the target is hit at all when they both try
P ( A » B) = 1 - P ( A « B )
1
= 1-
12
11
=
12
Aliter
P( A » B) = P( A) + P( B) - P( A « B)
= P( A) + P( B) - P( A) P( B) [∵ A and B independent]
3 2 3 2
= + - ¥
4 3 4 3
11
=
12
Example 15
The odds against A speaking the truth are 4 : 6 while the odds in favour
of B speaking the truth are 7 : 3. What is the probability that A and B
contradict each other in stating the same fact?
Solution
Let A and B be events that A and B speak the truth respectively.
6
P( A) =
10
7
P ( B) =
10
Events A and B are independent.
6 4
Probability that A speaks a lie = P( A) = 1 - P( A) = 1 - =
10 10
7 3
Probability that B speaks a lie = P( B ) = 1 - P( B) = 1 - =
10 10
1.6 Multiplicative Theorem for Independent Events 1.39
Example 16
An urn contains 10 red, 5 white and 5 blue balls. Two balls are drawn at
random. Find the probability that they are not of the same colour.
Solution
Let A, B, and C be the events that two balls drawn at random be of the same colour, i.e.,
red, white, and blue respectively.
10C2 9
P( A) = =
20C2 38
5C2 1
P ( B) = =
20C2 19
5C2 1
P(C ) = =
20C2 19
Events A, B, and C are independent.
Probability that both balls drawn are of same colour
P( A » B » C ) = P ( A) + P ( B) + P (C )
9 1 1
= + +
38 19 19
13
=
38
Probability that both balls drawn are not of the same colour
P( A « B « C ) = 1 - P( A » B » C )
13
= 1-
38
25
=
38
1.40 Chapter 1 Probability
Example 17
A problem in statistics is given to three students A, B and C, whose
1 1 1
chances of solving it independently are , , and respectively. Find
the probability that 2 3 4
(i) the problem is solved
(ii) at least two of them are able to solve the problem
(iii) exactly two of them are able to solve the problem
(iv) exactly one of them is able to solve the problem
Solution
Let A, B, and C be the events that students A, B, and C solve the problem respec-
tively.
1 1 1
P( A) = , P( B) = , P (C ) =
2 3 4
Events A, B, and C are independent.
(i) Probability that the problem is solved or at least one of them is able to solve the
problem is same.
P( A » B » C ) = P( A) + P( B) + P(C ) - P( A « B) - P( A « C ) - P( B « C )
+ P( A « B « C )
= P( A) + P( B) + P(C ) - P( A) P( B) - P( A) P(C ) - P( B) P(C )
+ P( A) P( B) P (C )
1 1 1 Ê 1 1ˆ Ê 1 1 ˆ Ê 1 1 ˆ Ê 1 1 1 ˆ
= + + - ¥ - ¥ - ¥ + ¥ ¥
2 3 4 ÁË 2 3 ˜¯ ÁË 2 4 ˜¯ ÁË 3 4 ˜¯ ÁË 2 3 4 ˜¯
3
=
4
(ii) Probability that at least two of them are able to solve the problem
P [( A « B) » ( B « C ) » ( A « C )] = P ( A « B) + P ( B « C ) + P( A « C ) - 2 P ( A « B « C )
= P( A) P( B) + P( B) P (C ) + P ( A) P (C )
- 2 P( A) P( B) P (C )
Ê 1 1ˆ Ê 1 1 ˆ Ê 1 1 ˆ Ê 1 1 1ˆ
= Á ¥ ˜ + Á ¥ ˜ + Á ¥ ˜ - 2Á ¥ ¥ ˜
Ë 2 3¯ Ë 3 4 ¯ Ë 2 4 ¯ Ë 2 3 4¯
7
=
24
1.6 Multiplicative Theorem for Independent Events 1.41
(iii) Probability that exactly two of them are able to solve the problem
P ÎÈ( A « B « C ) » ( A « B « C » ( A « B « C ˚˘
= P ( A « B) + P ( B « C ) + P ( A « C ) - 3P ( A « B « C )
= P( A) P( B) + P( B) P(C ) + P( A) P(C ) - 3P ( A) P ( B) P (C )
Ê 1 1ˆ Ê 1 1 ˆ Ê 1 1 ˆ Ê 1 1 1ˆ
= Á ¥ ˜ + Á ¥ ˜ + Á ¥ ˜ - 3Á ¥ ¥ ˜
Ë 2 3¯ Ë 3 4 ¯ Ë 2 4 ¯ Ë 2 3 4¯
1
=
4
(iv) Probability that exactly one of them is able to solve the problem
P ÈÎ A « B « C ) » ( A « B « C ) » ( A « B « C )˘˚
= P( A) + P( B) + P(C ) - 2 P( A « B) - 2 P( B « C ) - 2 P( A « C ) + 3P( A « B « C )
1 1 1 Ê 1 1ˆ Ê 1 1ˆ Ê 1 1ˆ Ê 1 1 1ˆ
= + + - 2 Á ¥ ˜ - 2 Á ¥ ˜ - 2 Á ¥ ˜ + 3Á ¥ ¥ ˜
2 3 4 Ë 2 3¯ Ë 3 4¯ Ë 2 4¯ Ë 2 3 4¯
11
=
24
Example 18
A husband and wife appeared in an interview for two vacancies in an
1
office. The probability of the husband’s selection is and that of the
1 7
wife’s selection is . Find the probability that (i) both of them are
5
selected, (ii) only one of them is selected, (iii) none of them is selected,
and (iv) at least one of them is selected.
Solution
Let A and B be the events that the husband and wife are selected respectively.
1 1
P( A) = , P ( B) =
7 5
Events A and B are independent.
(i) Probability that both of them are selected
P ( A « B) = P ( A) P ( B)
1 1
= ¥
7 5
1
=
35
1.42 Chapter 1 Probability
Example 19
There are two bags. The first contains 2 red and 1 white ball, whereas
the second bag has only 1 red and 2 white balls. One ball is taken out at
random from the first bag and put in the second. Then a ball is chosen at
random from the second bag. What is the probability that this last ball
is red?
Solution
There are two mutually exclusive cases.
Case I: A red ball is transferred from the first bag to the second bag and a red ball is
drawn from it.
Case II: A white ball is transferred from the first bag to the second bag and then a red
ball is drawn from it.
Let A be the event of transferring a red ball from the first bag, and B be the event of
transferring a white ball from the first bag.
2
P ( A) =
3
1.6 Multiplicative Theorem for Independent Events 1.43
1
P ( B) =
3
Let E be the event of drawing a red ball from the second bag.
2
P( E /A) =
4
1
P ( E /B ) =
4
P(Case I) = P( A « E )
= P( A) P( E /A)
2 2
= ¥
3 4
1
=
3
P(Case II) = P( B « E )
= P ( B ) P ( E /B )
1 1
= ¥
3 4
1
=
12
P [( A « E ) » ( B « E )] = P( A « E ) + P( B « E )
1 1
= +
3 12
5
=
12
Example 20
An urn contains four tickets marked with numbers 112, 121, 211, and
222, and one ticket is drawn. Let Ai (i = 1, 2, 3) be the event that the ith
digit of the ticket drawn is 1. Show that the events A1, A2, A3 are pairwise
independent but not mutually independent.
Solution
A1 = {112, 121}, A2 = {112, 211}, A3 = {121, 211}
A1 « A2 = {112}, A1 « A3 = {121}, A2 « A3 = {211}
2 1
P ( A1 ) = = = P ( A2 ) = P( A3 )
4 2
1
P ( A1 « A2 ) = = P ( A1 « A3 ) = P ( A2 « A3 )
4
1.44 Chapter 1 Probability
1
P( A1 « A2 ) = P ( A1 ) P ( A2 ) =
4
1
P ( A2 « A3 ) = P ( A2 ) P ( A3 ) =
4
1
P ( A1 « A3 ) = P ( A1 ) P ( A3 ) =
4
Hence, events A1, A2, and A3 are pairwise independent.
P( A1 « A2 « A3 ) = P(f ) = 0
P( A1 « A2 « A3 ) π P( A1 ) P ( A2 ) P ( A3 )
Hence, events A1, A2, and A3 are not mutually independent.
Exercise 1.3
What is the probability that the problem will be solved when both try
independently of each other?
È 37 ˘
ÍÎ ans.: 55 ˙˚
12. A bag contains 6 white and 9 black balls. Four balls are drawn at random
twice. Find the probability that the first draw will give 4 white balls
and the second draw will give 4 black balls if (i) the balls are replaced,
and (ii) the balls are not replaced before the second draw.
È 6 3 ˘
ÍÎ ans.: (i) 5915 (ii) 715 ˙˚
13. An urn contains 10 white and 3 black balls. Another urn contains 3
white and 5 black balls. Two balls are transferred from the first urn to
the second urn and then one ball is drawn from the latter. What is the
probability that the ball drawn is white?
È 5˘
ÍÎ ans.: 26 ˙˚
14. A man wants to marry a girl having the following qualities: fair
1
complexion—the probability of getting such a girl is , handsome
1 20
dowry—the probability is , westernized manners and etiquettes—
50
1
the probability of this is . Find the probability of his getting
100
married to such a girl when the possessions of these three attributes are
independent.
È 1 ˘
ÍÎ ans.: 100000 ˙˚
15. A small town has one fire engine and one ambulance available for
emergencies. The probability that the fire engine is available when
needed is 0.98 and the probability that the ambulance is available when
called is 0.92. In the event of an injury resulting from a burning building,
find the probability that both the fire engine and ambulance will be
available.
[ans.: 0.9016 ]
16. In a certain community, 36% of the families own a dog and 22% of the
families that own a dog also own a cat. In addition, 30% of the families
own a cat. What is the probability that (i) a randomly selected family
1.7 Bayes’ Theorem 1.47
owns both a dog and a cat, and (ii) a randomly selected family owns a
dog given that it owns a cat?
[ans.: (i) 0.0792 (ii) 0.264 ]
Let A1, A2, ..., An be n mutually exclusive and exhaustive events with P(Ai) π 0 for
i = 1, 2, ..., n in a sample space S. Let B be an event that can occur in combination with
any one of the events A1, A2, ..., An with P(B) π 0. The probability of the event Ai when
the event B has actually occurred is given by
P( Ai ) P ( B /Ai )
P( Ai /B) = n
 P( Ai ) P( B /Ai )
i =1
Proof Since A1, A2, ..., An are n mutually exclusive and exhaustive events of the
sample space S,
S = A1 » A2 » ... » An
Since B is another event that can occur in combination with any of the mutually
exclusive and exhaustive events A1, A2, ..., An,
B = ( A1 « B) » ( A2 « B) » » ( An « B)
Taking probability of both the sides,
P( B) = P( A1 « B) + P( A2 « B) + + P( An « B)
The conditional probability of an event A given that B has already occurred is given
by
P( Ai « B)
P( Ai /B) =
P ( B)
P( Ai ) P ( B /Ai )
=
P ( B)
P( Ai ) P ( B /Ai )
= n
 P( Ai ) P( B /Ai )
i =1
1.48 Chapter 1 Probability
Example 1
A company has two plants to manufacture hydraulic machines. Plant I
manufactures 70% of the hydraulic machines, and Plant II manufactures
30%. At Plant I, 80% of hydraulic machines are rated standard quality;
and at Plant II, 90% of hydraulic machines are rated standard quality.
A machine is picked up at random and is found to be of standard quality.
What is the chance that it has come from Plant I? [Summer 2015]
Solution
Let A1 and A2 be the events that the hydraulic machines are manufactured in Plant I
and Plant II respectively. Let B be the event that the machine picked up is found to be
of standard quality.
70
P ( A1 ) = = 0.7
100
30
P( A2 ) = = 0.3
100
Probability that the machine is of standard quality given
that it is manufactured in Plant I
80 Fig. 1.3
P( B /A1 ) = = 0.8
100
Probability that the machine is of standard quality given that it is manufactured in
Plant II
90
P( B /A2 ) = = 0.9
100
Probability that a machine is manufactured in Plant I given that it is of standard quality
P( A1 ) P ( B /A1 )
P ( A1 /B) =
P( A1 ) P ( B /A1 ) + P( A2 ) P( B /A2 )
0.7 ¥ 0.8
=
0.7 ¥ 0.8 + 0.3 ¥ 0.9
= 0.6747
Example 2
A bag A contains 2 white and 3 red balls, and a bag B contains 4 white
and 5 red balls. One ball is drawn at random from one of the bags and
it is found to be red. Find the probability that the red ball is drawn from
the bag B.
1.7 Bayes’ Theorem 1.49
Solution
Let A1 and A2 be the events that the ball is drawn from bags A and B respectively. Let
B be the event that the ball drawn is red.
1
P( A1 ) =
2
1
P( A2 ) = 3
2 —
A1 5
1 B
Probability that the ball drawn is red given that it is —
2
drawn from the bag A
3 5
P( B /A1 ) = —
1 A2 9
5 — B
2
Probability that the ball drawn is red given that it is Fig. 1.4
drawn from the bag B
5
P( B /A2 ) =
9
Probability that the ball is drawn from the bag B given that it is red
P( A2 ) P( B /A2 )
P( A2 /B) =
P( A1 ) P( B /A1 ) + P( A2 ) P( B /A2 )
1 5
¥
= 2 9
Ê 1 3ˆ Ê 1 5ˆ
ÁË 2 ¥ 5 ˜¯ + ÁË 2 ¥ 9 ˜¯
25
=
52
Example 3
The chances that Doctor A will diagnose a disease X correctly is
60%. The chances that a patient will die by his treatment after correct
diagnosis is 40% and the chance of death by wrong diagnosis is 70%.
A patient of Doctor A, who had the disease X, died. What is the chance
that his disease was diagnosed correctly?
Solution
Let A1 be the event that the disease X is diagnosed correctly by Doctor A. Let A2 be the
event that the disease X is not diagnosed correctly by Doctor A. Let B be the event that
a patient of Doctor A who has the disease X, dies.
1.50 Chapter 1 Probability
60
P ( A1 ) = = 0.6
100
P( A2 ) = P( A1 ) = 1 - P( A1 ) = 0.4
Probability that the patient of Doctor A who has the disease X dies given that the
disease X is diagnosed correctly
40
P( B /A1 ) = = 0.4
100
Probability that the patient of Doctor A who has the
disease X dies given that the disease X is not diagnosed A1 0.4
B
correctly 0.6
70
P( B /A2 ) = = 0.7
100
0.4 A2
Probability that the disease X is diagnosed correctly 0.7
B
given that a patient of Doctor A who has the disease X
Fig. 1.5
dies
P( A1 ) P ( B /A1 )
P ( A1 /B) =
P ( A1 ) P ( B /A1 ) + P ( A2 ) P ( B /A2 )
0.6 ¥ 0.4
=
(0.6 ¥ 0.4) + (0.4 ¥ 0.7)
6
=
13
Example 4
In a bolt factory, machines A, B, C manufacture 25%, 35%, and 40%
of the total output and out of the total manufacturing, 5%, 4%, and 2%
are defective bolts. A bolt is drawn at random from the product and is
found to be defective. Find the probabilities that it is manufactured from
(i) Machine A, (ii) Machine B, and (iii) Machine C.
Solution
Let A1, A2 and A3 be the events that bolts are manufactured by machines A, B, and C
respectively. Let B be the event that the bolt drawn is defective.
25
P( A1 ) = = 0.25
100
35
P( A2 ) = = 0.35
100
40
P( A3 ) = = 0.4
100
Fig. 1.6
1.7 Bayes’ Theorem 1.51
Probability that the bolt drawn is defective given that it is manufactured from
Machine A
5
P ( B /A1 ) = = 0.05
100
Probability that the bolt drawn is defective given that it is manufactured from
Machine B
4
P( B /A2 ) = = 0.04
100
Probability that the bolt drawn is defective given that it is manufactured from
Machine C
2
P( B /A3 ) = = 0.02
100
(i) Probability that a bolt is manufactured from Machine A given that it is defective
P ( A1 ) P ( B /A1 )
P( A1 /B) =
P( A1 ) P ( B /A1 ) + P( A2 ) P( B /A2 ) + P( A3 ) P ( B /A3 )
0.25 ¥ 0.05
=
(0.25 ¥ 0.05) + (0.35 ¥ 0.04) + (0.4 ¥ 0.02)
= 0.3623
(ii) Probability that a bolt is manufactured from Machine B given that it is defective
P( A2 ) P( B /A2 )
P( A2 /B) =
P( A1 ) P( B /A1 ) + P( A2 ) P( B /A2 ) + P( A3 ) P( B /A3 )
0.35 ¥ 0.04
=
(0.25 ¥ 0.05) + (0.35 ¥ 0.04) + (0.4 ¥ 0.02)
= 0.4058
(iii) Probability that a bolt is manufactured from Machine C given that it is defective
P( A3 ) P( B /A3 )
P( A3 /B) =
P( A1 ) P( B /A1 ) + P( A2 ) P( B /A2 ) + P( A3 ) P( B /A3 )
0.4 ¥ 0.02
=
(0.25 ¥ 0.05) + (0.35 ¥ 0.04) + (0.4 ¥ 0.02)
= 0.2319
Example 5
A businessman goes to hotels X, Y, Z for 20%, 50%, 30% of the time
respectively. It is known that 5%, 4%, 8% of the rooms in X, Y, Z hotels
have faulty plumbings. What is the probability that the businessman’s
room having faulty plumbing is assigned to Hotel Z?
1.52 Chapter 1 Probability
Solution
Let A1, A2 and A3 be the events that the businessman goes to hotels X, Y, Z respectively.
Let B be the event that the rooms have faulty plumbings.
20
P ( A1 ) = = 0.2
100
50
P( A2 ) = = 0.5
100
30
P( A3 ) = = 0.3
100
Fig. 1.7
Probability that rooms have faulty plumbings given that
rooms belong to Hotel X
5
P ( B /A1 ) = = 0.05
100
Probability that rooms have faulty plumbing given that rooms belong to Hotel Y
4
P( B /A2 ) = = 0.04
100
Probability that rooms have faulty plumbings given that rooms belong to Hotel Z
8
P( B /A3 ) = = 0.08
100
Probability that the businessman’s room belongs to Hotel Z given that the room has
faulty plumbing
P( A3 ) P( B /A3 )
P( A3 /B) =
P( A1 ) P( B /A1 ) + P( A2 ) P( B /A2 ) + P( A3 ) P ( B /A3 )
0.3 ¥ 0.08
=
(0.2 ¥ 0.05) + (0.5 ¥ 0.04) + (0.3 ¥ 0.08)
4
=
9
Example 6
Of three persons the chances that a politician, a businessman, or an
academician would be appointed the Vice Chancellor (VC) of a university
are 0.5, 0.3, 0.2 respectively. Probabilities that research is promoted by
these persons if they are appointed as VC are 0.3, 0.7, 0.8 respectively.
(i) Determine the probability that research is promoted.
(ii) If research is promoted, what is the probability that the VC is an
academician?
1.7 Bayes’ Theorem 1.53
Solution
Let A1, A2 and A3 be the events that a politician, a businessman or an academician will
be appointed as the VC respectively. Let B be the event
that research is promoted by these persons if they are
appointed as VC.
P( A1 ) = 0.5
P( A2 ) = 0.3
P( A3 ) = 0.2
Probability that research is promoted given that a
Fig. 1.8
politician is appointed as VC
P(B/A1) = 0.3
Probability that research is promoted given that a businessman is promoted as VC
P(B/A2) = 0.7
Probability that research is promoted given that an academician is appointed as VC
P(B/A3) = 0.8
(i) Probability that research is promoted
P( B) = P( A1 ) P( B /A1 ) + P( A2 ) P( B /A2 ) + P( A3 ) P ( B /A3 )
= (0.5 ¥ 0.3) + (0.3 ¥ 0.7) + (0.2 ¥ 0.8)
= 0.52
(ii) Probability that the VC is an academician given that research is promoted by him
P( A3 ) P( B /A3 )
P( A3 /B) =
P( A1 ) P( B /A1 ) + P( A2 ) P( B /A2 ) + P( A3 ) P ( B /A3 )
0.2 ¥ 0.8
=
0.52
4
=
13
Example 7
The contents of urns I, II, and III are as follows:
1 white, 2 red, and 3 black balls,
2 white, 3 red, and 1 black ball, and
3 white, 1 red, and 2 black balls.
One urn is chosen at random and two balls are drawn. They happen
to be white and red. Find the probability that they came from (i) Urn I,
(ii) Urn II, and (iii) Urn III.
1.54 Chapter 1 Probability
Solution
Let A1, A2, and A3 be the events that urns I, II and III are chosen respectively. Let B be
the event that 2 balls drawn are white and red.
1
P ( A1 ) =
3
1
P( A2 ) =
3
1
P ( A3 ) =
3
Fig. 1.9
Probability that 2 balls drawn are white and red given
that they are chosen from the urn I
1
C1 ¥ 2C1 1¥ 2 2
P( B /A1 ) = 6
= =
C2 15 15
Probability that 2 balls drawn are white and red given that they are chosen from the
urn II
2
C1 ¥ 3C1 2¥3 6
P( B /A2 ) = 6
= =
C2 15 15
Probability that 2 balls drawn are white and red given that they are chosen from the
urn III
3
C1 ¥ 1C1 3 ¥1 3
P( B /A3 ) = 6
= =
C2 15 15
(i) Probability that 2 balls came from the urn I given that they are white and red
P ( A1 ) P ( B /A1 )
P ( A1 /B) =
P ( A1 ) P ( B /A1 ) + P ( A2 ) P ( B /A2 ) + P ( A3 ) P ( B /A3 )
1 2
¥
= 3 15
Ê1 2 ˆ Ê1 6 ˆ Ê1 3 ˆ
ÁË 3 ¥ 15 ˜¯ + ÁË 3 ¥ 15 ˜¯ + ÁË 3 ¥ 15 ˜¯
2
=
11
(ii) Probability that 2 balls came from the urn II given that they are white and red
P ( A2 ) P ( B /A2 )
P ( A2 /B) =
P ( A1 ) P ( B /A1 ) + P ( A2 ) P ( B /A2 ) + P ( A3 ) P ( B /A3 )
1 6
¥
= 3 15
Ê1 2 ˆ Ê1 6 ˆ Ê1 3 ˆ
ÁË 3 ¥ 15 ˜¯ + ÁË 3 ¥ 15 ˜¯ + ÁË 3 ¥ 15 ˜¯
1.7 Bayes’ Theorem 1.55
6
=
11
(iii) Probability that 2 balls came from the urn III given that they are white and red
P( A3 ) P ( B /A3 )
P( A3 /B) =
P( A1 ) P ( B /A1 ) + P( A2 ) P( B /A2 ) + P ( A3 ) P ( B /A3 )
1 3
¥
= 3 15
Ê1 2 ˆ Ê1 6 ˆ Ê1 3 ˆ
ÁË 3 ¥ 15 ˜¯ + ÁË 3 ¥ 15 ˜¯ + ÁË 3 ¥ 15 ˜¯
3
=
11
Exercise 1.4
1. There are 4 boys and 2 girls in Room A and 5 boys and 3 girls in Room B.
A girl from one of the two rooms laughed loudly. What is the probability
the girl who laughed was from Room B?
È 9˘
ÍÎ ans.: 17 ˙˚
4 2 1
2. The probability of X, Y, and Z becoming managers are , , and
9 9 3
respectively. The probabilities that the bonus scheme will be introduced
3 1 4
if X, Y, and Z become managers are , , and respectively. (i) What
10 2 5
is the probability that the bonus scheme will be introduced? (ii) If the
bonus scheme has been introduced, what is the probability that the
manager appointed was X?
È 23 6˘
ÍÎ ans.: (i) 45 (ii) 23 ˙˚
3. A factory has two machines, A and B. Past records show that the machine A
produces 30% of the total output and the machine B, the remaining 70%.
Machine A produces 5% defective articles and Machine B produces 1%
defective items. An item is drawn at random and found to be defective.
What is the probability that it was produced (i) by the machine A, and
(ii) by the Machine B?
[ans.: (i) 0.682 (ii) 0.318]
1.56 Chapter 1 Probability
9. Vijay has 5 one-rupee coins and one of them is known to have two heads.
He takes out a coin at random and tosses it 5 times—it always falls head
upward. What is the probability that it is a coin with two heads?
È 8˘
ÍÎ ans.: 9 ˙˚
10. Stores A, B, and C have 50, 75, and 100 employees and, respectively
50, 60, 70 per cent of these are women. Resignations are equally likely
among all employees, regardless of sex. One employee resigns and this
is a woman. What is the probability that she works in Store C?
[ans.: 0.5]
CHAPTER
Random
2
Variables
Chapter Outline
2.1 Introduction
2.2 Random Variables
2.3 Probability Mass Function
2.4 Discrete Distribution Function
2.5 Probability Density Function
2.6 Continuous Distribution Function
2.7 Two-Dimensional Discrete Random Variables
2.8 Two-Dimensional Continuous Random Variables
2.1 Introduction
The outcomes of random experiments are, in general, abstract quantities or, in other
words, most of the time they are not in any numerical form. However, the outcomes
of a random experiment can be expressed in quantitative terms, in particular, by
means of real numbers. Hence, a function can be defined that takes a definite real
value corresponding to each outcome of an experiment. This gives a rationale for
the concept of random variables about which probability statements can be made.
In probability and statistics, a probability distribution assigns a probability to each
measurable subset of the possible outcomes of a random experiment. Important and
commonly encountered probability distributions include binomial distribution, Poisson
distribution, and normal distribution.
2.2 Chapter 2 Random Variables
Example 1
Identify the random variables as either discrete or continuous in each
of the following cases:
(i) A page in a book can have at most 300 words
X = Number of misprints on a page
(ii) Number of students present in a class of 50 students
(iii) A player goes to the gymnasium regularly
X = Reduction in his weight in a month
(iv) Number of attempts required by a candidate to clear the IAS
examination
(v) Height of a skyscraper
Solution
(i) X = Number of misprints on a page
The page may have no misprint or 1 misprint or 2 misprint … or 300 misprints.
Thus, X takes values 0, 1, 2, …, 300. Hence, X is a discrete random variable.
(ii) Let X be the random variable denoting the number of students present in a
class. X takes values 0, 1, 2, …, 50. Hence, X is a discrete random variable.
(iii) Reduction in weight cannot take isolated values 0, 1, 2, etc., but it takes any
continuous value.
Hence, X is a continuous random variable.
(iv) Let X be a random variable denoting the number of attempts required by a
candidate. Thus, X takes values 1, 2, 3, …. Hence, X is a discrete random
variable.
(v) Since height can have any fractional value, it is a continuous random variable.
Probability distribution of a random variable is the set of its possible values together
with their respective probabilities. Let X be a discrete random variable which takes
the values x1, x2, … xn. The probability of each possible outcome xi is pi = p(xi) =
P(X = xi) for i = 1, 2, …, n. The number p(xi), i = 1,2, …. must satisfy the following
conditions:
(i) p(xi) ≥ 0 for all values of i
•
(ii) Â p( xi ) = 1
i =1
The function p(xi) is called the probability function or probability mass function of the
random variable X. The set of pairs {x, p(xi)}, i = 1, 2, …, n is called the probability
2.4 Chapter 2 Random Variables
distribution of the random variable which can be displayed in the form of a table as
shown below:
X = xi x1 x2 x3 … xi … xn
Let X be a discrete random variable which takes the values x1, x2, … such that
x1 < x2 < … with probabilities p(x1), p(x2) … such that p(xi) ≥ 0 for all values of i and
x
 p( xi ) = 1.
i =1
Example 1
A fair die is tossed once. If the random variable is getting an even
number, find the probability distribution of X.
Solution
When a fair die is tossed,
S = {1, 2, 3, 4, 5, 6}
Let X be the random variable of getting an even number. Hence, X can take the values
0 and 1.
3 1
P(X = 0) = P(1, 3, 5) = =
6 2
3 1
P(X = 1) = P(2, 4, 6) = =
6 2
2.4 Distribution Function 2.5
1 1
P(X = 1)
2 2
1 1
Also, Â P( X = x ) = 2 + 2 = 1
Example 2
Find the probability distribution of the number of heads when three
coins are tossed.
Solution
When three coins are tossed,
S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}
Let X be the random variable of getting heads in tossing of three coins. Hence X can
take the values 0, 1, 2, 3.
1
P(X = 0) = P(no head) = P(TTT) =
8
3
P(X = 1) = P(one head) = P(HTT, THT, TTH) =
8
3
P(X = 2) = P(two heads) = P(HHT, THH, HTH) =
8
1
P(X = 3) = P(three heads) = P(HHH) =
8
Hence, the probability distribution of X is
X=x 0 1 2 3
1 3 3 1
P(X = x)
8 8 8 8
1 3 3 1
Also, Â P( X = x ) = 8 + 8 + + =1
8 8
2.6 Chapter 2 Random Variables
Example 3
State with reasons whether the following represent the probability mass
function of a random variable:
(i)
X=x 0 1 2 3
(ii)
X=x 0 1 2 3
1 1 1 1
P(X = x)
2 3 6 4
(iii)
X=x 0 1 2 3
1 1 1 3
P(X = x) -
2 2 4 4
Solution
(i) Here, 0 £ P(X = x) £ 1 is satisfied for all values of X.
ÂP(X = x) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3)
= 0.4 + 0.3 + 0.2 + 0.1
=1
Since ÂP(X = x) = 1, it represents probability mass function.
(ii) Here, 0 £ P(X = x) £ 1 is satisfied for all values of X.
ÂP(X = x) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3)
1 1 1 1
= + + +
2 3 6 4
5
= >1
4
Since Â(P(X = x) > 1, it does not represent a probability mass function.
(iii) Here, 0 £ P(X = x) £ 1 is not satisfied for all the values of X as
1
P(X = 0) = - .
2
Hence, P(X = x) does not represent a probability mass function.
2.4 Distribution Function 2.7
Example 4
Verify whether the following functions can be regarded as the probability
mass function for the given values of X:
1
(i) P ( X = x ) = for x = 0, 1, 2, 3, 4
5
= 0 for otherwise
x-2
(ii) P(X = x) = for x = 1, 2, 3, 4, 5
5
=0 for otherwise
x2
(iii) P(X = x) = for x = 0, 1, 2, 3, 4
30
=0 for otherwise
Solution
1
(i) P(X = 0) = P(X = 1) = P(X = 2) = P(X = 3) = P(X = 4) =
5
P(X = x) ≥ 0 for all values of x
ÂP(X = x) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4)
1 1 1 1 1
= + + + +
5 5 5 5 5
= 1
Hence, P(X = x) is a probability mass function.
1- 2 1
(ii) P(X = 1) = =- <0
5 5
Hence, P(X = x) is not a probability mass function.
(iii) P(X = 0) = 0
1
P(X = 1) =
30
4
P(X = 2) =
30
9
P(X = 3) =
30
16
P(X = 4) =
30
P(X = x) ≥ 0 for all values of x
2.8 Chapter 2 Random Variables
Example 5
A random variable X has the probability mass function given by
X 1 2 3 4
Find (i) P(2 £ x < 4), (ii) P(X > 2), (iii) P(X is odd), and (iv) P(X is
even).
Solution
(i) P(2 £ X < 4) = P(X = 2) + P(X = 3)
= 0.2 + 0.5
= 0.7
(ii) P(X > 2) = P(X = 3) + P(X = 4)
= 0.5 + 0.2
= 0.7
(iii) P(X is odd) = P(X = 1) + P(X = 3)
= 0.1 + 0.5
= 0.6
(iv) P(X is even) = P(X = 2) + P(X = 4)
= 0.2 + 0.2
= 0.4
Example 6
If the random variable X takes the value 1, 2, 3, and 4 such that
2P(X = 1) = 3P(X = 2) = P(X = 3) = 5P(X = 4). Find the probability
distribution.
Solution
Let 2P(X = 1) = 3P(X = 2) = P(X = 3) = 5P(X = 4) = k
k
P(X = 1) =
2
k
P(X = 2) =
3
2.4 Distribution Function 2.9
P(X = 3) = k
k
P(X = 4) =
5
Since Â(P(X = x) = 1,
k k k
+ + k + =1
2 3 5
30
k=
61
Hence, the probability distribution is
X 1 2 3 4
15 10 30 6
P(X = x)
61 61 61 61
Example 7
A random variable X has the following probability distribution:
X 0 1 2 3 4 5 6 7
P(X = x) a 4a 3a 7a 8a 10a 6a 9a
(i) Find the value of a.
(ii) Find P(X < 3).
(iii) Find the smallest value of m for which P(X £ m) ≥ 0.6.
Solution
(i) Since P(X = x) is a probability distribution function,
Â(P(X = x) = 1
P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4) + P(X = 5) + P(X = 6)
+ P(X = 7) = 1
a + 4a + 3a + 7a + 8a + 10a + 6a + 9a = 1
1
a=
48
(ii) P(X < 3) = P(X = 0) + P(X = 1) + P(X = 2)
= a + 4a + 3a
= 8a
Ê 1ˆ
= 8Á ˜
Ë 48 ¯
1
=
6
2.10 Chapter 2 Random Variables
Example 8
The probability mass function of a random variable X is zero
except at the points X = 0, 1, 2. At these points, it has the values
P(X = 0) = 3c3, P(X = 1) = 4c – 10c2, P(X = 2) = 5c – 1.
Find (i) c, (ii) P(X < 1), (iii) P(1 < X £ 2), and (iv) P(0 < X £ 2).
Solution
(i) Since P(X = x) is a probability mass function,
Â(P(X = x) = 1
P(X = 0) + P(X = 1) + P(X = 2) = 1
3c3 + 4c – 10c2 + 5c – 1 = 1
3c3 – 10c2 + 9c – 2 = 0
(3c – 1) (c – 2) (c – 1) = 0
1
c = , 2, 1
3
But c < 1, otherwise given probabilities will be greater than one or less than
zero.
1
\ c=
3
Hence, the probability distribution is
X 0 1 2
1 2 2
P(X = x)
9 9 3
2.4 Distribution Function 2.11
1
(ii) P(X < 1) = P(X = 0) =
9
2
(iii) P(1 < X £ 2) = P(X = 2) =
3
(iv) P(0 < X £ 2) = P(X = 1) + P(X = 2)
2 2
= +
9 3
8
=
9
Example 9
From a lot of 10 items containing 3 defectives, a sample of 4 items is
drawn at random. Let the random variable X denote the number of
defective items in the sample. Find the probability distribution of X.
Solution
The random variable X can take the value 0, 1, 2, or 3.
Total number of items = 10
Number of good items = 7
Number of defective items = 3
7
C4 1
P(X = 0) = P(no defective) = 10
=
C4 6
3
C1 7C3 1
P(X = 1) = P(one defective and three good items) =
10
=
C4 2
3
C2 7 C2 3
P(X = 2) = P(two defectives and two good items) =
10
=
C4 10
3
C3 7C1 1
P(X = 3) = P(three defectives and one good item) = 10
=
C4 30
1 1 3 1
P(X = x)
6 2 10 30
2.12 Chapter 2 Random Variables
Example 10
Construct the distribution function of the discrete random variable X
whose probability distribution is as given below:
X 1 2 3 4 5 6 7
P(X = x) 0.1 0.15 0.25 0.2 0.15 0.1 0.05
Solution
Distribution function of X
X P(X = x) F(x)
1 0.1 0.1
2 0.15 0.25
3 0.25 0.5
4 0.2 0.7
5 0.15 0.85
6 0.1 0.95
7 0.05 1
Example 11
A random variable X has the probability function given below:
X 0 1 2
P(X = x) k 2k 3k
Find (i) k, (ii) P(X < 2), P(X £ 2), P(0 < X < 2), and (iii) the distribution
function.
Solution:
(i) Since P(X = x) is a probability mass function,
Â(P(X = x) = 1
k + 2k + 3k = 1
6k = 1
1
k =
6
Hence, the probability distribution is
X 0 1 2
1 2 3
P(X = x)
6 6 6
2.4 Distribution Function 2.13
1 2 1
(ii) P(X < 2) = P(X = 0) + P(X = 1) = + =
6 6 2
1 2 3
P(X £ 2) = P(X = 0) + P(X = 1) + P(X = 2) = + + =1
6 6 6
1
P(0 < X < 2) = P(X = 1) =
3
(iii) Distribution function
X P(X = x) F(x)
1 1
0
6 6
2 1
1
6 2
3
2 1
6
Example 12
A random variable X takes the values –3, –2, –1, 0, 1, 2, 3, such that
P(X = 0) = P(X > 0) = P(X < 0),
P(X = –3) = P(X = –2) = P(X = –1) = P(X = 1) = P(X = 2) = P(X = 3).
Obtain the probability distribution and the distribution function of X.
Solution
Let P(X = 0) = P(X > 0) = P(X < 0) = k1
Since ÂP(X = x) = 1
k1 + k1 + k1 = 1
\ k1 = 1
3
1
P(X = 0) = P(X > 0) = P(X < 0) =
3
Let P(X = 1) = P(X = 2) = P(X = 3) = k2
P(X > 0) = P(X = 1) + P(X = 2) + P(X = 3)
1
= k2 + k2 + k2
3
1
\ k2 =
9
1
P(X = 1) = P(X = 2) = P(X = 3) =
9
2.14 Chapter 2 Random Variables
1
Similarly, P(X = –3) = P(X = –2) = P(X = –1) =
9
Probability distribution and distribution function
X P(X = x) F(x)
1 1
–3
9 9
1 2
–2
9 9
1 3
–1
9 9
1 6
0
3 9
1 7
1
9 9
1 8
2
9 9
1
3 1
9
Example 13
A discrete random variable X has the following distribution function:
Ï0 x <1
Ô1
Ô 1£ x < 4
Ô3
ÔÔ 1
F ( x) = Ì 4£ x<6
Ô 2
Ô5
Ô 6 6 £ x < 10
Ô
ÓÔ 1 x ≥ 10
Find (i) P(2 < X £ 6), (ii) P(X = 5), (iii) P(X = 4), (iv) P(X £ 6), and
(v) P(X = 6).
2.4 Distribution Function 2.15
Solution
5 1 3 1
(i) P(2 < X £ 6) = F(6) – F(2) = - = =
6 3 6 2
1 1
(ii) P(X = 5) = P(X £ 5) – P(X < 5) = F(5) – P(X < 5) = - =0
2 2
1 1 1
(iii) P(X = 4) = P(X £ 4) – P(X < 4) = F(4) – P(X < 4) = - =
2 3 6
5
(iv) P(X £ 6) = F(6) =
6
5 1 1
(v) P(X = 6) = P(X £ 6) – P(X < 6) = F(6) – P(X < 6) = - =
6 2 3
Exercise 2.1
1. Verify whether the following functions can be considered as probability
mass functions:
x2 + 1
(i) P(X = x) = , x = 0, 1, 2, 3 [Ans.: Yes)
18
x2 - 2
(ii) P(X = x) = , x = 1, 2, 3 [Ans.: No]
8
2x + 1
(iii) P(X = x) = , x = 0, 1, 2, 3 [Ans.: No]
18
2. The probability mass function of a random variable X is
X 0 1 2 3 4 5 6
Find (i) k, (ii) P(X < 5), (iii) P(X > 5), and (iv) P(0 £ X £ 5)
È 1 49 3 29 ˘
Í ans.: 8 (ii) 64 (iii) 32 (iv) 32 ˙
Î ˚
2.16 Chapter 2 Random Variables
Find (i) k, (ii) P(X ≥ 2), and (iii) P(–2 < X < 2).
È 1 1 2˘
Í ans.: 15 (ii) 2 (iii) 5 ˙
Î ˚
5. Given the following probability function of a discrete random variable X:
X 0 1 2 3 4 5 6 7
1
Find (i) c, (ii) P(X ≥ 6), (iii) P(X < 6), and (iv) Find k if P(X £ k) > ,
where k is a positive integer. 2
X 0 1 2
Ans.: 1 3 2
P(X = x)
5 5 5
9. Five defective bolts are accidentally mixed with 20 good ones. Find the
probability distribution of the number of defective bolts, if four bolts are
drawn at random from this lot.
2.4 Distribution Function 2.17
X 0 1 2 3 4
Ans.: 969 1140 380 40 1
P(X = x)
2530 2530 2530 2530 2530
10. Two dice are rolled at once. Find the probability distribution of the
sum of the numbers on them.
X 2 3 4 5 6 7 8 9 10 11 12
Ans.:
1 2 3 4 5 6 5 4 3 2 1
P(X = x)
36 36 36 36 36 36 36 36 36 36 36
X –3 –2 –1 0 1 2 3
Ans.: (i)
P(X = x) 0.08 0.12 0.2 0.25 0.15 0.1 0.1
(ii) 0.72 (ii) 0.35
Let X be a continuous random variable such that the probability of the variable X
1 1
falling in the small interval x - dx to x + dx is f ( x ) dx, i.e.,
2 2
Ê 1 1 ˆ
P Á x - dx £ X £ x + dx ˜ = f ( x ) dx
Ë 2 2 ¯
The function f (x) is called the probability density function of the random variable X
and the continuous curve y = f (x) is called the probability curve.
Properties of Probability Density Function
(i) f (x) ≥ 0, – • < x < •
•
(ii) Ú f ( x ) dx = 1
-•
b
(iii) P (a < x < b) = Ú f ( x ) dx
a
If X is a continuous random variable having the probability density function f (x) then
the function
x
F ( x ) = P( X £ x ) = Ú f ( x ) dx, - • < x < •
-•
(v) F ¢( x ) = d F ( x ) = f ( x ), f ( x) ≥ 0
dx
Example 1
Show that the function f (x) defined by
1
f ( x) = 1< x < 8
7
=0 otherwise
is a probability density function for a random variable. Hence, find
P(3 < X < 10).
Solution
f (x) ≥ 0 in 1 < x < 8
• 1 8 •
Ú f ( x ) dx = Ú f ( x ) dx + Ú f ( x ) dx + Ú f ( x ) d x
-• -• 1 8
8
1
= 0+Ú dx + 0
1
7
1 8
= x1
7
1
= (8 - 1)
7
=1
Hence, f (x) is a probability density function.
10
P(3 < X < 10) = Ú f ( x) dx
3
8 10
= Ú f ( x ) dx + Ú f ( x ) dx
3 8
8
1
= Ú dx + 0
3
7
1
= | x | 83
7
1
= (8 - 3)
7
5
=
7
2.20 Chapter 2 Random Variables
Example 2
Is the function f (x) defined by
f ( x ) = e- x x≥0
=0 x<0
is a probability density function. If so, find the probability that the variate
having this density falls in the interval (1, 2).
Solution
f (x) ≥ 0 in (0, •)
• 0 •
Ú f ( x ) dx = Ú f ( x ) dx + Ú f ( x ) dx
-• -• 0
•
= 0 + Ú e - x dx
0
•
= -e- x 0
= -e- • + 1
=1
Hence, f (x) is a probability density function.
2
P(1 £ X £ 2) = Ú f ( x ) dx
1
2
= Ú e - x dx
1
2
= -e- x 1
= -e -2 + e -1
= 0.233
Example 3
If a random variable has the probability density function f (x) as
f ( x ) = 2e -2 x x>0
=0 x£0
Find the probabilities that it will take on a value (i) between 1 and 3,
and (ii) greater than 0.5.
2.6 Continuous Distribution Function 2.21
Solution
(i) Probability that the variable will take a value between 1 and 3
3
P(1 < X < 3) = Ú f ( x ) dx
1
3
= Ú 2 e -2 x dx
1
3
e -2 x
=2
-2 1
-6
= -(e - e -2 )
= e -2 - e -6
(ii) Probability that the variable will take a value greater than 0.5
•
P( X > 0.5) = Ú f ( x ) dx
0.5
•
Ú 2e
-2 x
= dx
0.5
•
e -2 x
=2
-2 0.5
-•
= -(e - e -1 )
= e -1
Example 4
Find the constant k such that the function
f ( x ) = kx 2 0< x<3
=0 otherwise
is a probability density function and compute (i) P(1 < x < 2),
(ii) P(X < 2), and (iii) P(X ≥ 2).
Solution
Since f (x) is a probability density function,
•
Ú f ( x ) dx = 1
-•
0 3 •
Ú f ( x ) dx + Ú f ( x ) dx + Ú f ( x ) dx = 1
-• 0 3
2.22 Chapter 2 Random Variables
3
0 + Ú kx 2 dx + 0 = 1
0
3
x3
k =1
3 0
k
(27 - 0) = 1
3
9k = 1
1
k=
9
1 2
Hence, f ( x ) = x 0< x<3
9
=0 otherwise
2
(i) P(1 < X < 2) = Ú f ( x ) dx
1
2 1 2
=Ú x dx
1 9
2
1 x3
=
9 3 1
1
= (8 - 1)
27
7
=
27
2
(ii) P( X < 2) =
Ú -•
f ( x ) dx
0 2
=Ú f ( x ) dx + Ú f ( x ) dx
-• 0
2 1 2
= 0+Ú x dx
0 9
1 2 2
9 Ú0
= x dx
2
1 x3
=
9 3 0
1
= (8 - 0)
27
8
=
27
2.6 Continuous Distribution Function 2.23
(iii) P( X ≥ 2) = 1 - P( X < 2)
8
= 1-
27
19
=
27
Example 5
If the probability density function of a random variable is given by
f ( x ) = k (1 - x 2 ) 0 < x < 1
=0 otherwise
Find the value of k and the probabilities that a random variable having
this probability density will take on a value (i) between 0.1 and 0.2, and
(ii) greater than 0.5.
Solution
Since f (x) is a probability density function,
•
Ú f ( x ) dx = 1
-•
0 1 •
Ú f ( x ) dx + Ú f ( x ) dx + Ú f ( x ) dx = 1
-• 0 1
1
0 + Ú k (1 - x 2 ) dx + 0 = 1
0
1
x3
k x- =1
3 0
Ê 1ˆ
k Á1 - ˜ = 1
Ë 3¯
3
k=
2
3
Hence, f ( x ) = (1 - x 2 ) 0 < x < 1
2
=0 otherwise
(i) Probability that the variable will take on a value between 0.1 and 0.2
0.2
P(0.1 < X < 0.2) = Ú f ( x ) dx
0.1
0.2
3
= Ú 2
(1 - x 2 ) dx
0.1
2.24 Chapter 2 Random Variables
0.2
3 x3
= x-
2 3 0.1
3 ÈÊ 0.008 ˆ Ê 0.001ˆ ˘
= ÍÁ 0.2 - ˜ - Á 0.1 - ˜˙
2 ÎË 3 ¯ Ë 3 ¯˚
= 0.1465
(ii) Probability that the variable will take on a value greater than 0.5
•
P( X > 0.5) = Ú f ( x ) dx
0.5
1 •
= Ú f ( x ) dx + Ú f ( x ) dx
0.5 1
1
3
= Ú 2
(1 - x 2 ) dx + 0
0.5
1
3 x3
= x-
2 3 0.5
3 ÈÊ 1 ˆ Ê 0.125 ˆ ˘
= ÍÁ 1 - ˜ - Á 0.5 - ˜˙
2 ÎË 3 ¯ Ë 3 ¯˚
= 0.3125
Example 6
If X is a continuous random variable with pdf
f ( x) = x2 0 £ x £1
=0 otherwise
19
If P (a £ X £ 1) = , find the value of a.
81
Solution
19
P(a £ X £ 1) =
81
1 19
Úa f ( x) dx = 81
1 2 19
Úa x dx = 81
2.6 Continuous Distribution Function 2.25
1
x3 19
=
3 a 81
1 19
(1 - a ) =
3 81
19
1- a =
27
46
a=
27
Example 7
Let X be a continuous random variable with pdf
f (x) = kx (1 – x), 0 £ x £ 1
Find k and determine a number b such that P(X £ b) = P(X ≥ b).
Solution
Since f (x) is a probability density function,
•
Ú f ( x) = 1
-•
0 1 •
Ú f ( x ) dx + Ú f ( x ) dx + Ú f ( x ) dx = 1
-• 0 1
1
0 + Ú kx (1 - x ) dx + 0 = 1
0
1
k Ú ( x - x 2 ) dx = 1
0
1
x2 x3
k - =1
2 3 0
ÈÊ 1 1 ˆ ˘
k ÍÁ - ˜ - (0 - 0)˙ = 1
Ë
Î 2 3 ¯ ˚
Ê 1ˆ
kÁ ˜ =1
Ë 6¯
k =6
2
Hence, f (x) = 6(x – x ) 0 £ x £ 1
Since total probability is 1 and P(X £ b) = P(X ≥ b),
2.26 Chapter 2 Random Variables
1
P ( X £ b) =
2
b 1
Ú0 f ( x) dx = 2
b 1
6 Ú ( x - x 2 ) dx =
0 2
b
x2 x3 1
6 - =
2 3 0 2
b 2 b3 1
- =
2 3 12
6b 2 - 4b3 = 1
3 2
4b - 6b + 1 = 0
(2b - 1)(2b2 - 2b - 1) = 0
1 1± 3
b= or b =
2 2
b lies in (0, 1).
1
\ b=
2
Example 8
The length of time (in minutes) that a certain lady speaks on the
telephone is found to be a random phenomenon, with a probability
function specified by the function
x
-
f ( x) = A e 5 x≥0
=0 otherwise
(i) Find the value of A that makes f (x) a probability density function.
(ii) What is the probability that the number of minutes that she will take
over the phone is more than 10 minutes?
Solution
(i) For f(x) to be a probability density function,
•
Ú f ( x) x = 1
-•
0 •
Ú f ( x ) dx + Ú f ( x ) dx = 1
-• 0
2.6 Continuous Distribution Function 2.27
• x
-
0 + Ú Ae 5 dx =1
0
•
x
-
e 5
A =1
1
-
0 5
-• -0
-5 A(e - e ) = 1
-5 A(0 - 1) = 1
5A = 1
1
A=
5
x
1 -5
Hence, f ( x ) = e x≥0
5
=0 otherwise
•
(ii) P( X > 10) = Ú f ( x ) dx
10
• x
1 -
= Ú 5 e 5 dx
10
•
x
-
1 e 5
=
5 1
-
5 10
= - (e - • - e -2 )
= - (0 - e -2 )
1
= e2
Example 9
A continuous random variable X has a pdf f (x)2 = 3x2, 0 £ x £ 1. Find
a and b such that
(i) P(X £ a) = P(X > a) and
(ii) P(X > b) = 0.05
Solution
Since total probability is 1 and P(X £ a) = P(X > a),
2.28 Chapter 2 Random Variables
1
P( X £ a) =
2
a
1
Ú f ( x ) dx = 2
0
a
1
Ú 3x
2
dx =
0
2
a
x3 1
3 =
3 0 2
1
a3 =
2
1
Ê 1ˆ 3
a=Á ˜
Ë 2¯
P( X > b) = 0.05
1
Ú f ( x) dx = 0.05
b
1
Ú 3x
2
dx = 0.05
b
1
x3
3 = 0.05
3 b
1 - b3 = 0.05
19
b3 =
20
1
Ê 19 ˆ 3
b=Á ˜
Ë 20 ¯
Example 10
Let the continuous random variable X have the probability density
function
2
f ( x) = 3 1< x < •
x
=0 otherwise
Find F(x).
2.6 Continuous Distribution Function 2.29
Solution
x
F ( x) = Ú f ( x ) dx
-•
1 x
= Ú f ( x ) dx + Ú f ( x ) dx
-• 1
x
2
= 0+Ú dx
1 x3
x
x -2
=2
-2 1
x
1
=-
x2 1
Ê 1 ˆ
= - Á 2 - 1˜
Ëx ¯
1
= 1- 2
x
1
Hence, F ( x ) = 1 - 2 1< x < •
x
=0 otherwise
Example 11
Verify that the function F(x) is a distribution function.
F ( x) = 0 x<0
x
-
= 1- e 4 x≥0
Also, find the probabilities P ( X £ 4), P( X ≥ 8), P(4 £ X £ 8).
Solution
For the function F(x),
(i) F(– •) = 0
(ii) F(•) = 1 – e–• = 1 – 0 = 1
(iii) 0 £ F(x) £ 1 –•<x<•
If f (x) is the corresponding probability density function,
f ( x ) = F ¢( x ) = 0 x < 0
x
1 -4
= e x≥0
4
2.30 Chapter 2 Random Variables
• 0 •
Ú-• f ( x) dx = Ú-• f ( x) dx + Ú0 f ( x ) dx
x
• 1 -4
= 0+Ú e dx
0 4
•
x
-
1 e 4
=
4 1
-
4 0
•
x
-
=- e 4 0
= – (0 – 1)
=1
Hence, F(x) is a distribution function.
P( X £ 4) = F (4)
= 1 - e-1
1
= 1-
e
e -1
=
e
P( X ≥ 8) = 1 - P( X £ 8)
= 1 - F (8)
= 1 - (1 - e-2 )
= e -2
1
= 2
e
P(4 £ X £ 8) = F (8) - F (4)
= (1 - e -2 )- (1 - e -1 )
= e -1 - e-2
1 1
= - 2
e e
e -1
= 2
e
Example 12
The troubleshooting capacity of an IC chip in a circuit is a random
variable X whose distribution function is given by
2.6 Continuous Distribution Function 2.31
F ( x) = 0 x£3
9
= 1- x>3
x2
where x denotes the number of years. Find the probability that the IC chip
will work properly (i) less than 8 years, (i) beyond 8 years, (iii) between
5 to 7 years, and (iv) anywhere from 2 to 5 years.
Solution
(i) P(X £ 8) = F(8)
9
= 1- 2
8
= 0.8594
(ii) P( X > 8) = 1 - P( X £ 8)
= 1 - F (8)
= 1 - 0.8594
= 0.1406
(iii) P(5 £ X £ 7) = F (7) - F (5)
Ê 9ˆ Ê 9ˆ
= Á1 - 2 ˜ - Á1 - 2 ˜
Ë 7 ¯ Ë 5 ¯
= 0.1763
(iv) P(2 £ X £ 5) = F (5) - F (2)
Ê 9ˆ
= Á1 - 2 ˜ - 0
Ë 5 ¯
= 0.64
Example 13
The probability density function of a continuous random variable X is
given by
Ïax 0 £ x £1
Ôa 1£ x £ 2
Ô
f ( x) = Ì
Ô3a - ax 2 £ x £ 3
ÔÓ0 otherwise
(i) Find the value of a, and (ii) find the cdf of X.
2.32 Chapter 2 Random Variables
Solution
(i) Since f (x) is a probability density function,
•
Ú f ( x ) dx = 1
-•
• 1 2 3
Ú f ( x ) dx + Ú f ( x ) dx + Ú f ( x ) dx + Ú f ( x ) dx = 1
-• 0 1 2
1 2 3
0 + Ú ax dx + Ú a dx + Ú (3a - ax ) dx = 1
0 1 2
1 3
x2 2 ax 2
a +a x 1
+ 3ax - =1
2 0 2 2
Ê1 ˆ ÈÊ 9a ˆ ˘
a Á - 0˜ + a(2 - 1) + ÍÁ 9a - ˜ - (6 a - 2 a)˙ = 1
Ë2 ¯ ÎË 2 ¯ ˚
1 9a
a+a+ - 4a = 1
2 2
2a = 1
1
a=
2
x
(ii) F ( x ) =
Ú f ( x ) dx
-•
For 0 £ x £ 1,
0 x
F ( x) = Ú f ( x ) dx + Ú f ( x ) dx
-• 0
x
= 0 + Ú ax dx
0
x
x2
=a
2 0
2
ax
=
2
For 1 £ x £ 2,
0 1 x
F ( x) = Ú f ( x ) dx + Ú f ( x ) dx + Ú f ( x ) dx
-• 0 1
1 x
= 0 + Ú ax dx + Ú a dx
0 1
2.6 Continuous Distribution Function 2.33
1
x2 x
=a +a x 1
2 0
Ê1 ˆ
= a Á - 0˜ + a( x - 1)
Ë2 ¯
a
= + ax - a
2
a
= ax -
2
For 2 £ x £ 3,
0 1 2 x
F ( x) = Ú f ( x ) dx + Ú f ( x ) dx + Ú f ( x ) dx + Ú f ( x ) dx
-• 0 1 2
1 2 x
= 0 + Ú ax dx + Ú a dx + Ú (3a - ax ) dx
0 1 2
1 x
2
x 2 ax 2
=a +a x 1+ 3ax -
2 0 2 2
Ê1 ˆ ÈÊ ax 2 ˆ ˘
= a Á - 0˜ + a(2 - 1) + ÍÁ 3ax - ˜ - (6 a - 2 a)˙
Ë2 ¯ ÎË 2 ¯ ˚
a ax 2
=+ a + 3ax - - 4a
2 2
ax 2 5a
= 3ax - -
2 2
2
Hence, F ( x ) = ax 0 £ x £1
2
a
= ax - 1£ x £ 2
2
ax 2 5a
= 3ax - - 2£ x£3
2 2
Example 14
The pdf of a continuous random variable X is
1
f ( x ) = e -| x |
2
Find cdf F(x).
2.34 Chapter 2 Random Variables
Solution
1 x
f ( x) = e -• < x < 0
2
1
= e- x 0< x<•
2
x
F ( x) = Ú f ( x ) dx
-•
For x £ 0,
x
1 x
F ( x) = Ú 2
e dx
-•
1 x x
= e -•
2
1
= (e x - e - • )
2
1
= ex
2
For x > 0,
0 x
F ( x) = Ú f ( x ) dx + Ú f ( x ) dx
-• 0
0 x
1 x 1
= Ú 2
e dx + Ú e - x dx
2
-• 0
1 x 0 1 x
= e - • + -e- x 0
2 2
1 1
= (1 - e - • ) + (-e - x + e0 )
2 2
1 1 -x 1
= - e +
2 2 2
1 -x
= 1- e
2
1
Hence, F ( x ) = e x x£0
2
1
= 1 - e- x x > 0
2
2.6 Continuous Distribution Function 2.35
Example 15
Find the value of k and the distribution function F(x) given the probability
density function of a random variable X as
k
f ( x) = 2 -• < x < •
x +1
Solution
Since f (x) is the probability density function,
•
Ú f ( x ) dx = 1
-•
•
k
Ú x +12
dx = 1
-•
•
1
k Ú x +12
dx = 1
-•
•
k tan -1 x -• =1
k ÈÎtan • - tan (-•)˘˚ = 1
-1 -1
Èp Ê p ˆ ˘
k Í -Á- ˜˙ =1
Î2 Ë 2¯˚
kp = 1
1
k=
p
1 1
Hence, f ( x ) = -• < x < •
p x2 + 1
x
F ( x) = Ú f ( x ) dx
-•
x
1 1
=
p Ú x +12
dx
-•
1 x
= tan -1 x -•
p
1
= ÈÎtan -1 x - tan -1 (-•)˘˚
p
1Ê pˆ
= Á tan -1 x + ˜
pË 2¯
2.36 Chapter 2 Random Variables
Example 16
Find the constant k such that
f ( x ) = kx 2 0< x<3
=0 otherwise
is a probability function. Also, find the distribution function F(x) and
P(1 < X £ 2).
Solution
Since f (x) is probability density function,
•
Ú f ( x ) dx = 1
-•
• 3 •
Ú f ( x ) dx + Ú f ( x ) dx + Ú f ( x ) dx = 1
-• 0 3
3
0 + Ú kx 2 dx + 0 = 1
0
3
x3
k =1
3 0
k (9 - 0) = 1
1
k=
9
1 2
Hence, f ( x) = x 0< x<3
9
=0 otherwise
x
F ( x) = Ú f ( x ) dx
-•
0 x
= Ú f ( x ) dx + Ú f ( x ) dx
-• 0
x 1 2
= 0+Ú x dx
0 9
x
1 x3
=
9 3 0
1 3
= x
27
2.6 Continuous Distribution Function 2.37
1 3
Hence, F ( x ) = x 0< x<3
27
=0 otherwise
2
P(1 < x £ 2) = Ú f ( x ) dx
1
2
1 2
=Ú x dx
1
9
2
1 x3
=
9 3 1
1
= (8 - 1)
27
7
=
27
Exercise 2.2
1. Verify whether the following functions are probability density functions:
(i) f (x) = k e - kx x ≥ 0, k > 0
1
(ii) f (x) = e -|x| -• < x < •
2
2 Ê xˆ
(iii) f (x) = x Á 2 - ˜ 0£x£3
9 Ë 2¯
ÈÎans.: (i) Yes (ii) Yes (iii) Yes˘˚
2. Find the value of k if the following are probability density functions:
(i) f (x) = k(1 + x) 2£x £5
2
(ii) f (x) = k(x - x ) 0 £ x £1
2
-4 x
(iii) f (x) = kx e 0£x£•
2
x
-
(iv) f (x) = kx e
4
0£x£•
È 2 1˘
Í ans.: (i) 27 (ii) 6 (iii) 8 (iv) 2 ˙
Î ˚
2.38 Chapter 2 Random Variables
3. A function is defined as
Ï0 x<2
Ô
Ô 2x + 3
f ( x) = Ì 2£x£4
Ô 18
ÔÓ0 x>4
Show that f(x) is a probability density function and find P(2 < X < 3).
È 4˘
Í ans.: 9 ˙
Î ˚
4. Let X be a continuous random variable with probability distribution
Ïx
Ô +k 0£x£3
f ( x) = Ì 6
ÔÓ0 otherwise
Find k, and P(1 £ X £ 2).
È 1˘
Í ans.: 1, 3 ˙
Î ˚
5. Find the value of k such that f(x) is a probability density function. Find
also, P(X £ 1.5).
Ïkx 0 £ x £1
Ô
f (x) = Ìk 1£ x £ 2
Ôk(3 - x) 2£x£3
Ó
È 1 1˘
Í ans.: 2 , 2 ˙
Î ˚
6. If X is a continuous random variable whose probability density function is
given by
f (x) = k(4 x - 2 x 2 ) 0<x<2
=0 otherwise
Find (i) the value of k, and (ii) P(X > 1).
È 3 1˘
Í ans.: (i) 8 (ii) 2 ˙
Î ˚
7. If a random variable has the probability density function
f (x) = k(x 2 - 1) -1 £ x £ 3
=0 otherwise
2.6 Continuous Distribution Function 2.39
Ê1 5ˆ
Find (i) the value of k, and (ii) P Á £ X £ ˜ .
Ë2 2¯
È 3 19 ˘
Í ans.: (i) 28 (ii) 56 ˙
Î ˚
8. The probability density function is
f (x) = k(3x 2 - 1) -1 £ x £ 2
=0 otherwise
Ê 1ˆ Ê 1ˆ
(i) Find P Á X £ ˜ and P Á X > ˜ .
Ë 2¯ Ë 2¯
1
(ii) Find a number k such that P( X £ k) = .
2 7
È ˘
Í ans.: (i) 16 (ii) 0.452˙
Î ˚
11. The distribution function of a random variable X is given by
ÏÔ1 - e - x
2
x>0
F ( x) = Ì
ÔÓ0 otherwise
Find the probability density function.
È ans.: f (x) = 2 xe - x ˘
2
x>0
Í ˙
ÍÎ =0 otherwise ˙˚
2.40 Chapter 2 Random Variables
Ê1 4ˆ
Find the pdf and P Á £ X £ ˜ .
Ë2 5¯ ÈÎans.: 0.195˘˚
13. Find the distribution function corresponding to the following probability
density functions:
Ï 1 2 -x
Ô x e 0£x<•
(i) f (x) = Ì 2
ÔÓ0 otherwise
(ii) f (x) = x 0 £ x £1
= 2-x 1£ x £ 2
=0 otherwise
(iii) f (x) = l(x - 1)4 1 £ x £ 3, l > 0
=0 otherwise
È ˘
Í ˙
Í Ï Ê x ˆ
2
˙
Ô1 - e - x Á 1 + x + ˜ x ≥ 0
Í ans.: (i) F (x) = Ì Ë 2¯ ˙
Í Ô0 ˙
Í Ó otherwise ˙
Í Ï0 x<0 ˙
Í Ô 2 ˙
Í ÔÔ x ˙
Í 0 £ x £1 ˙
(ii) F (x) = Ì 2
Í ˙
Í Ô2 x - 0.5x - 1 1 £ x £ 2
2
˙
Í Ô ˙
Í Ô
Ó1 x > 2 ˙
Í Ï0 x £1 ˙
Í Ô5 ˙
Í 5 Ô
Í (iii) l = , F (x) = Ì (x - 1) 1 £ x £ 3˙˙
4
32 Ô 32
Í ˙
ÎÍ ÔÓ1 x ≥ 3 ˚˙
14. A continuous random variable X has the following probability density
function
a
f ( x) = 2 £ x £ 10
x5
2.7 Two-Dimensional Discrete Random Variables 2.41
In one-dimensional random variable, the outcome of any experiment had only one
characteristic. In many situations, the outcome of a random experiment depends on
two or more characteristics e.g., both voltage and current are measured in certain
experiment.
Let X and Y be two random variables defined on the same sample space S, then
the function (X, Y) that assigns a point in R2 is called a two-dimensional random
variable.
A two-dimensional random variable is said to be discrete if it takes at most a
countable number of points in R2. When (X, Y) is a two-dimensional discrete random
variable, the possible values of (X, Y) may be represented as (xi, yj), i = 1, 2, ..., m, ...;
j = 1, 2, ... , n, ...
n m
(ii) Â Â pXY ( xi , y j ) = 1
j =1 i =1
Properties of cdf
(i) F (-•, y) = 0 = F ( x, •) and F (•, •) = 1
(ii) P (a < X < b, Y £ y) = F (b, y) - F (a, y)
(iii) P ( X £ x , c < Y < d ) = F ( x, d ) - F ( x, c)
= pi*
and is known as marginal probability mass function or discrete marginal density
function of X.
Similarly,
n n
pY ( y j ) = P (Y = y j ) = Â pij = Â p( xi , y j ) = p* j
i =1 i =1
is the marginal probability mass function of Y.
2.7.4 Conditional Probability Function
Let (X, Y) be a two-dimensional discrete random variable. Then the conditional discrete
density function or conditional probability mass function of X, given Y = y, denoted by
pX/Y (x/y) is defined as
P ( X = x, Y = y )
pX /Y = P ( X = x / Y = y) = , provided P (Y = y) π 0.
P(Y = y)
2.7 Two-Dimensional Discrete Random Variables 2.43
Example 1
From the following table for bivariate distribution of (X, Y), find
(i) P(X £ 1) (ii) P(Y £ 3) (iii) P(X £ 1, Y £ 3), (iv) P(X £ 1/Y £ 3)
(v) P(Y £ 3/X £ 1) (vi) P(X + Y £ 4)
Y
1 2 3 4 5 6
X
1 2 2 3
0 0 0
32 32 32 32
1 1 1 1 1 1
1
16 16 8 8 8 8
1 1 1 1 2
2 0
32 32 64 64 64
Solution
Marginal distributions
Y
1 2 3 4 5 6 pX (x)
X
1 2 2 3 8
0 0 0
32 32 32 32 32
1 1 1 1 1 1 10
1
16 16 8 8 8 8 16
1 1 1 1 2 8
2 0
32 32 64 64 64 64
3 3 11 13 6 16 Sp ( x ) = 1
pY (y)
32 32 64 64 32 64 Sp ( y ) = 1
2.44 Chapter 2 Random Variables
(i) P( X £ 1) = P( X = 0) + P( X = 1)
8 10
= +
32 16
7
=
8
(iii) P( X £ 1, Y £ 3) = P ( X = 0, Y = 1) + P( X = 0, Y = 2) + P( X = 0, Y = 3)
+ P( X = 1, Y = 1) + P ( X = 1, Y = 2) + P( X = 1, Y = 3)
1 1 1 1
=0+0+ + + +
32 16 16 8
9
=
32
P( X £ 1, Y £ 3)
(iv) P( X £ 1 / Y £ 3) =
P (Y £ 3)
9
= 32
23
64
18
=
23
P( X £ 1, Y £ 3)
(v) P(Y £ 3 / X £ 1) =
P( X £ 1)
9
= 32
7
8
9
=
28
(vi) P( X + Y £ 4) = P( X = 0, Y = 1) + P( X = 0, Y = 2) + P( X = 0, Y = 3)
+ P( X = 0, Y = 4) + P( X = 1,, Y = 1) + P ( X = 1, Y = 2)
+ P( X = 1, Y = 3) + P( X = 2, Y = 1) + P( X = 2, Y = 2)
2.7 Two-Dimensional Discrete Random Variables 2.45
1 2 1 1 1 1 1
=0+0+ + + + + + +
32 32 16 16 8 32 32
13
=
32
Example 2
For the following joint distribution of X and Y, find the marginal
distributions:
X
0 1 2
Y
3 9 3
0
28 28 28
3 3
1 0
14 14
1
2 0 0
28
Solution
Marginal distributions
X
0 1 2 pY (y)
Y
3 9 3 15
0
28 28 28 28
3 3 6
1 0
14 14 14
1 1
2 0 0
28 28
10 15 3 Sp ( x ) = 1
pX (x)
28 28 28 Sp ( y ) = 1
Marginal distributions of X
P( X = 0) = P( X = 0, Y = 0) + P( X = 0, Y = 1) + P( X = 0, Y = 2)
3 3 1
= + +
28 14 28
10
=
28
2.46 Chapter 2 Random Variables
P( X = 1) = P( X = 1, Y = 0) + P( X = 1, Y = 1) + P( X = 1, Y = 2)
9 3
= + +0
28 14
15
=
28
P( X = 2) = P( X = 2, Y = 0) + P( X = 2, Y = 1) + P( X = 2, Y = 2)
3
= +0+0
28
3
=
28
Marginal distributions of Y
P(Y = 0) = P( X = 0, Y = 0) + P( X = 1, Y = 0) + P( X = 2, Y = 0)
3 9 3
= + +
28 28 28
15
=
28
P(Y = 1) = P( X = 0, Y = 1) + P ( X = 1, Y = 1) + P( X = 2, Y = 1)
3 3
= + +0
14 14
6
=
14
P(Y = 2) = P( X = 0, Y = 2) + P( X = 1, Y = 2) + P( X = 2, Y = 2)
1
= +0+0
28
1
=
28
Example 3
The joint distribution of X and Y is given by
x+y
f ( x, y ) = , x = 1, 2, 3; y = 1, 2
21
Find the marginal distributions.
2.7 Two-Dimensional Discrete Random Variables 2.47
Solution
Marginal distributions
X
1 2 3 pY (y)
Y
2 3 4 9
1
21 21 21 21
3 4 5 12
2
21 21 21 21
5 7 9 Sp ( x ) = 1
pX (x)
21 21 21 Sp ( y ) = 1
Marginal distributions of X
P( X = 1) = P ( X = 1, Y = 1) + P ( X = 1, Y = 2)
2 3
= +
21 21
5
=
21
P( X = 2) = P( X = 2, Y = 1) + P ( X = 2, Y = 2)
3 4
= +
21 21
7
=
21
P( X = 3) = P( X = 3, Y = 1) + P( X = 3, Y = 2)
4 5
= +
21 21
9
=
21
Marginal distributions of Y
P(Y = 1) = P( X = 1, Y = 1) + P ( X = 2, Y = 1) + P( X = 3, Y = 1)
2 3 4
= + +
21 21 21
9
=
21
2.48 Chapter 2 Random Variables
P(Y = 2) = P( X = 1, Y = 2) + P( X = 2, Y = 2) + P( X = 3, Y = 2)
3 4 5
= + +
21 21 21
12
=
21
Example 4
Given is the joint distribution of X and Y
X
0 1 2
Y
0 0.02 0.08 0.1
1 0.05 0.2 0.25
2 0.03 0.12 0.15
X
0 1 2 pY (y)
Y
0 0.02 0.08 0.1 0.2
1 0.05 0.2 0.25 0.5
2 0.03 0.12 0.15 0.3
S p(x) = 1
pX (x) 0.1 0.4 0.5
S p(y) = 1
Marginal distributions of X
P( X = 0) = P( X = 0, Y = 0) + P( X = 0, Y = 1) + P( X = 0, Y = 2)
= 0.02 + 0.05 + 0.03
= 0.1
P( X = 1) = P( X = 1, Y = 0) + P( X = 1, Y = 1) + P( X = 1, Y = 2)
= 0.08 + 0.2 + 0.12
= 0.4
2.7 Two-Dimensional Discrete Random Variables 2.49
P( X = 2) = P( X = 2, Y = 0) + P( X = 2, Y = 1) + P( X = 2, Y = 2)
= 0.1 + 0.25 + 0.15
= 0.5
Marginal distributions of Y
P(Y = 0) = P( X = 0, Y = 0) + P( X = 1, Y = 0) + P( X = 2, Y = 0)
= 0.02 + 0.08 + 0.1
= 0.2
P(Y = 1) = P( X = 0, Y = 1) + P ( X = 1, Y = 1) + P( X = 2, Y = 1)
= 0.05 + 0.2 + 0.25
= 0.5
P(Y = 2) = P( X = 0, Y = 2) + P( X = 1, Y = 2) + P( X = 2, Y = 2)
= 0.03 + 0.12 + 0.15
= 0.3
P( X = 1, Y = 0) 0.08
P ( X = 1 / Y = 0) = = = 0.4
P(Y = 0) 0.2
P( X = 2, Y = 0) 0.1
P ( X = 2 / Y = 0) = = = 0.5
P(Y = 0) 0.2
X=x 0 1 2
P(X = x/y = 0) 0.1 0.4 0.5
Example 5
The joint probability distribution of two random variables X and Y is
given by
1 1 1
P( X = 0, Y = 1) = , P ( X = 1, Y = - 1) = and P ( X = 1, Y = 1) = .
3 3 3
Find (i) marginal distributions of X and Y and (ii) the conditional
probability distributions of X given Y = 1.
2.50 Chapter 2 Random Variables
Solution
Marginal distributions
X Marginal Y
–1 0 1
Y pY (y)
1 1
–1 0 0
3 3
0 0 0 0 0
1 1 2
1 0
3 3 3
Marginal X 1 2 S p(y) = 1
0
pX (x) 3 3 S p(x) = 1
Marginal distributions of X
P ( X = -1) = P( X = -1, Y = -1) + P ( X = -1, Y = 0) + P( X = -1, Y = 1)
=0
P ( X = 0) = P( X = 0, Y = -1) + P ( X = 0, Y = 0) + P ( X = 0, Y = 1)
1
=0+0+
3
1
=
3
P ( X = 1) = P ( X = 1, Y = -1) + P ( X = 1, Y = 0) + P( X = 1, Y = 1)
1 1
= +0+
3 3
2
=
3
Marginal distributions of Y
P (Y = - 1) = P( X = - 1, Y = -1) + P ( X = 0, Y = - 1) + P ( X = 1, Y = - 1)
1
=0+0+
3
1
=
3
P (Y = 0) = P( X = - 1, Y = 0) + P ( X = 0, Y = 0) + P ( X = 1, Y = 0)
=0
2.7 Two-Dimensional Discrete Random Variables 2.51
P (Y = 1) = P ( X = - 1, Y = 1) + P ( X = 0, Y = 1) + P( X = 1, Y = 1)
1 1
=0+ +
3 3
2
=
3
Conditional Probability distributions of X given Y = 1 is
P ( X = x, Y = y )
P( X = x / Y = y) =
P(Y = y)
P ( X = - 1, Y = 1)
P( X = -1 / Y = 1) = =0
P (Y = 1)
1
P( X = 0, Y = 1) 3 1
P( X = 0 / Y = 1) = = =
P (Y = 1) 2 2
3
1
P( X = 1, Y = 1) 3 1
P( X = 1 / Y = 1) = = =
P(Y = 1) 2 2
3
Example 6
If the joint probability mass function of (X, Y) is given by
P(x, y) = k(2x + 3y), x = 0, 1, 2; y = 1, 2, 3
Find all the marginal probability distribution. Also, find the probability
distribution of (X + Y).
Solution
P(x, y) = k(2x + 3y)
Marginal distributions
X
0 1 2 pY (y)
Y
1 3k 5k 7k 15 k
2 6k 8k 10 k 24 k
3 9k 11 k 13 k 33 k
pX (x) 18 k 24 k 30 k 72 k
2.52 Chapter 2 Random Variables
X
0 1 2 pY (y)
Y
3 5 7 15
1
72 72 72 72
6 8 10 24
2
72 72 72 72
9 11 13 33
3
72 72 72 72
18 24 30
pX (x) 1
72 72 72
Probability distribution of (X + Y)
X +Y P
3
1 p01 =
72
11
2 p02 + p11 =
72
24
3 p03 + p12 + p21 =
72
21
4 p13 + p22 =
72
13
5 p23 =
72
Total = 1
2.7 Two-Dimensional Discrete Random Variables 2.53
Example 7
Let X and Y have the following marginal probability distributions:
Y
0 1 2 pX (x)
X
0 0.1 0.04 0.06 0.2
1 0.2 0.08 0.12 0.4
2 0.2 0.08 0.12 0.4
S p(x) = 1
pY (y) 0.5 0.2 0.3
S p(y) = 1
Solution
X and Y are independent, if pij = pi* p*j for all i and j.
p0* = 0.1 + 0.04 + 0.06 = 0.2
p1* = 0.2 + 0.08 + 0.12 = 0.4
p2* = 0.2 + 0.08 + 0.12 = 0.4
p*0 = 0.1 + 0.2 + 0.2 = 0.5
p*1 = 0.04 + 0.08 + 0.08 = 0.2
p = 0.06 + 0.12 + 0.12 = 0.3
*2
Now, p0* p*0 = (0.2) (0.5) = 0.1 = p00
p0* p*1 = (0.2) (0.2) = 0.04 = p01
p0* p*2 = (0.2) (0.3) = 0.06 = p02
Similarly, it can be verified that
p1* p*0 = p10 ; p1* p*1 = p11 ; p1* p*2 = p12
p p = p20 ; p2* p*1 = p21 ; p2* p*2 = p22
2* *0
Hence, the random variables X and Y are independent.
Exercise 2.3
1. Find the marginal distributions of X and Y from the bivariate distribution
of (X, Y) given below:
Y
1 2
X
1 0.1 0.2
2 0.3 0.4
2.54 Chapter 2 Random Variables
X=x 1 2 Y=y 1 2
Ans.:
P(X = x) 0.3 0.7 P(Y = y) 0.4 0.6
4 3 2 1 10
1
36 36 36 36 36
1 3 3 2 9
2
36 36 36 36 36
5 1 1 1 8
3
36 36 36 36 36
1 2 1 5 9
4
36 36 36 36 36
11 9 7 9
Total 1
36 36 36 36
10 9 8 9 11 9 7 9
P(X = x) P(Y = y)
36 36 36 36 36 36 36 36
4 1 5 1 1 1 1 2
P(X = x/Y = 1) P(Y = y/X = 2)
11 11 11 11 9 3 3 9
X
0 1 2
Y
1 2
0 0
3 3
2 3 4
1
9 9 9
4 5 6
2
15 15 15
1 1 1
2
8 24 12
1 1
4 0
4 4
1 1 1
6
8 24 12
Find P(X < 4), P(Y > 1), P(X < 4/Y > 1), P(2 £ X £ 5, Y > 1), P(Y = 3/X = 2),
P(X + Y £ 7).
È Ê 1 1 1 3 1 19 ˆ ˘
Í ans.: Á , , , , , ˜ ˙
Î Ë 4 2 4 8 6 24 ¯ ˚
6. For the following joint probability distribution of X and Y, find (i) marginal
distributions of X and Y, (ii) conditional distributions of X given Y = 2,
(iii) Are X and Y independent?
2.56 Chapter 2 Random Variables
X
1 2 3
Y
1 0.1 0.1 0.2
2 0.2 0.3 0.1
Ans.:
(i) X=x 1 2 3 Y=y 1 2
P(X = x) 0.3 0.4 0.3 P(Y = y) 0.4 0.6
(ii) X=x 1 2 3
1 1 1
P(X/Y = 2)
3 2 6
(iii) No
Ans.:
(i) X=x 1 2 Y=y 1 2
P(X = x) 0.3 0.7 P(Y = y) 0.4 0.6
(ii) X=x 1 2
P(X = x) 0.25 0.75
(iii) X+Y 2 3 4
P(X + Y) 0.1 0.5 0.4
•
Similarly, fY ( y) = Ú f ( x, y) dx is the marginal probability density function of Y.
-•
Example 1
The joint probability density function of a two dimensional random vari-
able is
1 -y
f ( x, y ) = xe , 0 < x < 2, y > 0
2
= 0, , otherwise
Find the cumulative distribution function.
Solution
The cumulative distribution function is given by
y x
F ( x, y ) = Ú
-• Ú-•
f ( x, y) dx dy
y x 1
=Ú Ú0 2 xe
-y
dx dy
0
x
1 y x2
= Ú e- y dy
2 0 2
0
1 2 -y y
= x -e
4 0
1 2
= x ( - e - y + e0 )
4
1 2
= x (-e - y + 1)
4
1 2
F ( x, y ) = x (1 - e - y ), 0 < x < 2, y > 0
4
=0 , otherwise
2.8 Two-Dimensional Continuous Random Variables 2.59
Example 2
The joint probability density function of a two dimensional random
variable (X, Y) is f ( x, y) = xe - x ( y + 1) , x > 0, y > 0 . Examine whether the
variables X and Y are independent.
Solution
•
fX ( x) = Ú f ( x, y) dy
-•
•
= Ú xe - x ( y +1) dy
0
•
e - x ( y +1)
=x
-x
0
-• -x
= -(e -e )
-x
= e ,x > 0
•
fY ( y) = Ú f ( x , y ) dx
-•
•
= Ú xe - x ( y +1) dx
0
•
e - x ( y +1) e - x ( y +1)
= x -1
-( y + 1) ( y + 1)2 0
1
= ,y > 0
( y + 1)2
1
f X ( x ) ◊ fY ( y) = e - x
( y + 1)2
f ( x, y) = xe - x ( y +1)
f ( x, y) π f X ( x ) ◊ fY ( y)
Hence, X and Y are not independent.
Example 3
Two random variables X and Y have the joint pdf
f ( x, y) = Ae - (2 x + y ) , x, y ≥ 0
=0 , otherwise
Find (i) A (ii) marginal pdf of X and Y (iii) f(y/x)
2.60 Chapter 2 Random Variables
Solution
(i) Since f(x, y) is a pdf,
• •
Ú Ú f ( x, y)dx dy = 1
-• -•
••
Ú Ú Ae
- (2 x + y )
dx dy = 1
0 0
• È• ˘
A Ú Í Ú e -2 x dx ˙ e - y dy = 1
0ÍÎ0 ˙˚
• •
e -2 x
AÚ e - y dy = 1
0
-2
0
•
A
-2 Ú0
(e -• - e0 )e - y dy = 1
•
A
-2 Ú0
(-1)e - y dy = 1
A -y •
-e =1
2 0
A
(-e -• + e0 ) = 1
2
A=2
•
(ii) f X ( x ) = Ú f ( x, y)dy
-•
•
= Ú Ae - (2 x + y ) dy
0
•
= 2 -e - (2 x + y )
0
= 2(- e-• + e -2 x )
= 2e-2 x , x ≥ 0
\ f X ( x ) = 2e -2 x , x≥0
=0 , x<0
•
fY ( y) = Ú f ( x , y ) dx
-•
2.8 Two-Dimensional Continuous Random Variables 2.61
•
= Ú Ae - (2 x + y ) dx
0
•
e - (2 x + y )
=2
-2
0
2
= - (e -• - e - y )
2
= -(0 - e - y )
= e- y , y ≥ 0
\ fY ( y) = e - y , y ≥ 0
=0 , y<0
f ( x, y )
(iii) f ( y / x) =
fX ( x)
2e - (2 x + y )
=
2e -2 x
= e- y , y ≥ 0
Example 4
The joint probability distribution of X and Y is given by
6- x- y
f ( x, y ) = , 0 < x < 2, 2 < y < 4
8
= 0, otherwise
Find f(y/x = 2).
Solution
f ( x, y )
f ( y / x) =
fX ( x)
•
fX ( x) = Ú f ( x, y) dy
-•
4
6- x- y
=Ú dy
2
8
4
1 y2
= 6 y - xy -
8 2
2
2.62 Chapter 2 Random Variables
1
8
[(24 - 4 x - 8) - (12 - 2 x - 2)]
=
1
= (6 - 2 x )
8
6- x- y
f ( y / x) =
6 - 2x
Putting x = 2,
4- y
f ( y / x = 2) =
2
Example 5
The joint pdf of a two dimensional variable (X, Y) is given by
2
+ y2 )
f ( x, y) = kxye - ( x , x > 0, y > 0.
Find the value of k and prove that X and Y are independent.
Solution
Since f(x, y) is a pdf,
• •
Ú Ú f ( x, y) dx dy = 1
-• -•
••
- ( x 2 + y2 )
ÚÚk x ye dx dy = 1
0 0
• •
2 2
k Ú ye - y dy ◊ Ú xe - x dx = 1 …(1)
0 0
1
Putting x 2 = t , x = t , dx = dt
2 t
When x = 0, t = 0
When x = • , t = •
• •
- x2 1
Ú xe dx = Ú te - t
2 t
dt
0 0
1 -t •
= -e
2 0
1
= (-e -• + e0 )
2
1
=
2
2.8 Two-Dimensional Continuous Random Variables 2.63
•
- y2 1
Similarly, Ú ye dy =
2
0
Putting both integral values in Eq. (1),
1 1
k◊ ◊ =1
2 2
k=4
If X and Y are independent,
f X ( x ) ◊ fY ( y) = f ( x, y)
•
fX ( x) = Ú f ( x, y) dy
-•
•
2
+ y2 )
= Ú k x y e-( x dy
0
•
2
- y2
= k x e- x Úye dy
0
2 1
= 4 x e- x ◊
2
2
= 2 x e- x , x > 0
•
fY ( y) = Ú f ( x , y ) dx
-•
•
2
+ y2 )
= Ú k x y e-( x dx
0
•
2
- x2
= k y e- y Ú xe dx
0
2 1
= 4 y e- y ◊
2
2
= 2 ye - y , y > 0
2 2
f X ( x ) ◊ fY ( y) = 2 x e - x ◊ 2 y e - y , x, y > 0
2 2
= 4 xy e - ( x + y )
= f ( x, y), x > 0, y > 0
Hence, X and Y are independent.
Example 6
The joint probability density function of a two dimensional random vari-
able (X, Y) is
2.64 Chapter 2 Random Variables
2 ÈÊ x2 ˆ Ê x2 ˆ ˘
k Ú x ÍÁ x 2 - ˜ - Á - x 2 - ˜ ˙ dx = 1
0
ÍÎË 2¯ Ë 2 ¯ ˙˚
2
k Ú 2 x 3 dx = 1
0
2
2x4
k =1
4
0
k (8) = 1
1
k=
8
(ii) The region of integration is DOAB.
In DOAB, along vertical strip RS,
Limits of y : y = –x to y = x
and x varies from x = 0 to x = 2.
•
fX ( x) = Ú f ( x, y) dy
-•
x
= Ú kx( x - y) dy
-x
x
y2
= kx xy -
2
-x
ÈÊ x2 ˆ Ê x2 ˆ ˘
= kx ÍÁ x 2 - ˜ - Á - x 2 - ˜ ˙
ÎÍË 2¯ Ë 2 ¯ ˙˚
= kx (2 x 2 )
1
= (2 x 3 )
8 Fig. 2.1
2.8 Two-Dimensional Continuous Random Variables 2.65
x3
= , 0< x<2
4
(iii) For limits of x, DOAB is divided into two parts, DOBC and DOAC.
In DOBC, along horizontal strip PQ,
Limits of x: x = –y to x = 2 and y varies from y = –2 to y = 0.
In DOAC, along horizontal strip P¢Q¢,
Limits of x: x = y to x = 2 and y varies from y = 0 to y = 2.
•
fY ( y) = Ú f ( x , y ) dx
-•
2
= Ú kx( x - y) dx, - 2 £ y £ 0
-y
2
= Ú kx( x - y) dx, 0 £ y £ 2
y
Now,
2
2 x3 x2 y
Ú- y kx( x - y) dx = k
3
-
2
-y
ÈÊ 8 ˆ Ê y3 y3 ˆ ˘
= k ÍÁ - 2 y˜ - Á - - ˜ ˙
ÍÎË 3 ¯ Ë 3 2 ¯ ˙˚
1Ê8 5 y3 ˆ
= Á - 2y +
8Ë3 6 ˜¯
1 y 5 3
= - + y
3 4 48
Also,
2
2 x3 x2 y
Úy kx ( x - y ) dx = k
3
-
2
y
ÈÊ 8 ˆ Ê y3 y3 ˆ ˘
= k ÍÁ - 2 y˜ - Á - ˜ ˙
ÍÎË 3 ¯ Ë 3 2 ¯ ˙˚
1Ê8 y3 ˆ
= Á - 2y + ˜
8Ë3 6¯
1 y y3
= - +
3 4 48
1 y 5 3
Hence, fY ( y) = - + y , -2 £ y £ 0
3 4 48
1 y y3
= - + , 0£ y£2
3 4 48
2.66 Chapter 2 Random Variables
f ( x, y )
(iv) f ( y / x ) =
fX ( x)
1
x( x - y)
= 8
x3
4
x-y
= , -x< y< x
2 x2
Example 7
The joint pdf of a two-dimensional random variable (X, Y) is given by
8
f XY ( x, y) = xy, 1 £ y £ 2, 1 £ x £ y
9
=0 , otherwise
Find the marginal density function of X and Y.
Solution
The region of integration is DABC.
In DABC, along vertical strip PQ,
limits of y: y = x to y = 2
and x varies from x = 1 to x = 2.
Marginal density function of X is
•
fX ( x) = Ú f ( x, y)dy
-•
2 8
=Ú xy dy
x 9
2
8 y2
= x
9 2 Fig. 2.2
x
4
= x(4 - x 2 ), 1£ x £ 2
9
In DABC, along horizontal strip P¢Q¢, limits of x : x = 1 to x = y and y varies from
y = 1 to y = 2.
Marginal density function of Y is
•
fY ( y) = Ú f ( x, y)dx
-•
y 8
=Ú xy dx
1 9
2.8 Two-Dimensional Continuous Random Variables 2.67
y
8 x2
= y
9 2
1
4
= y( y 2 - 1), 1£ y £ 2
9
Example 8
If the joint distribution function of X and Y is given by
F ( x, y) = (1 - e - x )(1 - e - y ), x > 0, y > 0
=0 , otherwise
Find fX(x), fY(y) (ii) Are X and Y are independent (iii) Find P(1 < X < 3,
1 < Y < 2).
Solution
F ( x, y) = (1 - e - x )(1 - e - y )
The joint pdf is given by
∂2 F
f ( x, y ) =
2 x ∂y
∂ Ê ∂F ˆ
=
∂x ÁË ∂y ˜¯
∂ È∂ -x -y ˘
= Í (1 - e )(1 - e )˙
∂x Î ∂y ˚
∂
= (1 - e - x )(e - y )
∂x
= e- x e- y
= e - ( x + y ) , x > 0, y > 0
\ f ( x, y) = e - ( x + y ) , x > 0, y > 0
= 0, , otherwise
•
(i) fX ( x) = Ú f ( x, y) dy
-•
• -( x + y)
=Ú e dy
0
•
= -e-( x + y )
0
-•
= (-e + e- x )
= e- x , x > 0
2.68 Chapter 2 Random Variables
•
fY ( y) = Ú f ( x , y ) dx
-•
• -( x + y)
=Ú e dx
0
•
= -e-( x + y )
0
= (-e -• + e - y )
= e- y , x > 0
(ii) f X ( x ) ◊ fY ( y) = e - x ◊ e - y
= e - ( x + y ) , x > 0, y > 0
= f ( x, y )
Hence, X and Y are independent.
(iii) Since X and Y are independent,
P(1 < X < 3,1 < Y < 2) = P(1 < X < 3) ◊ P(1 < Y < 2)
3 2
= Ú f X ( x ) dx ◊ Ú fY ( y) dy
1 1
3 -x 2 -y
=Ú e dx ◊ Ú e dy
1 1
3 2
= -e- x ◊ -e -y
1 1
-3
= (-e + e ) ◊ (-e -2 + e -1 )
-1
= e -5 - e -4 - e -3 - e -2
Example 9
The joint probability density of two random variables is given by
f ( x, y) = 15e -3 x -5 y , x > 0, y > 0
=0 , elsewhere
Find (i) P(1< X < 2, 0.2 < Y < 0.3) (ii) P(X < 2, Y > 0.2) (iii) marginal
probability density functions of X and Y.
Solution 0.3 2
(i) P(1 < X < 2, 0.2 < Y < 0.3) = Ú Ú f ( x, y) dxdy
0.2 1
0.3 2
Ú Ú 15e
-3 x - 5 y
= dxdy
0.2 1
0.3 È2 ˘
= 15 Ú e -5 y Í Ú e -3 x dx ˙ dy
0.2 ÍÎ 1 ˙˚
2.8 Two-Dimensional Continuous Random Variables 2.69
0.3 2
e -3 x
= 15 Ú e -5 y
dy
0.2
-3
1
0.3
= -5 Ú e -5 y (e -6 - e -3 )dy
0.2
0.3
-6 e -5 y
-3
= -5(e -e )
-5
0.2
-6 -3 -1.5
= (e - e )(e - e -1.0 )
-3
= 6.84 ¥ 10
• 2
(ii) P(X < 2, Y > 0.2) = Ú Ú f ( x, y) dxdy
0.2 0
• 2
Ú Ú 15e
-3 x - 5 y
= dxdy
0.2 0
• È2 ˘
= 15 Ú Í Ú e -3 x dx ˙ e -5 y dy
0.2 Í
Î0 ˙˚
• 2
e -3 x
= 15 Ú e -5 y dy
0.2
-3
0
•
= -5 Ú (e -6 - 1)e -5 y dy
0.2
•
-6 e -5 y
= -5(e - 1)
-5
0.2
-6 -• -1.0
= (e - 1)(e -e )
-6 -1.0
= (e - 1)(-e )
= 0.367
(iii) The region of integration is the first quadrant.
Hence, x and y both varies from 0 to •.
•
fX ( x) = Ú f ( x, y) dy
-•
•
= Ú 15e -3 x - 5 y dy
0
•
-3 x e-5 y
= 15e
-5
0
Fig. 2.3
2.70 Chapter 2 Random Variables
= -3e -3 x (e -• - e0 )
-3 x
= 3e , x > 0
•
fY ( y) = Ú f ( x , y ) dx
-•
•
= Ú 15e -3 x - 5 y dy
0
•
e -3 x
= 15e -5 y
-3
0
= -5e -5 y (e -• - e0 )
= 5e -5 y , y > 0
Example 10
The joint pdf of (X, Y) is given by
1 -x-y
f ( x, y ) =
e , - • < x < •, - • < y < •
4
(i) Are X and Y independent?
(ii) Find the probability that X £ 1 and Y < 0.
Solution
x = - x, -• < x £ 0
= x, 0£ x<•
Similarly, y = - y, -• < y £ 0
= y, 0£ y<•
1 -x-y
\ f ( x, y ) = e
4
1
= e x + y , - • < x £ 0, - • < y £ 0
4
1 - x- y
= e , 0 £ x < 0, 0 £ y < •
4
•
(i) f X ( x ) = Ú f ( x, y) dy
-•
•
1 -x-y
= Ú 4
e dy
-•
2.8 Two-Dimensional Continuous Random Variables 2.71
•
1 -x -y
= e Ú e dy
4 -•
1 -x È0 y • ˘
= e Í Ú e dy + Ú e - y dy ˙
4 ÍÎ -• 0 ˙˚
1 -x È y0 -y ˘
•
e = ÍÎ e -• + -e 0 ˙˚
4
1 -x
= e (1 + 1)
4
1 -x
= e , -• < x < •
2
•
fY ( y) = Ú f ( x , y ) dx
-•
•
1 -x-y
= Ú 4
e dx
-•
•
1 -y -x
= e Ú e dx
4 -•
1 -y È0 x • ˘
= e Í Ú e dx + Ú e - x dx ˙
4 ÍÎ -• 0 ˙˚
1 -y È x0 -x ˘
•
= e ÍÎ e -• + -e 0 ˙˚
4
1 -y
= e (1 + 1)
4
1 -y
= e , -• < y < •
2
1 -x 1 -y
f X ( x ) ◊ fY ( y) = e ◊ e
2 2
1 -x-y
= e , - • < x < •, -• < y < •
4
= f ( x, y )
Hence, X and Y are independent.
0 1
(ii) P(X £ 1, Y < 0) = Ú Ú f ( x, y) dx dy
-• -•
0 1
1 -x-y
= Ú Ú 4
e dx dy
-• -•
2.72 Chapter 2 Random Variables
1
0
-y
È0 x 1 ˘
= Úe
4 -•
Í Ú e dx + Ú e - x dx ˙ dy
ÍÎ -• ˙˚
0
0
1 -y È x 0 1˘
= Úe
4 -• ÍÎ e -•
+ -e - x ˙ dy
0˚
0
1 -y
= Ú
4 -•
e (1 - e -1 + 1) dy
0
1
= (2 - e -1 ) Ú e y dy
4 -•
1 0
(2 - e -1 ) e y
=
4 -•
1
= (2 - e -1 )(1)
4
1
= (2 - e -1 )
4
Example 11
The joint pdf of (X, Y) is given by
p
f ( x, y) = ke - x cos y, 0 £ x £ 2, 0 £ y £
2
=0 , otherwise
Ê pˆ
Find (i) k (ii) P Á X + Y ≥ ˜ .
Ë 2¯
Solution
(i) Since f(x, y) is a pdf,
• •
Ú Ú f ( x, y) dx dy = 1
-• -•
p
2 2
Ú Úk e
-x
cos y dx dy = 1
0 0
p
2 2
k Ú cos y -e - x dy = 1
0
0
p
2
k Ú cos y (-e -2 + 1) dy = 1
0
2.8 Two-Dimensional Continuous Random Variables 2.73
p
k (1 - e -2 ) sin y 02 = 1
k (1 - e -2 )(1) = 1
1
k=
1 - e -2 y
Ê pˆ Ê pˆ
(ii) P Á X + Y ≥ ˜ = 1 - P Á X + Y < ˜
Ë 2¯ Ë 2¯
Ê pˆ
The region of integration x + y < 1 is the B Á 0, ˜
Ë 2¯
DOAB. In DOAB, along horizontal strip P¢Q¢,
P¢ Q¢ p
p x+y=
Limits of x : x = 0 to x = -y 2
2
O x
p Êp ˆ
Limits of y : y = 0 to y = A Á , 0˜
2 Ë2 ¯
p p Fig. 2.4
-y
P( X + Y < 1) = Ú 2 Ú 2 ke - x cos y dx dy
0 0
p
p -y
2
= k Ú cos y -e - x
2 dy
0 0
p È - ÊÁ p - yˆ˜ ˘
= kÚ 2 cos y Í-e Ë 2 ¯ + e0 ˙ dy
0 Í ˙
Î ˚
p
È -p ˘
= k Ú 2 cos y Í-e 2 e y + 1˙ dy
0 Î ˚
È -p p p ˘
= k Í-e 2 Ú 2 e y cos y dy + Ú 2 cos y dy ˙
0 0
ÍÎ ˙˚
È y
p
2 p ˘
Í -p e
= k Í-e 2 (cos y + sin y) + sin y 02 ˙˙
1+1
ÍÎ 0 ˙˚
È Ï p2 ¸ ˘
-p Ôe Ê p p ˆ 1Ô
= k Í-e 2 Ì ÁË cos + sin - ˝ + sin p˙
Í
ÔÓ 2 2 2 ˜¯ 2 Ô 2˙
ÎÍ ˛ ˚˙
È Ï p ¸ ˘
Í -p Ôe 2 1Ô ˙
= k - e Ì (1) - ˝ + 1
2
Í 2Ô ˙
ÍÎ ÔÓ 2 ˛ ˙˚
Ê -p ˆ
1 e 2
Á
=k - + + 1˜
Á 2 2 ˜
Ë ¯
2.74 Chapter 2 Random Variables
kÊ -p ˆ
= Á 1+ e 2 ˜
2Ë ¯
Ê pˆ kÊ -p ˆ
2
+
PÁ X Y ≥ = 1 - +
Á1 e ˜
Ë 2 ˜¯ 2Ë ¯
Ê -p ˆ
2
Á1 + e ˜
Ë ¯
= 1-
2(1 - e-2 )
Example 12
The joint p.d.f of a two-dimensional random variable (X, Y) is given by
1
f ( x, y) = (6 - x - y), 0 < x < 2, 2 < y < 4
8
=0 , otherwise
Find (i) P(X < 1, Y < 3) (ii) P(X < 1/Y < 3).
Solution 31
(i) P(X < 1, Y < 3) = Ú Ú f ( x, y) dx dy
20
31
1
= Ú Ú (6 - x - y) dx dy
20
8
3 1
1 x2
= Ú 6x - - xy dy
82 2
0
3
1 Ê 1 ˆ
= Ú
82 Á
Ë
6 - - y˜ dy
2 ¯
3
1 Ê 11 ˆ
8 Ú2 ÁË 2
= - y˜ dy
¯
3
1 11 y2
= y-
8 2 2
2
1 ÈÊ 33 9 ˆ ˘
= Í - - (11 - 2)˙
8 ÎÁË 2 2 ˜¯ ˚
3
=
8
2.8 Two-Dimensional Continuous Random Variables 2.75
P ( X < 1, Y < 3)
(ii) P(X < 1/Y < 3) = ...(1)
P (Y < 3)
32
P(Y < 3) = Ú Ú f ( x, y) dx dy
20
32
1
= Ú Ú (6 - x - y) dx dy
20
8
3 2
1 x2
8 Ú2
= 6 x - - xy dy
2
0
3
1
8 Ú2
= (12 - 2 - 2 y) dy
3
1
8 Ú2
= (10 - 2 y) dy
1 3
=10 y - y 2
8 2
1
= [(30 - 9) - (20 - 4)]
8
5
=
8
Substituting in Eq (1),
Ê 3ˆ
ÁË 8 ˜¯ 3
P(X <1/Y < 3) = =
Ê 5ˆ 5
ÁË 8 ˜¯
Example 13
The joint pdf of a two-dimensional random variable (X, Y) is given by
Ï 2 x2
Ô xy + , 0 < x < 2, 0 < y <1
f XY ( x, y) = Ì 8
Ô0 , Otherwise
Ó
Ê 1ˆ Ê 1ˆ
Find (i) P(X > 1) (ii) P Á Y < ˜ (iii) P Á X > 1 / Y < ˜
Ë 2¯ Ë 2¯
Ê 1 ˆ
(iv) P Á Y < / X > 1˜ (v) P(X < Y) (vi) P(X + Y £ 1)
Ë 2 ¯
2.76 Chapter 2 Random Variables
Solution
12
(i) P( X > 1) = Ú Ú f ( x, y) dx dy
01
12
Ê x2 ˆ
= Ú Ú Á xy 2 + ˜ dx dy
0 1Ë
8¯
1 2
x 2 y2 x3
=Ú + dy
0
2 24 Fig. 2.5
1
1
Ê 1 ˆ Ê y2 1 ˆ
= Ú Á 2 y2 + ˜ - Á + dy
0
Ë 3 ¯ Ë 2 24 ˜¯
1
Ê 3y2 7 ˆ
= ÚÁ + ˜ dy
0Ë
2 24 ¯
1
y3 7 y
= +
2 24
0
1 7
= +
2 24
19
=
24
1
2 2
(ii) P ÊÁ Y < ˆ˜ = Ú Ú f ( x, y) dx dy
1
Ë 2¯ 0 0
1
2 2Ê
x2 ˆ
= Ú Ú Á xy 2 + ˜ dx dy
0 0Ë
8¯
1
2
2
x 2 y2 x3
=Ú + dy
0
2 24
0
1 Fig. 2.6
2
Ê 1ˆ
= Ú Á 2 y 2 + ˜ dy
0
Ë 3¯
1
3 2
2y 1
= + y
3 3
0
1 1
= +
12 6
1
=
4
2.8 Two-Dimensional Continuous Random Variables 2.77
1
2
Ê 1ˆ 2
(iii) P Á X > 1, Y < ˜ = Ú Ú f ( x, y) dx dy
Ë 2¯ 0 1
1
2 2Ê
x2 ˆ
= Ú Ú Á xy 2 + ˜ dx dy
0 1Ë
8¯
1
2
2
x 2 y2 x3
=Ú + dy
0
2 24
1
1 Fig. 2.7
2
Ê 1ˆ Ê y 1ˆ 2
= Ú Á 2 y2 + ˜ - Á + ˜ dy
0
Ë 3 ¯ Ë 2 24 ¯
1
2Ê
3 y2 7 ˆ
= ÚÁ + ˜ dy
0Ë
2 24 ¯
1
y3 7 y 2
= +
2 24
0
1 7
= +
16 48
5
=
24
Ê 1ˆ 5
P Á X > 1, Y < ˜
Ê 1ˆ Ë 2 ¯ 24 5
PÁ X >1/ Y < ˜ = = =
Ë 2¯ Ê 1ˆ 1 6
P ÁY < ˜
Ë 2¯ 4
Ê 1ˆ 5
P Á X > 1, Y < ˜
Ê 1 ˆ Ë 2 ¯ 24 5
(iv) P Á Y < / X > 1˜ = = =
Ë 2 ¯ P( X > 1) 19 19
24
1 y
(v) P(X < Y) = Ú Ú f ( x, y) dx dy
00
1 y
Ê x2 ˆ
= Ú Ú Á xy 2 + ˜ dx dy
0 0Ë
8¯
1 y
x 2 y2 x3
=Ú + dy
0
2 24
0
2.78 Chapter 2 Random Variables
1
Ê y 4 y3 ˆ
= ÚÁ + dy
0Ë
2 24 ˜¯
1
y5 y 4
= +
10 96
0
1 1
= +
10 96
53
=
480
1 1- y
(vi) P(X + Y £ 1) = Ú Ú f ( x, y) dx dy
0 0
1 1- y
Ê 2 x2 ˆ
=Ú Ú Á xy + 8 ˜ dx dy
0 0 Ë ¯
1 1- y
x 2 y2 x3
=Ú + dy
0
2 24
0
1
ÔÏ (1 - y) y (1 - y)3 Ô¸
2 2
= ÚÌ + ˝ dy
0Ô Ó 2 24 Ô˛
1
ÏÔ (1 - 2 y + y 2 ) y 2 (1 - y)3 ¸Ô
= ÚÌ + ˝ dy
0Ó Ô 2 24 ˛Ô
1
Ï1 1 ¸
= Ú Ì ( y 2 - 2 y3 + y 4 ) + (1 - y)3 ˝ dy
0Ó
2 24 ˛
1
1 Ê y3 y 4 y 5 ˆ 1 (1 - y)4
= Á - + ˜+
2Ë 3 2 5 ¯ 24 (-4)
0
1 Ê 1 1 1ˆ 1 1
= - + + ◊
2 ÁË 3 2 5 ˜¯ 24 4
13
=
480
Example 14
The joint pdf of a two dimensional random variable (X, Y) is given by
x 2 + y2
1 -
f ( x, y ) = e 2 a2 , -• < x, y < •
2
2p a
2.8 Two-Dimensional Continuous Random Variables 2.79
2 2
Find P( X + Y £ 4).
Solution
P( X 2 + Y 2 £ 4) = ÚÚ f ( x, y) dx dy
x 2 + y2 £ 4
x 2 + y2
1 -
= ÚÚ 2p a 2
e 2 a2 dx dy
x 2 + y2 £ 4
r2
-
1 2p 2 Ê r ˆ
=
2p Ú0 Ú0 -e 2 a2
ÁË - 2 ˜¯ dr dq
a
2
r2
1 2p -
È f ¢( x )dx = e f ( x ) ˙˘
Ú0 2 a2
ÍÎ∵ Ú0 e
f ( x)
= -e dq
2p ˚
a
1 2p Ê - a 2 ˆ
2
2p Ú0 ÁË
= Á - e + 1˜ dq
˜¯
1 Ê -
2 ˆ
2 2p
= Á1 - e a ˜ q 0
2p ÁË ˜¯
1 Ê -
2 ˆ
2
= Á 1 - e ˜ (2p )
a
2p ÁË ˜¯
2
-
=1- e a2
2.80 Chapter 2 Random Variables
Example 15
A gun is aimed at a certain point (origin of the co-ordinate system).
Because of the random factors, the actual hit point can be any point
(X, Y) in a circle of radius ‘a’ about the origin. Assume that the joint
density of X and Y is constant in this circle and is given by
f ( x, y) = c, x 2 + y2 £ a2
= 0, otherwise
Find (i) c (ii) fX(x).
Solution
(i) Since f(x, y) is a probability density func-
tion,
• •
Ú Ú f ( x, y) dx dy = 1
-• -•
ÚÚ c dx dy = 1
x 2 + y2 £ a2
c
2
ÚÚ
2 2
dxdy = 1
x + y £a
c(area of circle x2 + y2= a2) = 1 Fig. 2.9
c(pa2) = 1
1
c=
p a2
(iii) The region of integration is the interior of the circle x2 + y2 = a2.
2 2 2 2
In the region along the vertical strip AB, Limits of y = - a - x to y = a - x
and x varies from x = –a to x = a.
•
fX ( x) = Ú f ( x , y ) dy
-•
a2 - x 2
=Ú c dy
- a2 - x 2
a2 - x 2
=c y
- a2 - x 2
=c ( a -x + a -x )
2 2 2 2
=
1
pa
(2 a - x )
2
2 2
2
= a2 - x2 , -a £ x £ a
p a2
2.8 Two-Dimensional Continuous Random Variables 2.81
Exercise 2.4
1. The joint pdf of a two dimensional random variable (X, Y) is given by
f (x, y ) = 2, 0 < x < 1, 0 < y < 1
= 0, otherwise
È 1 1 3 1 3 ˘
Í ans.: (i) k = 16 , (ii)fX (x) = 8 (x + 2 x),0 £ x £ 2; fY (y ) = 8 (y + 2y ),0 £ y £ 2˙
Í ˙
Í y(x 2 + y 2 ) x(x 2 + y 2 ) ˙
Í (iii) f (y /x) = 2
,0 £ y £ 2; f (x/y ) = 2
,0 £ x £ 2. ˙
Î 2(x + 2) 2(y + 2) ˚
5. The joint pdf of (X, Y) is given by
1
f (x, y ) = (3x 2 + xy ), 0 < x £ 1, 0 < y £ 2
3
= 0, , otherwise
Find P( X + Y ≥ 1) .
È 65 ˘
Í ans.: 72 ˙
Î ˚
6. The joint pdf of (X, Y) is given by
f(x,y) = k(6 – x – y), 0 < x < 2, 2 < y < 4
Find (i) k (ii) P(X < 1, Y < 3), (iii) P(X + Y < 3), (iv) P( X < 1/ y < 3)
È 1 3 5 3˘
Í ans.: (i) 8 , (ii) 8 , (iii) 24 , (iv) 5 ˙
Î ˚
-y
7. The joint pdf of (X, Y) is given by f (x, y ) = e , x > 0, y > x
= 0 , otherwise
Find (i) P(X > 1 / Y < 5) (ii) marginal distributions of X and Y.
È e4 - 5 ˘
Í ans.: (i) 5 , (ii) fX (x) = e - x , x > 0; fY (y ) = ye - y , y > 0 ˙
ÍÎ e -6 ˙˚
x y
1 -4-3
8. The joint pdf of (X, Y) is given by f (x, y ) = e , x ≥ 0, y ≥ 0
12
= 0, otherwise
(i) Find conditional density functions of X and Y.
(ii) Are X, Y independent?
È 1 -
y
1 -
x ˘
Í ans.: (i) f (y /x) = e 3 , y ≥ 0; f (x/y ) = e 4 , x ≥ 0, (ii) Yes˙
ÍÎ 3 4 ˙˚
2
9. The joint pdf of (X, Y) is given by f (x, y ) = , x > 0, y > 0
(1 + x + y )3
=0 , otherwise
Find (i) F(x, y) (ii) fX(x) (iii) f(y/x).
2.8 Two-Dimensional Continuous Random Variables 2.83
È 1 1 1 ˘
Í ans.: (i) F (x, y ) = 1 - 1 + x + 1 + x + y - 1 + y ˙
Í ˙
Í 1 ˙
Í (ii) fX (x) = 2
, x>0 ˙
Í (1 + x) ˙
Í =0 , otherwise ˙
Í 2
˙
Í 2(1 + x) ˙
Í (iii) f (y /x) = 3 ˙
Î (1 + x y+ ) ˚
2 xy
10. The joint pdf of (X, Y) is given by f (x, y ) = x + , 0 < x < 1,0 < y < 2
3
=0 , otherwise
Ê 1ˆ Ê 1 1ˆ
Find (i) P Á X > ˜ (ii) P(Y < X) (iii) P Á Y < / X < ˜
Ë 2¯ Ë 2 2¯
È 5 7 5˘
Í ans.: (i) 6 , (ii) 24 , (iii) 32 ˙
Î ˚
1
11. The joint pdf of (X, Y) is given by f (x, y ) = (1 + xy ), x < 1, y < 1
4
=0 , otherwise
Show that X and Y are independent.
- x -2 y
12. The joint pdf of (X, Y) is given by f (x, y ) = Ae . Show that X and Y
are independent.
CHAPTER
3
Basic Statistics
Chapter Outline
3.1 Introduction
3.2 Measures of Central Tendency
3.3 Measures of Dispersion
3.4 Moments
3.5 Skewness
3.6 Kurtosis
3.7 Measures of Statistics for Continuous Random Variables
3.8 Expected Values of Two Dimensional Random Variables
3.9 Bounds on Probabilities
3.10 Chebyshev’s Inequality
3.1 Introduction
where p(x) is the probability mass function of the discrete random variable X.
Expectation of any function f(x) of a random variable X is given by
•
E [f ( x )] = Â f ( xi ) p( xi ) = Â f ( x ) p( x )
i =1
1
M= ( x + xk +1 )
2 k
1 1
where F(xk) < and F(xk+1) > and xk and xk+1 are two consecutive values of X.
2 2
3. Mode The mode is the value of discrete random variable X for which the prob-
ability is maximum.
3.3 Measures of Dispersion 3.3
where p(x) is the probability mass function of the discrete random variable X.
5. Harmonic Mean The harmonic mean of a random variable X is defined by
1 Ê 1ˆ
= E Á ˜ . The harmonic mean of the probability distribution of a discrete random
H Ë X¯
variable X is given by
•
1 1 1
= Â p( xi ) = Â p( x )
H i =1 xi x
where p(x) is the probability mass function of the discrete random variable X.
= Â x - m p( x )
where p(x) is the probability mass function of the discrete random variable X.
3.4 Chapter 3 Basic Statistics
3. Standard Deviation Standard deviation is the positive square root of the arith-
metic mean of the squares of the deviations of the given values from their arithmetic
mean. It is denoted by the Greek letter s.
•
SD = s = Â xi2 p( xi ) - m 2
i =1
= E (X 2 ) - m2
= E ( X 2 ) - [ E ( X )]2
Variance Variance characterizes the variability in the distributions since two
distributions with same mean can still have different dispersion of data about their means.
Variance of the probability distribution of a discrete random variable X is given by
Var(X) = s2 = E(X – m)2
= E(X2 – 2Xm + m2)
= E(X2) – E(2Xm) + E(m2)
= E(X2) – 2m E(X) + m2 [∵ E(constant) = (constant)]
= E(X2) – 2mm + m2
= E(X2) – m2
= E(X2) – [E(X)]2
Some important results on variance:
(i) Var (k) = 0
(ii) Var (kX) = k2 Var (X)
(iii) Var (X + k) = Var (X)
(iv) Var (aX + b) = a2 Var(X)
Example 1
A random variable X has the following distribution:
X 1 2 3 4 5 6
1 3 5 7 9 11
P(X = x)
36 36 36 36 36 36
Find (i) mean, (ii) variance, and (iii) P(1 < X < 6).
Solution
(i) Mean = m = S xp(x)
Ê 1ˆ Ê 3ˆ Ê 5ˆ Ê 7ˆ Ê 9ˆ Ê 11 ˆ
= 1Á ˜ + 2 Á ˜ + 3 Á ˜ + 4 Á ˜ + 5 Á ˜ + 6 Á ˜
Ë 36 ¯ Ë 36 ¯ Ë 36 ¯ Ë 36 ¯ Ë 36 ¯ Ë 36 ¯
3.3 Measures of Dispersion 3.5
161
=
36
= 4.47
(ii) Variance = s = S x2p(x) – m2
2
Ê 1ˆ Ê 3ˆ Ê 5ˆ Ê 7ˆ Ê 9ˆ
= 1 Á ˜ + 4 Á ˜ + 9 Á ˜ + 16 Á ˜ + 25 Á ˜
Ë 36 ¯ Ë 36 ¯ Ë 36 ¯ Ë 36 ¯ Ë 36 ¯
Ê 11 ˆ
+ 36 Á ˜ - (4.47)
2
Ë 36 ¯
791
= - 19.98
36
= 1.99
(iii) P(1 < X < 6) = P(X = 2) + P(X = 3) + P(X = 4) + P(X = 5)
3 5 7 9
= + + +
36 36 36 36
24
=
36
= 0.67
Example 2
The probability distribution of a random variable X is given below. Find
(i) E(X), (ii) Var(X), (iii) E(2X – 3), and (iv) Var (2X – 3)
X –2 –1 0 1 2
P(X = x) 0.2 0.1 0.3 0.3 0.1
Solution
(i) E(X) = S x p( x )
= –2(0.2) – 1(0.1) + 0 + (0.3) + 2(0.1)
=0
(ii) Var(X) = S x 2 p( x ) - [ E ( X )]2
= 4(0.2) + 1(0.1) + 0 + 1(0.3) + 4(0.1) – 0
= 1.6
(iii) E(2X – 3) = 2E(X) – 3
= 2(0) – 3
= –3
(iv) Var (2X – 3) = (2)2 Var (X)
= 4(1.6)
= 6.4
3.6 Chapter 3 Basic Statistics
Example 3
Mean and standard deviation of a random variable X are 5 and 4
respectively. Find E(X2) and standard deviation of (5 – 3X).
Solution
E(X) = m = 5
SD = s = 4
\ Var(X) = s2 = 16
Var(X) = E(X2) – [E(X)]2
16 = E(X2) – (5)2
\ E(X2) = 41
Var (5 – 3X) = Var (5) – (–3)2 Var (X)
= 0 + 9(16)
= 144
SD (5 – 3X) = Var (5 - 3 X )
= 144
= 12
Example 4
A machine produces an average of 500 items during the first week of the
month and on average of 400 items during the last week of the month,
the probability for these being 0.68 and 0.32 respectively. Determine the
expected value of the production. [Summer 2015]
Solution
Let X be the random variable which denotes the items produced by the machine. The
probability distribution is
X 500 400
Example 5
The monthly demand for Allwyn watches is known to have the following
probability distribution:
3.3 Measures of Dispersion 3.7
Demand (x) 1 2 3 4 5 6 7 8
Probability p(x) 0.08 0.12 0.19 0.24 0.16 0.10 0.07 0.04
Find the expected demand for watches. Also, compute the variance.
Solution
E ( X ) = S x p( x )
= 1(0.08) + 2(0.12) + 3(0.19) + 4(0.24) + 5(0.16)
+ 6(0.10) + 7(0.07) + 8(0.04)
= 4.06
Var( X ) = E ( X 2 ) - [ E ( X )] 2
= S x 2 p( x ) - [ E ( X )] 2
= 1(0.08) + 4(0.12) + 9(0.19) + 16(0.24) + 25(0.16)
+ 36(0.10) + 49(0.07) + 64(0.04) - (4.06)2
= 19.7 - 16.48
= 3.21
Example 6
A discrete random variable has the probability mass function given
below:
X –2 –1 0 1 2 3
2 3 1 6 1 6
P(X = x)
10 25 10 25 10 25
3.8 Chapter 3 Basic Statistics
Mean = E ( X ) = Â x p( x )
Ê 2ˆ Ê 3ˆ Ê 6ˆ Ê 1ˆ Ê 6ˆ
= (-2) Á ˜ + (-1) Á ˜ + 0 + 1 Á ˜ + 2 Á ˜ + 3 Á ˜
Ë 10 ¯ Ë 25 ¯ Ë 25 ¯ Ë 10 ¯ Ë 25 ¯
6
=
25
Variance = Var( X ) = E ( X 2 ) - [E ( X )]
2
= Â x 2 p( x ) - [E ( X )]
2
2
Ê 2ˆ Ê 3ˆ Ê 6ˆ Ê 1ˆ Ê 6ˆ Ê 6ˆ
= 4 Á ˜ + 1Á ˜ + 0 + 1Á ˜ + 4 Á ˜ + 9 Á ˜ - Á ˜
Ë 10 ¯ Ë 25 ¯ Ë 25 ¯ Ë 10 ¯ Ë 25 ¯ Ë 25 ¯
73 36
= -
250 625
293
=
625
Example 7
A random variable X has the following probability function:
x 0 1 2 3 4 5 6 7
(i) Determine k. (ii) Evaluate P(X < 6), P(X ≥ 6), P(0 < X < 5) and
P(0 £ X £ 4). (iii) Determine the distribution function of X. (iv) Find the
mean. (v) Find the variance.
Solution
(i) Since p(x) is a probability mass function,
 p( x) = 1
0 + k + 2 k + 2 k + 3k + k + 2 k + 7k 2 + k = 1
2 2
10 k 2 + 9 k - 1 = 1
(10 k - 1) (k + 1) = 0
1
k= or k = -1
10
1
k= = 0.1 [∵ p( x ) ≥ 0, k π -1]
10
3.3 Measures of Dispersion 3.9
(ii) P( X < 6) = P( X = 0) + P( X = 1) + P( X = 2) + P( X = 3) + P( X = 4) + P( X = 5)
= 0 + 0.1 + 0.2 + 0.2 + 0.3 + 0.01
= 0.81
P( X ≥ 6) = 1 - P( X < 6)
= 1 - 0.81
= 0.19
P (0 < X < 5) = P ( X = 1) + P( X = 2) + P( X = 3) + P( X = 4)
= 0.1 + 0.2 + 0.2 + 0.3
= 0.8
P(0 £ X £ 4) = P ( X = 0) + P ( X = 1) + P( X = 2) + P ( X = 3) + P( X = 4)
= 0 + 0.1 + 0.2 + 0.2 + 0.3
= 0.8
(iii) Distribution function of X
x p(x) F(x)
0 0 0
1 0.1 0.1
2 0.2 0.3
3 0.2 0.5
4 0.3 0.8
5 0.01 0.81
6 0.02 0.83
7 0.17 1
(iv) m = Â xp( x )
= 0 + 1(0.1) + 2(0.2) + 3(0.2) + 4(0.3) + 5(0.01) + 6(0.02) + 7(0.17)
= 3.66
(v) Var( X ) = s 2 = Â x 2 p( x ) - m 2
= 0 + 1(0.1) + 4(0.2) + 9(0.2) + 16(0.3) + 25(0.01) + 36(0.02)
+ 49(0.17) - (3.66)2
= 3.4044
3.10 Chapter 3 Basic Statistics
Example 8
A fair dice is tossed. Let the random variable X denote the twice the
number appearing on the dice. Write the probability distribution of X.
Calculate mean and variance.
Solution
Let X be the random variable which denotes twice the number appearing on the dice.
(i) Probability distribution of X
x 2 4 6 8 10 12
1 1 1 1 1 1
p(x)
6 6 6 6 6 6
(iii) Variance = s2 = Â x 2 p( x ) - m 2
Ê 1ˆ Ê 1ˆ Ê 1ˆ Ê 1ˆ Ê 1ˆ Ê 1ˆ
= 4 Á ˜ + 16 Á ˜ + 36 Á ˜ + 64 Á ˜ + 100 Á ˜ + 144 Á ˜ - (7)2
Ë 6¯ Ë 6¯ Ë 6¯ Ë 6¯ Ë 6¯ Ë 6¯
= 11.67
Example 9
Two unbiased dice are thrown at random. Find the probability distribution
of the sum of the numbers on them. Also, find mean and variance.
Solution
Let X be the random variable which denotes the sum of the numbers on two unbiased
dice. The random variable X can take values 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12. The
probability distribution is
X 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
P(X = x)
36 36 36 36 36 36 36 36 36 36 36
Mean = m = S x p(x)
3.3 Measures of Dispersion 3.11
Ê 1ˆ Ê 2ˆ Ê 3ˆ Ê 4ˆ Ê 5ˆ Ê 6ˆ Ê 5ˆ
= 2 Á ˜ + 3Á ˜ + 4 Á ˜ + 5Á ˜ + 6 Á ˜ + 7Á ˜ + 8Á ˜
Ë 36 ¯ Ë 36 ¯ Ë 36 ¯ Ë 36 ¯ Ë 36 ¯ Ë 36 ¯ Ë 36 ¯
Ê 4ˆ Ê 3ˆ Ê 2ˆ Ê 1ˆ
+ 9 Á ˜ + 10 Á ˜ + 11 Á ˜ + 12 Á ˜
Ë 36 ¯ Ë 36 ¯ Ë 36 ¯ Ë 36 ¯
252
=
36
=7
Variance = s2 = Â x 2 p( x ) - m 2
Ê 1ˆ Ê 2ˆ Ê 3ˆ Ê 4ˆ Ê 5ˆ
= 4 Á ˜ + 9 Á ˜ + 16 Á ˜ + 25 Á ˜ + 36 Á ˜
Ë 36 ¯ Ë 36 ¯ Ë 36 ¯ Ë 36 ¯ Ë 36 ¯
Ê 6ˆ Ê 5ˆ Ê 4ˆ Ê 3ˆ
+ 49 Á ˜ + 64 Á ˜ + 81 Á ˜ + 100 Á ˜
Ë 36 ¯ Ë 36 ¯ Ë 36 ¯ Ë 36 ¯
Ê 2ˆ Ê 1ˆ
+ 121 Á ˜ + 144 Á ˜ - (7)2
Ë 36 ¯ Ë 36 ¯
1974
= - 49
36
= 5.83
Example 10
A sample of 3 items is selected at random from a box containing 10
items of which 4 are defective. Find the expected number of defective
items.
Solution
Let X be the random variable which denotes the defective items.
Total number of items = 10
Number of good items = 6
Number of defective items = 4
6
C3 1
P( X = 0) = P(no defective item) = 10
=
C3 6
6
C2 4C1 1
P( X = 1) = P(one defective item) = 10
=
C3 2
6
C1 4C2 3
P( X = 2) = P(two defective items) = 10
=
C3 10
4
C3 1
P( X = 3) = P(three defective items) = 10
=
C3 30
3.12 Chapter 3 Basic Statistics
1 1 3 1
P(X = x)
6 2 10 30
Example 11
A player tosses two fair coins. He wins ` 100 if a head appears and
` 200 if two heads appear. On the other hand, he loses ` 500 if no
head appears. Determine the expected value of the game. Is the game
favourable to the players?
Solution
Let X be the random variable which denotes the number of heads appearing in tosses
of two fair coins.
S = {HH, HT, TH, TT}
1
p( x1 ) = P ( X = 0) = P (no heads) =
4
2 1
p( x2 ) = P ( X = 1) = P(one head) = =
4 2
1
p( x3 ) = P( X = 2) = P(two heads) =
4
Amount to be lost if no head appears = x1 = – ` 500
Amount to be won if one head appears = x2 = ` 100
Amount to be won if two heads appear = x3 = ` 200
Expected value of the game = m = Â x p( x )
= x1 p( x1 ) + x2 p( x2 ) + x3 p( x3 )
Ê 1ˆ Ê 1ˆ Ê 1ˆ
= -500 Á ˜ + 100 Á ˜ + 200 Á ˜
Ë 4¯ Ë 2¯ Ë 4¯
= ` - 25
Hence, the game is not favourable to the player.
3.3 Measures of Dispersion 3.13
Example 12
Amit plays a game of tossing a dice. If a number less than 3 appears, he
gets ` a, otherwise he has to pay ` 10. If the game is fair, find a.
Solution
Let X be the random variable which denotes tossing of a dice.
2 1
Probability of getting a number less than 3, i.e., 1 or 2 = p( x1 ) = =
6 3
4 2
Probability of getting number more than or equal to 3, i.e., 3, 4, 5, or 6 = p( x2 ) = =
6 3
Amount to be received for number less than 3 = x1 = ` a
Amount to be paid for numbers more than or equal to 3 = x2 = ` –10
E ( X ) = Â x p( x )
= x1 p( x1 ) + x2 p( x2 )
Ê 1ˆ Ê 2ˆ
= a Á ˜ + (-10) Á ˜
Ë 3¯ Ë 3¯
a 20
= -
3 3
For a pair game, E(x) = 0.
a 20
- =0
3 3
a = 20
Example 13
A man draws 2 balls from a bag containing 3 white and 5 black balls.
If he is to receive ` 14 for every white ball which he draws and ` 7 for
every black ball, what is his expectation?
Solution
Let X be the random variable which denotes the balls drawn from a bag. 2 balls drawn
may be either (i) both white, or (ii) both black, or (iii) one white and one black.
3
C2 3
Probability of drawing 2 white balls = p( x1 ) = 8 =
C2 28
5
C2 10
Probability of drawing 2 black balls = p( x2 ) = 8
=
C2 28
3.14 Chapter 3 Basic Statistics
3
C1 5C1 15
Probability of drawing 1 white and 1 black ball = p( x3 ) = 8
=
C2 28
Amount to be received for 2 white balls = x1 = ` 14 × 2 = ` 28
Amount to be received for 2 black balls = x2 = ` 7 × 2 = ` 14
Amount to be received for 1 white and 1 black ball = x3 = ` 14 + ` 7 = ` 21
Expectation = E ( X ) = Â x p( x )
= x1 p( x1 ) + x2 p( x2 ) + x3 p( x3 )
Ê 3ˆ Ê 10 ˆ Ê 15 ˆ
= 28 Á ˜ + 14 Á ˜ + 21 Á ˜
Ë 28 ¯ Ë 28 ¯ Ë 28 ¯
= ` 19.25
Example 14
The probability that there is at least one error in an account statement
prepared by A is 0.2 and for B and C, they are 0.25 and 0.4 respectively.
A, B, and C prepared 10, 16, and 20 statements respectively. Find the
expected number of correct statements in all.
Solution
Let p(x1), p(x2) and p(x3) be the probabilities of the events that there is no error in the
account statements prepared by A, B, and C respectively.
p( x1 ) = 1 - (Probability of at least one error in the account
statement prepared by A)
= 1 - 0.2
= 0.8
Similarly, p(x2) = 1 – 0.25 = 0.75
p(x3) = 1 – 0.4 = 0.6
Also, x1 = 10, x2 = 16, x3 = 20
Expected number of correct statements = E ( X ) = Â x p( x )
= x1 p( x1 ) + x2 p( x2 ) + x3 p( x3 )
= 10(0.8) + 16 (0.75) + 20 (0.6)
= 32
Example 15
A man has the choice of running either a hot-snack stall or an ice-cream
stall at a seaside resort during the summer season. If it is a fairly cool
3.3 Measures of Dispersion 3.15
Exercise 3.1
1. The probability distribution of a random variable X is given by
X –2 –1 0 1 2 3
P(X = x) 0.1 k 0.2 2k 0.3 k
X 0 10 15
k-6 2 14
P(X = x)
5 k 5k
X 0 10 15
Ans.: 8, 2 13 , 31
F(X) 1 4
5 20
4. For the following distribution,
X –3 –2 –1 0 1 2
Find (i) P(X ≥ 1), (ii) P(X < 0), (iii) E(X), and (iv) Var(X)
ÈÎans.: (i) 0.35 (ii) 0.35 (iii) 0.05 (iv) 1.8475˘˚
5. A random variable X has the following probability function:
X 0 1 2 3 4 5 6 7 8
k k k k 2k 6k 7k 8k 4k
P(X = x)
45 15 9 5 45 45 45 45 45
X 1 2 3 4 5
Ans.: (i) 1 1 1 1 1
P(X = x)
2 4 8 16 16
(ii) 1.9
7. Let X denotes the minimum of two numbers that appear when a pair
of fair dice is thrown once. Determine (i) probability distribution,
(ii) expectation, and (iii) variance.
3.3 Measures of Dispersion 3.17
X 1 2 3 4 5 6
Ans.: (i) 11 9 7 5 3 1
P(X = x)
36 36 36 36 36 36
10. An urn contains 6 white and 4 black balls; 3 balls are drawn without
replacement. What is the expected number of black balls that will be
obtained?
È 6˘
Í ans.: 5 ˙
Î ˚
11. A six-faced dice is tossed. If a prime number occurs, Anil wins that
number of rupees but if a nonprime number occurs, he loses that number
of rupees. Determine whether the game is favourable to the player.
14. A bag contains 2 white balls and 3 black balls. Four persons A, B, C, D in
the order named each draws one ball and does not replace it. The first
to draw a white ball receives ` 20. Determine their expectations.
ÈÎans.: ` 8, ` 6, ` 4, ` 2˘˚
3.4 Moments
Moment is the arithmetic mean of the various powers of the deviations of items from
their assumed mean or actual mean. If the deviations of the items are taken from the
arithmetic mean of the distribution, it is known as central moment. If the mean of the
first power of deviations are taken, the first moment about the mean is obtained and
is denoted by m1. The mean of the second power of the deviations gives the second
moment about the mean and is denoted by m2. Similarly, the mean of the cubes of
deviations gives third moment about the mean and is denoted by m3. The mean of
the fourth power of the deviations from the mean gives the fourth moment about the
mean and is denoted by m4. Thus, the mean of the rth power of deviations gives the rth
moment about mean or rth central moment and is denoted by mr.
= Â ( x - m )r p( x )
(iv) The fourth moment about the mean measures kurtosis. It gives information on
the peakedness or height of the peak of a frequency distribution, i.e., whether
it is more peaked or more flat topped than a normal curve.
m4
Kurtosis b2 =
m22
(v) In a symmetric distribution, all odd moments are zero, i.e., m1 = m3 = m5 = ...
= m2r+1 = 0.
= Â ( x - a )r p( x )
mr¢ = E{X r }
•
= Â xir p( xi )
i =1
3.20 Chapter 3 Basic Statistics
= Â x r p( x )
=
 fxr
n
m2 = m2¢ - ( m1¢ )
2
Second central moment
Example 1
Calculate the first four moments from the following data:
x 0 1 2 3 4 5 6 7 8
f 5 10 15 20 25 20 15 10 5
x=
 fx = 500 = 4
N 125
3.4 Moments 3.21
m1 =
 f (x - m) = 0
=0
N 125
m2 =
 f ( x - m )2 =
500
=4
N 125
m3 =
 f ( x - m )3 =
0
=0
N 125
 f (x - m)
4
4700
m4 = = = 37.6
N 125
m32 0
b1 = = =0
m23 64
m4 37.6
b2 = = = 2.35
m22 16
Example 2
Calculate the first four moments of the following distribution about the
mean:
x 0 1 2 3 4 5 6 7 8
f 1 8 28 56 70 56 28 8 1
Solution
Let a = 4 be the arbitrary origin.
m1¢ =
 f ( x - a) = 0
=0
N 256
m2¢ =
 f ( x - a )2 =
512
=2
N 256
m3¢ =
 f ( x - a)3 =
0
=0
N 256
m¢ =
 f ( x - a )4
=
2816
= 11
4
N 256
Moments about the actual mean:
m1 = 0
m2 = m2¢ - ( m1¢ )
2
= 2-0
=2
m3 = m3¢ - 3 m2¢ m1¢ + 2 ( m1¢ )
3
= 0 - 3(2)(0) + 2(00)3
=0
3.4 Moments 3.23
Example 3
The first four moments of distribution about x = 2 are 1, 2.5, 5.5, and 16.
Calculate the four moments about m.
Solution
m1¢ = 1, m2¢ = 2.5, m3¢ = 5.5, m 4¢ = 16
Moments about the mean:
m1 = 0
m2 = m2¢ - ( m1¢ )
2
= 2.5 - (1)2
= 1.5
m3 = m3¢ - 3 m2 m1¢ + 2 ( m1¢ )
3
Example 4
The first three moments of a distribution about the value 2 of the
variables are 1, 16, and –40. Show that the mean = 3, variance = 15
and m3 = –86.
Solution
a = 2, m1¢ = 1, m2¢ = 16, m2¢ = 16, m3¢ = - 40
3.24 Chapter 3 Basic Statistics
m1¢ = m - a
1= m-2
\ m=3
Mean = 3
m2 = m2¢ - ( m1¢ )
2
= 16 - (1)2
= 15
Variance = m2 = 15
Exercise 3.2
1. Calculate the first four moments about the mean from the following
data:
x 1 2 3 4 5
f 2 3 5 4 1
f 1 8 28 156 170 56 28 8 1
6. The first four moments about the working mean 28.5 of a distribution
are 0.294, 7.144, 14.409, and 454.98. Calculate the moments about the
mean. Also, evaluate b1 and b2.
[Ans.: 28.794, 7.058, 36.151, 408.738, 3.717, 8.205]
3.5 Skewness
Fig. 3.1
Fig. 3.2
Skewness gives an idea of the nature and degree of concentration of observations about
the mean.
If the value of the mean is greater than the mode, the skewness will be positive and if
the value of the mean is less than the mode, the skewness will be negative.
The relative measures of skewness is called the coefficient of skewness.
3.6 Kurtosis
Measures of central tendency, dispersion and skewness of a random variable cannot give
a complete idea about the probability distribution. In order to analyse the probability
distribution completely, another characteristic, Kurtosis is required. Kurtosis means
the convexity of the probability curve of the distribution. It measures the degree of
peakedness of distribution and is given by
m m
b2 = 42 = 44
m2 s
Fig. 3.3
3.6 Kurtosis 3.27
The curves with b2 > 3 is called Leptokurtic and those with b2 < 3 are called platykurtic.
The normal curve for which b 2 = 3 is called Mesokurtic.
2 m
As b = m3 and b2 = 42 determine the shape of the probability curve, these are
1
m23 m2
called Pearson’s shape coefficients.
Example 1
From the marks scored by 100 students in Section A and 100 students in
Section B of a class, the following measures were obtained:
Section A m A = 55 sA = 15.4 Mode = 58.72
Example 2
For a group of 10 items, Âx = 452, Âx2 = 24270, and mode = 43.7. Find
Karl Pearson’s coefficient of skewness.
Solution
n = 10, Â x = 452, Â x 2 = 24270, mode = 43.7
Sx 452
m= = = 45.2
n 10
2
Sx 2 Ê Sx ˆ
s= -Á ˜
n Ë n¯
2
24270 Ê 452 ˆ
= -Á
10 Ë 10 ˜¯
= 19.59
3.28 Chapter 3 Basic Statistics
Mean - Mode
Sk =
s
45.2 - 43.7
=
19.59
= 0.077
Example 3
In a distribution, the mean = 65, median = 70, coefficient of skewness =
–0.6. Find the mode and coefficient of variation.
Solution
m = 65, Median = 70, Sk = -0.6
Mode = 3 Median – 2 Mean = 3(70) – 2(65) = 80
Mean - Mode
Sk =
s
65 - 80
-0.6 =
s
\ s = 25
s 25
CV = ¥ 100 = ¥ 100 = 38.64%
x 65
Example 4
The following information was obtained from the records of a factory
relating to wages:
Arithmetic mean = ` 56.8, Median = ` 59.5, Standard deviation = ` 12.4
Give the information about the distribution of wages.
Solution
m = 56.8, Median = 59.5, s = 12.4
3(Mean - Median) 3(56.8 - 59.5)
Sk = = = -0.65
s 12.4
Mode = 3 Median - 2 Mean = 3(59.5) - 2(56.8) = 64.9
Example 5
For a moderately skewed distribution of retail price for men’s shoes,
it is found that the mean price is ` 20 and the median price is ` 17.
If the coefficient of variation is 20%, find the Pearson’s coefficient of
skewness.
Solution
m = 20, Median = 17, CV = 20%
s
CV = ¥ 100
x
s
20 = ¥ 100
20
\ s =4
3(Mean - Median) 3(20 - 17)
Sk = = = 2.25
s 4
Example 6
Find the mean, SD, quartiles, median and Karl Pearson’s coefficient of
skewness for the following probability distribution:
X=x 1 2 3 4 5 6 7 8
p(x) 0.008 0.032 0.142 0.216 0.240 0.206 0.143 0.013
Solution
(i) Mean = µ = Âx p(x)
= 1(0.008) + 2(0.032) + 3(0.142) + 4(0.216) + 5(0.240) + 6(0.206)
+ 7(0.143) + 8(0.013)
= 4.903
(ii) Var (X) = s 2 = Â x 2 p( x ) - m 2
= 1(0.008) + 4(0.032) + 9(0.142) + 16(0.216) + 25(0.240) + 36(0.206)
+ 49(0.143) + 64(0.013) – (4.903)2
= 2.086
SD = Var( X ) = 2.086 = 1.444
(iii) F(3) = 0.008 + 0.032 + 0.142 = 0.182 < 0.25
F(4) = 0.008 + 0.032 + 0.142 + 0.216 = 0.398 > 0.25
1
Q1 = (3 + 4) = 3.5
3
F(5) = 0.008 + 0.032 + 0.142 + 0.216 + 0.240 = 0.638 < 0.75
F(6) = 0.008 + 0.032 + 0.142 + 0.216 + 0.240 + 0.206 = 0.844 > 0.75
3.30 Chapter 3 Basic Statistics
1
Q2 =
(4 + 5) = 4.5
2
1
Q3 = (5 + 6) = 5.5
2
(iv) Median = Q2 = 4.5
(v) Pearson’s coefficient of skewness
Mean - Median 4.903 - 4.5
Sk = = = 0.279
SD 1.444
Example 7
Find the mean, median QD, MD, SD, b1 and b2 of the following prob-
ability distribution:
X=x 0 1 2 3 4 5 6 7 8
p(x) 0.004 0.036 0.1 0.232 0.280 0.204 0.112 0.028 0.004
Solution
(i) Mean = m = Âx p(x)
= 0 + 1(0.036) + 2 (0.1) + 3(0.232) + 4(0.280) + 5(0.204) + 6(0.112)
+ 7(0.028) + 8(0.04)
= 3.972
(ii) Median
F(3) = 0.004 + 0.036 + 0.1 + 0.232 = 0.372 < 0.5
F(4) = 0.004 + 0.036 + 0.1 + 0.232 + 0.280 = 0.652 > 0.5
1
Median M = (3 + 4) = 3.5
2
(iii) Mode is the value of X for which P(X = x) is maximum.
Mode = 4 [∵ P(X = 4) = 0.280 is maximum probability]
(iv) Variance = s = Â x p( x ) - m
2 2 2
Exercise 3.3
[ans.: 40 ]
2. From the marks scored by 120 students in Section A and 120 students in
Section B of a class, the following measures are obtained:
1. Mean The mean or average value (m) of the probability distribution of a continuous
random variable X is called the expectation and is denoted by E(X).
•
m = E( X ) = Ú x f ( x ) dx
-•
3.7 Measures of Statistics for Continuous Random Variables 3.33
2. Median The median is the point which divides the entire distribution into two
equal parts. In case of a continuous distribution, the median is the point which divides
the total area into two equal parts. Thus, if a continuous random variable X is defined
from a to b and M is the median,
M b
1
Ú f ( x ) dx = Ú f ( x) dx = 2
a M
•
MD = Ú x - m f ( x ) dx
-•
10. Moments Central moments or moments about actual mean of the probability
distribution of a continuous random variable X is given by
•
Ú (x - m)
r
mr = f ( x ) dx
-•
Ú ( x - a)
r
mr¢ = f ( x ) dx
-•
Úx
r
mr¢ = f ( x ) dx
-•
m32
b1 =
m23
Example 1
For the continuous random variable having pdf
f ( x) = 4 x3 0 £ x £1
=0 otherwise
Find the mean and variance of X.
Solution
•
Mean = m = Ú x f ( x ) dx
-•
0 1 •
= Ú x f ( x ) dx + Ú x f ( x ) dx + Ú x f ( x ) dx
-• 0 1
1
= 0 + Ú x (4 x 3 ) dx + 0
0
1
= 4 Ú x 4 dx
0
1
x5
=4
5 0
Ê1 ˆ
= 4 Á - 0˜
Ë5 ¯
4
=
5
•
Var (X) = Ú x 2 f ( x ) dx - m 2
-•
0 1 •
= Ú x 2 f ( x ) dx + Ú x 2 f ( x ) dx + Ú x 2 f ( x ) dx - m 2
-• 0 1
1 2
Ê 4ˆ
= 0 + Ú x 2 (4 x 3 ) dx + 0 - Á ˜
Ë 5¯
0
3.36 Chapter 3 Basic Statistics
1
16
= 4 Ú x 5 dx -
0
25
1
x6 16
=4 -
6 0 25
4 16
= -
6 25
2
=
75
Example 2
For the triangular distribution
f ( x) = x 0 < x £1
= 2- x 1£ x £ 2
=0 otherwise
Find the mean and variance.
Solution
•
m= Ú x f ( x ) dx
-•
0 1 2 •
= Ú x f ( x ) dx + Ú x f ( x ) dx + Ú x f ( x ) + Ú x f ( x ) dx
-• 0 1 2
1 2
= 0 + Ú x ◊ x dx + Ú x (2 - x ) dx + 0
0 1
1 2
= Ú x 2 dx + Ú (2 x - x 2 ) dx
0 1
1 2
x3 x2 x3
= +2 -
3 0 2 3 1
Ê 1 ˆ ÈÊ 8ˆ Ê 1ˆ ˘
= Á - 0˜ + ÍÁ 4 - ˜ - Á 1 - ˜ ˙
Ë 3 ¯ ÎË 3¯ Ë 3¯ ˚
1 4 2
+ -=
3 3 3
= 1
3.7 Measures of Statistics for Continuous Random Variables 3.37
•
Var (X) = Ú x 2 f ( x ) dx - m 2
-•
0 1 2 •
= Ú x 2 f ( x ) dx + Ú x 2 f ( x ) dx + Ú x 2 f ( x ) dx + Ú x 2 f ( x ) dx - m 2
-• 0 1 2
1 2
= 0 + Ú x 2 ◊ x dx + Ú x 2 (2 - x ) dx + 0 - 1
0 1
1 2
= Ú x 3 dx + Ú (2 x 2 - x 3 ) dx - 1
0 1
1 2
4
x 2 x3 x 4
= + - -1
4 0 3 4 1
Ê1 ˆ ÈÊ 16 16 ˆ Ê 2 1 ˆ ˘
= Á - 0˜ + ÍÁ - ˜ - Á - ˜ ˙ - 1
Ë4 ¯ ÎË 3 4 ¯ Ë 3 4¯ ˚
7
= -1
6
1
=
6
Example 3
If the probability density function of X is given by
Ïx
Ô2 0 < x £1
Ô
ÔÔ 1 1< x £ 2
f ( x) = Ì 2
Ô3 - x
Ô 2< x<3
Ô 2
ÔÓ0 otherwise
Find the expected value of f (x) = x2 – 5x + 3.
Solution
•
E [Ef ( x )] = Ú f ( x) f ( x) dx
-•
•
E ( x 2 - 5 x + 3) = Ú (x
2
- 5 x + 3) f ( x ) dx
-•
3.38 Chapter 3 Basic Statistics
1 2
x 1
= Ú ( x 2 - 5 x + 3) dx + Ú ( x 2 - 5 x + 3) dx +
0
2 1
2
3
Ê 3- xˆ
Ú (x
2
- 5 x + 3) Á dx
Ë 2 ˜¯
2
1 2
1 1
=
20Ú ( x 3 - 5 x 2 + 3 x ) dx + Ú ( x 2 - 5 x + 3) dx
21
3
1
2 Ú2
+ (- x 3 + 8 x 2 - 18 x + 9) dx
1 2 3
1 x 4 5x3 3x2 1 x3 5x2 1 x 4 8 x 3 18 x 2
= - + + - + 3x + - + - + 9x
2 4 3 2 0 2 3 2 1 2 4 3 2 2
1 Ê 1 5 3ˆ 1 Ê 8 1 5 ˆ
= Á - + ˜ + Á - 10 + 6 - + - 3˜¯
2 Ë 4 3 2¯ 2 Ë 3 3 2
1 Ê 81 216 162 16 64 72 ˆ
+ Á- + - + 27 + - + - 18˜
2Ë 4 3 2 4 3 2 ¯
1 13 19
= - -
24 12 24
11
=-
6
Example 4
A continuous random variable has the probability density function
f ( x ) = kxe - l x x ≥ 0, l > 0
=0 otherwise
Determine (i) k, (ii) mean, and (iii) variance.
Solution
Since f (x) is a probability density function,
•
Ú f ( x ) dx = 1
-•
0 •
Ú f ( x ) dx + Ú f ( x ) dx = 1
-• 0
•
0 + Ú k x e - l x dx = 1
0
3.7 Measures of Statistics for Continuous Random Variables 3.39
Ú xe
-l x
k dx = 1
0
•
e- l x e- l x
k x -1 2 =1
-l l 0
È Ê 1 ˆ˘
k Í(0 - 0) - Á 0 - 2 ˜ ˙ = 1
Î Ë l ¯˚
k = l2
2 -l x
Hence, f ( x ) = l x e x ≥ 0, l = 0
=0 otherwise
•
(ii) Mean = m = Ú x f ( x ) dx
-•
0 •
= Ú x f ( x ) dx + Ú x f ( x ) dx
-• 0
•
= 0 + Ú x l 2 x e - l x dx
0
•
= l 2 Ú x 2 e - l x dx
0
•
Ê e- l x ˆ
2 2
Ê e- l x ˆ Ê e- l x ˆ
=l x Á ˜ - 2x Á 2 ˜ + 2 Á
Ë -l ¯ Ë l ¯ Ë - l 3 ˜¯ 0
È Ê 2 ˆ˘
= l 2 Í(0 - 0 + 0) - Á 0 - 0 - 3 ˜ ˙
Î Ë l ¯˚
2
=
l
•
(iii) Variance = s 2 =
Ú x 2 f ( x ) dx - m 2
-•
0 •
= Ú x 2 f ( x ) dx + Ú x 2 f ( x ) dx - m 2
-• 0
• 2
Ê 2ˆ
= 0 + Ú x 2 l 2 x e - l x dx - Á ˜
Ë l¯
0
•
4
= l 2 Ú x 3 e - l x dx -
0 l2
3.40 Chapter 3 Basic Statistics
•
Ê e- l x ˆ
2 3
Ê e- l x ˆ Ê e- l x ˆ Ê e- l x ˆ 4
=l x Á ˜ - 3x2 Á 2 ˜ + 6 x Á ˜ - 6 Á -
Ë -l x ¯ Ë l ¯ Ë -l 3 ¯ Ë l 4 ˜¯ 0 l2
È Ê 6 ˆ˘ 4
= l 2 Í(0 - 0 + 0 - 0) - Á 0 - 0 + 0 - 4 ˜ ˙ - 2
Î Ë l ¯˚ l
6 4
= 2
-
l l2
2
=
l2
Example 5
The probability density f (x) of a continuous random variable is given
1
by f (x) = k e–|x|, –• < x < • (i) show that k = , and (ii) find the mean
2
and variance of the distribution. (iii) Also, find the probability that the
variate lies between 0 and 4.
Solution
(i) Since f (x) is a probability density function,
•
Ú f ( x ) dx = 1
-•
•
Ú ke
-| x |
dx = 1
-•
•
Úe
-| x |
k dx = 1
-•
•
2 k Ú e - | x | dx = 1 ÈÎ∵ e -| x| is an even function ˘˚
0
•
2 k Ú e - x dx = 1 [∵ |x|= x 0 £ x £ •]
0
•
2k -e- x 0 = 1
-2 k (0 - 1) = 1
1
k=
2
3.7 Measures of Statistics for Continuous Random Variables 3.41
1 -| x |
Hence, f ( x ) = e -• < x < •
2
•
(ii) m = Ú x f ( x ) dx
-•
•
1
2 -Ú•
= x e - | x | dx
•
Ê 1ˆ
= 2 Á ˜ Ú x 2 e -| x| dx [∵ the integrand is an even function]
Ë 2¯
0
•
= Ú x 2 e -| x| dx
0
•
e- x 2e- x e- x
= x - 2x +2
-1 1 -1 0
= 0 - (-2)
= 2
(iii) Probability that the variate lies between 0 and 4
4
P(0 < X < 4) = Ú f ( x ) dx
0
1 4
= Ú e -| x| dx
2 0
1 4
= Ú e - x dx ÎÈ∵ x =x 0 < x < 4 ˚˘
2 0
1 4
= - e- x 0
2
1
= - (e -4 - 1)
2
= 0.4908
Example 6
The daily consumption of electric power is a random variable X with
probability density function
3.42 Chapter 3 Basic Statistics
x
-
f ( x) = k x e 3 x>0
=0 x£0
Find the value of k, the expectation of X, and the probability that on a
given day, the electric consumption is more than the expected value.
Solution
Since f (x) is a probability density function,
•
Ú f ( x ) dx = 1
-•
0 •
Ú f ( x ) dx + Ú f ( x ) dx = 1
-• 0
• x
-
0+ Ú k xe 3 dx = 1
0
•
Ê -x ˆ Ê -x ˆ
Áe 3 ˜ Áe 3 ˜
k xÁ - (1) =1
1˜ Á 1 ˜
ÁË - ˜¯ ÁË ˜¯
3 9 0
k [(0 - 0) - (0 - 9)] = 1
9k = 1
1
k=
9
x
1 -3
Hence, f ( x ) = xe x>0
9
=0 x£0
•
E( X ) = Ú x f ( x ) dx
-•
0 •
= Ú x f ( x ) dx + Ú x f ( x ) dx
-• 0
• x
1 -
= 0+ Ú x◊ x e 3 dx
0
9
• x
1 2 -3
9 Ú0
= x e dx
3.7 Measures of Statistics for Continuous Random Variables 3.43
•
Ê -x ˆ Ê -x ˆ Ê -x ˆ
1 2Áe 3 ˜ Áe 3 ˜ Áe 3 ˜
= x Á - 2 x + 2
9 1˜ Á 1 ˜ Á 1 ˜
ÁË - ˜¯ ÁË ˜¯ ÁË - ˜¯
3 9 27 0
1
= (0 - 0 + 0 + 54)
9
= 6
6
P( X > 6) = Ú f ( x ) dx
0
6 x
1 -3
=Ú x e dx
0
9
6 x
1 -
=
90Ú x e 3 dx
•
Ê -x ˆ Ê -x ˆ
1 Áe 3 ˜ Áe 3 ˜
= xÁ -1
9 1˜ Á 1 ˜
ÁË - ˜¯ ÁË ˜¯
3 9 0
=
1È
9Î
(
(0 - 0) - -18 e -2 - 9 e -2 ˘˚ )
= 3 e -2
= 0.406
Example 7
Let X be a random variable with E(X) = 10 and Var(X) = 25. Find the
positive values of a and b such that Y = aX – b has an expectation of
0 and a variance of 1.
Solution
E (Y ) = E (aX - b)
0 = aE ( X ) - b
= a(10) - b
10 a - b = 0
Var(Y ) = Var(aX - b)
1 = a 2 Var( X )
= a 2 (25)
3.44 Chapter 3 Basic Statistics
25a 2 = 1
1
a=
5
b = 2
Example 8
A continuous random variable X is distributed over the interval [0, 1]
with pdf f (x) = ax2 + bx, where a, b are constants. If the mean of X is 0.5,
find the values of a and b.
Solution
Since f (x) is probability density function,
•
Ú f ( x ) dx = 1
-•
0 1 •
Ú f ( x ) dx + Ú f ( x ) dx + Ú f ( x ) dx = 1
-• 0 1
1
0 + Ú (ax 2 + bx ) dx + 0 = 1
0
1
ax 3 bx 2
+ =1
3 2 0
a b
+ =1
3 2
2 a + 3b = 6 ...(1)
Also, m = 0.5
1
Ú x f ( x) dx = 0.5
0
1
Ú x (ax
2
+ bx ) dx = 0.5
0
1
Ú (ax
3
+ bx 2 ) dx = 0.5
0
1
ax 4 bx 3
+ = 0.5
4 3 0
a b
+ = 0.5
4 3
3a + 4b = 6 ...(2)
3.7 Measures of Statistics for Continuous Random Variables 3.45
Example 9
A continuous random variable X has the pdf defined by f (x) = A + Bx,
0 £ x £ 1. If the mean of the distribution is 1 , find A and B.
3
Solution
Since f (x) is a probability density function,
•
Ú f ( x ) dx = 1
-•
0 1 •
Ú f ( x ) dx + Ú f ( x ) dx + Ú f ( x ) dx = 1
-• 0 1
1
0 + Ú ( A + Bx ) dx + 0 = 1
0
1
Bx 2
Ax + =1
2 0
B
A+ = 1 ...(1)
2
1
Also, m=
3
•
1
Ú x f ( x ) dx =
3
-•
0 1
1
Ú x f ( x ) dx + Ú x f ( x ) dx =
3
-• 0
1
1
0 + Ú x ( A + Bx ) dx =
0
3
1
1
Ú ( Ax + Bx
2
) dx =
0
3
1
Ax 2 Bx 3 1
+ =
2 3 0 3
A B 1
+ =
2 3 3
3A + 2B = 2 ...(2)
3.46 Chapter 3 Basic Statistics
Example 10
A continuous random variable has probability density function
f (x) = 6(x –x2) 0 £ x £ 1.
Find the (i) mean, (ii) variance, (iii) median, and (iv) mode.
Solution
•
(i) m = Ú x f ( x ) dx
-•
0 1 •
= Ú x f ( x ) dx + Ú x f ( x ) dx + Ú x f ( x ) dx
-• 0 1
1
= 0 + Ú x 6( x - x 2 ) dx + 0
0
1
= 6 Ú ( x 2 - x 3 ) dx
0
1
x3 x 4
=6 -
3 4 0
Ê 1 1ˆ
= 6Á - ˜
Ë 3 4¯
1
=
2
•
(ii) Var ( X ) = Ú x 2 f ( x ) dx - m 2
-•
0 1 •
= Ú x 2 f ( x ) dx + Ú x 2 f ( x ) dx + Ú x 2 f ( x ) dx - m 2
-• 0 1
1 2 1
= 0+Ú x 6 ( x - x 2 ) dx + 0 -
0 4
1 1
= 6 Ú ( x 3 - x 4 ) dx -
0 4
1
x4 x5 1
=6 - -
4 5 0 4
Ê 1 1ˆ 1
= 6Á - ˜ -
Ë 4 5¯ 4
3.7 Measures of Statistics for Continuous Random Variables 3.47
6 1
= -
20 4
1
=
20
M b
1
(iii) Ú f ( x ) dx = Ú f ( x) dx = 2
a M
M
1
Ú 6( x - x
2
) dx =
0
2
M
x2 x3 1
6 - =
2 3 0 2
Ê M2 M3 ˆ 1
6Á - ˜=
Ë 2 3 ¯ 2
1
3M 2 - 2 M 3 =
2
3 2
4M - 6M + 1 = 0
(2 M - 1) (2 M 2 - 2 M - 1) = 0
1 1± 3
M= or M =
2 2
1
M = lies in (0, 1)
2
1
Hence, median M =
2
(iv) Mode is the value of x for which f (x) is maximum. For f (x) to be maximum,
f ¢( x ) = 0 and f ¢¢( x ) < 0.
f ¢( x ) = 0
6 (1 - 2 x ) = 0
1
x=
2
f ¢¢( x ) = -12 x
1
At x = , f ¢¢( x ) = -12 < 0
2
1
Hence, f (x) is maximum at x = .
2
1
Mode =
2
3.48 Chapter 3 Basic Statistics
Example 11
The probability density function of a random variable X is
1
f ( x ) = sin x 0 £ x £ p
2
=0 otherwise
Find the mean, mode, and median of the distribution and also, find the
p
probability between 0 and .
2
Solution
•
(i) m = Ú f ( x ) dx
-•
0 p •
= Ú x f ( x ) dx + Ú x f ( x ) dx + Ú x f ( x ) dx
-• 0 p
p
Ê1 ˆ
= 0 + Ú x Á sin x ˜ dx + 0
Ë2 ¯
0
p
1
2 Ú0
= x sin x dx
1 p
= - x cos x + sin x 0
2
p
=
2
(ii) Mode is the value of x for which f (x) is maximum. For f (x) to be maximum,
f ¢( x ) = 0 and f ¢¢( x ) < 0.
f ¢( x ) = 0
cos x = 0
p
x=
2
1
f ¢¢( x ) = - sin x
2
p 1
At x = , f ¢¢( x ) = - < 0
2 2
p
Hence, f (x) is maximum of x = .
2
p
Mode =
2
3.7 Measures of Statistics for Continuous Random Variables 3.49
M b
1
(iii)
Ú f ( x ) dx = Ú f ( x) dx = 2
a M
M p
1 1 1
Ú 2
sin x dx = Ú sin x dx =
2 2
0 M
M
1 1
Ú 2 sin x dx = 2
0
1 1M
- cos x 0 =
2 2
1 1
- (cos M - 1) =
2 2
1 - cos M = 0
cos M = 0
p
M=
2
p
Hence, median M =
2
p
(iv) P ÊÁ 0 < X < p ˆ˜ = 2 f ( x ) dx
Ë 2 ¯ Ú0
p
1
=Ú2 sin x dx
0 2
p
1 2
= - cos x 0
2
1
= - (0 - 1)
2
1
=
2
Example 12
The cumulative distribution function of a continuous random variable X
is F ( x ) = 1 - e-2 x x ≥ 0
=0 x<0
Find the (i) the probability density function, (ii) mean, and
(iii) variance.
3.50 Chapter 3 Basic Statistics
Solution
d
(i) f ( x ) = F ( x)
dx
1
f ( x ) = e -2 x x ≥ 0
2
=0 x<0
•
(ii) m = Ú x f ( x ) dx
-•
0 •
=Ú x f ( x ) dx + Ú x f ( x ) dx
-• 0
• 1
= 0+Ú x ◊ e -2 x dx
0 2
1 •
2 Ú0
= x e -2 x dx
•
1 Ê e -2 x ˆ Ê e -2 x ˆ
= xÁ ˜ - 1Á ˜
2 Ë -2 ¯ Ë 4 ¯ 0
1È Ê 1ˆ˘
= Í(0 - 0) - ÁË 0 - ˜¯ ˙
2Î 4 ˚
1
=
8
• 2
(iii) Var ( X ) = Ú x f ( x ) dx - m 2
-•
0 •
=Ú x2 f ( x ) dx + Ú x 2 f ( x ) dx - m 2
-• 0
2
1 -2 x
• Ê 1ˆ
= 0 + Ú x2 ◊ e dx - Á ˜
0 2 Ë 8¯
1 • 1
= Ú x 2 e -2 x dx -
2 0 64
•
1 Ê e -2 x ˆ Ê e -2 x ˆ Ê e -2 x ˆ 1
= x2 Á ˜ - 2x Á ˜ + 2Á -
2 Ë -2 ¯ Ë 4 ¯ Ë -8 ˜¯ 0 64
1È Ê 1ˆ˘ 1
= Í(0 - 0 - 0) - ÁË 0 - 0 - ˜¯ ˙ -
2Î 4 ˚ 64
1 1
= -
8 64
7
=
64
3.7 Measures of Statistics for Continuous Random Variables 3.51
Example 13
A continuous random variable X has the distribution function
F ( x) = 0 x £1
= k ( x - 1)4 1< x £ 3
=1 x>3
Determine (i) f (x), (ii) k, and (iii) mean.
Solution
d
(i) f ( x ) = F ( x)
dx
f ( x) = 0 x £1
3
= 4 k ( x - 1) 1< x £ 3
=0 x>3
(ii) Since f (x) is a probability density function,
•
Ú-• f ( x) dx = 1
1 3 •
Ú-• f ( x) dx + Ú1 f ( x ) dx + Ú f ( x ) dx = 1
3
3
0 + Ú 4 k ( x - 1)3 dx + 0 = 1
1
3
( x - 1)4
4k =1
4 1
k (16 - 0) = 1
1
k=
16
Hence, f ( x ) = 0 x £1
1
= ( x - 1)3 1< x £ 3
4
=0 x>3
•
(iii) m = Ú x f ( x ) dx
-•
1 3 •
=Ú x f ( x ) dx + Ú x f ( x ) dx + Ú x f ( x ) dx
-• 1 3
3 1
= 0+Ú x ◊ ( x - 1)3 dx + 0
1 4
1 3
Ú1 x ( x - 1)
3
= dx
4
3.52 Chapter 3 Basic Statistics
ÈPutting x - 1 = t ˘
1 2 Í ˙
4 Ú0
= (t + 1) t 3 dt Í When x = 1, t = 0 ˙
ÍÎ When x = 3, t = 2 ˙˚
1 2
Ú0 (t
4
= + t 3 ) dt
4
2
1 t5 t4
= +
4 5 4 0
1 ÈÊ 25 2 4 ˆ ˘
= ÍÁ + ˜ - (0)˙
4 ÎË 5 4¯ ˚
= 2.6
Example 14
If the density function of a random variable X is given by
f(x) = kx (1 – x), 0 £ x £ 1,
find (i) AM, (ii) HM, (iii) Median, (iv) Mode, (v) SD, (vi) MD about the
mean.
Solution
(i) Since f(x) is a probability density function,
•
Ú f ( x )dx = 1
-•
1
Ú kx(1 - x)dx = 1
0
1
k Ú ( x - x 2 ) dx = 1
0
1
x2 x3
k - =1
2 3
0
Ê 1 1ˆ
kÁ - ˜ =1
Ë 2 3¯
k =6
Hence, f ( x ) = 6 x(1 - x ), 0 £ x £ 1
•
(ii) AM = m = E(x) = Ú xf ( x) dx
-•
3.7 Measures of Statistics for Continuous Random Variables 3.53
1
= Ú x ◊ 6 x(1 - x ) dx
0
1
= 6 Ú ( x 2 - x 3 ) dx
0
1
x3 x 4
=6 -
3 4
0
Ê 1 1ˆ
= 6Á - ˜
Ë 3 4¯
1
=
2
•
1 1
(iii) = Ú f ( x ) dx
H -• x
1
1
=Ú ◊ 6 x(1 - x ) dx
0
x
1
= 6 Ú (1 - x ) dx
0
1
x2
=6 x-
2
0
Ê 1ˆ
= 6 Á1 - ˜
Ë 2¯
=3
1
H=
3
M
1
(iv) Ú f ( x) dx = 2
0
M
1
Ú 6 x(1 - x) dx = 2
0
M
1
6 Ú ( x - x 2 )dx =
0
2
M
x2 x3 1
6 - =
2 3 2
0
3.54 Chapter 3 Basic Statistics
Ê M2 M3 ˆ 1
6Á - =
Ë 2 3 ˜¯ 2
1
3M 2 - 2 M 3 =
2
6M 2 - 4M 3 = 1
4M 3 - 6M 2 + 1 = 0
1 1 3
M= , ±
2 2 2
1 3
The values M = ± lie outside (0, 1).
2 2
1
Hence, M =
2
(v) Mode is the value of x for which f(x) is maximum. For f(x) to be maximum,
f¢(x) = 0 and f¢¢ (x) < 0.
f ¢( x ) = 0
6 - 12 x = 0
1
x=
2
f ¢¢( x ) = -12 < 0
1
Hence, f(x) is maximum at x =
2
1
Mode =
2
As the mean, median and mode are equal, the distribution is symmetrical.
•
(vi) E ( X 2 ) = Úx
2
+ f ( x ) dx
-•
1
= Ú x 2 ◊ 6 x(1 - x ) dx
0
1
= 6 Ú ( x 3 - x 4 ) dx
0
1
x4 x5
=6 -
4 5
0
Ê 1 1ˆ
= 6Á - ˜
Ë 4 5¯
3
=
10
3.7 Measures of Statistics for Continuous Random Variables 3.55
Var ( X ) = E ( X 2 ) - {E ( X )}2
2
3 Ê 1ˆ
= -
10 ÁË 2 ˜¯
1
=
20
1 1
SD = Var( X ) = =
20 2 5
(vii) Mean deviation about the mean
•
MD = Ú x - m f ( x ) dx
-•
1
1
= Ú x- 6 x(1 - x ) dx
0
2
1
2 1
Ê1 ˆ Ê 1ˆ
= Ú Á - x ˜ 6 x (1 - x ) dx + Ú Á x - ˜ 6 x (1 - x ) dx
Ë2 ¯ 1
Ë 2¯
0
2
1
2 1
= Ú (3 x - 9 x 2 + 6 x 3 )dx + Ú (-3 x + 9 x 2 - 6 x 3 ) dx
0 1
2
1
1
3x2 3x4 2 3x2 3x4
= - 3x3 + +- + 3x3 -
2 2 2 2 1
0
2
Ê3 3 3 ˆ Ê 3 3ˆ Ê 3 3 3 ˆ
= Á - + ˜ +Á- +3- ˜ -Á- + - ˜
Ë 8 8 32 ¯ Ë 2 2 ¯ Ë 8 8 32 ¯
3
=
16
Example 15
Prove that geometric mean G of the distribution
f(x) = 6(2 – x) (x – 1), 1 £ x £ 2
is given by 6 log(16G) = 19.
3.56 Chapter 3 Basic Statistics
Solution
•
log G = Ú (log x) f ( x) dx
-•
2
= Ú (log x ) 6(2 - x ) ( x - 1) dx
1
2
= -6 Ú ( x 2 - 3 x + 2) log x dx
1
ÈÊ 3 ˆ
2
2
Ê x3 3 x2 ˆ 1 ˘˙
Í x 3x2
= -6 Á - + 2 x˜ log x - Ú Á - + 2 x ˜ dx
ÍË 3 2 ¯ 3 2 ¯x ˙
ÍÎ 1 1Ë ˙˚
ÈÊ 8 ˆ
2
Ê x2 3x ˆ ˘
= -6 ÍÁ - 6 + 4˜ log 2 - Ú Á - + 2 ˜ dx ˙
ÍÎË 3 ¯ 1Ë
3 2 ¯ ˙˚
È 2˘
Í 2 x3 3 x2
= -6 log 2 - - + 2x ˙
Í3 9 4 ˙
Î 1˚
È2 Ê8 ˆ Ê1 3 ˆ˘
= -6 Í log 2 - Á - 3 + 4˜ + Á - + 2˜ ˙
Î3 Ë9 ¯ Ë9 4 ¯˚
È2 17 49 ˘
= -6 Í log 2 - + ˙
Î3 9 36 ˚
19
= -4 log 2 +
6
19
log G + 4 log 2 =
6
19
log(G ¥ 2 4 ) =
6
19
log(16G ) =
6
Example 16
The probability distribution of a random variable X is
p
f ( x ) = k sin
x, 0 £ x £ 5
5
Determine the constant k and obtain the median and quartiles of the
distribution.
3.7 Measures of Statistics for Continuous Random Variables 3.57
Solution
Since f(x) is a probability distribution,
•
Ú f ( x ) dx = 1
-•
5
p
Ú k sin 5 x dx = 1
0
5
p
- cos x
k 5 =1
p
5 0
5k
(- cos p + cos 0) = 1
p
5k
[ -(-1) + 1] = 1
p
10 k
=1
p
p
k=
10
p p
Hence, f ( x ) = sin x, 0£ x£5
10 5
The rth quartile Qr is given by
Qr
r
Ú f ( x ) dx =
4
, r = 1, 2, 3
-•
Qr
p p r
Ú 10 sin 5 x dx = 4
0
Qr
p
- cos x
p 5 r
=
10 p 4
5 0
1Ê p ˆ r
Á - cos Qr + cos 0˜ =
2 Ë 5 ¯ 4
p r
- cos Qr + 1 =
5 2
3.58 Chapter 3 Basic Statistics
p r
cos Qr = 1 -
5 2
p Ê rˆ
Qr = cos-1 Á 1 - ˜
5 Ë 2¯
5 Ê rˆ
Qr = cos-1 Á 1 - ˜
p Ë 2¯
5 Ê 1ˆ 5 Ê 1ˆ 5 Ê p ˆ 5
Q1 = cos-1 Á 1 - ˜ = cos-1 Á ˜ = Á ˜ =
p Ë 2¯ p Ë 2¯ p Ë 3 ¯ 3
5 5 5 Êpˆ 5
Q2 = cos-1 (1 - 1) = cos-1 (0) = Á ˜ =
p p p Ë 2¯ 2
5 Ê 3ˆ 5 Ê 1 ˆ 5 Ê 2p ˆ 10
Q3 = cos-1 Á 1 - ˜ = cos-1 Á - ˜ = Á ˜ =
p Ë 2¯ p Ë 2¯ p Ë 3 ¯ 3
5
Median = Q2 =
2
Example 17
Find the median, mode and quartile deviation of continuous random
variable X, given that its density functions is
k
f ( x) = , - • < x < •.
1 + x2
Solution
(i) Since f(x) is a probability density function,
•
Ú f ( x )dx = 1
-•
•
k
Ú 1 + x 2 dx = 1
-•
•
1 È a a ˘
2k Ú 2
dx = 1 Í ∵ Ú f ( x ) dx = 2 Ú f ( x ) dx , if f ( x ) is even function ˙
0 1+ x ÍÎ - a 0 ˙˚
•
2k tan -1 x =1
0
3.7 Measures of Statistics for Continuous Random Variables 3.59
2 k (tan -1 • - tan -1 0) = 1
Êpˆ
2k Á ˜ = 1
Ë 2¯
1
k=
p
1
Hence, f ( x) = ,-• < x < •
p (1 + x 2 )
f ¢( x ) = 0
2x
- =0
p (1 + x 2 )2
x=0
2È (1 + x 2 )2 - x ◊ 2(1 + x 2 ) ◊ 2 x ˘
f ¢¢( x ) = - Í ˙
pÍÎ (1 + x 2 )4 ˙˚
2 È 3x2 - 1 ˘
= Í ˙
p ÎÍ (1 + x 2 )3 ˚˙
2
f ¢¢(0) = -
<0
p
Hence, f(x) is maximum at x = 0.
Mode = 0
Example 18
Find the mean, variance and the coefficients b1, b2 of the distribution
f(x) = kx2e–x, 0 < x < •
Solution
Since f(x) is a probability density function,
•
Ú f ( x )dx = 1
-•
•
Ú kx
2 -x
e dx = 1
0
•
k x 2 (-e - x ) - 2 x e - x + 2(-e - x ) =1
0
k (2 e 0 ) = 1
1
k=
2
1 2 -x
Hence, f ( x) = x e ,0 < x < •
2
•
Úx
r
m2¢ = f ( x ) dx
-•
•
1 2 -x
= Ú xr x e dx
0
2
3.7 Measures of Statistics for Continuous Random Variables 3.61
•
1 - x r +2
2 Ú0
= e x dx
1
= r +3
2
1
= (r + 2)!
2
1
m1¢ = (3!) = 3
2
1
m2¢ = (4!) = 12
2
1
m 3¢ = (5!) = 60
2
1
m 4¢ = (6!) = 360
2
m2 = m2¢ - ( m1¢ )2 = 12 - (3)2 = 3
Example 19
The probability density function of a random variable X is given by
f(x) = kx (2 – x), 0 ≤ x ≤ 2. Find mean, variance b1 and b2.
Solution
Since f(x) is a probability density function,
•
Ú f ( x ) dx = 1
-•
3.62 Chapter 3 Basic Statistics
Ú k x(2 - x) dx = 1
0
2
k Ú (2 x - x 2 ) dx = 1
0
2
2x3
k x - =1
3
0
Ê 8ˆ
kÁ4- ˜ =1
Ë 3¯
3
k=
4
3
Hence, f ( x ) = x(2 - x ) , 0 ≤ x ≤ 2
4
•
Úx
r
mr¢ = f ( x ) dx
-•
2
3
= Ú xr x(2 - x )dx
0
4
2
3 r +1
4 Ú0
= x (2 - x )dx
2(2r +1 )
=
(r + 2)(r + 3)
3(22 )
�m1¢ = =1
(3)(4)
3(23 ) 6
m2¢ = =
(4)(5) 5
3(2 4 ) 8
m3¢ = =
(5)(6) 5
3(25 ) 16
m 4¢ = =
(6)(7) 7
6 1
m2 = m2¢ - ( m1¢ )2 = -1 =
5 5
8 Ê 6ˆ
m3 = m3¢ - 3m2¢ m1¢ + 2( m1¢ )3 = - 3 Á ˜ (1) + 2 = 0
5 Ë 5¯
m 4 = m 4¢ - 4 m3¢ m1¢ + 6 m2¢ ( m1¢ )2 - 3( m1¢ )4
3.7 Measures of Statistics for Continuous Random Variables 3.63
16 Ê 8ˆ Ê 6ˆ
= - 4 Á ˜ (1) + 6 Á ˜ (1)2 - 3(1)4
7 Ë 5¯ Ë 5¯
3
=
35
Mean = m1¢ = 1
1
Variance = m2 =
5
m32
b1 = =0
m23
3
m435 15
b2 = 2 = =
m2 Ê 1 ˆ 2 7
ÁË 5 ˜¯
Example 20
Show that for the symmetrical distribution
2a Ê 1 ˆ
f ( x) = , -a£ x £ a
p ÁË a 2 + x 2 ˜¯
a 2 (4 - p ) Ê 8ˆ
m2 = and m 4 = a 4 Á 1 - ˜
p Ë 3p ¯
Solution
• a
2a Ê 1 ˆ
Ú f ( x ) dx = Ú p ÁË a 2 + x 2 ˜¯
dx
-• -a
a
2a 1 x
= tan -1
p a a -a
a
2 x
= tan -1
p a -a
2 È -1
= tan (1) - tan -1 (-1)˘˚
pÎ
= 1
Hence, f(x) represents a probability density function.
3.64 Chapter 3 Basic Statistics
•
m1¢ = Ú x f ( x) dx
-•
a
2a Ê 1 ˆ
= Úxp ÁË 2 ˜ dx
a + x2 ¯
-a
a
2a x
= Ú
p -a a + x2
2
dx
a
2a 1
= log(a 2 + x 2 )
p 2 -a
=0 [∵ integrand is an odd function of x ]
•
u2¢ = Úx
2
f ( x ) dx
-•
a
2a Ê 1 ˆ
Úx
2
= dx
-a
p ÁË a 2 + x 2 ˜¯
a
2a x2
p -Úa a 2 + x 2
= dx
a
4a x 2 + a2 - a 2
p Ú0 a 2 + x 2
= dx
4a Ê a2 ˆ
a
= Ú 1- 2
p 0 Ë a + x 2 ˜¯
Á dx
a
4a x
= x - a tan -1
p a
0
4a
= (a - a tan -1 1)
p
4a Ê pˆ
= Á a-a ˜
p Ë 4¯
a 2 (4 - p )
=
p
a 2 (4 - p ) a 2 (4 - p )
m2 = m2¢ - ( m1¢ )2 = -0 =
p p
m 4 = m 4¢ (∵ m1¢ = 0)
•
Úx
4
m4 = ◊ f ( x ) dx
-•
3.7 Measures of Statistics for Continuous Random Variables 3.65
a
2a Ê 1 ˆ
Úx
4
= ◊ dx
-a
p ÁË a 2 + x 2 ˜¯
a
2a x4
= Ú
p - a a2 + x2
dx
4a Ê 2 a4 ˆ
a
p Ú0 ÁË
2
= x - a + ˜ dx
a2 + x2 ¯
a
4a 1 3 x
= x - a 2 x + a 3 tan -1
p 3 a 0
4a Ê a3 ˆ
= Á - a 3 + a 3 tan -1 1˜
p Ë 3 ¯
4a Ê a3 pˆ
= Á - a3 + a3 ˜
p Ë 3 4¯
Ê 8 ˆ
= a4 Á1 - ˜
Ë 3p ¯
Exercise 3.4
1. If the probability density function is given by
f (x) = kx 2 (1 - x 3 ) 0 £ x £1
=0 otherwise
Ê 1ˆ
Find (i) k, (ii) P Á 0 < X < ˜ , (iii) X , and (iv) s 2 .
Ë 2¯
È 15 9 9 ˘
Í ans.: (i) 6 (ii) 64 (iii) 14 (iv) 245 ˙
Î ˚
2. If the probability density function of a random variable is given by
f (x) = kx 0£x£2
= 2k 2£x£4
= 6 k - kx 4£x£6
—
Find (i) k, (ii) P(1 £ X £ 3), and (iii) X .
È 1 1 383 ˘
Í ans.: (i) 2 (ii) 3 (iii) 36 ˙
Î ˚
3.66 Chapter 3 Basic Statistics
14. If the continuous random variable has the density function
kx
f ( x) = , x ≥ 0, find the value of k, median and mode.
(1 + x)3
È 1˘
Í ans.: 2, 1 + 2, 2 ˙
Î ˚
15. The density function of a continuous random variable X is given by
3
f (x) = x(2 - x),0 £ x £ 2. Find the mean, median, mode, harmonic
4
mean, MD about mean and SD.
È 2 3 1 ˘
Í ans.: 1, 1, 1, , , ˙
Î 3 8 5˚
If (X, Y) is a two dimensional discrete random variable with joint probability mass
function P(xi, yj) = pij, then the mathematical expectation of a function g(x, y) is given
by
• •
E[ g( X , Y )] = ÂÂ g( xi , y j ) pij
j =1 i =1
= ÂÂ g( x, y) f ( x ), y)
x y
If (X, Y) is a two dimensional continuous random variable with joint probability density
function f(x, y), then the mathematical expectation of a function g(x, y) is given by
• •
E[ g( X , Y )] = Ú Ú g ( x , y ) f ( x , y ) dx dy
-• -•
3.8 Expected Values of Two Dimensional Random Variables 3.69
Ú xf ( x, y)dx
-•
E ( X / Y = y) =
fY ( y)
•
Ú yf ( x, y) dx
-•
Similarly, E (Y / X = x ) =
fX ( x)
3.70 Chapter 3 Basic Statistics
Example 1
Given a pair of discrete random variable X and Y whose joint probabil-
ity distribution is given by
X
2 4
Y
1 0.1 0.15
2 0.2 0.3
3 0.1 0.15
Find the expected value of the function g(X, Y) given that g(X, Y) = 2X + Y.
Solution
E[ g( x, y)] = ÂÂ g( x, y) f ( x, y)
x y
= ÂÂ (2 x + y) f ( x, y)
x y
Example 2
Let X and Y be two random variables each taking values –1, 0 and 1 and
having the joint probability distribution as given below:
3.8 Expected Values of Two Dimensional Random Variables 3.71
X
–1 0 1 Total p(y)
Y
–1 0 0.1 0.1 0.2
0 0.2 0.2 0.2 0.6
1 0 0.1 0.1 0.2
Total p(x) 0.2 0.4 0.4 1.0
(iii) E ( X 2 ) = Â x 2 p( x )
= (-1)2 (0.2) + 0(0.4) + (1)2 (0.4)
= 0.6
Var( X ) = E ( X 2 ) - {E ( X )}2
= 0.6 - (0.2)2
= 0.56
E (Y 2 ) = Â y 2 p( y)
= (-1)2 (0.2) + 0(0.6) + (1)2 (0.2)
= 0.4
3.72 Chapter 3 Basic Statistics
Var(Y ) = E (Y 2 ) - {E (Y )}2
= 0.4 - 0
= 0.4
P ( X = -1, Y = 0)
(iv) P ( X = -1 / Y = 0) =
P (Y = 0)
0.2 1
= =
0.6 3
P ( X = 0, Y = 0)
P ( X = 0 / Y = 0) =
P (Y = 0)
0.2 1
= =
0.6 3
P ( X = 1, Y = 0)
P ( X = 1 / Y = 0) =
P (Y = 0)
0.2 1
= =
0.6 3
Example 3
If the joint pdf of (X,Y) is given by
Ï16 y
Ô x > 2,0 < y < 1
f ( x, y ) = Ì x 3
Ô0 elsewhere
Ó
then find E(X,Y).
Solution
• •
E( X ,Y ) = Ú Ú xyf ( x, y) dx dy
-• -•
1•
Ê 16 y ˆ
= Ú Ú xy Á 3 ˜ dx dy
Ë x ¯
02
1•
Ê y2 ˆ
= 16 Ú Ú Á 2 ˜ dx dy
0 2Ë x ¯
1 •
1
= 16 Ú y 2 - dy
0
x 2
1
1 2
= 16 Ú y dy
0
2
3.8 Expected Values of Two Dimensional Random Variables 3.73
1
y3
=8
3
0
8
= (1 - 0)
3
8
=
3
Example 4
The joint PDF of (X,Y) is given by
f(x, y) = 24xy , x > 0, y > 0, x + y £ 1
= 0 , elsewhere
Find the conditional mean and variance of Y, given X.
Solution
The region of integration is DOAB.
In DOAB, along vertical strip PQ, limits of y: y = 0 to y = 1 – x and x varies from x = 0 to
x = 1. y
•
fX ( x) = Ú f ( x, y)dy
-•
1- x
B
= Ú 24 xy dy
Q
0 x+y=1
1- x
2
y
= 24 x
2 O P A(1, 0) x
0
1- x
2 y3
=
(1 - x )2 3
0
2
= (1 - x )
3
•
E (Y 2 / x ) = Úy
2
f ( y /x )dy
-•
1- x
2y
= Ú y2
(1 - x )2
dy
0
1- x
Var(Y 2 /x ) = ( E (2Y 2 / xy)4- {E (Y / x )}2
= 2
1 - x ) 24 Ï0 2
(1 2
¸
= (1 - x ) - Ì (1 - x )˝
12 3
= (1 - x )2 Ó ˛
21 4
= (1 - x )2 - (1 - x )2
2 9
1
= (1 - x )2
18
5 4
1 x3
= Ú
96 1
y2
3
dy
0
5
1 4
= Ú
288 1
y 2 x 3 dy
0
5
1
288 Ú1
= 64 y 2 dy
5
2 y3
=
9 3
1
2
= (125 - 1)
27
248
=
27
(iv) E (2 X + 3Y ) = 2 E ( X ) + 3E (Y )
Ê 8ˆ Ê 31 ˆ
= 2 Á ˜ + 3Á ˜
Ë 3¯ Ë 9¯
47
=
3
3.8 Expected Values of Two Dimensional Random Variables 3.75
• •
E( X 2 ) = Ú Úx
(v) 2
f ( x, y) dx dy
-• -•
54
Ê xy ˆ
= Ú Ú x Á ˜ dx dy
2
Ë 96 ¯
10
5 4
1 x4
96 Ú1 4
= y dy
0
5
1
384 Ú1
= 256 y dy
5
2 y2
=
3 2
1
1
= (25 - 1)
3
=8
Var( X ) = E ( X 2 ) - {E ( X )}2
2
Ê 8ˆ
= 8-Á ˜
Ë 3¯
8
=
9
• •
E (Y 2 ) = Ú Úy
2
(vi) f ( x, y) dxdy
-• -•
54
Ê xy ˆ
= Ú Ú y Á ˜ dx dy
2
Ë 96 ¯
10
5 4
1 x2
= Ú
96 1
y3
2
dy
0
5
1
192 Ú1
= 16 y3 dy
5
1 y4
=
12 4
1
1
= (625 - 1)
48
= 13
3.76 Chapter 3 Basic Statistics
Var(Y ) = E (Y 2 ) - {E (Y )}2
2
Ê 31ˆ
= 13 - Á ˜
Ë 9¯
92
=
81
(vii) Cov( X , Y ) = E ( XY ) - E ( X )E (Y )
248 Ê 8 ˆ Ê 31ˆ
= -
27 ÁË 3 ˜¯ ÁË 9 ˜¯
=0
Example 5
Two random variables X and Y have the following joint probability den-
sity function:
ÏÔ2 - x - y, 0 £ x £ 1, 0 £ y £ 1
f ( x, y ) = Ì
ÔÓ0 , otherwise
Find (i) Marginal probability density function of X and Y.
(ii) Conditional density functions
(iii) Var(X) and Var(Y)
(iv) Covariance between X and Y
Solution
•
(i) fX ( x) = Ú f ( x, y) dy
-•
1
= Ú (2 - x - y) dy
0
1
y2
= 2 y - xy -
2
0
Ê 1ˆ
= Á2 - x - ˜
Ë 2¯
3
= -x
2
Ï3
Ô - x, 0 < x < 1
\ fX ( x) = Ì 2
ÔÓ0 , otherwise
3.8 Expected Values of Two Dimensional Random Variables 3.77
Ï3
Ô - y, 0 < y < 1
Similarly, fY ( y) = Ì 2
ÔÓ0 , otherwise
f ( x, y )
(ii) f X /Y ( x / y) =
fY ( y)
(2 - x - y)
= , 0 < ( x, y ) < 1
Ê3 ˆ
ÁË 2 - y˜¯
f ( x, y )
fY / X ( y / x ) =
fX ( x)
(2 - x - y)
=
Ê3 ˆ
ÁË 2 - x ˜¯
•
(iii) E ( X ) = Ú x f X ( x)dx
-•
1
Ê3 ˆ
= Ú x Á - x ˜ dx
Ë2 ¯
0
1
3x2 x3
= -
4 3
0
3 1
= -
4 3
5
=
12
•
E (Y ) = Ú y fY ( y)dy
-•
1
Ê3 ˆ
= Ú y Á - y˜ dy
Ë2 ¯
0
1
3 y 2 y3
= -
4 3
0
3 1
= -
4 3
5
=
12
3.78 Chapter 3 Basic Statistics
•
E( X 2 ) = Úx
2
f X ( x )dx
-•
1
Ê3 ˆ
= Ú x 2 Á - x ˜ dx
Ë2 ¯
0
1
x3 x 4
= -
2 4
0
1 1
= -
2 4
1
=
4
Var( X ) = E ( X 2 ) = {E ( X )}2
2
1 Ê 5ˆ
= -Á ˜
4 Ë 12 ¯
11
=
144
11
Similarly, Var(Y ) =
144
Example 6
If the joint pdf of (X, Y) is given by
f(x, y) = 24y(1 – x), 0 ≤ y ≤ x ≤ 1,
then find E(XY).
Solution
The region of integration is DOAB. In DOAB, along horizontal strip P¢Q¢,
Limits of x: x = y to x = 1 and y varies from y = 0 to y = 1.
• •
y
E ( XY ) = Ú Ú xy f ( x, y) dx dy
-• •
11 B(1, 1)
x
= Ú Ú xy ◊ 24 y(1 - x ) dx dy y=
0 y P¢ Q¢
11 x=1
= 24 Ú Ú xy 2 (1 - x ) dx dy
0 y O x
A(1, 0)
Fig. 3.5
3.8 Expected Values of Two Dimensional Random Variables 3.79
1 1
x2 x3
= 24 Ú y 2 - dy
0
2 3
y
1
Ê 1 1 y 2 y3 ˆ
= 24 Ú y 2 Á - - + ˜ dy
0 Ë2 3 2 3¯
1
Ê 1 y 2 y3 ˆ
= 24 Ú y 2 Á - + ˜ dy
0 Ë6 2 3¯
1
Ê y2 y 4 y5 ˆ
= 24 Ú Á - + ˜ dy
0Ë
6 2 3¯
1
y3 y 5 46
= 24 - +
18 10 18
0
Ê 1 1 1ˆ
= 14 Á - + ˜
Ë 18 10 18 ¯
4
=
15
Example 7
Two random variables have joint pdf
Ï xy
Ô , 0 < x < 4, 1 < y < 5
f ( x, y) = Ì 96
ÔÓ0 , elsewhere
Find (i) E(X) (ii) E(Y) (iii) E(XY) (iv) E(2X + 3Y) (v) Var(X) (vi) Var(Y)
(vii) Cov(X,Y)
Solution
• •
(i) E ( X ) = Ú Ú x f ( x, y)dx dy
-• -•
54
Ê xy ˆ
= Ú Ú x Á ˜ dx dy
Ë 96 ¯
10
5 4
1 x3
96 Ú1 3
= y dy
0
5
1 Ê 64 ˆ
= Ú y Á ˜ dy
96 1 Ë 3 ¯
3.80 Chapter 3 Basic Statistics
5
2 y2
=
9 2
1
2 Ê 25 1 ˆ
= Á - ˜
9 Ë 2 2¯
8
=
3
• •
(ii) E (Y ) = Ú Ú yf ( x, y) dx dy
-• -•
54
Ê xy ˆ
= Ú Ú y Á ˜ dx dy
Ë 96 ¯
10
5 4
2
1 2 x
96 Ú1
= y dy
2
0
5
1
96 Ú1
= 8 y 2 dy
5
1 y3
=
12 3
1
1
= (125 - 1)
36
31
=
9
• •
(iii) E ( XY ) = Ú Ú xy f ( x, y) dx dy
-• -•
54
Ê xy ˆ
= Ú Ú xy Á ˜ dx dy
Ë 96 ¯
10
54
1
96 Ú1 Ú0
= x 2 y 2 dx dy
• •
(iv) E ( XY ) = Ú Ú xy f ( x, y)dx dy
-• -•
11
= Ú Ú xy (2 - x - y) dx dy
00
1 1
x3 x2 y
=Úy x - - 2
dy
0
3 2
0
3.8 Expected Values of Two Dimensional Random Variables 3.81
1
Ê 1 yˆ
= Ú y Á 1 - - ˜ dy
Ë 3 2¯
0
1
Ê 2 y y2 ˆ
=ÚÁ - ˜ dy
0 Ë
3 2¯
1
y 2 y3
= -
3 6
0
1 1
= -
3 6
1
=
6
(v) Cov( X , Y ) = E ( XY ) - E ( X )E (Y )
1 Ê 5 ˆÊ 5 ˆ
= -
6 ÁË 12 ˜¯ ÁË 12 ˜¯
1
=-
144
Example 8
Let f(x, y) = 8xy, 0<x<y<1
= 0 , elsewhere
Find (i) E (Y/X = x) (ii) E(XY/X = x) (iii) Var(Y/X = x).
Solution
The region of integration is DOAB. In DOAB, along vertical strip PQ, limits of y: y = x
to y = 1 and x varies from x = 0 to x = 1.
• y
fX ( x) = Ú f ( x, y) dy
-•
1 B Q y=1
A (1, 1)
= Ú 8 xy dy P¢ Q¢
x
x
y=
1 P
y2
= 8x O x
2 x
Fig. 3.6
= 4 x(1 - x 2 ) 0 < x <1
2 Ê 1- x ˆ 3
=
3 ÁË 1 - x 2 ˜¯
2 Ê 1 + x + x2 ˆ
=
3 ÁË 1 + x ˜¯
(ii) E ( XY / X = x ) = x E (Y / X = x )
2 x(1 + x + x 2 )
=
3 (1 + x )
3.8 Expected Values of Two Dimensional Random Variables 3.83
•
(iii) E (Y 2 / X = x ) = Úy
2
fY / X ( y / x ) dy
-•
1
Ê 2y ˆ
= Ú y2 Á dy
Ë 1 - x 2 ˜¯
x
1
2 y4
=
1 - x2 4
x
1 Ê 1- x ˆ 4
=
2 ÁË 1 - x 2 ˜¯
Var(Y /X = x ) = E (Y 2 /X = x ) - {E (Y /X = x )}2
2
1 + x2 È 2 Ê 1 + x + x2 ˆ ˘
= -Í Á ˜˙
2 ÍÎ 3 Ë 1 + x ¯ ˙˚
1 + x 2 4 (1 + x + x 2 )2
= -
2 9 (1 + x )2
Exercise 3.5
1. If the pdf of (X, Y) is given by
f(x, y) = 2 – x – y, 0 ≤ x ≤ y ≤ 1
Find E(X) and E(Y).
È 5 5˘
Í ans.: 12 , 12 ˙
Î ˚
Ï1
Ô , 0 < x2 + y2 < 1
2. If f ( x , y ) = Ìp
Ô0, x2 + y2 > 1
Ó
Find the covariance of X, Y. [Ans.: 0]
3. Joint pdf of X and Y is given by
f(x, y) = 3(x +y) 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
Find E(Y/X = x) and Cov(X,Y).
È (1 - x)(x + 2) 13 ˘
Í ans.: ,- ˙
Î 3(1 + x) 320 ˚
4. Let fXY (x, y) = e–(x+y) 0 ≤ x < •, 0 < y < •
Find Cov(X, Y).
[Ans.: 0]
3.84 Chapter 3 Basic Statistics
If the probability distribution of a random variable is known E(X) and Var(X) can be
computed. Conversely, if E(X) and Var(X) are known, probability distribution of X
{ }
cannot be constructed and quantities such as P X - E ( X ) £ k can not be evaluated.
Several approximation techniques have been developed to yield upper and /or lower
bounds to such probabilities. The most important of such techniques is Chebyshev’s
inequality.
If X is a random variable with mean m and variance s2, then for any positive umber k,
1
{
P X - m ≥ ks £ 2
k
}
1
{
or P X - m < ks ≥ 1 - 2
k
}
Proof
Let X be a continuous random variable.
s 2 = E[ X - E ( X )]2
= E[ X - m ]2 [∵ m = E ( X )]
3.10 Chebyshev’s Inequality 3.85
Ú ( x - m)
2
= f ( x )dx where f ( x ) is pdf of X .
-•
m - ks m + ks •
Ú Ú Ú
2 2
= ( x - m ) f ( x )dx + ( x - m ) f ( x )dx + ( x - m )2 f ( x )dx
-• m - ks m + ks
m - ks •
≥ Ú ( x - m )2 f ( x )dx + Ú ( x - m )2 f ( x )dx ...(1)
-• m + ks
È m - ks • ˘
= k 2s 2 Í Ú f ( x )dx + Ú f ( x ) dx ˙
Í -• ˙
Î m + ks ˚
= k 2s 2 [ P( X £ m - ks ) + P ( X ≥ m + ks )]
= k 2s 2 [ P( X - m £ - ks ) + P( X - m ≥ ks )]
{
= k 2 s 2 P X - m ≥ ks }
{ } k12
P X - m ≥ ks £
∵ P { X - m ≥ ks } + P { X - m < ks } = 1
P { X - m < ks } = 1 - P { X - m ≥ ks }
1
≥ 1-
k2
Note
1. If ks = c > 0
s2
{
P X-m ≥c £ } c2
s2
{
and P X - m < c ≥ 1 -
c2
}
2. To find the lower bound of probabilities following form of Chebyshev’s in-
equality is used:
1
{
P X - m < ks ≥ 1 - 2
k
}
3.86 Chapter 3 Basic Statistics
s2
or { }
P X - m < c ≥ 1-
c2
3. To find the upper bound of probabilities following form of Chebyshev’s in-
equality is used;
1
{
P X - m ≥ ks £ } k2
s2
or {
P X-m ≥c £ } c2
Example 1
A random variable X has a mean m = 12 and a variance s2 = 9 and
unknown probability distribution. Find P(6 < X < 18).
Solution
m = 12, s2 = 9
s=3
By Chebyshev’s inequality,
1
{
P X - m < ks ≥ 1 - } k2
1
P {- ks < X - m < ks } ≥ 1 -
k2
1
P { m - ks < X < m + k s } ≥ 1 -
k2
1
P {12 - 3k < X < 12 + 3k } ≥ 1 - 2
k
Comparing with P(6 < X < 18),
12 – 3k = 6
12 + 3k = 18
\ k = 2
1
P{6 < X < 18} ≥ 1 -
4
3
P{6 < X < 18} ≥
4
3.10 Chebyshev’s Inequality 3.87
Example 2
A random variable X has a mean 10 and a variance 4 and
unknown probability distribution. Find the value of c such that
P{|X – 10| ≥ c} ≤ 0.04.
Solution
m = 10, s2 = 4
s=2
By Chebyshev’s inequality,
1
{
P X - m ≥ ks £ } k2
{ }
Comparing with P X - 10 ≥ c £ 0.04,
1
= 0.04
k2
k=5
and ks = c
c = 5(2) = 10
Example 3
A random variable X has pdf f(x) = e–x, x ≥ 0. Use Chebyshev’s inequal-
1
ity to show that P { X - 1 > 2} £ and also, show that the actual prob-
4
ability is given by e–3 .
Solution
f(x) = e–x
The random variable X follows exponential distribution with parameter l = 1.
1
E( X ) = m = = 1
l
1
Var( X ) = s 2 = 2 = 1
l
By Chebyshev’s inequality,
1
{
P X - m > ks £ } k2
3.88 Chapter 3 Basic Statistics
{
Comparing with P X - m > 2 , }
ks = 2
k (1) = 2
k=2
1
\ {
P X -1 > 2 £ } 4
The actual probability is given by
{ }
P X - 1 > 2 = 1- P X - 1 £ 2 { }
= 1 - P{-1 < X £ 3}
= 1 - P{0 < X £ 3}
3
= 1 - Ú e - x dx
0
3
= 1 - e- x
0
-3
=1- e
Example 4
A random variable X is exponentially distributed with parameter 1. Use
3
Chebyshev’s inequality to show that P{-1 £ X £ 3} ≥ . Find the actual
4
probability also.
Solution
For an exponential distribution with parameter l = 1,
1
E( X ) = m = = 1
l
2 1
Var( X ) = s = 2 = 1
l
s = 1
By Chebyshev’s inequality,
1
{
P X - m < ks ≥ 1 - } k2
1
P {- ks < X - m < ks } ≥ 1 -
k2
3.10 Chebyshev’s Inequality 3.89
1
P { m - ks < X < m + k s } ≥ 1 -
k2
1
P {1 - k < X < 1 + k } ≥ 1 -
k2
3
Comparing with P {-1 £ X £ 3} ≥ ,
4
1- k = -1
k=2
1
\ P {-1 £ X £ 3} ≥ 1 -
4
3
≥
4
The actual probability is given by
P {-1 £ X £ 3} = P {0 £ X £ 3} [∵ x > 0 for exponential distribution ]
3
= Ú f ( x )dx
0
3
= Ú e - x dx
0
3
= -e- x
0
= -e -3 + e0
= 1 - e -3
= 0.9502
Example 5
A fair dice is tossed 120 times. Use Chebyshev’s inequality to find a
lower bound for the probability of getting 80 to 120 sixes.
Solution
Let X be the random variable which denotes number of sixes obtained when a fair dice
is tossed by 720 times.
n = 720
Probability of getting 6 in single toss
1
p=
6
1 5
q = 1- p = 1- =
6 6
3.90 Chapter 3 Basic Statistics
Example 6
Two dice are thrown once. If X is the sum of the numbers sharing up,
35
prove that P { X - 7 ≥ 3} £ . Compare this value with the exact prob-
34
ability.
Solution
Let X1 and X2 be the random variables which denote the outcomes of first and second
dice.
1 7
E ( X1 ) = E ( X 2 ) = (1 + 2 + 3 + 4 + 5 + 6) =
6 2
7 7
E ( X ) = E ( X1 ) + E ( X 2 ) = m = + = 7
2 2
3.10 Chebyshev’s Inequality 3.91
1 2 91
E ( X12 ) = E ( X 22 ) = (1 + 22 + 32 + 42 + 52 + 62 ) =
6 6
2
91 Ê 7 ˆ 35
Var( X1 ) = Var( X 2 ) = -Á ˜ =
6 Ë 2¯ 12
Var( X ) = Var( X1 + X 2 ) = (1)2 Var( X1 ) + (1)2 Var( X 2 )
35 35 35
s2 = + =
12 12 6
35
s=
6
By Chebyshev’s inequality,
1
{
P X - m ≥ ks £ } k2
{
Comparing with P X - 7 ≥ 3 , }
m=7
ks = 3
35
k =3
6
6
k =3
35
1
\ {
P X -7 ≥ 3 £ } 2
Ê 6 ˆ
Á 3 35 ˜
Ë ¯
35
£
54
Actual probability is given by
{ }
P X - 7 ≥ 3 = P{X = 1, 2, 3, 4,10,11,12}
1 2 3 4 3 2 1
= + + + + + +
36 36 36 36 36 36 36
4
=
9
Example 7
Use Chebyshev’s inequality to find how many times a fair coin must be
tossed in order that probability that the ratio of the number of heads
3.92 Chapter 3 Basic Statistics
to the number of tosses will the between 0.45 and 0.55 will be at least
0.95.
Solution
Let X be the random variable which denotes the number of heads obtained when a fair
coin is tossed n times.
1
p=q=
2
X follows a binomial distribution.
Mean = np and Var(X) = npq
x Ê1 ˆ 1
Mean of required ratio = E Á X ˜ = E( X )
n Ën ¯ n
1 1
= np = p =
n 2
1
\ m=
2
2
Ê X ˆ Ê 1ˆ 1 pq
Var Á ˜ = Á ˜ Var( X ) = 2 npq =
Ë n ¯ Ë n¯ n n
1.1
s=
pq
= 2 2 = 1
n n 2 n
By Chebyshev’s inequality,
ÏX ¸ 1
P Ì - m < ks ˝ ≥ 1 - 2
Ó n ˛ k
Ï X ¸ 1
P Ì - ks < - m < k s ˝ ≥ 1 - 2
Ó n ˛ k
Ï X ¸ 1
P Ì m - ks < < m + k s ˝ ≥ 1 - 2
Ó n ˛ k
Ï X ¸
But P Ì0.45 < < 0.55˝ ≥ 0.95
Ó n ˛
1
1- = 0.95
k2
1
= 0.05
k2
k = 20
3.10 Chebyshev’s Inequality 3.93
m - ks = 0.45
Ê 1 ˆ
0.5 - Á = 0.45
Ë 2 n ˜¯
n = 2000
Hence, the fair coin must be tossed 2000 times.
Example 8
If X is the number on a dice when it is thrown, prove that
P { X - m ≥ 2.5} £ 0.47, where m is the mean.
Solution
Let x be the random variable which denotes the number on a dice. The probability
function is
X 1 2 3 4 5 6
1 1 1 1 1 1
P(X = x)
6 6 6 6 6 6
E ( X ) = m = Â xp( x )
Ê 1ˆ Ê 1ˆ Ê 1ˆ Ê 1ˆ Ê 1ˆ Ê 1ˆ
= 1Á ˜ + 2 Á ˜ + 3 Á ˜ + 4 Á ˜ + 5 Á ˜ + 6 Á ˜
Ë 6¯ Ë 6¯ Ë 6¯ Ë 6¯ Ë 6¯ Ë 6¯
7
=
2
Var( X ) = s 2 = Â x 2 p( x ) - m 2
2
Ê 1ˆ Ê 1ˆ Ê 1ˆ Ê 1ˆ Ê 1ˆ Ê 1ˆ Ê 7ˆ
= 1 Á ˜ + 4 Á ˜ + 9 Á ˜ + 16 Á ˜ + 25 Á ˜ + 36 Á ˜ - Á ˜
Ë 6¯ Ë 6¯ Ë 6¯ Ë 6¯ Ë 6¯ Ë 6¯ Ë 2¯
= 2.9167
s = 1.707
By Chebyshev’s inequality,
1
{
P X - m > ks < } k2
{
Comparing with P X - m > 2.5 , }
ks = 2.5
k (1.707) = 2.5
k = 1.46
3.94 Chapter 3 Basic Statistics
1
{
\ P X - m > 2.5 < } (1.46)2
P {{X - m} > 2.5} < 0.47
Example 9
The number of planes landing at an airport in a 30 minutes interval
obeys the Poisson law with mean 25. Use Chebyshev’s inequality to find
the least chance that the number of planes landing within a given 30
minutes interval will be between 15 and 25.
Solution
Let x be a random variable which denotes the number of planes landing at an airport.
For Poisson distribution,
E ( X ) = m = 25
Var( X ) = s 2 = m = 25
s =5
By Chebyshev’s inequality,
1
{ }
P X - m < ks ≥ 1 -
k2
1
P {- ks < X - m < ks } ≥ 1 -
k2
1
P { m - ks < X < m + k s } ≥ 1 -
k2
1
P {25 - 5k < X < 25 + 5k } ≥ 1 - 2
k
Comparing with P{15 < X < 25},
25 – 5k = 15 and 25 + 5k = 25
k = 2
1
\ P{15 < X < 25} ≥ 1 - 2
(2)
3
≥
4
3.10 Chebyshev’s Inequality 3.95
Exercise 3.6
1. A discrete random variable takes the values –1, 0, 1 with probability
1 3 1
{
, , respectively. Find P X - m ≥ 25 .
8 4 8
}
È 1˘
Í ans.: 4 ˙
Î ˚
2. Use Chebyshev’s inequality to prove that P { X = m } = 1 if Var(X) = 0.
3. If X is a random variable with E(X) = 3 and E(X2) = 13, find the lower bound
for P(–2 < X < 8) using Chebyshev’s inequality.
È 21 ˘
Í ans.: 25 ˙
Î ˚
4. Can we find a random variable for which P{m - 2s < X < m + 2s} = 0.6?
[Ans.: No]
5. If X denotes the sum of the numbers obtained when 2 dice are drawn, obtain
an upper bound for P{|X - 7| ≥ 4}. Compare with actual probability.
È 35 1 ˘
Í ans.: 96 , 6 ˙
Î ˚
6. A fair dice is tossed 720 times. Use Chebyshev’s inequality to find a lower
bound for getting 100 to 140 sixes.
È 3˘
Í ans.: 4 ˙
Î ˚
7. A pair of dice is rolled 900 times and X denotes the number of times a
total of 9 occurs. Find P(80 ≤ X ≤ 120) using Chebyshev’s inequality.
È 2˘
Í ans.: 9 ˙
Î ˚
8. A discrete random variable X can assume the values x = 1, 2, 3, … with
1
probability 2-x. Show that P{|X - 2| ≥ 2} £ , while the actual probability
1 2
is .
8
1 16
9. A random variable X has the pmf P( X = 1) = , P( X = 2) = ,
18 18
1 s2
P( X = 3) =
18
{
. Show that there is a value of c such that P X - m ≥ c = 2 , }
c
3.96 Chapter 3 Basic Statistics
Chapter Outline
4.1 Introduction
4.2 Correlation
4.3 Types of Correlations
4.4 Methods of Studying Correlation
4.5 Scatter Diagram
4.6 Simple Graph
4.7 Karl Pearson’s Coefficient of Correlation
4.8 Properties of Coefficient of Correlation
4.9 Rank Correlation
4.10 Regression
4.11 Types of Regression
4.12 Methods of Studying Regression
4.13 Lines of Regression
4.14 Regression Coefficients
4.15 Properties of Regression Coefficients
4.16 Properties of Lines of Regression (Linear Regression)
4.1 Introduction
Correlation and regression are the most commonly used techniques for investigating the
relationship between two quantitative variables. Correlation refers to the relationship
of two or more variables. It measures the closeness of the relationship between the
variables. Regression establishes a functional relationship between the variables. In
correlation, both the variables x and y are random variables, whereas in regression, x is
a random variable and y is a fixed variable. The coefficient of correlation is a relative
measure whereas the regression coefficient is an absolute figure.
4.2 Chapter 4 Correlation and Regression
4.2 Correlation
Correlation is the relationship that exists between two or more variables. Two variables
are said to be correlated if a change in one variable affects a change in the other variable.
Such a data connecting two variables is called bivariate data. Thus, correlation is a
statistical analysis which measures and analyses the degree or extent to which two
variables fluctuate with reference to each other. Some examples of such a relationship
are as follows:
1. Relationship between heights and weights.
2. Relationship between price and demand of commodity.
3. Relationship between rainfall and yield of crops.
4. Relationship between age of husband and age of wife.
2. Multiple Correlation When more than two variables are studied, the relationship
is described as multiple correlation, e.g., relationship of price, demand, and supply of
a commodity.
Milk (l) 5 10 15 20 25 30
Curg (kg) 2 4 6 8 10 12
There are two different methods of studying correlation, (1) Graphic methods
(2) Mathematical methods.
Graphic methods are (a) scatter diagram, and (b) simple graph.
4.4 Chapter 4 Correlation and Regression
The coefficient of correlation is the measure of correlation between two random vari-
ables X and Y, and is denoted by r.
cov( X , Y )
r=
s XsY
where cov (X, Y) is the covariance of variables X and Y,
sX is the standard deviation of variable X,
and sY is the standard deviation of variable Y.
This expression is known as Karl Pearson’s coefficient of correlation or Karl Pearson’s
product-moment coefficient of correlation.
1
cov( X , Y ) =
n
 (x - x ) (y - y)
sX =
 ( x - x )2
n
sY =
 ( y - y )2
n
\ r =
 (x - x ) (y - y)
 ( x - x )2  ( y - y )2
The above expression can be further modified.
4.6 Chapter 4 Correlation and Regression
r=
 ( xy - xy - xy + x y )
 (x 2 - 2 x x + x 2 )  (y2 - 2 yy + y 2 )
=
 xy - y  x - x  y + x y Â1
 x 2 - 2 x  x + x 2 Â1  y2 - 2 y  y + y 2 Â1
Ây Âx Âx Ây
 xy - n  x - n  y + n n ◊ n
=
2 2
Âx Ê Â xˆ Ây Ê Â yˆ
Âx 2
-2
n
 x+Á
Ë n ˜¯
n Ây 2
-2
n
 y+Á
Ë n ˜¯
n
Âx Ây
 xy - n
=
(Â x ) (Â y )
2 2
Âx 2
-
n
Ây 2
-
n
Proof Let x and y be the mean of x and y series and sx and sy be their respective
standard deviations.
2
Ê x- x y- yˆ È∵ sum of squares of real quantities˘
Let  Á s ± s ˜ ≥ 0 Í cannot be negative ˙
Ë x y ¯ Î ˚
 ( x - x )2 +  ( y - y )2 ±
2Â ( x - x ) ( y - y )
≥0
s x2 s y2 s xs y
n + n ± 2 nr ≥ 0
2 n ± 2 nr ≥ 0
2 n (1 ± r ) ≥ 0
1± r ≥ 0
i.e., 1+ r ≥ 0 or 1- r ≥ 0
r ≥ -1 or r £1
Hence, the coefficient of correlation lies between –1 and 1, i.e., –1 £ r £ 1.
4.8 Properties of Coefficient of Correlation 4.7
2. C
orrelation coefficient is independent of change of origin and change
of scale.
x-a y-b
Proof Let d x = , dy =
h k
x = a + hd x , y = b + kd y
rxy =
 (x - x ) (y - y)
 ( x - x )2  ( y - y )2
=
 h (d x - d x ) k (d y - d y )
 h 2 ( d x - d x )2  k 2 ( d y - d y )2
=
 (d x - d x ) (d y - d y )
 ( d x - d x )2 ( d y - d y )2
= rd x d y
 Â
y
d x2 - d y2 -
n n
 (x - x ) (y - y) = 0 or cov ( X , Y ) = 0
\ r=0
Example 1
Calculate the correlation coefficient between x and y using the following
data:
x 2 4 5 6 8 11
y 18 12 10 8 7 5
Solution
n=6
x y x2 y2 xy
2 18 4 324 36
4 12 16 144 48
5 10 25 100 50
6 8 36 64 48
8 7 64 49 56
11 5 121 25 55
Âx = 36 Ây = 60 Âx = 266
2
Ây = 706
2
Âxy = 293
Âx Ây
 xy - n
r=
(Â x ) (Â y )
2 2
Âx 2
-
n
Ây 2
-
n
(36)(60)
293 -
= 6
(36)2 (60)2
266 - 706 -
6 6
= -0.9203
Note Âx, Ây, Âx2, Ây2, Âxy can be directly obtained with the help of scientific
calculator.
Example 2
Calculate the coefficient of correlation from the following data:
x 12 9 8 10 11 13 7
y 14 8 6 9 11 12 3
4.8 Properties of Coefficient of Correlation 4.9
Solution
n=7
x y x2 y2 xy
12 14 144 196 168
9 8 81 64 72
8 6 64 36 48
10 9 100 81 90
11 11 121 121 121
13 12 169 144 156
7 3 49 9 21
Âx = 70 Ây = 63 Âx = 728 2
Ây = 651
2
Âxy = 676
Âx Ây
Âxy- n
r=
(Â x ) (Â y )
2 2
Âx 2
-
n
Ây 2
-
n
(70) (63)
676 -
= 7
(70)2 (63)2
728 - 651 -
7 7
= 0.949
Example 3
Calculate the coefficient of correlation for the following data:
x 9 8 7 6 5 4 3 2 1
y 15 16 14 13 11 12 10 8 9
4.10 Chapter 4 Correlation and Regression
Solution
n=9
x y x2 y2 xy
9 15 81 225 135
8 16 64 256 128
7 14 49 196 98
6 13 36 169 78
5 11 25 121 55
4 12 16 144 48
3 10 9 100 30
2 8 4 64 16
1 9 1 81 9
Âx = 45 Ây = 108 Âx = 285
2
Ây = 1356
2
Âxy = 597
Âx Ây
Âx y- n
r=
(Â x ) (Â y )
2 2
Âx 2
-
n
Ây 2
-
n
(45)(108)
597 -
= 9
(45)2 (108)2
285 - 1356 -
9 9
= 0.95
Example 4
Calculate the correlation coefficient between the following data:
x 5 9 13 17 21
y 12 20 25 33 35
4.8 Properties of Coefficient of Correlation 4.11
Solution
n=5
x=
 x = 65 = 13
n 5
y=
Ây =
125
= 25
n 5
x y x-x y-y ( x - x )2 ( y - y )2 ( x - x )( y - y )
Â( x - x ) Â( y - y ) Â( x - x )2 Â( y - y )2 Â ( x - x )( y - y )
Âx = 65 Ây = 125
=0 =0 = 160 = 358 = 236
r=
 ( x - x )( y - y )
 ( x - x )2  ( y - y )2
236
=
160 358
= 0.986
Note Since Âx, Ây, Âx2, Ây2, Âxy can be directly obtained with the help of scientific
calculator, correlation coefficient can be calculated without using mean.
Example 5
Calculate the correlation coefficient between for the following values of
demand and the corresponding price of a commodity:
Demand in Quintals 65 66 67 67 68 69 70 72
Price in rupees per kg 67 68 65 68 72 72 69 71
4.12 Chapter 4 Correlation and Regression
Solution
Let the demand in quintal be denoted by x and the price in rupees per kg be denoted
by y.
n=8
x=
 x = 544 = 68
n 8
y=
 y = 552 = 69
n 8
x y x-x y-y ( x - x )2 ( y - y )2 ( x - x )( y - y )
65 67 –3 –2 9 4 6
66 68 –2 –1 4 1 2
67 65 –1 –4 1 16 4
67 68 –1 –1 1 1 1
68 72 0 3 0 9 0
69 72 1 3 1 9 3
70 69 2 0 4 0 0
72 71 4 2 16 4 8
Â( x - x ) Â( y - y ) Â( x - x )2 Â( y - y )2 Â ( x - x )( y - y )
Âx = 544 Ây = 552
=0 =0 = 36 = 44 = 24
r=
 ( x - x )( y - y )
 ( x - x )2  ( y - y )2
24
=
36 44
= 0.603
Example 6
Calculate the coefficient of correlation for the following pairs of
x and y:
x 17 19 21 26 20 28 26 27
y 23 27 25 26 27 25 30 33
4.8 Properties of Coefficient of Correlation 4.13
Solution
Let a = 23 and b = 27 be the assumed means of x and y series respectively.
d x = x - a = x - 23
d y = y - b = y - 27
n=8
x y dx dy dx2 dy2 dx dy
17 23 –6 –4 36 16 24
19 27 –4 0 16 0 0
21 25 –2 –2 4 4 4
26 26 3 –1 9 1 –3
20 27 –3 0 9 0 0
28 25 5 –2 25 4 –10
26 30 3 3 9 9 9
27 33 4 6 16 36 24
Âdx = 0 Âdy = 0 Âdx2 = 124 Âdy2 = 70 Âdx dy = 48
 dx  dy
 dx dy - n
r=
(Â dx ) (Â d )
2 2
 Â
y
d x2 - d y2 -
n n
48 - 0
=
124 - 0 70 - 0
= 0.515
Note Since Âx, Ây, Âx2, Ây2, Âxy can be directly obtained with the help of scientific
calculator, the correlation coefficient can be calculated without using assumed mean.
Example 7
Calculate the correlation coefficient from the following data:
x 23 27 28 29 30 31 33 35 36 39
y 18 22 23 24 25 26 28 29 30 32
4.14 Chapter 4 Correlation and Regression
Solution
Let a = 30 and b = 25 be the assumed means of x and y series respectively.
d x = x - a = x - 30
d y = y - b = x - 25
n = 10
x y dx dy dx2 dy2 dx dy
23 18 –7 –7 49 49 49
27 22 –3 –3 9 9 9
28 23 –2 –2 4 4 4
29 24 –1 –1 1 1 1
30 25 0 0 0 0 0
31 26 1 1 1 1 1
33 28 3 3 9 9 9
35 29 5 4 25 16 20
36 30 6 5 36 25 30
39 32 9 7 81 49 63
Âdx = 11 Âdy = 7 Âdx2 = 215 Âdy2 = 163 Âdx dy = 186
 dx  dy
 dx dy - n
r=
(Â d x ) (Â d )
2 2
 Â
y
d x2 - d y2 -
n n
(11)(7)
186 -
= 10
(111)2 ( 7) 2
215 - 163 -
10 10
= 0.996
Example 8
Calculate the coefficient of correlation between the ages of cars and
annual maintenance costs.
Age of cars (year) 2 4 6 7 8 10 12
Annual maintenance cost
1600 1500 1800 1900 1700 2100 2000
(`)
4.8 Properties of Coefficient of Correlation 4.15
Solution
Let the ages of cars in years be denoted by x and annual maintenance costs in rupees
be denoted by y.
Let a = 7 and b = 1800 be the assumed means of x and y series respectively.
Let h = 1, k = 100
x-a x-7
dx = = = x-7
h 1
y - b y - 1800
dy = =
k 100
n=7
 dx  dy
 dx dy - n
r=
(Â d x ) (Â d )
2 2
 Â
y
d x2 - d y2 -
n n
37 - 0
=
70 - 0 28 - 0
= 0.836
Example 9
Calculate Karl Pearson’s coefficient of correlation for the data given
below:
x 10 14 18 22 26 30
y 18 12 24 6 30 36
4.16 Chapter 4 Correlation and Regression
Solution
Let a = 22 and b = 24 be the assumed means of x and y series respectively.
Let h = 4, k = 6
x - a x - 22
dx = =
h 4
y - b y - 24
dy = =
k 6
n=6
x y dx dy dx2 dy2 dx dy
10 18 –3 –1 9 1 3
14 12 –2 –2 4 4 4
18 24 –1 0 1 0 0
22 6 0 –3 0 9 0
26 30 1 1 1 1 1
30 36 2 2 4 4 4
Âdx = –3 Âdy = –3 Âdx2 = 19 Âdy2 = 19 Âdx dy = 12
 dx  dy
 dx dy - n
r=
(Â d x ) (Â d )
2 2
 Â
y
d x2 - d y2 -
n n
(-3)(-3)
12 -
= 6
(-3)2 (-3)2
19 - 19 -
6 6
= 0.6
Example 10
The coefficient of correlation between two variables X and Y is 0.48. The
covariance is 36. The variance of X is 16. Find the standard deviation
of Y.
Solution
r = 0.48, cov(X, Y) = 36, sX2 = 16
\ sX = 4
4.8 Properties of Coefficient of Correlation 4.17
cov ( X , Y )
r=
s X sY
36
0.48 =
4 sY
\ s Y = 18.75
Example 11
Given n = 10, sX = 5.4, sY = 6.2, and sum of the product of deviations
from the mean of x and y is 66. Find the correlation coefficient.
Solution
n = 10, s X = 5.4, s Y = 6.2
 ( x - x )( y - y ) = 66
sX =
 ( x - x )2
n
5.4 =
 ( x - x )2
10
\ Â (x - x ) 2
= 291.6
sY =
 ( y - y )2
n
6.2 =
 ( y - y )2
10
\ Â (y - y) 2
= 384.4
r=
 ( x - x )( y - y )
 ( x - x )2  ( y - y )2
66
=
291.6 384.4
= 0.197
Example 12
From the following information, calculate the value of n.
 x = 4,  y = 4,  x 2 = 44,  y2 = 44,  xy = -40, r = -1
4.18 Chapter 4 Correlation and Regression
Solution
Âx Ây
 xy - n
r=
(Â x ) (Â y )
2 2
Âx 2
-
n
Ây 2
-
n
(4)(4)
-40 -
-1 = n
( 4 )2 ( 4 )2
44 - 44 -
n n
\ n=8
Example 13
From the following data, find the number of items n.
r = 0.5, Â ( x - x )( y - y ) = 120, s Y = 8, Â ( x - x )2 = 90
Solution
sY =
 ( y - y )2
n
8=
 ( y - y )2
n
 (y - y) 2
= 64 n
r=
 ( x - x )( y - y )
 ( x - x )2  ( y - y )2
120
0.5 =
90 64 n
\ n = 10
Example 14
Calculate the correlation coefficient between x and y from the following
data:
n = 10, Â x = 140, Â y = 150, Â ( x - 10)2 = 180
 ( y - 15)2 = 215,  ( x - 10) ( y - 15) = 60
4.8 Properties of Coefficient of Correlation 4.19
Solution
 dx2 =  ( x - 10)2 = 180
 dy2 =  ( y - 15)2 = 215
 dx dy =  ( x - 10) ( y - 15) = 60
a = 10
b = 15
n = 10
x=
 x = 140 = 14
n 10
y=
Ây 150
= = 15
n 10
x = a+
 dx
n
14 = 10 +
 dx
10
\ Â dx = 40
y = b+
 dy
n
15 = 15 +
 dy
10
\ Â dy = 0
 dx  dy
 dx dy - n
r=
(Â d x ) (Â d )
2 2
 Â
y
d x2 - d y2 -
n n
(40)(0)
60 -
= 10
( 40 ) 2 0
180 - 215 -
10 10
= 0.915
Example 15
A computer operator while calculating the coefficient between two
variates x and y for 25 pairs of observations obtained the following
constants:
4.20 Chapter 4 Correlation and Regression
Similarly,
Corrected  y = 100 - (14 + 6) + (12 + 8) = 100
Corrected  x 2 = 650 - (62 + 82 ) + (82 + 62 ) = 650
Corrected  y 2 = 460 - (142 + 62 ) + (122 + 82 ) = 436
Corrected  xy = 508 - (84 + 48) + (96 + 48) = 520
Âx 2
-
n
Ây 2
-
n
(125)(100)
520 -
= 25
(125)2 (100)2
650 - 436 -
25 25
= 0.67
Exercise 4.1
x 62 64 65 69 70 71 72 74
y 126 125 139 145 165 152 180 208
[Ans.: 0.9032]
8. The following data gave the growth of employment in lacs in the
organized sector in India between 1988 and 1995:
 x = 127,  y = 100,  x 2
= 760, Â y 2 = 449, Â xy = 500
Later on, it was found that two pairs of values were taken as (8, 14)
and (8, 6) instead of the correct values (8, 12) and (6, 8). Find the
corrected coefficient between x and y.
[Ans.: —0.31]
 ( x - x )2 =  ( x 2 - 2 x x + x 2 )
=  x2 - 2 x  x + x 2 Â1
= Â x 2 - 2 nx 2 + nx 2 ÈÎ∵ Â x = nx and Â1 = n˘˚
= Â x2 - n x 2
2
Ê n + 1ˆ
= (12 + 22 + + n2 ) - n Á
Ë 2 ˜¯
4.9 Rank Correlation 4.23
 d 2 =  [( x - x ) - ( y - y )]
2
= Â ( x - x )2 + Â ( y - y )2 - 2 Â ( x - x ) ( y - y )
1
 ( x - x ) ( y - y ) = 2 ÈΠ( x - x )2 +  ( y - y )2 -  d 2 ˘˚
1 3 1
( n - n) - Â d 2
=
12 2
Hence, the coefficient of correlation between these variables is
r=
 ( x - x )( y - y )
 ( x - x )2  ( y - y )2
1 3 1
( n - n) - Â d 2
= 12 2
1 3
( n - n)
12
6 Â d2
= 1- 3
n -n
6 Â d2
= 1-
n(n2 - 1)
This is called Spearman’s rank correlation coefficient and is denoted by r.
Note  d =  ( x - y) =  x -  y = n ( x - y ) = 0
Example 1
Ten participants in a contest are ranked by two judges as follows:
x 1 3 7 5 4 6 2 10 9 8
y 3 1 4 5 6 9 7 8 10 2
Solution
n = 10
6 Â d2
r = 1-
n(n2 - 1)
6 (96)
= 1-
10 ÈÎ(10)2 - 1˘˚
= 0.418
Example 2
Ten competitors in a musical test were ranked by the three judges A, B,
and C in the following order:
Rank by A 1 6 5 10 3 2 4 9 7 8
Rank by B 3 5 8 4 7 10 2 1 6 9
Rank by C 6 4 9 8 1 2 3 10 5 7
Using the rank correlation method, find which pair of judges has the
nearest approach to common liking in music. [Summer 2015]
Solution
n = 10
4.9 Rank Correlation 4.25
6 Â d12
r ( x, y ) = 1 -
n (n2 - 1)
6 (200)
= 1-
10 ÈÎ(10)2 - 1˘˚
= -0.21
6 Â d22
r ( y, z ) = 1 -
n (n2 - 1)
6 (214)
= 1-
10 ÈÎ(10)2 - 1˘˚
= -0.296
6 Â d32
r ( z, x ) = 1 -
n (n2 - 1)
6 (60)
= 1-
10 ÈÎ(10)2 - 1˘˚
= 0.64
Since r (z, x) is maximum, the pair of judges A and C has the nearest common
approach.
Example 3
Ten students got the following percentage of marks in mathematics and
physics:
4.26 Chapter 4 Correlation and Regression
Mathematics (x) 8 36 98 25 75 82 92 62 65 35
Physics (y) 84 51 91 60 68 62 86 58 35 49
6 Â d2
r = 1-
n (n2 - 1)
6 (90)
= 1-
10 Î(10)2 - 1˘˚
È
= 0.455
Example 4
The coefficient of rank correlation of the marks obtained by 10 students
in physics and chemistry was found to be 0.5. It was later discovered
that the difference in ranks in the two subjects obtained by one of the
students was wrongly taken as 3 instead of 7. Find the rank coefficient
of the rank correlation.
Solution
n = 10
4.9 Rank Correlation 4.27
6 Â d2
r = 1-
n (n2 - 1)
6 Â d2
0.5 = 1 -
10 (100 - 1)
\
Âd 2
= 82.5
Example 1
Obtain the rank correlation coefficient from the following data:
x 10 12 18 18 15 40
y 12 18 25 25 50 25
Solution
Here, n = 6
4.28 Chapter 4 Correlation and Regression
There are two items in the x series having equal values at the rank 4. Each is given the
rank 4.5. Similarly, there are three items in the y series at the rank 3. Each of them is
given the rank 4.
m1 = 2, m2 = 3
È 1 1 ˘
6 ÍÂ d 2 + (m13 - m1 ) + (m23 - m2 )˙
r = 1- Î 12 12 ˚
2
n(n - 1)
È 1 1 ˘
6 Í13.50 + (8 - 2) + (27 - 3)˙
= 1- Î 12 12 ˚
6 ÈÎ(6) - 1˘˚
2
= 0.5429
Exercise 4.2
x 36 56 20 42 33 44 50 15 60
y 50 35 70 58 75 60 45 80 38
[Ans.: 0.92]
4. Ten competitors in a voice test are ranked by three judges in the
following order:
Rank by First Judge 6 10 2 9 8 1 5 3 4 7
Rank by Second Judge 5 4 10 1 9 3 8 7 2 6
Rank by Third Judge 4 8 2 10 7 6 9 1 3 6
Use the method of rank correlation to gauge which pairs of judges has
the nearest approach to common liking in voice.
[Ans.: The first and third judge]
5. The following table gives the scores obtained by 11 students in English
and Tamil translation. Find the rank correlation coefficient.
Scores in English 40 46 54 60 70 80 82 85 85 90 95
Scores in Tamil 45 45 50 43 40 75 55 72 65 42 70
[Ans.: 0.36]
6. Calculate Spearman’s coefficient of rank correlation for the following
data:
x 53 98 95 81 75 71 59 55
y 47 25 32 37 30 40 39 45
[Ans.: —0.905]
7. Following are the scores of ten students in a class and their IQ:
Score 35 40 25 55 85 90 65 55 45 50
IQ 100 100 110 140 150 130 100 120 140 110
Calculate the rank correlation coefficient between the score IQ.
[Ans.: 0.47]
4.10 Regression
Regression is defined as a method of estimating the value of one variable when that
of the other is known and the variables are correlated. Regression analysis is used to
predict or estimate one variable in terms of the other variable. It is a highly valuable tool
for prediction purpose in economics and business. It is useful in statistical estimation
of demand curves, supply curves, production function, cost function, consumption
function, etc.
4.30 Chapter 4 Correlation and Regression
2. Multiple Regression The regression analysis for studying more than two
variables at a time is known as multiple regression.
2. Nonlinear Regression If the regression curve is not a straight line i.e., not a
first-degree equation in the variables x and y, the regression is said to be nonlinear
or curvilinear. In this case, the regression equation will have a functional relation
between the variables x and y involving terms in x and y of the degree higher than one,
i.e., involving terms of the type x2, y2, x3, y3, xy, etc.
4.13 Lines of Regression
If the variables, which are highly correlated, are plotted on a graph then the points lie
in a narrow strip. If all the points in the scatter diagram cluster around a straight line,
the line is called the line of regression. The line of regression is the line of best fit and
is obtained by the principle of least squares.
Line of Regression of y on x
It is the line which gives the best estimate for the values of y for any given values of x.
The regression equation of y on x is given by
sy
y-y =r (x - x )
sx
It is also written as
y = a + bx
Line of Regression of x on y
It is the line which gives the best estimate for the values of x for any given values of y.
The regression equation for x on y is given by
sx
x-x =r (y - y)
sy
It is also written as
x = a + by
where x and y are means of x series and y series respectively, sx and sy are standard
deviations of x series and y series respectively, r is the correlation coefficient between
x and y.
The slope b of the line of regression of y on x is also called the coefficient of regression
of y on x. It represents the increment in the value of y corresponding to a unit change
in the value of x.
byx = Regression coefficient of y on x
sy
=r
sx
4.32 Chapter 4 Correlation and Regression
r=
 (x - x ) (y - y)
 ( x - x )2  ( y - y )2
sx =
 ( x - x )2
n
sy =
 ( y - y )2
n
sy
byx = r
sx
=
 (x - x ) (y - y)
 ( x - x )2
sx
and bxy = r
sy
=
 ( x - x )( y - y )
 ( y - y )2
(ii) We know that
 x y
 xy - n
r=
(Â x ) (Â y )
2 2
Âx 2
-
n
Ây 2
-
n
(Â x )
2
sx = Â x2 - n
(Â y )
2
sy = Ây 2
-
n
4.14 Regression Coefficients 4.33
sy
byx = r
sx
 x y
 xy - n
=
(Â x )
2
Âx 2
-
n
sx
and bxy = r
sy
 x y
 xy - n
=
(Â y )
2
Ây 2
-
n
(iii) We know that
 dx  dy
 dx dy - n
r=
(Â d x ) (Â d )
2 2
 Â
y
d x2 - d y2 -
n n
(Â d x )
2
sx = Â dx2 - n
(Â d )
2
Â
y
sy = d y2 -
n
sy
byx = r
sx
 dx  dy
 dx dy - n
=
(Â d x )
2
 d x2 -
n
sx
and bxy = r
sy
 dx  dy
 dx dy - n
=
(Â d )
2
Â
y
d y2 -
n
4.34 Chapter 4 Correlation and Regression
1. T
he coefficient of correlation is the geometric mean of the coefficients of
regression, i.e., r = byx bxy .
2. I f one of the regression coefficients is greater than one, the other must be less
than one.
Proof Let byx > 1
We know that
r 2 £ 1 and r 2 = byx bxy
byx bxy £ 1
1
byx £
bxy
3. T
he arithmetic mean of regression coefficients is greater than or equal to the
coefficient of correlation.
Proof We have to prove that
1
(b + bxy ) ≥ r
2 yx
1 Ê sy s ˆ
i.e., Á r +r x˜ ≥r
2 Ë sx sy ¯
sy sx
i.e., + ≥2
sx sy
i.e., s y2 + s x2 - 2s xs y ≥ 0
4.15 Properties of Regression Coefficients 4.35
i.e., (s y - s x )2 ≥ 0
which is always true, since the square of a real quantity is 1 ≥ 0.
4. Regression Coefficients are independent of the change of origin but not of
scale.
x-a y-b
Proof Let dx = , dy =
h k
x = a + hd x , y = b + kd y
where a, b, h (> 0) and k(> 0) are constants.
1 1
rd x d y = rxy , s d2x = 2 s x2 , s d2y = 2 s y2
h k
sd
bd x d y = rd x d y x
s dy
sx k
= rxy
h sy
k s
= rxy x
h sy
k
= b
h xy
h
Similarly, bd y d x = b
k yx
5. B oth regression coefficients will have the same sign i.e., either both are positive
or both are negative.
6. The sign of correlation is same as that of the regression coefficients, i.e., r > 0 if
bxy > 0 and byx > 0; and r < 0 if bxy < 0 and byx < 0.
Example 1
The regression lines of a sample are x + 6y = 6 and 3x + 2y = 10. Find
(i) sample means x and y , and
(ii) the coefficient of correlation between x and y.
(iii) Also estimate y when x = 12.
Solution
(i) The regression lines pass through the point ( x , y ) .
x + 6y = 6 ...(1)
3 x + 2 y = 10 ...(2)
Solving Eqs (1) and (2),
1
x = 3, y =
2
(ii) Let the line x + 6y = 6 be the line of regression of y on x.
6y = -x + 6
1
y = - x +1
6
1
\ byx = -
6
Let the line 3x + 2y = 10 be the line of regression of x on y.
3 x = -2 y + 10
2 10
x=- y+
3 3
2
\ bxy = -
3
Ê 1ˆ Ê 2ˆ 1
r = byx bxy = Á - ˜ Á - ˜ =
Ë 6¯ Ë 3¯ 3
Since byx and bxy are negative, r is negative.
1
r=-
3
Estimated value of y when x = 12 is
1
y = - (12) + 1 = -1
6
4.15 Properties of Regression Coefficients 4.37
Example 2
If the two lines of regression are 4x – 5y + 30 = 0 and 20x – 9y – 107 = 0,
which of these are lines of regression of x on y and y on x? Find rxy and
sy when sx = 3.
Solution
For the line 4x – 5y + 30 = 0,
–5y = – 4x – 30
y = 0.8 x + 6
\ byx = 0.8
For the line 20x – 9y – 107 = 0
20x = 9y + 107
x = 0.45y + 5.35
\ bxy = 0.45
Both byx and bxy are positive.
Hence, line 4x – 5y + 30 = 0 is the line of regression of y one x and line
20x – 9y – 107 = 0 is the line of regression of x on y.
r = byx bxy = (0.8)(0.45) = 0.6
sy
byx = r
sx
Ê sy ˆ
0.8 = 0.6 Á ˜
Ë 3 ¯
\ sy = 4
Example 3
The following data regarding the heights (y) and weights (x) of 100
college students are given:
 x = 15000,  x 2 = 2272500,  y = 6800
 y2 = 463025,  xy = 1022250
Find the coefficient of correlation between height and weight and also
the equation of regression of height and weight.
Solution
n = 100
4.38 Chapter 4 Correlation and Regression
 x y
 xy - n
byx =
(Â x )
2
Âx 2
-
n
(15000)(6800)
1022250 -
= 100
(15000)2
2272500 -
100
= 0.1
 x y
 xy - n
bxy =
(Â y )
2
Ây 2
-
n
(15000)(6800)
1022250 -
= 100
(6800)2
463025 -
100
= 3.6
x=
 x = 15000 = 150
n 100
y=
 y = 6800 = 68
n 100
The equation of the line of regression of y on x is
y - y = byx ( x - x )
y - 68 = 0.1( x - 150)
y = 0.1x + 53
The equation of the line of regression of x on y is
x - x = bxy ( y - y )
x - 150 = 3.6( y - 68)
x = 3.6 y - 94.8
4.15 Properties of Regression Coefficients 4.39
Example 4
For a bivariate data, the mean value of x is 20 and the mean value of y is
1
45. The regression coefficient of y on x is 4 and that of x on y is .
9
Find
(i) the coefficient of correlation, and
(ii) the standard deviation of x if the standard deviation of y is 12.
(iii) Also write down the equations of regression lines.
Solution
1
x = 20, y = 45, byx = 4, bxy =
9
Ê 1ˆ 2
(i) r = byx bxy = (4) Á ˜ = = 0.667
Ë 9¯ 3
sy
(ii) byx = r
sx
2 Ê 12 ˆ
4=
3 ÁË s x ˜¯
\ sx = 2
(iii) The equation of the regression line of y on x is
y - y = byx ( x - x )
y - 45 = 4( x - 20)
y = 4 x - 35
The equation of the regression line of x on y is
x - x = bxy ( y - y )
1
x - 20 = ( y - 45)
9
1
x = y + 15
9
Example 5
From the following results, obtain the two regression equations and
estimate the yield when the rainfall is 29 cm and the rainfall, when the
yield is 600 kg:
4.40 Chapter 4 Correlation and Regression
Yield in kg Rainfall in cm
Mean 508.4 26.7
SD 36.8 4.6
The coefficient of correlation between yield and rainfall is 0.52.
Solution
Let rainfall in cm be denoted by x and yield in kg be denoted by y.
x = 26.7, y = 508.4, s x = 4.6, s y = 36.8, r = 0.52
sy
byx = r
sx
Ê 36.8 ˆ
= 0.52 Á
Ë 4.6 ˜¯
= 4.16
sx
bxy = r
sy
Ê 4.6 ˆ
= 0.52 Á
Ë 36.8 ˜¯
= 0.065
The equation of the line of regression of y on x is
y - y = byx ( x - x )
y - 508.4 = 4.16 ( x - 26.7)
y = 4.16 x + 397.328
The equation of the line of regression of x on y is
x - x = bxy ( y - y )
x - 26.7 = 0.065( y - 508.4)
x = 0.065 y - 6.346
Estimated yield when the rainfall is 29 cm is
y = 4.16 (29) + 397.328 = 517.968 kg
Estimated rainfall when the yield is 600 kg is
x = 0.065 (600) – 6.346 = 32.654 cm
Example 6
Find the regression coefficients byx and bxy and hence, find the correlation
coefficient between x and y for the following data:
4.15 Properties of Regression Coefficients 4.41
x 4 2 3 4 2
y 2 3 2 4 4
Solution
n=5
x y x2 y2 xy
4 2 16 4 8
2 3 4 9 6
3 2 9 4 6
4 4 16 16 16
2 4 4 16 8
 x y
 xy - n
byx =
(Â x )
2
Âx 2
n
-
(15)(15)
44 -
= 5
(15)2
49 -
5
= - 0.25
 x y
 xy - n
bxy =
(Â y)
2
Ây 2
n
-
(15)(15)
44 -
= 5
(15)2
49 -
5
= - 0.25
Note Âx, Ây, Âx2, Ây2, Âxy can be directly obtained with the help of scientific cal-
culator.
Example 7
The following data give the experience of machine operators and their
performance rating as given by the number of good parts turned out per
100 pieces.
Operator 1 2 3 4 5 6
Performance rating (x) 23 43 53 63 73 83
Experience (y) 5 6 7 8 9 10
Solution
n=6
x y y2 xy
23 5 25 115
43 6 36 258
53 7 49 371
63 8 64 504
73 9 81 657
83 10 100 830
Âx = 338 Ây = 45 Ây = 355
2
Âxy = 2735
 x y
 xy - n
bxy =
(Â y )
2
Ây 2
n
-
(338)(45)
2735 -
= 6
(45)2
355 -
6
= 11.429
4.15 Properties of Regression Coefficients 4.43
x=
 x = 338 = 56.33
n 6
y=
Ây =
45
= 7.5
n 6
The equation of regression line of x on y is
x - x = bxy ( y - y )
x - 56.33 = 11.429( y - 7.5)
x = 11.429 y - 29.3875
Estimated performance if y = 11 is
x = 11.429(11) – 29.3875 = 96.3315
Example 8
The number of bacterial cells (y) per unit volume in a culture at different
hours (x) is given below:
x 0 1 2 3 4 5 6 7 8 9
y 43 46 82 98 123 167 199 213 245 272
Âx Ây
 xy - n
byx =
(Â x )
2
Âx 2
-
n
(45)(1488)
8924 -
= 10
(45)2
285 -
10
= 27.0061
Âx Ây
 xy - n
bxy =
(Â y )
2
Ây 2
-
n
(45)(1488)
8924 -
= 10
(1488)2
282290 -
10
= 0.0366
x=
 x = 45 = 4.5
n 10
y=
Ây =
1488
= 148.8
n 10
The equation of the line of regression of y on x is
y - y = byx ( x - x )
y - 148.8 = 27.0061 ( x - 4.5)
y = 27.0061x + 27.2726
The equation of the line of regression of x on y is
x - x = bxy ( y - y )
x - 4.5 = 0.0366 ( y - 148.8)
x = 0.366 y - 0.9461
At x = 15 hours,
y = 27.0061 (15) + 27.2726 = 432.3641
4.15 Properties of Regression Coefficients 4.45
Example 9
Find the regression coefficient of y on x for the following data:
x 1 2 3 4 5
y 160 180 140 180 200
Solution
n=5
x=
 x = 15 = 3
n 5
y=
 y = 860 = 172
n 5
x y x-x y-y ( x - x )2 ( x - x )( y - y )
1 160 –2 –12 4 24
2 180 –1 8 1 –8
3 140 0 –32 0 0
4 180 1 8 1 8
5 200 2 28 4 56
Âx = 15 Ây = 860 Â( x - x ) = 0 Â( y - y ) = 0 Â( x - x )2 = 10 Â( x - x )( y - y ) = 80
byx =
 ( x - x )( y - y )
 ( x - x )2
80
=
10
=8
Note Since Âx, Ây, Âx2, Ây2, Âxy can be directly obtained with the help of scientific
calculator, the regression coefficient can be calculated without using mean.
Example 10
Calculate the two regression coefficients from the data and find
correlation coefficient.
x 7 4 8 6 5
y 6 5 9 8 2
4.46 Chapter 4 Correlation and Regression
Solution
n=5
x=
 x = 30 = 6
n 5
y=
 y = 30 = 6
n 5
x y x-x y-y ( x - x )2 ( y - y )2 ( x - x )( y - y )
7 6 1 0 1 0 0
4 5 –2 –1 4 1 2
8 9 2 3 4 9 6
6 8 0 2 0 4 0
5 2 –1 –4 1 16 4
Âx = Ây = Â( x - x ) Â( y - y ) Â( x - x )2 Â( y - y )2
Â( x - x )( y - y ) = 12
30 30 =0 =0 = 10 = 30
byx =
 ( x - x )( y - y )
 ( x - x )2
12
=
10
= 1.2
bxy =
 ( x - x )( y - y )
 ( y - y )2
12
=
30
= 0.4
r = byx bxy = (1.2)(0.4) = 0.693
Example 11
Obtain the two regression lines from the following data and hence, find
the correlation coefficient.
x 6 2 10 4 8
y 9 11 5 8 7
[Summer 2015]
4.15 Properties of Regression Coefficients 4.47
Solution
n=5
x=
 x = 30 = 6
n 5
y=
Ây
=
40
=8
n 5
x y x-x y-y ( x - x )2 ( y - y )2 ( x - x )( y - y )
6 9 0 1 0 1 0
2 11 –4 3 16 9 –12
10 5 4 –3 16 9 –12
4 8 –2 0 4 0 0
8 7 2 –1 4 1 –2
Â( x - x ) Â( y - y ) Â( x - x )2 Â( y - y )2 Â( x - x )( y - y )2
Âx = 30 Ây = 40
=0 =0 = 40 = 20 = -26
byx =
 (x - x ) (y - y)
 ( x - x )2
-26
=
40
= -0.65
bxy =
 (x - x ) (y - y)
 ( y - y )2
-26
=
20
= -1.3
The equation of regression line of y on x is
y - y = byx ( x - x )
y - 8 = -0.65( x - 6)
y = -0.65 x + 11.9
The equation of regression line of x on y is
x - x = bxy ( y - y )
x - 6 = -1.3( y - 8)
x = -1.3 y + 16.4
r = byx bxy = (-0.65) (-1.3) = 0.9192
4.48 Chapter 4 Correlation and Regression
Example 12
Calculate the regression coefficients and find the two lines of regression
from the following data:
x 57 58 59 59 60 61 62 64
y 67 68 65 68 72 72 69 71
Solution
n=8
x=
 x = 480 = 60
n 8
y=
Ây =
552
= 69
n 8
x y x-x y-y ( x - x )2 ( y - y )2 ( x - x )( y - y )
57 67 –3 –2 9 4 6
58 68 –2 –1 4 1 2
59 65 –1 –4 1 16 4
59 68 –1 –1 1 1 1
60 72 0 3 0 9 0
61 72 1 3 1 9 3
62 69 2 0 4 0 0
64 71 4 2 16 4 8
Âx = Ây = Â( x - x ) Â( y - y ) Â( x - x )2 Â( y - y )2
Â( x - x )( y - y ) = 24
480 552 =0 =0 = 36 = 44
byx =
 ( x - x )( y - y )
 ( x - x )2
24
=
36
= 0.667
4.15 Properties of Regression Coefficients 4.49
bxy =
 ( x - x )( y - y )
 ( y - y )2
24
=
44
= 0.545
The equation of regression line of y on x is
y - y = byx ( x - x )
y - 69 = 0.667( x - 60)
y = 0.667 x + 28.98
The equation of regression line of x on y is
x - x = bxy ( y - y )
x - 60 = 0.545( y - 69)
x = 0.545 y + 22.395
Value of y when x = 66 is
y = 0.667 (66) + 28.98 = 73.002
Example 13
The following data represents rainfall (x) and yield of paddy per hectare
(y) in a particular area. Find the linear regression of x on y.
x 113 102 95 120 140 130 125
y 1.8 1.5 1.3 1.9 1.1 2.0 1.7
Solution
Let a = 120 and b = 1.8 be the assumed means of x and y series respectively.
d x = x - a = x - 120
d y = y - b = y - 1.8
n=7
4.50 Chapter 4 Correlation and Regression
x y dx dy dy2 dxdy
113 1.8 –7 0 0 0
102 1.5 –18 –0.3 0.09 5.4
95 1.3 –25 –0.5 0.25 12.5
120 1.9 0 0.1 0.01 0
140 1.1 20 –0.7 0.49 –14
130 2.0 10 0.2 0.04 2.0
125 1.7 5 –0.1 0.01 –0.5
Âx = 825 Ây = 11.3 Âdx = –15 Âdy = –1.3 Âdy2 = 0.89 Âdxdy = 5.4
 dx  dy
 dx dy - n
bxy =
(Â d )
2
Â
y
d y2 -
n
(-15)(-1.3)
5.4 -
= 7
(-1.3)2
0.89 -
7
= 4.03
x=
 x = 825 = 117.86
n 7
y=
Ây =
11.3
= 1.614
n 7
The equation of the regression line of x on y is
x - x = bxy ( y - y )
x - 117.86 = 4.03 ( y - 1.614)
x = 4.03 y + 111.36
Note Since Âx, Ây, Âx2, Ây2, Âxy can be directly obtained with the help of scientific
calculator, the regression coefficient can be calculated without using assumed mean.
Example 14
Find the two lines of regression from the following data:
Age of husband (x) 25 22 28 26 35 20 22 40 20 18
Age of wife (y) 18 15 20 17 22 14 16 21 15 14
4.15 Properties of Regression Coefficients 4.51
Hence, estimate (i) the age of the husband when the age of the wife is 19,
and (ii) the age of the wife when the age of the husband is 30.
Solution
Let a = 26 and b = 17 be the assumed means of x and y series respectively.
d x = x - a = x - 26
d y = y - b = y - 17
n = 10
 dx  dy
 dx dy - n
byx =
(Â d x )
2
 d x2 -
n
(-4)(2)
172 -
= 10
(-4)2
450 -
10
= 0.385
4.52 Chapter 4 Correlation and Regression
 dx  dy
 dx dy - n
bxy =
(Â d )
2
Â
y
d y2 -
n
(-4)(2)
172 -
= 10
( 2 )2
78 -
10
= 2.227
x=
 x = 256 = 25.6
n 10
y=
Ây =
172
= 17.2
n 10
The equation of the regression line of y on x is
y - y = byx ( x - x )
y - 17.2 = 0.385( x - 25.6)
y = 0.385 x + 7.344
The equation of the regression line of x on y is
x - x = bxy ( y - y )
x - 25.6 = 2.227( y - 17.2)
x = 2.227 y - 12.704
Estimated age of the husband when the age of the wife is 19 is
x = 2.227 (19) – 12.704 = 29.601 or 30 nearly
Age of the husband = 30 years
Estimated age of the wife when the age of the husband is 30 is
y = 0.385 (30) + 7.344 = 18.894 or 19 nearly
Age of the wife = 19 years
Example 15
From the following data, obtain the two regression lines and correlation
coefficient.
Sales (x) 100 98 78 85 110 93 80
Purchase (y) 85 90 70 72 95 81 74
4.15 Properties of Regression Coefficients 4.53
Solution
Let a = 93 and b = 81 be the assumed means of x and y series respectively.
dx = x – a = x – 93
dy = y – b = y – 91
n=7
 dx  dy
 dx dy - n
byx =
(Â d x )
2
 d x2 -
n
(-7)(0)
639 -
= 7
(-7)2
821 -
7
= 0.785
 dx  dy
 dx dy - n
bxy =
(Â d )
2
Â
y
d y2 -
n
(-7)(0)
639 -
= 7
( 0 )2
544 -
7
= 1.1746
4.54 Chapter 4 Correlation and Regression
x=
 x = 644 = 92
n 7
y=
Ây =
567
= 81
n 7
Exercise 4.3
 x = 11.34,  y = 20.78,  x 2
= 12.16, Â y 2 = 84.96, Â xy = 22.13
From the above data, show how to compute the coefficients of the
equation y = a + bx.
[Ans.: a = 0.0005, b = 1.82 ]
6. In the estimation of regression equations of two variables x and y, the
following results were obtained:
x = 90, y = 70, n = 10, S( x - x )2 = 6360, S(y - y )2 = 2860
S(x - x ) (y - y ) = 3900
Obtain the two lines of regression.
[Ans.: x = 1.361 y — 5.27, y = 0.613 x + 14.812]
7. Find the likely production corresponding to a rainfall of 40 cm from the
following data:
Rainfall (in cm) Output (in quintals)
mean 30 50
SD 5 10
r = 0.8
[Ans.: 66 quintals]
8. The following table gives the age of a car of a certain make and annual
maintenance cost. Obtain the equation of the line of regression of cost
on age.
Age of a car 2 4 6 8
Maintenance 1 2 2.5 3
[Ans.: x = 0.325 y + 0.5]
9. Obtain the equation of the line of regression of y on x from the following
data and estimate y for x = 73.
x 70 72 74 76 78 80
y 163 170 179 188 196 220
[Ans.: y = 5.31 x — 212.57, y = 175.37]
10. The heights in cm of fathers (x) and of the eldest sons (y) are given
below:
x 165 160 170 163 173 158 178 168 173 170 175 180
y 173 168 173 165 175 168 173 165 180 170 173 178
4.56 Chapter 4 Correlation and Regression
Estimate the height of the eldest son if the height of the father is
172 cm and the height of the father if the height of the eldest son is
173 cm. Also, find the coefficient of correlation between the heights of
fathers and sons.
[Ans.: (i) y = 1.016 x — 5.123 (ii) x = 0.476 y + 98.98
(iii) 169.97, 173.45 (iv) r = 0.696]
11. Find (i) the lines of regression, and (ii) coefficient of correlation for
the following data:
x 65 66 67 67 68 69 70 72
y 67 68 65 66 72 72 69 71
[Ans.: (i) y = 19.64 + 0.72 x, x = 33.29 + 0.5 y, (ii) r = 0.604]
12. Find the line of regression for the following data and estimate y
corresponding to x = 15.5.
x 10 12 13 16 17 20 25
y 19 22 24 27 29 33 37
[Ans.: y = 1.21x + 7.71, y = 26.465]
13. The following data give the heights in inches (x) and weights in lbs (y)
of a random sample of 10 students:
x 61 68 68 64 65 70 63 62 64 67
y 112 123 130 115 110 125 100 113 116 126
Estimate the weight of a student of height 59 inches.
[Ans.: 126.4 lbs]
14. Find the regression equations of y on x from the data given below
taking deviations from actual mean of x and y.
Price in rupees (x) 10 12 13 12 16 15
Demand (y) 40 38 43 45 37 43
Estimate the demand when the price is `20.
[Ans.: y = —0.25 x + 44.25, y = 39.25]
CHAPTER
5
Some Special
Probability
Distributions
Chapter Outline
5.1 Introduction
5.2 Binomial Distribution
5.3 Poisson Distribution
5.4 Normal Distribution
5.5 Exponential Distribution
5.6 Gamma Distribution
5.1 Introduction
There are some specific distributions that are used in practice. There is a random
experiment behind each of these distributions. Since these random experiments model
a lot of real life phenomenon, these special distributions are used frequently in different
applications. Often a random experiment that we encounter in practice is such that we
are interested in the associated random variable X with such a standard distribution.
This chapter discusses special random variables and their distributions. These
include binomial distribution, Poisson distribution, normal distribution, exponential
distribution and gamma distribution.
5.2 Chapter 5 Some Special Probability Distributions
= np + n(n - 1) p ◊ (q + p)n - 2 - m 2
2
= np + n(n - 1) p2 - m 2 [∵ p + q = 1]
= np [1 + (n - 1) p ]- m 2
= np [1 - p + np ] - m 2
= np [q + np] - m 2 [∵ 1 - p = q ]
2
= np (q + np) - (np)
= npq
3. Standard Deviation of the Binomial Distribution
SD = Variance = npq
4. Mode of the Binomial Distribution
Mode of the binomial distribution is the value of x at which p(x) has maximum value.
Mode = integral part of (n + 1)p, if (n + 1)p is not an integer
= (n +1) p and (n + 1) p – 1, if (n + 1) p is an integer.
5.4 Chapter 5 Some Special Probability Distributions
Example 1
The mean and standard deviation of a binomial distribution are 5 and 2.
Determine the distribution.
Solution
m = np = 5
SD = npq = 2
npq = 4
5.2 Binomial Distribution 5.5
npq 4
=
np 5
4
\ q=
5
4 1
p = 1- q = 1- =
5 5
np = 5
Ê 1ˆ
nÁ ˜ = 5
Ë 5¯
\ n = 25
Hence, the binomial distribution is
P( X = x ) = nC x p x q n - x
x 25 - x
25 Ê 1ˆ Ê 4ˆ
= Cx Á ˜ Á ˜ , x = 0, 1, 2,..., 25
Ë 5¯ Ë 5 ¯
Example 2
The mean and variance of a binomial variate are 8 and 6. Find
P(X ≥ 2).
Solution
m = np = 8
s 2 = npq = 6
npq 6 3
= =
np 8 4
3
\ q=
4
3 1
p = 1- q = 1- =
4 4
np = 8
Ê 1ˆ
nÁ ˜ = 8
Ë 4¯
\ n = 32
P( X = x ) = nC x p x q n - x
x 32 - x
32 Ê 1ˆ Ê 3ˆ
= Cx Á ˜ Á ˜ , x = 0, 1, 2, ..., 32
Ë 4¯ Ë 4¯
5.6 Chapter 5 Some Special Probability Distributions
P( X ≥ 2) = 1 - P( X < 2)
= 1 - [P( X = 0) + P( X = 1)]
1
= 1 - Â P( X = x )
x =0
1 x 32 - x
Ê 1ˆ Ê 3ˆ
= 1- Â 32
Cx Á ˜ Á ˜
Ë 4¯ Ë 4¯
x =0
= 0.9988
Example 3
Suppose P(X = 0) = 1 – P(X = 1). If E(X) = 3 Var (X), find P(X = 0).
Solution
E ( X ) = 3 Var ( X )
np = 3 npq
1 = 3q
1
\ q=
3
1 2
p = 1- q = 1- =
3 3
Let P(X = 1) = p
P( X = 0) = 1 - P ( X = 1)
= 1- p
2
= 1-
3
1
=
3
Example 4
4
The mean and variance of a binomial distribution are 4 and
respectively. Find P(X ≥ 1). 3
Solution
m = np = 4
4
s 2 = npq =
3
5.2 Binomial Distribution 5.7
4
npq 3 1
= =
np 4 3
1
\ q =
3
1 2
p = 1- q = 1- =
3 3
np = 4
Ê 2ˆ
nÁ ˜ = 4
Ë 3¯
\ n=6
P( X = x ) = nC x p x q n - x
x 6- x
Ê 2ˆ Ê 1ˆ
= 6C x Á ˜ Á ˜ , x = 0, 1, 2, ..., 6
Ë 3¯ Ë 3¯
P ( X ≥ 1) = 1 - P ( X < 1)
= 1 - P ( X = 0)
0 6
Ê 2ˆ Ê 1ˆ
= 1 - 6 C0 Á ˜ Á ˜
Ë 3¯ Ë 3¯
= 0.9986
Example 5
A discrete random variable X has mean 6 and variance 2. If it is assumed
that the distribution is binomial, find the probability that 5 £ X £ 7.
Solution
m = np = 6
s 2 = npq = 2
npq 2 1
= =
np 6 3
1
\ q=
3
1 2
p = 1- q = 1- =
3 3
np = 6
Ê 2ˆ
nÁ ˜ = 6
Ë 3¯
\ n=9
5.8 Chapter 5 Some Special Probability Distributions
P( X = x ) = nC x p x q n - x
x 9- x
Ê 2ˆ Ê 1ˆ
= 9C x Á ˜ Á ˜ , x = 0, 1, 2, ..., 9
Ë 3¯ Ë 3¯
P(5 £ X £ 7) = P( X = 5) + P( X = 6) + P( X = 7)
7
= Â P( X = x )
x =5
7 x 9- x
Ê 2ˆ Ê 1ˆ
= Â 9
Cx Á ˜ Á ˜
Ë 3¯ Ë 3¯
x =5
4672
=
6561
= 0.7121
Example 6
With the usual notation, find p for a binomial distribution if n = 6 and
9P(X = 4) = P(X = 2).
Solution
For the binomial distribution,
P( X = x ) = nC x p x q n - x , x = 0, 1, 2, ..., n
n= 6
9 P ( X = 4) = P ( X = 2)
9 6 C 4 p 4 q 2 = 6 C2 p 2 q 4
9 p2 = q 2 = (1 - p)2
9 p2 = 1 - 2 p + p2
8 p2 + 2 p - 1 = 0
-2 ± 4 + 32 -2 ± 6 1 1
p= = =- ,
2¥8 16 2 4
1
Since probability cannot be negative, p = .
4
Example 7
In a binomial distribution consisting of 5 independent trials, the
probability of 1 and 2 successes are 0.4096 and 0.2048 respectively.
Find the parameter p of the distribution.
5.2 Binomial Distribution 5.9
Solution
n = 5, P( X = 1) = 0.4096, P( X = 2) = 0.2048
Example 8
In a binomial distribution, the sum and product of the mean and variance
25 50
are and respectively. Determine the distribution.
3 3
Solution
For the binomial distribution,
25
np + npq =
3
25
np (1 + q ) = ...(1)
3
50
and np (npq ) =
3
50
n2 p2 q =
3 ...(2)
5.10 Chapter 5 Some Special Probability Distributions
Example 9
1
If the probability of a defective bolt is , find the (i) mean, and
8
(ii) variance for the distribution of 640 defective bolts.
Solution
1
p= , n = 640
8
640
m = np = = 80
8
1 7
q = 1- p = 1- =
8 8
5.2 Binomial Distribution 5.11
Ê 1ˆ Ê 7ˆ
Variance of the distribution = npq = 640 ÁË ˜¯ ÁË ˜¯ = 70
8 8
Example 10
In eight throws of a die, 5 or 6 is considered as a success. Find the mean
number of success and the standard deviation.
Solution
Let p be the probability of success.
1 1 1
p= + =
6 6 3
1 2
q = 1- p = 1- =
3 3
n=8
Ê 1ˆ 8
m = np = 8 Á ˜ =
Ë 3¯ 3
Ê 1ˆ Ê 2ˆ 4
SD = npq = 8 Á ˜ Á ˜ =
Ë 3¯ Ë 3¯ 3
Example 11
4 coins are tossed simultaneously. What is the probability of getting
(i) 2 heads? (ii) at least 2 heads? (iii) at most 2 heads?
Solution
Let p be the probability of getting a head in the toss of a coin.
1 1 1
p = , q = 1- p = 1- = , n = 4
2 2 2
The probability of getting x heads when 4 coins are tossed
x 4- x
Ê 1ˆ Ê 1ˆ
P( X = x ) = nC x p x q n - x = 4C x Á ˜ Á ˜ , x = 0, 1, 2, 3, 4
Ë 2¯ Ë 2¯
(ii) Probability of getting at least two heads when 4 coins are tossed
P( X ≥ 2) = P( X = 2) + P( X = 3) + P( X = 4)
4
= Â P( X = x )
x =2
4 x 4- x
Ê 1ˆ Ê 1ˆ
= Â 4
Cx Á ˜ Á ˜
Ë 2¯ Ë 2¯
x =2
11
=
16
(iii) Probability getting at most 2 heads when 4 coins are tossed
P( X £ 2) = P( X = 0) + P( X = 1) + P( X = 2)
2
= Â P( X = x )
x =0
2 x 4- x
Ê 1ˆ Ê 1ˆ
= Â 4
Cx Á ˜ Á ˜
Ë 2¯ Ë 2¯
x =0
11
=
16
Example 12
Two dice are thrown five times. Find the probability of getting the sum as
7 (i) at least once, (ii) two times, and (iii) P(1 < X < 15).
Solution
In a single throw of two dice, a sum of 7 can occur in 6 ways out of 6 × 6 = 36 ways.
(1, 6), (6, 1), (2, 5), (5, 2), (3, 4), (4, 3)
Let p be the probability of getting the sum as 7 in a single throw of a pair of dice.
6 1 1 5
p= = , q = 1- p = 1- = , n = 5
36 6 6 6
Probability of getting the sum x times in 5 throws of a pair of dice
x 5- x
Ê 1ˆ Ê 5ˆ
P ( X = x ) = n C x p x q n - x = 5C x Á ˜ Á ˜ , x = 0, 1, 2, ..., 5
Ë 6¯ Ë 6¯
(i) Probability of getting the sum as 7 at least once in 5 throws of two dice
P( X ≥ 1) = 1 - P( X = 0)
0 5
Ê 1ˆ Ê 5ˆ
= 1 - 5 C0 Á ˜ Á ˜
Ë 6¯ Ë 6¯
5.2 Binomial Distribution 5.13
3125
= 1-
7776
4651
=
7776
(ii) Probability of getting the sum as 7 two times in 5 throws of two dice
2 3
Ê 1ˆ Ê 5ˆ 625
P( X = 2) = 5C2 Á ˜ Á ˜ =
Ë 6¯ Ë 6¯ 3888
(iii) Probability of getting the sum as 7 for P(1 < X < 5) in 5 throws of two dice
P(1 < X < 5) = P( X = 2) + P( X = 3) + P( X = 4)
4
= Â P( X = x )
x =2
4 x 5- x
Ê 1ˆ Ê 5ˆ
= Â 5
Cx Á ˜ Á ˜
Ë 6¯ Ë 6¯
x =2
1525
=
7776
Example 13
If 10% of the screws produced by a machine are defective, find the
probability that out of 5 screws chosen at random, (i) none is defective,
(ii) one is defective, and (iii) at most two are defective.
Solution
Let p be the probability of defective screws.
p = 0.1, q = 1 – p = 1 – 0.1 = 0.9, n = 5
Probability that x screws out of 5 screws are defective
P( X = x ) = nC x p x q n - x = 5C x (0.1) x (0.9)5- x , x = 0, 1,2, ..., 5
(i) Probability that none of the screws out of 5 screws is defective
P(X = 0) = 5C0 (0.1)0 (0.9)5 = 0.5905
(ii) Probability that one screw out of 5 screws is defective
P(X = 1) = 5C1 (0.1)1 (0.9)4 = 0.3281
(iii) Probability that at most 2 screws out of 5 screws are defective
P( X £ 2) = P( X = 0) + P( X = 1) + P( X = 2)
2
= Â P( X = x )
x =0
2
= Â 5
C x (0.1) x (0.9)5- x
x =0
= 0.9914
5.14 Chapter 5 Some Special Probability Distributions
Example 14
A multiple-choice test consists of 8 questions with 3 answers to each
question (of which only one is correct). A student answers each question
by rolling a balanced die and checking the first answer if he gets 1 or 2,
the second answer if he gets 3 or 4, and the third answer if he gets
5 or 6. To get a distinction, the student must secure at least 75% correct
answers. If there is no negative making, what is the probability that the
student secures a distinction? [Summer 2015]
Solution
Let p be the probability of getting an answer to a question correctly. There are three
answers to each question, out of which only one is correct.
1 1 2
p = , q = 1- p = 1- = , n = 8
3 3 3
Probability of getting x correct answers in an 8 questions test
x 8- x
Ê 1ˆ Ê 2ˆ
P ( X = x ) = n C x p x q n - x = 8C x Á ˜ Á ˜ , x = 0, 1, 2, ..., 8
Ë 3¯ Ë 3¯
Probability of securing a distinction, i.e., getting at least 6 correct answers out of the
8 questions
P( X £ 6) = P ( X = 6) + P ( X = 7) + P ( X = 8)
8
= Â P( X = x )
x =6
8 x 8- x
Ê 1ˆ Ê 2ˆ
= Â 8
Cx Á ˜ Á ˜
Ë 3¯ Ë 3¯
x =6
43
=
2187
= 0.0197
Example 15
A and B play a game in which their chances of winning are in the ratio
3:2. Find A’s chance of winning at least three games out of the five
games played.
Solution
Let p be the probability that A wins the game.
3 3 3 2
p= = , q = 1- p = 1- = , n=5
3+2 5 5 5
5.2 Binomial Distribution 5.15
Example 16
It has been claimed that in 60% of all solar heat installations the
utility bill is reduced by at least one-third. Accordingly, what are the
probabilities that the utility bill will be reduced by at least one third in
(i) four of five installations? (ii) at least four of five installations?
Solution
Let p be the probability that the utility bill is reduced by one-third in the solar heat
installations.
p = 60% = 0.6, q = 1 – p = 1 – 0.6 = 0.4, n = 5
Probability that the utility bill is reduced by one-third in x installations out of 5
installations
P( X = x ) = nC x p x q n - x = 5C x (0.6) x (0.4)5- x , x = 0, 1, 2, ..., 5
Probability that the utility bill is reduced by one-third in 4 of 5 installations
162
P( X = 5) = 5C4 (0.6)4 (0.4)1 =
625
Probability that the utility bill is reduced by one-third in at least 4 of 5 installations
P( X ≥ 4) = P( X = 4) + P( X = 5)
5
= Â P( X = x )
x=4
5
= Â 5
C x (0.6) x (0.4)5- x
x=4
1053
=
3125
= 0.337
5.16 Chapter 5 Some Special Probability Distributions
Example 17
The incidence of an occupational disease in an industry is such that the
workers have a 20% chance of suffering from it. What is the probability
that out of 6 workers chosen at random, four or more will suffer from
the disease?
Solution
Let p be the probability of a worker suffering from the disease.
p = 0.2, q = 1 – p = 1 – 0.2 = 0.8, n = 6
Probability that x workers will suffer from the disease
P( X = x ) = nC x p x q n - x = 6C x (0.2) x (0.8)6 - x , x = 0, 1, 2, ..., 6
Probability that 4 or more workers will suffer from the disease
P( X ≥ 4) = P( X = 4) + P( X = 5) + P( X = 6)
6
= Â P( X = x )
x=4
6
= Â 6
C x (0.2) x (0.8)6 - x
x=4
53
=
3125
= 0.017
Example 18
The probability that a man aged 60 will live up to 70 is 0.65. What is
the probability that out of 10 such men now at 60 at least 7 will live up
to 70?
Solution
Let p be the probability that a man will live up to 70.
p = 0.65, q = 1 – p = 1 – 0.65 = 0.35, n = 10
Probability that x men out of 10 will live up to 70
P( X = x ) = nC x p x q n - x = 10C x (0.65) x (0.35)10 - x , x = 0, 1, 2, ..., 10
Probability that at least 7 men out of 10 will live up to 70
P( X ≥ 7) = P( X = 7) + P( X = 8) + P( X = 9) + P( X = 10)
10
= Â P( X = x )
x =7
5.2 Binomial Distribution 5.17
10
= Â 10
C x (0.65) x (0.35)10 - x
x =7
= 0.5138
Example 19
In a multiple-choice examination, there are 20 questions. Each question
has 4 alternative answers following it and the student must select one
correct answer. 4 marks are given for a correct answer and 1 mark is
deducted for a wrong answer. A student must secure at least 50% of the
maximum possible marks to pass the examination. Suppose a student
has not studied at all, so that he answers the questions by guessing only.
What is the probability that he will pass the examination?
Solution
Since there are 20 questions and each carries with 4 marks, the maximum marks are
80. If the student solves 12 questions correctly and 8 questions wrongly, he gets 48 – 8
= 40 marks required for passing. If he gets more than 12 correct answers, he gets more
than 40 marks. Let p be the probability of getting a correct answer.
1 1 3
p = , q = 1 - p = 1 - = , n = 20
4 4 4
Probability of getting x correct answers out of 20 answers
x 20 - x
Ê 1ˆ Ê 3ˆ
P( X = x ) = nC x p x q n - x = 20
Cx Á ˜ Á ˜ , x = 0, 1, 2, ..., 20
Ë 4¯ Ë 4¯
Probability of passing the examination, i.e., probability of getting at least 12 correct
answers out of 20 answers
20
P( X ≥ 12) = Â P( X = x )
x =12
20 x 20 - x
Ê 1ˆ Ê 3ˆ
= Â 20
Cx Á ˜ Á ˜
Ë 4¯ Ë 4¯
x =12
= 9.3539 ¥ 10 -4
Example 20
1
The probability of a man hitting a target is . (i) If he fires 5 times, what
3
is the probability of his hitting the target at least twice? (ii) How many
times must he fire so that the probability of his hitting the target at least
once is more than 90%?
5.18 Chapter 5 Some Special Probability Distributions
Solution
Let p be probability of hitting a target.
1 1 2
p = , q = 1- p = 1- = , n = 5
3 3 3
Probability of hitting the target x times out of 5 times
x 5- x
Ê 1ˆ Ê 2ˆ
P ( X = x ) = n C x p x q n - x = 5C x Á ˜ Á ˜ , x = 0, 1, 2,..., 5
Ë 3¯ Ë 3¯
(i) Probability of hitting the target at least twice out of 5 times
P( X ≥ 2) = P( X = 2) + P( X = 3) + P( X = 4) + P( X = 5)
5
= Â P( X = x )
x =2
5 x 5- x
Ê 1ˆ Ê 2ˆ
= Â 5
Cx Á ˜ Á ˜
Ë 3¯ Ë 3¯
x =2
131
=
243
= 0.5391
(ii) Probability of hitting the target at least once out of 5 times
P( X ≥ 1) > 0.9
1 - P( X = 0) > 0.9
0 n
n Ê 1ˆ Ê 2ˆ
1 - C0 Á ˜ Á ˜ > 0.9
Ë 3¯ Ë 3 ¯
n
Ê 2ˆ
1 - Á ˜ > 0.9
Ë 3¯
6
Ê 2ˆ
For n = 6, 1 - Á ˜ = 0.9122
Ë 3¯
Hence, the man must fire 6 times so that the probability of hitting the target at lest once
is more than 90%.
Example 21
In sampling a large number of parts manufactured by a machine, the
mean number of defectives in a sample of 20 is 2. Out of 1000 such
samples, how many would be expected to contain exactly two defective
parts? [Summer 2015]
Solution
Let p be the probability of parts being defective.
5.2 Binomial Distribution 5.19
Example 22
An irregular 6-faced die is thrown such that the probability that it gives
3 even numbers in 5 throws is twice the probability that it gives 2 even
numbers in 5 throws. How many sets of exactly 5 trials can be expected
to give no even number out of 2500 sets?
Solution
Let p be the probability of getting an even number in a throw of a die.
n = 5, N = 2500
Probability of getting x even numbers in 5 throws of a die
P( X = x ) = nC x p x q n - x = 5C x p x q 5- x , x = 0, 1, 2, ..., 5
P(X = 3) = 2 P(X = 2)
5
C3 p3 q 2 = 2 (5 C2 p2 q3 )
10 p3 q 2 = 20 p2 q 3
p = 2q
p = 2(1 - p) = 2 - 2 p
2
\ p=
3
2 1
q = 1- p = 1- =
3 3
5.20 Chapter 5 Some Special Probability Distributions
Example 23
Out of 800 families with 5 children each, how many would you expect
to have (i) 3 boys? (ii) 5 girls? (iii) either 2 or 3 boys? (iv) at least one
boy? Assume equal probabilities for boys and girls.
Solution
Let p be the probability of having a boy in each family.
1 1 1 1
p = , q = 1 - = 1 - = , n = 5, N = 800
2 2 2 2
Probability of having x boys out of 5 children in each family
x 5- x
Ê 1ˆ Ê 1ˆ
P ( X = x ) = n C x p x q n - x = 5C x Á ˜ Á ˜ , x = 0, 1, 2, ..., 5
Ë 2¯ Ë 2¯
(i) Probability of having 3 boys out of 5 children in each family
3 2
Ê 1ˆ Ê 1ˆ 5
P( X = 3) = 5C3 Á ˜ Á ˜ =
Ë 2¯ Ë 2¯ 16
Expected number of families having 3 boys out of 5 children = N P(X = 3)
Ê 5ˆ
= 800 Á ˜
Ë 16 ¯
= 250
(ii) Probability of having 5 girls, i.e., no boys out of 5 children in each family
0 5
Ê 1ˆ Ê 1ˆ 1
P( X = 0) = 5C0 Á ˜ Á ˜ =
Ë 2¯ Ë 2¯ 32
Expected number of families 5 girls out of 5 children = NP(X = 0)
Ê 1ˆ
= 800 Á ˜
Ë 32 ¯
= 25
(iii) Probability of having either 2 or 3 boys out of 5 children in each family
3
P( X = 2) + P( X = 3) = Â P( X = x )
x =2
5.2 Binomial Distribution 5.21
3 x 5- x
Ê 1ˆ Ê 1ˆ
= Â 5
Cx Á ˜ Á ˜
Ë 2¯ Ë 2¯
x =2
5
=
8
Expected number of families having either 2 of 3 boys out of 5 children
= N [ P( X = 2) + P( X = 3)]
Ê 5ˆ
= 800 Á ˜
Ë 8¯
= 500
(iv) Probability of having at least one boy out of 5 children in each family
P( X ≥ 1) = P( X = 1) + P( X = 2) + P( X = 3) + P( X = 4) + P( X = 5)
5
= Â P( X = x )
x =1
5 x 5- x
Ê 1ˆ Ê 1ˆ
= Â 5C x Á ˜ Á ˜
Ë 2¯ Ë 2¯
x =1
31
=
32
Expected number of families having at least-one boy out of 5 children
= NP( X ≥ 1)
Ê 31 ˆ
= 800 Á ˜
Ë 32 ¯
= 775
Example 24
If hens of a certain breed lay eggs on 5 days a week on an average, find
how many days during a season of 100 days a will poultry keeper with
5 hens of this breed expect to receive at least 4 eggs.
Solution
Let p be the probability of hen laying an egg on any day of a week.
5 5 2
p = , q = 1 - p = 1 - = , n = 5, N = 100
7 7 7
Probability of x hens laying eggs on any day of a week
x 5- x
Ê 5ˆ Ê 2ˆ
P ( X = x ) = n C x p x q n - x = 5C x Á ˜ Á ˜ , x = 0, 1, 2, ..., 5
Ë 7¯ Ë 7¯
5.22 Chapter 5 Some Special Probability Distributions
Example 25
Seven unbiased coins are tossed 128 times and the number of heads
obtained is noted as given below:
No. of heads 0 1 2 3 4 5 6 7
Frequency 7 6 19 35 30 23 7 1
Fit a binomial distribution to the data.
Solution
Since the coin is unbiased,
1 1
p = , q = , n = 7, N = 128
2 2
For binomial distribution,
x 7- x
Ê 1ˆ Ê 1ˆ
P ( X = x ) = n C x p x q n - x = 7C x Á ˜ Á ˜ , x = 0, 1, 2, ..., 7
Ë 2¯ Ë 2¯
Theoretical or expected frequency f (x) = N P(X = x)
x 7- x 7
Ê 1ˆ Ê 1ˆ Ê 1ˆ
f ( x ) = 128 7C x Á ˜ Á ˜ = 128 7C x Á ˜
Ë 2¯ Ë 2¯ Ë 2¯
7
Ê 1ˆ
f (0) = 128 7C0 Á ˜ = 1
Ë 2¯
7
Ê 1ˆ
f (1) = 128 7C1 Á ˜ = 7
Ë 2¯
5.2 Binomial Distribution 5.23
7
Ê 1ˆ
f (2) = 128 7C2 Á ˜ = 21
Ë 2¯
7
Ê 1ˆ
f (3) = 128 7C3 Á ˜ = 35
Ë 2¯
7
Ê 1ˆ
f (4) = 128 7C4 Á ˜ = 35
Ë 2¯
7
Ê 1ˆ
f (5) = 128 7C5 Á ˜ = 21
Ë 2¯
7
Ê 1ˆ
f (6) = 128 7C6 Á ˜ = 7
Ë 2¯
7
7 Ê 1ˆ
f (7) = 128 C7 Á ˜ = 1
Ë 2¯
Binomial distribution
No. of heads x 0 1 2 3 4 5 6 7
Example 26
Fit a binomial distribution to the following data:
x 0 1 2 3 4 5
f 2 14 20 34 22 8
Solution
Mean =
 fx
Âf
2(0) + 14(1) + 20(2) + 34(3) + 22(4) + 8(5)
=
2 + 14 + 20 + 34 + 22 + 8
284
=
100
= 2.84
For binomial distribution,
n = 5
5.24 Chapter 5 Some Special Probability Distributions
m = np = 2.84
5 p = 2.84
\ p = 0.568
q = 1 - p = 1 - 0.568 = 0.432
Exercise 5.1
1. Find the fallacy if any in the following statements:
(a) The mean of a binomial distribution is 6 and SD is 4.
(b) The mean of a binomial distribution is 9 and its SD is 4.
È 8 ˘
Í ans.: (a) False, q = 3 is impossible ˙
Í ˙
Í 19
(b) False, q = is impossible ˙
ÍÎ 9 ˙˚
2. The mean and variance of a binomial distribution are 3 and 1.2
respectively. Find n, p, and P(X < 4).
È 2068 ˘
Í ans.: 5, 0.6, 3125 ˙
Î ˚
5.2 Binomial Distribution 5.25
10
3. Find the binomial distribution if the mean is 5 and the variance is .
Find P(X = 2). 3
È 25
x
Ê 1ˆ Ê 2 ˆ
25 - x
˘
Í ans.: P( X = x) = Cx Á ˜ Á ˜ , 0.003˙
ÍÎ Ë 3¯ Ë 3¯ ˙˚
4. In a binomial distribution, the mean and variance are 4 and 3 respectively.
Find P(X ≥ 1).
ÈÎans.: 0.9899˘˚
5. The odds in favour of X winning a game against Y are 4:3. Find the
probability of Y winning 3 games out of 7 played.
ÎÈans.: 0.0929˚˘
6. On an average, 3 out of 10 students fail in an examination. What is the
probability that out of 10 students that appear for the examination
none will fail?
ÎÈans.: 0.0282˚˘
7. If on the average rain falls on 10 days in every thirty, find the probability
(i) that the first three days of a week will be fine and remaining wet,
and (ii) that rain will fall on just three days of a week.
È 8 280 ˘
Í ans.: (i) 2187 (ii) 2187 ˙
Î ˚
8. Two unbiased dice are thrown three times. Find the probability that the
sum nine would be obtained (i) once, and (ii) twice.
11. An insurance salesman sells policies to 5 men, all of identical age and
good health. According to the actuarial tables, the probability that a
2
man of this particular age will be alive 30 years hence is . Find the
3
probability that 30 years hence (i) at least 1 man will be alive, (ii) at
least 3 men will be alive, and (iii) all 5 men will be alive.
È 242 64 32 ˘
Í ans.: (i) 243 (ii) 81 (iii) 243 ˙
Î ˚
12. A company has appointed 10 new secretaries out of which 7 are trained.
If a particular executive is to get three secretaries selected at random,
what is the chance that at least one of them will be untrained?
ÈÎans.: 0.7083˘˚
13. The overall pass rate in a university examination is 70%. Four candidates
take up such an examination. What is the probability that (i) at least
one of them will pass? (ii) all of them will pass the examination?
ÈÎans.: (i) 0.9919 (ii) 0.7599˘˚
14. T
he normal rate of infection of a certain disease in animals is known
to be 25%. In an experiment with a new vaccine, it was observed that
none of the animals caught the infection. Calculate the probability of
the observed result.
È 729 ˘
Í ans.: 4096 ˙
Î ˚
15. Suppose that weather records show that on the average, 5 out of 31
days in October are rainy days. Assuming a binomial distribution with
each day of October as an independent trial, find the probability that
the next October will have at most three rainy days.
ÈÎans.: 0.2403˘˚
16. Assuming that half the population of a village is female and assuming
that 100 samples each of 10 individuals are taken, how many samples
would you expect to have 3 or less females?
ÈÎans.: 17 ˘˚
17. Assuming that half the population of a town is vegetarian so that the
1
chance of an individual being vegetarian is , and assuming that 100
2
investigators can take a sample of 10 individuals to see whether they
are vegetarians, how many investigators would you expect to report
that three people or less in the sample were vegetarians?
ÈÎans.: 17 ˘˚
5.3 Poisson Distribution 5.27
x 0 1 2 3 4
f 12 66 109 59 10
l
Putting p = ,
n
x
Ê l ˆ
n
n(n - 1)(n - 2) (n - x + 1) Á n ˜ Ê l ˆ
P( X = x ) = Á Á 1 - ˜
x! l˜ Ë x¯
Á 1- ˜
Ë n ¯
n
n(n - 1)(n - 2) (n - x + 1) l x 1 Ê lˆ
= ÁË 1 - ˜¯
x! n Ê lˆx
x x
ÁË 1 - ˜¯
n
n- x
n(n - 1)(n - 2) (n - x + 1) l x Ê l ˆ
= Á 1 - ˜¯
x! nx Ë n
Ê 1 ˆ Ê 2 ˆ È Ê x - 1ˆ ˘
1 Á 1 - ˜ Á 1 - ˜ Í1 - Á
Ë n ¯ Ë n ¯ Î Ë n ˜¯ ˙˚ xÊ lˆ
n- x
= l Á1 - ˜
x! Ë n¯
n- x
Ê lˆ
Since lim Á 1 - ˜ = e- l
n Æ• Ë n¯
Ê 1ˆ Ê 2ˆ
and lim Á 1 - ˜ = lim Á 1 - ˜ = 1
n Æ• Ë n ¯ nÆ• Ë n ¯
5.3 Poisson Distribution 5.29
Var ( X ) = E ( X 2 ) - m 2
•
= Â x 2 p( x ) - m 2
x =0
•
e- l l x
= Â x2
x!
- l2
x =0
• -l
e lx
= Â x [( x - 1) + x ] x!
- l2
x =0
•
x( x - 1) e - l l x • x e - l l x
= Â x!
+Â
x!
- l2
x =0 x =0
•
x( x - 1) e - l l x - 2 l 2
= Â x ( x - 1)( x - 2) 1
+ l - l2
x =0
•
l x -2
= e- l l 2 Â + l - l2
x =2 ( x - 2)!
Ê l2 ˆ
= e- l l 2 Á 1 + l + + ˜ + l - l 2
Ë 2! ¯
= -el e- l l 2 + l - l 2
= l2 + l - l2
=l
SD = Variance = l
e - l l x e - l l x +1
≥
x! ( x + 1)!
l
1≥
x +1
( x + 1) ≥ l
x ≥ l -1 ...(5.1)
Similarly, for p(x) ≥ p(x – 1),
x £ l ...(5.2)
5.3 Poisson Distribution 5.31
Case II If l is not an integer, the distribution is unimodal and the mode of the Poisson
distribution is an integral part of l. The mode is the integer between l – 1 and l.
Example 1
Find out the fallacy if any in the statement. “The mean of a Poisson
distribution is 2 and the variance is 3.”
Solution
In a Poisson distribution, the mean and variance are same. Hence, the above statement
is false.
Example 2
If the mean of the Poisson distribution is 4, find
P(l - 2s < X < l + 2s ).
Solution
For a Poisson distribution,
Variance = l
5.32 Chapter 5 Some Special Probability Distributions
Mean = l = 4, s =2
-l
l
e e -4 4 x
x
P( X = x ) = = , x = 0, 1, 2, ...
x! x!
P(l - 2s < X < l + 2s ) = P(0 < X < 8)
7
= Â P( X = x )
x =1
7
e -4 4 x
=Â
x =1 x !
= 0.9306
Example 3
If the mean of a Poisson variable is 1.8, find (i) P(X > 1), (ii) P(X = 5),
and (iii) P(0 < X < 5).
Solution
For a Poisson distribution,
l = 1.8
e - l l x e -1.8 1.8 x
P( X = x ) = = , x = 0, 1, 2, ...
x! x!
(i) P( X > 1) = 1 - P( X £ 1)
= 1 - [P( X = 0) + P( X = 1)]
1
= 1 - Â P( X = x )
x =0
1
e -1.8 1.8 x
= 1- Â
x =0 x!
= 0.5372
e -1.8 1.85
(ii) P( X = 5) = = 0.026
5!
(iii) P(0 < X < 5) = P( X = 1) + P( X = 2) + P( X = 3) + P( X = 4)
4
= Â P( X = x )
x =1
4
e -1.8 1.8 x
=Â
x =1 x!
= 0.7983
5.3 Poisson Distribution 5.33
Example 4
If a random variable has a Poisson distribution such that P(X = 1) =
P(X = 2), find (i) the mean of the distribution, (ii) P(X = 4), (iii) P(X ≥ 1),
and (iv) P(1 < X < 4).
Solution
For a Poisson distribution,
e- l l x
P( X = x ) = , x = 0, 1,2,...
x!
(i) P( X = 1) = P( X = 2)
e- l l1 e- l l 2
=
1! 2!
l 2 = 2l
l 2 - 2l = 0
l (l - 2) = 0
l = 0 or l = 2
Since l π 0, l = 2
e - l l x e -2 2 x
Hence, P( X = x ) = = , x = 0, 1, 2, ...
x! x!
-2 4
(ii) P( X = 4) = e 2 = 0.9022
4!
(iii) P( X ≥ 1) = 1 - P( X < 1)
= 1 - P ( X = 0)
e -2 20
= 1-
0!
= 0.8647
(iv) P(1 < X < 4) = P( X = 2) + P( X = 3)
3
= Â P( X = x )
x =2
3
e -2 2 x
= Â x!
x =2
= 0.4511
5.34 Chapter 5 Some Special Probability Distributions
Example 5
If X is a Poisson variate such that P(X = 0) = P(X = 1), find P(X = 0)
and using recurrence relation formula, find the probabilities at x = 1, 2,
3, 4, and 5.
Solution
For a Poisson distribution,
e- l l x
P( X = x ) = , x = 0, 1,2, ...
x!
P( X = 0) = P( X = 1)
e- l l 0 e- l l1
=
0! 1!
l=1
e - l 1x
Hence, P ( X = x ) = , x = 0, 1, 2, ...
x!
e- l l 0
(i) P( X = 0) = = 0.3678
0!
(ii) By recurrence relation,
l
p( x + 1) = p( x )
x +1
1
p( x + 1) = p( x ) [∵ l = 1]
x +1
p(1) = p(0) = 0.3678
1 1
p(2) = p(1) = (0.3678) = 0.1839
2 2
1 1
p(3) = p(2) = (0.1839) = 0.0613
3 3
1 1
p(4) = p(3) = (0.0613) = 0.015325
4 4
1 1
p(5) = p(4) = (0.015325) = 0.003065
5 5
Example 6
If the variance of a Poisson variate is 3, find the probability that (i) X = 0,
(ii) 0 < X £ 3, and (iii) 1 £ X < 4.
5.3 Poisson Distribution 5.35
Solution
For a Poisson distribution,
Variance = Mean = l = 3
e - l l x e -3 3 x
P( X = x ) = = , x = 0, 1,2, ...
x! x!
e -3 30
(i) P ( X = 0) = = 0.0498
0!
(ii) P (0 < X £ 3) = P( X = 1) + P( X = 2) + P( X = 3)
3
= Â P( X = x )
x =1
3
e -3 3 x
=Â
x =1 x !
= 0.5974
(iii) P(1 £ X < 4) = P ( X = 1) + P( X = 2) + P( X = 3)
3
= Â P( X = x )
x =1
3
e -3 3 x
=Â
x =1 x !
= 0.5974
Example 7
3
If a Poisson distribution is such that P( X = 1) = P( X = 3), find
2
(i) P(X ≥ 1), (ii) P(X £ 3), and (iii) P(2 £ X £ 5).
Solution
For a Poisson distribution,
e- l l x
P( X = x ) = , x = 0, 1, 2,...
x!
3
P( X = 1) = P( X = 3)
2
3 e- l l1 e- l l 3
=
2 1! 3!
3 l3
l=
2 6
l 3 - 9l = 0
5.36 Chapter 5 Some Special Probability Distributions
l (l 2 - 9) = 0
l = 0, 3, - 3
Since l > 0, l = 3
e -3 3 x
Hence, P( x = x ) = , x = 0, 1, 2, ...
x!
(i) P( X ≥ 1) = 1 - P( X < 1)
= 1 - P( X = 0)
e -3 30
= 1-
0!
= 0.9502
(ii) P( X £ 3) = P( X = 0) + P( X = 1) + P( X = 2) + P( X = 3)
3
= Â P( X = x )
x =0
3
e -3 3 x
= Â
x =0 x !
= 0.6472
(iii) P(2 £ X £ 5) = P( X = 2) + P( X = 3) + P( X = 4) + P( X = 5)
5
= Â P( X = x )
x =2
5
e-3 3 x
= Â x!
x =2
= 0.7169
Example 8
If X is a Poisson variate such that
P( X = 2) = 9 P ( X = 4) + 90 P ( X = 6)
Find (i) the mean of X, (ii) the variance of X, (iii) P(X < 2), (iv) P(X > 4),
and (v) P(X ≥ 1).
Solution
For a Poisson distribution,
e- l l x
P( X = x ) = , x = 0, 1, 2, ...
x!
P( X = 2) = 9 P ( X = 4) + 90 P ( X = 6)
5.3 Poisson Distribution 5.37
e- l l 2 e- l l 4 e- l l 6
=9 + 90
2! 4! 6!
Ê 9l 2
90 l 4 ˆ
= e- l l 2 Á + ˜
Ë 4! 6! ¯
1 9l 2 90 l 4
= +
2 4! 6!
1 3l 2 l 4
= +
2 8 8
l 4 + 3l 2 - 4 = 0
3 ± 9 + 16 -3 ± 5
l2 = - = = 1, - 4
2 2
Since l > 0, l2 = 1
(i) Mean = l = 1
(ii) Variance = l = 1
e -11x
P( X = x ) = , x = 0, 1, 2, ...
x!
(iii) P( X < 2) = P( X = 0) + P( X = 1)
1
e -11x
= Â
x =0 x !
= 0.7358
(iv) P( X > 4) = 1 - P( X £ 4)
= 1 - [P( X = 0) + ( X = 1) + P( X = 2) + P( X = 3) + P( X = 4)]
4
e -11x
= 1- Â
x =0 x !
= 0.00366
(v) P( X ≥ 1) = 1 - P( X = 0)
e -110
= 1-
1!
= 0.6321
Example 9
3
If a Poisson distribution is such that P( X = 1) = P( X = 3), find
2
(i) P(X ≥ 1), (ii) P(X £ 3), and (iii) P(2 £ X £ 5).
5.38 Chapter 5 Some Special Probability Distributions
Solution
3
P( X = 1) = P( X = 3)
2
3 e- l l1 e- l l 3
=
2 1! 3!
3 l2
=
2 6
l2 = 9
l = ±3
Since l > 0, l = 3
e -3 3 x
P( X = x ) = , x = 0, 1, 2, ...
x!
(i) P( X ≥ 1) = 1 - P( X < 1)
= 1 - P ( X = 0)
e -3 30
= 1-
0!
= 0.9502
(ii) P( X £ 3) = P( X = 0) + P( X = 1) + P( X = 2) + P( X = 3)
3
= Â P( X = x )
x =0
3
e -3 3 x
= Â
x =0 x !
= 0.6472
(iii) P(2 £ X £ 5) = P( X = 2) + P( X = 3) + P( X = 4) + P( X = 5)
5
= Â P( X = x )
x =2
5
e -3 3 x
= Â
x =2 x !
= 0.7169
Example 10
If X is a Poisson variate such that
1
3 P( X = 4) =P ( X = 2) + P ( X = 0)
2
Find (i) the mean of X, and (ii) P(X £ 2).
5.3 Poisson Distribution 5.39
Solution
(i) For a Poisson distribution,
e- l l x
P( X = x ) = , x = 0, 1,2, ...
x!
1
3 P ( X = 4) = P( X = 2) + P( X = 0)
2
-l 4
e l 1 e- l l 2 e- l l 0
3 = +
4! 2 2! 0!
l 4 - 2l 2 - 8 = 0
(l 2 - 4) (l 2 + 2) = 0
l = ±2 (∵ l is real)
l=2 (∵ l > 0)
Mean = l = 2
e -2 2 x
Hence, P( X = x ) = , x = 0, 1, 2, ...
x!
(ii) P( X £ 2) = P( X = 0) + P( X = 1) + P( X = 2)
2
= Â P( X = x )
x =0
2
e -2 2 x
= Â x!
x =0
= 0.6766
Example 11
A manufacturer of cotterpins knows that 5% of his products are defective.
If he sells cotterpins in boxes of 100 and guarantees that not more than
10 pins will be defective, what is the approximate probability that a box
will fail to meet the guaranteed quality?
Solution
Let p be the probability of a pin being defective.
p = 5% = 0.05, n = 100
Since p is very small and n is large, Poisson distribution is used.
l = np = 100 × 0.05 = 5
Let X be the random variable which denotes the number of defective pins in a box of
100.
5.40 Chapter 5 Some Special Probability Distributions
e -5 5 x
10
= 1- Â
x =0 x !
= 0.0137
Example 12
A car-hire firm has two cars, which it hires out day by day. The number
of demands for a car on each day is distributed as a Poisson distribution
with a mean of 1.5. Calculate the proportion of days on which (i) neither
car is used, and (ii) the proportion of days on which some demand is
refused.
Solution
l = 1.5
Let X be the random variable which denotes the number of demands for a car on each
day.
Probability of days on which there are x demands for a car
e - l l x e -1.5 1.5 x
P( X = x ) = = , x = 0, 1, 2, ...
x! x!
(i) Proportion or probability of days on which neither car is used
e -1.5 1.50
P( X = 0) = = 0.2231
0!
(ii) Proportion or probability of days on which some demand is refused
P( X > 2) = 1 - P( X £ 2)
2
= 1 - Â P( X = x )
x =0
2
e -1.5 1.5 x
= 1- Â
x =0 x!
= 0.1912
5.3 Poisson Distribution 5.41
Example 13
Six coins are tossed 6400 times. Using the Poisson distribution, what is
the approximate probability of getting six heads 10 times?
Solution
Let p be the probability of getting one head with one coin.
1
p=
2
6
Ê 1ˆ 1
Probability of getting 6 heads with 6 coins = ÁË ˜¯ =
2 64
n = 6400
Ê 1ˆ
l = np = 6400 Á ˜ = 100
Ë 64 ¯
Probability of getting x heads
e - l l x e -100 100 x
P( X = x ) = = , x = 0, 1, 2, ...
x! x!
Probability of getting 6 heads 10 times
e -100 10010
P ( X = 10) = = 1.025 ¥ 10 -30
10!
Example 14
If 2% of lightbulbs are defective, find the probability that (i) at least one
is defective, and (ii) exactly 7 are defective. Also, find P(1 < X < 8) in a
sample of 100.
Solution
Let p be the probability of defective bulb.
p = 2% = 0.02
n = 100
Since p is very small and n is large, Poisson distribution is used.
l = np = 100(0.02) = 2
Let X be the random variable which denotes the number of defective bulbs in a sample
of 100.
Probability of x defective bulb in a sample of 100
e - l l x e -2 2 x
P( X = x ) = = , x = 0, 1, 2, ...
x! x!
5.42 Chapter 5 Some Special Probability Distributions
Example 15
An insurance company insured 4000 people against loss of both eyes
in a car accident. Based on previous data, the rates were computed on
the assumption that on the average, 10 persons in 100000 will have
car accidents each year that result in this type of injury. What is the
probability that more than 3 of the insured will collect on their policy in
a given year?
Solution
Let p be the probability of loss of both eyes in a car accident.
10
p= = 0.0001
100000
n = 4000
Since p is very small and n is large, Poisson distribution is used.
l = np = 4000 (0.0001) = 0.4
Let X be the random variable which denotes the number of car accidents in a group of
4000 people.
Probability of x car accidents in a group of 4000 people
e - l l x e -0.4 0.4 x
P( X = x ) = = , x = 0, 1, 2, ...
x! x!
Probability that more than 3 of the insured will collect on their policy, i.e., probability
of more than 3 car accidents in a group of 4000 people
5.3 Poisson Distribution 5.43
P( X > 3) = 1 - P( X £ 3)
= 1 - [P( X = 0) + ( X = 1) + P( X = 2) + P( X = 3)]
3
= 1 - Â P( X = x )
x =0
3
e -0.4 0.4 x
= 1- Â
x =0 x!
= 0.00077
Example 16
If two cards are drawn from a pack of 52 cards which are diamonds,
using Poisson distribution, find the probability of getting two diamonds
at least 3 times in 51 consecutive trials of two cards drawing each
time.
Solution
Let p be the probability of getting two diamonds from a pack of 52 cards.
13
C2 3
p= 52
= , n = 51
C2 51
Since p is very small and n is large, Poisson distribution is used.
Ê 3ˆ
l = np = 51 Á ˜ = 3
Ë 51¯
Let X be the random variable which denotes the drawing of two diamond cards.
Probability of x trials of drawing two diamond cards in 51 trials
e - l l x e -3 3 x
P( X = x ) = = , x = 0, 1, 2, ...
x! x!
Probability of getting two diamond cards at least 3 times in 51 trials
P(X ≥ 3) = 1 – P(X < 3)
= 1 – [P(X = 0) + P(X = 1) + P(X = 2)]
2
e -3 3 x
= 1- Â
x =0 x !
= 0.5768
Example 17
Suppose a book of 585 pages contains 43 typographical errors. If
these errors are randomly distributed throughout the book, what is the
probability that 10 pages, selected at random, will be free from errors?
5.44 Chapter 5 Some Special Probability Distributions
Solution
Let p be the probability of errors in a page.
43
p= = 0.0735, n = 10
585
Since p is very small and n is large, Poisson distribution is used.
l = np = 10(0.0735) = 0.735
Let X be the random variable which denotes the errors in the pages.
Probability of x errors in a page in a book of 585 pages
e - l l x e -0.735 0.735 x
P( X = x ) = = , x = 0, 1, 2, ...
x! x!
Probability that a random sample of 10 pages will contain no error.
e -0.735 0.7350
P( X = 0) = = 0.4795
0!
Example 18
A hospital switchboard receives an average of 4 emergency calls in a
10-minute interval. What is the probability that (i) there are at most
2 emergency calls? (ii) there are exactly 3 emergency calls in an interval
of 10 minutes?
Solution
Let p be the probability of receiving emergency calls per minute.
4
p= = 0.4, n = 10
10
l = np = 10 (0.4) = 4
Let X be the random variable which denotes the number of emergency calls per
minute.
Probability of x emergency calls per minute
e - l l x e -4 4 x
P( X = x ) = = , x = 0, 1, 2, ...
x! x!
Probability that there are at most 2 emergency calls
P( X £ 2) = P( X = 0) + P( X = 1) + P( X = 2)
2
= Â P( X = x )
x =0
2
e -4 4 x
= Â x!
x =0
= 0.238
5.3 Poisson Distribution 5.45
Example 19
A manufacturer, who produces medicine bottles, finds that 0.1% of
the bottles are defective. The bottles are packed in boxes containing
500 bottles. A drug manufacturer buys 100 boxes from the producer of
bottles. Using Poisson distribution, find how many boxes will contain
(i) no defective bottles and (ii) at least 2 defective bottles.
Solution
Let p be the probability of deflective bottles.
p = 0.1% = 0.001
n = 500
l = np = 500 (0.001) = 0.5
Let X be the random variable which denotes the number of defective bottles in a box
of 500.
Probability of x defective bottles in a box of 500
e - l l x e -0.5 0.5 x
P( X = x ) = = , x = 0, 1, 2, ...
x! x!
(i) Probability of no defective bottles in a box
e -0.5 0.50
P ( X = 0) = = 0.6065
0!
Number of boxes containing no defective bottles
f (x) = N P(x = 0) = 100(0.6065) ª 61
(ii) Probability of at least 2 defective bottles
P( X ≥ 2) = 1 - P( X < 2)
= 1 - [P( X = 0) + P( X = 1)]
1
= 1 Â P( X = x )
x =0
1
e -0.5 0.5 x
= 1- Â
x =0 x!
= 0.0902
Number of boxes containing at least 2 defective bottles
f (x) = N P(X ≥ 2) = 100 (0.0902) ª 9
5.46 Chapter 5 Some Special Probability Distributions
Example 20
1
In a certain factory turning out blades, there is a small chance of
500
for any blade to be defective. The blades are supplied in packets of 10.
Use the Poisson distribution to calculate the approximate number of
packets containing no defective, one defective, and two defective blades
in a consignment of 10000 packets.
Solution
Let p be the probability of defective blades in a packet.
1
p= , n = 10, N = 10000
500
Ê 1 ˆ
l = np = 10 Á = 0.02
Ë 500 ˜¯
Let X be the random variable which denotes the number of defective blades in a
packet.
Probability of x defective blades in a packet
e - l l x e -0.02 0.02 x
P( X = x ) = = , x = 0, 1, 2, ...
x! x!
(i) Probability of no defective blades in a packet
e -0.02 0.020
P ( X = 0) = = 0.9802
0!
Number of packets with no defective blades
f (x) = N P(X = 0) = 10000(0.9802) = 9802
(ii) Probability of one defective blade in a packet
e-0.02 0.021
P( X = 1) = = 0.0196
1!
Number of packets with one defective blade
f (x) = N P(X = 1) = 10000 (0.0196) = 196
(iii) Probability of two defective blades in a packet
e -0.02 0.022
P( X = 2) = = 1.96 ¥ 10 -4
2!
Number of packets with 2 defective blades
f (x) = N P(X = 2) = 10000 (1.96 × 10–4) = 1.96 ª 2
Example 21
The number of accidents in a year attributed to taxi drivers in a city
follows Poisson distribution with a mean of 3. Out of 1000 taxi drivers,
5.3 Poisson Distribution 5.47
Example 22
Fit a Poisson distribution to the following data:
Number of deaths (x) 0 1 2 3 4
Frequency (f) 122 60 15 2 1
Solution
Mean =
 fx
Âf
122(0) + 60(1) + 15(2) + 2(3) + 1(4)
=
122 + 60 + 15 + 2 + 1
100
=
200
= 0.5
5.48 Chapter 5 Some Special Probability Distributions
Example 23
Assuming that the typing mistakes per page committed by a typist follows
a Poisson distribution, find the expected frequencies for the following
distribution of typing mistakes:
Number of mistakes per page 0 1 2 3 4 5
Number of pages 40 30 20 15 10 5
Solution
Mean =
 fx
Âf
40(0) + 30(1) + 20(2) + 15(3) + 10(4) + 5(5)
=
40 + 30 + 20 + 15 + 10 + 5
5.3 Poisson Distribution 5.49
180
=
120
= 1.5
For a Poisson distribution,
l = 1.5
e - l l x e -1.5 1.5 x
P( X = x ) = = , x = 0, 1, 2,3, 4, 5
x! x!
N = Â f = 120
Expected frequency f ( x ) = N P ( X = x )
120 e -1.5 1.5 x
f ( x) =
x!
120 e -1.5 1.50
f (0) = = 26.78 ª 27
0!
120 e -1.5 1.51
f (1) = = 40.16 ª 40
1!
120 e -1.5 1.52
f (2) = = 30.12 ª 30
2!
120 e -1.5 1.53
f (3) = = 15.06 ª 15
3!
120 e -1.5 1.54
f (4) = = 5.65 ª 6
4!
120 e -1.5 1.55
f (5) = = 1.69 ª 2
5!
Exercise 5.2
1. The mean and variance of a probability distribution is 2. Write down
the distribution.
È e -2 2 x ˘
Í ans.: P( X = x) = , x = 0, 1, 2, ...˙
Î x! ˚
2. In a Poisson distribution, the probability P(X = 0) is 20 per cent. Find the
mean of the distribution.
ÎÈans.: 2.9957 ˚˘
3. If X is a Poisson variate and P(X = 0) = 6 P(X = 3), find P(X = 2).
ÎÈans.: 0.1839˚˘
5.50 Chapter 5 Some Special Probability Distributions
19. It is known that 0.5% of ballpen refills produced by a factory are
defective. These refills are dispatched in packaging of equal numbers.
Using a Poisson distribution, determine the number of refills in a
packing to be sure that at least 95% of them contain no defective
refills.
ÈÎans.: 10 ˘˚
20. A
manufacturer finds that the average demand per day for the
mechanics to repair his new product is 1.5 over a period of one
year and the demand per day is distributed as a Poisson variate. He
employs two mechanics. On how many days in one year (i) would both
mechanics would be free? (ii) some demand is refused?
X 0 1 2 3 4
f 211 90 19 5 0
No. of pieces 43 40 25 10 2
X 0 1 2 3 4 5
f 142 156 69 27 5 1
X 0 1 2 3 4 5 6 7 8
f 56 156 132 92 37 22 4 0 1
Fig. 5.2
x-m
Putting = t , dx = s dt
s
• 1
1 - t2
E( X ) = Ú (m + s t) 2p
e 2 dt
-•
• 1 • 1
1 - t2 t - t2
=m Ú 2p
e 2 dt + Ú s
2p
e 2 dt
-• -•
Ú (x - m)
2
= f ( x ) dx
-•
2
• 1 Ê x-m ˆ
1 - Á ˜
2Ë s ¯
Ú (x - m)
2
= e dx
-• 2p s
x-m
Putting = t , dx = s dt
s
• 1
1 - t2
Var ( X ) = Ú s 2t2
2p
e 2 dt
-•
• 1
s2 - t2
=
2p
Ú t2 e 2 dt
-•
• 1
2s 2 - t2
= Út e
2 2 dt
[∵ integral is an even function ]
2p 0
t2
Putting = u,
2
t = 2u
1 1
dt = 2 du = du
2 u 2u
When t = 0, u = 0
When t = •, u = •
•
2s 2 1
Ú 2ue
-u
Var( X ) = du
2p 0 2u
• 1
2s 2
Ú e u 2 du
-u
=
p 0
2s 2
3 È • ˘
Úe
- x n -1
= Í∵ x dx = n ˙
p 2 ÍÎ 0 ˙˚
2s 2 1 1
=
p 2 2
2s 2 1
= p
p 2
2
= s
3. Standard Deviation of the Normal Distribution
SD = s
5.56 Chapter 5 Some Special Probability Distributions
x-m
Putting = t in the first integral,
s
dx = s dt
When x = – •, t = – •
When x = m, t=0
2
m 1 Ê x-m ˆ 0 1
1 - Á ˜ 1 - t2
2Ë s ¯
s 2p
Ú e dx =
s 2p
Ú e 2 s dt
-• -•
0 1
1 - t2
=
2p
Ú e 2 dt
-•
• 1
1 - t2
=
2p
Úe 2 dt [By symmetry]
0
1 p
=
2p 2
1
= ...(5.4)
2
From Eqs (5.3) and (5.4),
2
M - 1 Ê x-m ˆ
1 1 Á
2Ë s ¯
˜ 1
+
2 s 2p Úe dx =
2
m
2
M - 1 Ê x-m ˆ
1 Á
2Ë s ¯
˜
s 2p
Úe dx = 0
m
2
M - 1 Ê x-m ˆ
Á ˜
2Ë s ¯
Úe dx = 0
m
È b ˘
m = M Í∵ if Ú f ( x) dx = 0 then a = b where f ( x) > 0˙˙
ÎÍ a ˚
Hence, mean = median for the normal distribution.
Note For normal distribution,
mean = median = mode = m
Hence, the normal distribution is symmetrical.
5.58 Chapter 5 Some Special Probability Distributions
ÈÊ x - m ˆ Ê X - m ˆ Ê x2 - m ˆ ˘
P( x1 £ X £ x2 ) = P ÍÁ 1 ˜ £Á ˜ £Á ˜˙
ÎË s ¯ Ë s ¯ Ë s ¯ ˚
= P ( z1 £ Z £ z2 )
x1 - m x -m
where z1 = and z2 = 2
s s
This probability is equal to the area under the standard normal curve between the
ordinates at Z = z1 and Z = z2.
5.4 Normal Distribution 5.59
When X < x1, Z < z1, the probability P(Z < z1) can be found for two cases as follows:
Case I If z1 > 0 (Fig. 5.8),
P( X < x1 ) = P ( Z < z1 )
= 1 - P ( Z ≥ z1 )
= 1 - ÈÎ0.5 - P (0 < Z < z1 )˘˚
= 0.5 + P(0 < Z < z1 )
= 0.5 + (Area under the curve from 0 to z1 ) Fig. 5.8
P( X < x1 ) = P( Z < - z1 )
= 1 - P( Z ≥ - z1 )
= 1 - ÎÈ0.5 + P(- z1 £ Z £ 0)˚˘
= 1 - ÈÎ0.5 + P(0 £ Z £ z1 )˘˚
[By symmetry] Fig. 5.9
= 0.5 - P(0 £ Z £ z1 )
= 0.5 - (Area under the curve from 0 to z1 )
Note
x1
(i) P( X < x1 ) = F ( x1 ) =
Ú f ( x ) dx
-•
(iii) If P(X < x1) > 0.5, the point x1 lies
to the right of x = m and the
corresponding value of standard
normal variate will be positive
(Fig. 5.11).
Fig. 5.11
5.4 Normal Distribution 5.61
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3990 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4115 0.4131 0.4147 0.4162
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
5.62 Chapter 5 Some Special Probability Distributions
Example 1
What is the probability that a standard normal variate Z will be (i) greater
than 1.09? (ii) less than –1.65? (iii) lying between –1 and 1.96?
(iv) lying between 1.25 and 2.75?
Solution
(i) Z > 1.09 (Fig. 5.12)
P( Z > 1.09) = 0.5 - P(0 £ Z £ 1.09)
= 0.5 - 0.3621
= 0.1379
Fig. 5.12
(ii) Z £ –1.65 (Fig. 5.13)
P( Z £ -1.65) = 1 - P( Z > -1.65)
= 1 - [0.5 + P (-1.65 < Z < 0)]
= 1 - [0.5 + P (0 < Z < 1.65)]
[By symmetry]
= 0.5 - P(0 < Z < 1.65)
= 0.5 - 0.4505 Fig. 5.13
= 0.0495
(iii) –1 < Z < 1.96 (Fig. 5.14)
P(-1 < Z < 1.96)
= P(-1 < Z < 0) + P(0 < Z < 1.96)
= P(0 < Z < 1) + P(0 < Z < 1.96)
[By symmetry]
= 0.3413 + 0.4750
= 0.8163 Fig. 5.14
5.4 Normal Distribution 5.63
Example 2
If X is a normal variate with a mean of 30 and an SD of 5, find the
probabilities that (i) 26 £ X £ 40, and (ii) X ≥ 45.
Solution
m = 30, s = 5
X-m
Z=
s
26 - 30
(i) When X = 26, Z = = -0.8
5
40 - 30
When X = 40, Z = =2
5
Fig. 5.16
P(26 £ X £ 40) = P (-0.8 £ Z £ 2) (Fig. 5.16)
= P (-0.8 £ Z £ 0) + P(0 £ Z £ 2)
= P (0 £ Z £ 0.8) + P (0 £ Z £ 2) [By symmetry]
= 0.2881 + 0.4772
= 0.7653
45 - 30
(ii) When X = 45, Z = =3
5
P( X ≥ 45) = P( Z ≥ 3) (Fig. 5.17)
= 0.5 - P(0 < Z < 3)
= 0.5 - 0.4987
= 0.0013
Fig. 5.17
Example 3
X is normally distributed and the mean of X is 12 and the SD is 4. Find
out the probability of the following:
(i) X ≥ 20 (ii) X £ 20 (iii) 0 £ X £ 12.
5.64 Chapter 5 Some Special Probability Distributions
Solution
m = 12, s = 4
X-m
Z=
s
20 - 12
(i) When X = 20, Z = =2
4
P( X ≥ 20) = P( Z ≥ 2) (Fig. 5.18)
= 0.5 - P(0 < Z < 2)
= 0.5 - 0.4772
= 0.0228
Fig. 5.18
(ii) P( X £ 20) = 1 - P( X > 20)
= 1 - 0.0228
= 0.9772
0 - 12
(iii) When X = 0, Z = = -3
4
12 - 12
When X = 12, Z = =0
4 Fig. 5.19
P(0 £ X £ 12) = P (-3 £ Z £ 0) (Fig. 5.19)
= P(0 £ Z £ 3) [By symmetry ]
= 0.4987
Example 4
If X is normally distributed with a mean of 2 and an SD of 0.1, find
P ( X - 2 ) ≥ 0.01)?
Solution:
m = 2, s = 0.1
X-m
Z=
s
1.99 - 2
When X = 1.99, Z = = -0.1 Fig. 5.20
0.1
2.01 - 2
When X = 2.01, Z = = 0.1
0.1
5.4 Normal Distribution 5.65
Example 5
If X is a normal variate with a mean of 120 and a standard deviation of
10, find c such that (i) P(X > c) = 0.02, and (ii) P(X< c) = 0.05.
Solution
For normal variate X,
m = 120, s = 10
X-m
Z=
s
(i) P(X > c) = 0.02
P(X < c) = 1 – P(X ≥ c)
= 1 – 0.02
= 0.98
Since P(X < c) > 0.5, the corre-
sponding value of Z will be positive.
P(X > c) = P(Z > z1) (Fig. 5.21)
0.02 = 0.5 – P(0 £ Z £ z1)
P(0 £ Z £ z1) = 0.48
\ z1 = 2.05 [From normal table]
Fig. 5.21
c - 120
Z= = z1 = 2.05
10
c = 2.05(10) + 120 = 140.05
(ii) Since P(X < c) < 0.5, the corresponding
value of Z will be negative.
P( X < c) = P( Z < - z1 ) (Fig. 5.22)
0.05 = 1 - P( Z ≥ - z1 )
Fig. 5.22
0.05 = 1 - ÈÎ0.5 + P(- z1 £ Z £ 0)˘˚
5.66 Chapter 5 Some Special Probability Distributions
Example 6
A manufacturer knows from his experience that the resistances of
resistors he produces is normal with m = 100 ohms and SD = s = 2 ohms.
What percentage of resistors will have resistances between 98 ohms and
102 ohms?
Solution
Let X be the random variable which denotes the resistances of the resistors.
m = 100, s =2
X-m
Z=
s
98 - 100
When X = 98, Z= = -1
2
102 - 100
When X = 102, Z = =1 Fig. 5.23
2
P(98 £ X £ 102) = P(-1 £ Z £ 1) (Fig. 5.23)
= P(-1 £ Z £ 0) + P(0 £ Z £ 1)
= P(0 £ Z £ 1) + P(0 £ Z £ 1) [By symmetry]
= 2 P(0 £ Z £ 1)
= 2(0.3413)
= 0.6826
Hence, the percentage of resistors have resistances between 98 ohms and 102 ohms =
68.26%.
Example 7
The average seasonal rainfall in a place is 16 inches with an SD of
4 inches. What is the probability that the rainfall in that place will be
between 20 and 24 inches in a year?
5.4 Normal Distribution 5.67
Solution
Let X be the random variable which denotes the seasonal rainfall in a year.
m = 16, s =4
X-m
Z=
s
20 - 16
When X = 20, Z= =1
4
24 - 16
When X = 24, Z= =2
4 Fig. 5.24
P(20 < X < 24) = P (1 < Z < 2) (Fig. 5.24)
= P(0 < Z < 2) - P (0 < Z < 1)
= 0.4772 - 0.3413
= 0.1359
Example 8
The lifetime of a certain kind of batteries has a mean life of 400 hours
and the standard deviation as 45 hours. Assuming the distribution of
lifetime to be normal, find (i) the percentage of batteries with a lifetime
of at least 470 hours, (ii) the proportion of batteries with a lifetime
between 385 and 415 hours, and (iii) the minimum life of the best 5%
of batteries.
Solution
Let X be the random variable which denotes the lifetime of a certain kind of batteries.
m = 400, s = 45
X-m
Z=
s
(i) When X = 470,
470 - 400
Z= = 1.56
45
P( X ≥ 470) = P( Z ≥ 1.56) (Fig. 5.25)
= 0.5 - P(0 < Z < 1.56)
= 0.5 - 0.4406
= 0.0594
Fig. 5.25
Hence, the percentage of batteries
with a lifetime of at least 470 hours
= 5.94%.
5.68 Chapter 5 Some Special Probability Distributions
Example 9
If the weights of 300 students are normally distributed with a mean of
68 kg and a standard deviation of 3 kg, how many students have weights
(i) greater than 72 kg? (ii) less than or equal to 64 kg? (iii) between
65 kg and 71 kg inclusive?
Solution
Let X be the random variable which denotes the weight of a student.
m = 68, s = 3, N = 300
X-m
Z=
s
72 - 68
(i) When X = 72, Z= = 1.33
3
Fig. 5.28
5.4 Normal Distribution 5.69
= 27.54
ª 28
65 - 68
(iii) When X = 65, Z= = -1
3
71 - 68
When X = 71, Z= =1
3
Example 10
The mean yield for a one-acre plot is 662 kg with an SD of 32 kg.
Assuming normal distribution, how many one-acre plots in a batch of
1000 plots would you expect to have yields (i) over 700 kg? (ii) below
650 kg? (iii) What is the lowest yield of the best 100 plots?
Solution
Let X be the random variable which denotes the yield for the one-acre plot.
m = 662, s = 32, N = 1000
X-m
Z=
s
700 - 662
(i) When X = 700, Z= = 1.19
32
P( X > 700) = P ( Z > 1.19) (Fig. 5.31)
= 0.5 - P(0 £ Z £ 1.19)
= 0.5 - 0.3830
= 0.1170 Fig. 5.31
Expected number of plots with yields over 700 kg = N P ( X > 700)
= 1000 (0.1170)
= 117
= 0.352
Expected number of plots with yields below 650 kg = N P( X < 650)
= 1000(0.352)
= 352
(iii) The lowest yield, say, x1 of the best 100 plots is given by
100
P( X > x1 ) = = 0.1
1000
5.4 Normal Distribution 5.71
x1 - 662
When X = x1 , Z= = z1
32
P( X > x1 ) = P( Z > z1 )
0.1 = 0.5 - P (0 £ Z £ z1 )
P(0 £ Z £ z1 ) = 0.4
\ z1 = 1.2 (approx.) [From normal table]
x1 - 662
= 1.28
32
x1 = 702.96
Hence, the best 100 plots have yields over 702.96 kg.
Example 11
Assume that the mean height of Indian soldiers is 68.22 inches with a
variance of 10.8 inches. How many soldiers in a regiment of 1000 would
you expect to be over 6 feet tall?
Solution
Let X be the continuous random variable which denotes the heights of Indian
soldiers.
m = 68.22, s 2 = 10.8, N = 1000
s = 3.29
X-m
Z=
s
When X = 6 feet = 72 inches,
72 - 68.22
Z= = 1.15
3.29
Fig. 5.33
P( X > 72) = P ( Z > 1.15) (Fig. 5.33)
= 0.5 - P(0 £ Z £ 1.15)
= 0.5 - 0.3749
= 0.1251
Expected number of Indian soldiers having heights over 6 feet (72 inches)
= N P ( X > 72)
= 1000(0.1251)
= 125.1
ª 125
5.72 Chapter 5 Some Special Probability Distributions
Example 12
The marks obtained by students in a college are normally distributed
with a mean of 65 and a variance of 25. If 3 students are selected at
random from this college, what is the probability that at least one of
them would have scored more than 75 marks?
Solution
Let X be the continuous random variable which denotes the marks of a student.
m = 65,s 2 = 25
s =5
X-m
Z=
s
75 - 65
When X = 75, Z= =2
5
P( X > 75) = P( Z > 2) (Fig. 5.34)
= 0.5 - P(0 £ Z £ 2) Fig. 5.34
= 0.5 - 0.4772
= 0.0228
If p is the probability of scoring more than 75 marks,
p = 0.0228, q = 1 – p = 1 – 0.0228 = 0.9772
P(at least one student would have scored more than 75 marks)
3
= Â 3C x p x q n - x
x =1
3
= Â 3C x (0.0228) x (0.9772)3- x
x =1
= 0.0668
Example 13
Find the mean and standard deviation in which 7% of items are under
35 and 89% are under 63.
Solution
Let m be the mean and s be standard deviation of the normal curve.
P( X < 35) = 0.07
P( X < 63) = 0.89
5.4 Normal Distribution 5.73
= 0.5 - 0.07
= 0.43
z1 = 1.48 [From normal table]
P(0 < Z < z2 ) = 0.5 - P ( Z ≥ z2 )
= 0.5 - 0.11
= 0.39
z2 = 1.23 [From normal table]
35 - m
Hence, = -1.48
s
–1.48 s + m = 35 ...(1)
63 - m
and = 1.23
s
1.23 s + m = 63 ...(2)
Solving Eqs (1) and (2),
m = 50.29, s = 10.33
Example 14
In an examination, it is laid down that a student passes if he secures 40 %
or more. He is placed in the first, second, and third division according to
whether he secures 60% or more marks, between 50% and 60% marks
and between 40% and 50% marks respectively. He gets a distinction in
case he secures 75% or more. It is noticed from the result that 10% of
5.74 Chapter 5 Some Special Probability Distributions
= 0.5 - 0.10
= 0.40
z1 = 1.28 [From normal table]
P(0 < Z < z2 ) = 0.5 - P ( Z ≥ z2 )
= 0.5 - 0.05
= 0.45
z2 = 1.64 [From normal table]
40 - m
Hence, = -1.28
s
m – 1.28 s = 40 ...(1)
75 - m
and = 1.64
s
m + 1.64 s = 75 ...(2)
5.4 Normal Distribution 5.75
Example 1
Fit a normal curve from the following distribution. It is given that the
mean of the distribution is 43.7 and its standard distribution is 14.8.
Frequency 20 28 40 60 32 20 8
Solution
m = 43.7, s = 14.8 N = Sf = 200
The series is converted into an inclusive series.
Example 2
Fit a normal distribution to the following data:
X 125 135 145 155 165 175 185 195 205
Y 1 1 14 22 25 19 13 3 2
Exercise 5.3
1. If X is normally distributed with a mean and standard deviation of 4,
find (i) P(5 £ X £ 10), (ii) P(X ≥ 15), (iii) P(10 £ X £ 15), and (iv) P(X £ 5).
ÈÎans.: (i) 0.3345 (ii) 0.003 (iii) 0.0638 (iv) 0.4013˘˚
2. A normal distribution has a mean of 5 and a standard deviation of 3.
What is the probability that the deviation from the mean of an item
taken at random will be negative?
ÈÎans.: 0.0575˘˚
3. If X is a normal variate with a mean of 30 and an SD of 6, find the value
of X = x1 such that P(X ≥ x1) = 0.05.
ÎÈans.: 39.84 ˚˘
4. If X is a normal variate with a mean of 25 and SD of 5, find the value of
X = x1 such that P(X £ x1) = 0.01.
ÈÎans.: 11.02˘˚
5. The weights of 4000 students are found to be normally distributed with
a mean of 50 kg and an SD of 5 kg. Find the probability that a student
selected at random will have weight (i) less than 45 kg, and (ii) between
45 and 60 kg.
ÈÎans.: (i) 0.1587 (ii) 0.8185˘˚
6. T
he daily sales of a firm are normally distributed with a mean of ` 8000
and a variance of ` 10000. (i) What is the probability that on a certain
5.78 Chapter 5 Some Special Probability Distributions
day the sales will be less than ` 8210? (ii) What is the percentage of
days on which the sales will be between ` 8100 and ` 8200?
ÈÎans.: (i) 0.482 (ii) 14% ˘˚
7. The mean height of Indian soldiers is 68.22¢¢ with a variance of 10.8¢¢.
Find the expected number of soldiers in a regiment of 1000 whose
height will be more than 6 feet.
ÎÈans.: 125˚˘
8. The life of army shoes is normally distributed with a mean of 8 months
and a standard deviation of 2 months. If 5000 pairs are issued, how
many pairs would be expected to need replacement after 12 months?
ÈÎans.: 2386 ˘˚
9. In an intelligence test administered to 1000 students, the average was
42 and the standard deviation was 24. Find the number of students
(i) exceeding 50, (ii) between 30 and 54, and (iii) the least score of top
1000 students.
ÈÎans.: (i) 129 (ii) 383 (iii) 72.72 ˘˚
10. In a test of 2000 electric bulbs, it was found that the life of a
particular make was normally distributed with an average of life
of 2040 hours and a standard deviation of 60 hours. Estimate the
number of bulbs likely to burn for (i) more than 2150 hours, and
(ii) less than 1950 hours.
ÈÎans.: (i) 67 (ii) 184 ˘˚
11. T
he marks of 1000 students of a university are found to be normally
distributed with a mean of 70 and a standard of deviation 5. Estimate
the number of students whose marks will be (i) between 60 and 75,
(ii) more than 75, and (iii) less than 68.
14. T
he marks obtained by students in an examination follow a normal
distribution. If 30% of the students got marks below 35 and 10% got
marks above 60, find the mean and percentage of students who got
marks between 40 and 50.
ÈÎans.: 42.23, 13.88, 28% ˘˚
15. Fit a normal distribution to the following data:
ÈÎans.: Expected frequency : 3, 31, 148, 322, 319, 144, 30, 3˘˚
= s •+ t
Ú s l e dx
-l x
5.80 Chapter 5 Some Special Probability Distributions
•
e- l x
l
-l
s+t
= •
-l x
e
l
-l
s
e- l ( s + t )
=
e- l s
= e- lt ...(5.5)
•
P( X > t ) = Ú l e - l x dx
t
•
e- l x
=l
-l
t
= e- lt ...(5.6)
From Eq. (5.5) and Eq. (5.6),
P[( X > s + t )/( X > s)] = P( X > t ) , for s, t > 0
1
=l◊
l2
1
=
l
•
= Ú x 2 l e - l x dx
0
5.5 Exponential Distribution 5.81
•
e- l x
2 e- l x e- l x
=l x - 2x + 2
-l l2 -l 3 0
Ê 2 ˆ
= lÁ 3˜
Ël ¯
2
=
l2
Substituting in Eq (5.7),
2 1 1 È 1˘
Var( X ) = 2
- 2
= 2 Í∵ m = l ˙
l l l Î ˚
1 1
SD = Var( X ) = =
l 2 l
f ( x ) = l e- l x , x>0
=0 , x£0
1
–(e - l M - 1) =
2
5.82 Chapter 5 Some Special Probability Distributions
1 1
-e- l M = -1= -
2 2
1
e- l M =
2
1
- l M log e = log = - log 2
2
l M = log 2
1
M= log 2
l
Example 1
Let X be a random variable with pdf
Ï1 - x
Ô 5 x>0
f ( x) = Ì 5 e
Ô0 otherwise
Ó
Find (i) P(X > 5) (ii) P (3 £ X £ 6) (iii) mean (iv) variance.
Solution
1
l=
5
•
(i) P( X > 5) = Ú f ( x ) dx
5
x
• 1 –5
=Ú e dx
5 5
•
x
–
1 e 5
=
5 1
–
5 5
•
x
–
=– e 5
5
-•
= - (e - e -1 )
= e -1
= 0.3679
5.5 Exponential Distribution 5.83
6
(ii) P(3 £ X £ 6) = Ú f ( x ) dx
3
x
6 1 –5
=Ú e dx
3 5
6
x
–
1 e 5
=
5 1
–
5 3
6
x
–
=– e 5
3
Ê -6 -
3ˆ
= - Ëe 5 -e 5¯
3 6
- -
= e 5 -e 5
= 0.2476
1 1
(iii) Mean m = = =5
l Ê 1ˆ
ÁË 5 ˜¯
1 1
(iv) Variance = Var( X ) = = = 25
2 2
l Ê 1ˆ
ÁË 5 ˜¯
Example 2
A random variable has pdf f(x) = ce–2x for x > 0. Find (i) P(X > 2)
Ê 1ˆ
(ii) P Á X < ˜ .
Ë c¯
Solution
Since f(x) is a probability density function,
•
Ú-• f ( x) dx = 1
•
Ú0 ce
-2 x
dx = 1
•
ce -2 x
=1
-2
0
5.84 Chapter 5 Some Special Probability Distributions
c -2 x •
- e =1
2 0
c
- (e -• - e0 ) = 1
2
c
=1
2
c=2
\ f ( x ) = 2e -2 x , x>0
•
(i) P( X > 2) = Ú f ( x ) dx
2
•
= Ú 2e -2 x dx
2
•
e -2 x
=2
-2
2
•
-2 x
=-e
2
-•
= -(e - e -4 )
= e -4
= 0.0183
Ê 1ˆ Ê 1ˆ
(ii) PÁ X < ˜ = PÁ X < ˜
Ë c¯ Ë 2¯
1
= Ú 2 f ( x ) dx
0
1
= Ú 2 2e -2 x dx
0
1
e -2 x 2
=2
-2
0
1
-2 x 2
=- e
0
= -(e -1 - e0 )
= -e -1 + 1
= 0.6321
5.5 Exponential Distribution 5.85
Example 3
If X is random variable which follows an exponential distribution with
parameter l with P(X £ 1) = P(X > 1), find Var(X).
Solution
Since X is random variable which follows an exponential distribution,
-l x
f ( x ) = l e , x ≥ 0
P( X £ 1) = P( X > 1)
1 - P ( X > 1) = P( X > 1)
2 P ( X > 1) = 1
1
P ( X > 1) =
2
• 1
Ú1 f ( x ) dx =
2
• 1
Ú1 l e dx = 2
-l x
•
e- l x 1
l =
-l 2
1
-l x
• 1
-e =
1 2
1
-(e -• - e- l ) =
2
1
e- l =
2
1 1
l
=
e 2
el = 2
l = loge 2
1 1
Var( X ) = 2
=
l (loge 2)2
Example 4
If X is a exponentially distributed random variable with parameter l,
P( X > k )
find the value of k such that = a.
P( X £ k )
5.86 Chapter 5 Some Special Probability Distributions
Solution
P( X > k )
=a
P( X £ k )
P( X > k )
=a
1 - P( X > k )
P( X > k ) = a [1 - P( X > k )]
P( X > k )(1 + a ) = a
a
P( X > k ) =
1+ a
• a
Ú k f ( x ) dx = 1 + a
• a
Ú k l e dx = 1 + a
-l x
•
e- l x a
l =
-l 1+ a
k
• a
- e- l x =
k 1+ a
a
-(e -• - e- l k ) =
1+ a
a
e- l k =
1+ a
1 a
lk
=
e 1+ a
1+ a
el k =
a
Ê1 + aˆ
l k = log Á
Ë a ˜¯
1 Ê1 + aˆ
k= log Á
l Ë a ˜¯
Example 5
If the density function of a continuous random variable X is
1
f(x) = ce–b(x – a), a £ x where a, b, c are constants. Show that b = c =
and a = m – s, where m = E(X) and s 2 = Var(X). s
5.5 Exponential Distribution 5.87
Solution
Since f(x) is a density function,
•
Ú-• f ( x) dx = 1
•
Úa ce
- b( x - a )
dx = 1
•
e- b( x - a )
c =1
-b
a
c - b( x - a ) •
e - =1
b a
c
- (e-• - e0 ) = 1
b
c
=1
b
b = c ...(1)
•
m = E ( X ) = Ú bxe - b( x - a )
dx
a
•
ab
Ê e - bx ˆ e - bx
= be xÁ ˜-
Ë -b ¯ b2 a
Êa 1 ˆ
= be ab Á e - ab + 2 e - ab ˜
Ëb b ¯
1
= a+ ...(2)
b
•
E ( X 2 ) = Ú bx 2 e - b( x - a ) dx
a
•
ab
Ê e - bx ˆ
2
Ê e - bx ˆ Ê e - bx ˆ
= be x Á ˜ - 2 x Á 2˜ + 2 Á 3˜
Ë -b ¯ Ë -b ¯ Ë -b ¯ a
Ê a2 2a 2 ˆ
= bÁ + + ˜
Ë b b 2 b3 ¯
1
= (a 2 b2 + 2 ab + 2)
b2
5.88 Chapter 5 Some Special Probability Distributions
Var( X ) = E ( X 2 ) - [ E ( X )]2
1 Ê 2a 1 ˆ
s2 = (a 2 b2 + 2 ab + 2) - Á a 2 + +
b2 Ë b b2 ˜¯
1
=
b2
1
s= ...(3)
b
From Eq. (1) and (3),
1
b=c=
s
Subtracting Eq. (3) from Eq. (2),
m -s = a
\ a = m - s
Example 6
The mileage which car owners get with a certain kind of radial tire
is a random variable having an exponential distribution with mean
4000 km. Find the probabilities that one of these tires will last (i) at least
2000 km (ii) at most 3000 km.
Solution
Let X be the random variable which denotes the mileage obtained with the tire.
1
Mean m = = 4000 km
l
f ( x) = l e- l x , x>0
1
1 - x
= e 4000 , x>0
4000
•
(i) P( X ≥ 2000) = Ú f ( x ) dx
2000
1
• 1 - 4000 x
=Ú e dx
2000 4000
•
1
- x
1 e 4000
=
4000 1
-
4000 2000
5.5 Exponential Distribution 5.89
•
1
- x
=-e 4000
2000
-• -0.5
= -(e -e )
-0.5
=e
= 0.6065
3000
(ii) P( X £ 3000) = Ú f ( x ) dx
0
1
3000 1 - 4000 x
=Ú e dx
0 4000
3000
1
- x
1 e 4000
=
4000 1
-
4000 0
3000
1
- x
=-e 4000
0
-0.75
= -(e - e0 )
= – e-0.75 + 1
= 0.5270
Example 7
If the number of kilometers that a car can run before its battery wears
out is exponentially distributed with an average value of 10000 km and
if the owner desires to take a 5000 km trip, what is the probability that
he will be able to complete his trip without having to replace the car
battery. Assume that the car has been used for same time.
Solution
Let X be the random variable which denotes the number of kilometers that a car can
run before its battery wears out.
1
Mean m = = 10000
l
f ( x) = l e– l x , x>0
1
1 - x
= e 10000 , x>0
1000
5.90 Chapter 5 Some Special Probability Distributions
•
P( X > 5000) = Ú f ( x ) dx
5000
1
• 1 - x
=Ú e 10000 dx
5000 10000
•
1
- x
1 e 10000
=
10000 1
-
10000 5000
•
1
- x
=-e 10000
5000
-• –0.5
= -(e -e )
-0.5
=e
= 0.6065
Example 8
The average time it takes to serve a customer at a petrol pump is 6 min-
utes. The service time follows exponential distribution. Calculate the
probability that
(i) A customer will take less than 2 minutes to complete the service.
(ii) A customer will take between 4 and 5 minutes to get the service.
(iii) A customer will take more than 10 minutes for his service.
Solution
Let X be the random variable which denotes the service time.
1
Mean m = =6
l
f ( x) = l e– l x , x > 0
1
1 -6 x
= e ,x>0
6
2
(i) P( X < 2) = Ú f ( x ) dx
0
1
2 1 -6 x
=Ú e dx
0 6
5.5 Exponential Distribution 5.91
2
1
- x
1 e 6
=
6 1
-
6 0
2
1
- x
=-e 6
0
1
-
= -(e 3 - e0 )
1
-
= –e 3 + 1
= 0.2835
5
(ii) P(4 < X < 5) = Ú f ( x ) dx
4
1
5 1 -6 x
=Ú e dx
4 6
5
1
- x
1 e 6
=
6 1
-
6 4
5
1
- x
=-e 6
Ê -5 –
2ˆ
= –Ëe 6 –e 3¯
= 0.0788
•
(iii) P( X > 10) = Ú f ( x ) dx
10
1
• 1 -6 x
=Ú e dx
10 6
•
1
- x
1 e 6
=
6 1
-
6 10
•
1
- x
=- e 6
10
5.92 Chapter 5 Some Special Probability Distributions
Ê – ˆ
10
= - Á e -• - e 6 ˜
ÁË ˜¯
10
–
=e 6
= 0.1889
Example 9
The length of time X to complete a job is exponentially distributed with
1
E ( X ) = m = = 10 hours. (i) Compute the probability of job comple-
l
tion between two consecutive jobs exceeding 20 hours. (ii) The cost of
job completion is given by C = 4 + 2X + 2X2. Find the expected value
of C.
Solution
Let X be a random variable which denotes the length of time to complete a job.
1
E( X ) = m = = 10
l
f ( x) = l e– l x
1
1 - 10 x
= e
10
•
(i) P( X > 20) = Ú f ( x ) dx
20
1
• 1 - 10 x
=Ú e dx
20 10
•
1
- x
1 e 10
=
10 1
-
10 20
•
1
- x
=– e 10
20
-•
= -(e - e –2 )
= e –2
= 0.1353
5.5 Exponential Distribution 5.93
Example 10
The time (in hours) required to repair a machine is exponentially dis-
1
tributed with parameter l = .
2
(i) What is the probability that the repair time exceeds 2 hours?
(ii) What is the conditional probability that a repair takes at least
11 hours given that its direction exceeds 8 hours?
Solution
Let X be the random variable which denotes the time to repair the machine.
1
l=
2
f ( x) = l e– l x , x>0
1
1 -2 x
= e , x>0
2
5.94 Chapter 5 Some Special Probability Distributions
•
(i) P( X > 2) = Ú f ( x ) dx
2
1
• 1 -2 x
=Ú e dx
2 2
•
1
- x
1 e 2
=
2 1
-
2 2
•
1
- x
=–e 2
= -(e -• - e –1 )
= e –1
= 0.3679
3
-•
= -(e - e -1.5 )
= e -1.5
= 0.2231
Example 11
The daily consumption of milk in excess of 20000 gallons is approxi-
1
mately exponentially distributed with l = . The city has a daily
3000
stock of 35000 gallons. What is the probability that of 2 days selected at
random, the stock is insufficient for both the days.
5.5 Exponential Distribution 5.95
Solution
Let Y be a random variable which denotes the daily consumption of milk consumed in
a day. The random variable X = Y – 20000 has an exponential distribution.
1
l=
3000
f ( x) = l e– l x , x>0
1
1 - x
= e 3000 , x>0
3000
15000
-• -5
= -(e -e )
-5
=e
= 0.0067
Exercise 5.4
1. If X is exponentially distributed, prove that probability that X exceeds
its expected value is less than 0.5.
2. The amount of time that a watch will run without having to be reset
is a random variable having an exponential distribution with mean 120
days. Find the probability that such a watch will
(a) have to be set in less than 24 days.
(b) not have to be reset in at least 180 days.
[Ans.: (a) 0.1813, (b) 0.2231]
5.96 Chapter 5 Some Special Probability Distributions
3. The length of the shower on a tropical island during rainy season has
an exponential distribution with parameter 2, time being measured
in minutes. What is the probability that a shower will last more than
3 minutes? If a shower has already lasted for 2 minutes, what is the
probability that it will last for at least one more minute?
[Ans.: (a) 0.0025, (b) 0.1353]
4. If X is exponentially distributed with parameter l, find the value of k
such that P(X > k)/P(X £ k) = a.
È -1 Ê 1ˆ ˘
Í ans.: l log ÁË 1 + ˜¯ ˙
Î a ˚
5. The life length X of an electronic component follows an exponential
distribution. These are 2 processes by which the component may be
manufactured. The expected life length of the component is 100 hrs
if process I is used to manufacture, while it is 150 hrs if process II is
used. The cost of manufacturing a single component by process I is
`10, while is `20 for process II. Moreover, if the component lasts less
than the guaranteed life of 200 hrs, a loss of `50 is to be borne by the
manufacturer. Which process is advantageous to the manufacturer?
[Ans.: Process I is advantageous to the manufacturer]
6. The life of an electronic component follows exponential distribution
with a mean of 4 years. The manufacturer of this component gives a
replacement warranty of 3 years.
(a) What proportion of components will be replaced in the period
of warranty?
(b) What is the probability that a randomly selected component
will have life within two standard deviations of the mean life?
[Ans.: (a) 0.5276, (b) 0.9502]
lr r + 1 È • - kx n -1 n˘
= r +1 Í∵ Ú0 e x dx = n ˙
r l ÍÎ k ˙˚
lrr r
=
r ◊ l r +1
r
=
l
lr r + 2 È • - kx n -1 n˘
= r +2 Í∵ Ú0 e x dx = n ˙
r l ÍÎ k ˙˚
(r + 1)r r
=
r ◊ l2
2
r +r
=
l2
r2 + r r2
Var( X ) = -
l2 l2
r
=
l2
5.98 Chapter 5 Some Special Probability Distributions
r r
SD = Var( X ) = =
l 2 l
lr
f ( x) = x r -1e - l x , x>0
r
=0 , x£0
Differentiating w.r.t. x,
lr È
f ¢( x ) = Î(r - 1) x r - 2 e - l x + x r -1e - l x (- l )˘˚
r
l r r -2 - l x
= x e [(r - 1) - l x ]
r
For maximum value of f(x),
f ¢( x ) = 0
(r - 1) - l x = 0
r -1
x=
l
Differentiating f¢(x) w.r.t. x,
lr È
f ¢¢( x ) = (r - 2) x r - 3 e - l x (r - 1 - l x )
r Î
+ x r - 2 e - l x (- l )(r - 1 - l x ) + x r - 2 e - l x (- l )˘˚
lr
= x r - 3 e - l x [(r - 2)(r - 1 - l x ) - l x(r - 1 - l x ) - l x ]
r
r -1
Putting x = ,
l
lr
f ¢¢( x ) = x r - 3 e - l x [(r - 2)(r - 1 - r + 1) - l x(r - 1 - r + 1) - (r - 1)]
r
lr
= x r - 3 e - l x (1 - r )
r
5.6 Gamma Distribution 5.99
r -1
f(x) is maximum when x = , if f¢¢(x) < 0,
l
f ¢¢( x ) < 0 if 1 - r < 0
1< r
or r >1
r -1
Hence, x = is the mode of the gamma distribution for r > 1.
l
Example 1
Given a Gamma random variable X with r = 3 and l = 2. Compute
E(X), Var(X) and P(X £ 1.5 years).
Solution
l = 2, r=3
r
l
f ( x) = x r -1e - l x , x>0
r
r 3
(a) E ( X ) = = = 1.5 years
l 2
r 3
(b) Var(X ) = 2 = = 0.75
l ( 2 )2
1.5
(c) P ( X £ 1.5 years) = Ú f ( x ) dx
0
1.5 23
=Ú x 2 e -2 x dx
0 3
1.5
Ê e -2 x ˆ
2
Ê e -2 x ˆ Ê e -2 x ˆ
=4 x Á ˜ - 2x Á ˜ + 2Á ˜
Ë -2 ¯ Ë 4 ¯ Ë -8 ¯ 0
È Ê e -3 ˆ Ê e -3 ˆ Ê e -3 ˆ 1 ˘
= 4 Í(1.5)2 Á ˜ - 2(1.5) Á ˜ + 2Á ˜+ ˙
ÍÎ Ë -2 ¯ Ë 4 ¯ Ë -8 ¯ 4 ˙˚
= 0.5768
Example 2
The daily consumption of milk in a city, in excess 20000 litres, is approx-
1
imately distributed as a Gamma variate with parameters l =
10000
5.100 Chapter 5 Some Special Probability Distributions
and r = 2. The city has a daily stock of 30000 litres. What is the prob-
ability that the stock is insufficient on a particular day?
Solution
Let Y be the random variable which denotes the daily consumption of milk (in litres)
in a city. The random variable X = Y – 20000 has a gamma distribution.
1
l= ,r=2
10000
lr
f ( x) = x r -1e - l x , x > 0
r
2
Ê 1 ˆ
ÁË 10000 ˜¯ -
1
x
= x 2 -1e 10000
2
1
- x
xe 1000
=
(10000)2
Probability that the stock is insufficient on a particular day
P(Y > 30000) = P( X > 10000)
•
=Ú f ( x )dx
10000
1
- x
• xe 1000
=Ú dx
1000 (10000)2
1 • -4
Ú10 xe -10 x
= 8 4 dx
10
-4 -4 •
1 x ◊ e -10 x
1 ◊ e -10 x
= -
108 -10 -4 (-10 -4 )2
10 4
1 Ê e-1 e-1 ˆ
= Á + ˜
108 Ë 10 -8 10 -8 ¯
= e-1 + e-1
= 2e-1
= 0.7358
5.6 Gamma Distribution 5.101
Example 3
In a certain city, the daily consumption of electric power in millions
of kilowatt hours can be treated as a random variable having gamma
1
distribution with parameters l = and r = 3. If the power plant of this
2
city has a daily capacity of 12 millions kilowatt-hours, what is the prob-
ability that this power supply will be inadequate on any given day.
Solution
Let X be a random variable which denotes the daily consumption of electric power in
millions killowatt-hours.
lr
f ( x) = x r -1e - l x , x > 0
r
3
Ê 1ˆ
ÁË 2 ˜¯ 1
- x
2
= x e 2
3
P(power supply is inadequate) = P( X > 12)
•
= Ú f ( x ) dx
12
1
• 1 1 - x
=Ú x2e 2 dx
12 3 23
•
Ê -1x ˆ Ê -1x ˆ Ê -1x ˆ
1 2Áe 2 ˜ Áe 2 ˜ Áe 2 ˜
= x Á - 2x Á + 2Á
16 1 ˜ 1 ˜ 1 ˜
Á - ˜ Á ˜ Á - ˜
Ë 2 ¯ Ë 4 ¯ Ë 8 ¯ 12
1 -6
= e (288 + 96 + 16)
16
= 25e -6
= 0.062
Example 4
If a company employs n sales persons, its gross sales in thousands
of rupees may be regarded as a random variable having a gamma
1
distribution with l = and r = 80 n. If the sales cost is `8000 per
2
5.102 Chapter 5 Some Special Probability Distributions
Example 5
Consumer demand for milk in a certain locality, per month, is known to
be a general gamma random variable. If the average demand is ‘a’ litres
and the most likely demand is ‘b’ litres (b < 0), what is the variance of
the demand?
Solution
Let X be the random variable which denotes the monthly consumer demand of milk.
Average demand is the value of E(X). Most likely demand is the value of the mode of
X or the value of X for which its probability density function is maximum.
lr
f ( x) = x r -1e - l x , x > 0
r
lr È
f ¢( x ) = (r - 1) x r - 2 e - l x - l x r -1e - l x ˘˚
r Î
l r r -2 - l x
= x e [(r - 1) - l x ]
r
For maximum value of f(x),
f ¢( x ) = 0
(r - 1) - l x = 0
r -1
x=
l
Differentiating f¢(x) w.r.t. x,
f ¢¢( x ) =
lr È r -2 - l x
Í- l x e
r Î
+ {(r - 1) - l x}
dx
x e {
d r -2 - l x ˘
˙
˚
}
lr
= x r - 3 e - l x (1 - r )
r
r -1
f ¢¢( x ) < 0 when x =
l
r -1
f(x) is maximum when x = if f ¢¢( x ) < 0
l
f ¢¢( x ) < 0 if 1 - r < 0
1< r
or r >1
5.104 Chapter 5 Some Special Probability Distributions
r -1
Most likely demand = = b, r > 1
l
r -1
=b
l
r 1
= b+ ...(1)
l l
r
Average demand = E ( X ) = =a ...(2)
l
Putting in Eq. (1),
1
a = b+
l
1
= a-b ...(3)
l
r r 1
Var( X ) = 2 = ◊
l l l
= a( a - b) [from Eq. (2) and (3)]
Exercise 5.5
1. Find the probabilities that the value of a random variable will exceed
4, if it has gamma distribution with
1 1
(a) l = , r = 2 (b) l = ,r = 3
3 4
Chapter Outline
6.1 Introduction
6.2 Terms Related to Tests of Hypothesis
6.3 Procedure for Testing of Hypothesis
6.4 Test of Significance for Large Samples
6.5 Test of Significance for Single Proportion – Large Samples
6.6 Test of Significance for Difference between Two Proportions – Large
Samples
6.7 Test of Significance for Single Mean – Large Samples
6.8 Test of Significance for Difference between Two Means – Large Samples
6.9 Test of Significance for Difference of Standard Deviations – Large Samples
6.10 Small Sample Tests
6.11 Student’s t-distribution
6.12 t-test: Test of Significance for Single Mean
6.13 t-test: Test of Significance for Difference of Means
6.14 t-test: Test of Significance for Correlation Coefficients
6.15 Snedecor’s F-test for Ratio of Variances
2
6.16 Chi-square (c ) Test
6.17 Chi-square Test: Goodness of Fit
6.18 Chi-square Test for Independence of Attributes
6.1 Introduction
The main purpose behind the sampling theory is the study of the Tests of Hypothesis
or Tests of significance. In many situations, assumptions are made about the population
6.2 Chapter 6 Applied Statistics: Test of Hypothesis
(1) Parameters: The statistical constants of population such as mean (m), standard
deviation (s), correlation coefficient (r), population proportion (P) etc. are
called the parameters. Greek letters are used to denote the population param-
eters.
(2) Statistic: The statistical constants for the sample drawn from the given popula-
tion such as mean ( x ), standard deviation (s), correlation coefficient (r), sam-
ple proportion (p) etc., are called the statistic. Roman letters are used to denote
the sample statistic.
(3) Sampling Distribution: Consider all possible samples of size ‘n’ which can be
drawn from a population of size ‘N’. These samples will give different values
of a statistic. The means of the samples will not be identical. If these different
means are arranged according to their frequencies, the frequency distribution
formed is called sampling distribution of mean. Similarly, the sampling dis-
tribution of other statistics can be defined.
(4) Standard Error: The standard deviation of the sampling distribution of a statis-
tic is known as its standard error SE. Standard error plays a very important role
in the large sample theory and forms the basis of the testing of hypothesis.
(5) Null Hypothesis: Null hypothesis is the hypothesis which is tested for possible
rejection under the assumption that it is true. It is denoted by H0. It asserts that
there is no significant difference between the statistic and the population param-
eter and whatever observed difference exists, is merely due to the fluctuations in
sampling from the same population.
(6) Alternative Hypothesis: Any hypothesis which is complementary to the null
hypothesis is called an alternative hypothesis. It is denoted by H1. It is set
in such a way that the rejection of null hypothesis implies the acceptance of
alternative hypothesis. For example, if the null hypothesis is that the average
height of the students of a college is 166 cm. i.e., m0 = 166 cm, say then the null
hypothesis is
H 0 : m = 166( = m0 )
and the alternative hypothesis could be
(i) H1 : m ¹ m 0 (i.e., m > m 0 or m < m 0 )
(ii) H1 : m > m0
(iii) H1 : m < m0
Thus, there can be more than one alternative hypothesis.
(7) Test Statistic: After setting up the null hypothesis and alternative hypothesis,
test statistic is calculated. The test statistic is a statistic based on appropriate
6.2 Terms Related to Tests of Hypothesis 6.3
P(Z )
Z
−a O a
Fig. 6.1 Two tailed test
For example, in a test for testing the mean (m) of the population
Null Hypothesis H 0 : m = m0
(12) Confidence Limits: The limits within which a hypothesis should lie with
specified probability are called confidence limits or fiducial limits. Gener-
ally, the confidence limits are set up with 5% or 1% level of significance.
If the sample value lies between the confidence limits, the hypothesis is
accepted, if it does not, then the hypothesis is rejected at the specified level
of significance. Suppose that the sampling distribution of a statistic S is
normal with mean m and standard deviation s. The sample statistic S can
be expected to lie in the interval (m – 1.96s, m + 1.96s) for 95% times
(Fig. 6.29). Because of this, (S – 1.96s, S + 1.96s) is called the 95% confi-
dence interval for estimation of m. The ends of this interval, i.e., S ± 1.96s
are called 95% confidence limits for S. Similarly, S ± 2.58s are 99% confi-
dence limits. The numbers 1.96, 2.58 etc. are called confidence coefficients.
P(X )
Critical Critical
region region
X
–1.96σ m 1.96σ
Fig. 6.4 Confidence Limits
(vi) Decision: Compare the calculated value of Z with the tabulated value Za.
If |Z| < Za i.e., if the calculated value of Z is less than tabulated value Za at
the level of significance a, the null hypothesis is accepted. If |Z| > Za i.e.,
if the calculated value of Z is more than tabulated value Za at the level of
significance a, the null hypothesis is rejected.
If a sample consists of more than 30 items, i.e., n > 30, it is considered as large sample.
The following assumptions are applied for significance tests of large samples:
(i) The random sampling distribution of statistic has the properties of the normal
curve.
(ii) Values (i.e., statistic) given by the samples are sufficiently close to the popu-
late values (i.e., parameters) and can be used in its place for calculating the
standard error (SE) of the estimate.
For example, if SD of the population is not known, SE can be calculated by SD of the
sample.
Suppose the hypothesis to be tested is that the probability of success in such trail is
p. Assuming it to be true, the mean m and the standard deviation s of the sampling
distribution of the number of successes are np and npq respectively as the sampling
distribution of number of successes follows a binomial probability distribution.
If x is the observed number of successes in the sample and Z is the standard normal
variate then
x−m
Z=
s
The tests of significance are as follows:
(i) If |Z| < 1.96, the difference between the observed and expected number of
successes is not significant.
(ii) If |Z| > 1.96, the difference is significant at 5% level of significance.
(iii) If |Z| > 2.58, the difference is significant at 1% level of significance.
Example 1
A coin was tossed 960 times and returned heads 183 times. Test the
hypothesis that the coin is unbiased. Use a 0.05 level of significance.
Solution
n = 960
1
p = probability of getting head =
2
1 1
q = 1- p = 1- =
2 2
6.4 Test of Significance for Large Samples 6.7
æ1ö
m = np = 960 ç ÷ = 480
è2ø
1 1
s = npq = 960 ´ ´ = 15.49
2 2
x = number of sucessess = 183
(i) Null Hypothesis H0: The coin is unbiased.
(ii) Alternative Hypothesis H1: The coin is biased.
(iii) Level of significance: a = 0.05
x - m 183 - 480
(iv) Test statistic: Z = = = -19.17
s 15.49
| Z | = 19.17
(v) Critical value: |Z0.05| = 1.96
(vi) Decision: Since |Z| > |Z0.05|, the null hypothesis is rejected at 5% level of
significance, i.e., the coin is biased.
Example 2
A dice is tossed 960 times and it falls with 5 upwards 184 times. Is the
dice unbiased at a level of significance of 0.01?
Solution
nn == 960
960
11
pp == Probability
Probabilityof of throwing
throwing 55 with
with one die ==
one die
66
11 55
qq ==11−− pp == 11−− ==
66 66
11
mm == np
np == 960
960 ==160 160
66
11 55
ss == npq npq == 960 960×× ×× ==11 11..55
55
66 66
xx == number
number of of succes
succes ses ==184
ssses 184
(i) Null Hypothesis H0: The dice is unbiased.
(ii) Alternative Hypothesis H1: The dice is biased.
(iii) Level of significance: a = 0.01
x - m 184 - 160
(iv) Test statistic: Z = = = 2.08
m 11.55
Z = 2.08
(v) Critical value: |Z0.01| = 2.58
(vi) Decision: Since |Z| < |Z0.01|, the null hypothesis is accepted at 1% level of
significance, i.e., the dice is unbiased.
6.8 Chapter 6 Applied Statistics: Test of Hypothesis
Let p be the sample proportion in a large random sample of size n drawn from a
population having proportion P. Also, the population proportion P has a specified
value P0.
Working Rule
Null Hypothesis H0: P = P0, i.e., the population proportion P has a specified
(i)
value P0.
(ii) Alternative Hypothesis H1: P π P0 (i.e., P > P0 or P < P0)
or H1: P > P0
or H1: P < P0
(iii) Level of significance: Select the level of significance a
p-P
(iv) Test statistic: Z = , where Q = 1 – P
PQ
n
(v) Critical Value: Find the critical value (tabulated value) Za of Z at the given
level of significance.
(vi) Decision: If |Z | < Za at the level of significance a, the null hypothesis is
accepted. If |Z | > Za at the level of significance a, the null hypothesis is
rejected.
Note
1. Null Hypothesis H0 is rejected when Z > 3 without mentioning any level of
significance.
2. Confidence limits:
PQ
(i) 95% confidence limits = p ± 1.96
n
PQ
(ii) 99% confidence limits = p ± 2.58
n
If the population proportions P and Q are not known, p and q are used in equations.
Example 1
A manufacturer claimed that atleast 95% of the equipment which he
supplied to a factory conformed to specification. An examination of a
sample of 200 pieces of equipment revealed that 18 were faulty. Test his
claim at 5% level of significance.
Solution
n = 200
6.5 Test of Significance for Single Proportion — Large Samples 6.9
Example 2
In a hospital 480 female and 520 male babies were born in a week. Do
these figures confirm the hypothesis that males and females were born
in equal numbers?
Solution
n = Total number of births = 480 + 520 = 1000
480
p = Sample proportion of females born = = 0.48
1000
P = Population proportion of females born = 0.5
Q = 1 – P = 1 – 0.5 = 0.5
(i) Null Hypothesis H0: P = 0.5 i.e., the males and females were born in equal
numbers.
(ii) Alternative Hypothesis H1: P π 0.5 (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
p−P 0.48 − 0.5
(iv) Test statistic: Z = = = −1.265
PQ (0.5)(0.5)
n 1000
|Z | = 1.265
(v) Critical value: |Z0.05| = 1.96
(vi) Decision: Since |Z| < |Z0.05|, the null hypothesis is accepted at 5% level of
significance, i.e., males and females were born in equal proportions.
6.10 Chapter 6 Applied Statistics: Test of Hypothesis
Example 3
In a study designed to investigate whether certain detonators used with
explosives in a coal mining meet the requirement that at least 90% will
ignite the explosive when charged. It is found that 174 of 200 detonators
function properly. Test the null hypothesis P = 0.9 against the alternative
hypothesis P < 0.9 at the 0.05 level of significance.
Solution
n = 2000
174
p = Sample proportion of detonators functioning properly = = 0.87
200
P = Population proportion of detonators functioning properly = 0.9
Q = 1 – P = 1 – 0.9 = 0.1
(i) Null Hypothesis H0: P = 0.9
(ii) Alternative Hypothesis H1: P < 0.9 (Left tailed test)
(iii) Level of significance: a = 0.05
(vi) Decision: Since Z < Z 0.05 , the null hypothesis is accepted at 5% level of
significance.
Example 4
A salesman in a departmental store claims that at most 60 percent of the
shoppers entering the store leave without making a purchase. A random
sample of 50 shoppers showed that 35 of them left without making a
purchase. Are these sample results consistent with the claim of the
salesman? Use a level of significance of 0.05.
Solution
n = 50
35
p = Sample proportion of shoppers not making a purchase = = 0.7
50
6.5 Test of Significance for Single Proportion — Large Samples 6.11
(vi) Decision: Since Z < Z 0.05 , the null hypothesis is accepted, i.e., the sample
results are consistent with claim of the salesman.
Example 5
The fatality rate of typhoid patients is believed to be 17.26%. In a certain
year 640 patients suffering from typhoid were treated in a metropolitan
hospital and only 63 patients died. Can you consider the hospital efficient
at 1% level of significance?
Solution
n = 640
63
p = Sample proportion of typhoid patients died = = 0.0984
640
P = Population proportion of typhoid patients died = 0.1726
Q = 1 – P = 1 – 0.1726 = 0.8274
(i) Null Hypothesis H0: P = 0.1726, i.e., the hospital is efficient.
(ii) Alternative Hypothesis H1: P < 0.1726 (Left tailed test)
(iii) Level of significance: a = 0.01
p-P 0.0984 - 0.1726
(iv) Test statistic: Z = = = - 4.97
PQ (0.1726)(0.8274)
n 640
Z = 4.97
(vi) Decision: Since Z > Z 0.01 , the null hypothesis is rejected at 1% level of
significance, i.e., the hospital is efficient.
Example 6
In a big city, 325 men out of 600 were found to be smokers. Does this
information support the conclusion that the majority of men in this city
are smokers?
Solution
n = 600
325
p = Sample proportion of smokers in city = = 0.542
600
P = Population proportion of smokers in city = 0.5
Q = 1 – P = 1 – 0.5 = 0.5
(i) Null Hypothesis H0: P = 0.5, i.e., the proportion of smokers in the city is
50%.
(ii) Alternative Hypothesis H1: P > 0.5 (Right tailed test)
(iii) Level of significance: a = 0.05 (assumption)
p-P 0.542 - 0.5
(iv) Test statistic: Z = = = 2.06
PQ (0.5)(0.5)
n 600
Z = 2.06
(vi) Decision: Since Z > Z 0.05 , the null hypothesis is rejected at 5% level of sig-
nificance, i.e., proportion of smokers in city is more than 50% and majority of
men in the city are smokers.
Example 7
In a random sample of 160 worker exposed to a certain amount of
radiation, 24 experienced some ill effects. Construct a 95% confidence
interval for the corresponding true percentage.
Solution
n = 160
24
p = Sample proportion of workers exposed to radiation = = 0.15
160
q = 1 – p = 1 – 0.15 = 0.85
6.6 Test of Significance for Difference of Proportions — Large Samples 6.13
Let p1 and p2 be the sample proportions in two large samples of sizes n1 and n2 drawn
from two populations having proportions P1 and P2.
Working Rule
Null Hypothesis H0: P1 = P2, i.e., there is no significant difference in two
(i)
population proportions P1 and P2.
(ii) Alternative Hypothesis H1: P1 π P2
or H1: P1 > P2
or H1: P1 < P2
(iii) Level of significance: Select level of significance a
(iv) Test statistic: There are two cases:
(a) When the population proportions P1 and P2 are known
P1 − P2
Z=
P1Q1 P2 Q2
+
n1 n2
(b) When the population proportions P1 and P2 are not known but sample
proportions p1 and p2 are known
There are two methods to estimate P1 and P2.
Method of Substitution: In this method, sample proportions p1 and p2
are substituted for P1 and P2.
p1 − p2
Z=
p1q1 p2 q2
+
n1 n2
Method of pooling: In this method, the estimated value of two popula-
tion proportions is obtained by pooling the two sample proportions p1
and p2 into a single proportion p.
6.14 Chapter 6 Applied Statistics: Test of Hypothesis
n1 p + n2 p2
p=
n1 + n2
p1 − p2
Z=
1 1
pq +
n1 n2
(v) Critical value: Find the critical value (tabulated value) of Z at given level of
significance.
(vi) Decision: If |Z| < Za at the level of significance, the null hypothesis is
accepted. If |Z| > Za at the level of significance, the null hypothesis is
rejected.
Note
1. Null Hypothesis H0 is rejected when Z > 3 without mentioning any level of
significance.
2. Confidence limits:
P1Q1 P2Q2
(i) 95% confidence limits = ( p1 - p2 ) ± 1.96 +
n1 n2
P1Q1 P2Q2
(ii) 99% confidence limits = ( p1 - p2 ) ± 2.58 +
n1 n2
If the population proportions P1 and P2 are not known, p1, p2 , q1 and q2 are used in
equations.
Example 1
Random samples of 400 men and 600 women were asked whether they
would like to have a flyover near their residence 200 men and 325
women were in favour of the proposal. Test the hypothesis that propor-
tions of men and women in favour of the proposal are same at 5% level
of significance.
Solution
n1 = 400, n2 = 600
200
p1 = Proportion of men = = 0.5
400
325
p2 = Proportion of women = = 0.541
600
n1 p1 + n2 p2 (400)(0.5) + (600)(0.541)
p= = = 0.525
n1 + n2 400 + 600
q = 1 - p = 1 - 0.525 = 0.475
6.6 Test of Significance for Difference of Proportions — Large Samples 6.15
(i) Null Hypothesis H0: P1 = P2, i.e., there is no significant difference in pro-
portion of men and women in favour of the proposal.
(ii) Alternative Hypothesis is H1: P1 π P2 (Two tailed test)
(iii) Level of significance: a = 0.05
p1 - p2 0.5 - 0.541
(iv) Test statistic: Z = = = -1.28
æ1 1ö æ 1 1 ö
pq ç + ÷ (0.525)(0.475) ç + ÷
è n1 n2 ø è 400 600 ø
| Z | = 1.28
(v) Critical value: |Z0.05| = 1.96
(vi) Decision: Since |Z | < |Z0.05|, the null hypothesis is accepted at 5% level of
significance, i.e., there is no significant difference of opinion between
men and women in favour of the proposal.
Example 2
In a city A, 20% of a random sample of 900 school boys has a certain
slight physical defect. In another city B, 18.5% of a random sample
of 1600 school boys has the same defect. Is the difference between the
proportions significant at 0.05 level of significance?
Solution
n1 = 900, n2 = 1600
p1 = Proportion of school boys in city A = 0.2
p2 = Proportion of school boys in city B = 0.185
n p +n p (900)(0.2) + (1600)(0.185)
p= 1 1 2 2 = = 0.1904
n1 + n2 900 + 1600
q = 1 – p = 1 – 0.1904 = 0.8096
(i) Null Hypothesis H0: P1 = P2, i.e., there is no significant difference in propor-
tion of two city school boys.
(ii) Alternative Hypothesis H1: P1 π P2 (Two tailed test)
(iii) Level of significance: a = 0.05
p1 - p2 0.2 - 0.185
(iv) Test statistic: Z = = = 0.916
Ê1 1ˆ Ê 1 1 ˆ
pq Á + ˜ (0.1904)(0.8096) Á +
Ë n1 n2 ¯ Ë 900 1600 ˜¯
Z = 0.916
(v) Critical value: Z 0.05 = 1.96
6.16 Chapter 6 Applied Statistics: Test of Hypothesis
(vi) Decision: Since Z < Z 0.05 , the null hypothesis is accepted at 5% level of
significance, i.e., there is no significant difference between the proportions of
two city school boys.
Example 3
Before an increase in excise duty on tea, 800 people out of a sample of
1000 were consumers of tea. After an increase in excise duty, 800 people
were consumers of tea in a sample of 1200 persons. Find whether there
is significant decrease in the consumption of tea after the increase in
duty.
Solution
n1 = 1000, n2 = 1200
800
p1 = Proportion of consumers of tea before increase in excise duty = = 0.8
1000
800
p2 = Proportion of consumers of tea after increase in excise duty = = 0.67
1200
n p +n p (1000)(0.8) + (1200)(0.67)
p= 1 1 2 2 = = 0.73
n1 + n2 1000 + 1200
q = 1 – p = 1 – 0.73 = 0.27
(i) Null Hypothesis H0: P1 = P2, i.e., there is no significant decrease in the con-
sumption of tea after the increase in duty.
(ii) Alternative Hypothesis H1: P1 > P2 (Right tailed test)
(iii) Level of significance: a = 0.05 (assumption)
p1 - p2 0.8 - 0.67
(iv) Test statistic: Z = = = 6.84
Ê 1 1ˆ Ê 1 1 ˆ
pq Á + ˜ (0.73)(0.27) Á +
Ë n1 n2 ¯ Ë 1000 1200 ˜¯
Z = 6.84
(v) Critical value: Z 0.05 = 1.645
(vi) Decision: Since Z > Z 0.05 , the null hypothesis is rejected at 5% level of sig-
nificance, i.e., there is significant decrease in the consumption of tea after the
increase in duty.
Example 4
15.5% of a random sample of 1600 undergraduates smokers, whereas
20% of a random sample of 900 postgraduates were smokers in a state.
6.6 Test of Significance for Difference of Proportions — Large Samples 6.17
Z = 2.87
(vi) Decision: Since Z > Z 0.05 , the null hypothesis is rejected at 5% level of sig-
nificance, i.e., less number of undergraduates smokers than the postgraduates.
Example 5
A machine produced 20 defective articles in a batch of 400. After
overhauling it produced 10 defective articles in a batch of 300. Has the
machine improved?
Solution
n1 = 400, n2 = 300
20
p1 = Proportion of defective articles before overhauling = = 0.05
400
10
p2 = Proportion of defective articles after overhauling = = 0.033
300
n p +n p (400)(0.05) + (300)(0.033)
p= 1 1 2 2 = = 0.043
n1 + n2 400 + 300
q = 1 – p = 1 – 0.043 = 0.957
6.18 Chapter 6 Applied Statistics: Test of Hypothesis
(i) Null Hypothesis H0: P1 = P2, i.e., the proportions of defective articles before
and after overhauling are equal.
(ii) Alternative Hypothesis H1: P1 > P2 (Right tailed test)
(iii) Level of significance: a = 0.05 (assumption)
p1 - p2 0.05 - 0.033
(iv) Test statistic: Z = = = 1.097
Ê 1 1ˆ Ê 1 1 ˆ
pq Á + ˜ (0.043)(0.957) Á +
Ë 1
n n2¯
Ë 400 300 ˜¯
Z = 1.097
(vi) Decision: Since Z < Z 0.05 , the null hypothesis is accepted at 5% level of sig-
nificance, i.e., proportion of defective articles before and after are equal and
machine has not improved.
Example 6
In two large populations, there are 30% and 25% fair haired people
respectively. Is this difference likely to be hidden in samples of 1200 and
900 respectively from the two populations?
Solution
n1 = 1200, n2 = 900
P1 = Proportion of faired people in the first population = 0.3
Q1 = 1 - P1 = 1 - 0.3 = 0.7
P2 = Proportion of faired people in the second populationn = 0.25
Example 7
A random sample of 300 shoppers at a supermarket includes 204 who
regularly uses cents off coupons. In another sample of 500 shoppers at
a supermarket includes 75 who regularly uses cents off coupons. Obtain
95% confidence limits for the difference in the population proportions.
Solution
n1 = 300, n2 = 500
p1 = Proportion of shoppers who uses cents of coupons in the first sample
204
= = 0.68
300
q1 = 1 – p1 = 1 – 0.68 = 0.32
p2 = Proportion of shoppers who uses cents of coupons in the second sample
75
= = 0.15
500
q2 = 1 – p2 = 1 – 0.15 = 0.85
p1q1 p2 q2 (0.68)(0.32) (0.15)(0.85)
SE = + = + = 0.031
n1 n2 300 500
95% confidence limits for the difference in population proportion is
p1q1 p2 q2 pq pq
( p1 - p2 ) - 1.96 + , ( p1 - p2 ) + 1.96 1 1 + 2 2
n1 n2 n1 n2
Exercise 6.1
1. A manufacturer claims at least 95% of the items he produces are failure
free. Examinations of a random sample of 600 items showed 39 to be
defective. Test the claim at a significance level of 0.05.
[Ans.: Claim is rejected]
2. In a sample of 400 parts manufactured by a factory, the number of
defective parts was found to be 30. The company, however, claim that
only 5% of their product is defective. Is the claim tenable?
[Ans.: Claim is rejected]
3. A sample of 600 persons selected at random from a large city shows
that the percentage of male in the sample is 53%. It is believed that
1
male to the total population ratio in the city is . Test whether this
2
6.20 Chapter 6 Applied Statistics: Test of Hypothesis
12. A company wanted to introduce a new plan of work and a survey was
conducted for this purpose. Out of sample of 500 workers in one
group, 62% favoured the new plan and another group of sample of
400 workers, 41% were against the new plan. Is there any significant
difference between the two groups in their attitude towards the new
plan at 5% level of significance?
[Ans.: There is no significant difference between the
two groups in their attitude towards the new plan]
13. In a random sample of 1000 persons from town A, 400 are found to be
consumers of wheat. In a sample of 800 from town B, 400 are found to
be consumers of wheat. Do these data reveal a significant difference
between town A and town B, so far as the proportion of wheat consumers
is concerned?
[Ans.: There is significant difference between town A and town B
as the proportion of wheat consumers is concerned]
14. 100 articles from a factory are examined and 10 are found to be
defective. Out of 500 similar articles from a second factory 15 are
found to be defective. Test the significance between the difference of
two proportions at 5% level.
[Ans.: There is a significant difference between the two proportions]
(v) Critical value: Find the critical value (tabulated value) Za of Z at the given level
of significance a.
(vi) Decision: If | Z | < Za at the level of significance a, the null hypothesis is
accepted. If | Z | > Za at the level of significance a, the null hypothesis is
rejected.
Note
1. Null Hypothesis H0 is rejected when Z > 3 without mentioning any level of
significance.
2. Confidence limits:
Ê s ˆ
(i) 95% confidence limits = x ± 1.96 Á
Ë n ˜¯
Ê s ˆ
(ii) 99% confidence limits = x ± 2.58 Á
Ë n ˜¯
If standard deviation s of population is not known, s is used in equations.
Example 1
A random sample of 100 Indians has an average life span of 71.8 years
with standard deviation of 8.9 years. Can it be concluded that the
average life span of an Indian is 70 years?
Solution
n = 100, x = 71.8 years, m = 70 years, s = 8.9 years
(i) Null Hypothesis H0: m = 70 years i.e., the average life span of an Indian is
70 years.
(ii) Alternative Hypothesis H1: m π 70 years (Two tailed test)
(iii) Level of Significance: a = 0.05 (assumption)
x - m 71.8 - 70
(iv) Test statistic: Z = = = 2.02
æ s ö æ 8.9 ö
ç ÷ ç ÷
è n ø è 100 ø
Z = 2.02
(v) Critical value: |Z0.05| = 1.96
(vi) Decision: Since |Z| > |Z0.05|, the null hypothesis is rejected at 5% level of
significance, i.e., the average life span of an Indian is not 70 years.
Example 2
A random sample of 50 items gives the mean 6.2 and variance 10.24.
Can it be regarded as drawn from a normal population with mean 5.4 at
5% level of significance?
6.7 Test of Significance for Single Mean — Large Samples 6.23
Solution
n = 50, x = 6.2, m = 5.4, s = 10.24
(i) Null Hypothesis H0: m = 5.4, i.e., the sample is drawn from a normal popula-
tion with mean 5.4.
(ii) Alternative Hypothesis H1: m π 5.4 (Two tailed test)
(iii) Level of significance: a = 0.05
x-m 6.2 - 5.4
(iv) Test statistic: Z = = = 1.77
Ê s ˆ Ê 10.24 ˆ
ÁË n ˜¯ Á ˜
Ë 50 ¯
Z = 1.77
(v) Critical value: Z 0.05 = 1.96
(vi) Decision: Since Z < Z 0.05 , the null hypothesis is accepted at 5% level of sig-
nificance i.e., the sample is drawn from a normal population with mean 5.4.
Example 3
A random sample of 400 members is found to have a mean of 4.45 cm.
Can it be reasonably regarded as a sample from a large population
whose mean is 5 cm and variance is 4 cm?
Solution
n = 400, x = 4.45 cm, m = 5 cm s = 4 = 2 cm
(i) Null Hypothesis H0: m = 5 cm, i.e., the sample is drawn from a large population
with mean 5 cm.
(ii) Alternative Hypothesis H1: m π 5 cm (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
x-m 4.45 - 5
(iv) Test statistic: Z = = = 5.55
Ê s ˆ Ê 2 ˆ
ÁË n ˜¯ ÁË 400 ˜¯
Z = 5.55
(v) Critical value: Z 0.05 = 1.96
(vi) Decision: Since Z > Z 0.05 , the null hypothesis is rejected at 5% level of signif-
icance, i.e., the sample is not drawn from the large population with mean 5 cm.
6.24 Chapter 6 Applied Statistics: Test of Hypothesis
Example 4
A sample of 900 members has a mean of 3.4 cm and SD 2.61 cm. Is the
sample from a large population of mean 3.25 cm and SD 2.61 cm? If
the population is normal and its mean is unknown, find the 95% fiducial
limits of its true mean.
Solution
n = 900, x = 3.4 cm, s = 2.61 cm, m = 3.25 cm, s = 2.61 cm
(i) Null Hypothesis H0: m = 3.25 cm, i.e., the sample has been drawn from the
population with mean m = 3.25 cm and SD = 2.61 cm.
(ii) Alternative Hypothesis H1: m π 3.25 cm (Two tailed test)
(iii) Level of significance: a = 0.05
x - m 3.4 - 3.25
(iv) Test statistic: Z = = = 1.72
Ê s ˆ Ê 2.61 ˆ
ÁË n ˜¯ ÁË 900 ˜¯
Z = 1.72
Critical value: |Z0.05| = 1.96
(v)
(vi)
Decision: Since |Z| < |Z0.05|, the null hypothesis is accepted at 5% level of
significance i.e., the sample has been drawn from the population with mean
m = 3.25 cm.
95% fiducial limits:
s 2.61
x ± 1.96 = 3.4 ± 1.96 = 3.4 ± 0.1705,
n 900
i.e., 3.5705 and 3.2295
Example 5
A type company claims that the lives of tyres have mean 42000 km
with s.d. of 4000 km. A change in the production process is believed to
result in better product. A test sample of 81 new tyres has a mean life of
42500 km. Test at 5% level of significance that the new product is
significantly better than the old one.
Solution
n = 81, x = 42500 km, m = 42000 km, s = 4000 km
(i) Null Hypothesis H0: m = 42000 km, i.e., the new product is not significantly
better than the old one.
(ii) Alternative Hypothesis H1: m > 42000 km (Right tailed test)
(iii) Level of significance: a = 0.05
6.7 Test of Significance for Single Mean — Large Samples 6.25
x - m 42500 - 42000
(iv) Test statistic: Z = = = 1.125
Ê s ˆ Ê 4000 ˆ
ÁË n ˜¯ ÁË 81 ˜¯
Z = 1.125
(v) Critical value: Z 0.05 = 1.645
(vi) Decision: Since Z < Z 0.05 , the null hypothesis is accepted at 5% level of sig-
nificance, i.e., the new product is not significantly better than the old one.
Example 6
The mean breaking strength of cables supplied by a manufacturer is 1800
with standard deviation 100. By a new technique in the manufacturing
process it is claimed that the breaking strength of the cable has increased.
In order to test the claim a sample of 50 cables is tested. It is found that
the mean breaking strength is 1850. Can we support the claim at 1%
level of significance?
Solution
n = 50, x = 1850, m = 1800, s = 100
(i) Null Hypothesis H0: m = 1800, i.e., the mean breaking strength of cables sup-
plied by manufacturer is 1800.
(ii) Alternative Hypothesis H1: m > 1800 (Right tailed test)
(iii) Level of significance: a = 0.01
x - m 1850 - 1800
(iv) Test statistic: Z = = = 3.54
Ê s ˆ Ê 100 ˆ
ÁË n ˜¯ ÁË 50 ˜¯
Z = 3.54
(v) Critical value: Z 0.01 = 2.33
(vi) Decision: Since Z > Z 0.01 , the null hypothesis is rejected at 1% level of sig-
nificance, i.e., the mean breaking strength of cables supplied is more than
1800.
Example 7
An ambulance service claims that it takes on the average 10 minutes
to reach its destination in emergency calls. A sample of 36 calls has a
6.26 Chapter 6 Applied Statistics: Test of Hypothesis
Z = 1.5
(v) Critical value: Z 0.05 = 1.645
(vi) Decision: Since Z < Z 0.05 , the null hypothesis is accepted at 5% level of con-
fidence, i.e., the ambulance service takes on the average 10 minutes to reach its
destination.
Let x1 and x2 be the sample means of two independent large random samples with sizes
n1 and n2 (n1 > 30, n2 > 30) drawn from two populations with means m1 and m2 and
standard deviations s1 and s2.
Working Rule
Null Hypothesis H0: m1 = m2, i.e., the two samples have been drawn from
(i)
two different populations having the same means and equal standard devia-
tions.
(ii) Alternative Hypothesis H1: m1 π m2 (two tailed test)
or H1: m1 < m2 (one tailed test)
or H1: m1 > m2 (one tailed test)
(iii) Level of significance: Select the level of significance a.
(iv) Test statistic: There are two cases for calculating test statistic.
(a) When the population standard deviations s1 and s2 are known
x1 − x2
Z=
s 12 s 22
+
n1 n2
6.8 Test of Significance for Difference of Means — Large Samples 6.27
(b) When the population standard deviations s1 and s2 are not known
x - x2
Z= 1
s12 s22
+
n1 n2
where s1 and s2 are sample standard deviations.
(v) Critical Value: Find the critical value (tabulated value) Za of Z at the given
level of significance.
(vi) Decision: If |Z| < Za at the level of significance a, the null hypothesis is
accepted. If |Z| > Za at the level of significance a, the null hypothesis is
rejected.
Note:
1. Null Hypothesis H0 is rejected when Z > 3 without mentioning any level of
significance.
2. Confidence limits:
s2 s2
(i) 95% confidence limits = ( x1 - x2 ) ± 1.96 1 + 2
n1 n2
s 12 s 22
(ii) 99% confidence limits = ( x1 - x2 ) ± 2.58 +
n1 n2
If population standard deviation s1 and s2 are not known, s1 and s2 are used in
equations.
Example 1
Test the significance of the difference between the means of two normal
population with the same standard deviation from the following data:
Size Mean SD
Sample I 100 64 6
Sample II 200 67 8
Solution
n1 = 100, n2 = 200, x1 = 64, x2 = 67 s1 = 6, s2 = 8
(i) Null Hypothesis H 0 : m1 = m2 i.e., there is no significant difference between
two means.
(ii) Alternative Hypothesis H1: m1 π m2 (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
x1 - x2 64 - 67
(iv) Test statistic: Z = = = -3.31
s12 s22 (6)2 (8)2
+ +
n1 n2 200 100
6.28 Chapter 6 Applied Statistics: Test of Hypothesis
|Z| = 3.31
(v) Critical value: |Z0.05| = 1.96
(vi) Decision: Since |Z| > |Z0.05|, the null hypothesis is rejected at 5% level of
significance, i.e., the samples do not support the hypothesis that the two
population have the same mean although they may have the same standard
deviation.
Example 2
The means of simple samples of sizes 1000 and 2000 are 67.5 and 68 cm
respectively. Can the samples be regarded as drawn from the same
population of S.D. 2.5 cm.
Solution
n1 = 1000, n2 = 2000, x1 = 67.5 cm, x2 = 68 cm, s = 2.5 cm
(i) Null Hypothesis H0: m1 = m2 i.e., the samples have been drawn from the
same population of S.D. 2.5 cm
(ii) Alternative Hypothesis H1 : m1 π m2 (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
x1 - x2 67.5 - 68
(iv) Test statistic: Z = = = -5.16
s2
s 2
( 2.5) 2 ( 2.5) 2
+ +
n1 n2 1000 2000
|Z| = 5.16
(v) Critical value: |Z0.05| = 1.96
(iv) Decision: Since |Z| > |Z0.05|, the null hypothesis is rejected at 5% level of
significance, i.e., the samples cannot be regarded as drawn from the same
population of SD 2.5 cm.
Example 3
The mean life of a sample of 10 electric bulbs was found to be 1456
hours with SD of 423 hours. A second sample of 17 bulbs chosen from
a different batch showed a mean life of 1280 with SD of 398 hours. Is
there a significant difference between the means of two batches?
Solution
n1 = 10, n2 = 17, x1 = 1456 hours, x2 = 1280 hours, s1 = 423 hours, s2 = 398 houurs
(i) Null Hypothesis H0: m1 = m2, i.e., there is no significant difference between
the means of two batches.
(ii) Alternative Hypothesis H1: m1 π m2 (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
6.8 Test of Significance for Difference of Means — Large Samples 6.29
x1 - x2 1456 - 1280
(iv) Test statistic: Z = = = 1.07
2 2
s s ( 423) 2 (398) 2
1
+ 2
+
n1 n2 10 17
Z = 1.07
(v) Critical value: |Z0.05| = 1.96
(vi) Decision: Since Z < |Z0.05|, the null hypothesis is accepted at 5% level of
significance, i.e., there is no significant difference between the means of
two batches.
Example 4
The average of marks scored by 32 boys is 72 with standard deviation 8
while that of 36 girls is 70 with standard deviation 6. Test at 1% level of
significance whether the boys perform better than the girls.
Solution
n1 = 32, n2 = 36, x1 = 72, x2 = 70, s1 = 8, s2 = 6
(i) Null Hypothesis H0: m1 = m2, i.e., there is no significant difference between the
performance of boys and girls.
(ii) Alternative Hypothesis H1: m1 > m2 (Right tailed test)
(iii) Level of significance: a = 0.01
x1 - x2 72 - 70
(iv) Test statistic: Z = = = 1.1547
s12 s22 (8)2 (6)2
+ +
n1 n2 32 36
Z = 1.1547
(v) Critical value: Z 0.01 = 2.33
(vi) Decision: Since Z < Z 0.01 , the null hypothesis is accepted at 1% level of sig-
nificance, i.e., the boys do not perform better than the girls.
Example 5
A simple sample of heights of 6400 English men has a mean of 170 cm
and a s.d. of 6.4 cm, while a simple sample of heights of 1600 Americans
has a mean of 172 cm and a s.d. of 63 cm. Do the data indicate that
American are, on the average, taller than the English men?
Solution
n1 = 1600, n2 = 6400, x1 = 172 cm, x2 = 170 cm, s1 = 6.3 cm, s2 = 6.4 cm
6.30 Chapter 6 Applied Statistics: Test of Hypothesis
(i) Null Hypothesis H0: m1 = m2, i.e., there is no significant difference in heights of
Americans and English men.
(ii) Alternative Hypothesis H1: m1 > m2 (Right tailed test)
(iii) Level of significance: a = 0.01 (assumption)
x1 - x2 172 - 170
(iv) Test statistic: Z = = = 11.32
s12 s22 (6.3)2 (6.4)2
+ +
n1 n2 1600 6400
Z = 11.332
(v) Critical value: Z 0.01 = 2.33
(vi) Decision: Since Z > Z 0.01 , the null hypothesis is rejected at 1% level of sig-
nificance, i.e., Americans are, on the average, taller than English men.
Example 6
In a certain factory there are two different processes of manufacturing
the same item. The average weight in a sample of 250 items produced
from one process is found to be 120 gm with a s.d. of 12 gm; the
corresponding figures in a sample of 400 items from the other process
are 124 gm and 14 gm. Is this difference between the two sample means
significant?
Solution
n1 = 250, n2 = 400, x1 = 120 gm, x2 = 124 gm, s1 = 12 gm, s2 = 14 gm
(i) Null Hypothesis H0: m1 = m2, i.e., there is no significant difference between the
two sample means.
(ii) Alternative Hypothesis H1: m1 π m2 (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
x1 - x2 120 - 124
(iv) Test statistic: Z = = = - 3.87
s12 s22 (12)2 (14)2
+ +
n1 n2 250 400
Z = 3.87
(v) Critical value: Z 0.05 = 1.96
(vi) Decision: Since Z > Z 0.05 , the null hypothesis is rejected at 5% level of
significance, i.e., there is significant difference between two sample means.
6.9 Test of Significance for Difference of Standard Deviations — Large Samples 6.31
Example 7
The mean height of 50 male students who participate in sports is 68.2
inches with a s.d. of 2.5 inches. The mean height of 50 male students who
have not participated in sport is 67.2 inches with a s.d. of 2.8 inches.
Test the hypothesis that the height of students who have participated in
sports is more than the students who have not participated in sports.
Solution
n1 = 50, n2 = 50, x1 = 68.2 inch, x2 = 67.2 inch, s1 = 2.5 inch, s2 = 2.8 inch
(i) Null Hypothesis H0: m1 = m2, i.e., there is no significant difference in heights of
students who have participated in sports or not.
(ii) Alternative Hypothesis H1: m1 > m2 (Right tailed test)
(iii) Level of significance: a = 0.05 (assumption)
x1 - x2 68.2 - 67.2
(iv) Test statistic: Z = = = 1.88
s12 s22 (2.5)2 (2.8)2
+ +
n1 n2 50 50
Z = 1.88
(v) Critical value: Z 0.05 = 1.645
(vi) Decision: Since Z > Z 0.05 , the null hypothesis is rejected at 5% level of sig-
nificance, i.e., the height of students who have participated in sports is more
than the students who have not participated in sports.
Let s1 and s2 be the standard deviations of two independent large random samples with
sizes n1 and n2 (n1 > 30, n2 > 30) drawn from two populations with standard deviations
s1 and s2.
Working Rule
(i) Null Hypothesis H0: s1 = s2, i.e., the two samples have been drawn from two
different populations having same standard deviations.
(ii) Alternative Hypothesis H1: s1 π s2 (Two tailed test)
or H1: s1 < s2 (One tailed test)
or H1: s1 > s2 (One tailed test)
(iii) Level of significance: Select the level of significance a.
6.32 Chapter 6 Applied Statistics: Test of Hypothesis
(iv) Test statistic: There are two cases for calculating test statistic.
(a) When the population standard deviations s1 and s2 are known
s1 - s2
Z=
s 12 s 22
+
2 n1 2 n2
(a) When the population standard deviations s1 and s2 are not known
s1 - s2
Z=
s12 s2
+ 2
2 n1 2 n2
where s1 and s2 are sample standard deviation.
(v) Critical value: Find the critical vale (tabulated value) Za of Z at the given level
of significance.
(vi) Decision: If Z < Za at the level of significance a, the null hypothesis is ac-
cepted. If Z > Za at the level of significance a, the null hypothesis is re-
jected.
Example 1
The SD of a random sample of 1000 is found to be 2.6 and the SD
of another random sample of 500 is 2.7. Assuming the samples to
be independent, find whether the two samples could have come from
populations with the same SD.
Solution
n1 = 1000, n2 = 500, s1 = 2.6, s2 = 2.7
(i) Null Hypothesis H0: s1 = s2, i.e., there is no significant difference between
two standard deviations.
(ii) Alternative Hypothesis H1: s1 π s2 (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
s1 - s2 2.6 - 2.7
(iv) Test statistic: Z = = = - 0.997
s12 s22 (2.6)2 (2.7)2
+ +
2 n1 2 n2 2(1000) 2(500)
Z = 0.97
(v) Critical value: Z 0.05 = 1.96
(vi) Decision: Since Z < Z 0.05 , the null hypothesis H0 is accepted at 5% level
of significance, i.e., there is no significance difference between two standard
6.9 Test of Significance for Difference of Standard Deviations — Large Samples 6.33
deviations and the two samples could have come from populations with the
same SD.
Example 2
Random samples drawn from two countries gave the following data
relating to the heights of adult males:
Country A Country B
Standard deviation (in inches) 2.58 2.50
Number in samples 1000 1200
(i) Null Hypothesis H0: s1 = s2, i.e., there is no significant difference between
two standard deviations.
(ii) Alternative Hypothesis H1: s1 π s2 (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
s1 - s2 2.58 - 2.50
(iv) Test statistic: Z = = = 0.077
s12 s22 (2.58)2 (2.50)2
+ +
2 n1 2 n2 2(1000) 2(1200)
Z = 0.077
(v) Critical value: Z 0.05 = 1.96
(vi) Decision: Since Z < Z 0.05 , the null hypothesis is accepted at 5% level of
significance, i.e., there is no significance difference between the standard de-
viations.
Example 3
Examine whether the two samples for which the data are given in the
following table could have been drawn from populations with the same
SD.
Size SD
Sample I 100 5
Sample II 200 7
6.34 Chapter 6 Applied Statistics: Test of Hypothesis
Solution
n1 = 100, n2 = 200, s1 = 5, s2 = 7
(i) Null Hypothesis H0: s1 = s2, i.e., the two samples could have been drawn from
populations with the same SD.
(ii) Alternative Hypothesis H1: s1 π s2 (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
s1 - s2 5-7
(iv) Test statistic: Z = = = - 4.02
s12 s22 2
(5) (7)2
+ +
2 n1 2 n2 2(100) 2(200)
Z = 4.02
(v) Critical value: Z 0.05 = 1.96
(vi) Decision: Since Z > Z 0.05 , the null hypothesis is rejected at 5% level of sig-
nificance, i.e., the two samples could not have been drawn from populations
with the same SD.
Exercise 6.2
1. A random sample of 100 students gave a mean weight of 58 kg with a
SD of 4 kg. Test the hypothesis that the mean weight in the population
is 60 kg.
[Ans.: The mean weight in the population is not 60 kg]
2. A sample of 400 items is taken from a normal population whose mean
is 4 and whose variance is also 4. If the sample mean is 4.45, can the
sample be regarded as truly random sample?
[Ans.: Sample cannot be regarded as truly random sample]
3. The mean IQ of a sample of 1600 children was 99. Is it likely that this
was a random sample from a population with mean IQ 100 and SD 15?
[Ans.: Sample was not drawn from a
population with mean 100 and SD 15]
4. In a random sample of 60 workers, the average time taken by them to
get to work is 33.8 minutes with a standard deviation of 6.1 minutes. Can
we reject the null hypothesis m = 32.6 minutes in favour of alternative
hypothesis m > 32.6 at a = 0.025 level of significance
[Ans.: The null hypothesis is accepted]
5. It is claimed that a random sample of 49 types has a mean life of 15200
km. This sample was drawn from a population whose mean is 15150 km
and a standard deviation of 1200 km. Test the significance at 0.05 level.
[Ans.: The null hypothesis is accepted]
6.9 Test of Significance for Difference of Standard Deviations — Large Samples 6.35
Test at 5% level of significance that the mean height is the same for
children at two places.
[Ans.: The mean height is same for children at two places]
10. The mean life of a sample of 10 electric bulbs was found to be 1456
hours with SD of 423 hours. A second sample of 17 bulbs chosen from a
different batch showed a mean life of 1280 hours with SD of 398 hours.
Is there a significant difference between the means of two batches?
[Ans.: There is no difference between
the mean life of two batches]
11. T
he SD of a random sample of 900 members is 4.6 and that of another
independent sample of 1600 members is 4.8. Examine if the two samples
could have been drawn from a population with SD 4?
[Ans.: Two samples could have been drawn
from a population with SD 4]
6.36 Chapter 6 Applied Statistics: Test of Hypothesis
If the samples are large (n > 30) then the sampling distribution of a statistic is normal.
But if the samples are small (n £ 30) then the above result does not hold good. For
estimation of the parameter as well as for testing a hypothesis, following distributions
are used:
(i) Student’s t-distribution
(ii) Snedecer’s F-distribution
2
(iii) Chi-square (c ) distribution
The theory of small or exact sample was developed by Irish statistician William S.
Gosset who used to write under pen-name of student. The quantity t is defined as
Difference of population parameter and the corresponding statisttic
t=
Standard error of statistic
S( x - x ) 2
where x is sample mean and s = is an unbiased estimate of s2. The test sta-
n
x-m
tistic t = is a random variable having t-distribution with n = n – 1 degrees of
Ê s ˆ
ÁË n - 1 ˜¯ - (n +1)
Ê t2 ˆ 2
freedom and with probability density function f (t ) = c Á 1 + ˜ , where n = n – 1
Ë n¯
∞
and c is a constant required to make the area under the curve unity, i.e.,
∫
−∞
f (t )dt = 1.
6.12 t-test: Test of Significance for Single Mean 6.37
The t-distribution is used when (i) the sample size is less than or equal to 30, and
(ii) population standard deviation is not known.
If x1, x2, …, xn is a random sample of size n (n £ 30) drawn from a normal population
with mean m and SD s and if the sample mean x differs significantly from the
population mean m then the student’s t statistic is given by
6.38 Chapter 6 Applied Statistics: Test of Hypothesis
x-m S( x - x ) 2
t= , where s = with n = n - 1
Ê s ˆ n
ÁË n - 1 ˜¯
Note: Confidence Limit
Ê s ˆ
(i) 95% confidence limits = x ± t0.05 Á
Ë n - 1 ˜¯
where t0.05 is the 5% critical value of t for n = n – 1 degree of freedom for a
Two tailed test.
Ê s ˆ
(ii) 99% confidence limits = x ± t0.01 Á
Ë n - 1 ˜¯
where t0.01 is the 1% critical value of t for n = n – 1 degree of freedom for a
Two tailed test.
Example 1
A machinist is making engine parts with axle diameter of 0.7 cm. A
random sample of 10 parts shows a mean diameter of 0.742 cm with a
standard deviation of 0.04 cm. Compute the statistic you would use to
test whether work is meeting the specification at 0.05 level of signifi-
cance.
Solution
n = 10, x = 0.742 cm, s = 0.04 cm m = 0.7 cm
(i) Null Hypothesis H0: m = 0.7 cm, i.e., the product is meeting the specifica-
tion.
(ii) Alternative Hypothesis H1: m π 0.7 cm (Two tailed test)
(iii) Level of significance: a = 0.05
x-m 0.742 - 0.7
(iv) Test statistic: t = = = 3.15
Ê s ˆ Ê 0.04 ˆ
ÁË n - 1 ˜¯ ÁË 10 - 1 ˜¯
t = 3.15
(v) Critical value: n = n – 1 = 10 – 1 = 9
t0.05 (n = 9) = 2.262
(vi) Decision: Since t > t0.05, the null hypothesis is rejected at 5% level of
significance i.e., the product is not meeting the specification.
6.12 t-test: Test of Significance for Single Mean 6.39
Example 2
Ten objects are chosen at random from a large population and their
weights are found to be in grams: 63, 63, 64, 65, 66, 69, 69, 70, 70, 71.
Discuss the suggestion that the mean weight is 65 g.
Solution
n = 10, m = 65 g
x = 67 g ¸
˝ From calculator
s = 2.966 g ˛
(i) Null Hypothesis H0: m = 65 g, i.e., there is no significant difference in the
mean weight of sample and population.
(ii) Alternate Hypothesis H1: m π 65 g (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
x -m 67 - 65
(iv) Test statistic: t = = = 2.023
æ s ö æ 2.966 ö
ç ÷ ç ÷
è n - 1 ø è 10 - 1 ø
t = 2.023
(v) Critical value: n = n – 1 = 10 – 1 = 9
t0.05 (n = 9) = 2.262
(vi) Decision: Since t < t0.05, the null hypothesis is accepted at 5% level of
significance, i.e., the mean weight is 65 g.
Example 3
The mean lifetime of a sample of 25 bulbs is found as 1550 hours with a
SD of 120 hours. The company manufacturing the bulbs claims that the
average life of their bulbs is 1600 hours. Is the claim acceptance at 5%
level of significance?
Solution
t = 2.04
(v) Critical value: n = n – 1 = 25 – 1 = 24
t0.05 (n = 24) = 1.711
(vi) Decision: Since t > t0.05 , the null hypothesis is rejected at 5% level of sig-
nificance, i.e., the average life of bulbs is less than 1600 hours and the claim is
unacceptable.
Example 4
A soap manufacturing company was distributing a particular brand of
soap through a large number of retail shops. Before a heavy advertisement
campaign, the mean sales per week per shop was 140 dozens. After the
campaign, a sample of 26 shops was taken and the mean sales was
found to be 147 dozens with standard deviation 16. Can you consider
the advertisement effective?
Solution
n = 26, x = 147 dozens, s = 16 dozens, m = 140 dozens
(i) Null Hypothesis H0: m = 140 dozens, i.e., the advertisement is not effective.
(ii) Alternative Hypothesis H1: m > 140 dozens (One tailed test)
(iii) Level of significance: a = 0.05 (assumption)
x-m 147 - 140
(iv) Test statistic: t = = = 2.1875
Ê s ˆ Ê 16 ˆ
ÁË n - 1 ˜¯ ÁË 26 - 1 ˜¯
t = 2.1875
(v) Critical value: n = n – 1 = 26 – 1 = 25
t0.05 (n = 25) = 1.708
(vi) Decision: Since t > t0.05 , the null hypothesis is rejected at 5% level of signifi-
cance, i.e., the advertisement is effective.
Example 5
A random sample of size 16 from a normal population showed a mean of
103.75 cm and sum of squares of deviations from the mean 843.75 cm2.
Can we say that the population has a mean of 108.75 cm?
6.12 t-test: Test of Significance for Single Mean 6.41
Solution
n = 16, x = 103.75 cm, Â ( x - x )2 = 843.75 cm2 , m = 108.75 cm
Â(x - x )
2
843.75
s= = = 7.26 cm
n 16
(i) Null Hypothesis H0: m = 108.75 cm, i.e., the population has a mean of 108.75 cm.
(ii) Alternative Hypothesis H1: m π 108.75 cm (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
x-m 103.75 - 108.75
(iv) Test statistic: t = = = - 2.67
Ê s ˆ Ê 7.26 ˆ
ÁË n - 1 ˜¯ ÁË 16 - 1 ˜¯
t = 2.67
(v) Critical value: n = n – 1 = 16 – 1 = 15
t0.05 (n = 15) = 2.132
(vi) Decision: Since t > t0.05 , the null hypothesis is rejected at 5% level of signifi-
cance, i.e., the population has not a mean of 108.75 cm.
Example 6
A random sample of 10 boys had the following IQs:
70, 120, 110, 101, 88, 83, 95, 98, 107 and 100.
(a) Do these data support the assumption of a population mean IQ of
100?
(b) Find 95% confidence limits for the mean IQ.
Solution
n = 10
x = 97.2 ¸
˝ From calculator
s = 13.54 ˛
(i) Null Hypothesis H0: m = 100, i.e., the population has mean IQ of 100.
(ii) Alternative Hypothesis H1: m π 100 (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
x-m 97.2 - 100
(iv) Test statistic: t = = = - 0.62
Ê s ˆ Ê 13.54 ˆ
ÁË n - 1 ˜¯ ÁË 10 - 1 ˜¯
t = 0.62
6.42 Chapter 6 Applied Statistics: Test of Hypothesis
Example 7
The heights of 10 males of a given locality are found to be 175, 168, 155,
170, 152, 170, 175, 160, 160 and 165 cm. Based on this sample, find the
95% confidence limits for the heights of males in that locality.
Solution
n = 10
x = 165¸
˝ From calculator
s = 7.6 ˛
n = n - 1 = 10 - 1 = 9
From t-table
t0.05(n = 9) = 2.262 (Two tailed test)
The 95% confidence limits for m are
È Ê s ˆ Ê s ˆ˘
Í x - t0.05 Á ˜ , x + t0.05 Á ˙
Ë n -1 ¯ Ë n - 1 ˜¯ ˙˚
ÍÎ
È ˘
i.e., Í165 - 2.262(7.6) , 165 + 2.262(7.6) ˙
Î 10 - 1 10 - 1 ˚
i.e., [159.27, 170.73]
i.e., the heights of males in the locality are likely to be in limits 159.27 cm and 170.73 cm.
normal population with means m1 and m2 and same standard deviations. The student’s
t statistic is given by
x-y
t= with n = n1 + n2 - 2
1 1
s +
n1 n2
∑x
where x=
n1
∑y
y=
n2
S ( x - x )2 + å( y - y )2
and s =
n1 + n2 - 2
In terms of standard deviations s1 and s2.
x-y
t=
1 1
s +
n1 n2
n1 s12 + n2 s22
where s =
n1 + n2 - 2
S( x - x ) 2
and s1 =
n1
S( y - y ) 2
s2 = n2
Note
1. If n1 = n2 = n and the samples are independent, i.e., the observations in the two
samples are not all related then test statistic is given by
x-y
t= with n = 2 n - 2
s12 + s22
n -1
2. If n1 = n2 = n and if the pairs of values of x and y are associated or correlated
in some way (or not independent), the above formula for testing of hypothesis
cannot be used.
Let di = xi – yi denote the difference (with proper sign) in the values of x and y
for the ith pair (i = 1, 2, ..., n).
The test statistics is given by
d
t= with n = n - 1
Ê s ˆ
ÁË n - 1 ˜¯
6.44 Chapter 6 Applied Statistics: Test of Hypothesis
where d and s denote the mean and standard deviation of the difference di
respectively, i.e.,
d=
 di
n
2
s=
 (di - d )2 =  di 2 - Ê Â di ˆ
Á ˜
n n Ë n ¯
(3) Confidence Limits
Ê ˆ
Á ˜
1
(i) 95% confidence limits = ( x - y ) ± t0.05 Á ˜
Á 1 1 ˜
Ás + ˜
Ë n1 n2 ¯
where t0.05 is the 5% critical value of t for n = n1 + n2 – 2 degree of free-
dom for a Two tailed test.
Ê ˆ
Á ˜
Á 1 ˜
(ii) 99% confidence limits = ( x - y ) ± t0.01
Á 1 1 ˜
Ás + ˜
Ë n1 n2 ¯
where t0.01 is the 1% critical value of t for n = n1 + n2 – 2 degree of free-
dom for a Two tailed test.
Example 1
The means of two random samples of size 9 and 7 are 196.42 and 198.82
respectively. The sum of squares of the deviation from the mean are
26.94 and 18.73 respectively. Can the sample be considered to have
been drawn from the same population?
Solution
n1 = 9, n2 = 7, x = 196.42, y = 198.82
 ( x - x )2 = 26.94,  ( y - y )2 = 18.73
Â(x - x ) + Â(y - y )
2 2
26.94 + 18.73
s= = = 1.806
n1 + n2 - 2 9+7-2
(i) Null Hypothesis H0: m1 = m2, i.e., the samples are drawn from the same popula-
tion.
(ii) Alternative Hypothesis H1: m1 π m2 (Two tailed test)
6.13 t-test: Test of Significance for Difference of Means 6.45
Example 2
Samples of two types of electric bulbs were tested for length of life and
the following data were obtained.
Size Mean SD
Sample 1 8 1234 hr 36 hr
Sample 2 7 1036 hr 40 hr
Is the difference in the means sufficient to warrant that type 1 bulbs are
superior to type 2 bulbs?
Solution
n1 = 8, n2 = 7, x1 = 1234 hr, x2 = 1036 hr
s1 = 36 hr, s2 = 40 hr
(vi) Decision: Since t > t0.05 , the null hypothesis is rejected at 5% level of signifi-
cance, i.e., the type 1 bulbs are superior to type 2 bulbs.
Example 3
The mean height and SD height of 8 randomly chosen soldiers are
166.9 cm and 8.29 cm respectively. The corresponding values of 6
randomly chosen sailors are 170.3 cm and 8.50 cm respectively. Based
on this data, can we conclude that soldiers are, in general, shorter than
sailors?
Solution
n1 = 8, n2 = 6, x1 = 166.9 cm, x2 = 170.3 cm
s1 = 8.29 cm, s2 = 8.50 cm
Example 4
Two types of batteries are tested for their length of life and the following
data are obtained:
6.13 t-test: Test of Significance for Difference of Means 6.47
(i) Null Hypothesis H0: m1 = m2, i.e., there is no significant difference in two
means.
(ii) Alternative Hypothesis H1: m1 π m2 (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
x1 - x2 600 - 640
(iv) Test statistic: t = = = - 6.74
1 1 1 1
s + 12.22 +
n1 n2 9 8
t = 6.74
(v) Critical value: n = n1 + n2 – 2 = 9 + 8 – 2 = 15
t0.05 (n = 15) = 2.132
(vi) Decision: Since t > t0.05 , the null hypothesis is rejected at 5% level of confi-
dence, i.e., there is significant difference in the two means.
Ê 1 1 ˆ
95% confidence limits for (m1 – m2) = ( x1 - x2 ) ± t0.05 Á s + ˜
Ë n1 n2 ¯
Ê 1 1ˆ
= (600 - 640) ± 2.132 Á 12.22 +
Ë 9 8 ˜¯
= - 40 ± 12.66
= - 27.34 and - 52.66
Example 5
A group of 5 patients treated with medicine A weigh 42, 39, 48, 60 and
41 kg. Second group of 7 patients from the same hospital treated with
6.48 Chapter 6 Applied Statistics: Test of Hypothesis
medicine B weigh 38, 42, 56, 64, 68, 69 and 62 kg. Do you agree with
the claim that medicine B increases the weight significantly?
Solution
n1 = 5, n2 = 7
x = 46 kg ü
y = 57 kg ïï
ý From calculator
s1 = 7.62 kg ï
s2 = 11.5 kg ïþ
Example 6
The following data represent the biological values of protein from cow’s
milk and buffalo’s milk at a certain level:
Cow’s milk 1.82 2.02 1.88 1.61 1.81 1.54
Buffalo’s milk 2.00 1.83 1.86 2.03 2.19 1.88
n=6
x1 = 1.78 ¸
Ô
x2 = 1.965Ô
˝ From calculator
s1 = 0.16 Ô
s2 = 0.124 Ô˛
(i) Null Hypothesis H0: m1 = m2, i.e., there is no significant difference in average
values of proteins in two milk samples.
(ii) Alternative Hypothesis H1: m1 π m2 (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
x1 - x2 1.78 - 1.965
(iv) Test statistic: t = = = - 2.043
s12 + s22 (0.16)2 + (0.124)2
n -1 6 -1
t = 2.0043
(v) Critical value: n = 2n – 2 = 2(6) – 2 = 10
t0.05 (n = 10) = 2.228
(vi) Decision: Since t < t0.05 , the null hypothesis is accepted at 5% level of sig-
nificance, i.e., there is no significant difference in average values of proteins in
two milk samples.
Example 7
A certain injection administered to 12-patients resulted in the following
changes of blood pressure:
5, 2, 8, –1, 3, 0, 6, –2, 1, 5, 0, 4
Can it be concluded that the injection will be in general accompanied by
an increase in blood pressure?
Solution
Here, ‘the changes’ d = x – y in blood pressure are given, i.e., x is the final blood pressure
after administering the injection and y is the initial blood pressure. It is required to test
whether the mean blood pressure has increased, i.e., m1 is greater than m2.
n = 12, Â di = 31, Â di2 = 185
d=
 di =
31
= 2.58
n 12
2
s=
 di2 - Ê Â di ˆ 185 Ê 31 ˆ
2
Á ˜ = - = 2.96
n Ë n ¯ 12 ÁË 12 ˜¯
6.50 Chapter 6 Applied Statistics: Test of Hypothesis
(i) Null Hypothesis H0: m1 = m2, i.e., mean blood pressure has not increased.
(ii) Alternative Hypothesis H1: m1 > m2 (One tailed test)
(iii) Level of significance: a = 0.05 (assumption)
d 2.58
(iv) Test statistic: t = = = 2.89
Ê s ˆ Ê 2.96 ˆ
ÁË n - 1 ˜¯ ÁË ˜
12 - 1 ¯
t = 2.89
(v) Critical value: n = n – 1 = 12 – 1 = 11
t0.05 (n = 11) = 1.796
(vi) Decision: Since t > t0.05 , the null hypothesis is rejected, i.e., injection will in
general accompanied by an increase in blood pressure.
Example 8
Scores obtained in a shooting competition by 10 soldiers before and
after intensive training are given below:
Score before training 67 24 57 55 63 54 56 68 33 43
Score after training 70 38 58 58 56 67 68 75 42 38
s=
 di2 - Ê Â di ˆ 732 Ê -50 ˆ
2
Á ˜ = - = 6.94
n Ë n ¯ 10 ÁË 10 ˜¯
6.14 t-test: Test of Significance for Correlation Coefficients 6.51
(i) Null Hypothesis H0: m1 = m2, i.e., there is no significant effect of intensive
training.
(ii) Alternative Hypothesis H1: m1 < m2 (One tailed test)
(iii) Level of significance: a = 0.05
d -5
(iv) Test statistic: t = = = - 2.16
Ê s ˆ Ê 6.94 ˆ
ÁË n - 1 ˜¯ ÁË 10 - 1 ˜¯
t = 2.16
(v) Critical value: n = n – 1 = 10 – 1 = 9
t0.05 (n = 9) = 1.96
(vi) Decision: Since t > t0.05 , the null hypothesis is rejected at 5% level of signifi-
cance, i.e., intensive training is useful.
Let (x1, y1), (x2, y2), …, (xn, yn) be n pairs of observations of a random sample from
a bivariate normal population and let r be the observed correlation coefficient in the
sample. It is required to test if this sample correlation coefficient is significant of any
correlation in the population, i.e., whether the value of the population correlation
coefficient r is zero and the observed value of r has arisen due to fluctuation of
sampling. The student’s t statistic is given by
r n-2
t = with n = n – 2
1 - r2
Example 1
A random sample of 18 pairs of observations from a bivariate normal
population gives a correlation coefficient of 0.3. Is it likely that vari-
ables are uncorrelated in the population?
Solution
n = 18, r = 0.3
(i) Null Hypothesis H0: r = 0, i.e., the variables are uncorrelated.
(ii) Alternative Hypothesis H1: r π 0 (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
r n - 2 0.3 18 - 2
(iv) Test statistic: t = = = 1.26
1 - r2 1 - (0.3)2
t = 1.26
6.52 Chapter 6 Applied Statistics: Test of Hypothesis
Example 2
A random sample of 10 nations gives a correlation coefficient of 0.5
between literacy rate and political stability. Is the relationship signifi-
cant?
Solution
n = 10, r = 0.5
(i) Null Hypothesis H0 : r = 0, i.e., there is no relationship between literacy rate
and political stability.
(ii) Alternative Hypothesis H1: r π 0 (Two tailed test)
(iii) Level of significance a = 0.5 (assumption)
r n-2 0.5 10 - 2
(iv) Test statistic: t = = = 1.63
2
1- r 1 - (0.5)2
t = 1.63
(v) Critical value: n = n – 2 = 10 – 2 = 8
t0.05 (n = 8) = 2.306
(vi) Decision: Since t < t0.05, the null hypothesis is accepted at 5% level of
significance i.e., there is no relationship between literacy rate and political
stability.
Example 3
Find the least value of r in samples of 18 pairs of observations from a
bivariate normal population, which is significant at 5% level.
Solution
The value of r for n = 18 will be significant at 5% level if
r n-2
≥ t0.05 (n = 16)
1 - r2
r n-2
≥ 2.12
1 - r2
6.14 t-test: Test of Significance for Correlation Coefficients 6.53
Exercise 6.3
1. A sample of 26 bulbs gives a mean life of 990 hours with a SD of 20
hours. The manufacturer claims that the mean life of bulbs is 1000
hours. Is the sample not up to standard?
[Ans.: The sample is not up to the standard]
Horse B 29 30 30 24 27 29
Test whether the two horses have the same running capacity.
[Ans.: The two horses do not have the same running capacity]
9. To examine the hypothesis that the husbands are more intelligent than
the wives, an investigator took a sample of 10 couples and administered
them a test which measures the IQ. The results are as follows:
Husbands 117 105 97 105 123 109 86 78 103 107
Wives 106 98 87 104 116 95 90 69 108 85
Test the hypothesis with a reasonable test at the level of significance of
0.05.
[Ans.: These is no significant difference in IQs]
10. Two independent samples of 8 and 7 items respectively had the following
values:
Sample I 11 11 13 11 15 9 12 14
Sample II 9 11 10 13 9 8 10 –
Is the difference between the means of samples significant?
[Ans.: The difference between the mean of samples is not significant]
11. Random samples of specimens of coal from two mines A and B are
drawn and their heat-producing capacity (in millions of calories/ton)
were measured yielding the following results:
Mine A 8350 8070 8340 8130 8260 –
Mine B 7900 8140 7920 7840 7890 7950
Is there significant difference between the means of these two samples
at 0.01 level of significance?
[Ans.: There is significant difference between
the means of two samples]
6.15 Snedecor’s F-test for Ratio of Variances 6.55
12. A
random sample of 27 pairs of observations from a bivariate normal
population gives a correlation coefficient of 0.42. Is it likely that the
variables are uncorrelated in the population?
[Ans.: correlated]
13. F
ind the least value of r in a sample of 27 pairs from a bivariate normal
population which is significant at 5% level.
[Ans.: |r| = 0.487]
Let x1, x2, …, xn and y1, y2, …, yn be the values of two independent random samples
of sizes n1 and n2 (n1 £ 30, n2 £ 30) with means x and y drawn from the normal
population with mean m and standard deviation s. The test statistic of Snedecor’s
F-test in terms of unbiased estimates of standard deviations S1 and S2 of population is
given by
S12
F= where S12 > S22
S22
and S12 =
 ( x - x )2
n1 - 1
S22 =
 ( y - y )2
n2 - 1
with numerator degree of freedom n1 = n – 1 and denominator degree of freedom
n2 = n2 – 1.
If s1 and s2 are standard deviations of samples then
s12 =
 ( x - x )2 P(F )
n1
s22 =
 ( y - y )2
n2
\ Â (x - x ) =
2
n1s12
 ( y - y )2 = n2 s22 O
F
n1s12
S12 =
n1 - 1
n2 s22
S22 =
n2 - 1
6.56 Chapter 6 Applied Statistics: Test of Hypothesis
where the constant c depends on n1 and n2. It is so chosen that the area under the curve
is unity.
Example 1
In two independent samples of sizes 8 and 10, the sum of squares of
deviations of the sample values from the respective means were 84.4
6.15 Snedecor’s F-test for Ratio of Variances 6.57
Example 2
The standard deviations calculated from two random samples of sizes
9 and 13 are 2.1 and 1.8 respectively. Can the samples be regarded as
drawn from normal populations with the same SD?
Solution
n1 = 9, n2 = 13, s1 = 2.1, s2 = 1.8
n1s12 2
9(2.1)
S12 = = = 4.96
n1 - 1 9 -1
n2 s22 13(1.8)2
S22 = = = 3.51
n2 - 1 13 - 1
6.58 Chapter 6 Applied Statistics: Test of Hypothesis
Example 3
Two random samples are drawn from two populations and the following
results were obtained:
Sample I 16 17 18 19 20 21 22 24 26 27
Sample II 19 22 25 25 26 28 29 30 31 32 35 36
Find the variances of the two samples and test whether the two popula-
tions have the same variances.
Solution
n1 = 10, n2 = 12
x1 = 21 ¸
Ô
x2 = 28 Ô
˝ From calculator
s1 = 3.55 Ô
s = 4.98 Ô˛
2
n1s12 10(3.55)2
S12 = = = 14
n1 - 1 10 - 1
n2 s22 12(4.98)2
S22 = = = 27.05
n2 - 1 12 - 1
2
(i) Null Hypothesis H0: s 12 = s 2 , i.e., two populations have the same variances.
2
(ii) Alternative Hypothesis H1: s 12 > s 2
(iii) Level of significance: a = 0.05 (assumption)
(iv) Test statistic: Since S22 > S12 ,
6.15 Snedecor’s F-test for Ratio of Variances 6.59
S22 27.05
F= = = 1.93
S12 14
(v) Critical value: n1 = n1 – 1 = 10 – 1 = 9
n2 = n2 – 1 = 12 – 1 = 11
F0.05 (n2 = 11, n1 = 9) = 3.10
(vi) Decision: Since F < F0.05, the null hypothesis is accepted at 5% level of signifi-
cance, i.e., two populations have the same variances.
Example 4
In a test given to two groups of students drawn from two normal
populations, the marks obtained were as follows:
Group I 18 20 36 50 49 36 34 49 41
Group II 29 28 26 35 30 44 46
n1s12 9(11.2225)2
S12 = = = 141.75
n1 - 1 9 -1
n2 s22 7(7.426)2
S22 = = = 64.33
n2 - 1 7 -1
(i) Null Hypothesis H 0 : s 12 = s 22 , i.e., the two populations have same vari-
ances.
S12 141.75
(iv) Test statistic: F = = = 2.203
S22 64.33
Example 5
A group of 10 rats fed on diet A and another group of 8 rats fed on diet
B recorded following increase in weight:
Diet A 5 6 8 1 12 4 3 9 6 10 gm
Diet B 2 3 6 8 1 10 2 8 gm
(vi) Decision: Since F < F0.05, the null hypothesis is accepted at 5% level of signifi-
cance, i.e., the two variances are not significantly different.
Example 6
Two random samples gave the following data:
Size Mean Variance
Sample I 8 9.6 1.2
Sample II 11 16.5 2.5
Can we conclude that the two samples have been drawn from the same
normal population?
Solution
A normal distribution has two parameters, mean µ and variance s2. To conclude that the
two samples have been drawn from the same normal population, we have to test for
(i) Equality of two means H0 (µ1 = µ2) by t-test
2 2
(ii) Equality of two variances H0 (s 1 = s 2 ) by F-test.
F-test:
S22 2.75
F= = = 2.007
S12 1.37
(v) Critical value: n1 = n1 – 1 = 8 – 1 = 7
n2 = n2 – 1 = 11 – 1 = 10
F0.05 (n2 = 10, n1 = 7) = 3.64
(vi) Decision: Since F < F0.05, the null hypothesis is accepted at 5% level of signifi-
cance, i.e., two populations have the same variances.
6.62 Chapter 6 Applied Statistics: Test of Hypothesis
t-test:
n1s12 + n2 s22 8(1.2) + 11(2.5)
s= = = 1.48
n1 + n2 - 2 8 + 11 - 2
(i) Null Hypothesis H0: µ1 = µ2, i.e., means of two populations are equal.
(ii) Alternative Hypothesis H1: µ1 π µ2 (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
x1 - x2 9.6 - 16.5
(iv) Test statistic: t = = = -10.03
1 1 1 1
s + 1.48 +
n1 n2 8 11
t = 10.03
(v) Critical value: n1 = n1 + n2 – 2 = 8 + 11 – 2 = 17
t0.05 (n = 17) = 2.11
(vi) Decision: Since t > t0.05, the null hypothesis is rejected at 5% level of signifi-
cance, i.e., two populations have not same means.
Hence, the two samples could not have been drawn from the same normal
population.
Example 7
Two nicotine contents in two random samples of tobacco are given
below:
Sample I 21 24 25 26 27
Sample II 22 27 28 30 31 36
Can we say that two samples came from the same population?
Solution
F-test:
n1 = 5, n2 = 6
x1 = 24.6 ¸
Ô
x2 = 29 Ô
˝ From calculator
s1 = 2.06 Ô
s = 4.24 Ô˛
2
n1s12 5(2.06)2
S12 = = = 5.30
n1 - 1 5 -1
n2 s22 6(4.24)2
S22 = = = 21.57
n2 - 1 6 -1
6.15 Snedecor’s F-test for Ratio of Variances 6.63
2
(i) Null Hypothesis H0: s 1 = s 22 , i.e., variances of two populations are equal.
2 2
(ii) Alternative Hypothesis H1: s 1 > s 2
(iii) Level of significance: a = 0.05 (assumption)
2 2
(iv) Test statistic: Since S2 > S1 ,
S22 21.57
F= = = 4.07
S12 5.30
(v) Critical value: n1 = n1 – 1 = 5 – 1 = 4
n2 = n2 – 1 = 6 – 1 = 5
F0.05 (n2 = 5, n1 = 4) = 6.26
(vi) Decision: Since F < F0.05, the null hypothesis is accepted at 5% level of signifi-
cance, i.e., the two populations have the same variances.
t-test:
n1s12 + n2 s22 5(2.06)2 + 6(4.24)2
s= = = 14.34
n1 + n2 - 2 5+6-2
(i) Null Hypothesis H0: µ1 = µ2, i.e., means of two populations are equal.
(ii) Alternative Hypothesis H1: µ1 π µ2 (Two tailed test)
(iii) Level of significance: a = 0.05 (assumption)
x1 - x2 24.6 - 29
(iv) Test statistic: t = = = -0.51
1 1 1 1
s + 14.34 +
n1 n2 5 6
t = 0.51
(v) Critical value: n = n1 + n2 – 2 = 5 + 6 – 2 = 9
t0.05 (n = 9) = 2.262
(vi) Decision: Since t < t0.05, the null hypothesis is accepted at 5% level of signifi-
cance, i.e., two populations have same means.
Hence, two samples came from the same population.
Exercise 6.4
1. If two independent samples of sizes n1 = 13 and n2 = 7 are taken from
a normal population. What is the probability that the variance of the
first sample will be at least four times as large as that of the second
sample?
[Ans.: 0.05]
6.64 Chapter 6 Applied Statistics: Test of Hypothesis
Method I 20 16 26 27 22
Method II 27 33 42 35 32 34 38
Test the hypothesis that the variance of brand A is more than that of B.
[Ans.: Variance of brand A is not more than the variance of brand B ]
6. In a laboratory experiment two samples gave the following results:
2
6.16 Chi–square (c ) Test
2
The chi-square (c ) test is a useful measure of comparing experimentally obtained
results with those expected theoretically and based on hypothesis. It is used as a test
statistic in testing a hypothesis that provides a set of theoretical frequencies with which
observed frequencies are compared. The magnitude of discrepancy between observed
2
and theoretical frequencies is given by the quantity c (pronounced as chi-square). If
c = 0, the observed and expected frequencies completely coincide. As the value of c2
2
increases, the discrepancy between the observed and theoretical frequency decreases.
If fo1 , fo2 , ..., fon be a set of observed frequencies and fe , fe , ..., fe be the corre-
1 2 n
2
sponding set of expected (or theoretical) frequencies then c is defined by
6.16.1 Chi–Square Distribution
If x1, x2, …, xn are n independent normal variates with mean zero and standard deviation
2
unity then x12 + x22 + ... + xn2 is a random variate having c distribution with probability
density function given by
n −1 c2
−
P ( c ) = y0 ( c )
2 2 2
e 2
P(c 2)
v=3
v=5
v=1
c2
O
2
6.16.2 Properties of c -Distribution
(i) Chi-Square test is always positively skewed.
(ii) The mean of chi-square distribution is the number of degrees of freedom.
(iii) The standard deviation of chi-square distribution = 2n .
(iv) Chi-square values increases with the increase in degrees of freedom.
2
(v) The value of c lies between zero and infinity.
(vi) For different values of degrees of freedom, the shape of the curve will be
different.
Test of Significance
Let fo1 , fo2 , ..., fon be a set of observed frequencies and fe1 , fe2 , ..., fen be the corre-
sponding set of expected or theoretical frequencies. The c2 statistic is given by
( f o - f e )2
c2 = Â
fe
Working Rule
(i) Set up a null hypothesis.
(ii) Set up an alternative hypothesis.
(iii) Set a level of significance a.
2
(iv) Calculate c .
2
(v) Find the degree of freedom and find the corresponding value of c at given
level of significance a.
2 2
(vi) If the calculated value of c is less than tabulated value of c at the level
2
of significance a, the null hypothesis is accepted. If calculated value of c
2
is more than tabulated value of c at the level of significance a, the null
hypothesis is rejected.
Example 1
A dice was thrown 132 times and the following frequencies were observed:
No. obtained 1 2 3 4 5 6 Total
Frequency 15 20 25 15 29 28 132
6.17 Chi-square Test: Goodness of Fit 6.67
Observed Expected ( fo - fe ) 2
No. obtained
frequency, fo frequency, fe fe
1 15 22 2.23
2 20 22 0.18
3 25 22 0.41
4 15 22 2.23
5 29 22 2.23
6 28 22 1.64
( fo - fe )2
c2 = Â fe
= 8.92
Example 2
The number of car accidents in a metropolitan city was found to be 20,
17, 12, 6, 7, 15, 8, 5, 16 and 14 per month respectively. Use c2 test to
check whether these frequencies are in agreement with the belief that the
occurrence of accidents was the same during 10 months period. Test at
5% level of significance.
Solution
n = 10
(i) Null Hypothesis H0: Occurrence of accident was same during 10 months
period.
6.68 Chapter 6 Applied Statistics: Test of Hypothesis
(ii) Alternative Hypothesis H1: Occurrence of accidents was not same during
10 months period.
(iii) Level of significance: a = 0.05
(iv) Test statistic: If occurrence of accidents is same, the expected frequency of
accidents per month
20 + 17 + 12 + 6 + 7 + 15 + 8 + 5 + 16 + 14
fe = = 12
10
Observed Expected ( fo - fe ) 2
frequency, fo frequency, fe fe
20 12 5.33
17 12 2.08
12 12 0
6 12 3
7 12 2.08
15 12 0.75
8 12 1.33
5 12 4.08
16 12 1.33
14 12 0.33
( fo - fe )2
c2 = Â fe
= 20.31
Example 3
200 digits were chosen at random from a set of tables, The frequency of
the digits are shown below:
Digits 0 1 2 3 4 5 6 7 8 9
Frequency 18 19 23 21 16 25 22 20 21 15
Use the c2-test to access the correctness of the hypothesis that the digits
were distributed in equal number in the tables from which these were
chosen.
6.17 Chi-square Test: Goodness of Fit 6.69
Solution
n = 10
(i) Null Hypothesis H0: The digits were distributed in equal number in the ta-
bles.
(ii) Alternative Hypothesis H1: The digits were not distributed in equal number in
the tables.
(iii) Level of significance: a = 0.05 (assumption)
200
(iv) Test statistic: Expected frequency of each digit fe = = 20
10
Observed Expected ( fo - fe ) 2
frequency, fo frequency, fe fe
18 20 0.2
19 20 0.05
23 20 0.45
21 20 0.05
16 20 0.8
25 20 1.25
22 20 0.2
20 20 0
21 20 0.05
15 20 1.25
( fo - fe )2
c2 = Â fe
= 4.3
Example 4
Theory predicts that the proportion of beans in the four groups A, B, C, D
should be 9 : 3 : 3: 1. In an experiment among 1600 beans, the numbers
in the four groups were 882, 313, 287 and 118. Does the experimental
results support the theory?
Solution
n=4
6.70 Chapter 6 Applied Statistics: Test of Hypothesis
(i) Null Hypothesis H0: The proportion of the beans in the four groups A, B, C, D
is 9 : 3 : 3 : 1.
(ii) Alternative Hypothesis H1: The proportion of the beans in the four groups A,
B, C, D is not 9 : 3 : 3 : 1.
(iii) Level of significance: a = 0.05 (assumption)
(iv) Test statistic:
Observed Expected ( fo - fe ) 2
Group Frequency, frequency,
fe
fo fe
9
A 882 ¥ 1600 = 900 0.36
16
3
B 313 ¥ 1600 = 300 0.56
16
3
C 287 ¥ 1600 = 300 0.56
16
1
D 118 ¥ 1600 = 100 3.24
16
( fo - fe )2
c2 = Â fe
= 4.72
Example 5
The following mistakes per page were observed in a book:
No. of mistakes per page 0 1 2 3 4
No. of pages 211 90 19 5 0
( fo - fe )2
c2 = å = 0.07
fe
(v)
Critical value: The number of degrees of freedom is 1 for each class. There
are 5 classes originally. Hence, the degrees of freedom originally is 5. Since
the classes are reduced by 2, the degrees of freedom is reduced by 2. Further,
while calculating the parameter l, two sums Sfx and Sf are used. Hence, the
degrees of freedom is again reduced by 2.
Hence, the number of degrees of freedom g = 5 – (2 + 2) = 1
c 02.05 = 3.84
2
(vi) Decision: Since c < c 0.05 , the null hypothesis is accepted at 5% level of sig-
2
Example 6
A set of five similar coins is tossed 320 times and result is obtained as
follows:
6.72 Chapter 6 Applied Statistics: Test of Hypothesis
No. of heads 0 1 2 3 4 5
Frequency 6 27 72 112 71 32
Test the hypothesis that the data follow a binomial distribution.
Solution
(i)
Null Hypothesis H0: The data follow a binomial distribution.
(ii)
Alternative Hypothesis H1: The data do not follow binomial distribution.
Level of significance: a = 0.05
(iii)
1
Test statistic: Probability of getting a head p =
(iv)
2
1
Probability of getting a tail q =
By binomial distribution, 2
x 5- x
æ1ö æ1ö
p( x ) = nC x p x q n - x = 5C x ç ÷ ç ÷ , x = 0,1, 2, 3, 4, 5
è2ø è2ø
N = 320
é æ1ö æ1ö ù
x 5- x
Observed Expected ( fo - fe )2
No. of heads fo – fe
frequency fo frequency fe fe
0 6 10 –4 1.6
1 27 50 –23 10.58
2 72 100 –28 7.84
3 112 100 12 1.44
4 71 50 21 8.82
5 32 10 22 48.4
( fo - fe )2
c2 = å = 78.68
fe
Example 7
Fit the equation of the best fitting normal curve to the following data:
x 135 145 155 165 175 185 195 205 Total
f 2 14 22 25 19 13 3 2 100
2
Compare the theoretical and observed frequencies. Using c test find
goodness of fit. Given that m = 165.6 and s = 15.02.
Solution
m = 165.6, s = 15.02, N = Sf = 100
The data is first converted into class intervals with inclusive series
135 2 4 1 0.067
145 14 11
155 22 21 1 0.048
165 25 26 –1 0.038
175 19 21 –2 0.19
185 13 12
195 3 4 1 0.0588
205 2 1
( f o - f e )2
c2 = Â = 0.4018
fe
6.74 Chapter 6 Applied Statistics: Test of Hypothesis
Critical value: There are 5 frequencies. While calculating mean and standard deviation,
three sums Sf, Sfx, and Sfx2 are used. Hence, the number of degrees of freedom
n = 5−3= 2
c 02.05 = 5.99
2
Since c < c 0.05 at 5% level of significance, the fit is good and the distribution is nearly
2
normal.
Independence of Attributes
Two attributes A and B are said to be independent if they are not related to each other.
If two attributes A and B are not independent, they are associated on the basis of cell
frequencies. It is required to test whether two attributes A and B are associated or not.
Under null hypothesis H0 (attributes are independent), the expected frequency fe of any
cell is given by
(Row total) ¥ (Column total) ( Ai )( B j )
fe = =
Total frequency N
6.18 Chi-square Test for Independence of Attributes 6.75
Yate’s Correction
In a 2 × 2 table, there is only one degree of freedom. If any of the expected frequency
is less than 10, Yate’s correction is applied in chi-square formula.
È f - f - 0.5
{ } ˘
2
c = ÂÍ ˙
o e 2
Í fe ˙
ÍÎ ˙˚
Example 1
A total of 3759 individual were interviewed in a public opinion survey on
a political proposal. Of them 1872 were men and the rest were women.
A total of 2257 individuals were in favour of the proposal and 917 were
opposed to it. A total of 243 men were undecided and 442 women were
opposed to it. Do you justify or contradict the hypothesis that there is no
association between sex and attitude at 5% level of significance?
Solution
N = 3759
Opinion about political proposal
Favoured Opposed Undecided Total
Men 1154 475 243 1872
Women 1103 442 342 1887
Total 2257 917 585 3759
(i) Null Hypothesis H0: There is no association between sex and attitude i.e., sex
and attitude are independent.
(ii) Alternative Hypothesis H1: There is association between sex and attitude.
(iii) Level of significance: a = 0.05
6.76 Chapter 6 Applied Statistics: Test of Hypothesis
Expected Frequency
Observed Frequency ( fo - fe ) 2
( Ai )( B j )
fo fe = fe
N
1872 ¥ 2257
1154 ª 1124 0.8
3759
1872 ¥ 917
475 ª 457 0.71
3759
1872 ¥ 585
243 ª 291 7.92
3759
1887 ¥ 2257
1103 ª 1133 0.79
3759
1887 ¥ 917
442 ª 460 0.70
3759
1887 ¥ 585
342 ª 294 7.84
3759
( fo - fe )2
c2 = Â fe
= 18.76
Example 2
A sample of 400 students of undergraduate and 400 students of post-
graduate classes was taken to know their opinion about autonomous
colleges. 290 of the undergraduate and 310 of the postgraduate students
favoured the autonomous status. Present these facts in the form of a
table and test at 5% level of significance, that the opinion regarding
autonomous status of colleges is independent of the level of classes of
students.
6.18 Chi-square Test for Independence of Attributes 6.77
Solution
N = 800
Opinion about autonomous colleges
(i) Null Hypothesis H0 : There is no relation between the classes of students and
opinion, i.e., two attributes are independent.
(ii) Alternative Hypothesis H1: There is relation between the classes of students
and opinion.
(iii) Level of significance: a = 0.05
(iv) Test statistic:
Expected frequency
Observed Frequency ( fo - fe ) 2
( Ai )( B j )
fo fe = fe
N
400 ¥ 600
290 = 300 0.33
800
400 ¥ 200
110 = 100 1.00
800
400 ¥ 600
310 = 300 0.33
800
400 ¥ 200
90 = 100 1.00
800
( fo - fe )2
c2 = Â fe
= 2.66
Example 3
In an experiment on immunisation of cattle from tuberculosis the follow-
ing results were obtained:
6.78 Chapter 6 Applied Statistics: Test of Hypothesis
294 ¥ 1024
267 ª 250 1.156
1206
294 ¥ 182
27 ª 44 6.568
1206
912 ¥ 1024
757 ª 774 0.37
1206
912 ¥ 182
155 ª 138 2.09
1206
( fo - fe )2
c2 = Â fe
= 10.19
Example 4
Given the following contingency table for hair colour and eye colour.
Find the value of c2. Is there good association between the two?
6.18 Chi-square Test for Independence of Attributes 6.79
Hair colour
Eye colour Total
Fair Brown Black
Blue 15 5 20 40
Grey 20 10 20 50
Brown 25 15 20 60
Total 60 30 60 150
Solution
N = 150
(i) Null Hypothesis H0: There is no association between two attributes, hair and
eye colours.
(ii) Alternative Hypothesis H1: There is association between two attributes, hair
and eye colours.
(iii) Level of significance: a = 0.05 (assumption)
(iv) Test statistic:
Expected frequency
Observed Frequency ( fo - fe ) 2
( Ai )( B j )
fo fe = fe
N
40 ¥ 60
15 = 16 0.0625
150
40 ¥ 30
5 =8 1.125
150
40 ¥ 60
20 = 16 1
150
50 ¥ 60
20 = 20 0
150
50 ¥ 30
10 = 10 0
150
50 ¥ 60
20 = 20 0
150
60 ¥ 60
25 = 24 0.042
150
60 ¥ 30
15 = 12 0.75
150
60 ¥ 60
20 = 24 0.666
150
( fo - fe )2
c2 = Â fe
= 3.6465
Example 5
Two researchers adopted different sampling techniques while investi-
gating some group of students to find the number of students falling into
different intelligence level. The results are as follows:
Researchers Below Average Above Genius Total
average average
X 86 60 44 10 200
Y 40 33 25 2 100
Total 126 93 69 12 300
Would you say that the sampling techniques adopted by the two research-
ers are significantly different?
Solution
N = 300
(i) Null Hypothesis H0: There is no significant difference in the sampling
techniques adopted by the two researchers.
(ii) Alternative Hypothesis H1: There is significant difference in the sampling
techniques adopted by the two researchers.
(iii) Level of significance: a = 0.05 (assumption)
(iv) Test statistic:
Expected frequency
Observed frequency ( fo - fe ) 2
( Ai )( B j )
fo fe = fe
N
200 ¥ 126
86 = 84 0.0476
300
200 ¥ 93
60 = 62 0.0645
300
200 ¥ 69
44 = 46 0.0869
300
200 ¥ 12
10 =8 0.5
300
6.18 Chi-square Test for Independence of Attributes 6.81
100 ¥ 126
40 = 42 0.0952
300
100 ¥ 93
33 = 31 0.129
300
100 ¥ 69
25 = 23 0.1739
300
100 ¥ 12
2 =4 1
300
( fo - fe )2
c2 = Â fe
= 2.0971
Example 6
The following table gives the level of education and the marriage
adjustment score for a sample of married women:
Level of Marriage adjustment Total
education Very low Low High Very high
College 24 97 62 58 241
High school 22 28 30 41 121
Middle school 32 10 11 20 73
Total 78 135 103 119 435
Can you conclude from the above data the higher the level of education,
the greater is the degree of adjustment in marriage?
Solution
N = 435
(i) Null Hypothesis H0: There is no relation between the level of education and
adjustment in marriage, i.e., two attributes are independent.
(ii) Alternative Hypothesis H1: There is relation between level of education and
adjustment in marriage.
6.82 Chapter 6 Applied Statistics: Test of Hypothesis
241 ¥ 78
24 ª 43 8.3953
435
241 ¥ 135
97 ª 75 6.4533
435
241 ¥ 103
62 ª 57 0.4386
435
241 ¥ 119
58 ª 66 0.9697
435
121 ¥ 78
22 ª 22 0
435
121 ¥ 135
28 ª 37 2.1892
435
121 ¥ 103
30 ª 29 0.0345
435
121 ¥ 119
41 ª 33 1.9394
435
73 ¥ 78
32 ª 13 27.7692
435
73 ¥ 135
10 ª 23 7.3478
435
73 ¥ 103
11 ª 17 2.1176
435
73 ¥ 119
20 ª 20 0
435
( fo - fe )2
c2 = Â fe
= 57.713
Example 7
Two batches each of 12 animals are taken for test of inoculation. One
batch was inoculated and the other batch was not inoculated. The
number of dead and surviving animals are given in the following table
for both the cases. Can the inoculation be regarded as effective against
the disease. Make Yate’s correction for continuity of c2?
Dead Survived Total
Inoculated 2 10 12
Not inoculated 8 4 12
Total 10 14 24
Solution
N = 24
(i) Null hypothesis H0: There is no relation between inoculation and death i.e.,
inoculation and effect on disease are independent.
(ii) Alternative Hypothesis H1: There is relation between inoculation and death.
(iii) Level of significance: a = 0.05 (assumption)
(iv) Test statistic: Yate’s correction is used only when n = 1 and when some ex-
pected frequencies are small, i.e., less than 10. Here, expected frequencies are
less than 10 each.
Expected frequency
{ fo - fe - 0.5}
2
Observed frequency ( Ai )( B j )
fo fe = fe
N
12 ¥ 10
2 =5 1.25
25
12 ¥ 14
10 =7 0.89
24
12 ¥ 10
8 =5 1.25
24
12 ¥ 14
4 =7 0.89
24
{ fo - fe - 0.5}
2
2
c = Â fe
= 4.28
6.84 Chapter 6 Applied Statistics: Test of Hypothesis
Exercise 6.5
1. A dice is thrown 264 times with the following results: Show that the
dice is biased [Given χ02.05 = 11.07 for 5 df]
No. appeared on the dice 1 2 3 4 5 6
Frequency 40 32 28 58 54 52
2. A pair of dice are thrown 360 times and frequency of each sum is given
below:
Sum 2 3 4 5 6 7 8 9 10 11 12
Frequency 8 24 35 37 44 65 51 42 26 14 14
would you say that the dice are fair on the basis of the chi-square test
at 0.05 level of significance?
[Ans.: The dice are fair]
3. 4 coins are tossed 160 times and the following results were obtained:
No. of heads 0 1 2 3 4
Observed frequencies 17 52 54 31 6
Under the assumption that coins are balanced, find the expected
frequencies of 0, 1, 2, 3 or 4 heads, and test the goodness of fit
(a = 0.05).
[Ans.: Expected frequencies: 10, 40, 60, 40, 10,
the data do not follow binomial distribution]
4. Fit a Poisson distribution to the following data and for its goodness of
fit at level of significance 0.05:
x 0 1 2 3 4
f 419 352 154 56 19
5. The following table gives the number of accidents in a city during a week.
Find whether the accidents are uniformly distributed over a week.
Day Sun Mon Tue Wed Thu Fri Sat Total
No. of accidents 13 15 9 11 12 10 14 84
6. Weights in kilograms of 10 students are given below: 38, 40, 45, 53, 47,
43, 55, 48, 52, 49
Can we say that the variance of the normal distribution from which the
above sample is drawn is 20 kg?
[Ans.: The sample is drawn from the
normal population with variance 20]
7. Five dice are thrown 192 times and the number of times 4, 5 or 6 are
obtained are as follows:
No. of dice showing 4, 5, 6 5 4 3 2 1 0
Frequency 6 46 70 48 20 2
2
Calculate c . [Ans.: 16.94]
8. The distribution of defects in printed circuit board is hypothesised to
follow Poisson distribution. A random sample of 60 printed boards shows
the following data:
No. of defects 0 1 2 3
Observed frequency 32 15 9 4
Does the hypothesis of Poisson distribution appropriate?
[Ans.: The defects follow Poisson distribution]
9. Based on the following data, determine if there is a relation between
literacy and smoking.
Smokers Non-smokers
Literates 83 57
Illiterates 45 68
[Ans.: c2 = 9.19, yes]
10. Table below shows the performances of students in mathematics and
physics. Test the hypothesis that the performance in mathematics is
independent of performance in physics.
Grades in Grades in Mathematics
Physics High Medium Low
High 56 71 12
Medium 47 163 38
Low 14 42 81
2
[Ans.: c = 132.31, Hypothesis is rejected]
11. Investigate the association between the darkness of eye colour in father
and son from the following data:
6.86 Chapter 6 Applied Statistics: Test of Hypothesis
Test whether the two attributes merit and economic condition are
associated or not.
[Ans.: c2 = 9.30, The two attributes are associated]
CHAPTER
Curve Fitting
7
Chapter Outline
7.1 Introduction
7.2 Least Square Method
7.3 Fitting of Linear Curves
7.4 Fitting of Quadratic Curves
7.5 Fitting of Exponential and Logarithmic Curves
7.1 introduction
Curve fitting is the process of finding the ‘best-fit’ curve for a given set of data. It is
the representation of the relationship between two variables by means of an algebraic
equation. On the basis of this mathematical equation, predictions can be made in many
statistical problems.
Suppose a set of n points of values (x1, y1), (x2, y2), …, (xn, yn) of the two variables
x and y are given. These values are plotted on a rectangular coordinate system, i.e.,
the xy-plane. The resulting set of points is known as a scatter diagram (Fig. 7.1).
The scatter diagram exhibits the trend and it is possible to visualize a smooth curve
approximating the data. Such a curve is known as an approximating curve.
y y
o x o x
Fig. 7.1
7.2 Chapter 7 Curve Fitting
The distance QP is known as deviation, error, or residual and is denoted by di. It may
be positive, negative, or zero depending upon whether P lies above, below, or on the
curve. Similar residuals or errors corresponding to the remaining (n – 1) points may be
obtained. The sum of squares of residuals, denoted by E, is given as
n n
E = Â di 2 = Â [ yi - f ( xi )]2
i =1 i =1
If E = 0 then all the n points will lie on y = f (x). If E π 0, f (x) is chosen such that E is
minimum, i.e., the best fitting curve to the set of points is that for which E is minimum.
This method is known as the least square method. This method does not attempt to
determine the form of the curve y = f (x) but it determines the values of the parameters
of the equation of the curve.
Let (xi, yi), i = 1, 2, …, n be the set of n values and let the relation between x and y be
y = a + bx. The constants a and b are selected such that the straight line is the best fit to
the data.
The residual at x = xi is
di = yi - f ( xi )
= yi - (a + bxi ) i = 1, 2, ..., n
n
E = Â di 2
i =1
n
= Â ÈÎ yi - (a + bxi )˘˚
2
i =1
n
= Â ( yi - a - bxi )2
i =1
7.3 Fitting of Linear Curves 7.3
For E to be minimum,
(i) ∂E = 0
∂a
n
 2( yi - a - bxi )(-1) = 0
i =1
n
 ( yi - a - bxi ) = 0
i =1
n n n
 yi = aÂ1 + b xi
i =1 i =1 i =1
 y = na + b x
∂E
(ii) =0
∂b
n
 2( yi - a - bxi )(- xi ) = 0
i =1
n
 ( xi yi - axi - bxi 2 ) = 0
i =1
n n n
 xi yi = a xi + b xi 2
i =1 i =1 i =1
 xy = a x + b x 2
These two equations are known as normal equations. These equations can be solved
simultaneously to give the best values of a and b. The best fitting straight line is
obtained by substituting the values of a and b in the equation y = a + bx .
Example 1
Fit a straight line to the following data:
x 1 2 3 4 6 8
y 2.4 3 3.6 4 5 6
Solution
Let the straight line to be fitted to the data be
y = a + bx
Here, n = 6
x y x2 xy
1 2.4 1 2.4
2 3 4 6
3 3.6 9 10.8
4 4 16 16
6 5 36 30
8 6 64 48
Note Âx, Ây, Âx2, Âxy can be directly obtained with the help of scientific calculator.
Example 2
Fit a straight line to the following data. Also, estimate the value of y at
x = 2.5.
x 0 1 2 3 4
y 1 1.8 3.3 4.5 6.3
Solution
Let the straight line to be fitted to the data be
y = a + bx
The normal equations are
 y = na + b x …(1)
 xy = a x + b x 2 …(2)
Here, n = 5
7.3 Fitting of Linear Curves 7.5
x y x2 xy
0 1 0 0
1 1.8 1 1.8
2 3.3 4 6.6
3 4.5 9 13.5
4 6.3 16 25.2
At x = 2.5,
y (2.5) = 0.72 + 1.33 (2.5) = 4.045
Example 3
A simply supported beam carries a concentrated load P(lb) at its
midpoint. Corresponding to various values of P, the maximum deflection
Y(in) is measured. The data is given below:
P 100 120 140 160 180 200
Y 0.45 0.55 0.60 0.70 0.80 0.85
 Y = na + b P ...(1)
7.6 Chapter 7 Curve Fitting
 PY = a P + b P 2 ...(2)
Here, n = 6
P Y P2 PY
100 0.45 10000 45
120 0.55 14400 66
140 0.60 19600 84
160 0.70 25600 112
180 0.80 32400 144
200 0.85 40000 170
ÂP = 900 ÂY = 3.95 ÂP = 142000
2
ÂPY = 621
Example 4
Fit a straight line to the following data. Also, estimate the value of y at
x = 70.
x 71 68 73 69 67 65 66 67
y 69 72 70 70 68 67 68 64
Solution
Since the values of x and y are larger, we choose the origin for x and y at 69 and 67
respectively,
Let X = x - 69 and Y = y - 67
Let the straight line to be fitted to the data be
Y = a + bX
The normal equations are
 Y = na + b X …(1)
 XY = a X + b X 2 …(2)
7.3 Fitting of Linear Curves 7.7
Here, n = 8
x y X Y X2 XY
71 69 2 2 4 4
68 72 −1 5 1 −5
73 70 4 3 16 12
69 70 0 3 0 0
67 68 –2 1 4 −2
65 67 −4 0 16 0
66 68 −3 1 9 −3
67 64 −2 −3 4 6
ÂX = –6 ÂY = 12 ÂX = 54
2
ÂXY = 12
Note Since Âx, Ây, Âx2, Âxy can be directly obtained with the help of scientific
calculator, the problem can be solved without shifting the origin.
Example 5
Fit a straight line to the following data taking x as the dependent vari-
able.
x 1 3 4 6 8 9 11 14
y 1 2 4 4 5 7 8 9
Solution
If x is considered the dependent variable and y the independent variable, the equation
of the straight line to be fitted to the data is
x = a + by
7.8 Chapter 7 Curve Fitting
 x = na + b y …(1)
 xy = a y + b y2 …(2)
Here, n = 8
x y y2 xy
1 1 1 1
3 2 4 6
4 4 16 16
6 4 16 24
8 5 25 40
9 7 49 63
11 8 64 88
14 9 81 126
Âx = 56 Ây = 40 Ây = 256
2
Âxy = 364
Substituting these values in Eqs (1) and (2),
56 = 8a + 40b …(3)
364 = 40 a + 256b …(4)
Solving Eqs (3) and (4),
a = − 0.5
b = 1.5
Hence, the required equation of the straight line is
x = - 0.5 + 1.5 y
Example 6
If P is the pull required to lift a load W by means of a pulley block, find
a linear law of the form P = mW + c connecting P and W using the
following data:
P 12 15 21 25
W 50 70 100 120
P W W2 PW
12 50 2500 600
15 70 4900 1050
21 100 10000 2100
25 120 14400 3000
ÂP = 73 ÂW = 340 ÂW = 31800
2
ÂPW = 6750
Exercise 7.1
1. Fit the line of best fit to the following data:
x 0 5 10 15 20 25
y 12 15 17 22 24 30
t°C 19 25 30 36 40 45 50
R 76 77 79 80 82 83 85
ÈÎAns. : y = 19 + 9.7 x ˘˚
V 60 65 70 75 80 85 90
R 109 114 118 123 127 130 133
Let (xi, yi), i = 1, 2, …, n be the set of n values and let the relation between x and y be
y = a + bx + cx 2 . The constants a, b, and c are selected such that the parabola is the
best fit to the data. The residual at x = xi is
di = yi - f ( xi )
(
= yi - a + bxi + cxi2 )
n
E = Â di2
i =1
2
( )
n
= Â È yi - a + bxi + cxi 2 ˘
i =1
Î ˚
2
( )
n
= Â yi - a - bx i - cxi 2
i =1
For E to be minimum,
∂E
(i) =0
∂a
n
 2 ( y i - a - bxi - cxi ) (-1) = 0
i =1
7.4 Fitting of Quadratic Curves 7.11
n
 ( yi - a - bxi - cxi ) = 0
i =1
n n n n
 yi = aÂ1 + b xi + c xi 2
i =1 i =1 i =1 i =1
 y = na + b x + c x 2
(ii) ∂E = 0
∂b
n
 2( yi - a - bxi - cxi )(- xi ) = 0
i =1
i =1
n n n n
 xi yi = a xi + b xi2 + c xi3
i =1 i =1 i =1 i =1
 xy = na + b x 2 + c x3
∂E
(iii) =0
∂c
n
 2( yi - a - bxi - cxi2 )( xi2 ) = 0
i =1
n
 xi2 yi - axi2 - bxi3 - cxi4 = 0
i =1
n n n n
 xi2 yi = a xi2 + b xi3 + c xi4
i =1 i =1 i =1 i =1
 x 2 y = a  x 2 + b x 3 + c  x 4
These equations are known as normal equations. These equations can be solved simul-
taneously to give the best values of a, b, and c. The best fitting parabola is obtained by
substituting the values of a, b, and c in the equation y = a + bx + cx 2 .
Example 1
Fit a least squares quadratic curve to the following data:
x 1 2 3 4
y 1.7 1.8 2.3 3.2
Estimate y(2.4).
7.12 Chapter 7 Curve Fitting
Solution
Let the equation of the least squares quadratic curve (parabola) be y = a + bx + cx 2 .
The normal equations are
 y = na + b x + c x 2 …(1)
 xy = a x + b x 2 + c x3 …(2)
 x 2 y = a  x 2 + b x 3 + c  x 4 …(3)
Here, n = 4
x y x2 x3 x4 xy x2y
1 1.7 1 1 1 1.7 1.7
2 1.8 4 8 16 3.6 7.2
3 2.3 9 27 81 6.9 20.7
4 3.2 16 64 256 12.8 51.2
Sx = 10 Sy = 9 2
Sx = 30 3
Sx = 100
4
Sx = 354 Sxy = 25 2
Sx y = 80.8
Note Âx, Ây, Âx2, Âx3, Âx4, Âxy, Âx2y can be directly obtained with the help of
scientific calculator.
Example 2
Fit a second-degree polynomial using least square method to the
following data:
x 0 1 2 3 4
y 1 1.8 1.3 2.5 6.3
[Summer 2015]
7.4 Fitting of Quadratic Curves 7.13
Solution
Let the equation of the least squares quadratic curve be y = a + bx + cx2. The normal
equations are
Ây = na + bÂx + cÂx2 ...(1)
x y x2 x3 x4 xy x2y
0 1 0 0 0 0 0
1 1.8 1 1 1 1.8 1.8
2 1.3 4 8 16 2.6 5.2
3 2.5 9 27 81 7.5 22.5
4 6.3 16 64 256 25.2 100.8
Âx = 10 Ây = 12.9 Âx2 = 30 Âx3 = 100 Âx4 = 354 Âxy = 37.1 Âx2y = 130.3
Example 3
By the method of least squares, fit a parabola to the following data:
x 1 2 3 4 5
y 5 12 26 60 97
Also, estimate y at x = 6.
Solution
Let the equation of the parabola be y = a + bx + cx2. The normal equations are
Ây = na + bÂx + cÂx2 ...(1)
7.14 Chapter 7 Curve Fitting
x y x2 x3 x4 xy x2y
1 5 1 1 1 5 5
2 12 4 8 16 24 48
3 26 9 27 81 78 234
4 60 16 64 256 240 960
5 97 25 125 625 485 2425
Âx = 15 Ây = 200 Âx = 55
2
Âx = 225 Âx = 979 Âxy = 832
3 4
Âx y = 3672
2
Example 4
Fit a second-degree parabolic curve to the following data.
x 1 2 3 4 5 6 7 8 9
y 2 6 7 8 10 11 11 10 9
Solution
Let X = x-5
Y = y -10
Let the equation of the parabola be Y = a + bX + cX 2 .
The normal equations are
 Y = na + b X + c X 2 …(1)
 XY = a  X + b X 2 + c  X 3 …(2)
7.4 Fitting of Quadratic Curves 7.15
 X 2Y = a  X 2 + b  X 3 + c  X 4 …(3)
Here, n = 9
x y X Y X2 X3 X4 XY X 2Y
1 2 −4 −8 16 −64 256 32 −128
2 6 −3 −4 9 −27 81 12 −36
3 7 −2 −3 4 −8 16 6 −12
4 8 −1 −2 1 −1 1 2 −2
5 10 0 0 0 0 0 0 0
6 11 1 1 1 1 1 1 1
7 11 2 1 4 8 16 2 4
8 10 3 0 9 27 81 0 0
9 9 4 −1 16 64 256 −4 −16
Note Since Âx, Ây, Âx2, Âx3, Âx4, Âxy, Âx2y can be directly obtained with the help
of scientific calculator, the problem can be solved without shifting the origin.
Example 5
Fit a second-degree parabola y = a + bx2 to the following data:
x 1 2 3 4 5
y 1.8 5.1 8.9 14.1 19.8
7.16 Chapter 7 Curve Fitting
Solution
Let the curve to be fitted to the data be
y = a + bx2
The normal equations are
 y = na + b  x 2 ...(1)
 x 2 y = a  x 2 + b x 4 ...(2)
Here, n = 5
x y x2 x4 x2y
1 1.8 1 1 1.8
2 5.1 4 16 20.4
3 8.9 9 81 80.1
4 14.1 16 256 225.6
5 19.8 25 625 495
Ây = 49.7 Âx = 55
2
Âx = 979
4
Âx y = 822.9
2
Example 6
Fit a curve y = ax + bx 2 for the following data:
x 1 2 3 4 5 6
y 2.51 5.82 9.93 14.84 20.55 27.06
Solution
Let the curve to be fitted to the data be
y = ax + bx 2
The normal equations are
 xy = a x 2 + b x3 …(1)
7.4 Fitting of Quadratic Curves 7.17
 x 2 y = a  x 3 + b x 4 …(2)
x y x2 x3 x4 xy x2y
1 2.51 1 1 1 2.51 2.51
2 5.82 4 8 16 11.64 23.28
3 9.93 9 27 81 29.79 89.37
4 14.84 16 64 256 59.36 237.44
5 20.55 25 125 625 102.75 513.75
6 27.06 36 216 1296 162.36 974.16
Exercise 7.2
1. Fit a parabola to the following data:
x −2 −1 0 1 2
y 1.0 1.8 1.3 2.5 6.3
2
2. Fit a curve y = ax + bx to the following data:
x −2 −1 0 1 2
y −72 −46 −12 35 93
x 0 2 5 10
y 4 7 6.4 −6
x 3 5 7 9 11 13
y 2 3 4 6 5 8
Let (xi , yi), i = 1, 2, …, n be the set of n values and let the relation between x and y be
y = abx.
Taking logarithm on both the sides of the equation y = abx,
loge y = loge a + x loge b
Y = A + BX
This is a linear equation in X and Y. The normal equations are
 Y = nA + B X
 XY = A X + B X 2
Solving these equations, A and B, and, hence, a and b can be found. The best fitting
exponential curve is obtained by substituting the values of a and b in the equation
y = abx.
Similarly, the best fitting exponential curves for the relation y = axb and y = aebx can be
obtained.
Example 1
Find the law of the form y = abx to the following data:
x 1 2 3 4 5 6 7 8
y 1 1.2 1.8 2.5 3.6 4.7 6.6 9.1
7.5 Fitting of Exponential and Logarithmic Curves 7.19
Solution
y = ab x
Taking logarithm on both the sides,
loge y = loge a + x loge b
 XY = A X + B X 2 …(2)
Here, n = 8
x y X Y X2 XY
1 1 1 0.0000 1 0.0000
2 1.2 2 0.1823 4 0.3646
3 1.8 3 0.5878 9 1.7634
4 2.5 4 0.9163 16 3.6652
5 3.6 5 1.2809 25 6.4045
6 4.7 6 1.5476 36 9.2856
7 6.6 7 1.8871 49 13.2097
8 9.1 8 2.2083 64 17.6664
Example 2
Fit a curve of the form y = abx to the following data by the method of
least squares:
x 1 2 3 4 5 6 7
Solution
y = abx
Taking logarithm on both the sides,
logey = logea + x logeb
Putting logey = Y, logea = A, x = X and logeb = B,
Y = A + BX
The normal equations are
ÂY = nA + BÂX ...(1)
x y X Y X2 XY
1 87 1 4.4659 1 4.4659
2 97 2 4.5747 4 9.1494
3 113 3 4.7274 9 14.1822
4 129 4 4.8598 16 19.4392
5 202 5 5.3083 25 26.5415
6 195 6 5.2730 36 31.6380
7 193 7 5.2627 49 36.8389
ÂX = 28 ÂY = 34.4718 ÂX = 140
2
ÂXY = 142.2551
Substituting these values in Eqs (1) and (2),
34.4718 = 7A + 28 B ...(3)
142.2551 = 28 A + 140 B ...(4)
Solving Eqs (3) and (4),
A = 4.3006
B = 0.156
7.5 Fitting of Exponential and Logarithmic Curves 7.21
logea = A
logea = 4.3006
a = 73.744
logeb = B
logeb = 0.156
b = 1.1688
Hence, the required curve is
y = 73.744 (1.1688)x
Example 3
Fit a curve of the form y = axb to the following data:
x 20 16 10 11 14
y 22 41 120 89 56
Solution
y = axb
Taking logarithm on both the sides,
loge y = loge a + b loge x
 Y = nA + B X …(1)
 XY = A X + B X 2 …(2)
Here, n = 5
x y X Y X2 XY
20 22 2.9957 3.0910 8.9742 9.2597
16 41 2.7726 3.7136 7.6873 10.2963
10 120 2.3026 4.7875 5.3019 11.0237
11 89 2.3979 4.4886 5.7499 10.7632
14 56 2.6391 4.0254 6.9648 10.6234
Example 4
Fit a curve of the form y = aebx to the following data:
x 1 3 5 7 9
y 115 105 95 85 80
Solution
y = aebx
Taking logarithm on both the sides,
loge y = loge a + bx loge e
= loge a + bx
 Y = nA + B X …(1)
 X Y = A X + B  X 2 …(2)
Here, n = 5
x y X Y X2 XY
1 115 1 4.7449 1 4.7449
3 105 3 4.6539 9 13.9617
5 95 5 4.5539 25 22.7695
7 85 7 4.4427 49 31.0989
9 80 9 4.3820 81 39.438
a = 120.2653
b = B = - 0.0469
and
Hence, the required equation of the curve is
y = 120.2653 e -0.0469 x
Example 5
bx
Fit the exponential curve y = ae to the following data:
x 0 2 4 6 8
y 150 63 28 12 5.6
[Summer 2015]
Solution
y = aebx
Taking logarithm on both the sides,
loge y = loge a + bx loge e
= loge a + bx
Putting logey = Y, logea = A, b = B and x = X,
Y = A + BX
The normal equations are
 Y = nA + b X ...(1)
 XY = A X + B X 2 ...(2)
7.24 Chapter 7 Curve Fitting
Here, n = 5
x y X Y X2 XY
0 150 0 5.0106 0 0
2 63 2 4.1431 4 8.2862
4 28 4 3.3322 16 13.3288
6 12 6 2.4849 36 14.9094
8 5.6 8 1.7228 64 13.7824
Example 6
The pressure and volume of a gas are related by the equation PVg = c.
Fit this curve to the following data:
P 0.5 1.0 1.5 2.0 2.5 3.0
Solution
PVg = c
1 1
Putting loge V = y, loge c = a, loge P = x, - = b,
g g
y = a + bx
7.5 Fitting of Exponential and Logarithmic Curves 7.25
P V x y x2 xy
0.5 1.62 –0.6931 0.4824 0.4804 –0.3343
1.0 1.00 0 0 0 0
1.5 0.75 0.4055 –0.2877 0.1644 –0.1166
2.0 0.62 0.6931 –0.4780 0.4804 –0.3313
2.5 0.52 0.9163 –0.6539 0.8396 –0.5992
3.0 0.46 1.0986 –0.7765 1.2069 –0.8531
Âx = 2.4204 Ây = –1.7137 Âx = 3.1717
2
Âxy = –2.2345
Exercise 7.3
1. Fit the curve y = ab to the following data:
x
x 2 3 4 5 6
y 144 172.3 207.4 248.8 298.5
[Ans.: y = 100 (1.2)x]
7.26 Chapter 7 Curve Fitting
x 1 2 3 4
y 2.50 8.00 19.00 50.00
[Ans.: y = 2.227x2.09]
4. Estimate g by fitting the ideal gas law PV g = c to the following data:
P 16.6 39.7 78.5 115.5 195.3 546.1
V 50 30 20 15 10 5
[Ans.: g = 1.504]
Appendix
Standard Normal Distribution Table
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3990 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4115 0.4131 0.4147 0.4162
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
A.2 Appendix
t-Distribution Table
Significance level = a
Degrees of
.005 (1-tail) .01 (1-tail) .025 (1-tail) .05 (1-tail) .10 (1-tail) .25 (1-tail)
Freedom
.01 (2-tails) .02 (2-tails) .05 (2-tails) .10 (2-tails) .20 (2-tails) .50 (2-tails)
(n)
1 63.657 31.821 12.706 6.314 3.078 1.000
2 9.925 6.965 4.303 2.920 1.886 0.816
3 5.841 4.541 3.182 2.353 1.638 0.765
4 4.604 3.747 2.776 2.132 1.533 0.741
5 4.032 3.365 2.571 2.015 1.476 0.727
6 3.707 3.143 2.447 1.943 1.440 0.718
7 3.500 2.998 2.365 1.895 1.415 0.711
8 3.355 2.896 2.306 1.860 1.397 0.706
9 3.250 2.821 2.262 1.833 1.383 0.703
10 3.169 2.764 2.228 1.812 1.372 0.700
11 3.106 2.718 2.201 1.796 1.363 0.697
12 3.054 2.681 2.179 1.782 1.356 0.696
13 3.012 2.650 2.160 1.771 1.350 0.694
14 2.977 2.625 2.145 1.761 1.345 0.692
15 2.947 2.602 2.132 1.753 1.341 0.691
16 2.921 2.584 2.120 1.746 1.337 0.690
17 2.898 2.567 2.110 1.740 1.333 0.689
18 2.878 2.552 2.101 1.734 1.330 0.688
19 2.861 2.540 2.093 1.729 1.328 0.688
20 2.845 2.528 2.086 1.725 1.325 0.687
21 2.831 2.518 2.080 1.721 1.323 0.686
22 2.819 2.508 2.074 1.717 1.321 0.686
23 2.807 2.500 2.069 1.714 1.320 0.685
24 2.797 2.492 2.064 1.711 1.318 0.685
25 2.878 2.485 2.060 1.708 1.316 0.684
26 2.779 2.479 2.056 1.706 1.315 0.684
27 2.771 2.473 2.052 1.703 1.314 0.684
28 2.763 2.467 2.048 1.701 1.313 0.683
29 2.756 2.462 2.045 1.699 1.311 0.683
Large 2.575 2.327 1.960 1.645 1.282 0.675
Appendix A.3
n 0.995 0.990 0.975 0.950 0.900 0.10 0.05 0.025 0.010 0.005
1 0.000039 0.00016 0.00098 0.0039 0.0158 2.71 3.84 5.02 6.63 7.88
2 0.0100 0.0201 0.0506 0.103 0.211 4.61 5.99 7.38 9.21 10.60
3 0.0717 0.115 0.216 0.352 0.584 6.25 7.81 9.35 11.34 12.84
4 0.207 0.297 0.484 0.711 1.06 7.78 9.49 11.14 13.28 14.86
5 0.412 0.554 0.831 1.15 1.61 9.24 11.07 12.83 15.09 16.75
6 0.676 0.872 1.24 1.64 2.20 10.64 12.59 14.45 16.81 18.55
7 0.989 1.24 1.69 2.17 2.83 12.02 14.07 16.01 18.48 20.28
8 1.34 1.65 2.18 2.73 3.49 13.36 15.51 17.53 20.09 21.96
9 1.73 2.09 2.70 3.33 4.17 14.68 16.92 19.02 21.67 23.59
10 2.16 2.56 3.25 3.94 4.87 15.99 18.31 20.48 23.21 25.19
11 2.60 3.05 3.82 4.57 5.58 17.28 19.68 21.92 24.73 26.76
12 3.07 3.57 4.40 5.23 6.30 18.55 21.03 23.34 26.22 28.30
13 3.57 4.11 5.01 5.89 7.04 19.81 22.36 24.74 27.69 29.82
14 4.07 4.66 5.63 6.57 7.79 21.06 23.68 26.12 29.14 31.32
15 4.60 5.23 6.26 7.26 8.55 22.31 25.00 27.49 30.58 32.80
16 5.14 5.81 6.91 7.96 9.31 23.54 26.30 28.85 32.00 34.27
17 5.70 6.41 7.56 8.67 10.08 24.77 27.59 30.19 33.41 35.72
18 6.26 7.01 8.23 9.39 10.86 25.99 28.87 31.5.3 34.81 37.16
19 6.84 7.63 8.91 10.12 11.65 27.20 30.14 32.85 36.19 38.58
20 7.43 8.26 9.59 10.85 12.44 28.41 31.41 34.17 37.57 40.00
21 8.03 8.90 10.28 11.59 13.24 29.62 32.67 35.48 38.93 41.40
22 8.64 9.54 10.98 12.34 14.04 30.81 33.92 36.78 40.29 42.80
23 9.26 10.20 11.69 13.09 14.85 32.01 35.17 38.08 41.64 44.18
24 9.89 10.86 12.40 13.85 15.66 33.20 36.42 39.36 42.98 45.56
25 10.52 11.52 13.12 14.61 16.47 34.38 37.65 40.65 44.31 46.93
26 11.16 12.20 13.84 13.38 17.29 35.56 38.88 41.92 45.64 48.29
27 11.81 12.88 14.57 16.15 18.11 36.74 40.11 43.19 46.96 49.64
28 12.46 13.56 15.31 16.93 18.94 37.92 41.34 44.46 48.28 50.99
29 13.12 14.26 16.05 17.71 19.77 39.09 42.56 45.72 49.59 52.34
30 13.79 14.95 16.79 18.49 20.60 40.26 43.77 46.98 50.89 53.67
40 20.71 22.16 24.43 26.51 29.05 51.81 55.76 59.34 63.69 66.77
60 35.53 37.48 40.48 43.19 46.46 74.40 79.08 83.30 88.38 91.95
120 83.85 86.92 91.58 95.70 100.62 140.23 146.57 152.21 158.95 163.65
A.4 Appendix
F-Distribution Table
n1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 16
n2
1 161 200 216 225 230 234 237 239 241 242 243 244 245 245 246
2 18.5 19.0 19.2 19.2 19.3 19.3 19.4 19.4 19.4 19.4 19.4 19.4 19.4 19.4 19.4
3 10.1 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 8.76 8.74 8.73 8.71 8.69
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 5.94 5.91 5.89 5.87 5.84
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.73 4.70 4.68 4.66 4.64 4.60
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 4.03 4.00 3.98 3.96 3.92
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64 3.60 3.57 3.55 3.53 3.49
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35 3.31 3.28 3.26 3.24 3.20
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14 3.10 3.07 3.05 3.03 2.99
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98 2.94 2.91 2.89 2.86 2.83
11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85 2.82 2.79 2.76 2.74 2.70
12 4.75 3.89 3.49 3.25 3.11 3.00 2.91 2.85 2.80 2.75 2.72 2.69 2.66 2.64 2.60
13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67 2.63 2.60 2.58 2.55 2.51
14 4.60 3.74 3.35 3.11 2.96 2.85 2.76 2.70 2.65 2.60 2.57 2.53 2.51 2.48 2.44
16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49 2.46 2.42 2.40 2.37 2.33
18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41 2.37 2.34 2.31 2.29 2.25
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35 2.31 2.28 2.25 2.22 2.18
22 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30 2.26 2.23 2.20 2.17 2.13
24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25 2.21 2.18 2.15 2.13 2.09
26 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22 2.18 2.15 2.12 2.09 2.05
28 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 2.19 2.15 2.12 2.09 2.06 2.02
30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16 2.13 2.09 2.06 2.04 1.99
40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.08 2.04 2.00 1.97 1.95 1.90
50 4.03 3.18 2.79 2.56 2.40 2.29 2.20 2.13 2.07 2.03 1.99 1.95 1.92 1.89 1.85
60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99 1.95 1.92 1.89 1.86 1.82
80 3.96 3.11 2.72 2.49 2.33 2.21 2.13 2.06 2.00 1.95 1.91 1.88 1.84 1.82 1.77
100 3.94 3.09 2.70 2.46 2.31 2.19 2.10 2.03 1.97 1.93 1.89 1.85 1.82 1.79 1.75
200 3.89 3.04 2.65 2.42 2.26 2.14 2.06 1.98 1.93 1.88 1.84 1.80 1.77 1.74 1.69
500 3.86 3.01 2.62 2.39 2.23 2.12 2.03 1.96 1.90 1.85 1.81 1.77 1.74 1.71 1.66
• 3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88 1.83 1.79 1.75 1.72 1.69 1.64
Index
A Constants of the Poisson Distribution 5.29
Continuous Distribution Function 2.18
Additive Law of Probability (Addition Continuous Random Variable 2.2
Theorem) 1.15 Correlation 4.2
Alternative Hypothesis 6.2 Correlation Coefficient 6.2
Applications of t-distribution 6.37 Critical Region 6.3
Assumptions for t-test 6.37 Cumulative Distribution Function 2.4, 2.41,
Axiomatic Definition of Probability 1.3 2.57
Cumulative Probability Distribution 2.4
B
D
Bayes’ Theorem 1.47
Binomial Distribution 5.2 De Morgan’s Laws 1.14
Binomial Frequency Distribution 5.4 Definition of Probability 1.4
Bivariate Data 4.2 Definitions of Probability 1.3
Bounds on Probabilities 3.84 Deviation 7.2
Discrete Distribution Function 2.4
C Discrete Probability Distribution 2.3
Discrete Random Variables 2.2
Central Moment 3.18
Central Moments or Moments about Actual E
Mean 3.18
Chebyshev’s Inequality 3.84 Empirical or Statistical Definition of
Chi-square Test for Independence of Probability 1.3
Attributes 6.74 Equally Likely Events 1.2
Chi-square Test: Goodness of Fit 6.66 Errors in Hypothesis Testing 6.3
Chi-square Distribution 6.65 Event 1.2, 1.4
Chi-square (c2) Test 6.65 Examples of Binomial Distribution 5.2
Classical Definition of Probability 1.3 Examples of Poisson Distribution 5.29
Coefficient of Variation Exhaustive Event 1.2
Expectation 3.2
Conditional Expectation and Conditional
Expected Values of Two Dimensional
Variance 3.69
Random Variables 3.68
Conditional Probability Function 2.42, 2.57
Exponential Distribution 5.79
Conditional Probability Theorem 1.25
Expressions for Regression Coefficients
Conditions for Binomial Distribution 5.2
4.32
Conditions of Poisson Distributions 5.29
Confidence Limits 6.5
F
Constants of the Binomial Distribution 5.2
Constants of the Exponential Favourable Events 1.3
Distribution 5.80 Fitting a Normal Distribution 5.75
Constants of the Gamma Distribution 5.97 Fitting of Exponential and Logarithmic
Constants of the Normal Distribution 5.54 Curves 7.18
I.2 Index
L N
Least Square Method 7.2 Negative Correlation 4.2
Left Tailed Test 6.4 No Correlation 4.4
Level of Significance 6.3 Nonlinear Correlation 4.3
Line of Regression of x on y 4.31 Nonlinear Regression 4.30
Line of Regression of y on x 4.31 Normal Distribution 5.53
Linear Correlation 4.3 Normal Equations 7.3
Linear Regression 4.30 Null Hypothesis 6.2
M O
Marginal Probability Function 2.42, 2.57 Outcome 1.2
Mean 3.2, 3.32
Mean Deviation 3.3, 3.33 P
Mean of the Binomial Distribution 5.2 Pairwise Independence 1.26
Mean of the Exponential Distribution 5.80 Parameters 6.2
Mean of the Gamma Distribution 5.97 Partial Correlation 4.3
Mean of the Normal Distribution 5.54 Perfect Negative Correlation 4.4
Mean of the Poisson Distribution 5.29 Perfect Positive Correlation 4.4
Measures for Continuous Random Poisson Approximation to the Binomial
Variables 3.32 Distribution 5.28
Index I.3