0% found this document useful (0 votes)

368 views26 pages

Chapter 5 Probability and Statistics

Probability, probability distribution, statistics, binomial, bernoulli, confidence interval, hypothesis test, simple linear regression, least square method, sample correlation

Uploaded by

Michelle Lee Sze Yee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

368 views26 pages

Chapter 5 Probability and Statistics

Probability, probability distribution, statistics, binomial, bernoulli, confidence interval, hypothesis test, simple linear regression, least square method, sample correlation

Uploaded by

Michelle Lee Sze Yee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

CHAPTER 5 PROBABILITY AND STATISTICS

Definition of statistics: The mathematics of the collection, organization and interpretation of numerical data,
especially the analysis of population by inference from sampling
Let

denotes a probability of an event A which is a subset of a sample space.

5.1 Rules of probability

1. Complement rule
2. Addition rule
3. For disjoint events,
4. Product rule,

, thus
If

5.2 Conditional probability

(a) If and are any events with

then

(b) If

then

and

are any events with

and

independent, then

5.3 Multiplication rule

If and are any events then

5.4 Total probability rule

If
are mutually exclusive and exhaustive events, then

P A P A E1 P A E2 P A Ek

P A E1 P E1 P A E2 P E2 P A Ek P Ek

5.5 Bayes Theorem

If
are mutually exclusive events, one of which occurs given that another event

occurs, then

Example 5.1 Three machines produce similar car parts. A produces 40% of the total output, machines B and C
produce 25% and 15% respectively. The proportions of the output from each machine that do not conform to the
specification are 10% for A, 5% for B and 1% for C. What proportion of these parts that do not conform to the
specification are produced by machine A?
Solution
Let D represent the event that a particular part is defective. Then the overall proportion of defective parts is

Using Bayes theorem,

Example 5.2 Suppose that 0.1% of the people in a certain area have a disease D and that a mass screening test is
used to detect cases. The test gives either a positive result or a negative result for each person. In practice the test
gives a positive result with probability 99.9% for a person who has D and a probability of 0.2% for a person who has
not. What is the probability that a person for whom the test is positive actually has the disease?
1

Solution
Let T represent the event that the test gives a positive result.
Then,

Using Bayes theorem,

5.6 Random variables

A random variable (rv) has a sample space of possible numerical values together with a distribution of probabilities.
Examples: (a) the number of defectives in a process (b) number of successful projects.
Random variables can be discrete or continuous.
Discrete random variables and distributions
Definition
If

X is a discrete random variable, then p x P X x is called

a probability mass function or

probability distribution if, for each outcome of x ,

(a)

p x 0

(b)

p x 1
x

Cumulative distribution functions

The cumulative distribution function,

F x for a discrete random

p x P X x is
F x P X x P X t

variable X with probability distribution

Properties of the cumulative distribution functions

F x satisfies the following properties:

(a) F x P X x P X t
tx

0 F x 1
(c) If x y , then F x F y
(b)

Mean of a discrete random variable

X is a discrete random variable with probability distribution p x P X x , then the mean or

expected value for X which is denoted by X or E X is given by
If

X E X xp x
x

Variance of a discrete random variable

X is a discrete random variable with probability distribution p x P X x , then the variance for
X which is denoted by V X or is given by
2
2
V X X2 E X X x X p x

Standard deviation of a discrete random variable

The standard deviation of a discrete random variable, denoted as X , is the positive square root for the variance,

X2 .
Example 5.3
The number of successful projects

X per day obtained by a small engineering firm can be described by the

following probability distribution:

P X x 10
0

for x 0,1, 2, 3, 4

otherwise
Find the cumulative distribution function for X . Find the mean and variance for the number of successful projects
per day.
Solution
The cumulative distribution function for X is given by

FX x P X x PX X t
t x

For

x 0 , F 0 P X 0 P 0

0
0
10

x 1 , F 1 P X 1 P 0 P 1
1
0 0 .1
10
x 2 , F 2 P X 2 P 0 P 1 P 2
1 2
0 0 .3
10 10
x 3 , F 3 P X 3 P 0 P 1 P 2 P 3

0
For

1 2 3
0 .6
10 10 10

x 4 , F 4 P X 4 P 0 P 1 P 2 P 3 P 4
1 2 3 4
0 1 .0
10 10 10 10

5.7 Continuous random variables and distributions

Definition
If

X is a continuous random variable defined over a set of real

numbers, then

f x is called a probability

density function, if
(a)

f x 0

(b)

f x dx 1

P a X b f x dx
b

(c)

where

lies in the interval

a, b

Cumulative distribution functions

The cumulative distribution function,
function

f x is

F x P X

F x for a continuous random variable X with probability density

x f t dt for x
x

Properties of the cumulative distribution functions

F x satisfies the following properties:

P X a f t dt
a

(a)

for

(b)

P X a f t dt
a

P a X b f t dx
b

(c)

for

Mean of a continuous random variable

X is a continuous random variable with probability density function f x , then the mean or expected value
for X which is denoted by X or E X is given by
If

X E X xf x dx

Variance of a continuous random variable

X is a continuous random variable with probability density function f x , then the variance for
2
denoted by V X or X is given by
2
V X X2 E X X
If

X which is

x X f X x dx
2

x 2 f X x dx X2

Standard deviation of a continuous random variable

The standard deviation of a continuous random variable, denoted as

X2 .

X , is the positive square root for the variance,

Example 5.4 Assume that the particle size of an air pollutant (in micrometers) can be described by the following
probability function:

3
for x 1

f X x x 4
0
otherwise
(a) Show that the f x is a probability density function
(b) Find the cumulative distribution function
(c) Determine the mean and standard deviation
Solution

(a) f x is a probability density function if it satisfies

f x dx 1 .

Here

f X x dx
1

3
dx
x4

x 3

Therefore

f x is a probability density function.

(b) The cumulative distribution function for X is given by

FX x P X x

f t dt for x
X

3
dx
x4
x

1
3
x 1
1
1
3 11 3
x
x

(c) The mean for X is given by

X EX

xf x dx

x
1

3
dx
x4

3
dx
x3

1
3 2
2 x 1
3
micrometer s
2
The variance for

X is given by

V X X2 x 2 f x dx X2

3
3
x 4 dx
x
2
1

3
3
2 dx
x
2
1

3
3

x 1 2
9 3
3 sq. micrometer s
4 4
5.8 Discrete distributions
Bernoulli distribution
PMF
Range
Mean
Variance

P X x p x 1 p
x 0,1 and 0 p 1
p
p 1 p

1 x

Binomial distribution
PMF

Range
Parameters
Mean
Variance

n
n x
P X x p x 1 p
x
x 0,1,, n and 0 p 1
n and p
np
np1 p

Example 5.5 Suppose a road is flooded with probability

during a year and not more than one flood occurs
during a year. What is the probability that it will be flooded at least once during a five year period?
Solution
Let X be the event a flood occurs in a year.
Then,

Poisson distribution
PMF
Range
Parameter
Mean

P X x

x e

x 0,1, 2,

Variance

If
and
, the binomial distribution can be approximated by the Poisson distribution with
.
Example 5.6 The number of flaws for a thin copper wire follows a Poisson distribution with a mean of 2.3 flaws per
mm. (a)Determine the probability of exactly two flaws in 1mm of wire. (b)Determine the probability of ten flaws in
5mm of wire.
Solution
(a) Let X be the number of flaws in 1mm of wire.
Given that
, thus

(a) Let X be the number of flaws in 5mm of wire. Then X has a Poisson distribution with
flaws.

5.9 Continuous distribution

Normal distribution

1x
1
f x
exp

2
2

PDF

Range

x , 0, 0

Parameters

: location parameter, : scale parameter

If X follows a normal distribution then

Also,

f(x)

0.0
-6

-4

-2

5.10 Sample measures and parameter estimates

X 1 , X 2 ,, X n be a random sample from a population with mean and variance 2 . Then the point
estimate for and are
Let

x
8

where

xi
x2 xn
i 1

n
n

is the sample mean

And

s
2

Thus if

where

1 N
xi x 2 is the
s

n 1 i1
2

then

sample variance.

5.11 Confidence interval for the mean based on the normal distribution
(1)Population variance is known
The

100 1 % confidence interval for the mean

X z

is given by

where

X is the sample mean.

th quantile of the standard normal distribution

(b) z is the 100
2
2
(a)

which is given in Table 1.

Assumptions:
(a)

X 1 , X 2 ,, X n is the random sample of size n from a population which has a normal distribution

with mean

and variance 2 .

(b) The sample size

n can either be small or large.

(2)Population variance is unknown

The

100 1 % confidence interval for the mean

S
S
X z
X z
2
2
n
n

is given by

where

X is the sample mean and S is the sample standard deviation.

th quantile of the standard normal distribution

(b) z is the 100
2
2
(a)

which is given in Table 1.

Assumptions:
(a)
mean

X 1 , X 2 ,, X n is the random sample of size n from a population which has a normal distribution with

and variance 2 .

(b) The sample size

n is large.

Table 1: Cumulative distribution function for the standard normal distribution

PZ z

1 2 x2
e dx
2
0

z
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0

.00
0.5000
0.5398
0.5793
0.6179
0.6554
0.6915
0.7257
0.7580
0.7881
0.8159
0.8413

.01
0.5040
0.5438
0.5832
0.6217
0.6591
0.6950
0.7291
0.7611
0.7910
0.8186
0.8438

.02
0.5080
0.5478
0.5871
0.6255
0.6628
0.6985
0.7324
0.7642
0.7939
0.8212
0.8461

.03
0.5120
0.5517
0.5910
0.6293
0.6664
0.7019
0.7357
0.7673
0.7967
0.8238
0.8485

.04
0.5160
0.5557
0.5948
0.6331
0.6700
0.7054
0.7389
0.7704
0.7995
0.8264
0.8508

.05
0.5199
0.5596
0.5987
0.6368
0.6736
0.7088
0.7422
0.7734
0.8023
0.8289
0.8531

.06
0.5239
0.5636
0.6026
0.6406
0.6772
0.7123
0.7454
0.7764
0.8051
0.8315
0.8554

.07
0.5279
0.5675
0.6064
0.6443
0.6808
0.7157
0.7486
0.7794
0.8078
0.8340
0.8577

.08
0.5319
0.5714
0.6103
0.6480
0.6844
0.7190
0.7517
0.7823
0.8106
0.8365
0.8599

.09
0.5359
0.5753
0.6141
0.6517
0.6879
0.7224
0.7549
0.7852
0.8133
0.8389
0.8621

1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0

0.8643
0.8849
0.9032
0.9192
0.9332
0.9452
0.9554
0.9641
0.9713
0.9772

0.8665
0.8869
0.9049
0.9207
0.9345
0.9463
0.9564
0.9649
0.9719
0.9778

0.8686
0.8888
0.9066
0.9222
0.9357
0.9474
0.9573
0.9656
0.9726
0.9783

0.8708
0.8907
0.9082
0.9236
0.9370
0.9484
0.9582
0.9664
0.9732
0.9788

0.8729
0.8925
0.9099
0.9251
0.9382
0.9495
0.9591
0.9671
0.9738
0.9793

0.8749
0.8944
0.9115
0.9265
0.9394
0.9505
0.9599
0.9678
0.9744
0.9798

0.8770
0.8962
0.9131
0.9279
0.9406
0.9515
0.9608
0.9686
0.9750
0.9803

0.8790
0.8980
0.9147
0.9292
0.9418
0.9525
0.9616
0.9693
0.9756
0.9808

0.8810
0.8997
0.9162
0.9306
0.9429
0.9535
0.9625
0.9699
0.9761
0.9812

0.8830
0.9015
0.9177
0.9319
0.9441
0.9545
0.9633
0.9706
0.9767
0.9817

2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0

0.9821
0.9861
0.9893
0.9918
0.9938
0.9953
0.9965
0.9974
0.9981
0.9987

0.9826
0.9864
0.9896
0.9920
0.9940
0.9955
0.9966
0.9975
0.9982
0.9987

0.9830
0.9868
0.9898
0.9922
0.9941
0.9956
0.9967
0.9976
0.9982
0.9987

0.9834
0.9871
0.9901
0.9925
0.9943
0.9957
0.9968
0.9977
0.9983
0.9988

0.9838
0.9875
0.9904
0.9927
0.9945
0.9959
0.9969
0.9977
0.9984
0.9988

0.9842
0.9878
0.9906
0.9929
0.9946
0.9960
0.9970
0.9978
0.9984
0.9989

0.9846
0.9881
0.9909
0.9931
0.9948
0.9961
0.9971
0.9979
0.9985
0.9989

0.9850
0.9884
0.9911
0.9932
0.9949
0.9962
0.9972
0.9979
0.9985
0.9989

0.9854
0.9887
0.9913
0.9934
0.9951
0.9963
0.9973
0.9980
0.9986
0.9990

0.9857
0.9890
0.9916
0.9936
0.9952
0.9964
0.9974
0.9981
0.9986
0.9990

3.1
3.2
3.3
3.4

0.9990
0.9993
0.9995
0.9997

0.9991
0.9993
0.9995
0.9997

0.9991
0.9994
0.9995
0.9997

0.9991
0.9994
0.9996
0.9997

0.9992
0.9994
0.9996
0.9997

0.9992
0.9995
0.9996
0.9997

0.9993
0.9995
0.9996
0.9997

0.9993
0.9995
0.9997
0.9998

Example 5.7
A research was done to determine the wind speed distribution in Penang. The following monthly wind speed data
(measured in m/s) was obtained.
10

15.42 12.85 10.28 13.36 15.42 20.56 16.28

10.28
9.25
8.22 11.31 14.91 16.45 13.36
11.31 11.31 12.85 11.82 14.39 15.42 16.96
12.85 12.85 11.82 14.39 12.34 24.67 12.85
Find a 90% confidence interval for the true mean wind speed in Penang.

25.70
15.42
21.59
20.05

15.42
13.36
15.42
27.24

9.25
12.85
15.42
22.62

Solution
Let
be the true mean wind speed (in m/s) in Penang.

Since the sample size is large

n 40 , the following confidence interval is used.

90% confidence interval for the true population means is given by

S
S
X z
X z
2
2
n
n
S
S
X z 0.05
X z 0.05
n
n
4.489
4.489
14.953 1.65
14.953 1.65
40
40
14.953 1.650.710 14.953 1.650.710
14.953 1.172 14.953 1.172
13.781 16.125

Thus the

Calculations

X 1 X 2 X 40 15.42 10.28 15.42 22.62

14.953
40
40
15.42 14.9532 10.28 14.9532 22.62 14.9532
1 40
2
2
S X i X
39 i 1
39
2
4.489 20.149
From Table 1, z 0.05 1.65
X

Example 5.8
The flow discharge of Sungai Kerian (measured in m3/s) was obtained at random. 50 readings were collected and the
mean flow discharge was found to be 3.512m3/s with a standard deviation of 0.5 m3/s. Construct a 99% confidence
interval for the true mean flow discharge of Sungai Kerian.
Solution
Let
be the true mean flow discharge of Sungai Kerian.

Since the sample size is large

n 50 , the following confidence interval is used.

99% confidence interval for the true population means is given by

S
S
X z
X z
2
2
n
n
S
S
X z 0.005
X z 0.005
n
n
0.5
0.5
3.512 2.57
3.512 2.57
50
50
3.512 2.57 0.071 3.512 2.57 0.071
3.512 0.182 3.512 0.182
3.330 3.694

Thus the

Calculations

X 3.512

n 50

S 0.5 . From Table 1, z0.005 2.57

5.12 Confidence intervals for the mean based on the t distribution

The

100 1 % confidence interval for the mean

S
S
X t ,n1
X t ,n1
2
2
n
n

is given by

where

X is the sample mean.

(b) S is the sample standard deviation.

th quantile of the t distribution with n 1 degrees of freedom. The critical

values of the t distribution is given in Table 2.

Assumptions:
(a)

X 1 , X 2 ,, X n is the random sample of size n from a population which has a

normal distribution with mean

(b) The sample size

and variance 2 .

n is small.

Example 5.9
The moisture content (measured in percentage) of clay in Batu Ferringhi was investigated. The following data was
obtained from a random sample.
1.81
2.00
2.74
3.56
2.13
4.64
3.64
4.62
4.47
3.12
Construct a 98% confidence interval for the true moisture content for clay by assuming that the sample is from a
normal distribution.
Solution
Let
be the true mean moisture content (in percentage) for clay.

Since the sample size is small

n 10 , the following confidence interval is used.

98% confidence interval for the true population means is given by

S
S
X t ,n1
X t ,n1
2
2
n
n
S
S
X t0.01, 9
X t0.01, 9
n
n
1.091
1.091
3.273 2.821
3.273 2.821
10
10
3.273 2.8210.345 3.273 2.8210.345
3.273 0.973 3.273 0.973
2.300 4.246

Thus the

Calculations

X 1 X 2 X 00 1.81 4.64 2.13 3.12

3.273
10
10

1.81 3.273 4.64 3.273 3.12 3.273

1 10
2
S X i X
1.0912 1.190
i

1
9
9
2

From Table 2,

t 0.01, 9 2.821

Table 2: Critical values for the t distribution with degrees of freedom

1
2
3
4
5
6
7
8
9
10

0.40
0.325
0.289
0.277
0.271
0.267
0.265
0.263
0.262
0.261
0.260

0.30
0.727
0.617
0.584
0.569
0.559
0.553
0.549
0.546
0.543
0.542

0.20
1.376
1.061
0.978
0.941
0.920
0.906
0.896
0.889
0.883
0.879

0.15
1.963
1.386
1.250
1.190
1.156
1.134
1.119
1.108
1.100
1.093

0.10
3.078
1.886
1.638
1.533
1.476
1.440
1.415
1.397
1.383
1.372

0.05
6.314
2.920
2.353
2.132
2.015
1.943
1.895
1.860
1.833
1.813

0.025
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228

0.02
15.895
4.849
3.482
2.999
2.757
2.612
2.517
2.449
2.398
2.359

0.015
21.205
5.643
3.896
3.298
3.003
2.829
2.715
2.634
2.574
2.528

0.01
31.821
6.965
4.541
3.747
3.365
3.143
2.998
2.897
2.821
2.764

11
12
13
14
15
16
17
18
19
20

0.260
0.259
0.259
0.258
0.258
0.258
0.257
0.257
0.257
0.257

0.540
0.539
0.538
0.537
0.536
0.535
0.534
0.534
0.533
0.533

0.876
0.873
0.870
0.868
0.866
0.865
0.863
0.862
0.861
0.860

1.088
1.083
1.080
1.076
1.074
1.071
1.069
1.067
1.066
1.064

1.363
1.356
1.350
1.345
1.341
1.337
1.333
1.330
1.328
1.325

1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1.725

2.201
2.179
2.160
2.145
2.131
2.120
2.110
2.101
2.093
2.086

2.328
2.303
2.282
2.264
2.249
2.235
2.224
2.214
2.205
2.197

2.491
2.461
2.436
2.415
2.397
2.382
2.368
2.356
2.346
2.336

2.718
2.681
2.650
2.625
2.603
2.584
2.567
2.552
2.540
2.528

21
22
23
24
25
26
27
28
29
30

0.257
0.256
0.256
0.256
0.256
0.256
0.256
0.256
0.256
0.256

0.532
0.532
0.532
0.531
0.531
0.531
0.531
0.530
0.530
0.530

0.859
0.858
0.858
0.857
0.856
0.856
0.855
0.855
0.854
0.854

1.063
1.061
1.060
1.059
1.058
1.058
1.057
1.056
1.055
1.055

1.323
1.321
1.320
1.318
1.316
1.315
1.314
1.313
1.311
1.310

1.721
1.717
1.714
1.711
1.708
1.706
1.703
1.701
1.699
1.697

2.080
2.074
2.069
2.064
2.060
2.056
2.052
2.048
2.045
2.042

2.189
2.183
2.177
2.172
2.167
2.162
2.158
2.154
2.150
2.147

2.328
2.320
2.313
2.307
2.301
2.296
2.291
2.286
2.282
2.278

2.518
2.508
2.500
2.492
2.485
2.479
2.473
2.467
2.462
2.457

40
60
120

0.255
0.254
0.254
0.253

0.529
0.527
0.526
0.524

0.851
0.848
0.845
0.842

1.050
1.046
1.041
1.036

1.303
1.296
1.289
1.282

1.684
1.671
1.658
1.645

2.021
2.000
1.980
1.960

2.123
2.099
2.076
2.054

2.250
2.223
2.196
2.170

2.423
2.390
2.358
2.326

5.13 Tests of hypotheses for the mean based on the normal distribution
(1)Population variance is known
One tail tests

Two tail tests

H 0 : d0

H1 : d 0

H1 : d0

H : d
1

Test statistic

X d0

2
n

Rejection region
Reject

Z z
(or Z z )

Z z

Notes:

(b)

d 0 is a constant.
X is the sample mean.

(c)

(a)

is the

100

quantile of the standard normal distribution which is given in Table 1.

Assumptions:
(a)

X 1 , X 2 ,, X n is a random sample of size n from a population which has a normal distribution with

mean

and variance 2 .

(b) The sample size

n can either be small or large.

2 Population variance is unknown

One tail tests

Two tail tests

H0 : d0

H0 : d0
H1 : d0

H1 : d0

H : d
1

Test statistic

X d0
S2
n

Rejection region
Reject

Z z
(or Z z )
Notes:
(a)
15

d 0 is a constant.

Z z

X is the sample mean and S is the sample standard deviation.

th quantile of the standard normal distribution

which is given in Table 1.

Assumptions:
(a)
mean

X 1 , X 2 ,, X n is a random sample of size n from a population which has a normal distribution with

and variance 2 .

(b) The sample size

n is large.

Example 5.10
A research was done to determine the wind speed distribution in Penang. The following monthly wind speed data
(measured in m/s) was obtained.
15.42 12.85 10.28 13.36 15.42 20.56 16.28 25.70 15.42
9.25
10.28
9.25
8.22 11.31 14.91 16.45 13.36 15.42 13.36 12.85
11.31 11.31 12.85 11.82 14.39 15.42 16.96 21.59 15.42 15.42
12.85 12.85 11.82 14.39 12.34 24.67 12.85 20.05 27.24 22.62
Can you conclude that the mean wind speed in Penang is less than 12m/s? Use

0.10 .

Solution
We will follow the six step procedure to solve this problem.
Step 1: Define the population parameter of interests.
Let
be the true mean wind speed (in m/s) in Penang.

Since the sample size is large

n 40 , the following hypothesis test is used.

Step 2 : Define the null and alternative hypotheses

H 0 : 12

H 1 : 12
Step 3 : Calculate the test statistic

X d0
S2
n
14.953 12
Z
20.149
40
2.953
Z
0.710
Z 4.159
Z

Calculations

X 1 X 2 X 40 15.42 10.28 15.42 22.62

14.953
40
40
15.42 14.9532 10.28 14.9532 22.62 14.9532
1 40
2
2
S X i X
39 i 1
39
4.489 2 20.149
X

Step 4 : Determine the rejection region

Reject

H 0 if Z z z0.10 1.28 (From Table 1).

Step 5 : Result
The null hypothesis cannot be rejected.
Step 6 : Conclusion

0.10 , there is insufficient evidence to show that the true mean wind speed (in m/s) in Penang is less

At
than 12m/s.

Example 5.11
The flow discharge of Sungai Kerian (measured in m3/s) was obtained at random. Fifty readings were collected and
the mean flow discharge was found to be 3.512m3/s with a standard deviation of 0.5 m3/s. Show that the true mean

0.05 .

flow discharge at Sungai Kerian is not equal to 4 m3/s. Use

Solution
We will follow the six step procedure to solve this problem.
Step 1: Define the population parameter of interests.
Let
be the true mean flow discharge of Sungai Kerian.

Since the sample size is large

n 50 , the following hypothesis test is used.

Step 2 : Define the null and alternative hypotheses

H0 : 4

H1 : 4
Step 3 : Calculate the test statistic

X d0
2
where X 3.512 , S 0.25, n 50
S2
n
3.512 4
Z
0.25
50
Z

0.488
0.071
Z 6.873

Step 4 : Determine the rejection region

Reject

H 0 if

Z z z0.025 1.96 or Z z z0.025 1.96 (From Table 1)

Step 5 : Result
The null hypothesis is rejected.
Step 6 : Conclusion

0.10

At
, there is sufficient evidence to show that the true mean flow discharge of Sungai Kerian is not
equal to 4 m3/s.

5.14 Test of hypothesis for the mean based on the t distribution

One tail tests
Two tail tests

H0 : d0

H1 : d0

H : d
1

Test statistic

X d0
S2
n

Rejection region
Reject

T t ,n1

T t

(or T t , n 1 )

,n 1

Notes:

d 0 is a constant.
(b) X is the sample mean.
(c) S is the sample standard deviation.
(a)

(d)

, n 1

is the

100

quantile of the t distribution with

of the t distribution is given in Table 2.

Assumptions:

n 1 degrees of freedom. The critical values

(a)

X 1 , X 2 ,, X n is a random sample of size n from a population which has a normal distribution with

mean

and variance 2 .

(b) The sample size

n is small.

Example 5.12
The moisture content (measured in percentage) of clay in Batu Ferringhi was investigated. The following data was
obtained from a random sample.
1.81
2.00
2.74
3.56
2.13
4.64
3.64
4.62
4.47
3.12
Is the moisture content greater than 3.0%? Use

0.05 .

Solution
We will follow the six step procedure to solve this problem.
Step 1: Define the population parameter of interests.
Let
be the true mean moisture content (in percentage) for clay.

Since the sample size is small n 10 , the following hypothesis test is used.
Step 2 : Define the null and alternative hypotheses

H 0 : 3.0
H 1 : 3.0

Step 3 : Calculate the test statistic

X d0

S2
n
3.273 3.0
T
1.190
9
0.273
T
0.364
T 0.750
Calculations

X 1 X 2 X 00 1.81 4.64 2.13 3.12

3.273
10
10

1 10
X i X 2 1.81 3.273 4.64 3.273 3.12 3.273 1.0912 1.190

9 i 1
9
2

Step 4 : Determine the rejection region

Reject H 0 if T t ,n 1 t0.05,9 1.833 (From Table 2).
Step 5 : Result
The null hypothesis cannot be rejected.
19

Step 6 : Conclusion
At 0.10 , there is insufficient evidence to show that the true mean moisture content (in percentage) for clay is
greater than 3%.

5.15 Sample correlation

X and Y .
The sample correlation coefficient of n pairs of observations x1 , y1 , x 2 , y 2 ,, x n , y n denoted by
r is given by
Correlation measures the linear relationship between two variables,

X
n

i 1

X
n

i 1

X Yi Y

Y
n

i 1

2
i

The strength of the linear relationship is determined by the following:

0.80 r 1.00

then the relationship is very strong.

0.60 r 0.79

then the relationship is strong.

0.40 r 0.59

then the relationship is moderate.

0.20 r 0.39

then the relationship is weak.

0.00 r 0.19

then the relationship is very weak.

i 1

X Y

i 1

Y
i 1

i 1

Example 5.13

The cost,
of a manufacturing product usually depends on the lot size,
. The following data on the cost of the
manufacturing product and its lot size is given below:
30
70
140
270
530
1000
2000
3000
Y
1
5
10
25
50
100
250
500
X
Find the value of the correlation coefficient for the above data.
Solution

Y and X is given by
n
X
i Yi

The correlation coefficient between

X Y

i 1

X i2
i 1

i 1

2135030
941
325751
8

Yi 2
i 1

i 1

9417040

8
7040 2
14379200
8

1306950
1306950

463 .752860 .8 1326696

0.985

Therefore, there is a very strong linear relationship between cost and lot size.
Calculations

n8
8

X 941 , Y 7040 , X Y 2135030 , X 325751 .00 ,

i 1
8

i 1

Y 14379200
2

i 1

5.16 Simple linear regression

Let

X , Y , X
1

, Y2 ,, X n , Yn be n pairs of random variables. Then the simple linear regression

model is given by

Yi 0 1 X i i

i 1, 2,, n

where

Yi is the dependent or response variable

X i is the independent or regressor or explanatory or predictor
variable

0 is the intercept of the regression model

1 is the slope of the regression model
i is the random error term
Assumptions
The assumptions of the random error term are:

E i 0
2
(b) V i c (a constant)
(a)

(c) The probability distribution is normal

(d) Random error term is independent
Method of least squares
The method of least squares can be used to estimate the values of the intercept (

L min min Y X

This method minimizes the sum of squares of the random error term, that is
n

i 1

0 ) and slope ( 1 ) parameters.

Hence,

n
L
2 Yi 0 1 X i
0
i 1
0
n
L
2 Yi 0 1 X i X i 0
i 1
1

Simplifying yields,

n0 1 X i Yi
n

i 1
n

i 1

0 X i 1 X i2 Yi X i
n

i 1

Solving the two equations yield,

Y X
YX
n

0 Y 1 X

and 1

i 1

where

i 1

and

X
i 1

Thus the fitted or estimated regression model is

Yi 0 1 X i

i 1, 2,, n

i 1

ei Yi Yi is called the residual.

Example 5.14
The yield of a chemical process (in percentage) is hypothesized to be linearly related with the amount of catalyst (in

Y
X
Y

grams). Let

denote the yield of the chemical process and

be the amount of catalyst. The data is given below.
0.9
1.4
1.6
1.7
1.8
2.0
2.1
60.54

63.86

63.76

60.15

66.66

71.66

70.81

Fit a simple linear regression model.

Solution
The following simple linear regression model is fitted

Yi 0 1 X i i

i 1, 2,,7

where

Yi is the yield of a chemical process

X i is the amount of catalyst
By using the least squares method, the estimates for

Y X
YX
n

i 1

X
i 1

And

i 1

2
i

0 and 1 are

760 .17 751 .5086

19.87 18.8929

i 1

8.6614
8.8644
0.9771

0 Y 1 X 65.3486 8.86441.643 65.3486 14.5642 50.7844

Therefore the fitted simple linear regression model is

Yi 50.784 8.864 X i

for

i 1, 2,,7

Example 5.15
A study was conducted to determine the relationship between bridge pier scour depths,

q . A simple linear regression model of the form D 0 q

D q
D q
D
35.67
31.71
17.84
14.63

52.51
52.04
22.58
8.51

12.62
9.76
8.54
13.87

11.99
10.33
8.36
8.24

20.73
11.24
8.80
12.44

D and discharge intensity,

was proposed. The following data was obtained:

25.56
7.39
6.71
13.28

11.48
8.71
4.94
10.07

13.22
11.21
2.61
13.21

12.71 11.15 11.60 6.29

9.20
13.72 13.75 19.51 22.03 9.76
12.88 14.31 11.89 11.15 11.42
19.35 9.20
13.72 18.59 11.22
11.92 8.60
11.89 13.66 10.47
14.98 11.43 12.80 15.99 9.48
Determine the simple linear regression model for this problem.

6.49
6.42
7.78
11.85
9.78
7.48

5.50
7.13
6.85
4.00
4.07
4.08

1.62
7.72
4.68
3.40
4.00
3.18

Solution
The proposed model is given by

D 0q

The above model can be transformed into a simple linear regression model by taking natural logarithm as follows:

ln D ln 0 q
ln D ln 0 ln q
ln D ln 0 1 ln q
1

Letting

Yi ln D , 0 ln 0 and X i ln q , we will obtain the following linear regression model

Yi 0 1 X i i i 1, 2,,40

The following data gives the new values for

Yi ln D and X i ln q

3.57
3.46
2.88
2.68
2.54
2.62
2.56
2.96
2.48
2.71

3.96
3.95
3.12
2.14
2.41
2.62
2.66
2.22
2.15
2.44

2.54
2.28
2.14
2.63
2.45
2.97
2.48
2.62
2.48
2.55

2.48
2.34
2.12
2.11
1.84
3.09
2.41
2.92
2.61
2.77

3.03
2.42
2.17
2.52
2.22
2.28
2.44
2.42
2.35
2.25

3.24
2.00
1.90
2.59
1.87
1.86
2.05
2.47
2.28
2.01

2.44
2.16
1.60
2.31
1.70
1.96
1.92
1.39
1.40
1.41

2.58
2.42
.96
2.58
.48
2.04
1.54
1.22
1.39
1.16

By using the least squares method, the estimates for

Y X
YX
n

i 1

X i2
i 1

i 1

0 and 1 are

i 1

230 .09 218 .4492

226 .25 207 .1615

And

11.6408
0.6098
19.0885

0 Y 1 X 2.3997 0.60982.2757 2.3997 1.3877 1.012

0 ln 0

1.012
So 0 e e
2.7511
Here

Therefore the fitted model is

D 0 q 2.7511q 0.6098 for i 1, 2,,40

Calculations
40

i 1

40
40
Xi

3.57 3.46 1.40 1.41 95.99

2.3997
40
40

3.96 3.95 1.39 1.16 91.03

2.2757
40
40
40
40
Yi X i 3.57 3.96 3.46 3.95 1.40 1.39 1.41 1.16 230 .09
X

i 1

Y X 95.99 91.03 8737 .9697 218.4492

i 1

n
40
40
X i2 3.96 2 3.95 2 3.12 2 1.22 2 1.39 2 1.16 2
25

i 1

15.69 15.62 9.72 1.50 1.92 1.34 226 .25

X 91.03
2

i 1

8286 .4609
207 .1615
40

Unit 2 P&S
No ratings yet
Unit 2 P&S
82 pages
Aem Probability PDF
No ratings yet
Aem Probability PDF
10 pages
UNIT II Probability Theory
No ratings yet
UNIT II Probability Theory
77 pages
Mathematics PDF
No ratings yet
Mathematics PDF
280 pages
Prepared By: Mohammad Saifuddin: Discrete or Continuous
No ratings yet
Prepared By: Mohammad Saifuddin: Discrete or Continuous
7 pages
Random Variables & Probability Distributions
No ratings yet
Random Variables & Probability Distributions
82 pages
CSD4101 Probability Distributions
No ratings yet
CSD4101 Probability Distributions
4 pages
Random Variables
No ratings yet
Random Variables
26 pages
QT I (Probability Dist)
No ratings yet
QT I (Probability Dist)
22 pages
IARE P&S Lecture Notes 0
No ratings yet
IARE P&S Lecture Notes 0
71 pages
Chapter 3
No ratings yet
Chapter 3
6 pages
CHP 5
No ratings yet
CHP 5
63 pages
Random Variables and Its Probability Distributions
0% (2)
Random Variables and Its Probability Distributions
18 pages
Unit II - ML
No ratings yet
Unit II - ML
29 pages
Random Variables and Pdfs
No ratings yet
Random Variables and Pdfs
18 pages
Basic Probability Concepts Review
No ratings yet
Basic Probability Concepts Review
77 pages
Engineering Uncertainty Notes
No ratings yet
Engineering Uncertainty Notes
15 pages
1st UNIT Probabilty Distributions
No ratings yet
1st UNIT Probabilty Distributions
27 pages
RM2
No ratings yet
RM2
102 pages
UNITIIProbability DFTheory by DR NVNagendram
No ratings yet
UNITIIProbability DFTheory by DR NVNagendram
86 pages
Probability Distribution Function-1
No ratings yet
Probability Distribution Function-1
22 pages
A 18-Page Statistics & Data Science Cheat Sheets
No ratings yet
A 18-Page Statistics & Data Science Cheat Sheets
18 pages
Random Variables
No ratings yet
Random Variables
4 pages
Statatics and Probability Chapter 3 and 4
No ratings yet
Statatics and Probability Chapter 3 and 4
10 pages
Reflective Essay of Probability Statistics
No ratings yet
Reflective Essay of Probability Statistics
24 pages
Introduction To Probability Distributions
No ratings yet
Introduction To Probability Distributions
93 pages
Random Variables
No ratings yet
Random Variables
19 pages
Continuous Probability Distribution.
100% (2)
Continuous Probability Distribution.
10 pages
Discrete Random Variables and Probability Distributions
No ratings yet
Discrete Random Variables and Probability Distributions
33 pages
Random Variables and Their Distributions
80% (5)
Random Variables and Their Distributions
21 pages
Sampling and Sampling Distribution
No ratings yet
Sampling and Sampling Distribution
47 pages
RV and Distributions
No ratings yet
RV and Distributions
81 pages
Prob Distribn Theory
No ratings yet
Prob Distribn Theory
8 pages
Unit 4.
No ratings yet
Unit 4.
14 pages
Comm 05 Random Variables and Processes 1
No ratings yet
Comm 05 Random Variables and Processes 1
66 pages
Chapter 2 - Random Variables and Probabi - 2016 - Introduction To Statistical Ma
No ratings yet
Chapter 2 - Random Variables and Probabi - 2016 - Introduction To Statistical Ma
14 pages
Expected Value and Variance in Probability
100% (1)
Expected Value and Variance in Probability
32 pages
Understanding Random Variables
No ratings yet
Understanding Random Variables
6 pages
Random Variables Apr 27
No ratings yet
Random Variables Apr 27
32 pages
Probabiliti Theory Notes
No ratings yet
Probabiliti Theory Notes
21 pages
Probability Distributions for MME Students
100% (2)
Probability Distributions for MME Students
30 pages
Lecture (9), BSCS, Probability and Statistics, Dated 2nd Dec-20 (Autosaved)
No ratings yet
Lecture (9), BSCS, Probability and Statistics, Dated 2nd Dec-20 (Autosaved)
10 pages
Chap 5
No ratings yet
Chap 5
14 pages
Module 2
No ratings yet
Module 2
22 pages
P6 - HTZ-2.1 Statistics and Probability
No ratings yet
P6 - HTZ-2.1 Statistics and Probability
3 pages
02 Random Variables
No ratings yet
02 Random Variables
51 pages
Some Common Probability Distributions
No ratings yet
Some Common Probability Distributions
92 pages
Chapter 3
No ratings yet
Chapter 3
26 pages
Lecture 8
No ratings yet
Lecture 8
37 pages
Lecture 3 - Statistics
No ratings yet
Lecture 3 - Statistics
16 pages
Module 2
No ratings yet
Module 2
36 pages
Unit-IV Engineering Maths-III (Defn and Problems)
No ratings yet
Unit-IV Engineering Maths-III (Defn and Problems)
14 pages
Chapter 2 Random Variables
No ratings yet
Chapter 2 Random Variables
34 pages
Chapter 1
No ratings yet
Chapter 1
47 pages
Discrete Probability Distributions Guide
No ratings yet
Discrete Probability Distributions Guide
18 pages
PRP Module 2
No ratings yet
PRP Module 2
113 pages
BS EN 12350-3-2009 Vebe Test PDF
100% (3)
BS EN 12350-3-2009 Vebe Test PDF
12 pages
BS EN 12350-3-2009 Vebe Test PDF
100% (3)
BS EN 12350-3-2009 Vebe Test PDF
12 pages
Candy Dye Analysis via Chromatography
100% (1)
Candy Dye Analysis via Chromatography
21 pages
Environmental Quality Report
No ratings yet
Environmental Quality Report
162 pages
Biogeochemical Cycles
No ratings yet
Biogeochemical Cycles
3 pages
Usage Note 40724: Comparing Covariance Structures, Testing Covariance Parameters Using The COVTEST Statement in PROC GLIMMIX
No ratings yet
Usage Note 40724: Comparing Covariance Structures, Testing Covariance Parameters Using The COVTEST Statement in PROC GLIMMIX
8 pages
Advanced Mathematics Curriculum
0% (1)
Advanced Mathematics Curriculum
2 pages
What Is Chaos Theory
No ratings yet
What Is Chaos Theory
2 pages
Advanced Communication Theories Assignment
No ratings yet
Advanced Communication Theories Assignment
2 pages
Quantum Mechanics Review and Schrödinger
No ratings yet
Quantum Mechanics Review and Schrödinger
34 pages
Solved Examples of Cramer Rao Lower Bound
100% (1)
Solved Examples of Cramer Rao Lower Bound
6 pages
Talcot Parson
No ratings yet
Talcot Parson
13 pages
Regression Analysis: Causal Relationship Between The Explanatory and
No ratings yet
Regression Analysis: Causal Relationship Between The Explanatory and
17 pages
Diminishing Marginal Utility Explained
No ratings yet
Diminishing Marginal Utility Explained
17 pages
Human Rationality in Politics
100% (1)
Human Rationality in Politics
8 pages
Lattice QCD and Strong-Interaction Matter
No ratings yet
Lattice QCD and Strong-Interaction Matter
72 pages
2012 Basic Business Statistics 12e Berenson Tables
100% (1)
2012 Basic Business Statistics 12e Berenson Tables
5 pages
Wick's Theorem for Quantum Operators
No ratings yet
Wick's Theorem for Quantum Operators
2 pages
Business Statistics Final Exam 2016
No ratings yet
Business Statistics Final Exam 2016
7 pages
MX Theory
No ratings yet
MX Theory
22 pages
Guide To Math Needed To Study String Theory
100% (1)
Guide To Math Needed To Study String Theory
7 pages
Final Examination in Educ-Pa 502
No ratings yet
Final Examination in Educ-Pa 502
3 pages
Shannon's Communication Theory of Secrecy
No ratings yet
Shannon's Communication Theory of Secrecy
2 pages
Social Bondis
No ratings yet
Social Bondis
2 pages
Classical Mechanics Final Paper
No ratings yet
Classical Mechanics Final Paper
2 pages
Newtons Law Question
No ratings yet
Newtons Law Question
4 pages
Statistical Hypothesis Testing - One Way & Two Way
No ratings yet
Statistical Hypothesis Testing - One Way & Two Way
49 pages
Interpretation of Paired-Samples t Test
No ratings yet
Interpretation of Paired-Samples t Test
13 pages
The Standard Model Explained
No ratings yet
The Standard Model Explained
15 pages
PH4401 03 Entanglement
No ratings yet
PH4401 03 Entanglement
21 pages
ANOVA for Statistical Analysis
100% (3)
ANOVA for Statistical Analysis
52 pages
Regression Analysis: SW318 Social Work Statistics Slide 1
No ratings yet
Regression Analysis: SW318 Social Work Statistics Slide 1
64 pages
Bmsi Et 7
No ratings yet
Bmsi Et 7
16 pages
Understanding ANOVA: Types & Applications
No ratings yet
Understanding ANOVA: Types & Applications
11 pages
Neha Zaidi's Dissertation on Particle Physics
No ratings yet
Neha Zaidi's Dissertation on Particle Physics
95 pages