Probability & Statistical Modelling
Estimation & Confidence Intervals
Topic & Structure of the lesson
This topic is divided into 2 parts:
Part A: Sampling Distribution
Part B: Estimation & Confidence
interval
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 2 (of 57)
Learning Outcomes
Part A: Sampling distribution:
At the end of this section, you should
be able to:
Recognise what a sampling distribution
is and how it differs from other types of
probability distribution
Make statistical statements that rely
upon knowledge of the central limit
theorem.
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 3 (of 57)
Sampling Distribution
Introduction
Is the probability distribution of all
possible sample means of n items drawn
from a population.
Such a distribution exists not only for the
mean but for any point estimate.
Properties
Very close to being normally distributed
The mean of the sampling is the same as the
population mean.
It has a standard deviation which is called the
standard error.
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 5 (of 57)
Standard Error
The standard deviation of the
sampling distribution.
It measures the extent to which we
expect the means from the
different samples to vary because
of the chance error in the sampling
process.
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 6 (of 57)
Required to know how to:
Determine the standard error of the
mean
with finite population size
And known population standard deviation
And unknown population standard deviation
With infinite population size
And known population standard deviation
And unknown population standard deviation
Population
With finite population size
With infinite population size
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 7 (of 57)
Standard error of the
Population finite
Population infinite
Mean
- Known population
standard deviation
N n
N 1
- Unknown population
standard deviation
s
n
N n
N 1
N n
N 1
pq
n
Proportion
CT042-3-2 Probability and Statistical Modeling
pq
n
Estimation & Confidence Intervals
Central limit theorem
It states that as the sample size
increases, the sampling distribution of
the mean approaches the normal
distribution in form, regardless of the
form of the population distribution.
For practical purposes, the sampling
distribution of the mean can be assumed
to be approximately normal, regardless
of the population distribution whenever
the sample size is greater than 30.
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 9 (of 57)
Statistical inference
Statistical inference can be defined
as the process by which conclusions
are drawn about some measure or
attributes of a population based
upon analysis of sample data.
Statistical inference can be divided
into two types
Estimation
Hypothesis testing
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 11 (of 57)
Learning Outcomes
Part B: Estimation
At the end of this section, you should
be able to:
Make a confidence interval estimate of a
population mean.
Make a confidence interval estimate of a
population proportion,
Determine the appropriate sample size
for interval estimation of means and
proportions.
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 12 (of 57)
Estimation
Introduction
Deals with the estimation of
population characteristics from
sample statistics
The distribution of sample means
follows a normal curve.
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 14 (of 57)
Point & Interval Estimates
A point estimate is a single number,
a confidence interval provides additional
information about variability
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 15 (of 57)
Point Estimates
We can estimate a
Population Parameter
with a Sample
Statistic
(a Point Estimate)
Mean
Proportion
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 16 (of 57)
How much uncertainty is associated with a
point estimate of a population parameter?
An interval estimate provides more
information about a population characteristic
than does a point estimate
Such interval estimates are called
confidence intervals
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 17 (of 57)
Confidence Interval Estimate
An interval gives a range of values:
Takes into consideration variation in sample
statistics from sample to sample
Based on observation from 1 sample
Gives information about closeness to unknown
population parameters
Stated in terms of level of confidence
Never 100% sure
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 18 (of 57)
Estimation Process
Random Sample
Population
(mean, , is
unknown)
Mean
x = 50
I am 95%
confident that
is between
40 & 60.
Sample
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 19 (of 57)
The general formula for all confidence
intervals is:
Point Estimate (Critical Value)(Standard Error)
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 20 (of 57)
Confidence level
Confidence in which the interval will
contain the unknown population parameter
A percentage (less than 100%)
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 21 (of 57)
Suppose confidence level = 95%
Also written (1 - ) = .95
A relative frequency interpretation:
In the long run, 95% of all the confidence
intervals that can be constructed will contain
the unknown true parameter
A specific interval either will contain or will
not contain the true parameter
No probability involved in a specific interval
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 22 (of 57)
Confidence
Intervals
Population
Mean
Known
CT042-3-2 Probability and Statistical Modeling
Population
Proportion
Unknown
Estimation & Confidence Intervals
Slide 23 (of 57)
Confidence interval for ( known)
Assumptions
Population standard deviation is known
Population is normally distributed
If population is not normal, use large sample
Confidence interval estimate
x z /2
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 24 (of 57)
Finding the critical value
z /2 1.96
Consider a 95% confidence interval:
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 25 (of 57)
Commonly used confidence levels are
90%, 95%, and 99%
Confidence
Level
80%
90%
95%
98%
99%
99.8%
99.9%
CT042-3-2 Probability and Statistical Modeling
Confidence
Coefficient,
z value,
z /2
.80
.90
.95
.98
.99
.998
.999
1.28
1.645
1.96
2.33
2.57
3.08
3.27
Estimation & Confidence Intervals
Slide 26 (of 57)
Margin of Error (e): the amount added and
subtracted to the point estimate to form the
confidence interval
Example: Margin of error for estimating , known:
x z /2
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
e z /2
Slide 27 (of 57)
Factors Affecting Margin of Error
e z /2
Data variation, :
as
Sample size, n :
as n
Level of confidence, 1 - :
if 1 -
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 28 (of 57)
Quick Review Question
Example:
A sample of 11 circuits from a large normal
population has a mean resistance of 2.20 ohms.
We know from past testing that the population
standard deviation is .35 ohms.
Determine a 95% confidence interval for the true
mean resistance of the population.
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 29 (of 57)
95% confidence interval:
x z /2
2.20 1.96 (.35/ 11)
2.20 .2068
1.9932 .......... ..... 2.4068
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 30 (of 57)
Interpretation
We are 95% confident that the true mean resistance is
between 1.9932 and 2.4068 ohms
Although the true mean may or may not be in this interval,
95% of intervals formed in this manner will contain the true
mean
An incorrect interpretation is that there is 95% probability
that this interval contains the true population mean.
(This interval either does or does not contain the true mean, there
is no probability for a single interval)
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 31 (of 57)
Example 5.3
After a particularly wet nights, 10 worms
surfaced on the lawn. Their lengths,
measured in cm, were
9.5 9.5 11.2 10.6 9.9 11.1 10.9
9.8 10.1 10.2
Assuming that this sample came from a
normal population with variance 4, calculate
a 95% confidence interval for the mean
length of all the worms in the garden.
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Confidence
Intervals
Population
Mean
Known
CT042-3-2 Probability and Statistical Modeling
Population
Proportion
Unknown
Estimation & Confidence Intervals
Slide 32 (of 57)
Confidence interval for ( Unknown)
If the population standard deviation is
unknown, we can substitute the sample standard
deviation, s
This introduces extra uncertainty, since s is
variable from sample to sample
So we use the t distribution instead of the normal
distribution
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 33 (of 57)
Assumptions
Population standard deviation is unknown
Population is normally distributed
If population is not normal, use large sample
Use Students t Distribution
Confidence Interval Estimate
x t /2
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
s
n
Slide 34 (of 57)
The t is a family of distributions
The t value depends on degrees of freedom
(d.f.)
Number of observations that are free to vary after
sample mean has been calculated
d.f. = n - 1
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 35 (of 57)
Students t-distribution
Note: t
z as n increases
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 37 (of 57)
Students t - table
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 38 (of 57)
t distribution with comparison to the z value
Confidence
t
Level
(10 d.f.)
t
(20 d.f.)
t
(30 d.f.)
z
____
.80
1.372
1.325
1.310
1.28
.90
1.812
1.725
1.697
1.64
.95
2.228
2.086
2.042
1.96
.99
3.169
2.845
2.750
2.57
Note: t
CT042-3.5-2 Probability and Statistical Modeling
z as n increases
Estimation & Confidence Intervals
Slide 39 (of 57)
Quick Review Question
A random sample of n = 25 has x = 50 and
s = 8. Form a 95% confidence interval for
d.f. = n 1 = 24, so
t /2 , n1 t.025,24 2.0639
The confidence interval is
x t /2
s
8
50 (2.0639)
n
25
46.698 .. 53.302
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 40 (of 57)
Approximation for Large samples
Since t approaches z as the sample size
increases, an approximation is sometimes used
when n 30:
Technically
correct
x t /2
CT042-3-2 Probability and Statistical Modeling
Approximation
for large n
s
n
x z /2
Estimation & Confidence Intervals
s
n
Slide 41 (of 57)
Example 5.5
A random sample of 120 measurements
taken from a normal population gave the
following data:
n = 120, x 1008 , s = 1.44
Find
(a) a 97% confidence interval
(b) a 99% confidence interval
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Determining Sample Size
The required sample size can be found to reach a
desired margin of error (e) and level of confidence (1 )
Required sample size, known:
CT042-3-2 Probability and Statistical Modeling
2
/2
z /2
Estimation & Confidence Intervals
Slide 42 (of 57)
Quick Review Question
Example:
If = 45, what sample size is needed to be 90% confident of
being correct within 5?
z /2
n
1.645(45)
219.19
So the required sample size is n = 220
(Always round up)
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 43 (of 57)
If is unknown
If unknown, can be estimated when using
the required sample size formula
Use a value for that is expected to be at least
as large as the true
Select a pilot sample and estimate with the
sample standard deviation, s
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 44 (of 57)
Confidence
Intervals
Population
Mean
Known
CT042-3-2 Probability and Statistical Modeling
Population
Proportion
Unknown
Estimation & Confidence Intervals
Slide 45 (of 57)
Confidence Intervals for the Population
Proportion, p
An interval estimate for the population
proportion ( p ) can be calculated by adding an
allowance for uncertainty to the sample
proportion ( p )
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 46 (of 57)
Recall that the distribution of the sample
proportion is approximately normal if the
sample size is large, with standard deviation
p(1 p)
n
We will estimate this with sample data:
p(1 p)
sp
n
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 47 (of 57)
Confidence intervals endpoints
Upper and lower confidence limits for the
population proportion are calculated with the
formula
p z /2
p(1 p)
n
where
z is the standard normal value for the level of confidence
desired
p is the sample proportion
n is the sample size
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 48 (of 57)
Quick Review Question
Example:
A random sample of 100 people shows that 25
are left-handed.
Form a 95% confidence interval for the true
proportion of left-handers
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 49 (of 57)
1.
p 25/100 .25
2.
Sp p(1 p)/n .25(.75)/n .0433
3.
.25 1.96 (.0433)
0.1651 . . . . . 0.3349
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 50 (of 57)
Interpretation
We are 95% confident that the true percentage of
left-handers in the population is between 16.51%
and 33.49%.
Although this range may or may not contain the
true proportion, 95% of intervals formed from
samples of size 100 in this manner will contain the
true proportion.
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 51 (of 57)
Changing the sample size
Increases in the sample size reduce
the width of the confidence interval.
Example:
If the sample size in the above example is
doubled to 200, and if 50 are left-handed in the
sample, then the interval is still centered at .
25, but the width shrinks to
.19 .31
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 52 (of 57)
Finding the required sample size for proportion
problems
CT042-3.5-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 53 (of 57)
Quick Review Question
What Sample size ?
How large a sample would be necessary to
estimate the true proportion defective in a large
population within 3%, with 95% confidence?
(Assume a pilot sample yields p = .12)
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 54 (of 57)
For 95% confidence, use Z = 1.96
E = .03
p = .12, so use this to estimate p
z 2 /2 p (1 p) (1.96)2 (.12)(1 .12)
n
450.74
2
2
e
(.03)
So use n = 451
CT042-3.5-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 55 (of 57)
Question and Answer Session
Q&A
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 56 (of 57)
Next Session
Hypothesis Testing
CT042-3-2 Probability and Statistical Modeling
Estimation & Confidence Intervals
Slide 57 (of 57)