MMW Lecture 4.3 Data Management Part 3
MMW Lecture 4.3 Data Management Part 3
Data
Management
Learning Objectives
o Advocate the use of statistical data in making important
decisions.
o Use a variety of statistical tools to process and manage
numerical data.
o Use linear regression to predict the value of a variable given
certain conditions.
o Apply correlation to determine the relationship between two
variables.
o Perform operations on mathematical expressions correctly.
o Articulate the importance of mathematics in one’s life.
o Express appreciation for mathematics as a human endeavor.
o Support the use of mathematics in various aspects and
endeavors in life.
(a) Random Sample of 100 (b) Sample size increased & class
male width decreased
2. The mean, median, and mode are equal and are located at the
center of the distribution.
About 68%
About 95%
About
99.7%
Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.
Standard Normal Distribution
A normal distribution can be converted into a standard normal
distribution by obtaining the z value.
X where: z = z value
z X = the value of any particular observation
or measurement.
μ = the mean of the distribution.
σ = standard deviation of the distribution
Example 1
Find the area under the standard normal distribution curve
between z = 0 and z = 1.85.
Solution:
0 1.85
0.4678
0 1.85
Solution:
0.3749
–1.15 0
Solution:
0.5000
0.3749
0 1.15
0.5000
– 0.3749
0.1251
0.3749 0.1251
0 1.15
0.0322
0.5000
0.4678
–1.85 0
0 0.75 1.85
Solution: 0.1944
P(–1.85 z 0) = 0.4678
P(–0.75 z 0) = 0.2734
–1.85 –0.75 0
0.4678 0.3749
–1.85 0 1.15
Example 8
Find the area under the standard normal distribution curve to the
left of z = 1.15.
Solution:
P(0 z 1.15) = 0.3749 P(z 1.15) = 0.5000 + P(0 z 1.15)
= 0.5000 + 0.3749
= 0.8749
0 1.15
Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.
Example 9
Find the area under the standard normal distribution curve to the
right of z = –1.15.
Solution:
P(–1.15 z 0) = 0.3749 P(z –1.15) = P(–1.15 z 0) + 0.5000
= 0.3749 + 0.5000
= 0.8749
–1.15 0
Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.
Example 10
Find the area under the standard normal distribution curve to
the rights of z = 1.15 and to the left of z = –1.85.
Solution:
P(–1.85 z 0) = 0.4678 P(z –1.85) = 0.5000 – 0.4678 = 0.0322
0.1251
0.0322
–1.85 0 1.15
0.3962
0 z
z 0.00 0.01 … 0.06 …
The z value is 1.36. 0.0 0.0000 0.0040 … 0.0239 …
0.1 0.0398 0.0438 … 0.0636 …
0.2 0.0793 0.0832 … 0.1026 …
: : : :
1.3 0.4032 0.4049 … 0.3962 …
: : : :
Application of Normal Distribution
X
z
where z = z value.
X = the value of any particular observation or measurement.
μ = population mean.
σ = population standard deviation.
P(X 18,500)
18,500 23,000
= 0.5000 – P(–1.80 z 0)
= 0.5000 – 0.4641
= 0.0359
Solution:
X 35 40 5
z 1.00
5 5
X 46 40 6
z 1.20
5 5
35 40 46
P(–1.00 z 0) = 0.3413 P(0 z 1.20) = 0.3849
72.62%
35 40 46
–1.00 0 1.20
P(35 X 46) = P(–1.00 z 1.20)
= P(–1.00 z 0) and P(0 z 1.20)
= 0.3413 + 0.3849
= 0.7262
20% or
0.2000
180 x
Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.
Solution
Determine the area under the normal distribution between 180 & X:
0.5000 – 0.2000 = 0.3000
Standardized Normal Distribution
z 0.00 0.01 … 0.04 0.05 X
z
0.0 …
0.0000 0.0040 0.0199 0.0199
0.1
Closest … X 180
0.0398 0.0438
value 0.0596 0.0596 0.84
0.2 … 0.0987 0.0987 25
0.0793 0.0832
: : : : 0.84(25) + 180 = X
0.8 0.2881 0.2910 … 0.2995 0.3023 21 + 180 = X
201 = X
Pearson product-moment
correlation (PPMC) - most widely
used in statistics to measure the
degree of the relationship between
the linear related variables.
The correlation
coefficient is defined as
the covariance divided
by the standard
deviations of the
variables.
Y Variable
X Variable X Variable
Y Variable
Y Variable
X Variable X Variable
Y Variable
Y Variable
X Variable X Variable
N XY ( X )( Y)
r
[N( X 2 ) ( X )2 ][ N( Y 2 ) ( Y)2 ]
Test of Significance
r N2
t
1 r2
df = n – 2
220
210
200
190
Sales (Y)
180
170
160
150
140
130
70 75 80 85 90 95 100
Temperature (X)
= 0.93
Reject H0
Step 6: Conclusion.
We can conclude that there is evidence that shows
significant association between the atmospheric
temperature and the total sales of fruit shake.
Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.
Simple Regression Equation
Regression analysis is a simple statistical tool used
to model the dependence of a variable on one (or
more) explanatory variables.
Xi X
i i i
2
( y y ) ( ˆ
y y ) 2
( y ˆ
y ) 2
SST = SSR + SSE
x
x 1,029
85.75 y
y 2 ,115
176.25
n 12 n 12
b0 y b1 x
ŷ = 3.7496x – 145.2782
ŷ = 3.7496x – 145.2782