0% found this document useful (0 votes)
501 views

MMW Lecture 4.3 Data Management Part 3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
501 views

MMW Lecture 4.3 Data Management Part 3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 55

Chapter 4:

Data
Management
Learning Objectives
o Advocate the use of statistical data in making important
decisions.
o Use a variety of statistical tools to process and manage
numerical data.
o Use linear regression to predict the value of a variable given
certain conditions.
o Apply correlation to determine the relationship between two
variables.
o Perform operations on mathematical expressions correctly.
o Articulate the importance of mathematics in one’s life.
o Express appreciation for mathematics as a human endeavor.
o Support the use of mathematics in various aspects and
endeavors in life.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Topic Outline

I. Introduction to Data Management


II. Measures of Central Tendency
III. Measures of Dispersion
IV. Measures of Relative Position
V. Probabilities and Normal Distributions
VI. Linear Regression and Correlation

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Introduction
The normal distribution or Gaussian distribution is a continuous
probability distribution that describes data that clusters around a
mean.

The graph of the associated probability density function is bell-


shaped, with a peak at the mean, and is known as the Gaussian
Function or bell curve.

The normal curve was developed mathematically


in 1733 by Abraham de Moivre (1667-1754) as
an approximation to the binomial distribution.

De Moivre’s paper was not discovered until


1924 by Karl Pearson (1857-1936).

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Introduction
Pierre-Simon Laplace (1749-1827) used the normal
curve in 1783 to describe the distribution of errors.

Subsequently, Carl Friedrich Gauss (1777-1855)


used the normal curve to analyze astronomical data
in 1809.

The normal curve is often called the Gaussian


distribution. The term bell-shaped curve is
often used in everyday usage.

The normal distribution can be used to


describe, at least approximately, any variable
that tends to cluster around the mean.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Example
The heights of adult males in the Philippines are roughly
normally distributed.

Most men have a height close to the mean, though a small


number of outliers have a height significantly above or below
the mean.

A histogram of male heights will appear similar to a bell curve,


with the correspondence becoming closer if more data is used.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Histogram for the Distribution of Heights of Adult Male in the Phil.

(a) Random Sample of 100 (b) Sample size increased & class
male width decreased

(d) Normal distribution for the


(c) Sample size increased and
population
class width decreased further
Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.
Properties of Normal Distribution
1. The distribution is bell-shaped.

2. The mean, median, and mode are equal and are located at the
center of the distribution.

3. The normal distribution is unimodal.

4. The normal distribution curve is symmetric about the mean (the


shape are same on both sides).

5. The normal distribution is continuous.

6. The normal curve is asymptotic (it never touches the x axis).

7. The total area under the normal distribution curve is 1.00 or


100%.
Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.
Properties of Normal Distribution
8. The area under the part of a normal curve that lies within 1
standard deviation of the mean 68%; within 2 standard
deviation, about 95%; and with 3 standard deviation, about
99.7%.

μ–3σ μ–2σ μ–1σ μ μ+1σ μ+2σ μ+3σ X


–3 –2 –1 0 +1 +2 +3 z value

About 68%
About 95%
About
99.7%
Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.
Standard Normal Distribution
A normal distribution can be converted into a standard normal
distribution by obtaining the z value.

A z value is the signed distance between a selected value,


designated X, and the mean, μ, divided by the standard deviation.

It is also called as z scores, the z statistics, the standard normal


deviates, or the standard normal values. In terms of formula:

X  where: z = z value
z X = the value of any particular observation

or measurement.
μ = the mean of the distribution.
σ = standard deviation of the distribution
Example 1
Find the area under the standard normal distribution curve
between z = 0 and z = 1.85.

Solution:

0 1.85

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Standard Normal Distribution Table
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
3.1 0.4990 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993
3.2 0.4993 0.4993 0.4994 0.4994 0.4994 0.4994 0.4994 0.4995 0.4995 0.4995
3.3 0.4995 0.4995 0.4995 0.4996 0.4996 0.4996 0.4996 0.4996 0.4996 0.4997
3.4 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4998
Example 1
Find the area under the standard normal distribution curve
between z = 0 and z = 1.85.
Solution:

0.4678

0 1.85

P(0  z  1.85) = 0.4678

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Example 2
Find the area under the standard normal distribution curve
between z = 0 and z = –1.15.

Solution:

0.3749

–1.15 0

The area between z = 0 and z = –1.15 or P(–1.15  z  0) is 0.3749.

Therefore, the area is 0.3749 or 37.49%.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Example 3
Find the area under the standard normal distribution curve to the
right of z = 1.15.

Solution:

0.5000

0.3749

0 1.15

P(0  z  1.15) = 0.3749. P(z  1.15) = 0.5000 – P(0  z  1.15)


= 0.5000 – 0.3749
= 0.1251
Solution

0.5000
– 0.3749
0.1251

0.3749 0.1251

0 1.15

The area to the right of z = 1.15 is 0.1251 or 12.51%.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Example 4
Find the area under the standard normal distribution curve to the left
of z = –1.85.
Solution:
P(0  z  1.85) = 0.4678 P(z  –1.85) = 0.5000 – P(0  z  1.85)
= 0.5000 – 0.4678
= 0.0322

0.0322
0.5000

0.4678

–1.85 0

The area is 0.0322 or 3.22%


Example 5
Find the area under the standard normal distribution curve
between z = 0.75 and z = 1.85
Solution:
0.1944
P(0  z  1.85) = 0.4678
0.4678
P(0  z  0.75) = 0.2734 0.2734

0 0.75 1.85

P(0.75  z  1.85) = P(0  z  1.85) – P(0  z  0.75)


= 0.4678 – 0.2734
= 0.1944

The area is 0.1944 or 19.44%.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Example 6
Find the area under the standard normal distribution curve
between z = 0.75 and z = 1.85

Solution: 0.1944

P(–1.85  z  0) = 0.4678

P(–0.75  z  0) = 0.2734

–1.85 –0.75 0

P(–1.85  z  –0.75) = P(–1.85  z  0) – P(–0.75  z  0)


= 0.4678 – 0.2734
= 0.1944

P(0.75  z  1.85) and P(–1.85  z  –0.75) have the same area


under the normal curve and one is the mirror image of the other.
Example 7
Find the area under the standard normal distribution curve
between z = 1.15 and z = –1.85.
Solution:
P(–1.85  z  0) = 0.4678 P(0  z  1.15) = 0.3749

P(– 1.85  z  1.15) = P(–1.85  z  0) + P(0  z  1.15)


= 0.4678 + 0.3749 = 0.8427
0.8427

The total area is 0.8427


or 84.27%.

0.4678 0.3749

–1.85 0 1.15
Example 8
Find the area under the standard normal distribution curve to the
left of z = 1.15.

Solution:
P(0  z  1.15) = 0.3749 P(z  1.15) = 0.5000 + P(0  z  1.15)
= 0.5000 + 0.3749
= 0.8749

The area is 0.8749 or


87.49%.
0.5000 0.3749

0 1.15
Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.
Example 9
Find the area under the standard normal distribution curve to the
right of z = –1.15.

Solution:
P(–1.15  z  0) = 0.3749 P(z  –1.15) = P(–1.15  z  0) + 0.5000
= 0.3749 + 0.5000
= 0.8749

The area is 0.8749 or


87.49%.
0.3749 0.5000

–1.15 0
Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.
Example 10
Find the area under the standard normal distribution curve to
the rights of z = 1.15 and to the left of z = –1.85.
Solution:
P(–1.85  z  0) = 0.4678 P(z  –1.85) = 0.5000 – 0.4678 = 0.0322

P(0  z  1.15) = 0.3749 P(z  1.15) = 0.5000 – 0.3749 = 0.1251

0.1251
0.0322

–1.85 0 1.15

P(z  –1.85) + P(z  1.15) = 0.0322 + 0.1251 = 0.1573


The total area is 0.1573 or 15.73%.
Example 11
Find the z value such that the area under the standard normal
distribution curve between 0 and z value is 0.3962.
Solution:

0.3962

0 z
z 0.00 0.01 … 0.06 …
The z value is 1.36. 0.0 0.0000 0.0040 … 0.0239 …
0.1 0.0398 0.0438 … 0.0636 …
0.2 0.0793 0.0832 … 0.1026 …
: : : :
1.3 0.4032 0.4049 … 0.3962 …
: : : :
Application of Normal Distribution

X 
z

where z = z value.
X = the value of any particular observation or measurement.
μ = population mean.
σ = population standard deviation.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Example 1
The average Pag-ibig salary loan for RFS Pharmacy
Inc. employees is ₱23,000. If the debt is normally
distributed with a standard deviation of ₱2,500, find
the probability that the employee owes less than
₱18,500.
Solution:

P(X  18,500)

18,500 23,000

X   18,500  23,000  4,500


z    1.80
 2,500 2,500
Solution
P(–1.80  z  0) = 0.4641

P(X  18,500) = P(z  – 1.80)

= 0.5000 – P(–1.80  z  0)

= 0.5000 – 0.4641

= 0.0359

The probability that the employee owes less than Php18,500


in Pag-ibig salary loan is 0.0359 or 3.59%.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Example 2
The average age of bank managers is 40 years.
Assume the variable is normally distributed. If the
standard deviation is 5 years, find the probability
that the age of a randomly selected bank manager
will be in the range between 35 and 46 years old.

Solution:

X   35  40  5
z    1.00
 5 5
X   46  40 6
z    1.20
 5 5

35 40 46
P(–1.00  z  0) = 0.3413 P(0  z  1.20) = 0.3849

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Solution

72.62%

35 40 46
–1.00 0 1.20
P(35  X  46) = P(–1.00  z  1.20)
= P(–1.00  z  0) and P(0  z  1.20)
= 0.3413 + 0.3849
= 0.7262

The probability that a randomly selected bank manager


is between 35 and 46 years old is 0.7262 or 72.62%.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Example 3
To qualify for a Master’s degree program in Business
Administration at ADMU, candidates must score in the
top 20% on a mental ability test. The test has a mean of
180 and a standard deviation of 25. Find the lowest
possible score to qualify. Assume the test scores are
normally distributed.
Solution:
The test value x that cut off the upper 20% of the area under a normal
distribution curve.

20% or
0.2000

180 x
Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.
Solution
Determine the area under the normal distribution between 180 & X:
0.5000 – 0.2000 = 0.3000
Standardized Normal Distribution
z 0.00 0.01 … 0.04 0.05 X 
z
0.0 …

0.0000 0.0040 0.0199 0.0199
0.1
Closest … X  180
0.0398 0.0438
value 0.0596 0.0596 0.84 
0.2 … 0.0987 0.0987 25
0.0793 0.0832
: : : : 0.84(25) + 180 = X
0.8 0.2881 0.2910 … 0.2995 0.3023 21 + 180 = X
201 = X

A score of 201 should be used as a cut off. Anybody scoring 201 or


higher qualifies for Master in Business Administration

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Correlation
Correlation refers to the departure
of two random variables from
independence.

Pearson product-moment
correlation (PPMC) - most widely
used in statistics to measure the
degree of the relationship between
the linear related variables.
The correlation
coefficient is defined as
the covariance divided
by the standard
deviations of the
variables.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Pearson Product Moment Correlation
Pearson’s product-moment correlation coefficient of simply
correlation coefficient (or Person’s r) is a measure of the
linear strength of the association between two variables.

 Founded by Karl Pearson.

 The value of the correlation coefficient varies between +1


and –1.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Correlation Coefficient
Y Variable

Y Variable
X Variable X Variable

Perfect Positive Correlation (r = 1.00) Perfect Negative Correlation (r = -1.00)

Y Variable
Y Variable

X Variable X Variable

Positive Correlation (r = 0.80) Negative Correlation (r = -0.80)

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Correlation Coefficient

Y Variable
Y Variable

X Variable X Variable

Zero Correlation (r = 0.00) Non-Linear Correlation (r = -1.00)

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Pearson Product-Moment Correlation (PPMC)

N XY  (  X )(  Y)
r
[N( X 2 )  (  X )2 ][ N(  Y 2 )  ( Y)2 ]

Test of Significance

r N2
t
1 r2

df = n – 2

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Correlation Coefficient and Strength of Relationship

0.00 – no correlation, no relationship

±0.01 to ±0.20 – slight correlation, almost negligible relationship

±0.21 to ±0.40 – slight correlation, definite but small relationship

±0.41 to ±0.70 – moderate correlation, substantial relationship

±0.71 to ±0.90 – high correlation, marked relationship

±0.91 to ±0.99 – very high correlation, very dependable relationship

±1.00 – perfect correlation, perfect relationship

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Assumptions

Subjects are randomly selected and independently assigned to


groups.

Both populations are normally distributed.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Procedure for PPMC Test

 Set up the hypotheses.


H0:  = 0 (The correlation in the population is zero.)
H1:   0,   0,   0 (The correlation in the population is
different from zero.)

 Calculate the value of Pearson’s r.

 Calculate the value of t value.

 Statistical decision for hypothesis testing


If tcomputed  tcritical, do not reject H0.
If tcomputed  tcritical, reject H0.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Example 7
The owner of a chain of halo-halo stores would like to study
the effect of atmospheric temperature on sales during the
summer season. A random sample of 12 days is selected
with the results given as follows:
Day 1 2 3 4 5 6 7 8 9 10 11 12
Temperature (°F) 79 76 78 84 90 83 93 94 97 85 88 82
Total Sales 147 143 147 168 206 155 192 211 209 187 200 150

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Example 1
Plot the data on a scatter diagram. Does it appear there is a
relationship between atmospheric temperature and sales? Compute
the coefficient of correlation. Determine at the 0.05 significance level
whether the correlation in the population is greater than zero.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Scatterplot

220
210
200
190
Sales (Y)

180
170
160
150
140
130
70 75 80 85 90 95 100
Temperature (X)

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Solution
Step 1: State the hypotheses.
H0: r = 0
There is no correlation between atmospheric
temperature and total sales of fruit shake.
H1: r  0
There is a correlation between atmospheric
temperature and total sales of fruit shake.

Step 2: Level of significance is α = 0.05.

Step 3: df = n–2 = 12 – 2 = 10 & t critical value is 2.228.

Step 4: Compute the Pearson’s r.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Solution
Day x y x2 y2 xy
1 79 147 6,241 21,609 11,613
2 76 143 5,776 20,449 10,868
 X  1,029
3 78 147 6,084 21,609 11,466
 Y  2 ,115
4 84 168 7,056 28,224 14,112
5 90 206 8,100 42,436 18,540
X 2
 88 ,733
6 83 155 6,889 24,025 12,865
7 93 192 8,649 36,864 17,856   380 ,887
Y 2

8 94 211 8,836 44,521 19,834


9 97 209 9,409 43,681 20,273  XY  183,222
10 85 187 7,225 34,969 15,895
11 88 200 7,744 40,000 17,600
12 82 150 6,724 22,500 12,300
Total 1,029 2,115 88,733 380,887 183,222

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Solution
N XY  (  X)(  Y)
r
[ N(  X 2 )  (  X ) 2 ][ N(  Y 2 )  ( Y) 2 ]

12(183,222)  (1,029)( 2 ,115)



[12(88 ,733)  (1,029) 2 ][12( 380 ,887 )  ( 2 ,115) 2 ]

= 0.93

The atmospheric temperature and total sales indicates a very


high positive correlation (very dependable relationship)–that is
an increased in atmospheric temperature is highly associated
with the increased in total sales of fruit shake.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Solution
Step 5: Decision rule.
r N2 0.93 12  2 0.93( 3.16227766) 2.940918224
t     8.00
1 r2 1  (0.93) 2 1  0.8649 0.367559519

Reject H0

-2.228 0 +2.228 8.00

Step 6: Conclusion.
We can conclude that there is evidence that shows
significant association between the atmospheric
temperature and the total sales of fruit shake.
Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.
Simple Regression Equation
Regression analysis is a simple statistical tool used
to model the dependence of a variable on one (or
more) explanatory variables.

A simple linear regression is the least estimator of a


linear regression model with a single predictor (or
one independent variable)

The least square model determines a regression


equation by minimizing the sum of squares of the
vertical distances between the actual Y values and
the predicted values of Y.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Assumptions of Linear Regression Equation
Linearity – The mean of each error component is zero.

Independence of Error Terms – The errors are independent of


each other.

Normally Distributed Error Terms – Each error component


(random variable) follows an approximate normal distribution.

Homoscedasticity – The variance of the error components is


the same for each value of the independent variable.

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Estimating the Coefficient
Predicted or fitted value of Y.
ŷ = b1x + b0

Slope of the regression line


n(  xy )  (  x)(  y )
b1 
n(  x 2 )  (  x) 2

Intercept of the regression line


b0  y  b1 x

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Sum of Squares for Error
The least squares method determines the coefficients that minimize
the sum of the squared deviation between the points and the line
defined by the coefficients, it is called sum of squares for error (or
total variations).

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Measures of Variations
Y Yi Unexplained sum
of squares

Total sum of Ŷ = b1X + b0


squares
Explained sum of
squares
Y

Xi X

 i  i  i
2
( y  y )  ( ˆ
y  y ) 2
 ( y  ˆ
y ) 2
SST = SSR + SSE

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Solution

Slope of the regression line


n(  xy )  (  x)(  y )
b1 
n(  x 2 )  (  x) 2

12(183 ,222)  (1,029)( 2 ,115)


b1  = 3.7496
12(88 ,733)  (1,029) 2

Compute for intercept of the simple linear regression.

x
 x 1,029
  85.75 y
 y 2 ,115
  176.25
n 12 n 12
b0  y  b1 x

b0 = 176.25 – 3.7496(85.75) = –145.2782


Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.
Regression Equation
ŷ = b1x + b0

ŷ = 3.7496x – 145.2782

b1 = 3.7496 (for each additional temperature in Fahrenheit, sales are


expected to increase by 3.7496 units).

b0 = –145.2782 (if the temperature in Fahrenheit is zero, a negative


145.2782 units would be sold).

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Regression Line

ŷ = 3.7496x – 145.2782

Copyright 2018: Mathematics in the Modern World by Winston S. Sirug, Ph.D.


Do not put your faith in what
statistics say until you have carefully
considered what they do not say.
– William W.
Watt
Copyright 2018:
Mathematics in the Modern World by Winston S. Sirug, Ph.D.

You might also like