Business Statistics
Business Statistics
SEGMENTS 1-3
Segment: Descriptive Statistics
Topic: Introduction to Data Analysis
Introduction to Data Analysis
Table of Contents
2
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Introduction to Data Analysis
Introduction
Data analysis and making decisions is inevitable for all business entities. Data analysis provides a
range of statistical tools for organising, presenting, analysing, and interpreting data. There is an
upsurge in demand for data analysts throughout the world. Data analysts are now able to collect
tremendous amounts of data and data analysis had become relatively easier with the usage of
new world technology.
Learning Objectives
3
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Introduction to Data Analysis
A data analyst may need to perform data analysis for many different purposes. The most
common ones are to:
describe a data set using the results of an analysis
make inferences from the results of an analysis
estimate unknown quantities from appropriate sample data
test hypotheses about unknown quantities
quantify relationships among variables
make decisions based on statistical results.
For example, assume that a manager of a retail chain wants to have insights about the sales data
of various products under his organisation. He can make use of various charts and measures
4
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Introduction to Data Analysis
such as mean and standard deviation to get better inputs about the products sold by the
organisation.
Suppose we wish to learn about Fortune 500 companies. These 500 companies become the
population that we need to collect data about. To collect the data, we could look at published
data in libraries or on the web, mail questionnaires to company managers, or conduct face to
face or telephone interviews.
Suppose we wish to learn about all the companies in Europe. This population is too large to
collect data about. We may then resort to sampling. The population is a set of all the elements
under a study of interest or research, whereas the sample is a subset derived from a population.
Using the appropriate computed results from the sample data it is possible to infer the
parameters of the population. By designing the sampling experiment suitably, we can make the
inference as reliable as needed.
Sometimes it may be difficult to define the population. For example, a store manager may be
interested in knowing about the characteristics of all potential customers. The population should
then include anyone who is a potential customer. However, it is hard to know who exactly those
people are. In such cases, one has to make some assumptions and proceed. The store manager
5
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Introduction to Data Analysis
may randomly pick customers entering the store and treat them as a representative sample of
the population.
In another example, an inspector may be interested in the average weight of all the parts to be
produced by a machine. Since all the parts have not been produced yet, the population is not
there yet. The inspector may randomly pick parts being produced by the machine and treat them
as a representative sample of the population.
Estimation
When the inferences we make about the population parameters are about their exact values,
then the process is called estimation.
For example, we might estimate the sales revenue of a company to be US$3,284,500 for the
coming year.
Hypothesis
Some inferences made regarding population parameters are not about their exact values.
For example, we may be interested in simply testing whether a parameter is not less than 100,
because someone has claimed that it is not less than 100 and a few others have challenged that
claim. This process is called hypothesis testing.
Sometimes a hypothesis might be made about two or more population parameters. Such a
hypothesis can also be tested using sample data.
6
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Introduction to Data Analysis
2. Summary
Here is a quick recap of what we have learnt so far:
Data analysis provides a range of statistical tools for organising, presenting, analysing, and
interpreting data.
Descriptive statistics deals with describing the data using graphs, tables, and various
summary measures such as measures of central tendency and measures of variation.
Inferential statistics speaks about drawing some meaningful inferences from the data after
performing suitable statistical tests.
The population is a set of all the elements under a study of interest or research whereas the
sample is a subset derived from a population.
It is possible to make two types of inferences from data, namely, estimation and hypothesis.
When inferences are made about the exact values of the population parameters, then this
process is called estimation.
In hypothesis testing, the inferences are made about the population parameters which are
not about their exact values.
3. Glossary
7
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Segment: Descriptive Statistics
Topic: Types of Data
Types of Data
Table of Contents
2
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Types of Data
Introduction
The first step in data analysis is learning to recognise the different types of data.
Different types of data may require different methods of organising and presenting the data.
This topic introduces some common types of data and discusses the key differences between
them.
Let us first look at an overview of this topic.
Types of data
Consider this.
The identification number of the part that your company produces is 167402.
Your annual income is S$167,402.
The number is the same in both cases.
But can you use the number the same way in both cases? No.
The income is a measured quantity to which you can add a number if that is an additional
income, or from which you can subtract a number if that is an expense.
On the other hand, the identification number cannot be meaningfully used in any arithmetic
calculation.
In this topic, you will study different types of data and the scale of measurement, and learn
what can, and cannot, be done with each type of data.
Learning Objectives
At the end of this topic, you will be able to:
identify the different types of data
identify the different scales of measurement for measuring various types of data
distinguish between types of quantitative data
differentiate between data collected using various time scales.
3
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Types of Data
Quantitative
Qualitative
Qualitative
A qualitative (or categorical) variable is a variable that records a quality. Variables such as
gender are considered qualitative, as the possible values (male or female) are categories
rather than numerical. If a number is used for distinguishing members of different categories
of a qualitative variable, the number assignment is arbitrary.
Exercise:
Below is an exercise to identify the difference between quantitative and qualitative variables.
A condominium is up for sale in the Boston area. Realtors who help sell the unit provide
prospective buyers with the following information – asking price US$168,000, two bedrooms
facing east, a washer, dryer and heater are included.
4
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Types of Data
2. Qualitative
Question 2: Is the fact that the unit has two bedrooms qualitative or quantitative information?
1. Quantitative
2. Qualitative
Question 3: Is the fact that the unit faces east qualitative or quantitative information?
1. Quantitative
2. Qualitative
Question 4: Is the fact that the unit has a washer, dryer and heater qualitative or quantitative
information?
1. Quantitative
2. Qualitative
2. Measurement Scales
Given some data, quantitative or qualitative, one should be clear about the scale in which it has
been measured. There are four generally used scales of measurement. We shall note some
important aspects of each scale listed here.
1. Nominal scale
2. Ordinal scale
3. Interval scale
4. Ratio scale
Scales of Measurements
1. Nominal Scale
In the nominal scale of measurement, numbers are used simply as labels for groups or classes.
If our data set consists of blue, green and red items, we may designate blue = 1, green = 2,
5
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Types of Data
and red = 3. In this case, the numbers 1, 2, and 3 stands only for the category to which a data
point belongs. 'Nominal' stands for 'name' of category. Such data can only be used to identify
or classify a person or thing.
2. Ordinal Scale
In the ordinal scale or measurement, data elements may be ordered according to their
relative size or quality.
A good example of ordinal scale measurement is the rank of an athlete in a tournament. Such
data can be used only for ordering from best to worst, or biggest to smallest, etc.
Consider a running race in which runners are ranked 1, 2, ... such that rank 1 is faster than
rank 2 and so on. Looking at the ranks we can only say who is faster. Take the two runners
whose ranks are 10 and 20... We can say that rank 10 is faster than rank 20.
But we cannot say that rank 10 was twice as fast as rank 20. Also, we cannot say that the
difference between ranks 10 and 20 is twice as large as the difference between ranks 10 and
15.
3. Interval Scale
In the interval scale of measurement, the value of zero is assigned arbitrarily. Along major
highways, miles or kilometres are marked prominently at various points.
If point A is marked 100 kilometres, and point B is marked 200 kilometres, does this mean
point B is twice as far as point A is from where you are?
No. The reason is that you may not be at the 0 km point. When the zero marker is placed at
some arbitrary point, we have an interval scale.
In this case, only the intervals can be compared to produce meaningful ratios. For example, if
point C is marked 300 kilometres, then we can say that point C is twice as far from point A as
point B is because the interval AC = (300 - 100) kilometre is twice as long as the interval AB =
(200 - 100) kilometre.
6
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Types of Data
4. Ratio Scale
If two measurements are in ratio scale, then we can take ratios of those measurements. The
zero in the scale is absolute zero.
Good examples of ratio scale measurements are weight, volume, income, cost, etc.
In these cases, zero really means zero and ratios can be taken straight away with the measured
amounts. Weight of 20kg is indeed twice as heavy as 10kg.
2. Continuous Variables
On the other hand, variables such as the return on a portfolio of stocks can vary continuously.
For example, the annual return on a portfolio of stocks could be, say 8.765%.
In this case, our variable of interest could assume an infinite number of possible values, and
thus it is classified as a continuous variable.
7
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Types of Data
It is useful to differentiate between discrete and continuous variables, as they may need to be
handled differently in many situations.
For example, to display the distribution of a discrete variable graphically, we might use a bar
chart while for a continuous variable we might use a histogram.
In the segment, “Probability and uncertainty”, we will explore the basic concepts involved in
probability and in making decisions where uncertainty is involved. These concepts lay the
groundwork for later work involving probability distributions, where we will need to treat
discrete and continuous variables very differently.
Cross-sectional Data
An example might be data collected on price-earnings ratios for a group of companies quoted
on the NASDAQ exchange, where the data was collected at the same point in time.
Time-series Data
An example of time-series data might be the closing value of a particular stock market index
over the period of a year. The values collected represent time-series data, where the
observations are collected over a period of time.
8
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Types of Data
5. Summary
Here is a quick recap of what we have learnt so far:
9
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Segment: Descriptive Statistics
Topic: Visual Representation of Data
Visual Representation of Data
Table of Contents
1. Types of Tables................................................................................................................................ 4
2. Types of Charts................................................................................................................................ 5
3. Types of Graphs .............................................................................................................................. 7
4. Summary ....................................................................................................................................... 10
5. Glossary ......................................................................................................................................... 10
2
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Visual Representation of Data
Introduction
A picture is worth a thousand words, indeed. A large data set might be meaningless unless it is
organised into tables, charts, or graphs. This topic introduces some of the most common types
of tables, charts, and graphs that are useful in organising and presenting data.
In this topic, we will investigate a few methods of displaying data, some of which are descriptive
only.
Let us first look at the overview of this topic.
The United States government publishes the pie charts for filing tax returns. It is an effective
way of communicating to the taxpayers what proportion of taxes is collected from different
sources and how the tax revenue is spent.
This communication is more effective than presenting the numbers involved because there
will be very many numbers and each number would be so large with more than ten digits that
it would be difficult to comprehend them.
A comprehension of the facts and the realisation of the effort the government puts into
communicating those facts offer some comfort, however little, to millions of taxpayers.
In many presentations that you are going to make, you might find yourself in a similar
situation: you have to communicate large sets of numerical data to an important audience.
You will then have to use charts, graphs, and tables in your presentation.
In this topic, we will study different types of charts and graphs and their suitability for the type
of data communicated. You will also learn how to use spreadsheet templates to create many
types of charts.
Learning Objectives
At the end of this topic, you will be able to:
distinguish between frequency and pivot tables
evaluate the importance of using different types of charts
assess the significance of using different types of graphs.
3
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Visual Representation of Data
1. Types of Tables
The two types of tables normally used to present quantitative data are frequency and pivot.
Frequency Tables
The frequency table lists the number of observations for some variables that fall into various
categories. For example, suppose that data was collected on the ages of 25 employees of a
particular company. The data is quantitative and could be presented as a frequency table, where
the data is grouped into categories based on age groups as shown in the table below.
There is no set rule for determining the correct number of categories or classes, although
common sense should prevail. As a guide, somewhere between 5 to 15 categories might be
suggested.
Pivot Tables
One of the most powerful tools in Excel for analysing data is a pivot table. Pivot tables enable us
in a variety of ways in order to investigate relationships in data. Tables generated using this
technique are often called contingency tables or cross tabs. Excel, however, provides more
flexibility with pivot tables than is usually reflected with simple contingency tables. Contingency
tables list only counts, whereas pivot tables can list counts, averages, sums, and other measures.
4
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Visual Representation of Data
As an example, let us say we have been performing some market research to investigate the
characteristics of Internet users. Suppose we were interested in investigating the relationship
between Age and Gender (coded as 1 for female and 0 for male). One of the questions of
interest may have been the proportion of Internet users that were men under thirty. Pivot
tables allow us to group these variables as shown in the table below, where the answer to our
question can be identified with ease.
Table 2: Pivot Table
2. Types of Charts
Additionally, charts may also be used. In this section, we will look at three different types of
charts:
1. Histogram
2. Pie chart
3. Bar chart
Pie and bar charts can be used for both qualitative and quantitative data, while histograms are
for quantitative data.
Read below to find out more about these charts.
Types of Charts
1. Histograms
When data is grouped into categories or classes, we could plot a frequency distribution of the
data. Such a frequency plot is called a histogram.
Visually, a histogram is a chart made up of bars of different heights. The height of each bar
represents the frequency of values in the class represented by the bar. Adjacent bars share
5
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Visual Representation of Data
sides. For example, if you refer back to the frequency table given earlier in this topic, we could
represent the data on the ages for the employees as shown in the figure below.
Fig. 1: Histogram
2. Pie Charts
Pie charts are used to show the proportion of different parts that make up the whole, for
example, we can show the proportion of different ethnic groups in a community or the
proportion of different types of expenses within a budget.
The figure below shows a pie chart of the geographical locations of the world's largest
telecommunications companies.
3. Bar Chart
Bar charts use horizontal or vertical rectangles to display categorical data when there is no
emphasis on the percentage of a total represented by each category. The scale of
measurement is nominal or ordinal. A bar chart is a good way to show how different categories
stack up against one another. The figure below shows how a bar chart can be used to display
and interpret operating expenses and revenues of the top US airline carriers.
6
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Visual Representation of Data
3. Types of Graphs
Let us now look at two different types of graphs, namely:
1. Scatter plot
2. Line graph
Scatter Plot
In some instances, we are interested in investigating the relationship between two variables.
One way is to prepare a plot where each pair of data points is represented as a point on the
graph. The resulting graph is called a scatter plot as shown in the following figure.
7
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Visual Representation of Data
Scatter plots can be generated using either the Excel® Chart Wizard or using the StatTools add-
in. To use the Excel® Chart Wizard, first place your cursor anywhere in the data that you wish to
plot and then choose Insert/Chart. The chart wizard dialog box will then open up and you can
select the type of chart that you wish to insert. Choose the XY Scatter plot graph type and then
click on Next. The chart wizard will show a preview of the scatter plot. If this is correct you can
click on OK and the scatter plot will be inserted into the worksheet.
Alternatively, StatTools can be used to generate a scatter plot. To do this, first place your cursor
anywhere in the data that you wish to plot ad choose StatTools/Chart/Scatter plot(s). Then
follow the instructions provided and the scatter plot will be inserted into the worksheet.
A scatter plot is used to identify underlying relationships among variables. For example, if we
have the data on annual sales of a product and the annual advertising budgets for that product
during the same period, we can plot them on a scatter plot to identify any underlying patterns
in the data. One would expect that whenever the advertising budget was high, the sales would
also be high. This relationship can be explored using the scatter plot.
The plot consists of a scatter of points, each point representing an observation. For instance, if
the advertising budget in one year was x and the sales in the same year was y, then a point is
marked on the plot at coordinates (x, y).
8
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Visual Representation of Data
The figure above shows a scatter plot of sales versus advertising budget observed over a period
of 12 years measured in thousands of dollars. It shows that whenever the advertising budget
was high, sales were also high. This type of relationship is known as a positive correlation. We
will learn more details about correlation in the topic, “Numerical representation of data”. A
negative correlation between two variables means that when one increases, the other tends to
decrease.
Line Graphs
Line graphs are an effective way to represent the relationship between two variables particularly
when time is involved.
A time plot is an example of a line graph. Time plots display the changes in a variable that
occurred over time. In these graphs, the time variable is plotted along the horizontal axis of the
graph.
9
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Visual Representation of Data
4. Summary
Here is a quick recap of what we have learnt so far:
The two types of tables normally used to present quantitative data are frequency and pivot.
Graphical methods can be used to present statistical information. Some of these methods
are
o Histograms
o Bar charts
o Pie charts
Histograms are very commonly used and enable us to plot a frequency distribution of
grouped data.
A bar chart is a good way to show how different categories stack up against one another.
The difference between a histogram and a bar chart is that in a bar chart, the horizontal axis
is not in a continuous scale.
Pie charts are used to show the proportion of different parts that make up the whole.
A scatter plot is used to identify underlying relationships among variables.
Line graphs such as time plots represent the relationship between two variables.
Excel provides a powerful analysis tool called pivot tables, which enable us to summarise and
present data in a variety of forms.
5. Glossary
Time plot A graph of the value of a variable is placed on the vertical axis versus
time on the horizontal axis.
10
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Segment: Descriptive Statistics
Topic: Numerical Representation of Data
Numerical Representation of Data
Table of Contents
2
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
Introduction
Measures of central location, variability, and shape help to describe a data set. Measures of
location reveal centrality of the data set, while variability measures reveal deviance of the data
set from the centre. Measures of shape enable us to compare the shape of a distribution relative
to common distributions found in data. We also introduce an alternative graphical method of
presenting a distribution, known as a boxplot. In addition, it is sometimes useful to measure the
linear association between two variables.
Correlation is a useful measure of this association between the two variables. In this topic, we
will be investigating measures of central location, spread, and shape. In addition, we will also be
exploring a technique to summarise the strength of the relationship between two variables.
Learning Objectives
3
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
Secondly, it can serve as a basis for making comparisons – if set A is centred at 10 and set B at
20, it makes sense to say that the numbers in set A are mostly smaller than those in set B.
1. Mean
2. Median
3. Mode
Read below to explore the differences between these three measures of central location using
the sales data set {6, 9, 10, 12, 13, 14, 14, 15, 16, 16, 16, 17, 17, 18, 18, 19, 20, 21, 22, 24}.
4
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
2. Median
Median = value below which half the data points lie
The median is defined such that half the numbers in the set are greater and the other half are
smaller than it.
If the numbers in a data set are sorted in increasing order, then "the number in the middle
position" is clearly the median.
If there are an odd number of numbers, say, nine, then the middle is the fifth position and the
number in the fifth position is the median.
For odd number data set {122, 122, 134, 145, 152, 156, 158, 161, 170} with 9 numbers
median = 152
But if there are an even number of numbers, say, ten, then there is no middle position.
Rather, the fifth and the sixth positions together form the middle. In this case, the average of
the two numbers in positions five and six is declared the median.
For the data set {122, 122, 134, 145, 152, 156, 158, 161, 170, 175}
with ten numbers, the median is the average of 152 and 156, which is 154.
For even number data set n {122, 122, 134, 145, 152, 156, 158, 161, 170, 175} with 10
numbers
median = (152 + 156)/2
median = 154
5
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
3. Mode
Mode = most frequently occurring value
The mode is defined as the most frequently occurring value in the set.
In the given data set, the Mode is also 16. because it has the largest frequency of 3.
The use of the mode as the centre or a representative value of a data set is better suited for
large data sets, say, with at least a hundred numbers.
To see why, consider the data set
{8, 8, 8, 122, 122, 134, 145, 152, 156, 158, 161, 170, 175}
Its mode is 8, but 8 is hardly a representative value or the centre of the set.
6
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
This can also pose problems. Once again, this is unlikely to occur in a large data set.
The deviations of all the data points from the mean will add to zero. For example, the mean of
the data set {5, 7, 8, 10, 10} is 8. The deviations from the mean are {-3, -1, 0, +2, +2} and these
deviations add to zero.
7
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
Using mean
In algebraic calculations that involve products of deviations, many terms become zero and vanish
by virtue of this property. As a result, we get simpler results than if we used median or mode.
For this reason, the mean is used in almost all the techniques we will see in later topics.
Exercise:
Below is an exercise to practice what you have learnt on mean, median, and mode.
Here is a data set of monthly income for a set of ten employees:
1,800; 1,980; 2,000; 2,400; 2,750; 3,100; 3,200; 4,320; 6,750; 12,000.
Question 1: Based on these figures, calculate the mean, median, and mode.
Question 2: Based on these figures, which is the better measure of the central tendency of
data?
1. Mean
2. Median
3. Mode
The nth percentile of a data set is that value below which n% of the data lie. The most common
percentiles used are quartiles.
The first quartile is the 25th percentile where 25% of the observations fall below this
point.
The second quartile is the same as the 50th percentile or the median.
The third quartile is the 75th percentile with 75% of the observations falling below this
point.
8
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
Percentiles and quartiles are popular ways to describe the position of a data point within a data
set.
2. Measures of Variability
There is a popular joke that goes: a statistician is one who would say your body temperature is
normal if your head is in the oven and your feet are in the fridge. Of course, that is only a joke.
Statisticians do pay attention to deviations from the measure of central location, because they
are as important as, or at times even more important than, the average.
Inter-quartile range
This is the difference between the first and the third quartiles. It is a measure of the spread of
the data:
The mean absolute deviation of a data set is the average of the absolute values of the deviations
of all the data points from their mean. The formula is given below:
Variance
The variance of a set of observations is the average squared deviation of the data points from
their mean. Sample variance is denoted by s2 and population variance is denoted by 2 and the
formulae for both are given below:
9
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
Standard deviation
The standard deviation of a set of observations is the (positive) square root of the variance of
the set.
As seen from the formulas found under the solutions referred to above, be aware that the
variance and standard deviation, the last two terms, are calculated slightly differently when the
data is of a sample and of a population. Unlike the other measures of variability, the variance is
a squared value, thus if our data has the dimensions of kilograms then our variance will be in
kilograms² (kilograms squared). Since the standard deviation is the square root of the variance,
it will have the dimensions of kilograms.
The advantage of working with the standard deviation is that it has the same units as the original
data.
We need to keep in mind that two standard deviations cannot be added to produce a meaningful
quantity. On the other hand, two variances can be added, in certain cases, to produce a
combined variance. (We can see the details of this only when we learn about random variables
later.) As a result, you will see that in some instances, people mention the standard deviation of
a data set and in other instances, they mention the variance. You should be wary of which one
is being mentioned and use it accordingly.
The obvious disadvantage with the variance is that it is in squared units (eg, kg²) and therefore,
cannot be compared with quantities in the original unit (eg, kg).
Examples of variability
Example 1
Suppose there is some uncertainty in the rates of return from two investments A and B. Looking
at the past five years' data we see that the returns from A and B are
10
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
On average, both A and B returned 10%. However, B's returns have deviated more than A's. We
would therefore, declare that B is more risky. To quantify how risky an investment's returns are,
it is customary to use the variance or the standard deviation of its past returns. The smaller the
variance, the smaller the risk.
Example 2
Suppose two automatic machines A and B produce pins whose diameters need to be exactly one
inch. Suppose a random sample of five pins is taken from A and another sample from B. Let the
accurate measurements of the diameters of these samples be:
The average diameter of the pins from A and B are both 1.000. But A's pins deviate from 1.000
to a larger extent than B's. Hence B has better quality. Thus, a measure of variability can be
useful for measuring the quality of machined parts. The smaller the variability the better the
quality.
11
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
3. Measures of Shape
It is useful to have a measure to describe the shape of a distribution. One method to do this is
to compare the distribution to a symmetrical distribution, since for a symmetrical distribution
that is unimodal, the three measures of central location coincide. If the distribution is not
symmetrical, the distribution is said to be skewed.
Skewness measures the deviation from symmetry, and hence zero skewness means perfect
symmetry. Refer to the two figures below:
Negative skewness indicates there are some large deviations to the left.
12
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
Positive skewness indicates there are some large deviations to the right.
One method of investigating the skewness of a distribution is to compare the mean and median
of the data set. If the mean is greater than the median, the data is said to be right skewed (or
positively) skewed. Likewise, if the mean is less than the median, the data is said to be left
skewed (or negatively) skewed.
As an example, consider data collected as part of an investigation on new home prices for a
particular building company in a major city in Australia. The investigators collected data on the
price of the new home, the floor area, and several other relevant variables.
Suppose that we were interested in summarising the information on the size of each home. To
do this we can make use of the Descriptive Statistics function in Excel (on the Excel menu bar,
click Data/ Data Analysis/ Descriptive Statistics). In the descriptive statistics window, you will
need to select the relevant 'Input Range'.
13
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
If we select our data range and check the box for the 'Summary statistics', the following output
as shown in the table below appears.
14
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
As can be seen in the table, the output includes most of the key summary measures we would
be interested in determining.
The median floor area is 193.36 m2, which, if compared to the mean, suggests that the
data may be very slightly right, or positively, skewed.
5. Measures of Association
We have been introduced to scatter plots to investigate the relationship between two variables.
It is also useful to summarise the linear relationship between two variables. Two measures useful
in summarising this relationship are:
Covariance
Correlation
Each of these measures describes the strength and direction of the linear relationship between
two quantitative variables.
Covariance
The covariance is essentially the average of the products of the deviations from the mean of
each of the variables. If the two variables tend to vary in the same direction, the covariance will
tend to be positive. Likewise, if the two variables tend to vary in opposite directions, the
covariance will tend to be negative.
As an example, we could consider the selling prices of homes in a particular city. We might expect
the covariance between the selling price and the distance from the city centre to be negative.
Likewise, we would expect that the price of a house to increase with the size of the house. In
this case, we would expect the covariance between the price and the size of the house to be
positive.
One limitation of using the covariance as a measure of association is that it is affected by the
units in which the two variables are measured. For example, suppose you have been told that
the covariance of two variables is 250. The sign is positive which suggests that the two variables
15
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
tend to vary in the same direction, but the magnitude gives us little information as to the
strength of this relationship. To overcome this problem, we can derive another measure of
association to describe the strength of the relationship.
Correlation
This measure is the coefficient of correlation and is obtained by dividing the covariance by the
product of the standard deviation of both variables. The resulting measure is a unitless quantity
that is unaffected by the units of measurement of each of the two variables.
The coefficient of correlation is always between -1 and +1. The closer it is to either of these
extremes, the closer the points in a scatter plot are to some straight line, either in a positive or
negative direction. Likewise, a coefficient of correlation close to 0 indicates that there may be
no linear relationship between the two variables.
As an example, consider the new home price data collected in the previous example. Suppose
that the investigators were interested in the relationship between the price of a new home and
its floor area. The figure below shows this data represented graphically as a scatter plot.
The covariance between these two variables is 1234011. This value can be derived by using the
covariance data analysis option in Excel (Tools/Data Analysis/Covariance). This figure is positive
which suggests that there is a positive relationship between these two variables, however the
magnitude of this result is difficult to interpret unless we know the units. The coefficient of
correlation for this relationship is 0.9842. As we discussed previously, this indicates that there is
a positive relationship between these two variables and as the value is close to 1 it also suggests
16
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
that there is a very strong linear component in the relationship between these two variables (as
can be seen from the above figure).
Exercise
Below is an exercise to practise your knowledge of correlation.
Data on employee age and salaries for three different companies, Company 1, Company 2,
Company 3 is shown using three scatterplots, which represent the relationship between the
employee age and their respective salaries.
17
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
18
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
Question 1: Refer to the diagram "Company 1". Which of the following is the correlation
coefficient for company 1?
1. 0.343
2. 0.858
3. 0.994
4. 0.996
Question 2: Refer to the diagram "Company 2". Which of the following is the correlation
coefficient for company 2?
1. -0.023
2. -0.026
3. -0.799
4. 0.434
Question 3: Refer to the diagram "Company 3. Which of the following is the correlation
coefficient for company 3?
1. -0.062
19
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
2. -0.532
3. 0.858
4. 0.860
As an example, suppose that a pizza store has four delivery drivers, and would like to investigate
the times taken to deliver pizzas by each of these drivers. The manager of the store might take
a sample of the deliveries by each of these drivers and then produce histograms of the delivery
times for each driver.
The boxplots represent the delivery times for the four pizza drivers. The variable along the
bottom of the graph is time and each of the four boxplots are stacked on top of each other
allowing easy comparisons.
20
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
If we consider any one of the boxplots in the figure, we can see a box in the centre of each plot.
The left and right side of the box are the first and third quartiles respectively. Therefore, the
length of each box represents the interquartile range (the height of each box has no
significance). The vertical line in each box represents the location for the median, while the point
inside the box represents the mean.
Horizontal lines are also drawn from each side of the box. They extend to the most extreme
observations on each side. Typically, these lines on each side of the box are drawn out no further
than 1.5 interquartile ranges from each side. Points further out than this are considered outliers
and are represented by dots (as on the right-hand side in the lower box the above figure).
In this case, it can be seen that the delivery times for driver four (the top boxplot) are, on
average, longer. There may be several reasons for this and we would need to collect more
information before making a decision. One obvious reason could be that driver four drives
slower. Another reason is that driver four delivers pizzas to areas that are further from the store.
21
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
7. Summary
Here is a quick recap of what we have learnt so far:
To describe a large data set, it is convenient to describe a measure of its central tendency
and describe how the data points deviate from this central tendency.
Measures of central location (mean, median, and mode) help to describe the centre of the
data set.
Measures of variability – inter-quartile range, mean absolute deviation, variance, and
standard deviation – help to describe deviations from the centre.
Skewness and similar measures can be used to describe the shape of the data sets.
Tools such as Excel can be used to generate summaries of relationships
Covariance and the correlation coefficient can be used to provide a measure of the strength
of the linear relationship between two variables.
Boxplots are a useful method of graphically investigating the distribution of one or more
variables.
8. Glossary
Boxplot A plot that describes the distribution of a data set, with a box in the
middle and a whisker on each side. The box denotes where the middle
half of the data lies, and the whiskers show the extent of the first and
last quarters. Outliers are separated out to limit the length of the
whiskers. The median is marked inside the box.
Percentile
The nth percentile of a data set is that value below which n% of the
data points lie.
Quartile The first quartile of a data set is that value below which a quarter of
the data points lie. It is the same as the 25th percentile. The third
quartile is that value below which three quarters of the data lie. It is
the same as the 75th percentile.
Inter-quartile range The inter-quartile range of a data set is the difference between its first
and third quartiles.
22
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
9. Answers
Exercise: Measures of Central Location
Question 1: Correct answer is shown in the table below:
Mean of the average of the ten numbers in the data set, i.e.
Median is defined such that half the numbers in the set are greater than it, and half are less than
it. In this case there are 10 numbers, so the fifth and sixth numbers are added and average to
get the medina, i.e.,
Mode is the most frequently occurring variable, which, in this case, is absent from the data set-
this is an instance of a case where is no unique mode.
The data supplied contains one outlier (12000). In this case it gives a mean which is considerably
higher than the typical value in the dataset. In such cases the median might be a better measure.
The relationship seems to be quite a strong positive linear relationship and we would expect
the correlation coefficient to be positive and reasonably close to one.
Although not quite a linear relationship, the relationship seems to be a moderately strong
negative relationship and we would expect the correlation coefficient to be negative and
reasonably large.
23
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Numerical Representation of Data
The linear relationship seems to be very weak and we would expect the correlation coefficient
to be close to zero.
24
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Segment: Probability and Uncertainty
Topic: Assigning Probabilities to Events
Assigning Probabilities to Events
Table of Contents
2
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Assigning Probabilities to Events
Introduction
Introduction to probability
Suppose your company has received an order for five custom-designed microchips at a price
of US$7,500 each.
There is no chip in stock, and the required five have to be produced one by one. The
production process is complex and every time a chip is produced, there is only a 2/3 chance
that it will be defect-free. Each trial will cost US$2,700 and there's a fixed setup cost of
US$14,800.
Is this order lucrative enough to accept, or is it too risky?
Is it possible to quantify the risk so that a decision can be made systematically?
Problems such as this one are common, especially in high-tech industries.
In order to solve this problem, one has to know the concepts of probability, random variables,
and probability distributions.
Learning Objectives
3
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Assigning Probabilities to Events
1. Basic Concepts
1. Random experiments
2. Sample space
3. Events
A random experiment is a process that can result in one of a number of possible outcomes. This
outcome cannot be predicted with certainty.
The list of possible outcomes of a random experiment is called the sample space and is denoted
as S. An example of a sample space is the list of possible outcomes for a tender that three
competing companies have submitted bids for. In this case, the sample space could be
represented as {Company 1 wins, Company 2 wins, Company 3 wins}.
Individual outcomes of a random experiment are called simple events. A simple event in the
tender example might be 'Company 2 wins'. A collection of one or more simple events is called
an event. For example, we could define the event 'Company 2 or Company 3 wins the tender'.
It is typical to represent events using capital letters. We can then define the probability that
event A occurs as P(A).
4
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Assigning Probabilities to Events
Assigning probabilities to events is not always an easy task, particularly in the business
environment. There are, however, three distinct approaches to determining the probability that
a particular event might occur. These three approaches are:
1. Classical approach
3. Subjective approach
Subjective approach
There are, however, situations where we do not have access to previous records or repetitions
of a particular event on which to base our probability assessment. Consider, for example,
trying to estimate the probability that the launch of a new computer product will be
5
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Assigning Probabilities to Events
Whichever of the three methods of assigning probabilities is used, the probabilities assigned to
simple events must satisfy two basic rules:
2. The sum of all the possible simple events that could occur must add up to 1
The probability of an event is the likelihood of the occurrence of that event. It is a positive
number between 0 and 1:
Exercise:
See below for an exercise to practice what you have just learnt about the three approaches to
assigning probabilities.
Q1. Suppose that your company was considering introducing a new product into the market
place. A survey of 400 potential customers found that 280 of them would purchase the
product. As the marketing manager, you have been asked to estimate the probability that a
randomly selected customer would purchase the product.
Which one of the methods below would you most likely use to estimate the probability that
a randomly selected customer would buy the product?
1. Classical approach
2. Relative frequency approach
3. Subjective approach
6
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Assigning Probabilities to Events
Q2. Suppose that you were submitting a tender for the supply of one of your products to a
government department. This will be the first time your firm has submitted a tender. Which
one of the following methods would you most likely use to estimate the probability your firm
will win the tender?
1. Classical approach
2. Relative frequency approach
3. Subjective approach
Q3. Suppose that the tax department made a statement in the press that they were going to
investigate 20% of all companies this year. Which one of the following methods would you
most likely use to estimate the probability that your firm will be investigated?
1. Classical approach
2. Relative frequency approach
3. Subjective approach
We also need to know how to combine events to form new events. Three probability operations
that allow us to do this are:
1. Union of events
2. Intersection of events
3. Complement of events
Union of events
The union of any two events, A and B, which can be denoted as A or B, is the event consisting
of all simple events in A or B or both. This probability can be written as:
P (A or B) = P (A occurs or B occurs or both occur)
7
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Assigning Probabilities to Events
8
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Assigning Probabilities to Events
We will make use of these methods of combining probabilities in the topic, “Conditional
probability”.
9
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Assigning Probabilities to Events
4. Summary
5. Glossary
6. Answers
10
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Segment: Probability and Uncertainty
Topic: Conditional Probability
Conditional Probability
Table of Contents
2
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Conditional Probability
Introduction
Probabilities are normally assessed relative to the information that we have available. As new
information becomes available however, probabilities can change. A formal way to revise
probabilities based on new information is to use conditional probabilities.
In this topic, we will explore conditional probability and how we can use these to update our
probability that an event might occur.
Learning Objectives
At the end of this topic, you will be able to:
describe the concept of conditional probability
calculate conditional probabilities.
3
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Conditional Probability
P (A | B)
Consider two events A and B such that P(B) > 0. Then we can define the conditional probability
that A occurs, given that B has already occurred as:
4
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Conditional Probability
about their beer drinking preferences. The results from this survey could then be summarised in
a contingency table as shown in the table below.
Suppose that we were interested in the probability that a randomly selected person was a white-
collar worker. This probability can be obtained by looking at the totals in the margins of the table.
In this case, there were 150 white-collar workers in the total sample of 400 people surveyed
which suggests that
Likewise, we may be interested in the probability that a randomly selected worker preferred to
drink light beer. In this case
Now consider if we were interested in determining the probability that a randomly selected
white-collar worker preferred to drink light beer. This is the conditional probability that the
randomly selected worker prefers light beer given that they were a white-collar worker
5
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Conditional Probability
You might notice in this case, that our knowledge that the randomly selected worker was a
white-collar worker changed the probability that this person was a light beer drinker.
Likewise, we found that the probability that a randomly selected worker preferred light beer
given that they were a white-collar worker is:
In other words, knowing that the event 'White-collar' occurred changes the probability that
'Prefers light beer' occurred. In such situations, events 'Prefers light beer' and 'White-collar' are
called dependent events.
Alternatively, if the occurrence of one event does not change the probability that the other
event occurs then the events are called independent events.
P (A | B) = P(A)
6
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Conditional Probability
or, alternatively
P (B | A) = P(B)
If this is not the case, then events A and B are said to be dependent.
Question 1: Given the information above, what is the probability that a randomly selected
person was female?
1. 0.25
2. 0.392
3. 0.4
4. 0.5
Question 2: Given the information above, what is the probability that a randomly selected
person with a university degree was female?
1. 0.3875
2. 0.3965
3. 0.4225
4. 0.4435
7
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Conditional Probability
4. Summary
Here is a quick recap of what we have learnt so far:
Probabilities may change as new information becomes available. A formal way to revise
probabilities on the basis of new information is to use conditional probabilities.
Two events can be independent or dependent. If the probability of one event is changed by
the knowledge of the other event, then the events are said to be dependent events.
If the knowledge of one event does not change the probability of the other event occurring,
then the events are known as independent events.
5. Answers
Exercise: Independent and Dependent Events
8
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Segment: Probability and Uncertainty
Topic: Rules of Probability
Rules of Probability
Table of Contents
2
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Rules of Probability
Introduction
We have looked at conditional probability and how we can use the knowledge of whether an
event has occurred to revise our probability that some other event might occur.
Now that we have some basic knowledge of simple probability events, we can consider how we
can use some rules of probability to calculate the probability of more complex events occurring.
Learning Objectives
At the end of this topic, you will be able to:
apply the following rules to calculate the probabilities of more complex events occurring
o complement rule
o addition rule
o multiplication rule
3
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Rules of Probability
1. Complement Rule
The simplest probability rule involves the complement of an event and is based on the
requirement that the sum of probabilities assigned to simple events in a sample space must sum
to 1.
Given any event A, then the complement of event A is that event A does not occur. If the
probability of event A occurring is P(A), then the complement , is given by
For example, suppose that the probability of a randomly selected customer entering your store
will purchase an item is 0.2 or 20%. Given this, we can determine the probability that the
randomly selected customer will not purchase an item from your store as
Although the complement rule is very simple, it is a very useful rule. If we are trying to calculate
the probability that an event will not occur, it is sometimes much easier to calculate the
probability that an event will occur and then subtract that probability from 1 in order to obtain
the probability we are after.
Exercise:
Q1. In a particular community, there are children in 40% of the households. If a household is
selected at random from this community, what is the probability that there are no children in
this household?
1. 0.4
2. 0.5
3. 0.6
4. 1.0
4
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Rules of Probability
2. Addition Rule
The addition rule enables us to calculate the probability for the union of two events given the
probability of the other events. For example, suppose that a marketing company was
researching on the use of Sport Utility Vehicles (SUVs) and would like to know whether there
was any relationship between gender and the ownership of SUVs.
The marketing company might randomly survey 200 drivers and determine whether they owned
an SUV or some other vehicle.
In particular, suppose that one research question of interest was to determine the probability
that a randomly selected driver was female or owned an SUV. In other words, the marketing
company was interested in determining the following probability
P(Female or SUV)
To calculate this probability, we need to make use of the addition rule which for two events is
given below as
To understand this probability rule, suppose that the results of the survey were presented as in
the table below.
Table 1: Contingency Table
5
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Rules of Probability
We can see from this table that the probability that a randomly selected driver is either female
or drives an SUV is given by the circled areas. One of these circled areas represents the number
of females in the sample, while the other circled area represents the number of drivers who own
an SUV. The area covered by both circles represents the number of drivers who were female and
drove an SUV.
From this diagram, we can see that we have double counted the number of people who are
female and who also own an SUV. Therefore, we must subtract this number from our result.
The probability that a randomly selected driver is either female or drives an SUV is then given
by
P(Female or SUV) = 50/200 + 80/200 – 10/200 = 0.6 or 60%
There are situations however, when events are mutually exclusive. We say that two events are
mutually exclusive if at most one of them can occur. As an example, consider the following three
events concerning a company's profit in the following year:
Clearly only one of these events can occur and they are mutually exclusive. They are also
exhaustive, which means that they exhaust all possibilities (at least one of these three events
must occur).
If two events A and B are mutually exclusive then the P(A and B) = 0. Therefore, the addition rule
for mutually exclusive events becomes P(A or B) = P(A) + P(B).
3. Multiplication Rule
Another probability rule is the multiplication rule, which is used to find the probability of a joint
event. This rule is obtained by rearranging the conditional probability rule. By doing this, we
obtain the following for any two events A and B
Or alternatively
6
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Rules of Probability
It should be noted that both these expressions of the multiplication rule are equivalent. The form
of the rule used will depend on the information we have been supplied with.
In the special case where the two events A and B are independent, such that P (B|A) = P(B), then
we can express the multiplication rule more conveniently as
Exercise:
See below for an exercise to practise your knowledge of the multiplication rule.
Q1. In the same community referred to earlier, where there are children in 40% of the
households, two households are chosen at random from this community, what is the
probability that there are no children in either household?
1. 0.16
2. 0.24
3. 0.36
4. 1.00
7
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Rules of Probability
4. Summary
Here is a quick recap of what we have learnt so far:
The complement of an event is a useful rule that can be used to find the probability that an
event will not occur.
The addition rule can be used to find the probability of the union of two events.
Mutually exclusive events are events in which if one occurs the others cannot occur.
The multiplication rule can be used to find the joint probability of two events.
5. Answers
Exercise: Complement Rule
Q1: The correct answer is option 3, 0.6
Exercise: Multiplication Rule
Q1: The correct answer is option 3, 0.36
8
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Segment: Probability and Uncertainty
Topic: Probability Trees
Probability Trees
Table of Contents
2
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Probability Trees
Introduction
We have looked at the basic rules of probability and how these can be combined to calculate
more complex probabilities. This topic introduces a graphical method of representing
probabilities that can be extremely useful when dealing with more complex problems.
Learning Objective
At the end of this topic, you will be able to:
draw and interpret probability trees to assist in calculating probabilities.
3
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Probability Trees
1. Probability Trees
A very useful method of calculating probabilities is by using probability trees, where branches or
lines of the tree represent the various probabilities of certain possible events. Probability trees
are a very useful device to ensure that we have identified all simple events and assigned
probabilities to them.
The concept of probability trees is best illustrated by considering the following example.
Suppose that you are the owner of a video game store and have recorded the purchasing
behaviour of your customers over a period of time. From the records, you have established that
the probability that a customer will buy an X-Box console is about 15%. You are also aware that
a customer tends to buy an X-Box game 50% of the time when they purchase an X-Box console,
but only 5% of the time when they do not purchase an X-Box console.
Given this information we might be interested in determining the probabilities that a randomly
selected customer will buy one of the following combinations:
2. A game
We could use our knowledge of probability rules we developed in the previous topics to calculate
these probabilities. However, if the process of observing the results of an experiment can be
broken down into a series of stages, the various sequences of events can be analysed using
probability trees. Using these probabilities can assist in making the task of determining these
probabilities a much simpler matter.
The probability tree for the video store scenario is shown in the figure below:
4
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Probability Trees
The probability associated with each branch is the conditional probability that a particular
branch outcome will occur, given that the outcomes represented by the previous branches have
occurred.
5
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Probability Trees
First probability
Referring to the X-box probability tree, we might be interested in determining the first possible
outcome, which is the probability that a randomly selected customer will buy an X-Box and will
buy a game. This probability is found by following the top branches along the probability tree.
To calculate this probability, we can multiply the probability associated with the first branch by
the probability of the second branch as we move along the branches of the probability tree.
This probability is
P (X-Box and Game) = P(X-Box) * P (Game | X-Box) = 0.075 or 7.5%
Likewise, we can move through each of the possible branches of the tree and multiply the
probabilities to obtain the joint probabilities and the end of each of the possible branches. You
might also notice that we are using the multiplication rule we discussed in the previous topic,
"Rules of Probability".
Now that we have X-box probability tree, we can calculate the probabilities that we are
interested in. The first of these was the probability that a randomly selected customer will buy
an X-Box console and a game, or alternatively P(X-Box and game). This probability can be seen
directly from the end branch of the tree as 0.075 or 7.5%.
Second probability
The second probability that we were considering was the probability that a customer will buy a
game. You might notice that this is the case when the customer either buys an X-Box and a
game or does not buy an X-Box but does buy a game.
6
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Probability Trees
The benefits and advantages of probability trees become apparent when we are dealing with
more complex problems with many alternative outcomes. In these cases, probability trees
greatly simplify the assignment of probabilities of combinations of events.
As a challenge for yourself, you might wish to calculate the other two probabilities that we
were interested in determining, i.e., the probability that a customer will buy
7
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Probability Trees
2. Summary
Here is a quick recap of what we have learnt so far:
Probability trees can be used to represent the possible outcomes of an experiment when the
process can be broken down into a series of stages.
Probability trees consist of a series of nodes and branches emanating from each of these
nodes.
Probability trees can greatly simplify more complex probability problems.
8
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Segment: Probability and Uncertainty
Topic: Bayes' Rule
Bayes' Rule
Table of Contents
2
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Bayes' Rule
Introduction
We have examined the conditional probability that an event A will occur given that some event
B has occurred.
In this topic, we investigate an alternative method of calculating these conditional probabilities.
The method can be used to revise probabilities that an event will occur on the basis of new
information being acquired.
Learning Objective
At the end of this topic, you will be able to:
apply Bayes' rule to revise probabilities.
3
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Bayes' Rule
1. Bayes' Rule
Let us first consider the following business scenarios.
Business Scenarios
Scenario 1
Assume you are the human resource manager for a large company and are responsible for
recruiting junior managers. From previous records, you are aware that 40% of the junior
managers that you hire turn out to be good managers. Suppose then, that you introduce an
aptitude test in order to better screen applicants for the management jobs. According to the
developers of the aptitude test, there is a 90% chance that an applicant will do well on the
test, given that they would be good managers. Likewise, they estimate that there is an 80%
chance that they will do poorly on the test given that they would not be good managers.
Scenario 2
Imagine you are the loans manager for a particular bank. You know from previous records that
5% of customers will default on a loan. Suppose that you also know from your records that
the probability that a customer was late on payments at least twice in a situation where the
customer defaulted was 95%. Likewise, you determine that the probability that a customer
was late on payments at least twice in a situation where the customer did not default was
15%.
Assume that a customer has just defaulted on payments for a second time.
What is the probability that this customer will ultimately default on the loan?
Calculating probabilities
In order to determine the probabilities in the two scenarios above, we need to understand how
to revise our probability estimates based on new information being obtained. The conditional
probability that a particular event A will occur, given that event B has occurred, is given by
4
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Bayes' Rule
Bayes' theorem provides us with an alternative formula for calculating this conditional
probability. The alternative formula simplifies the calculation of P(A|B) when P(A and B) and P(B)
are not given directly.
Essentially, Bayes' rule is a method for updating probabilities as new information becomes
available.
2. An Example
Let us review the calculation of probability through the Bayes’ rule in the example below.
To find this probability, we need to consider the information that we have already been given.
Let us consider the three types of probabilities in this example.
Prior Probabilities
1. The first of these is the probability that a customer bought an X-Box. These are the initial
probabilities regarding whether or not the customer purchased an X-Box. From the previous
example, this probability is
2. Given that, we then know that the complement of this probability, i.e., the probability that
the customer did not buy an X-Box is
5
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Bayes' Rule
These two probabilities are known as prior probabilities because they represent the chance that
the customer buys an X-Box prior to our knowledge of whether they also bought a game.
Conditional Probabilities
3. In addition, we have also been given the probabilities that a randomly selected customer will
buy a game given that they have bought an X-Box, and also the probability that they buy a
game given that they did not buy an X-Box. These probabilities are given as
And
Knowing these two probabilities, we can also use the complement rule to find the following
probabilities
And
The above four probabilities are the conditional probabilities of game buying behaviour given the
X-Box purchasing behaviour.
The probabilities that we would like to know are the revised or posterior probabilities. These are
the probabilities that we are interested in.
In our case, it is the probability that the customer has bought an X-Box given that they have
bought a game, which can be written as
6
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Bayes' Rule
This probability is called a posterior probability since it is assessed after we have gained
information on whether or not we know if the customer bought a game. This probability can be
obtained from the following
Therefore, the probability that a randomly selected customer will buy an X-Box, given that they
have bought a game is
As managers, we need to be aware of this or we might fall into the trap identified by
psychologists as 'the confusion of the inverse'. The basic problem here is that people tend to
confuse the conditional probability P (A | B) with the conditional probability P(B | A).
7
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Bayes' Rule
Reading: What Educated Citizens Should Know About Statistics and Probability
You may wish to read the following article for further information on the confusion of the
inverse problem and other statistical issues:
Utts, J. "What Educated Citizens Should Know About Statistics and Probability". The American
Statistician. Vol. 57, No. 2, 2003: 74–79.
3. Summary
We sometimes need to revise the probability of an event occurring on the basis of new
information being obtained.
Given initial probabilities known as prior probabilities, we can use our knowledge of the
conditional probabilities to arrive at our revised or posterior probabilities.
Bayes' rule can be used to simplify these calculations in many cases.
8
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Segment: Probability Distributions
Topic: Random Variables
Random Variables
Table of Contents
2
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Random Variables
Introduction
Random variables are uncertain quantities and information on their probabilities is available.
Using the mean and variance of a random variable in the computations, decision-making under
uncertain conditions can be effectively handled.
A random variable can be a discrete variable, such as the number of customers in a queue or the
number of defectives in a shipment of parts, or a continuous variable, such as the weight of a
part, or the volume of cola in a bottle.
Random Variables
When a manager prepares to make a decision, he/she may find that some of the data needed
for the decision are not known for sure. But the decision has to be made anyway.
Suppose a production manager has to decide how much to produce of an important product
during the next month. How much to produce may well depend on how much can be sold.
And how much can be sold is usually not known with certainty. It is a random variable. Many
business decisions involve random variables that have to do with uncertainties such as
demand, competition, economy, government actions, or even the weather.
Applying the concepts of probability to every random variable involved in business decisions
can become tedious. Fortunately, many random variables that occur in practice follow some
well-known standard distributions with well-known properties. Knowledge of the standard
distributions, and the use of appropriate spreadsheet templates can make the computations
far less tedious.
Learning Objectives
At the end of this topic, you will be able to:
describe the concepts of random variables and probability distributions
distinguish between the two types of random variables, discrete and continuous.
3
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Random Variables
1. Random Variables
A random variable is a variable that takes on numerical values depending on the outcome of an
experiment.
We also need to make the distinction between a random variable and the values it can assume.
By convention, we use uppercase letters such as X to denote a random variable. A possible value
of X is denoted by lowercase letters such as x. The symbols used to represent random variables
are:
As an example, consider the number of phone calls that might be received by your cell phone
on any particular day. The actual number of phone calls you might receive on any particular day
is uncertain and will vary from day to day. If we denote the number of phone calls you receive
on any particular day by X, then X is a random variable whose outcomes will vary.
1. A random variable is discrete if it can assume only a countable number of possible values.
2. A random variable that can assume an uncountable number of values is classified as
continuous.
A discrete random variable is a variable for which the number of possible values is finite or
countable. For each one of these possible values, the probability that the random variable will
assume that value must be known.
4
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Random Variables
For example, the number of customers staying at a hotel on any particular day is an example of
a discrete random variable. The number of e-mails that your company receives every day is
another example of a discrete random variable.
To be a valid probability distribution for a discrete random variable, the probabilities of individual
values must not be negative, and they must add-up to 1.
In contrast, continuous random variables can take on a number of possible values that are
uncountable and infinite.
For example, the temperature is a continuous random variable, since it can be measured as
72.00340981136 degrees. Weight, height, and time are other examples of continuous random
variables.
The difference between discrete and continuous random variables is illustrated in the following
figure
Once we know the possible values of a random variable and the possibilities associated with
each particular outcome, we have the probability distribution for our random variable.
A bar chart or a table can show the probabilities of the possibilities of the possible values of a
discrete random variable. This table or chart is called a discrete probability distribution.
As an example, consider the number of girls born at a hospital where four births have taken
place. The random variable in this example is discrete, with the possible outcomes being from 0
to 4 girls. Assuming that there is a 50% chance of a girl being born in each birth, the possible
5
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Random Variables
outcomes and the probabilities attached to each possible outcome are shown in the chart below.
It is important to note that the sum of the probabilities of all the possible outcomes is one.
In contrast, a continuous random variable can take on an infinite number of possible values. One
way to represent the probability distribution of a continuous random variable could be to use a
histogram. We could then imagine what this histogram would look like if the class widths used
were infinitely narrow. The result is a smooth curve called a probability density function defined
as f(x).
We then get the plot of the probability density function of the f(x) where X is the continuous
random variable. An example of a probability density function for a continuous random variable
is shown in the figure below.
6
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Random Variables
With continuous probability distributions, it makes no sense to think of our random variable
taking on a particular value. Instead, we consider the probability that it will take on a value within
a specified range.
For example, we could imagine the probability that our random variable could take on a value
somewhere between a and b as shown in the above figure. The shaded area in the above figure
represents this probability.
The total probability, which is the total area under the graph of f(x), must be 1.
In this case, probabilities are assigned to intervals (just as we assign relative frequencies to each
class interval in a histogram).
Below is an exercise to practice what you have learnt on discrete and continuous values.
Exercise: Discrete and Continuous Random Variables
Q1: Consider whether each of the following examples is discrete or continuous variables.
a. The distance a car travels on one tank of petrol
b. The number of accidents that occurs annually on a busy stretch of road
c. The number of coins in your pocket
d. The average length of a business meeting
7
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Random Variables
3. Summary
Here is a quick recap of what we have learnt so far:
A random variable is one that can assume any one of many possible values with known
probabilities.
A discrete random variable is a variable for which the number of possible values is finite or
countable.
A continuous random variable is a variable for which the number of possible values is
uncountable and infinite.
For a discrete random variable, we show the probabilities of the possible values in a bar chart
or a table.
For a continuous random variable, we can plot the probability density function of
the f(x) where X is the continuous random variable.
4. Glossary
Discrete variable A variable that can assume a countable or finite number of
possible values.
Continuous variable A variable that can assume an uncountable and infinite number of
possible values.
5. Answers
Exercise: Discrete and Continuous Random Variables
Q1:
8
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Segment: Probability Distributions
Topic: Expected Values and Variance
Expected Values and Variance
Table of Contents
2
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Expected Values and Variance
Introduction
We have explored the concept of random variables and probability distributions. Once we know
our probability distribution, we can describe it in terms of its mean and variance.
Mathematically, there are distinct differences in how we deal with discrete and continuous
probability distributions. While we will be exploring continuous distributions in later topics, in
this topic we will restrict ourselves to discrete probability distributions.
Learning Objectives
At the end of this topic, you will be able to:
describe the concepts of the expected value, variance and standard deviation of a discrete
random variable
calculate the expected value for a given discrete random variable.
3
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Expected Values and Variance
The mean or expected value of a discrete random variable can be found using the following
formula
The presentation below explains about the expected value of a random variable:
Suppose you own a corner shop where you sell daily newspapers, etc. Suppose also that
the number of copies of the newspaper sold on a given day is a discrete random variable
with the probabilities as displayed in the accompanying table.
While you cannot be sure exactly how many copies will be sold, you can get an expected
value of the number that will be sold and use that for certain decisions.
4
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Expected Values and Variance
The expected value of a discrete random variable is defined as the sum of the product of
each possible value, multiplied by its probability.
That is, it is a weighted average of the possible values, where the weights are the
probabilities.
Note: The expected value is a fraction. You should not round it up or down thinking
fractional values are not allowed. E(X) can be continuous even if X is discrete.
Expected number of copies of newspapers sold
= 28 * 0.2 + 28 * 0.25 + …….. = 28.7
The expected value of X is denoted by E(X). Thus, where the summation is carried out
over all possible values x.
The expected value of a continuous random variable is defined as
Knowing that the expected number of copies you will sell is 28.7, you may be able to make
plans.
Another piece of information that could help you make your plans is the expected revenue
from the sale of newspapers.
Note: US$22.96 is only the expected revenue. The actual revenue may deviate from this,
and therefore has a variance.
The basis for this calculation is the formula
E(aX) = a E(X) for any constant a
Suppose the price of each copy is US$0.80.
Then, by selling an expected number of 28.7, you can expect a revenue of
= 28.7 * US$0.80
= US$22.96
5
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Expected Values and Variance
The standard deviation of a random variable X, denoted by σ(X) or SD(X), is the (positive) square
root of its variance, that is
The notation for standard deviation (alpha) is shown in the line above the formula.
It is a measure of the dispersion of the values of the random variable about the mean or
expected value. The variance gives us an idea of the variation or uncertainty associated with the
random variable. The larger the variation, the further away from the mean is the possible values
of the random variable.
For example, one measure of the risk associated with an investment is the standard deviation of
investment returns. When comparing two investments with the same average (expected) return,
the investment with the higher standard deviation is considered riskier (although a higher
standard deviation implies that returns are expected to be more variable – both below and
above the mean).
6
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Expected Values and Variance
We quite often have to combine expected values in order to obtain an overall expected value.
For example, suppose that we are considering two investments costing US$25 each. The first of
these might have an expected payoff of US$100, while the second may have an expected pay-
off of US$150. If we let the random variable X represent the pay-off from the first investment
and Y represent the pay-off from the second investment, then the overall expected pay-off can
be found as
Some useful laws of expected values can be found in the box below.
At times, we need the variance of the sum of two variables, X and Y. The general formula is
7
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Expected Values and Variance
Where COV(X, Y) denotes the co-variance of X and Y. We will study co-variance in detail in the
segment, “Regression Analysis”.
Here we shall only mention that if the two variables X and Y are independent, then their
covariance will be zero.
The formula for V (X + Y) simplifies for independent variables, i.e. when X and Y are independent,
as
V (X + Y) = V(X) + V(Y)
For example, if the variance of investment X is 84 and the variance of investment Y is 60, and the
two investments are independent of each other, then the variance of the total income from both
sources is
V(X) + V(Y) = 84 + 60 = 144
This means that the standard deviation of the total income is the square root of this variance,
which is US$12.
Laws of variance
If X and Y are random variables and a and b are any constants, the following identities hold:
• V(a) = 0
• V(aX) = a2V(X)
• V(X + b) = V(X)
• V(aX + b) = a2V(X)
• V(X + Y) = V(X) + V(Y), if X and Y are independent
• V(X - Y) = V(X) - V(Y), if X and Y are independent
• V(aX + bY) = a2V(X) + b2V(Y), if X and Y are independent
8
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Expected Values and Variance
5. Summary
Here is a quick recap of what we have learnt so far:
The expected value of a discrete random variable is defined as the sum of the product of
each possible value, multiplied by its probability.
The variance of a random variable is the expected squared deviation of the random variable
from its mean.
The standard deviation of a random variable is the (positive) square root of its variance.
Random variables can be combined into a new random variable in which the expected value
and variance can be found.
9
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Segment: Probability Distributions
Topic: Binomial Distribution
Binomial Distribution
Table of Contents
2
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Binomial Distribution
Introduction
We have learned how to calculate the expected value and the variance for a discrete random
variable whose probability distribution was given.
In this topic, we will consider one particular discrete probability distribution that arises regularly
in business situations.
Learning Objectives
At the end of this topic, you will be able to:
describe the concept of a binomial distribution
calculate the mean, variance and standard deviation for a binomial distribution
explain how binomial probabilities can be calculated using the BINOMDIST function in
Excel®.
3
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Binomial Distribution
1. Binomial Distribution
One of the most important discrete probability distributions is the binomial distribution. To
understand the binomial distribution, we need to consider what is meant by a binomial
experiment.
The random variable of interest is the number of successes and is called the binomial random
variable. The binomial distribution occurs when we count the number of successes out of n
number of independent trials.
It can be shown that the mean and variance for a binomial random variable X are
Variance = V(X) = σ2 = np (1 - p)
For example, suppose that you are the manager of a venture capital firm with 15 investments. If
you assume that 85% of venture capital investments are failures, how many of your investments
can you expect to be successful?
4
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Binomial Distribution
If we assume that the investments are independent of each other, then the number of successful
investments should follow a binomial probability distribution with
In other words, you can expect on average 2.25 successful investments from the 15 that you
have invested in.
You might want to confirm for yourself that the standard deviation for the number of successful
investments is approximately 1.383.
BINOMDIST(k,n,p,cum)
Where
k is an integer number of successes
n is the number of trials
p is the probability of a success
cum is either 0 or 1. It is 1 if we want the probability of less than or equal to k
successes, and it is 0 if we want the probability of exactly k successes
In this section, we will consider how to calculate binomial probabilities for two situations:
5
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Binomial Distribution
Suppose as in the example above, that you are the manager of a venture capital firm with 15
investments. If you assume that 85% of venture capital investments are failures, what is the
probability that exactly 5 of these investments will be successful?
Solution
As before, if we assume that these investments are independent of each other, then the
number of successful investments should follow a binomial distribution with n = 15 and p =
0.15. The probability we are after is
P (X = 5)
This probability is the shaded area in the graph shown in this figure.
In this situation, we are after exactly 5 successful investments and to calculate this probability,
we would enter the following formula into Excel®.
= BINOMDIST (5, 15, 0.15, 0)
= 0.044895 = 4.4895%
Therefore, the probability that we will have 5 successful investments is approximately 4.49%.
6
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Binomial Distribution
Solution
As before, if we assume that these investments are independent of each other, then the
number of successful investments should follow a binomial distribution with n = 15 and p =
0.15. The probability we are after is
P(X≥3)
This probability is the shaded area in the graph shown in this figure.
In this situation, we are after the probability that more than 3 of our investments will be
successful. To calculate this probability, we can again use the BINOMDIST function in Excel®.
Before we enter the formula though, we need to recognise that in this case, we are after a
cumulative probability instead of exactly some number of successes. As such, we will need to
construct our question of interest such that we are after less than or equal to k successes. We
can achieve this as shown below:
P(X≥3) = 1 – P(X≤2)
Now that we have the probability in the form of less than or equal to k successes, we can enter
the following formula into Excel®:
Therefore, the probability that we will have at least 3 successful investments is approximately
39.6%.
7
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Binomial Distribution
4. Summary
Here is a quick recap of what we have learnt so far:
A binomial random variable indicates the total number of successes in the n trials of a
binomial experiment.
The probability distribution of this random variable is called the binomial probability
distribution which gives us the probability that a success will occur x times in n trials.
Binomial probabilities can be calculated using the BINOMDIST function in Excel®.
8
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Segment: Probability Distributions
Topic: Normal Distribution
Normal Distribution
Table of Contents
2
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Normal Distribution
Introduction
The normal distribution is an important continuous distribution because many random variables
that occur in business and nature can be approximated by it.
This topic introduces the normal distribution and investigates how normal distribution
calculations can be performed using Excel®.
Learning Objectives
At the end of this topic, you will be able to:
describe the concept of the normal distribution
perform normal distribution calculations in Excel® using the NORMDIST and NORMINV
functions.
3
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Normal Distribution
1. Normal Distribution
The normal distribution is arguably the most important continuous probability distribution.
The normal distribution is extremely important, as many distributions that we come across in
the business environment tend to follow a normal distribution. It is also a useful approximation
to other distributions such as the binomial distribution.
The normal distribution is symmetrical about its mean. Approximately 68% of the area under the
curve lies within one standard deviation on each side of the mean. Approximately 95% of the
area under the curve lies within two standard deviations on each side of the mean.
4
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Normal Distribution
distribution is a special case of the normal distribution where the mean is 0 and the standard
deviation is 1.
The corresponding normal random variable is called the standard normal random variable. A
normal distribution with a mean of 0 and a standard deviation of 1 is also called a Z-distribution.
The graphic shows an example of standard normal distribution. There is a bell curve with a
vertical line down the centre to indicate MEAN of zero, and a horizontal line halfway down the
vertical line, on the right, to indicate Standard Deviation of one. This is also called the Z
distribution.
The Z-distribution is very useful for comparing distributions that have different means of
standard deviations. It is also easy to interpret the Z-value as it is simply the number of standard
deviations to the right or left of the mean.
Before we can use the Z-distribution though, we must transform or convert our normal random
variable X into the standard normal variable Z. Where X is a random variable, µ (mu) is the mean
of the random variable, and σ (sigma) is the standard deviation of the random variable. This
operation is called standardisation. The formula we use to standardise a normal random variable
is given in the formula below:
5
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Normal Distribution
NORMDIST
NORMINV
The important thing to note is that the NORMDIST function returns left-hand tail probabilities
and the NORMINV function requires left-hand tail probabilities.
As an example, suppose we know that the monthly return of the DOW JONES INDEX was
normally distributed with a mean monthly return of 1.5% and a standard deviation of 3.6%.
6
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Normal Distribution
Given that a normal distribution is completely defined by its mean and standard deviation, we
can then find probabilities associated with any particular return.
Finding the probability associated with a particular value of the random variable
Suppose for example, that you are interested in determining the probability that the monthly
return for any particular month is going to be more than 5%. This probability is shown in the
figure below.
Chart showing the probability, p, that the monthly return for any particular month is going to be
more than 5 percent.
The probability that we are after is P(X>5). Remembering that the NORMDIST function returns
left-hand tail probabilities, we will need to rewrite our probability of interest as shown below:
Now that we have it in this form, we can use the following formula in Excel® in order to calculate
the probability
Therefore, there is a 16.55% chance that the mean monthly return of the DOW JONES INDEX will
be greater than 5% on any particular month.
7
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Normal Distribution
Finding the value of the random variable associated with a particular probability
Suppose though, that we were interested in finding the monthly return associated with the
worst 10% of months. This is a case where we are given a probability and are trying to determine
the x value (in this case the monthly return). This situation is shown in the figure below.
Since this is already a left-hand tail probability, we can enter our formula directly into Excel® as
Therefore, the maximum monthly return for the worst 10% of months is negative 3.1136%.
As you can see, using Excel® functions makes these calculations straightforward. We will need
to make use of these calculations in later part of this subject, so it is important that you do some
examples and become familiar with their use. It is also highly recommended that you prepare a
sketch so that you can see the probability that you are after.
8
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Normal Distribution
In this case, we make use of the Excel® functions NORMSDIST and NORMSINV. You will notice
that the difference between these functions and the previous functions is the 'S' in these
formulas, which stands for standardised.
For situations where we need to find the probability associated with a particular value of z, we
can use the NORMSDIST function. The general form of this function is
= NORMSDIST(z)
You will notice that we do not need the other arguments for this function as, for a standard
normal distribution, the mean is 0 and the standard deviation is 1.
For situations where we need to find the value of z associated with a particular probability, we
can use the NORMSINV function. The general form of this function is
= NORMSINV(p)
As before, the important thing to note is that the NORMSDIST function returns left-hand tail
probabilities and the NORMSINV function requires left-hand tail probabilities.
Finding the probability associated with a particular value of the random variable
Suppose that we wish to find the probability that the monthly return on any particular month
was greater than 5% using a standard normal distribution. In this case, we need to standardise
this value using our standardisation formula as shown below:
This formula shows that Z is equal to X minus the mean divided by the standard deviation.
Our value of the random value here is the monthly return of 5%, which when standardised
becomes
Z = (5 – 1.5) / 3.6 = 0.9722
9
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Normal Distribution
In other words, 5% is 0.9722 standard deviations above the mean. To calculate this probability,
we can use the NORMSDIST function as shown below:
= 1 – NORMSDIST (0.9722)
= 0.1655 or 16.55%
You will notice that this is exactly the same result as we obtained using the NORMDIST function.
Finding the value of the random variable associated with a particular probability
Likewise, we may have been interested in determining the maximum return of the worst 10% of
months. In this case, we could enter the NORMSINV function as shown below:
= NORMSINV(0.10)
= - 1.2816
In other words, the maximum return of the worst 10% of months is 1.2816 standard deviations
to the left of the mean.
This value of our standard normal variable can then be converted back into our random variable
of interest, which is the monthly return in percentage. To do this, we can rearrange our
standardisation formula into the form shown below:
X = µ + Z.σ
Again, you will notice that this is exactly the same result we obtained using the NORMINV
function as before.
5. Normal Approximation
In general, if we are considering the binomial distribution and the number of trials, n, is 30 or
more, binomial probabilities can be approximated by the normal distribution.
10
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Normal Distribution
After you begin the program try n = 5 (the program uses the notation N in place of n) and p =
0.5. Then change p to, say, 0.1. The distribution is too skewed to be normal. Try n = 50 and see
how well it fits the normal curve.
Below is an exercise to practice what you have just learnt on normal approximation.
11
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Normal Distribution
6. Summary
Here is a quick recap of what we have learnt so far:
The normal distribution is symmetric about its mean and is uniquely determined by its mean
and standard deviation.
Normal distribution calculations can be performed in Excel® using the two functions below,
both of which use left-hand tail probabilities,
o NORMDIST
o NORMINV
Standard normal distribution calculations can be performed in Excel® using the NORMSDIST
and NORMSINV functions. These functions use left-hand tail probabilities.
When the number of trials n in a binomial distribution is 30 or more, binomial probabilities
can be calculated by normal approximation.
7. Answers
Exercise: Normal Approximation
Q1: The correct answer is option 1, 50%
12
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Segment: Probability Distributions
Topic: Poisson and Exponential Distributions
Hypothesis tests for a population
meanHypothesis tests for a population mean
Poisson and Exponential Distributions
Table of Contents
2
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Poisson and Exponential Distributions
Introduction
A particular discrete probability distribution known as the Poisson distribution can be used to
describe situations where we are interested in the number of events occurring in a specified
period of time.
While waiting time for an event that occurs periodically is uniformly distributed, waiting time
for a rare event can be shown to be exponentially distributed.
Learning Objectives
At the end of this topic, you will be able to:
• describe the concept of the Poisson distribution
• describe the concept of the Exponential distribution.
3
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Poisson and Exponential Distributions
1. Poisson Distribution
A Poisson distribution plots the number of occurrences of a rare event during a finite period.
Some events, such as accidental power outage, are rare events that occur at the rate of say, 0.2
times per month. Given any tiny interval of time, say one second, the probability of an accidental
power outage occurring during that second is the same as for any other second.
When we are counting the number of occurrences of a rare event, which is equally likely to occur
in any equal interval of time, that number follows a Poisson distribution. This distribution is
named after the French mathematician Siméon Poisson (1781 – 1840).
You can find further information about the Poisson distribution at this website -
https://en.wikipedia.org/wiki/Poisson_distribution.
Below is an exercise to practice what you have read about Poisson distribution.
Q1: An important meeting is about to start and will last 1 hour. Assuming that your mobile
phone rings on average 12 times during an 8-hour business day, what is the probability that
your phone will ring at least once during the meeting?
1. 22.3%
2. 77.7%
3. 85.3%
4. 88.3%
In the latest version of Excel, we can see both POISSON and POISSON.DIST functions, which
return the Poisson probabilities.
4
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Poisson and Exponential Distributions
2. Exponential Distribution
Picture a gift shop on a busy street during the peak hour. A customer may enter the shop at any
time. More formally, given any tiny interval of time, say, a second, the probability that a
customer will enter the shop during that second is the same as for any other second. Under such
conditions, if we start a stopwatch at any time and wait till a customer arrives, our waiting time
will follow an exponential distribution.
The exponential distribution is fairly common in practice. Here are two examples:
1. Inter-arrival time
2. Mean time between failures (MTBF)
Read below to find out more about these two examples of exponential distribution.
5
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Poisson and Exponential Distributions
3. Relationship
In general, if an event is equally likely to occur during any tiny interval of time, and we start a
stopwatch at any time and wait till the event occurs, our waiting time will follow an exponential
distribution.
An exponential distribution is memoryless, and therefore the stopwatch to measure the waiting
time can be started at any time.
If an event occurs with a Poisson distribution at the mean rate of λ per hour, then the time gap
between two successive occurrences of that event, or our waiting time for the next occurrence
of that event with our stopwatch started at any time, will be exponentially distributed with a
mean of 1/λ hours and a standard deviation of 1/λ hours.
Note that the mean and the standard deviation of an exponential distribution are equal.
6
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION
Poisson and Exponential Distributions
4. Summary
Here is a quick recap of what we have learnt so far:
A Poisson random variable indicates the number of successes that occur during a given time
interval or in a specified region in a Poisson experiment.
An exponential random variable can be used to measure the time that elapses between
occurrences of an event.
The mean and standard deviation of an exponential distribution are the same.
5. Answers
Exercise: Poisson Distribution
This is a Poisson problem with a mean of 12/8 or 1.5 phone calls per hour. The probability we
are after is P(X>0) which is equal to 1- P(X=0).
=1-POISSON(0,1.5,0) = 0.777
7
©COPYRIGHT 2019, ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION