A Quick Approach To Statistics by G.R.pashA
A Quick Approach To Statistics by G.R.pashA
A Quick Approach To Statistics by G.R.pashA
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
1111111111
Contents
Preface
By: Rafaqat
Exercises 11
18
~
J Chapter 2: Basic Probability Theory
Basics in Probability 23
Some Probability Rules
23
Counting Rules 28
Questions-Answers 29
Exercises 31
34
Chapter 3: Random Vari~bles
Questions-Answers 41
Exercises 46
48
Chapter 4: Discrete rrobability Distributions
Questions~Answers 49
Exercises 57
59
Chapter 5: Continuous Probability Distributions 63 !
Questions-Answers
Exercises 81
85 1
i
http://stat9943.blogspot.com
A QMJa Approach to Stat/sJJa wJJh Questions and Answers
Exercises 125
Bibliography 201
http://stat9943.blogspot.com
rs A Quick Approach to Statistics with Questions and Answers
Chapter 1
. 11111111
Introductory Statistics
;
Basic Statistics
Statistics
By: Rafaqat
Statistical Methods
Statistical methods are those ways that are used to collect, present, analyze,
and interpret quantitative data.
Type of Statistics
There are two major types of Statistics: Descriptive Statistics and Infere,.ntial
Statistics:
Descriptive Statistics
It consists of methods for organizing and summarizing information in a
presentable and effective way.
Inferential Statistics -
It consists of methods of drawing conclusions about a population based on
information obtained from a sample of the population.
Data
A collection of fatts from which conclusions may be drawn is referred as
data.
Observation -
Any sort of recording of information:1i; called observation.
Chapter 1: IntroductoryStatistlcs
http://stat9943.blogspot.com
r
A. Quick Approach to Statistics with Questions and Answers
Types of Data
Generally, data can be classified by their nature and 'fay of collection.
Qualitative Data
Qualitative (or Categorical or Attribute) data can be separated into different
categories tl:tat are distinguished by some non-numerical characteristics. For
example, gender of person, blood type, and eye color etc.
Quantitative Data
Quantitative data consist of numbers representing counts or measurements
such as number of patients in a hospital, ages ofa group of persons, data
about height and weight of individuals ety
Quantitative data can be further classified into discrete and .continuous data.
By: Rafaqat
All type of c.ount data are referred as discrete data where measured data are
referred as continuous data.
Discrete Data . .
Data obtained by categorizing subjects so that there is a distinct interval
between any two possible values e.g_., number of patients in a hospital and
number of chairs in a room etc.
Continuous Data
Continuous data resuh from infinitely many possible values that can be
associated with points on a continuous sCale in such a way that there are no
gaps or interruptions. For example, data about height and weight of
individuals etc.
,
Types of Data (Collection)
Primary Data .
The data collected directly from pc0ple and organizations via questionnaires
or surveys before being analyzed to reach conclusions concerning the issues
covered in the.questionnaire or survey.
Secondary Data
The data that have undergone any ~ort of treatment by statistical methods.
In other words, the data that have already been assembled, having been
coltected for some other purpose., are referred as secondary data. Sources
include census reports, trade publications, and subscription services.
http://stat9943.blogspot.com
A Quick Approai:h to Statistics with Questions and Answers
Scales of Measurement
Nominal Scale
)n nomin.al scale is categorized by data that consists of names, labels or
categories only. Such data cannot be arranged in an ordering scheme. For
example, gender; "male" and "female", response; "yes" or "no",-etc.
Ordinal Scale
The ordinal scale involves data that may be arranged in some order but
differences between data values either cannot be determined or are
meaningless. For example, in a sample of 36 'stereo speakers, 12 were rated
good, 16 wei;e rated better and 8 were rated best.
Interval Scale
By: Rafaqat
The interval scale is like the ordinal scale with the additional property that
meaningful amount of differences between data can be determmedJ
However, there is no inherent (natural) zero starting point. Interval scal6
take the notion ofranking items in order one step further, s\nce the distance
between adjacent poinfs on the scale are equal. For instance, the Fahrenheit
scale is an interval scale, since each degree is equal but there is no absolute
zero point. This means that although we can add and subtract degrees (I 00
is 10 warmer than 90), we cannot multiply values or create ratios (100 is
not twice as warm as 50). What is important in determining whether a
scale is considered interval or not is the underlying intent regarding the
equal intervals: although in' an IQ scale, the intervals are not necessarily
equal {e:g. the difference between 105 and 11.0 is not really .the same as
between 80 and 85), behavioral scientists are willing to assume that most of
their measures are interval scales as this allows the calculation of averages -
mode, median and mean -, the range and standard deviation. ,.
Ratio Scale
The ratio scale is the intervat scale modified to include the inherent zero
stlrting point. FOJ: values at this level, differences and ratios are meaningful}
Ratio scales are the most sophisticated of scales, since it incorporates all tlie
characteristics of nominal, ordinal and interval scales. As a resuft, a large
number of descriptive calc\Jlations are applicable such as when respondents
are asked for their age, height, income etc.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Presentation of Data
Classification
A classification is the separation or ordering of objects into classes where
classes are categories for grouping data.
Tabulation
Tabulation is placement of data into rows and columns with suitable heads
and subheads.
Frequency
The frequency of a particular class is the number of original scores that fall
into that class. Simply, frequency is the number of times that a repeated
observation occurs.
By: Rafaqat
Cumulative Frequency
The cumulative frequency for ~ class is the sum of the frequencies of that
class and all the previous classes.
Relative Frequency
The relative frequency of a particular class can be found by dividing by the
class frequency by the total of all frequencies.
Grouped data
The data presented in the form of a frequency distribution are called
grouped data.
Frequency Distribution
The division of counts (frequencies) of number of scores that fall into each
class (category) is called frequency distributio!Vln other words, a listing of
cla$ses and their frequencies is called frequency distribution. A table that
represents classes along with their respective class frequencies is called
frequency table.
Class Limits
The values or numbers specifying a class are called class limi~ The
smallest value specifying a cl?SS is called lower class limit while the largest
value for specifying a class is called upper class limi9
http://stat9943.blogspot.com
A Quick Approach to Statistics wiih Questions and Answers
Class Boundaries
Class boundaries are the numbers used to separate classes, but without the
gaps created by class limits:) They are obtained by increasing the upper class
limits and decreasing the-1'ower class limits by the same amount so that
there are no gaps between consecutive classes. These boundaries are also
called precise limits or tnie limits. ,
Cholesterol
Class Class Cumulative Relative
Level Frequency
Boundaries Mark Frequency Frequency
(Class Limits)
191- 195 190.S- 195.S 193 l l 1/25 = 0.04
196-200 195.5-200.5 198 3 3+1 =4 3125=0.12
201-205 200.S - 205.5 203 4 4+4 =8 0.16
206-210 205.5-210.5 208 7 7+8 =IS 0.28
211-215 210.5-215.5 213 5 5+15 = 20 0.20
216-220 215.5 - 220.5 218 4 4+20 = 24 0.16
221-225 220.5 - 225.5 223 l 1+24 = 25 0.04.
Total 25 1.00
Graph
A drawing representing the relationship between data sets is called the
graph.
Histogram
A graph that displays the classes on the horizontal axis and the frequencies
of the classes on the vertical axis is called histogram. The frequency of each
class is represented by a vertical bar whose height is proportional to the
frequency of that class.
Frequen~y Polygon
Polygon means closed shape. A frequency polygon is obtained by joining
the mid points of the adjacent bars of histograms with straight lines and then
joining the both ends w.ith X-axis by assuming class frequencies zero at
those pointS.
Frequency Curve
When frequency polygon is constructed for large numbers of observations
and small class intervals, a smo'othed curve can be approximated that is
referred as frequency curve.
Charts
It is the plotting data and showing results of a process over a period of time
(day, month, etc.)
Diagram
A diagram is a simplified and structured visual representation of concepts,
ideas, constructions, relations, statistical data, anatomy, etc. used in all
aspects of human activities to visualize and clarify the topic.
http://stat9943.blogspot.com
II rs
{ A Quick Approach to Statistics with Questions and Answers
.I
I
f
i
1 Measures of Central Tendency
'
\
Central Tendency
The general level, characteristic, or typical value that is representative of the
majority of ca5es is referred as central or average value in general. The
tendency of observations to gather around the central. part of data is called
central tendency.
. Arithmetic Mean
It is a type of average (measure of central tendency), which is defined as the
sum of all the values in a set of numerical data divided by total number of
observations in that data se:.J This is the mast commonly used measure of
central tendency and is simply called M_ean. It is labeled as either
(lowercase Greek letter "mu") to denote a population mean or X (X-bar) to
denote a sample mean.
Weighted Mean
An average of means calculated by weighting each individual mean
according to the number of data points that made up that individual mean.
Geometric Mean
A mean of n objects that is computed by taking the n-th root of the product
of the n terms. A measure of the central tendency of a data set that
minimizes the effects of extreme values.
Harmonic Mean
It is reciprocal of the mean of reciprocal values.
Median
The median of a set of scores is the middle value when the scores are
arranged in order of increasing (cir decreasing) magnirude. The median is
often denoted by X (X-tilde ).
Chapter 1: Introductory Statistics 7
http://stat9943.blogspot.com
r
A Quick Approach to Statistics with Questions and Answers
F Mode
The mode is the value that has the largest frequency in a data set. When two
scores occur with the same greatest frequency, each one is Mode and the
data set is called bimodal. When a data set has more than two. modes, it is
called multimodal.
Quartile
A quartile is any of the 3 values which divide the sorted data set into 4
equal parts, so that each part represents I/4th of the sample or population. It
is denoted by Q; (i = I, 2, 3). The second quartile is, obviously, equal to
By: Rafaqat
median ..
Deciles
A decile is any of the 9 values which divide the sorted data set into I 0 equal
parts, so that each part represents Ill Otb of the sample or population. It is
denoted by D, (i = I, 2, .., 9). The 5th decile is, obviously, equal to median.
Percentile
A percentile is any of the 99 values which divide the sorted data set into ~ 00
eqwlJ parts. so that each parrrepr.esents 1/1 OOth of the sample or population.
It is denoted by P; (i = I, 2, ... , 99). The 50th percentile is, obviously, equal
to median.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Measures of Dispersion
Dispersion
Dispersion indicates variation while a descriptive measure that indicates the
amount of variation in a data set is called measure ofdispersion.
Range
The range is the length of the smallest interval which contains all the data. It
is defined to be the difference between the largest and the smallest value of
a data set.
Mean Deviation
By: Rafaqat
Quartile Deviation _
Quartile deviation is the half of'the difference between first and the third
quartile.
Variance
A measure of the variation shown by a set of observations and defined as
the mean of the squares deviations of all the observations from their mean.
It is usually denoted by d (sigma-square) for population and s2 for sample.
Standard Deviation
The positive square root of the variance (defined above).
Moments
A moment designates the power to which deviations are raised before
averaging them. Moments determine the shape and location of a
distribution.
Skewness
A distribution of data is symmetric if the left half of its histogram is roughly
a mirror image of its right half. The lack of symmetry ot departure from
symmetry is called skewness. A skewed distribution extends more to one
side than the other. If it has longer right tail, it is called positively skewed
distribution. if it has longer left tail, it is called negatively skewed
distribution.
It is important to note that
Mean = Median = Mode; Symmetric Distribution.
By: Rafaqat
.Kurtosis
Kurtosis descnl>es the extent to which a frequency distribution of scores is
bunched around the center or spread toward the endpoints. Simply
speaking, it measures the degree of peakedness or flatness of a unimodel
distribution.
A measure to describe the degree of kurtosis is called coefficient of
kurtosis.
http://stat9943.blogspot.com
A Ql!ick Approach to Statistics with Que~tions and Answers
Questions-Answers
Q.J What is the main objective of Statistics while .applying on
numerical data?
Ans. The main objective of Statistics is summarization of numerical
data.
Q.4 Men have more auto accidents than women; therefore women
are better drivers than men. True or False? Explain.
Ans. This is false" because more men than women driVe cars and for the
same span of time, say one year, the total hours driven by men are
more than by the females; therefore, men would have more
accidents ev,en if they are equally good drivers. But if we take
average number of road accidents per average number of driven
hours for both the genders then the situation may be comparable.
Q.6 Naf111! three Pakistani Government agencies that are good source
of the data.
Ans. Some that might be named are, Federal Bureau of statistics,
Punjab Bureau of Statistics, Sindh Bureau of Statistics, NWFP
Bureau of Statistics, Balochistan Bureau of Statistics, Pakistan
Census Organization, Agriculture Census Organization etc.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Q. 7 Why should an. anafyst know the accuracy of the data he plans to
analyze? .
Ans. If the analyst does not know the accuracy of his data he cannot
trust any analysis and interpretation that are made with them.
Q.11 Olr "'""' kind of data, the use of Arithmetic mean is inost
milllble?
AIU. If there is less variation in data (likely to b.e homogeneous) and the
observations are equally weighted then Arithmetic Mean is the
most suitable measure of central tendency.. In other words,' if all
the observations have ~ame importance or there are no extreme
values (outliers), Arithmetic Mean is preferable.
Arithmetic Mean:
Each value in the group to be averaged directly I
influences the magnitude of the arithmetic mean.
The arithmetic mean is subject to algebraic manipulation;
for the number of values from which it was computed will
give their sum.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Median:
The median is a positional average, being the center
figure in an arrayed list of figures . . After it has been
determined, its value cannot be changed, changing values. '
of the other figures, unless the changes result in placing a
new figure at the center of the list.
The median cannot be treated algebraically like the
arithmetic mean, geometric mean and harmonic mean.
Mode:
The mode, being that figure which appears most often in a
By: Rafaqat
Geometric Mean:
All of the values being averaged directly affect the
magnitude of the geometric mean.
The geometric mean is subject to algebraic manipulation.
The geometric mean gives the large values less weight
than arithmetic mean; therefore it will be smaller than the
arithmetic mean of the same figures.
The geometric me.an will be zero if any of the values in the .
series to be averaged is zero.
When negative values are used, the geometric mean does
not exist.
/armonic Mean:
Each figure being averaged directly affects the magnitude
of the harmonic mean.
The harmonic mean gives less weight to the larger figures
that do the arithmetic and geometric means; therefore, it
will be smaller than these means when all three are
computed/ram ihe same figures.
The harmonic merm of a set of ratios may be
appropriaiely calculated if the numerators ofthe fractions
. ~ . _ _ _ _..,fr_o_m_w_h_ic_h_th_e_r_a_tr_o_s_w_e_r_e_c_o_m...p_u_te_d_a_r_e_t_h_e_s_a_m_e_
. ..._ _.
., 'r 1: Introductory Statistics 13
http://stat9943.blogspot.com
A Qukk Approach to Statistics with Questions and Answers
Q.13 Under what general conditions each of the five averages the most
appropriate to use for a given group offigures~
Ans. General conditions under which the five averages would be used
are;
Arithmetic mean used as an average when the group of
figures is quite homogeneous.
Median used as an average of data that are highly
skewed
Mode used when it is a highly predominate figure in a
group.
Geometric Mean used primarily as an average ofratios of
By: Rafaqat
chang~
Harmonic Mean used when the numerators of the
fractions used to compute the ratios were the same or
nearly so. 1
Q.16 When computed from the same data, which of the following
averages will be the largest, second largest, and the smalli:st:
arithmetic mean, geometric mean, and harmonic mean? Whyr
Chapter I: Introductory Statistics 14
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Ans. The arithmetic mean would be the largest and the harmonic mean
would be smallest. The geometric mean would be the smaller than .
the arithmetic mean because it gives less weight to the larger
values. The harmonic mean, however, gives less weight to the
larger values than does the geometric mean.
Q.17 Why is the median usually a better average for a highly skewed
frequency distribution than the arithmetic mean?
Ans. The extremes in the highly skewed distribution will distort or bias
an arithmetic mean in .their direction. The median, however, begin
a positional average will not receive the same bias.
Why must the interval of the model class and the two adjacent
classes be the same?
Ans. Jn the.mode formula the h' represents the interval, which is
common to the model class, and the classes that immediately
precede and follow the model class.
Q.19 Ifyour chairman of the department asked you to plan alt evening
for the sons of the department employees and toldyoli''thtit the':
average age~ of the sons was 14, what more would yoli'wantto
know about the ages before you started planning?
Ans. One would need information on the variations in the ages of the
sons. If their ages all are in the range 13-15, the entertainmen't
plan would be entirely different than it would be if the ages were
distributedfrom ages 6 to 18..
Q.20 A man who stated that he manufactured lid for Jhe glass jars was
asked what size lids he manufactured. He replied, '1the typical
size". To which average was he referring?
Ans. This would be the mode. The ariihmetic mean size and median size
may not fit any glass jar.
Q.21 Could a man six feet tall drawn while crossing a (iver with an
average depth of two feet?
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Ans. Of course, the question was asked to get a little humor into a
sometimes dry subject. The question should make a student realize
how misleading an average can be when nothing is known about
the values that makeup the average.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Q.26 When computed from the same data, which will be larger, the
mean deviation or the standard deviation? Why?
Ans. The standard deviation. Becaiise it gives more weight to the large
values.
Q.29 What are the measures of relative dispersion? How are they
used?
Ans. Measures of the relative dispersion are absolute measures of
dispersion expressed as a fraction or percent of some base, usually
the arithmetic mean. They are used to compare the variations of
two or more sets of the data that are expressed in the different
units or two or more sets of data expressed in the same units but
which have different arithmeti,c means.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Exercises
Exercise-Lt (MCQs')
Q.2 Indicate the following for what the type of the data described
below is nominal: .
(a) Team scores in a cricket match.
(b) Daily temperatures in degree Celsius.
(c) Room numbers in the Holiday Irmho'1. _
(d) Identification oftfie children who luJve chicken pox.
Q.3 Indicate the discrete variable for the variables given below:
(a) Batting average of the Pakistan cricket team.
i
(b) Number of children of each of 1,000 married graduates of
a public university.
(c) Average heights of the students of the lst year Class.
(d) Daily hours of the sunshine during the period from .
September 21 to. December 21.
Q.4 Indicate for which of the following data, the shape of the curve
will be normal?
(a) A computer printout shows the current checking account
balances for all the checking customers o.f the national
bank.
(b) Diagnostic reading test scores are tabulated for all the low
graders in a school of a distri.ct.
(c) a
Gifted students in creative writing class are given the
verbal subtest of an intelligent test.
(d) A nation wide mathematics exam is given to 1-,000
students at a college wi_th a selective admission policy.
http://stat9943.blogspot.com
A Quick Approacli to Statistics witli Questions and Answers
http://stat9943.blogspot.com
A Quick Approach to StatiStics with Questions and Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
16; The difference between the highest and the lowest observations
in.a data set, called the quartile range.
17. The range is more stable measure of variability than the inter-
quartile range .
. 18. The median is less influenced by chance than the mode because
the median take into account the entire distribution.
19. In a symmetric unimodel distribution, the arithmetic mean is
equal to the median.
20. The standard deviation is the square of the variance.
21. If the data of an experiment is measured in meters, then the
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
http://stat9943.blogspot.com i
t
A Quick Approach to Statistics with Questions and Answers
111111 Chapter 2
B~sics in Probability
By: Rafaqat
Experiment . .
The process of obtaining an observation is called an experiment. Hitting a
target, checking the boiling point of a liquid, taking examination for a
student, conducting interviews for some jobs, tossing of a coin, rolling of a
die, hitting a ball of a batsman, sale of so)lle products, chemical reaction of
elements, are few examples of experiments.
Trial
A single performance of an experiment is called trial. If a batsman plays a
single ball, if a bowler bowls, if a student solves a single question, single
rolling of a die, all these are the examples of a trial.
Outcome
An outcome is the result of an experiment. Each possible distinct result of
an experiment is refei:red as outcome. Hitting or not hitting a target, making
some scores, leaving a ball or being out for a batsman, boiling of water on
l 00C are the examples of outcomes relevant to the experiments discussed
above.
Random Experiment
An expttriment is called random experiment if its outcomes cannot be
predicted in advance even if it is performed under similar conditions. Any
random experiment has the following properties:
(i) It has at least two outcomes.
(ii) The number of all possible outcomes is known in advance.
(iii) It can be repeated any number of times under similar conditions.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Sample Space
The s~t of all possible outcomes of a random experiment is called a sample
space. It is usually denoted by Q (omega).
In the random experiment in which a student takes a examination, suppose
the result of examination can be in form of grades, 'A', 'B', 'C', 'D' and 'F'
then Q = {'A',"B', 'C',,'D','F'}..-
In rolling a die, Q = { l, 2, 3, 4, 5, 6}.
By: Rafaqat
Event
Any subset of sample space is called event. It is a coflection of outcomes of
an experiment. Events may be either simple or composite. Formally, if an
event consists of single sample point of a sample space, it is called
elementary or simple event and in case of two or more sample points, it is
referred as composite event.
Simple Events: A student passes an examination, a batsman makes shot for
six, a die show a number 2, etc.
Composite Event: A motor accident for rash driving and failure of brakes, a
ball results in I run and a run-out during an over in a cricket inatch, et-c.
Event Space
A set of all events relevant to a sample space is called event space. It is
usually denoted by ;t
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions anti Answers
Complementary Events
Complementary event for an event A is the event that A does not occur. For
event A it is denoted by A1 or Ac. For example, passing of a student is
By: Rafaqat
Independent Events
Two events A 1 and Ai relevant to a sample space are said to be statistically
independent ifthe occurrence of A 1 does not affect the probability of
occurrence or non-occurrence of Ai.'
Symbolically,
=
P(A 1 nA 2) P(A 1). P(Ai).
The passing (or failing) of one student is statistically independent to the
passing or failing of other student(s) in the same examination, score on .a
current ball is statistically independent to the result of previous ball in a
cricket match, are the examples of statistically independent events.
Probability
Probability may be defined as the likelihood of the occurrence of an event.
A probability provides a quantitative description of the likely occurrence of
a particular event. In other words, it is a numerical measure of uncertainty.
Probability is conventionally expressed on a scale from 0 to I; a rare event
has a probability close to 0, a very common event has a probability close to
1.
Subjective Probability
A subjective probability des.crioes an individual's personal judgment about
how likely a particular event is to occur~ It is not based on any precise
computations but is often a reasonable assessment by a knowledgeable
person. A person's subjective probability of an event describes his/her
Chapter 2: Basic Probability Theory 25
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
degree of belief in the event. For example, a cricket expert says that there
are more than 80% chances that the team A will win the tournament. A
planning minister guesses that at least 3/4 villages of the country will be
supplied electricity by the end of next year.
Objective Probability
A probability that can be established theoretically or from historical data:
The objective probability has following main approaches to define
Probability:
(i) Classical (Priori) Definition of Probability
(ii) Relative -Frequency Definition of Probability
(iii) Axiomatic Definition of Probability
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics witli Questions and Answers
Conditional Probability ..
In many situations, once more information becomes available; we are able
to revise our estimates for the probability of further outcomes or events
happening. For example, 'suppose you go out for lunch at the same place
and time every Friday and you are served lunch within 15 minutes with
probability 0.9. How.ever, given that you notice that the restaurant is
exceptionally .busy, the probability of being served lunch within 15 minutes
may reduce to 0.7, This is the conditional probability of being served lunch
within 15 minutes given that the restaurant is exceptionally busy. The effect
of such information is to reduce the sample space by exc1uding some
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
. ,.
P(LJ A,)=
iI
I P(A.)- L P(A,r.A )+
;al 1<1
1 L P(A,ri A1 riAt)- ... +(-t)"i P(f) A.)
i<j<lt i..al
By: Rafaqat
1 ,,
1
Baye's Theorem
Suppose, A 1 and Ai be mutually exClusive and exhaustive events with non-
zero probabilities then for any event B (with non-zero probability)
P(Ai I B) = P(A,).P(B I A)
P(AJP(B I A)+ P(Ai).P(B I Ai)
Generally, for n events:
28 f
Chapter 2: Basic ProbabUity Theory
I
I-
http://stat9943.blogspot.com
I
I
A Quick Approach to Statistics with Questions and Answers
Counting Rules
Rule of Multiplication
If there are K procedures and ith procedure may be performed inn; ways (i
=I, 2, ... , k) then all the k pr0cedures may be performed in n 1 x n 2 x. ... x ni.
ways. For example, a person has 3 different pairs of shoes and 4 differ~nt
pairs of socks then he may use all of these pairs in J x .4 = 12 different
ways.
By: Rafaqat
Rule of Addition .
If there are K procedures and ith procedure may be performed in n; ways (i
= l, 2, ... , k) then the number of ways in. which one can perform procedure
I or procedure 2 ... or pr9cedure k given by n 1 + n 2 + ... + nk (assuming
that one procedure can be performed one time or no two procedures can be
performed together).
Suppose, a group of students is planning a trip and thinking about either bus
"'\ or train to use for that. If there are 3 different routes available when using
bus for the trip and 2 different routes for train then there are 3 + 2 = 5 routes
available for that trip. '
Permutations
A permutation of n different objects -taking rat a time {O Sr Sn)
I.
P, = n. . In permutations, order of objects is important or meaningful.
~-~ .
For example, if one wants to calculate the ways in which 6 persons may be
=
seated bn a bench having a capacity of 4 seats then in this case, n 6, r"= 4
. 61
and answer is 6 P, =--- = 6 x 5 x 4 x 3 = 360.
(6-4)!
Combinations
A combination of n different objects taking rat a time {O Sr Sn)
c, =(n) = . n! . In .combinations, order of objects is not meaningful.
r r!(n-r)! .
To differentiate between the case of permutation and combination, we
coi:isider the example that out of 7 Statisticians a committee is to be formed
C/ulplel' 2: Basic Probability Theory 29
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
1
of 4 then in
7
c. = ( 4 ) = 4!(7-4)!
,:;J._ . -
35 different ways the committee can
.
:be formed. Now, if we specify the positions of committee members like,
one president, one secretitry, one treasurer and one speaker then this will be'
a case of permutation because here the positions (order) are meaningful and
the number of ways in which such committee can be constituted is
7
p = _ 7_!- = 840.
(7-4)!
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Questions.-Answers
Q.3 What is the probability that a professor will meet his next class?
Is this a priori probability or posterior probability?.
Ans. This answer will depend on our previous experience that is, on the
frequency. that professor has met his previous classes. This is the
example ofposterior probability.
http://stat9943.blogspot.com
A Quick Approach to Statisiics with Questions and Answers
The Union of the elements of the set A and set B consists of the
elements that belong to set A or set B or both A and B.
Q.6 What is the difference between Sample Space. and Event Space?
Ans. Sample space is the set of all possible outcomes of a random
experime~t whjle the event space is ihe set of all events dssodated
with a sa~ple space. .
't .
By: Rafaqat
Q.10 What is the tnajor put:pose (or advantag~) of 11,sing the Baye's
rule in decision making problem?
Ans. The advantage of this approa.ch is that it allows the initial
forecasts in the form of probabilities to be revised upward or
downward as related additional inforrrtation b~comes available.
I
Q.13 What Is the probability. that the price of oil will higher next year
1 than this year? Why? .
f Ans. Again, any value between 0 and I is acceptable here. Judging.from
recent past experience, the probability could be q~ite high such as .
0.8, or even 1.
By: Rafaqat
~Exercises
Q. I A probability of 1 represents
(a) Impossibility
(b) An improbable event
(c) A 50 - 50 chance
( d) Certainty
By: Rafaqat
Q.2 Three coins are tossed. What is probability that there will all be
heads?
(a) 1/8
(b) 114
(c) 1/3
(d) 3/2
Q.3 A cricket team captain wins the toss for three consecutive matches.
What is the probability that he will call correctly for the fourth
match?
(a) 1116
(b) 1/8
(c) 1/4
(d) 1/2
(c) 314
(d) 11112-
(c) 2/3
(d) 3
Q.8 If a letter is chosen .at random from the l 0 letters of the word
STATISTICS, what is probabiiity that it is a vowel?
(a) 0.20
(b) 0.23
-(c) 0.30
(d) 0.40
http://stat9943.blogspot.com
A Quick Approacll to Statistics witll Questions and Answers
http://stat9943.blogspot.com
Chapter 1: Basic Probability Theory 36
A Quick Approach to Statistics with Questions and Answers
Q.17 A marginal probability might be found by apy but which one of the
following?
(a) Adding together appropriate joint probabilities.
(b) Subtracting or the sum of several marginal probabilities
By: Rafaqat
from 1.
( c) Dividing the size of the appropriate event set by the
number of possible equally likely elementary events.
( d) Multiplying together all probabilities in the same column
or row.
Q.18 Which one of the following statement is not true?
(a) Mutually excl~sive events are statistically dependent.
(b) Complementary ev~nts have probabilities that sum to l.
(c) Opposite events, are statistically independent.
(d) An experiment's elementary events are collectively
exhaustive and mutually exclusive.
Q.20 A fair coin is tossed 50 times, the expe~ted number of heads are;
(a) 100
(b) 50
(c) 15
(d) None of these.
I. A 50-50 chance of the rain means that the probability of the rain
is 112.
2. For a random experiment, all the outcomes are known in
advance.
3. A or B is_ an event occurring whenever A-occurs alone, B occurs
alone, or the both A and B do not occur.
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
events.
18. Probability is a number between 0 and I, exclusive.
19. If the two events, A and B are mutually excfusive then the
probability that eithe~ one or the other will occur is P(A).P,(B).
20. A probability can be certainty.
21. In a classical probability, we can determine a prior probability
based on a logical reasoning before any experiments t&ke place.
22. An unconditional probability is also known as a marginal .
proba.bility.
23. A subjective probability may be nothing more than an educated
guess ..
24. When using the relative frequency approach, probability figures
become less accurate for large number of observations.
25. The relative frequency approach to probability will provide
correct statistical probabilities after I 00 trials ..
26. A and B are independent events if P(A/B) = P(B).
27. Symbolically, a marginal probability is P(AB).
28. If A and Bare independent events then P(AnB) #:- P(A).P(B).
http://stat9943.blogspot.com
A Qui!:k Approach to Statistics with Questions and Answers
'.
Chapter 2: Basic Probability Theory 40
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Chapter 3
1111111
Random Variables
Random Variable . .
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Probability Function
A probability function is a real valued function defiried on the class of all
subsets of the sample space Q; the value that is associated with a subset A is
denoted by P(A). The assignments of probability must satisfy the following
three axioms:
(i) P(Q) =I
(ii) P(A) ~ 0
(iii) Jf A; ( i = 1, 2, ... ) is a sequence of mutually exclusive events
then
co co
P(U A,)= L. P(A;)
i=I i=I
Probability Distribution
By: Rafaqat
A table listing all possible values that a random variable can take on
together with the associated probabilities is called probability distribution.
The probability distributio~ of a discrete random- variable is a list of
probabilities associated with each of its possible values. It is also sometimes
called the probability function or the probability mass function.
More formally, the probability distribution of a discrete random variable X
is a function which gives the probability P(x;) that the random variable
equals x;, for each value x,:
P(x;) = P(X=x,)
It satisfies the following conditions:
(i) 0::; P(~.)::; I
(ii) 2: P(x) =1
Probabi_lity Density Function (pdf)
The probability density .function (pdt) of a continuous random variable is a
function which can be integrated to obtain the probability that the random
variable takes a value in a given interval.
More formally, the probability density function, ft..x), of a continuous
random variable Xis the derivative of the cumulative distribution function
F(x) (defined next):
d
f(x) = dx F(x).
lfft..x) is a probability density function then it must obey two conditions:
(i) That the total probability for all possible values of the
continuous random. variable Xis 1, i.e;
I f(x)dx = i
Cllaptu J: Random Variables 42
'
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Expected Value
The expected value (or population mean) of a random variable indicates its
average or central value. It is a useful summary value (a .number) of the
variable's distribution. Stating the expected value gives a general impression
of the behavior of some random variable without giving full details of its
probability distribution (if it is discrete) or its probability density function ..
(if it is continuous). The exp~cted value of a random variable X is
symbolized by E(X) or.
If Xis a discrete random variable with possible values Xi. x 2, x 3, , x 0 , and
P(x;) denotes P(X =x1), then the expected value of Xis defined by:
= E(X) = L;x,P(x,) ',provided that series are convergent.
If Xis a continuous random variable with probability density functionj(x),
then the expected value of Xis defined by:
J
= E(X) = xf(x)di.
It is to be noted that
Var (X) = E(X1-)-{E(X)} 2
Also
http://stat9943.blogspot.com
A Q.ulck ApprOllCb to Statistics with Questions and Answers
E(X +C) = + C,
where C is any constant.
Mx(t) = Je"'f(x)dx
.. . . t2x2 .
=I (I +tx+-+
-- 2!
.. ) J<x>dx
. '
.
I
/2 ' 2
= I +fI+~+,
where 1
1
is the i-th moment about origin.
http://stat9943.blogspot.com
A Quick Approach to Statistics with. Questions and Answers
Characteristic Function
By: Rafaqat
nt- I
Pr( - kO' < X < + ka) ~ I - -
k2
http://stat9943.blogspot.com
A Qttkk Approach to Statistics with Questions and Answers
Questions-Answers
Q.J If a random variable (r.v) is a real valued function defined on
sample space, what is domain and range for r. v.?
Ans. The domain is sample space whr1e the range is real line.
Determine,
a) P(T56) b) P(3 5T 54.5) c) P(-005 T 5 3. 75)
d) P(T> 9.45) e) P(T 515) j) P(T=tt)
Q.6
In a :iven business venture, a man can make profit of Rs. 1000
; a loss of Rs.500. If t!1e probability of a profit '!_f}:6,
~e'!'
. _ _ - wl;at-;.,~ tire excepted profit m tile velllure? -
http://stat9943.blogspot.com
swers
A Quick Approach to Statistics with Questions and Answers
re. Q.8 If we want to get moments about mean rath,(!r than origin then
on what change we should make in computing mg/?
Ans. We should compute mgftaking mean as origin in place ofzero i.e.,
f
, M,.1- (t) = E( e(t(X-)) =e_,,,Mx (t) '
47
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Exercise 3 (True/False)
Read the following statements carefully and indicate which statement is
"True" or "False'1:
http://stat9943.blogspot.com
A Quick Approach to Statistics "'1th Questions and Answers
Chapter 4
Discrete,
Probability Distributions
Bernoulli Trial
There are {Ilany practical and experimental situations where out comeof
each repeated random trail can result in just two categories, namely
' "success" and "failure", or dichotomy ofresults can be found. For example,
two possible outcomes of each exam of a student can be passing or failing
of that student, correct or wrong answer, catching or missing a bus on a
stop, hitting or missing of a target, infected or non-infected from some
disease after result of a test, etc. Such trials are called Bernoulli Trials.
Simply speaking, a trial of a random experiment whose -outcomes can be
s classified into two categories, "success" or "failure", is referred as Bernoulli
trfal. .
n Binomial Distribution
An experiment haviJlg n (say) Bernoulli trials with the following properties,
h
is called Binomial Experim~nt:
is
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Distribution Summary:
pmf: x = 0, 1, 2, , n,
where
X = Nmber of successes in n trial~.
By: Rafaqat
q = 1-p
Mean: np
Variance: npq
l-2p
Skewness:
~npq
l-6pq
Kurtosis: npq
mgf: (q+pe'Y
Char. June.:
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
r. Hypergeometric Distribution
There are many experiments in which the condition of independent trials is ..
not met and resultantly, the probability of success does not remain constant
from trial to trial. Such experiments are called hypergeometric experiments
with the following properties:
Distribution Summary:
Parameters: Three parameters, N, k and p,
N = Number ofunits in the set or population,
k =Number of successes (units of interest) in
the set or population,
p= ~ = Probability of su~cess.
pmf P(X ~ x)
(kxN-KJ
= .xn- x
. (:) '
where
X = Number of successes (units of interest)
inn (sample size of items selected,
without replacement),
x = 0, 1,2,,n, and x = 0,1,2, ,k.
k
Mean: np=n-
N
Variance: (N-n)
npq - -
N-l . N
k(1 -N'
=n- k-xN-n)
--
N-l
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Geometric Distribution
In many practical situations, an experimenter is interested in the first
success in the experiment. To obtain this, he repeats the experiment until he
gets the first success. For example, a researcher keeps on taking blood
samples until he gets 0-ve blood group. To handle such situation, here a
distribution presented called geometric distribution and an experiment in
which .trials are repeated until first success is obtained, is called geometric
experiment. A geometric experiment has the following pi:_operties:
Distribution Summary:.
Mean:
!1.
p
I
~.
_g_
Variance: p2
_li_
mg[:
1-qe'
01apter 4: Discrete Probability Distributions 52
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions andAnswers
x =l,2, .. .,
where
X= Number of trials to. have first success,
p = Probability of success,
=
q 1- p,
also p is the only parameter of this distribution.
1
Mean:
p
By: Rafaqat
1
Variance: -2
p
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Distribution Summary:
k !l.
By: Rafaqat
Mean: p
Variance:
mg(: . .
p ( l-qe') -
Y=X-k,
y+k-1)
P(Y=y)= ( k-1 ~qy; y = 0, I, 2,- ,
where
Y= Number of failures preceding k successes.
The mean and variance for Ywill be changed accordingly.
Poisson Distribution
A distribution often used to compute probabilities for random variables
distributed over time and space is the Poisson distribtion. For example,
Chapter 4: Discrete Probability Distributions '54
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
a) Events that occur in one time interval (or region or space) are
independent of those in any other non-overlapping time interval (or
region or space).
By: Rafaqat
b) For a small time interval (or region or space), the probability that
an event occurs is proportional to the length of the time interval (or
region or space). '
c) The probability that two or more events occur in a very small time
interval (or region or space) is so small that it can be neglected.
Distribution Summary:
Variance:
--------------
http://stat9943.blogspot.com
A Quick Approacll to Statistics with Questions and Answers
Multinomial Distribution
If a trait's outcomes can be classified into more than two categories, a
binomial experiment becomes multinomial experiment. For example, a
finished product may be classified as excellent, good, average; a student's
grade may be A, B, C, D, or F etc.
A multinomial experiment has the following properties:
Distribution Summary:
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Questions-Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics witil Questions and Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Exercises
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questio~Jand Answers
Q.S If X has binomial distribution with parameter p and n therr Xln has
the variance
(a) nl/.q
(b} npq
(c) pqln
(d) pq!n2
'" -
Chapter 4: Discrete ProbQbOity Distribution$" 60
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
has Q.IO Which of the following is the most reasonable condition for the
binomial approximation to the hypergeometric distribution?
(a) N= 200, n = 12
(b) N= 500, n = 20
(c) N= 640, n.= 30
(d) N= 800, n = 50
vith Q.11 Suppose, we have a Poisson distribution with A/ equals to 2 then the
probability of having exactly I 0 occurrences is;
-10 e-IO
(a) . 2
10!
(b)
2'" e-2
2!
By: Rafaqat
102 e-10
(c)
10!
(d)
2'" e -2
.'JO!
http://stat9943.blogspot.com
A Quick Approach to Statistics with Quesiion.s and Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
111111 Chapter 5
continuous
Probability Distributions
Uniform Distribution
A continuous rahdom variable X is said to follow a Uniform distribution
with parameters a and b, '"'.ritten X - U(a,b), if its probability density
function is constant within a finite interval [a,b], and zero outside th.is
interval (with a less than or equal to b). In other words, the values of a
uniform random variable are uniformly distributed over an interval. For
example, if buses arrive at a given bus stop every 15 minutes, and you
arrive at the bus stop at a random time, the time you wait for the next bus to
arrive could be described by a uniform distribution over the interval from 0
to 15 minutes.
One of the most ,important applications of the uniform distribution is in the
generation of random numbers. That is, almost all random number
generators generate random numbers on the (0,1) interval. For other .
distributions, some transformation is applied to the uniform random
numbers.
The following is the plots of the uniform probability density function and
cumulative distribution function:
http://stat9943.blogspot.com
... A. Quick Approach to.Statistics with Questions and Answers
.... --~
. .... . ..
I
a
r . f
! ..
I..... ...
.. ~-
I .. .. .... ...
I
, Distribution Summary:
By: Rafaqat
Parameters: a,be(-00,00)
O; x<a
X-:-a
""' cdf --
b-a '
aSx<b
l; x~b
a+b
Mean: -- 2
Median: a+b
...' --
2
.~
Variance: (b-a) 2
12
Skewness: 0
Chapter 5: Continuous Probability Distributions 64
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
6
Kvrtosis: 5
e"' -era
mg[
' t(b-a)
itb Ila
e -e
C~ar. June.:
it(b- a)
Exponential Distribution
The exponential distribution is used to model Poisson processes, which are
By: Rafaqat
situations in which an object initially in state "A can change to state B with
cons~~mt probability per unit time ..t. The time at which the state actually
changes is described by an exponential random variable with parameter ..t
The exponential distribution is also known as the waiting-time 9istribution,
describes 'the amount of time or distance between the occurrence of random
events such as the time between major earthquakes or the time between two
cons8Cutive goals ina football match or the time until you get seat in a bus
etc. This distribution is also used in connection with estimating the length of
materiallife, or the lertgth of time a process might take.
The following is the plot of the exponential probability density function and .
cumulative distribution function:
..,
!\
_.J\
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Distribution Summary:
Parameters: A.> 0; mean time (rate) of a
process
pdf
_,,
A. e ' x~O
0, x<O.
X = time elapsed
I
Mean: -
A.
By: Rafaqat
ln2
Median:
-
A.
Mode: 0
Variance: A.'
Skewness: 2
I"'
Kurtosis: 6
t. mgf: (1- ~r
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Normal Distribution
The Nonna! Distribution, also called Gaussian distribution, is an extremely
important probability distribution. in many fields. The fundamentaJ
importance of the nonnal distribution is its use as model of quantitative
phenomena in the natural and behavioral sciences: A variety of
psychological test scores and physical phenomena can be well
approximated by a normal distribution. For example, height at a given age
for a given gender in a given racial group is adequately described by a
nonnal random variable even though heights must be positive.
For both theoretical and prac.tical reasons, the normal distribution is
probably the most important distribution in Statistics .. For example, many
classical Statistical tests are based on the assumption that the data follow a
nonnitl distribution. This assumption should.be tested before applying these
tests.
By: Rafaqat
The following are the plots of the normal probability density function and
cumulative distribution function:
...
I
I'
i
ru - t
/
I O.t
.. 4 oJ _,
' ' . .. ... 4
http://stat9943.blogspot.com
A Quick Approac/1 to Statistics wltb Questions and Answers
Distribution Summary: s
l
Parameters: ..._ oo S s; oo; u 2 > 0 2)
d
l
pdf . &exp -(x.~}2)
I.
u 2n
(
2u_2
, . -OO<X<OO
0, otherwise s
ti
~(l+ex{:Ji ))
c
cdf
1
a
By: Rafaqat
Mean:
Median: (
1
Mode: i!
Si
Variance: cl-
a
Skewness: 0 a
ii
Kurtosis: 3 h
'
n
l
mgf: ex{ t+--
u2t2)
1
2 .
b
p
Char. June.: ex { it--
u2t2)
-
2
Ii
ti
Ii
ll
(1
'.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
'
http://stat9943.blogspot.com
A Qu!ck Approach to Statistics with Questions and Answers
Gamma Distribution
Gamma densities provide a fairly flexible class for modeling nonnegative
randorrt variables. The exponential distribution becomes gamma
distribution, when we 9onsider sum of independently identically distributed
exponential variates. The gamma distribution ~an be used ,to describe the
probability that a events will occur within a time period ii . Contrast it
with the exponential distribution, which describes the. probability that one
event will occur.
The following are the plots of the gamma probability density function and
cumulative distribution function:
By: Rafaqat
- - ...............-.. . -
...--Gll_m_...
_P_D_F-=(ea.__m_m_a_:D_.6...:;J-.., Gemma PDF (gZlmma ::11
8
~ Ii ;a.
I!! ii 0.75
.!! 4 .!!
~ 3 ~ o.s
:a :a
B ;z
~
~ 025
Cl.
\.._
o+--.,,...,......,,_.,_,.......,.-.-~~~ a-1-~,_...,:::>;......,.......,..~-.--..--1
a 1 2 :s 4 s e 1 a 9 10 0 1 2 3 4 ~ 8 7 B. 9 10
x
Ga mm a P OF fsain ma =2)
0.4r------~---
~ ~
: 0.3 ~ 0.15
l'1 ' l'1
~ 02 ~ 0.1
2l :a
B
12 0.1 la.a&
Cl. Cl.
o+""..--.....,,.....,.....,...,..........,.....,......,..~
1 2 :I 4 5 fl 7 fl 9 10 o 1 2 3 4 s fl 1 a e 10
x x
l'
http://stat9943.blogspot.com
A Quick Approach to Statistics with.Questions and Answers
:ive .--------------------'"'-
Gmm CDfl!Pm ..... 0.5)
--------------------------
Ga., . . CDf4'111111ms1J
1
1ma ag
1ted 0.1 l:'0.75
the !' 0.7
t it ! ~= i
0
0.5
Q: OA a:
0.3 0.25
)Oe
0.2
0.1 ........~~~~~~~-1 a,_,~....,..~~~-.-~~
o 1 2 3 4 s e 1 a 1 10
x a 1 2. 2 4 l e 1 e 10
md
~mrn11 CDf (Gemma :2] Gltmint COF (&ammo: I)
i0.76 l
JI 0.5'
I! .
I
By: Rafaqat
a.
0.25
0
o 1 2 3 4
---~-~...-~J
l e e 1 11 10
Q.
a
I
1 2 2 ' l e 1
I
e 1 10
Distribution Summary:
l a-I -x!P
pdf --.-ax e ., x2'.0
rap
Mean: a/J
Mode: (_a - l)/3 for a ;;::: I
Variance: af3
0
Chapter 5: Continuo11s Probability D~iributions 71
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
2
Skewness: r
-.;a
.6
Kurtosis: _
a
ingf: (I - p1r 0
fort<. l/p
Chi-Square Distribution
In probability theory and Statistics, the Chi-square distribution '(also Chi-
squared or x' distribution) is one of the theoretical probability distributions
l most widely used in inferential statistics, i.e. in statistical significance tests.
1L
It is useful because, under reasonable assumptions, easily calculated
t
'' quantities can be proven to have distributions that approximate to the Chi-
h square distribution if the null hypothesis is true .
,,
.t
~'t z= (X,-,)z
; .. 1 U;
". is distributed according to the Chi.:square distribution.
t~
The Chi-square distribution has one parameter: k - a positive integer which
J
specifies the number of degrees of freedom (i.e.' the number of Xi). The Chi-
~' ~ square distribution is a special case of the gamma distribution.
The best-known sifuations in which the Chi-square dis'tribution is used are
\' the common Chi-square tests for goodness of fit of an observed distribution
~ to a theoretical one, and of the independence of two criteria of classification
of qualitative data. Two common examples are the Chi-square test .for
independence in an R x C contingency table and the Chi-square test- to
determine if the standard deviation of a population is equal to a pre-
specified value. However; many other statistical tests lead to a use of this
' .. ,_ distribution. One example is Friedman's analysis of variance by ranks.
,.
http://stat9943.blogspot.com
A Quick Approach to S~atistics with Questions and Answers.
rs
The following are the plots of the Chi-Square probability density function
and cumulative distribution function: '
..--C-hl-S__,qua_r_P_D_F~(1_dll_._~ o.s..--C_hi._S~qua_ie_P_D_F~~-dll---__,
4
i
. g.
OA
0.3
t
I u
I a.t
o~.::::;,,,,.......,._,...,...,......,.._.._,......: o~,_.., ........._,_,......~:;==....,.....,
012341871191.Q a 1_ 2 3. 4 l e ]'.. a a 10
-~
ll r O.Hi 50.075
i-
IS
i il.1 :a~ .ODli
s. ~
... 0.05 I... 0:02&
:d o+-..............-r.....-............-.,.....,,......f
i- 1 2 :a 4 s e
x
7 11 e ro a 1 2 .2 4 i e 7 a e 10
lS
t-0.75
i OJi
~
02&
o+-~~~~~~~....._. o+-~~~~~~~........-f
0 1 2 2 4 l 8 7 II 9 10 0 1 2 3 4 l 8 7 II 9 10
m JI OJi JI Q.3
h!
>r ... 02li
I!
~ 02 '1
to . 0.1
e- , 2 3 4 1e 7 e a 10
0
o t 2 2 4 l e 7 a e 10
is
http://stat9943.blogspot.com
A Quick Approaclt to Statistics witlt Questions and Answers
Distribution Summary:
1 !._, -~
pdf - -2 - - x z ez x::=:O
2" _f(k I 2)
0 otherwise
Mean: k
Median: 2
k--
3
By: Rafaqat
Variance: 2k . .
Skewness:
J8/i.
Kurtosis: 12/ k
Char. June.:
..
Student's t -Distribution
The I-distribution or Student's I-distribution is a probability~distribution that
arises in the problem of estimating the mean of a normally distributed
population when the sample size .is small. It is the basis of the popular
Student;s t-test for the statistical significance of the difference between two
sample means, and for confidence interVals for the difference between two
population means. The Student's I-distribution is a special case of the
generalized hyperbolic distribution.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
http://stat9943.blogspot.com
,.---
1
------------------------------------------
1 PDF
..
OAT-_ _ _ _ (1 di') _._,-':---~
OA~----~~--~
.,,. !' -
: 0.2 : 0.3
.Ji .!I
ii' 02 ~ 02-
:iii
j
~ 0.1 e. 0.1
a.
o~..--,_,..........,......,,.....,,__,.....,.;;::i ,.....,,...;;-.--i
o.P.......,co;....~..............
,,. QA,-----....-----~
i 0.3
l!'
By: Rafaqat
02
2i
Ia. a.1
o+-,......,:;,..,...........,"""'"'.........;:;....,....,.
~ ~-3 ~ ~ i ~ 2 3 4 &
,,.o.75
=
I0
u
D:
026
O't-.,....,......,.....,.....,.....,,.....,,_..,_,,._. o+-..-.,...;-..--.,.....,.........-....,....--1
~ -4-3 ~ ~ i 1 2 3 4 & -6 -4 -3 ~ _, 0 ., 2 3 4 Ii
' x
1
.. t CDF(illd1 t CDf.(30d11
1
a
F
~0.76 ,,.o.7& g
ii i l
! OJi ! OJi
2 e re
a. a.
026 026 d
T
h:
Cl
v;
http://stat9943.blogspot.com
,
A Quick Approach to Statistics with Questions and Answers
Distribution Summary:
Median: 0
By: Rafaqat
Mode: 0
Ske:wness: 0 forv> 3
3v-6
Kurtosis: - - for v>4
v-4
F -Distribution
The F-distribution is a continuous probability distribution. It is also known
as Snedecor's F-distribution or the Fisher-Snedecor distribution (after R.A.
Fisher and George W. Snedecor). F.stands for Sir Ronald Fisher, English
geneticist' and statistician. .
The distribution is used in the analysis of variance and is a function of the
ratio of two independent random variables each of which has a Chi-square
distribution and is divided by its number of degrees of freedom.
The F distribution is used in many cases for the critical regions for
hypothesis tests and in determining confidence intervals. Two .common
examples are the analysis of variance and the F-test to determine if the
variances of two populations are equal.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
FPDF(1.10dl)
4 4
By: Rafaqat
.... ....iii
..!I .
iii 3 .
8
3
~
:a
!
2
1
..
~
!
2
9
a. 2
a.
a a
a 2 3 4 Ii a 2 :I 4 I
x x
F PDF (1G, 1 di) F P~(10, 10 d1J
GJi a.a
l !' o.4 ~
:i
G.7
G.11
.! 02 I! GJj
~ ~ OA
~ 0.2 j
0.2
! !
9 2 0.2
a. G.1 a. a.1
a a
a 1 2 3 4 Ii 0 2 3 4 Ii
x x
d I
.l
'
02&
0.1
2 3 4 5 2 J 4 Ii
x x
Distribution Summary:
(v1xr1 v;2 .
pdf (v1x+vJ"1 ..2
xj{~ + i)
Mean:
Mode:
0
Chapter 5: Continuous Prob'fibility Distributions 79
78
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Variance:
Skewness:
I
By: Rafaqat
1
'I
80
Chapter 5: Continuous Probability Distributions
i
j
; .
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Questions-Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
devices, because- the "failure rates" here are not cbns_tant: more
failures occur/or-very young and/or very old systems.
In physics. if one observe a gas at a fixed temperature and
pressure In a uniform gravitational field, the heights of thew1rious
molecules also follow an approximate exponential distribution.
t;
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and. Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Quesflons amt Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Exercises
a
By: Rafaqat
Q.2 For continuous random variable the area under the probability
distribution curve between any two points is always:
(a) Greater then one
(b) Less then zero
( c) Equal to one
(d) In the range zero and one
..
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Q.5 IfXis a uniform variate U(5, 10), then the mean of Xis
(a) 5
(b) 7.5
(c) IO
. (d) I5
Q.6 If Xis a uniform .variate U (5, Io), then the variance of Xis
(a) 0.4I 7
(b) 2.08
(c) 7.5
(d) 5
(b) I
(c) 6
(d) 4
2
Q.10 If X - N(,a ) and a and b are re<!l numbers, then mean of
(aX + b) is
(a) a+b
(b) a+b
(c) a
(d) a+b
http://stat9943.blogspot.com
'.
~
t
I
A Quick Approach to Statistics with Questions and Answers
Q.13 The total area under a normal distribution curve to the left o(the
mean is always: ,
(a) 1
(b) 0
(c) 0.5
(d) 0.9
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Q.18 IfX follows t-distribution with v d.f. then the distribution of x2 is:
By: Rafaqat
Q.19 The area under the normal curve within two standard deviation of
the mean is:
. (a) 68.26%
(b) 95.44%
(c) 99.73%
(d) 99.99%
-~i,
http://stat9943.blogspot.com
A Q1,1ick Approach to Statistics with Questions and Answers
http://stat9943.blogspot.com
A Qulc1c Approach.to Statistics with Questions and Answu.s
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Chapter 6
'IJlllll
. Regression & ~orrelation
Regression
By: Rafaqat
Regression Analysis
Regression analysis .refers to the methods of describing the functional
relationship between the dependent arid independent variable. It is .used to
predict value of one variable (dependent) on the basis of other (independent)
variables.
Scatter Diagram .
When we plot data in such a way that we obtain dots showing relation
betw.een dependent and independent variables, it is called scatter .diagram.
Regression Line .
A regression line is a lirie drawn through the points on a scatter plot to
summarize the relationship between the variables being studied. When it
slopes down (from top left to bottom right), this indicates a negative or
inverse relationship between the variables; when it slopes up (from bottom
right to top left), a positive or direct relationship is indicat~d. The regression
line often represents the regression equation on a scatter plot.
lf this line is a straight line then the regression is called linear regression
while, on the other hands, it is referred as non-linear regression.
Regression Equation/Model
A regression equation (model) allows us to.express the relationship between
two (or more) variables algebraically. It indicates the nature of the
. Chapter 6: Regression & Correlation 91
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Simple Regression .
Simple. regression investigates the effect of one independent variable on the
:;. dependent variable.
~
Multiple Regression
Multiple regression investigates the effect of tWo or more independent
variables on the dependent variable.
Alsu y - y =e = residual.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
\ -
Standard ~rror of Estimate
A measure of the scatter' of actual values of the dependent variable from
their estimated values. It can be used to set up confidence intervals for the
actual values of the dependent variables.
Total Variation
A measure of the variation of the actual values of the dependent variable
from their mean.
Symbolically,
Total Sum of Squares= TSS = L (y - ji) 2
Total Variation= Explained Variation+ Unexplained Variation.
By: Rafaqat
Explained Variation
A measure of the variation of the estimated values ofthe dependent variable
from the mean of actual values.
Symbolically,
. Explained S.Um of Squares = ESS =L (ji- ji) 2
Unexplained Variation
A measure of the variation of the actual values of the dependent variable
from the estimated values of that variable.
Symbolically,
Residual Sum ofSquares = RSS = L (y- ji) 2
Coefficient of Determination
It measures the relative amount of variation in the dependent variable that
has .been explained by variation in the independent variable. It is the
measure of strength of association that exists between variables.
It is denoted by R!-.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Correlation
Correlation Analysis
Correlation analysis refers to the methodology of measuring the
interdependence between two or more variables. For this measure,. we
compute coefficient of correlation whose value ranges from -1 to + 1. The
negative correlation shows that both the series under consideration are
moving in different directions while the positive correlation shO\~S that both
the series jointly move in the same direction i.e., either going to increase or
going to decrease. The zero correlation shows the statistical independence
of the variables. Negative correlation shows indirect relation among the
series while positive correlation shows a direct relation. Furthermore, -1 or
By: Rafaqat
Partial Correlation
A partial correla~ion is used to measure the degree of liner relationship
between any two variables in a multivariable problem by removing any
common relationship or influence with all other variables.
Transformation to Linearity . .
Transformations allow us to change all the values of a variable by using
some mathematical operation, for example, we can change a number, group
of numbers, or an equation by multiplying or dividing. by a constant or
taking the square root. A transformation to linearity is a transformation of a
response variable, or independent variirble, or both, which produces an
approximate linear relationship between the variables.
http://stat9943.blogspot.com
A Quick Approac/1 to Statistics with Questions and Answers
Questions-Answers
Q.l How would yo,u decide to choose regression or correlation
analysis for study of relationship?
Ans. If we want to measure of dependence of one variable on one or
more variables, we use regression analysis. On the other hand, if
we want. ,to measure the interdependence among two or more
variables, we use correlation analysis.
Q.4 Before correlating two series, how can you determine which will
be the dependent variable?
Ani. The dependent variable is the one for which estimates and
forecasts will be ma_de.
j
Q.5 Is the computation of coefficient of correlation is part of
!
regression analysis?
/fns. No, although a correlation analysis usually makes a regression
analysis more meaningful and useful.
Q.6 Can a coefficient of correlation be computed without computing
a regression equation?
1
Ans. Yes.
Chapter 6: Regression & Correlation 95
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
= Cov(X,Y)
PYX ax
2
.,. http://stat9943.blogspot.com
A Quick Approach to Stt#istics wiili Questions and Answers
Since V(X) and V(Y) both have positive values and only the
quantily Cov(X. Y) can have a posifive or negative sign. According
to the above equations if Cov(X, Y) is positive then r an<l p are
positive and vice versa. So ii is clear that due to common
numerator i.e. Cov (X. Y) both r and fJ have the same sign.
http://stat9943.blogspot.com
A Quick Approach to Statistic~ with Questions and Answers
)
correlation?
Ans. The partial correlation coefficient receives the same sign as its
corresponding net regression coefficient.
Q.21 If twQ variables are closely correlated, then tire movements in one
variable cause the movements in another. Why or why not?
Ans. No, there is no proof of causation in correlation theory. A
correlation analysis. never proves or disproves that there is
relationship between two variables. All it does is measure the.
closeness of the relationship.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Q.23 What values does r assume if all the sample points fall on the
. same straight line and if the line has:
(a) a positive slope (b) a negative slope.
Ans. (a) +I (b) -1.
Q.24 A police research cell has shown that the crime rate is cocrelated
with the number of unemployed people in Pakistan. Would you
expect the correlation to be positive or negative?
Ans. Positive, because as number of unemployed people increases, the
crime will also increase.
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Q~estions and Answers
ra
Regression coefficient ofX on Y is: hxy = __
Y
. . a.
ra ra
G.M of Regression coefficie~ts:.. ~b,,.b.,, = --y .-- =r
a. ay
By: Rafaqat
Q.33 If two regression coefficients are 0.8 and 0.2. What would be the
value of coefficient of correlation.
Ans. r1 = byx. bxy = (0.8) (0.2)~ 0.16,
r = 0.4, positive since byx and bxy both will be positive.
http://stat9943.blogspot.com
A Quick Approaclr to Statistics with Questions and Answers
Exercises
Exercise 6 (MCQs')
http://stat9943.blogspot.com
. A QuiC:: Approach to Statistics with Questions and Answers
Q.7 For the regression equation Y=IO + 2X, the Y intercept is:
(a) 10 (b) 2 (c) 0 (d) -2
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
calculated.
(c) The units in which X and Y are measured will hot affect
the value of r.
(d) The correlation coefficient can be calculated only after the
estimated regression line has been found.
Q.14 _Whenever predictions are made from the estimated regression line,
the relation between X and Y is assumed to be:
(a) Direct (b) Inverse
(c) Linear (d) Perfect
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Q.18 A larger sample size can be expected to achieve all qut which one
By: Rafaqat
Q.20 Iii multiple regression analysis, the purpose of solving the normal
equations is to find:
(a) The standard error of estimate.
(b) The constant and coefficients in the least squares
relationship.
(c) The number of independent variables in the least squares
relationship.
(d) The variance around the least squares relationship.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Chapter 7
Sampling
Basics of Sampling
By: Rafaqat
Population _
A group of individuals or entities about which you wish to know something
and from which a sample will be taken, is called population.
Sampling Unit
This is a single member of a population; e.g., if the population is defined to
be l 00 trees on a lot, then the sampling unit is a single tree.
Sample . .
A sample is a sub-collection of elements drawn from a population; a subset
of a population (a collection of Sampling Units), with the assumption that it
represents the whole population. -
Sampling _
The procedure by which a few subjects are chosen from the universe
(population) to be studied in such as way that the sample can be used to
estimate the same characteristics in the total is referred to as sampling.
Sampling Frame
A iist of all the sampling units. It includes lists that are available or that are
constructed from different sources specifically for the study. Directories,
membership or customer lists, even invoices or credit card receipts can
serve as a sampling frame. ' .
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Sample Design
A program layout including all the procedures to take a sample.
Parameter
A summary, typically numerical, of a variable (or variables) over the entire
population.
. I
Statistic
A summary, typically numerical, ofa variable (or variables) over a sample .
.Statistical Inference
The process of making a statement about a population on the basis of
sample infol1!1ation.
By: Rafaqat
~xample:
In an example to find out the average number of cups of tea that office
worker take daily in a. particular city:
Census
the method that collects data from all members of the population, rather
than from a selected subset ~fthe population.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions aatl ..4JmNls
Sampling Technlques
There are five main types -0f non-probability sampling that we will review
more closely:
Purposive Sampling
.Convenience Sampling
Quota Sampling
Judgment Sampling
Snowball Sampling
Purposive Sampling
In purposive sampling, the researcher selects the units with some.purpose in
mind, for example, students who live in dorms on campus, or experts on
urban development. This is where. the researcher targets a group of people
believed to be typical or average; or a group of people specialty picked for
some unique pqrpose. The researcher never. knows if the sample is
representative of the population, and this method is largeiy limited to
exploratory research. Sampling of cricket players, sampling of dresses in a
market, and sampling of Jewelry etc., are few examples of puq,osive
sampling.
Convenie~ce Sampling .
A convenience sample is used when a researcher simply stop anybody in
the street who is prepared to stop, or when a researcher wander round a
business, a.shop, a restaurant, etc. and asking people he meet whether they
http://stat9943.blogspot.com
A Quick Approacll to Statistics witll Questions and Answers
will answer his questions. In other words, the sample comprises subjects
who are simply available in a convenient way to the researcher. There is no
randomness and the likelihood of bias is _high. One can't draw any
me<lningful conclusions from the results h~ obtains.
Quota Sampling
In quota sampling, the researcher constructs quotas for different types of
units. For example, to interview a fixed number of shoppers at a mall, half
of whom are male and halfofwhom are female. It is widely used in opinion
polling and market research. Interviewers are each given' a quota of subjects
of speci~ed type to attempt to recruit e.g., an interviewer '!flight be told to
go out and select 20 working men and 20 working women, 10 school girls
and l q school boys so that they could ,interview them al?out their television
viewing.
By: Rafaqat
Judgement Sampling
In judgement sampling, the researcher or some other "expert" uses his/her
judgement in selecting the units from the population for study based on the
population parameters.
This type of sampling technique might be the most appropriate if the
population to be studied is difficult .to locate or if some members are
thought to be better (more knowledgeable, more willing, etc.) than others to
interview.
Snowball Sampling
It is also called network, chain, or reputational sampling. This method
begins with a few people or cases and then gradually increases the sample
size as new contacts are mentioned by the people you started out with.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answas
A sampling procedure that assures that each element in the population has
an equal chance of being selected is referred to as simple random sampling.
Let us assume you had a school with a IoOO students, divided equally int()
boys and girls, and you wanted to select 100 of them for further study. You
might put all their names in a drum and then pull l 00 names out. Not only
does each person have .an equal chance of being selected, we can also easily
calculate the probability of a given person being chosen, since we know the
sample size (n) and the population size (N) and it becomes a simple matter
of division:
n/N x 100 or 100/1000 x 100 = 10%.
This means that ever.y student in the school as a 10% or 1 in 10 chance of
being selected using this. method.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
and read across or down until you come to a digit between 1 and 4. This is
your random starting point. Say your random starting point is "3". This
means you select shop 3 as your first shop, and then every fourth shop down
the list (3, 7, 11, 15, 19, etc.) until you have 25 shop selected.
Cluster Sampling
Cluster sampling. is a sampling technique where .the entire population is
divided into a number of heterogeneous groups, or clusters, and a random
sample of these clusters is selected. All observations in the selected clusters
are included in the sample.
Contrary to simple random sampling and stratified sampling, where single
subjects are selected from the population, in cluster sampling the subjects
are selected in groups. or cJusters. Suppose, you have a population that is
dispersed across a wide geographic . region. This method allows you to
divide this population into clusters (usually counties, census tracts, or other
boundaries) and then randomly sample everyone in those clusters. This
approach allows to overcome the constraints of costs and time associated
with a very dispersed population .. Cluster sampling views the units in a
population as not only being members of the total population but as
members also of naturally-occurring in clusters within the population. For
example, city residents are also residents of neighborhoods, blocks, and
housing structures.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Standard Error
The standard deviation of the sampling distribution tells us something about
how different samples would be distributed. It is referred to as the standard
error.
e Sampling Error
s For a given estimator, the difference between an estimate based on a sample
s and population parameter.
()
:r
is
Non-Sampling Error
d For a given estimator, the difference between the estimate that would result
a if the sample were to include the entire population and the true population
value being estimated.
IS
>r
ld Coverage Error
Error due to omissions, erroneous inclusions, and duplications of units in
the frame us.ed to conduct the survey; also, for housepold surveys, any
omissions or duplicates wi~hin the householcJs.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Non-Response Error
Error caused by survey failure to get a response to one or possibly all of the
questions. Indirect measures include the detail disposition rates . (un-
weighted and weighted) of all the selected sample c~ses during data
collection. Direct measures may require non-response follow-up.
'
Measurement Error.
Error when the response received differs from the. "true" value due to the
respondent, the interviewer, the questionnaire, the mode of collecti9n, or the
respondent's record-keeping system(s):
Processing Error
By: Rafaqat
.Estimation Error
For a given estimator, the difference between the value of the estimate anq
the true population value being estimated. Includes both sampling and non-
sampling error.
http://stat9943.blogspot.com
!TS A Quick Approach to Statistics with QuestiOns and Answers
Questions-Answers
he
1n-
1ta Q.J . What is the representative sample? ,
Ans. A sample is "representative" if the distribution of the- sample's
characteristics is the same, on average, as the distribution of the
characteristics of the population. The size of the sample and the
he type of sample (random er purposeful) are key decisions if you
he wan_t to say that your sample represents the population.
~).
(/?) Save items that must be injured or destroyed in the
process ofstudying their characteristics.
(c) Samples are also studied because it is also the only
practical way t() ~btain information about a universe
nd because of its large size. '
1n-
Q;J Explain tire difference between a sample and a census.
Ans. A census is a survey that attempts to include every elemeni in the
population while a sample is a partial enumeration ofpopulation.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questlonsoand Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
generation
Some research not interested in working out what proportion -
. of population gives a particular response but rather - in
obtaining an idea of the range of responses on ideas that
people have._
http://stat9943.blogspot.com
A Quick Approach to Statislics with Questions and Answers
..
http://stat9943.blogspot.com
A Quick tfpproach to SJatistlcs with Questions and Answers
cost.
Q.30 What are different methods used for the allocaticn of sample
size?
Ans. Proportional allocation, optimum allocation and Neyman
allocation are usedfor sample size selection.
http://stat9943.blogspot.com
A QuickApproach to Statistics with Questions and Answers
Q.34 Why you might use a cluster sample of the households in an area
rather than a simple random sample drawn from a directory
giving addresses of houseliolds?
Ans. A cluster sample of household might be used because directories
with the address of the hous~ho/d are never 100% accurate.. By t!u;..
time the directory information is obtained and published, people
have J1ZOved in an out of the area, which results in the population
different from that listed in the directory.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
http://stat9943.blogspot.com
i '
A Quick Approach to Statistics with Questions and Answers
Q.43 Explain and criticiie each part of the following statement: The
frequency distributions of the family income, size of the business,
and salaries of the skilled employees all tends to be skewed to the
right. .
Ans. Agree, since in al/. three cases it is quite logical to assume that
there would be extreme values at the upper end of the scale that
would tend to make the ar(thmetid mean larger than the median.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Q.49 'list ten rules that are useful/or making out questi~nnaire.
Ans. (i) Use items that can be easily understandable
(ii) Avoid ambiguous questions
(iii) Make sure that the questions asked can b(! accurately
answered
(iv) Avoid double questions
(v) Avoid the direct embarrassing questions
(vi) Word the questions so that the answer can be easily
tabulated and easily classified
(vii) Avoid the leading questions
(viii) list the question in a logical sequence
(ix) Make the questionnaire short and attractive
(x) Place the research organization's name and address on
By: Rafaqat
each questionnaire.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Partial Non-response:
A partial intervtew is when some but not all items have
responses. A partial interview is treated as a "unit response" when
a sufficiently accurate response is obtained for only some of the
data items required from a respondent and meets some minimum
threshold level. A partial interview is treated as a "unit non-
response" when this threshold is not met.
By: Rafaqat
Unit Non-response:
It occurs when the sampled unit response does not meet a
minimum threshold and is classified as not having responded at
all; failure IQ make measurel!lents or obtain observati<Jns on a
listing unit selected/or inclusion in a sample.
Over coverage:
The extent to which a frame includes more element$ than the
sampled population; including duplicate elements.
Under coverage:
The extent to which a frame irlcludes fewer elements than the
sampled population.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Exercises
Exercise 7 (True/False)
Read the following statements carefully and !ndicate which statement is
"True" or "False":
125
I
Chapter 7: Sampling
http://stat9943.blogspot.com
A Quick Approach to Stqtistics with Questions and Answers
the population.
12. To perform a complete enumeration, one would need to
examine every item in a population.
13. In everyday life, we see many examples of infinite populations
of physical objects.
14. Large samples are always a good idea because they decrease
the standard error.
15. The main problem of data collecti.on through mail
questionnaire is bias.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions (llld Answers
-
'S
>f Chapter 8
:d
iS Statistical Inference
1e
of
Statistical Inference
By: Rafaqat
ns Estimator
An estimator is any quantity calculated from the sample data, which is used
to give information about an unknown quantity in the population. In other
se words, any statistic that is used to estimate a population parameter is called
estimator. For example, the sample mean is an estimator of the population
mean.
ail
Estimate
An estimate is a specific value or range of values used for indication of the
value of an unknown quantity based on observed data. More formally, an
estimate is the particular value of an estimator that is obtained from a
particular sample of data and used to indicate the value of a parameter.
Estimation
Estimation i; the process by which sample data are used to indicate the
value of an unknown quantity in a population.
Results of estimation can be expressed as a single value, known as a Point
Estimate, or a range of values, known as an Interval Estimate ..
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Efficiency:
EffiCiency refers to the size of standard error of the statistic. If we COl1'\l)are
two statistics from a sample of the same size then the estimator with smaller
standard error is said to be more efficient.
. Particularly, an estimator fJ 1 is said to be efficient estimator than iJ 1 if
Var ( fJ 1 ) < Var( fJ 2 ). The variane is calculated if the estimators are
..
By: Rafaqat
Consistency:
An estimator is.said to be consistent estimator of a population parameter if
as the sample size increases, it becomes almost certain that the value of the
statistic comes very close to the value of population parameter.
.
In other sense an estimator fJ (as n
. is the sample size) is a consistent
. estimator for parameter () if and only if, forall & > 0,110 matter how small,
we have;
P( I Bn - B I< & ) = I ' when n --+ oo.
Sufficiency: .
An estimator is called sufficient estimator if it makes so much use of the
sample information that no other estimator could extract form the sample
additional information about the population parameter being estimated.
Hypothesis
It is a supposition or assumption, which acts as a foundation or as a starting
point in an investigation, irrespective of its probable truth or falsity. For
example, average body temperature of adults is 98.6F, procedute A of
cultivation is better than that of B, etc. .
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
l
,
Statistical Hypothesis .
A statistical hypothesis is a statement about parameter(s) of population(s).
Fo~ example, average body temperature. of adults is 98.6F, more than 10%
voters are in favour of a particular party, etc.
Testing of Hypothesis
Hypothesis testing begins with an assumption, called a hypothesis that we
make about a population parameter. Then we collect sample data, produce
sample statistic, and use this as information to decide how likely it is that .
our hypothesized population parameter is correct. The purpose of this type .
of inference to determine whether enough statistical evidence exists to
enable us to conclude that a belief or. hypothesis about a parameter is
supported by the data.
By: Rafaqat
Null Hypothesis
A hypothesis to be tested for possible rejection under the assumption that it
is true, is called mill hypothesis and is denoted by H0 For example, in a
clinical trial of a new drug, the null hypothesis might be that the _new drug is
no better, on average, than the current drug. We would write
H 0 : there is no difference between the two drugs on a~erage;
We give special consideration to the null hypothesis. This is due to the fact
that the null hypothesis relates to the statement being tested.
Altern~tive Hypothesis
The alternative hypothesis, denoted by H., is to be considered as an
alternate to the null hypothesis. It is also known as Research Hypothesis.
For the above example, we :would write
H 1: the two drugs have different effects, on average.
l'h;e alternative hypothesis might also be that the new drug is better, on .
average, than the current drug. In this case we would write
H 1: the new drug is better than the current drug, on average.
Simple ~ypothesis
A simple hypothesis is a hypothes;:,, which specifies the population
distribution completely.
For example, ,
1. Ho: p = 0.5, i.e., p is specified
2. Ho: X - N(S, 20), i.e., and u 2 are specified
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
I
Composite Hypothesis
A composite hypothesis is a hypothesis, which does not specify the
population distribution completely.
For example,
I. H 1: p > 0.5, i.e., p is not completely specified
2. H1: X - N(5, u 2 ), te., c;2 is not completely specified
Type-I Error
In a hypothesis test, a type-1 error occurs when the null hypothesis is
rejected when !t is in fact true; that is, Ho is wrongly rejected. The
probability of committing type-I error is denoted by a.
A type-I error is often considered to be more serious, and therefore more
important to avoid, than a type II error. The hypothesis test procedure is
therefore adjusted so that there is a guaranteed 'low' probability of rejecting
By: Rafaqat
Type.JI Error
In a hypothesis test, .a type-II ~rror occurs when the null hypothesis H0 , is
not rejected when it is in fact false, Ho is wrongly accepted. The probability
of committing type-II error is denoted by f3.
A type-II error would occur if it was concluded that the two drugs produced
the same effect, i.e. there is no difference between the , two drugs on
average, when in fact they produced different ones. A type-II error is
frequently due to sample sizes b~ing too small.
Significance Level
The significance level of a statistical hypothesis test is a fixed probability. of
wrongly rejecting the null hypothesis Ho, if it is in fact true. It is the
probability of a type I error and is set by the investigator in relation to the
consequences of such an error. That is, we want to make the significance
level as small as possible in order to protect the null hypothesis and to
prevent, as far as possible, the investigator from inadvertently making false
claims. The significance level is usually denoted by a.
Test Statistic
A test statistic is a quantity calculated from the sample of data. Its value is
used to decide whether or not the null hypothesis should be rejected in our
http://stat9943.blogspot.com
A Quick Approac/1 to Statistics with Questions and Answers
Critical Value
The critical value for a hypothesis te~t is a threshold to which the value of
the test statistic in a sample is compared to detennine whether or not the
null hypothesis is rejected.
The crltiCal value for any hypothesis test depends on the significance level
at which the test is carried out, and whether the test is one-sided or two-
sided (described below).
Critical Region
The critical region (CR), or rejection region (RR), is a set of values of the
By: Rafaqat
test statistic for which the null hypothesis is rejected in a hypothesis test.
That is, the sample space for the test statistic is partitioned into two regions;
one region (the critical region) will lead us to reject the null hypothesis H0 ,
the other will not. So, if the observed value of the test statistic is a member
of the critical region, we conclude "Reject H0 "; if it is not a member of the
critical region then we conclude, "Do not reject H0".
http://stat9943.blogspot.com
A Quick Approach to Statistics with. Questions and Answers
Example:
Suppose, we want to test a manufacturers claim that there are, on average,
50 sticks in a match-box. We could set up the following hypothesis
Ho:= 50,
. Against;
H1: * 50
The choice between a one-sided and a two-sided test is determined by the
purpose of the investigation or prior reasons for ~sing a one-sided test.
P-Value .
The probability value (p-value) of a statistical hypothesis test is the
probability of getting a value of the sample test statistic that is at least as
extreme as the one found from the sample data assuming that the null.
_hypothesis is true.
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Questions-Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
'hypotheses such as arise for testing from the statistical view point.
Statistical hypothesis concerns the behavior of observable random
variable. Jn other words statistical hypotheses are testable claim.s
or assertions about one or more parameter of empirical
distributions.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
http://stat9943.blogspot.com
A Qick Approach to Statistics with Qu{!stions and Answers
even with very wide deviations from normality. Finally, if the mean
and standard deviation of a normal distribution are known, it is
easy to convert back and forth from raw scores to percentile&
,.
Q.21 Discuss the role of Chi-square distribution in testing of
hypothesis and confidence interval estimation.
Ans. Let Z -N(O,J) be a standard normal variable. Jf n random values
Z1., Z2, .. , Zn are drawn from this distribution, squared, and
summed, the resultant statistic is said to have a I distribution with
n degrees offreedom. It is right skewed distribution rangingfrom 0
to oo. Chi-square distribution fzas a wide range of application. In
.tesiing of hypothesis it is being used to test the variance, equality
of variances. oreover it is used to test the goodness offit and
also for testing association of attributes. With reference to
confidence interval it is being used to. construct the confidence
interval for population variance.
..~
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions .and Answers
It is symmetrical about its mean point, zero, and tends
asymptotically to the standard normai distribution. The standard
normal distribution is replaced by t-distribution when population
variance is estimated from sample data and central limit theorem
is not applicable. I-distribution may be used for testing the means,
equality of.means (two), proportion and difference of proportions
(two). All this. will be done when the population variance is
unknown.
The distribution is also applicable to construct the confidence
interval for:
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Exercises
Exercise 8 (True/False)
Read the following statements carefully and indicate which statem.~nt is
"True" or "False":
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
22. In an upper tailed test, large values of the test statistic result in
rejecting the null hypothesis.
23. - For a prescribed decision rule, a larger sample size will result in
a larger. type-I error probability.
24. A contingency table can have only two rows and two columns.
25. The Chi-square test can be used to determine if there is a
significant difference between two sample percentages.
26. In a .Chi-square analysis tile- number of rows in the contingency
table should equal the number of columns.
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
1111111 Chapter 9
Design and
Analysis of Experiments
Experiment
An experiment is any process or study, which results in the collection of
data, the outcome of which is unknown. In statistics, .the term is usually
.restricted to situations in which the researcher has control over some of the
By: Rafaqat
Experimental Unit
Experimental units is the basic object upon which the study or experiment is l
carried out. The entity to which a specific treatment combination is applied.
An.experimental unit can be a individual agricultural plant, plot of land, PC
board, silicon wafer, tray of components simultaneously treated, automotive
transmisl!ions etc.
--
12
Chapter 9: Design Rnd Analysis of Experiments 143
j
J
http://stat9943.blogspot.com
A Quick Approach to Statistics ivith Questions and Answers
Randomization
Randomization is a schedule for allocating treatment material and for
conducting treatment combinati,ons in a DOE such that the conditions in one
run neither depend on the conditions of the previous run nor predicts the
conditions in the subsequent runs. The importance of randomization cannot
b.e over stressed. Randomization is necessary for conclusions drawn from
the experiment to be correct, unambiguous and defensible. Randomization
is preferred since alternatives may lead to biased results. The main point is
that randomization tends .to produce groups for study that are comparable in
unknown as well as known factors likely to influence the outcome, apart
from the actual treatment under study. The analysis of variance F tests
assume that treatments have been applied randomly.
By: Rafaqat
Replication
The repetition of an experiment on a large group of subjects is known as
replication. If a treatment is truly effective, the long-term averaging effect
of replication will reflect its experimental worth. If it is not effective, then
the few members of the experimental population who may have reacted to
the treatment will be negated by the large numbers. of subjects who were
unaffected by it. Replication reduces variability in experimental results,
increasing their significance and the confidence' level with which a
researcher can draw conclusions about an experimental factor.
Local Control
'J. Local control refers to grouping of the experimental units in such a way that
,,..,.! the units within a group (i.e., block) are more homogeneous than are units
in different groups. The experimental materials or conditions are more alike
within a group. Thus, the variation among experimental units within a group
is less than the variation would have been without grouping. This leads to
the "comparison of treatment effects under more uniform conditions or .on
the more uniform materials. For example, the total variation in Randomized
Complete Block Design (RCBD) is partjtioned into _:variation due to two
assignable causes, blocks and treatments, and .variation due to a non-
assignable cause or experimental error. This latter source of variation is
, reduced .as the variation due to block is remove&.
Experimental error= Total variation - Treatment variation - Block variation.
Chapter
, 9: Design and Analysis of.Experiments 144
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions.and An.swen
treatments.
Factor of an Experiment
A factor of an experiment is a controlled independent variable; a variable
whose levels are set by t~e experimenter. A factor is a general type or
-category of treatments. Different treatments constitute different levels of a
factor. For example, three different groups of runners are subjected to
different training methods .. The runners are the experimental units, the
training methods, the treatments, where the three types of training methods
constitute three levels of the factor 'type of training'.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Questions"".Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Ans. Latin square (and related) designs are efficient designs to b/Qck
from 2 to 4 nuisance factors .
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Ans. The random effect model does not provide knowledge of the
treatment effect at a particular level. It enables us to study the
variability due to the effect of treatment. Therefore the random
effect model is sometimes ca/led the component of variance. Or an
effect associated with input variables chosen at random from a
population having a large or infinite number ofpossible values.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and.Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
(DMRTJ., Tuckey 's test, Scheffe 's test, orthogonal contrasts, trend
comparisons, etc.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Exercises
Exercise 9 (True/False)
Read the following statements carefQ!iy-and indicate whi~h statement is
"True" or "False":
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
l!
'j
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
111111 Chapter 10
Time Series
By: Rafaqat
/. Secular Trend
2; Seasonal Variations
3. Cyclical Fluctuations
4. Irregular Variations
Secular Trend
The ba:;ic long-term movement in a time series that can be described by a
smooth line.
http://stat9943.blogspot.com
A Quick Approach.to Statistics with Questions and Answers
Seasonal Variations
The seasonal variations are short-term movements occurring in a periodic
manner. These variations are mainly caused by the changes in the season.
These variations involves pattern of change within a. year that tend to be
By: Rafaqat
Cyclical Fluctuations
Cyclical fluctuation is a wavelike pattern describing a long-term trend that
"''t is generally apparent oyei' a number of years, resulting .in a cyclical effect.
A cycle is said to be completed when beginning with. a peak, the declining
curve reaches a low point, and then rising again reaching the next peak. By
definition, it.has duration of more than one year. Business cycle is the most
common example of such variation. .
Irregular Variations
Non-periodic or random fluctuations those are due to non~recurring or non-
periodic events such as strikes, wars, elections, deaths, and weather changes
etc.
http://stat9943.blogspot.com
-.,------
Questions-Answers
Q.3 Why the term "Series" is applied in the time series analysis?
Ans. We use the term time series to refer to any group of statistical
information accumulated.at regular intervals.
-
J
Chapter 10: Analysis of Time Series 155
http://stat9943.blogspot.com
A .QuickApproach to Statistics with Questions and Answers
Q.5 How can technological change affect the trend of a time series?
Ans. Technological change can cause up~ard or downward mov,ement
in the time series. For e3camp/e, the development of the tractor
caused a downward trend in the number of the mules, ox and
camels on the farms. However, the. development of the tractor
helped produce an upward trend in the sales of the petrol.
series: T xC xS x I.
Q.10 Give some rules for constructing time series line charts.
Ans. Some good rules are:
Make a chart wider than high
Place time on the horizontal axis and a scale for the vaiues on
the vertical axis, show only a few scale values
Sei tf!e scale so that the line (or Jines) will appear near the
center ofthe chart
Start the scale with zero urlless indexes are being charted
Make equal distance on the vertical scale represent equal
absolute amourtts
Plot po~nts to the middle of the periodfor eumulative data and
to the point ofthe time for non-cumulative data
Connect plotted points with the straight lines
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Q.11 Why should the numerical scale of a time series line chart being
with the zero? Are there any exceptions?
Ans. The scale should begin with the zero becau3e the vase of the
reference is zero. An exception is the scale for an index, which has
the base of JOO. Another exception is when a logarithmic scale
(ratio scale) is used
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
, .
Q.15 In what direction does a secular trends mov'?
Ans. Secular trends can move up, down, or in both of the directions. i.e.
value of the wiriable tends to increase or decrease over long
period of the tin,e.
Q.16 What is the most widely used trend line? What equation is
generally used to describe a trend with one bend?
Ans. . The, least squares line is . the mostly commonly used The
trend equation is;
Y = a+bx+cx . 2
t
is used more than another equation to describe a trend with one
By: Rafaqat
bend .
Q.17 . Explain the meaning of tl~e coefficients of the equation
Y = a + bx + cx 2 .
I
Ans. The coeffident 'a' is the value of the trend as its origin, 'b' is the
general basic slope of the line and 'c' is th~ typical change in the
basic slope per unit lime.
Q.18 What is the purpose of.the measuring cycles of the time series?
Ans. Cycles of the time series usually measured so that they can be
studied and analyzed in order to find a way to forecast turning
~ point of the future cycles.
... Q.19. why are the turning points of the business cycles are different to
<..
predict?
Ans. Turning points of the business cycles are difficult to predict
because turning points of past cycle have not occurred at regular
interval oftime.
..j
t
http://stat9943.blogspot.com
A Quick Approach toStatistics with Questions and Answers
Q.13 How are the cycles isolated forlJI the other components of a time
series? -
Ans. Cycles are isolated so that they must be analyzed and studied
By: Rafaqat
Q.25 Why would it be difficult to predict the point 1'n time when a
business cycle will reach a peak or trough? .
Ans. The timing of cyclical peaks and troug,,. are difficult to predict
becquse ofthe irregularity in the timing ofthe past business cycles.
'I
r Q.26 Show symbolically how the influence of seasonal variations may
be removed from a monthly time series.
TxCxSx/
Ans.
s
=
TxCxl
http://stat9943.blogspot.com
A Quick Approach tu Statittics with Questions and Answers
http://stat9943.blogspot.com
. '
A Quick 'Approach to Statistics with Questions and Answers
Q.34 What ar.e the- two basic purpos.es for computing a seasonal
index?
Ans. Seasonal indices are computed in order to use them (1) to make
short-range forecasts and (ii) eliminate the effec( of seasonal
variations from the original data.
By: Rafaqat
Q.37 If the storm prevents people from shopping for two days, wllat .
type offluctuation in the sales would this cause: trend, seasonal,
cyclical, or irregu,lar? Why?
Ans.. "Irregular", because storms are irregular in the timings of the
occurrence.
http://stat9943.blogspot.com
A Quick Approacll to Statistics witll Questions and Answers
long term fluctuations 'in the series stand out more clearly.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
http://stat9943.blogspot.com
A Quick Approach to Statisticswith Questions and Answers
Exercises
Exercise 10 (MCQs')
Q.1 Data concerning events over a period of time is called a:
(a) Time Series
(b) Moving Average
(c) Frequency Distribution
(d) Random Sample
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Ques#ions and Answers
Week 1 2 3 4 5
Taking (Rs.) 98 112 161 109 101
.Q.8 The following table shows (in thousands) the number of units of
electricity used by a firm over a period of two years.
Quarter. l 23 4 1 2 3 4
No. of units . 85 49 25 87 89 53 29 86
(lOOO's)
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Q.9 . Suppose you were considering a time series of data for the
quarters of 1992 and 1993. The third quarter of 1993 would be
coded as:
(a) 2
(b) 3
(c) 5
(d) 6
Q. to Assume that you have been given quarterly sales data for a five-
year period. To use the ratio-to-moving-average method of
computing a seasonal index, your first step will be
(a) Compute the four-quarter moving average
By: Rafaqat
http://stat9943.blogspot.com
A Quick Appro{li:h to Statistics with Questions and Answers
Index Numbers
Index
By: Rafaqat
A numerical scale used to compare variables with one another or with some
reference number. Jn other words, a number or ratio (a value on a scale of
measurement) derived from a series of observed facts.
Index Number
An index number measures how much a variable changes. over time or
space. This is a statistical.measure to give average change in a variable or
group of variables with respect to time or space.
We calculate an index number by finding the ratio of the current value to a
base value then we multiply the resulting number by I 00 to express the
index as a percentage. This final value is the percentage relative. Note that
the index number for the base point in time is always hundred.
Generally,
. Current Value
1n d exN um b er= x 100 .
Base Va/11e [.
Base Period
The period of time for which data used as the base of an index number or
other ratio. In other words, it is a period. from which the changes _are
measured..
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Price Index
A price index is the one most frequently used. It compares levels of prices
from one period to another.
For example, if the retail price of sugar is Rs. 18 in 1998 and Rs. 30 in 2005
from the foilowing Table then the index number for 2005 ob the base price
in 1998 will be
P.. 100 =-.-x
Po. =p1.,.,.2,.,;=-x JO 100 = 166.
Pm; 100 =-x 67.
. . Po P1998 18
It mean~ that if the price of sugar/Kg. was Rs. I 00 in 1998 then it becomes
Rs. i66.67 in 2005 or the said price gets 66.67% increase in 2005 when we
compare that in 1998.
The quantity P. is also called Price Relative.
Po
Quantity Index .
A quantity index measures how much the number or quantity of a variable
changes over time.
From the following Table, the quantity index for year 2005 using year 1998
as base:
25
qO,n =qim.2005=q.X100=qlm;X100= X100=186.67.
90 q199K 18
Value Index
The vatue index measures the changes in total monetary worth. In fact, the
value index combines price and quantity changes to present a more
informative index. '
Chapter I I: Index Numbers 168
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
From the following Table, showing the current and base year prices and the
quantities consumed (in 000' Kg.), the value index for year 2005 using year
1998 as base:
v. OO Vims . 750
v0 . =v1m2ms =-x 1 =--x 100 =-xlOO= 277.78.
. v.. v,.,.,. 270
Table: Prices (Rs./Kg.), Quantities (consumed in 000' Kg.) and
values for Sugar
(period '0' denotes 1998 while 'n' denotes 2005)
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Stall"stics with Questions and Answers
p = LP. x 100.
o.. LPo
p
"
1
= -M r(i!..!.)
Po
x I 00
,
where Mis the number of commodities und~ study.
.;
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
http://stat9943.blogspot.com
A Quick Approach_ to Statistics with Questions and Answer~
Circular Test
This test often called transitivity is a multi-period test (essentially a test of .
chaining). It requires that the product of a price index obtained by going
By: Rafaqat
CoQlmensurability
This test requires that if the units of measurement of the items are changed
(for example, form Kgs. to _Tonnes), then the price index will not change.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions a1'd Answers
In other words, if the factor 'p' is changed by the factor 'q', index formula
be interchanged (9r reversed) so that a quantity (or prke) index formula is
obtained, then the product of the two index numbers should equal the value
index number.
That is,
(Price Index) x (Quantity Index)= Value Index.
'I
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answer$
Questions-Answers
http://stat9943.blogspot.com
..,.. Quick Approach to Statistics with Questions and Answers
Q.6 What' is the m_ost appropriate time duration oft/U! base period?
Alis. This period is frequently one year but it may be as~sh'ort as one day
or as long as the average of a group ofyears.
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questitms and Answers
,,
http://stat9943.blogspot.com
1
A Quick Approach to Statistics with Questions and Answers
Exercises
(d) CPI
http://stat9943.blogspot.com
<"'.
\\,,
i
'l
,/'
A Quick Approach to Statistics with Questions.and Answe1
Q.8 To measure how much the cost of some variable changes over
time, we would use:
(a} Inflation index
(b) Quantity index
(c) Value index
(d} None of these
Q.9 When the base year values are used as weights, the weighted
average of relative price index is the same as:
(a) The Paasche's index
(b) The Laspeyers' index
(c) The unweighted average of relative price index
. (d) None of these
~I
http://stat9943.blogspot.com
A QuickApproach io Statistics with .Questions and Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
several locations.
13. A simplest form of a composite index is a weighted aggregate
index.
14. An index number is alWl!YS found by taking the ratio of current
value to a base value and multiplying by I 00.
15. The simple average of relatives method divides the weighted
sum by the sum of weiWits.
I
~I
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Chapter 12
1111111
Nonparametric Statistics
By: Rafaqat
Nonparametric Tests
Parametric tests require assumptions about the nature or shape of the
populations involved; nonparametric tests do not require such assumptions;
Consequently, nonparametric tests of hypotheses are often called
Distribution Free Tests.
Some of these tests are. Sign test, Wilcoxon Signed-Rank test, Runs test;
Mann-Whitney Utest, Kruskal Wallis test, etc.
Nonparametric tests may be, and often are, more powerful in detecting
population differences when certain assumptions are not satisfied.
http://stat9943.blogspot.com
A Quick Approqch to Statistics with Questions and Answers
Sign Test
The sign test is designed to test a hypothesis about the location of a
population distribution. It is most often used to test the hypothesis about a
population median, and often involves the use of matched pairs, for
example, before and after data, in which case ittests for a median difference
of zero.
We can use a signed test;
(i) To test claims about the median of the paired differences for two
dependent samples
(ii) To test claims about certain types of nominal data
By: Rafaqat
(iii) To test the claim. made about the median of a single population.
The Sign test does not require the assumption that the population is
normally distributed. In many applications, this test is used in place of the
one sample t-test when the normality assumption is questionable. This test
can also be applied when the observations in a sample of data are ranks, that
is, ordinal data rather than direct measurements.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
~ .
Mann-Whitney UJ'est
The Mann-Whitney U test is one of the most powerful nonparametric tests
for comparing two populations. It is ~sed as a test of comparison of medianS
or means of two populations using independent samples.
The Mann-Whitney test does not require the assumption that the differences
between the two samples are normally distributed. In many applications, the
Mann-Whitney test is used in place of the two sample t-test when the
normality assumption is questionable.
This test can also be applied when the observations in a sample of data are
ranks, that is, ordinal data rather than direct measurements. This test is also
known as Mann-Whitney Utest and Wilcoxon Rarik-Sum test.
Kruskal-Wallis Test .
By: Rafaqat
Runs Test
In studies .where measurements are made according to some well defined
ordering, either in time or space, a frequent question is whether or not the
average value. of the measurement is different at different points in the
syquence. The runs test provides a means of testing this. In other words; it is
a grocedure for testing the randomness of data.
Run:
A run is a sequence of data that exhibits the sam~ characteristic; the
sequence is preceded and followed by different data or no data at all. In
other words, it is maximal sequence of similar elements.
Chapter 12: Nonparametric Stat~tics 183
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Kolmogorov-Smirnov Test
For a single sample of data, the Kolmogorov-Smimov test is used to test
whether or not the sample of data is consistent with a specified distribution
function. When there are two samples of data, it is used to test whether or
By: Rafaqat
not these two samples may reasonably be assumed to come from the same
distribution. The Kolmogorav-Smimov Test is therefore another measure of
the goodness of fit of the theoretical frequency distribution as was the Chi-
square test. The Kolmogorov-Smirnov test does not require the assumption
that the population is normally distributed.
http://stat9943.blogspot.com
..
A Quick Approach to Statistics with Questions and Aftswers
Questions-Answers
Q.3 What is the nonparametric alternative of the two sample I-test for
independent samples?
Ans. Usually, when we have two samples that we want to compare
concerning their mean value for some variable of interest, we
would use the I-test for independent samples; nonparametric
alternatives for this test is the Mann~Whitney U test.
http://stat9943.blogspot.com
.A Quick Approach to Statistics with Questions and Answers
Q.6 What are the nonparametric qlternatives of the two sample t-test
' for dependent samples?
Ans. Ifwe want to compare two variables measured in the same sample
By: Rafaqat
we would customarily use _the matched pair t-tesl (or I-test for
dependent samples). Nonparametric alternatives to this test are the
Sign test and Wilcoxon's matched pairs test.
http://stat9943.blogspot.com
A. Quick Approach to Statistics with Questions and Answers
Q.9 What can be determined with the Mann-Whitney test? Wl}at kind
of data are used/or this test'! .
Ans. The I-test and z-test are useful for testing whether two samples
have been drawn from populations that are assumed to , be
normally distributed and which have equal means and variances.
However, the Mann-Whitney test enables us to make a significance
test without assumptions, although the data/or the Mann-Whitney
test are assumed to be continuous and which must be ranked
continuous. To make this test the data for all samples must be
pooled a.nd ranked in order ofmagnitude.
Q.11 What is the null hypothesis when we use the Spearman Rank
correlation coefficient test?
Ans. The null hypothesis is that there is no significant co"elation
. between the two rankings.
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Q.14 What are the major nonparametric tests used to test the
randomness in the data?
Ans. Runs test for randomness and Kendal Tau test are the major tests
used to test the randomness in the data.
. '
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Exercises
Q.4 When using the sign test, if two scores are tied, then we:
(a) Count them
(b) Discard them
(c) Depends upon the scores
(d) None of these
http://stat9943.blogspot.com
/
randomness when:
(a) There is an unusually large number of runs
(b) There is an unusually small number of runs
(c) Either of the above
(d) None of the above
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Q.12 '
Which of the following tests must be two-sided?
(a) Kruskal-Wallis
(b) Wilcoxon Signed rank
(c) Runs test
(d) 'sign test
Q.16 To perform a runs test for randomness the data must be:
(a) Qualitative
(b) Quan_tjtative
(c) Divided into at least two classifications
(d) Divided into exactly two classifications
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
. Q.18 Three brands of coffee are rated for taste on a scale of I to 10. Six
persons are asked to rate each brand so that there is a total of 18
observations. The appropriate test to determine if three -brands taste
equally goodis:
(a) One way analysis of variance
(b) Wilcoxon rank-sum test
(c) Spearman rank difference
(d) Kruskal-Wallis test
Q.21 Which of the follo~ing tests is~ most)ikeiy. asse5sing this null
hypothesis: Ho: The number of vio~atii>ns per aPartJneni in the
population of all city apartments is binomially distributed with a ,
probability of success in any one trial of p =0.3:
(a) The Kolmogorov-Smirnov test
(b) The Kruskal-Wallis test
( c) The Mann-Whitney test
(d) The Wilcoxon signed-rank test
..
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions ad Answn
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
8. The Wilcoxon signed rank test may be used whenever the sign
test is applicable.
9; The binomial distribution can be used to provide probabilities
for outcomes when the sign test is used.
10. One of the main advantage of the nonparametrk tests is that the
underlying assumptions .are often less restrictive than those of
parametric tests.
I I. Parametric tests are easier to compute and therefore more
desirable than nonparametric tests.
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
Answers to Exercises
Answers to Exercises 195
http://stat9943.blogspot.com
A Quick Appro~ch to Statistics with Questions and Answers
l. d 2. a 3. d 4. b 5. a
6. d 7. c 8. c 9. d 10. c
11. d 12. c 13. a 14. b 15. a
16. b 17. a 18. c 19. b 20. d
By: Rafaqat
Exercise 3 (True/False)
1. False 2. True 3. False 4. Ttue 5. True
6. False 7. False 8. True 9. True 10. False
http://stat9943.blogspot.com
i-
' A Quick Approach to Statistics with Questions and Answers
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
j
Exercise 6 (MCQs')
1. a 2. c 3. c 4. b 5. b
6. a 7. a 8. c 9. d 10. c
11. b 12.. c 13. b 14. c 15. d
16. b 17. b 18. d 19. b 20. b
By: Rafaqat
Chapter 7: Sampling
Exercise 7 (True/False)
1. True 2. False 3. True 4. False 5. False
6. False 7. True 8. True 9. True 10. True
11. True 12. True 13. True 14. False 15. False
Exercise 8 (True/F~lse)
l. False 2. False 3. True 4. False 5. True
6. False 7. True 8. True 9. True 10. False
11. True 12. False 13. True 14. True 15. True
16. False 17. False 18. False 19. True 20. False
21. True 22. True 23. False 24. False 25. False
26. False 27. True 28; False 29. True 30. True
http://stat9943.blogspot.com
.A Quick Approach to Statistics with Questions and Answers
. Exercise 9 (True/False)
1. Fats~ 2. True 3. True 4. False 5. True
6. False 7. False 8. True 9. False 10. True
11. True 12. Tnie 13. True 14. True 15. True
Exercise 10 (MCQs')
1. a 2. c 3. d 4. b 5. a
6. a 7. c 8. b 9. c 10. c
http://stat9943.blogspot.com
A Quick Appro.ach 'to Statistics with Questions and Answers
1. a 2. c 3. c 4. b 5. b
6. c 7. c 8. c 9. a 10. b
11. d 12. d 13. c 14. b 15. b
16. d 17. 6 18. d 19. a 20. b
21. a 22. b 23. d
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
1111111
Bibliograp_hy
http://stat9943.blogspot.com
A Quick Approach to Statistic$ with Questions and Answers
Bibliography 202
http://stat9943.blogspot.com
A. Quick Approach to Statistics with Questions and Answers
1111111111
Subject Index
Class Boundaries, 5
A Class Interval (Width), 5
Class Limits, 4
Addition Rule of Probability, 28 Class Mark (Midpoints), 5
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to.Statistics with Questions and Answers
D G
Data, l Gamma . Distribution, 70
Deciles, 8 Geometric Distribution, 52
Descriptive Statistics, l Geometric Mean, 7
Design of Experiment (DOE), Graph, 5
143 Grouped Data, 4
Diagram, 6
By: Rafaqat
Discrete Data, 2 H
Discrete Probability
Distributions, 49 Harmonic Mean, 7
Discrete Random Variable, 41 Histogram, 5
Dispersion, 9 Homogeneity, 145
Hypergeometric Distribution, 51
Hypothesis, 128
E
Efficiency, 128 I
Equally Likely Events, 24
Estimate, 127 Independent Events, 25
Estimation, 127 Independent Random Variables,
Estimation Error, 112 44
Estimator, 127 Index, 167
Event, 24 Index Number, 167
Event Space, 24 Inferential Statistics, 1
Expected Value, 43 Interval Scale, 3
Experiment, 143 Irregular Variations, 153
Experimental Unit, 143
Explained Variation, 93
Exponential Distribution, 65 l
Judgement Sampling, I 08 ]
]
F
1
F -Distribution, 77 1
Subject Index 204 }
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
K Normality, 145
Null Hypothesis, 129
Kolmogorov;Smimov Test; 184
Kruskal-Wallis Test., 183
0
L Obse~ation, v
One"'.Sided (One-Tailed) Test,
.Law of Total Probability, 28 131
Local Control, 144 Ordinal Scale, 3
Outcome, 23
M
p
Mann-Whitney U Test, 183
Mean Deviation, 9 Parameter, l 06
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
R Simple Regression, 92
Skewness, IO
Random Experiment, 23
Random Variables, 41
Randomization, 144
Range, 9
,(~tio Scale, 3
Regression Analysis, 91
Regression Equation/Model, 91
Snowball Sampling, I 08
Standard Deviation, 9
Standard Error of Estimate, 93
Standard Normal Distribution, 69
Statistic, 106
Statistical Hypothesis, 129
Statistical Inference, 106
I
Regression Line, 91 Statistical Methods, 1
Relative Frequency, 4 Statistics, 1
Relative -Frequency Definition Stratified Random Sampling, 109
of Probability, 26 Subjective Probability, 25
Replication, 144 Sufficiency, 128
By: Rafaqat
http://stat9943.blogspot.com
A Quick Approach to Statistics with Questions and Answers
w z
Weighted Aggregate Price Index Z-Score (Standard Score), 9
Numbers, 170
Weighted Average of Relative
Price Index Number, 170
By: Rafaqat
http://stat9943.blogspot.com