Chapter Five Research

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 50

Data analysis

• Quantitative research is, as the term suggests,


concerned with the collection and analysis of
data in numeric form. It tends to emphasize
relatively large-scale and representative sets
of data
Data analysis in quantitative research

I. Descriptive statistics
• Simple distribution (one variable)
• Bivariate relationships (2 variables., e.g.
frequency distributions)
• More than 2 variables (tri/multivariate, e.g.
multiple regression analysis)
II. Inferential statistics
• Uses probability theory:
• to test hypotheses
• to draw inferences as to whether results from a
random sample hold true for a designated study
population.
• to test whether descriptive results are likely to be due
to random factors or due to a real relationship. It helps
researchers decide whether a relationship really exists
between different sets of statistical results
Steps in designing a quantitative study

• Formulate a researchable question


• Review related literature
• State hypotheses
• Determine the variables to be studied
 Identify dependent, independent, control and other variables
 Determine how these variables will be operational
 Determine level of measurement
• Determine research plan/method of data collection
• Define population
• Determine what instruments will be used to collect data
• Pre-test instruments or pilot the study
• Determine statistical tests to use
Data collection in quantitative studies

1.Experimental
• Simple post-test
• Classic pre-test, post-test
• Pre-test, post-test, control group
2. Analysis of quantitative data
3. Observation
• Use check or tally sheet
4. Surveys
• Use questionnaires
Validity in quantitative research
• Validity of an assessment is the degree to
which it measures what it is supposed to
measure. This is not the same as reliability,
which is the extent to which a measurement
gives result that are very consistent.
• Validity in quantitative research “refers to the
extent to which an empirical measurements
adequately reflects the real meaning of the
concept under consideration”
Measures of validity
• Face validity : ascertains that the measure appears to be
assessing the intended construct under study.
• Criterion-related or predictive validity : is used to predict
future or current performance - it correlates test results
with another criterion of interest.
• Construct validity: is used to ensure that the measure is
actually measure what it is intended to measure (i.e. the
construct), and not other variables.
• Content validity : Do the items included in the measure
adequately represent the universe of questions that could
have been asked?
QUALITATIVE RESEARCH

• This is sometimes referred to as ‘descriptive study’, ‘field study’, ‘participant


observation’, ‘case study’ or ‘naturalistic research’.
• The aim of qualitative type research is to “get close to the data in their natural
setting” (versus counting and statistical techniques at a distance from the
data): it is designed to best reflect an individual’s experience in the context of
their everyday life.
• It uses smaller sample sizes than quantitative studies and digs deeply for data.
• Qualitative research, on the other hand, is concerned with collecting and
analysing information in as many forms, chiefly non-numeric, as possible. It tends
to focus on exploring, in as much detail as possible, smaller numbers of instances
or examples which are seen as being interesting or illuminating, and aims to
achieve `depth' rather than `breadth'. (Blaxter, Hughes and Tight, 1996: 61)
• Emphasises comprehensive, interdependent, and dynamic
structures.
• Is appropriate in the investigation of complex, interdependent
issues, and allows for the collection of rich data that can
answer the “what” and “why” qualitative questions, and not
just the “how many” quantitative research questions
• Often draws on multiple sources of data
• Given its strength in generating/developing theory (inductive),
qualitative research is particularly appropriate for the
investigation of research problems that are under-theorised
• Qualitative research is harder, more stressful and more time-consuming
than other types.
• Qualitative research is only suitable for people who care about it, take it
seriously, and are prepared for commitment (Delamont, 1992)
Qualitative research methods:
• are concerned with opinions, feelings and experiences
• describes social phenomena as they occur naturally - no attempt is made
to manipulate the situation - just understand and describe
• understanding is sought by taking a holistic perspective / approach, rather
than looking at a set of variables
• qualitative research data is used to help us to develop concepts and
theories that help us to understand the social world - which is an inductive
approach to the development of theory, rather than a deductive approach
that quantitative research takes - ie. Testing theories that have already
been proposed.
• Qualitative data is collected through direct encounters i.e. through
interview or observation and is rather time consuming
Data collection in qualitative research

• Participant observation
• Case studies
• Formal and informal interviewing
• Videotaping
• Archival data surveys OR document review
Data analysis in qualitative studies
• Qualitative Data Analysis (QDA) is the range of
processes and procedures whereby we move
from the qualitative data that have been
collected into some form of explanation,
understanding or interpretation of the people
and situations we are investigating. QDA is
usually based on an interpretative philosophy.
The idea is to examine the meaningful and
symbolic content of qualitative data
• Discourse analysis
• Narrative analysis
• Content analysis :
• Thematic analysis
• Content analysis can be used when qualitative
data has been collected through: Interviews,
Focus groups, Observation, Documentary analysis
• is the procedure for the categorization of verbal
or behavioural data for the purpose of
classification, summarization and tabulation
• The content can be analyzed on two levels –
Descriptive: What is the data? – Interpretative:
what was meant by the data?
• Narrative analysis
• Narratives are transcribed experiences • Every
interview/observation has narrative aspect-the
researcher has to sort-out and reflect up on
them, enhance them, and present them in a
revised shape to the reader • The core activity
in narrative analysis is to reformulate stories
presented by people in different contexts and
based on their different experiences
• Discourse analysis
• A method of analyzing a naturally occurring talk (spoken
interaction) and all types of written texts
• Focus on ordinary people method of producing and making
sense of everyday social life: How language is used in
everyday situations? – Sometimes people express
themselves in a simple and straightforward way – Sometimes
people express themselves vaguely and indirectly – Analyst
must refer to the context when interpreting the message as
the same phenomenon can be described in a number of
different ways depending on context
Thematic analysis
• Familiarization: Transcribing & reading the data
• Identifying a thematic framework: Initial coding framework
which is developed both from a priori issues and from
emergent issues
• Coding: Using numerical or textual codes to identify specific
piece of data which correspond to different themes
• Charting: Charts created using headings from thematic
framework (can be thematic or by case)
• Mapping and interpretation: Searching for patterns,
associations, concepts and explanations in the data
Sampling in qualitative research

• Sampling is mostly purposive – with specific criteria


in mind!
• Seek conceptual applicability rather than
representativeness (quantitative representativity)
• You want to capture the range of
views/experiences
• Or seek after/pursue saturation of data
• Or to draw theory from data.
WHY DO WE ANALYZE DATA
• The purpose of analyzing data is to obtain usable and useful information. The
analysis, irrespective of whether the data is qualitative or quantitative, may:

• describe and summaries the data

• identify relationships between variables

• compare variables

• identify the difference between variables

• forecast outcomes
Data analysis process
Data
Exploration
collection and Analysi
of data
preparation s
Collect data
Explore
Descriptive
relationship
Prepare codebook Statistics
between variables

Set up structure of
data

Enter data
Graphs Compare groups
Screen data for
errors
Analysis
Explore •

Crosstabulation/Chi Square
Correlation
relationship •

Regression/Multiple regression
Logistic regression
s among • Factor analysis

variables

• Non-parametric statistics
• Parametric statistics
Compare T-tests
One-way analysis of variance
groups ANOVA
Two-way between groups
ANOVA
Multivariate analysis of variance
MANOVA
Crosstabulation
Aim: for categorical data (nominal
and ordinal) to see the
relationship between two or more
variables
Procedure:
◦ Analyze>Descriptive
statistics>Crosstab
◦ Statistics: correlation, Chi Square,
association
◦ Cells: Percentages – row or column
◦ Cluster bar charts
Correlation
• Aim: find out whether a relationship exists and
determine its magnitude and direction
• Correlation coefficients:
• Pearson product moment correlation
coefficient ,Spearman rank order correlation coefficient
• Assumptions: relationship
is linear
• Homoscedasticity: variability of DV should remain constant
at all values of IV
Partial correlation
• Aim: to explore the relationship between two
variables while statistically controlling for the effect
of another variable that may be influencing the
relationship
• Assumptions:
• same as correlation
Regression
Aim: use after there is a significant correlation to find the
appropriate linear model to predict DV (scale or
ordinal) from one or more IV (scale or ordinal)
Assumptions:
sample size needs to be large enough
multicollinearity and singularity
outliers
normality
linearity IV2
homoscedasticity
Types:
standard
IV1 IV3
DV
hierarchica
l stepwise
Logistic regression
• Aim: create a model to predict DV (categorical – 2 or
more categories) given one or more IV (categorical or
numerical/scale)
• Assumptions:
• sample size large enough
multicollinearity
•outliers Procedure

note:
• use Binary Logistic for DV of 2 categories (coding 0/1)
• use Multinomial Logistic for DV for more then 2
categories
Factor analysis
• Aim: to find what items (variables) clump together.
Usually used to create subscales. Data reduction.
• Factor analysis:
exploratory
confirmatory
• SPSS -> Principal
component analysis
Parametric Vs Non parametric statistics
Parameters in statistics is an important component of any statistical
analysis. In simple words, a parameter is any numerical quantity
that characterizes a given population or some aspect of it. This
means the parameter tells us something about the whole
population.
The basic distinction for parametric versus non-parametric is:
• If your measurement scale is nominal or ordinal  then you use 
non-parametric statistics
• If you are using Interval or ratio scales you use 
parametric statistics.
• Parametric statistical procedures rely on assumptions about the
shape of the distribution (i.e., assume a normal distribution)
• Nonparametric statistical procedures rely on
no or few assumptions about the shape or
parameters of the population distribution
from which the sample was drawn.
• Information about population Completely
known for parametric
• Information about population Unavailable for
non parametric
Parametric Vs Non parametric statistics
Analysis Type Parametric Procedure Nonparametric Procedure
Compare means between Two-sample t-test Wilcoxon ranksum test
two distinct/independent
groups
Compare two quantitative Paired t-test Wilcoxon signedrank tes
measurements taken from
the same individual
Compare means between Analysis of variance Kruskal-Wallis test
three or more (ANOVA)
distinct/independent
groups
Estimate the degree of Pearson coefficient of Spearman’s rank
association between two correlation correlation
quantitative variables
Measure of central Mean Median
tendency
T-test for independent groups
 Aim:Testing the differences between the means of two independent samples or
groups
•  Requirements:
◦ Only one independent (grouping) variable IV (ex. Gender)
◦ Only two levels for that IV (ex. Male or Female)
◦ Only one dependent variable (DV - numerical)
•  Assumptions:
◦ Sampling distribution of the difference between the means is normally distributed
◦ Homogeneity of variances – Tested by Levene’s Test for Equality of Variances
•  Procedure:
◦ ANALYZE>COMPARE MEANS>INDEPENDENT SAMPLES T-TEST
◦ Test variable – DV
◦ Grouping variable – IV
◦ DEFINE GROUPS (need to remember your coding of the IV)
◦ Can also divide a range by using a cut point
Paired Samples T-test
 Aim:used in repeated measures or correlated groups
designs, each subject is tested twice on the same
variable, also matched pairs
 Requirements:
◦ Looking at two sets of data – (ex. pre-test vs. post-test)
◦ Two sets of data must be obtained from the same subjects
or from two matched groups of subjects
 Assumptions:
◦ Sampling distribution of the means is normally distributed
◦ Sampling distribution of the difference scores should
be normally distributed
 Procedure:
◦ ANALYZE>COMPARE MEANS>PAIRED SAMPLES T-
TEST
One-way Analysis of Variance
 Aim: looks at the means from several
independent groups, extension of the independent
sample t-test
 Requirements:
◦ Only one IV (categorical)
◦ More than two levels for that IV
◦ Only one DV (numerical)
 Assumptions:
◦ The populations that the sample are drawn are
normally distributed
◦ Homogeneity of variances
◦ Observations are all independent of one another
 Procedure:
ANALYZE>COMPARE MEANS>One-Way ANOVA
Dependent List – DV
Factor – IV
Two-way Analysis of Variance
 Aim: test for main effect and interaction effects
on the DV
 Requirements:
◦ Two IV (categorical variables)
◦ Only one DV (continuous variable)
 Procedure:
ANALYZE>General Linear Model>Univariate
Dependent List – DV
Fixed Factor – IVs
MANOVA
Aim: extension of ANOVA when there is
more than one DV (should be related)
Assumptions:
sample size
normality
outliers
linearity
homogeneity of regression
multicollinearity and singularity
homogeneity of variance-covariance
matrices
Key Differences Between Parametric and Nonparametric Tests
• The fundamental differences between parametric and nonparametric test are discussed in
the following points:
• A statistical test, in which specific assumptions are made about the population parameter is
known as the parametric test. A statistical test used in the case of non-metric independent
variables is called nonparametric test.
• In the parametric test, the test statistic is based on distribution. On the other hand, the test
statistic is arbitrary in the case of the nonparametric test.
• In the parametric test, it is assumed that the measurement of variables of interest is done on
interval or ratio level. As opposed to the nonparametric test, wherein the variable of interest
are measured on nominal or ordinal scale.
• In general, the measure of central tendency in the parametric test is mean, while in the case
of the nonparametric test is median.
• In the parametric test, there is complete information about the population. Conversely, in the
nonparametric test, there is no information about the population.
• The applicability of parametric test is for variables only, whereas nonparametric test applies
to both variables and attributes.
• For measuring the degree of association between two quantitative variables, Pearson’s
coefficient of correlation is used in the parametric test, while spearman’s rank correlation is
used in the nonparametric test.
Hypothesis Tests Hierarchy
• To make a choice between parametric and the
nonparametric test is not easy for a researcher
conducting statistical analysis. For performing
hypothesis, if the information about the population
is completely known, by way of parameters, then
the test is said to be parametric test whereas, if
there is no knowledge about population and it is
needed to test the hypothesis on population, then
the test conducted is considered as the
nonparametric test.
When trying to decide what test to use, ask
yourself the following...
Am I interested in...?:
 description (association) - correlations, factor analysis, path analysis
 explanation (prediction) - regression, logistic regression, discriminant analysis
 intervention (group differences) - t-test, anova, manova, chi square
Do I need longitudinal data or is cross-sectional data sufficient for my purpose?
 Do my hypotheses involve the investigation of change, growth, or the timing of an event?
 If longitudinal data is necessary, how many data points are needed?

Is my dependent variable nominal, ordinal, interval, or ratio?


 nominal - chi square, logistic regression
 dichotomous - logistic regression
 ordinal - chi square
 interval/ratio - correlation, multiple regression, path analysis, t-test, anova, manova,
discriminant analysis
Regression Analysis
• Regression analysis is a statistical tool for the investigation of
relationships between variables.
• Usually, to ascertain the causal effect of one variable upon
another
Eg. the effect of a price increase upon demand, for example, or the
effect of changes in the money supply upon the inflation rate.
To explore such issues, data assembles on the underlying variables
of interest and employs regression to estimate the quantitative
effect of the causal variables upon the variable that they
influence.
The investigator typically assesses the “statistical significance” of
the estimated relationships, that is, the degree of confidence that
the true relationship is close to the estimated relationship.
Linear regression is the most basic type of regression and commonly used predictive analysis. 

There are several linear regression analyses available to the researcher.


• Simple linear regression
1 dependent variable (interval or ratio), 1 independent variable (interval or ratio or
dichotomous)
• Multiple linear regression
1 dependent variable (interval or ratio) , 2+ independent variables (interval or ratio or
dichotomous)
• Logistic regression
1 dependent variable (binary), 2+ independent variable(s) (interval or ratio or dichotomous)
• Ordinal regression
1 dependent variable (ordinal), 1+ independent variable(s) (nominal or dichotomous)
• Multinominal regression
1 dependent variable (nominal), 1+ independent variable(s) (interval or ratio or dichotomous)
• Discriminant analysis
1 dependent variable (nominal), 1+ independent variable(s) (interval or ratio)
• Multiple linear regression is the most common
form of linear regression analysis.  As a
predictive analysis, the multiple linear
regression is used to explain the relationship
between one continuous dependent variable
from two or more independent variables.  The
independent variables can be continuous or
categorical (dummy coded as appropriate).
• Ordinal regression is a statistical technique
that is used to predict behavior of ordinal level
dependent variables with a set of independent
variables.  The dependent variable is the order
response category variable and the
independent variable may be categorical or
continuous.
• Logistic regression is the linear regression
analysis to conduct when the dependent
variable is dichotomous (binary).  Like all
linear regressions the logistic regression is a
predictive analysis.  Logistic regression is used
to describe data and to explain the
relationship between one dependent binary
variable and one or more continuous-level
(interval or ratio scale) independent variables.
• Nonlinear regression is a regression in which
the dependent or criterion variables are
modeled as a non-linear function of model
parameters and one or more independent
variables.  There are several common models,
such as Asymptotic Regression
• Assumptions of Linear Regression
• Linear regression is an analysis that assesses whether
one or more predictor variables explains the dependent
(criterion) variable.  The regression has five key
assumptions:
 Linear relationship
 Multivariate normality
 No or little multicollinearity
 No auto-correlation
 Homoscedasticity
Assumptions of Logistic Regression

• model fit
• the error terms need to be independent.
• linearity of independent variables and log
odds.
• it requires quite large sample sizes

You might also like