Module 7 - Slides Flashcards
Data Analysis / Reduction
Reduces large data sets into more compact, manageable and interpretable information
What does data analysis involve?
Organizing and Interpreting quantitative data following systematic rules, then followed by statistical analysis
Two main types of statistical analysis for quantitative data
- Descriptive statistics
- Inferential statistics
Descriptive Satistics
Used to characterize the SHAPE, CENTRAL TENDENCY, and VARIABILITY within a set of data, called a DISTRIBUTION
Parameters
Measures of population characteristics
Statistic
A descriptive index computed from sample data
Distribution
The total set of scores for a particular variable
Distribution of Scores -Coin Rotation Test (CRT)
Frequency distribution
Cumulative percent
Methods to Display Frequency Distributions
A table of rank ordered scores that shows the number of times each value occurred, or its frequency (f)
(Ex: histogram)
Graphing Frequency Distributions -Histogram
A type of bar graph, composed of a series of columns, each representing one score or group interval
Measures of Central Tendency (3Ms)
Mean (average)
Median (middle score)
Mode (the score that occurs most frequently in a distribution)
Variability
A measure of the spread of scores within a distribution, and expressed in different ways:
Range, Variance, Standard deviation (SD)
Range
From minimum to maximum
Variance
Expressed as sum of square (SS)
-Should be small if scores are close together, and large if they are spread out
Standard deviation (SD)
Square root of the variance (SS)
Normal Distribution
Known as a bell-shaped distribution or Gaussian distribution
Inferential Statistics
Used to make INFERENCES or draw conclusions about a POPULATION based on findings in a SAMPLE (involving a decision-making process)
Fundamental Concepts of Statistics - Statistical Basics
▪Alpha (α) level and Probability (p) value
▪Confidence interval (CI)
▪Hypothesis testing (Null vs. Alternative
hypothesis)
▪Errors in hypothesis testing – type I and II
▪Statistical Power
▪Effect size
Alpha Level
Level of significance
The amount of chance researchers are willing to tolerate
Specified before the analysis is conducted
P value (Probability)
The likelihood that any one event will occur, given all the possible outcomes
Implies uncertainty - what is likely to happen
Is a product of data analysis
When is there a statistically significant
difference?
If p < a ; otherwise, no significant difference
is found
Confidence Intervals (CI)
A range of scores with specific boundaries or confidence limits; represents a specified
probability (e.g., 95% a traditional value) that the true population value is within the range
Statistical Hypothesis Testing
- Null hypothesis – H0: 1 = 2
➢There is NO difference between the groups or
interventions” - Alternative hypothesis – H1: 1 ≠ 2
- There is a difference.
- Statistical conclusion: “Disproving” the null
hypothesis - either Reject (H0) or
- Do not reject (H0)
Errors in Hypothesis Testing
- Decision is either correct or not correct
- Potential errors in statistical decision making
✓Type I – mistakenly finding a difference (false-positive +); probability is α
✓Type II – mistakenly finding no difference (false-negative -); probability is β
Statistical Power
Power is the probability of attaining statistical significance; can be thought of sensitivity; or probability that a test will lead to rejection of the null hypothesis (H0)
Power analysis involves 4 interdependent concepts: PANE
✓ P = power (1 – β; β = probability to commit Type II error)
✓ A = alpha (level of significance; 0.05 (default) or 0.01)
✓ N = sample size, its influence on power is critical
✓ E = effect size, the size of the effect of IV also influences the power
Effect Size (ES)
A measure of the degree to which H0 is
false, or the size of the effect of the independent variable (IV)
Assumptions that must be met for Parametric statistics
- Samples are randomly drawn from a parent population with a normal distribution
- Variances in the samples being compared are roughly
equal - Data should be measured on the interval or ratio scales
Nonparametric statistics
Can be used when either or all of the assumptions is/are not met
Parametric statistic - t-Test
Compares two means (of data samples)
Parametric statistic - Analysis of Variance (ANOVA)
Compares more than two means
Independent t-Test (unpaired)
Test difference between two independent groups or samples
Levene’s test
Used to determine homogeneity of variance. If the test
is not significant, variances are assumed to be equal
Paired t-Test
- Used in repeated measures designs (see
previous module) - Used when subjects exposed to both conditions
Inappropriate Use of Multiple t-Tests
Use of multiple t-tests will increase the chance of making a Type
I error
Analysis of Variance (ANOVA)
- Differences between more than two means (3 or 4 …)
- Uses the F statistic (counterpart to t statistic for t test)
- Named for Sir Ronald Fisher
- Based on parametric assumptions of
✓ normal distribution
✓ interval or ratio level of measurement
✓ equal variance within groups (Levene’s test*)
One-Way ANOVA
Appropriate for one-way design with one IV with three or more levels (groups)
If comparing only two modalities, ANOVA is equivalent to the t-Test
Two-Way ANOVA for factorial design
Two-way indicates two independent variables (IVs)
Accounts for the main effects of all the independent variables respectively, and the interaction effect
between the two IVs
Mixed ANOVA for Mixed Designs
Between groups, within-group mixed design
Two independent variables
✓One (timing) repeated across all subjects (pretest & posttest) – within-subjects
✓The other randomized to independent groups (Tx) – between-subjects
Format of the mixed ANOVA
A combination of
between-subjects (independent factors) and within-subjects
(repeated factors) analyses.
✓The independent factor is analyzed as it would be in a regular one-way ANOVA
✓The repeated factor is analyzed using techniques for a repeated measures
analysis.