Week 8-11 (Quantitative) Flashcards
Descriptive Statistics
- Numbers that describe the data
- Frequencies
- Central tendency
- Measures of dispersion
Inferential Statistics
- Numbers that make inferences/predictions
- Calculations depend on the study
Frequency
How many times each value appears for a given variable
Mode
Most frequent value in a data set
Median
Middle value of data set
Mean
The mean is the average or norm
Measures of Dispersion
- Variables can vary from their centre or central tendency
- Variation can be explained by two terms
Range
Difference between the lowest & highest value
Standard Deviation (SD)
Average difference between each values & the mean
Large Standard Deviation (SD)
Data is spread out
Probability
- Chance of something happening
- Allows inferences/predictions about what is likely to happen
- Normal distribution for interval & ratio data
Normal Distribution
Probability distribution in which the mean, median, mode are equal
Normal Curve
- Most variable form a normal distribution
- Assumption for inferential statistics
Z Score
- Statistic used to measure distance of raw scores from the mean
- Unit of measure is in standard deviation
SD & Z Scores
- Express raw score as a percentile
- Determine likelihood of getting a particular score
- Compare 2 scores from different normal distributions - standardizations
Skewness
- Asymmetrical distribution in which 1 tail is longer than the other
- Outliers to the right = positive skew
- Outliers to the left = negative skew
- Mean, median, mode not equal
Kurtosis
How narrow is the peak of the distribution
Null Hypothesis (H0)
Observations are the result of chance
Alternative Hypothesis (H1)
Observations are result of a real affect - something else happened
P-Value
Probability that a test statistic will result by chance
Threshold Value (a)
Acceptable probability of rejecting a true null hypothesis
Rejecting Null Hypothesis
P-value lower than the pre-determined value a
Between .05-.001
Hypothesis Testing
- State the null hypothesis & alternative hypothesis
- Identify a statistic to assess the truth of the null hypothesis
- Compute the p-value
- Compare the p-value to a predetermined threshold value (a)
P=0
Impossible to be chance
P=0.001
- Very unlikely
- 1 in 1000
P=0.05
- Fairly unlikely
- 1 in 20
P=0.5
- Fairly likely
- 1 in 2
P=0.75
- Very likely
- 3 in 4
P=1
Absolutely certain it is due to chance
Statistical Significance
- A result NOT attributed to chance
- Reject null hypothesis
Type I Error
- Null hypothesized is incorrectly rejected
- Concludes there is a significant relationship when there is not
Type II Error
- Fail to reject a null hypothesis that is actually false
- Concludes there is no relationship when there is
Confidence Intervals
- Specific range within which the population parameter is expected to lie
- Narrower = more precise findings
- Common to use 95%
Clinical Significance
- Practical importance of a treatment effect
- Not always the same as statistical significance
Nominal
Differentiates between items based only on qualitative classifications
Ordinal
Provides a rank order with no degree of difference between them
Interval
- Allows for the degree of difference between items
- Zero is not truly zero
Ratio
Has a meaningful zero value
Alpha Level
- Cutoff for value for p
- P should be less than a to reject H0
Non-Parametric Tests
- Use categorical data
- Ordinal or nominal
- Normal distribution not applicable
- Chi-square test
Parametric Tests
- Use continuous data
- Interval or ratio
- Normal distribution is applicable
- Population parameters (means/SDs) can be estimated
- T-test, ANOVA, correlation
Chi-Square Test
- Compares expected frequency with observed frequency of the data
- Examines relationship among categorical variables
Chi-Square Data Type
- Categorical - nominal
- Frequencies - counts or percentages
- Data can be put in a contingency table
Chi-Square Relationship of Interest
- Goodness of fit (observed vs expected) - 1 variable
- Test of independence/association - 2 variables
Degrees of Freedom (df)
Number of scores that are free to vary when calculating a statistic
Goodness of Fit - 1 Variable*
- 1 independent categorical variable
- Tests how well an observed distribution corresponds to an expected probability
- Represented by the H0
Test for Independence/Association - 2+ Variables*
- 2+ independent categorical variables
- Tests whether categorical variables are associated with 1 another
Chi-Square Assumptions
- Frequency data
- Adequate sample size - 5+
- Measures must be independent
- Categories are set before testing based on theory
- No assumptions about the underlying distribution of the data
Chi-Square Limitations
- Does not indicate strength of an association
- Yes or no statistically significant relationship
- Sensitive to the sample size
T-Test
Used to compare means of 2 groups
T-Test Assumptions
2 groups that are compared should have approx. normal distributions with similar SDs
Independent vs Paired Samples
- T-tests can be done with independent or paired/dependent samples
- Not the same as dep & indep variables
- Calculations are different
Independent Samples
Both samples are randomly selected within population of interest
Paired Samples
Individuals in 1 sample are matched with those in the other sample
Two-Tailed Test
- Tests for any difference between means
- Non-directional
- Means are significantly different if 1 mean is within the top/bottom 2.5% of the other samples probability distribution (p<0.05)
One-Tailed Test
- Tests for a difference in a particular direction
- Less stringent in the direction of interest
- Rejection region for H0 is all in 1 tail of the curve
- Will not give a significant result in other direction
- Should be used only when change in opposite direction is nearly impossible
ANOVA
Compare means of 3+ groups
Interpreting T-Test
- Compare test statistic to critical value for a given alpha
- If test statistic > value it means p<a
ANOVA Function
Calculates ratio of variation between treatments to the variation within treatments
ANOVA H0
All means of treatment group are equal
ANOVA H1
At least 1 mean of treatment group is different
ANOVA Hypothesis
Can only determine whether a difference exists
Why ANOVA
- Allows testing of several null hypotheses at 1 time without increasing error
- <2 groups can’t compare means with t-test without increasing risk of type I error
ANOVA Musts
- Use interval/ratio data/quasi-interval
- In practice ordinal data are used if scales are symmetric
- Groups being compared have similar SDs
- Independent/dependent samples
Independent Sample
Randomly selected within population of interest
Dependent Sample
- For repeat measures
- Examining a change over time in samples - time related
Repeat Measures
Increases likelihood of finding significant differences where they exist
Position/Carry-Over Effects
- Order of treatment may affect outcome
- Previous treatment continues to have effect during the next treatment
- Minimized by randomly assigning treatment order
SD Calculation
Average difference between each value & the mean
Variance Calculation
Average of the squared differences from the mean
F-Distribution
- Skewed to right
- F-values can be 0 or +
- Different F-distribution for each pair of degrees of freedom
MANOVA
- M=multivariate
- Data comes from independent samples
- 2 outcome variables
Post Test
- Determine which means are significantly different from the others
- Different tests: Tukey, Bonferroni, Fisher’s
Tukey Test
Best of all pairwise comparisons are of interest