Introduction to statistics Flashcards
What are the two types of statistics?
- Descriptive Statistics
- Inferential Statistics
What is the purpose of descriptive statistics?
- Describe data
- Summarize data
For example - How many people got each score
- The standing of a score relative to other scores
- Graphically summarizing set of scores
What are the types of descriptive statistics?
- Frequency distribution
- Central tendency
- Variability
Frequency Distribution
- Number of participants in total or each category
- A full glance without overwhelmed by raw scores
- Visual assessment
- The sum of frequency should be equal to n
- Possible score and frequency of occurrence
What are the characteristics of distribution shapes?
- Modality
Number of humps in a distribution - Skewness
Symmetrical or not (leaning on to one side over the other?) - Kurtosis
The relative peakedness or flatness of a distribution compared to normal distribution
What is a normal distribution?
- Bell-shaped curve
- The majority of the scores in the centre
- Skewness and kurtosis less than +/- 1(strict)
What are statistical assessment one can do to check normal distribution?
- Kolmogorov-Smirnov Test
If n is larger than 50 - Shapiro-Wilk test
N smaller than 50 - The tests should not be significant, if they are the groups are too different from known populations
What happens if there is no normal distribution?
- Mann Whitney test (independent groups)
- Wilcoxson test (paired groups)
- Non-parametric data, instead of t-tests
What are different frequency shapes that is not normally distributed?
- Positive skew
On the right side, tail pointing toward the higher score - Negative score
On the left side, tail pointing toward lower score - Leptokurtic
Symmetrical in shape but central peak is higher; more frequent scores near the mean, thus less variability - Platykurtic
Symmetrical, the frequency of most values are the same so a flatter curve
What can happen when data is not normally distributed?
- Positively skewed; inflated mean
- Negatively skewed; deflated mean
- Leptokurtic; off little variation in the data, so too little differences between people
- Platykurtic; too much variation
- Less confidence in the outcome of parametric tests
Central Tendency
- Describe the average score on a variable
- Ideally a singly value
What are the three common measures of central tendency?
- Mean
The average score in the distribution - Median
The middle score - Mode
Most frequent occurring score in the distribution
Variability
- The differences between the samples are with respect to variability
- How spread out are the scores in a distribution
What are different measures of variability?
- Range
- Interquartile Range
- Standard Deviation
What is standard deviation?
- Conceptually it is an average deviation score
- How big one step is from the mean
What rule applies if data is normally distributed?
- 68, 95 and 99.7 % are 3 steps away from the mean
Inferential Statistics
Analyses one conduct in order to draw conclusion from their data and to be able to test hypothesis
Indirect Approach - Hypothesis Testing
- Obtain sample from population
- Compute statistics
- Infer relations in population from the sample
Null Hypotheses
That there is no effect
Example; KBT has no effect on depression
- Often what gets tested
- Less than 5% means that the likelihood of getting our finding by change is less than 5%
- 95% confidence its not random
Alternative Hypotheses
That there is an effect
Type 1 Error
- False positive
- Detect a significant result
- 2-tailed tests often eliminate these errors
- Set significant to .01
- Too many comparisons
Type 2 Error
- False negative
- Saying there is no effect despite being one
- Saying there is no effect despite there being one
- Increase sample size
- .80 Power
Effect sizes
The actual magnitude of the difference between groups or the magnitude of the association between variables
- Might be statistically significant but is the effect size big enough?
- Strenght of the relationship; size of association or group means
What are the 4 general assumptions of parametric tests?
- Dependent variable is normally distributed
- Homogeneity of variance
- Outcome variable is continuous
- Independence of observations
Assumption - Dependent variable is normally distributed
- Problematic with smaller sample sizes
- Check skewness, kurtosis, histogram and conducting tests
Assumption - Homogeneity of variance
Variances of the dependent variable are equal across different groups or conditions in statistical tests like ANOVA.
- Important in t-test and ANOVA
- Should not be statistically significant
- Levene’s test
- Falsey rejecting null if violated
Assumption - Independence of observations
Data obtained from participants is independent
- No observation bias
Assumption - Outcome variable is continious
- Should be measured on an interval or a ratio scale
Likert scale
“1.Disagree - 5.Agree”
What is standard error and its relation with sample size?
- Calculated error between my sample and the whole population
- The higher sample size, the smaller the error i.e closer to real population