(Quantitative): Comparing groups: continuous variables Flashcards
What is statistical data analysis?
• Organise and analyse the data • Common procedures used in analysis: -Descriptive Statistics - Inferential Statistics (more of a focus this term) • Need to get data into shape
What does data analysis need to do/have?
- Needs to have a purpose
- Describe
- Compare
- Examine similarities
- Examine differences
What is descriptive statistics (recap from last semester)?
- Check for errors and outliers
- Describe & summarise
- Spread of the data
- Ensure appropriate analysis
- Data parametric or non‐parametric
What ways can data be summarised (ratio or interval)?
• Measure of Central Tendency -Mean, Median, Mode - If not normal‐median • Measure of Dispersion -Variation, Range, Standard Deviation • Normal Curve, Skewness, Kurtosis
What are inferential statistics?
All statistical tests of common structure:
• Set up a null and alternative hypothesis
• Establish a level of statistical significance (also known as alpha (α), usually set at 5% or 1%)-depends on study
• Determine statistical significance of the findings‐ p value
• Accept or reject the null hypothesis-is there a difference between two groups
- SPSS output provides a p‐value (probability value)
- If the p‐value is greater than the alpha you cannot reject the null hypothesis
note:may be statistically significant but not actually meaningful e.e.g 1 second difference in a marathon
What are the steps when undertaking a hypothesis test?
- Define study question
- Set null and alternative hypothesis
- Calculate a test statistic
- Calculate a p value
- Make a decision and interpret
What is the students t-test?
- The t‐test is used to compare means between groups
- t‐test is easy to use but can be easily misused
- Most common statistical procedure used by researchers
What are the two types of t-test?
Same principles behind each but there is more random error in the …
• INDEPENDENT SAMPLES DESIGN because the control
group might, by chance, be very different from the treatment group
• With the PAIRED DESIGN, each person is their own control so variation is limited
What is the process in the decision chart for bivariate data?
- Paired data–>Paired samples t-test–>wilcoxon signed rank test (for non-parametric)
- Independent data–>Independent samples t-test–>Mann Whitney u-test (for non-parametric)
What is independent data?
• Data comes from different (independent) groups of people
• Eg. classic experiment (eg. Group 1 receives intervention A,
Group 2 receives intervention B).
• Study participant is in one group only
• Compare differences between groups (mean or median)
What is paired data?
• Data comes from one group of individuals
• Data collected from an individual at different points in time or under different conditions
• Compare differences in outcome between time 1 and time 2 or condition 1 and 2 (mean or median)
• Other terms: repeated measures, before and after study
e.g. cycling speed with different helmets
Explain the independent samples design
• Dependent Variable is ratio/interval and Independent Variable has two categories
• Measurements in condition 1 are independent of
measurements in condition 2
• If the H0 is true we expect the difference between the mean of condition 1 group and condition 2 group to be zero
Explain the issue of error in an independent t-test
• The act of using a sample will introduce error
-What is the probability that the difference we found occurred by chance?
-If less than 5%, reject H0 and accept HA (or written H1
‐ alternative)
Explain the sampling distribution of the independent samples t-test?
• T-distribution shape similar to normal curve
• the middle is the population parameter when H0
is true (i.e. the mean difference is 0)
• Around it are all the possible sample statistics
• Is ‘our’ difference so big that would only rarely happen by chance?
- Rarest 2.5% in both tails
What are the assumptions for the independent samples t-test?
- Dependent Variable is ratio/interval
- If either group is small (30 or less), distribution of Dependent Variable for each group should not be badly skewed
- The variance of the Dependent Variable for the two groups should not be very different