(Quantitative): Comparing groups: continuous variables Flashcards

1
Q

What is statistical data analysis?

A
• Organise and analyse the data
• Common procedures used in analysis:
 -Descriptive Statistics
 - Inferential Statistics (more of a focus this term)
 • Need to get data into shape
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does data analysis need to do/have?

A
  • Needs to have a purpose
  • Describe
  • Compare
  • Examine similarities
  • Examine differences
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is descriptive statistics (recap from last semester)?

A
  • Check for errors and outliers
  • Describe & summarise
  • Spread of the data
  • Ensure appropriate analysis
  • Data parametric or non‐parametric
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What ways can data be summarised (ratio or interval)?

A
• Measure of Central Tendency
  -Mean, Median, Mode
  - If not normal‐median
• Measure of Dispersion
 -Variation, Range, Standard Deviation
• Normal Curve, Skewness, Kurtosis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are inferential statistics?

A

All statistical tests of common structure:
• Set up a null and alternative hypothesis
• Establish a level of statistical significance (also known as alpha (α), usually set at 5% or 1%)-depends on study
• Determine statistical significance of the findings‐ p value
• Accept or reject the null hypothesis-is there a difference between two groups
- SPSS output provides a p‐value (probability value)
- If the p‐value is greater than the alpha you cannot reject the null hypothesis
note:may be statistically significant but not actually meaningful e.e.g 1 second difference in a marathon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the steps when undertaking a hypothesis test?

A
  • Define study question
  • Set null and alternative hypothesis
  • Calculate a test statistic
  • Calculate a p value
  • Make a decision and interpret
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the students t-test?

A
  • The t‐test is used to compare means between groups
  • t‐test is easy to use but can be easily misused
  • Most common statistical procedure used by researchers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two types of t-test?

A

Same principles behind each but there is more random error in the …
• INDEPENDENT SAMPLES DESIGN because the control
group might, by chance, be very different from the treatment group
• With the PAIRED DESIGN, each person is their own control so variation is limited

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the process in the decision chart for bivariate data?

A
  • Paired data–>Paired samples t-test–>wilcoxon signed rank test (for non-parametric)
  • Independent data–>Independent samples t-test–>Mann Whitney u-test (for non-parametric)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is independent data?

A

• Data comes from different (independent) groups of people
• Eg. classic experiment (eg. Group 1 receives intervention A,
Group 2 receives intervention B).
• Study participant is in one group only
• Compare differences between groups (mean or median)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is paired data?

A

• Data comes from one group of individuals
• Data collected from an individual at different points in time or under different conditions
• Compare differences in outcome between time 1 and time 2 or condition 1 and 2 (mean or median)
• Other terms: repeated measures, before and after study
e.g. cycling speed with different helmets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain the independent samples design

A

• Dependent Variable is ratio/interval and Independent Variable has two categories
• Measurements in condition 1 are independent of
measurements in condition 2
• If the H0 is true we expect the difference between the mean of condition 1 group and condition 2 group to be zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Explain the issue of error in an independent t-test

A

• The act of using a sample will introduce error
-What is the probability that the difference we found occurred by chance?
-If less than 5%, reject H0 and accept HA (or written H1
‐ alternative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain the sampling distribution of the independent samples t-test?

A

• T-distribution shape similar to normal curve
• the middle is the population parameter when H0
is true (i.e. the mean difference is 0)
• Around it are all the possible sample statistics
• Is ‘our’ difference so big that would only rarely happen by chance?
- Rarest 2.5% in both tails

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the assumptions for the independent samples t-test?

A
  • Dependent Variable is ratio/interval
  • If either group is small (30 or less), distribution of Dependent Variable for each group should not be badly skewed
  • The variance of the Dependent Variable for the two groups should not be very different
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How is a problematic difference indicated in an independent t-test?

A

• A problematic difference in variances is indicated by a significant Levene’s Test:

  • If significant, interpret the p value associated with ‘equal variances not assumed’
  • If non‐significant, interpret p value associated with ‘equal variances assumed’
17
Q

What about paired data that is normal? (check)

A
  • Observations not independent
  • Paired equivalent‐ Paired sample t test (same assumptions)
  • H0: No difference in the means before and after
  • H1: A difference in the means before and after
18
Q

Explain the paired samples t-test design

A
  • Dependent Variable is ratio/interval and Independent Variable has two categories
  • Each measurement in Cond 1 (performance with caffeine) has a match in Cond 2 (performance with water)
  • One measurement is deducted from the other so that each case has a different score
  • If the null hypothesis (H0 ) is true and there is no difference in performance with e.g. caffeine and with water we would expect the group mean difference score to be 0
19
Q

Explain the issue of error in a paired-samples t-test

A

the act of using a sample will introduce error
• What is the probability that the difference we found occurred by chance?
• If less than 5%, reject H0 and accept HA
(or written H1 ‐ alternative)

20
Q

What would non-parametric equivalents to a t-test be used?

A
  • If we have an ordinal scale Dependent Variable, or a ratio/interval Dependent Variable that does not meet parametric assumptions we use non‐parametric equivalents
  • These compare medians (ranks) (not affected by extremes) rather than means
  • They are usually less powerful
21
Q

When should non-parametric tests be used?

A
  • used when assumptions of parametric tests are not met (i.e. breached) e.g. the level of measurement (e.g., interval or ratio data), normal distribution, and homogeneity of variances across groups
  • It is not always possible to correct the distribution of a data set
  • In these cases we have to use non‐parametric tests
  • They make fewer assumptions about the type of data on which they can be used
  • Many of these tests use “ranked” data
22
Q

But what if my data are independent but nonparametric?

A

• Mann‐Whitney U test e.g. income ranks between teams

23
Q

What is the mann-whitney u-test

A
  • It is used to test the null hypothesis that two samples come from the same population (i.e. have the same median)
  • or, alternatively, whether observations in one sample tend to be larger than observations in the other
24
Q

What are the assumptions of the mann whitney u-test?

A

• (also known as the Mann‐Whitney U) is similar to the two independent samples t‐test
• Data must meet the requirement that the two
samples are independent
• The Mann‐Whitney procedure uses ranks instead of
the raw data values
• Data values are assigned ranks relative to both
samples combined

25
Q

When should a mann-whitney u-test be used?

A
  • The sample sizes are small and normality is questionable.
  • The data contain outliers or extreme values that, because of their magnitude, distort the mean values and affect the outcome of the comparison.
  • The data are ordinal
  • Assumes distributions of two groups being compared are the same shape
  • Assumes not too many ties in ranks of data
26
Q

What is used for paired non-parametric?

A

• The Wilcoxon Signed‐Rank test or sign test
• Can use interval, ratio or ordinal data
• Null hypothesis the same as for Mann‐Whitney U test
but for paired data

27
Q

Explain the Wilcoxon signed-rank test

A
  • The Sign test can be used to measure the differences between each variable as nonparametric alternatives to the one sample t‐test
  • The Wilcoxon Signed‐Rank test can be used to compare paired data as nonparametric alternatives to the paired t‐test
  • These tests are used when you cannot justify a normality assumption for the differences
  • The sign test is very simple in that it counts the number of differences that are positive (+) and those that are negative (‐) and makes a decision based on these counts
28
Q

Give examples of paired and independent data?

A

Paired: Cyclicling rate difference between people within a company
Independent: comparing cycling rates across companies

29
Q

Explain the process of hypothesis testing

A

All statistical tests of common structure:
• Set up a null hypothesis
• Establish a level of statistical significance (also known as alpha (α), usually set 5% or 1%)
• Determine statistical significance of the findings
• Accept or reject the null hypothesis
• SPSS output provides a p‐value (probability value)
• If the p‐value is greater than the alpha you can accept
the null hypothesis

30
Q

Explain the p-values

A

• The p value quantifies the chance of observing such a
value of the test statistic (or one more extreme) if the
null hypothesis was actually true
• If set an Alpha level of .05 (5%) then you decide to
reject H0 and accept HA when p is no more than .05
• This leaves up to 5% chance that you are wrong in
concluding that there is a difference (making a Type 1
error)

31
Q

What are type 1 and 2 errors?

A
  • Type I Error is the rejection of a null hypothesis when it is true. (This probability is known as significance level of test usually at 5% or 1% and decided before the test is conducted. (false posiitve) e.g. tell a man they’re pregnant
  • Type II Error is the failure to reject the null hypothesis, which is false. (false negative) e.g. tell a preganant woman they a not pregnant
32
Q

What are some limitations of independent t-tests?

A

• Doesn’t take into the impact of population size on the chances of type 1 error

33
Q

What does the Levene’s test?

A

Homogeneity of variances

34
Q

what does 0.000 tend to mean in spss?

A

<0.001