(Quantitative): Comparing groups: continuous variables Flashcards by Toby de Gruchy

What is statistical data analysis?

• Organise and analyse the data
• Common procedures used in analysis:
 -Descriptive Statistics
 - Inferential Statistics (more of a focus this term)
 • Need to get data into shape

How well did you know this?

Not at all

Perfectly

What does data analysis need to do/have?

Needs to have a purpose
Describe
Compare
Examine similarities
Examine differences

How well did you know this?

Not at all

Perfectly

What is descriptive statistics (recap from last semester)?

Check for errors and outliers
Describe & summarise
Spread of the data
Ensure appropriate analysis
Data parametric or non‐parametric

How well did you know this?

Not at all

Perfectly

What ways can data be summarised (ratio or interval)?

• Measure of Central Tendency
  -Mean, Median, Mode
  - If not normal‐median
• Measure of Dispersion
 -Variation, Range, Standard Deviation
• Normal Curve, Skewness, Kurtosis

How well did you know this?

Not at all

Perfectly

What are inferential statistics?

All statistical tests of common structure:
• Set up a null and alternative hypothesis
• Establish a level of statistical significance (also known as alpha (α), usually set at 5% or 1%)-depends on study
• Determine statistical significance of the findings‐ p value
• Accept or reject the null hypothesis-is there a difference between two groups
- SPSS output provides a p‐value (probability value)
- If the p‐value is greater than the alpha you cannot reject the null hypothesis
note:may be statistically significant but not actually meaningful e.e.g 1 second difference in a marathon

How well did you know this?

Not at all

Perfectly

What are the steps when undertaking a hypothesis test?

Define study question
Set null and alternative hypothesis
Calculate a test statistic
Calculate a p value
Make a decision and interpret

How well did you know this?

Not at all

Perfectly

What is the students t-test?

The t‐test is used to compare means between groups
t‐test is easy to use but can be easily misused
Most common statistical procedure used by researchers

How well did you know this?

Not at all

Perfectly

What are the two types of t-test?

Same principles behind each but there is more random error in the …
• INDEPENDENT SAMPLES DESIGN because the control
group might, by chance, be very different from the treatment group
• With the PAIRED DESIGN, each person is their own control so variation is limited

How well did you know this?

Not at all

Perfectly

What is the process in the decision chart for bivariate data?

Paired data–>Paired samples t-test–>wilcoxon signed rank test (for non-parametric)
Independent data–>Independent samples t-test–>Mann Whitney u-test (for non-parametric)

How well did you know this?

Not at all

Perfectly

What is independent data?

• Data comes from different (independent) groups of people
• Eg. classic experiment (eg. Group 1 receives intervention A,
Group 2 receives intervention B).
• Study participant is in one group only
• Compare differences between groups (mean or median)

How well did you know this?

Not at all

Perfectly

What is paired data?

• Data comes from one group of individuals
• Data collected from an individual at different points in time or under different conditions
• Compare differences in outcome between time 1 and time 2 or condition 1 and 2 (mean or median)
• Other terms: repeated measures, before and after study
e.g. cycling speed with different helmets

How well did you know this?

Not at all

Perfectly

Explain the independent samples design

• Dependent Variable is ratio/interval and Independent Variable has two categories
• Measurements in condition 1 are independent of
measurements in condition 2
• If the H0 is true we expect the difference between the mean of condition 1 group and condition 2 group to be zero

How well did you know this?

Not at all

Perfectly

Explain the issue of error in an independent t-test

• The act of using a sample will introduce error
-What is the probability that the difference we found occurred by chance?
-If less than 5%, reject H0 and accept HA (or written H1
‐ alternative)

How well did you know this?

Not at all

Perfectly

Explain the sampling distribution of the independent samples t-test?

• T-distribution shape similar to normal curve
• the middle is the population parameter when H0
is true (i.e. the mean difference is 0)
• Around it are all the possible sample statistics
• Is ‘our’ difference so big that would only rarely happen by chance?
- Rarest 2.5% in both tails

How well did you know this?

Not at all

Perfectly

What are the assumptions for the independent samples t-test?

Dependent Variable is ratio/interval
If either group is small (30 or less), distribution of Dependent Variable for each group should not be badly skewed
The variance of the Dependent Variable for the two groups should not be very different

How well did you know this?

Not at all

Perfectly

How is a problematic difference indicated in an independent t-test?

Study These Flashcards

• A problematic difference in variances is indicated by a significant Levene’s Test:

If significant, interpret the p value associated with ‘equal variances not assumed’
If non‐significant, interpret p value associated with ‘equal variances assumed’

What about paired data that is normal? (check)

Study These Flashcards

Observations not independent
Paired equivalent‐ Paired sample t test (same assumptions)
H0: No difference in the means before and after
H1: A difference in the means before and after

Explain the paired samples t-test design

Study These Flashcards

Dependent Variable is ratio/interval and Independent Variable has two categories
Each measurement in Cond 1 (performance with caffeine) has a match in Cond 2 (performance with water)
One measurement is deducted from the other so that each case has a different score
If the null hypothesis (H0 ) is true and there is no difference in performance with e.g. caffeine and with water we would expect the group mean difference score to be 0

Explain the issue of error in a paired-samples t-test

Study These Flashcards

the act of using a sample will introduce error
• What is the probability that the difference we found occurred by chance?
• If less than 5%, reject H0 and accept HA
(or written H1 ‐ alternative)

What would non-parametric equivalents to a t-test be used?

Study These Flashcards

If we have an ordinal scale Dependent Variable, or a ratio/interval Dependent Variable that does not meet parametric assumptions we use non‐parametric equivalents
These compare medians (ranks) (not affected by extremes) rather than means
They are usually less powerful

When should non-parametric tests be used?

Study These Flashcards

used when assumptions of parametric tests are not met (i.e. breached) e.g. the level of measurement (e.g., interval or ratio data), normal distribution, and homogeneity of variances across groups
It is not always possible to correct the distribution of a data set
In these cases we have to use non‐parametric tests
They make fewer assumptions about the type of data on which they can be used
Many of these tests use “ranked” data

But what if my data are independent but nonparametric?

Study These Flashcards

• Mann‐Whitney U test e.g. income ranks between teams

What is the mann-whitney u-test

Study These Flashcards

It is used to test the null hypothesis that two samples come from the same population (i.e. have the same median)
or, alternatively, whether observations in one sample tend to be larger than observations in the other

What are the assumptions of the mann whitney u-test?

Study These Flashcards

• (also known as the Mann‐Whitney U) is similar to the two independent samples t‐test
• Data must meet the requirement that the two
samples are independent
• The Mann‐Whitney procedure uses ranks instead of
the raw data values
• Data values are assigned ranks relative to both
samples combined

When should a mann-whitney u-test be used?

* The sample sizes are small and normality is questionable. * The data contain outliers or extreme values that, because of their magnitude, distort the mean values and affect the outcome of the comparison. * The data are ordinal * Assumes distributions of two groups being compared are the same shape * Assumes not too many ties in ranks of data

What is used for paired non-parametric?

• The Wilcoxon Signed‐Rank test or sign test • Can use interval, ratio or ordinal data • Null hypothesis the same as for Mann‐Whitney U test but for paired data

Explain the Wilcoxon signed-rank test

* The Sign test can be used to measure the differences between each variable as nonparametric alternatives to the one sample t‐test * The Wilcoxon Signed‐Rank test can be used to compare paired data as nonparametric alternatives to the paired t‐test * These tests are used when you cannot justify a normality assumption for the differences * The sign test is very simple in that it counts the number of differences that are positive (+) and those that are negative (‐) and makes a decision based on these counts

Give examples of paired and independent data?

Paired: Cyclicling rate difference between people within a company Independent: comparing cycling rates across companies

Explain the process of hypothesis testing

All statistical tests of common structure: • Set up a null hypothesis • Establish a level of statistical significance (also known as alpha (α), usually set 5% or 1%) • Determine statistical significance of the findings • Accept or reject the null hypothesis • SPSS output provides a p‐value (probability value) • If the p‐value is greater than the alpha you can accept the null hypothesis

Explain the p-values

• The p value quantifies the chance of observing such a value of the test statistic (or one more extreme) if the null hypothesis was actually true • If set an Alpha level of .05 (5%) then you decide to reject H0 and accept HA when p is no more than .05 • This leaves up to 5% chance that you are wrong in concluding that there is a difference (making a Type 1 error)

What are type 1 and 2 errors?

* Type I Error is the rejection of a null hypothesis when it is true. (This probability is known as significance level of test usually at 5% or 1% and decided before the test is conducted. (false posiitve) e.g. tell a man they're pregnant * Type II Error is the failure to reject the null hypothesis, which is false. (false negative) e.g. tell a preganant woman they a not pregnant

What are some limitations of independent t-tests?

• Doesn't take into the impact of population size on the chances of type 1 error

What does the Levene's test?

Homogeneity of variances

what does 0.000 tend to mean in spss?

<0.001

(Quantitative): Comparing groups: continuous variables Flashcards

(34 cards)