SPSS Flashcards

1
Q

What is a continuous variable?

A

Arising from measurements (e.g. height)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a discrete variable?

A

Arising from counting (e.g. number of books on a bookshelf)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a nominal/categorical variable?

A

Having no natural order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is an ordinal variable?

A

Having natural order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is simple random sampling?

A

NPS: Each member of the population has equal chance of being selected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is systematic sampling?

A

NPS: Every nth subject from a population list is chosen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is stratified random sampling?

A

NPS: The population is split into groups of similar individuals from which a sample is drawn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is disproportionate sampling?

A

NPS: If strata in population are of substantially unequal size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Cluster sampling?

A

NPS: Successive random sampling a series of units in a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is convenience sampling?

A

PS: Samples are based on availability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is quota sampling?

A

Researcher guides sampling process until participant quota is met (e.g. volunteers called for until equal quota of males/females is met)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is purposive sampling?

A

Subjects are hand picked based on certain criteria

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is snowball sampling?

A

Used when desired characteristics are rare. Initial subjects refer others with similar characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What happens to your accuracy if you quadruple your sample size?

A

It doubles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Name 7 types of Experimental designs

A

RCT, Blind study, Cross over design (each subject has own control but order of treatments is randomised), Factorial design (several factors compared at once), outcome variables, Quasi experimental design (Often happens when independent variable in question is an innate characteristic of the participants involved), single subject study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Name 9 types of observational designs

A

Retrospective, prospective, surveys and polls, observation, longitudinal cohort studies, case-controlled study, cross sectional study, case reports, questionnaires.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is dichotomous survey questioning?

A

Two possible answers - yes/no/agree/disagree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is likert scale in surveys?

A

3-5 categories of responses usually provided

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is visual analogue scale in survey?

A

Results measured along a continuum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Why can histograms be subjective?

A

Dependent on number of bins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the interquartile range?

A
Minimum = 1st quartile
0.25 = 2nd quartile
Median = 3rd quartile
0.75= 4th quartile
Maximum = 5th quartile
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Which chart/graph best displays the interquartile range?

A

Boxplots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What type of data best suits bar chart?

A

Categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the standard deviation

A

How far away values deviate from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the formula for degrees of freedom

A

n-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the standard error?

A

How far the sample mean deviates from the population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is a parameter?

A

A numerical characteristic of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is a statistic?

A

A numerical characteristic of a sample (e.g. mean, SD)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What are the confidence intervals associated with a normal distribution?

A

68% of values within 1 SD of the mean
95% of values within 2 SD of the mean
99.7% of values within 3 SD of the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is the central limit theorem?

A

The Central Limit Theorem states that the sampling distribution of the sampling means approaches a normal distribution as the sample size gets bigger- no matter what the shape of the population distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is a Type 1 Error?

A

Null hypothesis is rejected when it is actually true (False positive)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the relationship between Type 1 Error and the P-Value?

A

The probability of making a type 1 error is precisely the significance level we set our p-Value at

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is a Type II Error?

A

Where we don’t reject the Null hypothesis and we should have

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is Power?

A

The probability of detecting an effect when there is indeed an affect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

How can power be improved?

A

Decreasing effect size, decreasing variability, Increasing sample size, decreasing the significance threshold (but this can increase Type I Error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is a Parametric test?

A

Tests some parameter in your population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is a Non-Parametric Test?

A

Looks at some comparison between groups, such as comparing the “ranks” of values instead of the values themselves.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What are the three Parametric test assumptions?

A
  1. Normality: Data have normal distribution
  2. Homogeneity of variances: Data from multiple groups have the same variance
  3. Independence: Data are independent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What does a p-Value of <0.05 for Levene’s test tell you?

A

That the Variances are not equal and a parametric test cannot be performed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What does a p-Value of >0.05 for Levene’s test tell you?

A

That there is less than 5% chance that the equality in the variances occurred by chance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What is the purpose of a t-test?

A

To compare the means between two independent groups on the same, continuous, dependent variable.

42
Q

What is the NULL hypothesis for a t test?

A

That the difference between the two means is zero

43
Q

What type of data do you need to run a t-test?

A

One independent categorical variable and one continuous, dependent variable

44
Q

What is the non-parametric equivalent of the independent t-test?

A

Mann-Whitney U Test

45
Q

What does ANOVA do

A

Measures the difference between means

46
Q

What type of data does ANOVA require

A

One categorical, independent variable and one dependent continuous variable

47
Q

What is the Null hypothesis for ANOVA?

A

That there is no difference in the means of the groups

48
Q

What is the F value in ANOVA?

A

The variability between the means / variability within the sample. i.e. Is the variability between group means larger than the variability of the observations within the groups

49
Q

What does a large F value signify

A

That the variance between the groups is more than the variance within the groups. A high F value means that your data does not well support your null hypothesis

50
Q

What test do you use to determine more specific difference between groups?

A

Tuckey post-hoc analysis

51
Q

What is the non-parametric equivalent of ANOVA

A

Kruskal-wallis Test

52
Q

What is a residual

A

The difference between an observed response and the value predicted for the response by our model

53
Q

How do you calculate a residual value?

A

Residual for an observed value is the difference between that variable and the mean

54
Q

Residual degrees of freedom?

A

n-2

55
Q

What is another name for a residual?

A

Prediction error

56
Q

What does a high correlation coefficient signify

A

Likely association

57
Q

What is the R value?

A

Pearson correlation coefficient

58
Q

What is the R-squared value

A

Statistical measure of how close the data are to the fitted regression line. e.g. if R2 = .97, this means that 97% of the variance in the dependent variable can be attributed to the independent variable.

59
Q

What is the Null hypothesis of regression

A

That the underlying slope equals zero

60
Q

What does the P-Value signify in regression?

A

The probability of getting an association by chance when there is no association

61
Q

What are the assumptions for Linear regression?

A

Independent observations
Linear association
Normal variability
Equal variances

62
Q

What does an ANOVA regression p-value <0.05 tell us?

A

There is strong evidence against the null hypothesis of 0 slope

63
Q

How do you calculate the sample size for an ANOVA from SPSS?

A

TOTAL df + 1

64
Q

Can you use ANOVA for linear association?

A

The ANOVA ideas extend from comparing means to testing for linear association

65
Q

Which test would you immediately think of if you saw the terms “relationship between” in the question

A

Correlation

66
Q

Which test would you immediately think of if you were asked to compare means of two groups or one group and 2 variables?

A

t-test

67
Q

Which test would you immediately think of if you were asked to compare the means of more than two groups or multiple variables?

A

ANOVA

68
Q

When would you use a Welch’s ANOVA

A

When you have normally distributed data that violates the assumption of homogeneity of variance

69
Q

What is the nonparametric equivalent of Pearson’s Correlation?

A

Spearmans Correlation or Chi Squared

70
Q

What is the nonparametric equivalent of the dependent t-test

A

Wilcoxon Signed Rank Test

71
Q

Which test would you use for categorical outcome?

A

Chi Squared

72
Q

What test would you use for multiple variable comparison in two or more groups?

A

MANOVA

73
Q

What are some limitations of Pearson Coefficient?

A

Presence of outliers
Linearity (if plot is curved)
Limited range of scores will limit generalisation
Does not imply cause

74
Q

What are confounding variables?

A

“lurking” variables which may be influencing the two variables of interest

75
Q

What type of test would you perform when you have a scale (response) and Nominal (predictor) variable?

A

ANOVA, Independent Samples t-test, Mann-Whitney U test , Kruskall Wallis test

76
Q

What kind of test would you perform when you have a Nominal (response) and Nominal (predictor) variable?

A

Chi Squared

77
Q

What kind of test would you perform when you have a scale (response) and scale (predictor) variable?

A

Regression (ANOVA F-test or Coefficient t-test), Pearson correlation t-test, Spearman correlation

78
Q

What kind of test would you perform if you had a scale (response) variable and no predictor variable?

A

One Sample t-test or Paired t-test

79
Q

What is the Null hypothesis for Chi Squared?

A

The distribution is the same across x groups

80
Q

What is the purpose of Chi Square?

A

To compare the observed counts with the counts we would expect if the Null hypothesis was true. Comparing expected and observed values

81
Q

What is inter-rater reliability?

A

The degree to which ratings given by different observers agree

82
Q

What is intra-rater reliability?

A

The degree to which ratings given by the same observer on different occasions agree

83
Q

How do you measure intra/inter-rater reliability, taking chance into account?

A

Cohens Kappa (k)

84
Q

What does the kappa value tell you?

A

Measure of agreement: Percentage of times results agreed and this did not take place by chance

85
Q

What is an acceptable k score?

A

0.4 and above

86
Q

How do you measure if a scale is internally consistent?

A

Cronbach’s alpha (a)

87
Q

What does a value of zero represent for Cronbach’s alpha?

A

Internal consistency reliability is very low and consistency cannot be assumed

88
Q

What is the acceptable score for Cronbach’s alpha?

A

0.8

89
Q

What is sampling error?

A

error in a statistical analysis arising from the unrepresentativeness of the sample taken.

90
Q

What are three limitations of convenience sampling

A
  • Possible bias
  • Poor generalisability
  • Potential for sampling error
91
Q

Why is it crucial to discuss attrition?

A

Attrition of the original sample represents a potential threat of bias if those who drop out of the study are systematically different from those who remain in the study.

92
Q

What is the b value in regression?

A

The b value is the gradient of the regression line. The b value (on the second line of SPSS) tells you “if the other variable is increased by one point, the result will go up by “b”)

93
Q

What does Central Limit Theorem tell us about in statistical inference?

A

Since t tests and ANOVA are based on assuming the sample means have Normal distributions, this means that we can use these methods even if the data seem slightly skewed, particularly if the sample sizes are large.

94
Q

What do you need to remember when describing a relationship from a scatterplot?

A

Strength of association, direction/shape (pos/neg) and linearity

95
Q

Would you go ahead with further statistical testing if a scatterplot showed moderate relationship, but had a number of points that deviated from the line?

A

Yes, but may be necessary to try non-parametric test. Strength of association may be stronger than anticipated due to involvement of values that deviate from line.

96
Q

Can you imply cause from an observational study?

A

No, it is very difficult

97
Q

What is Sampling variability?

A

Sampling variability refers to the process whereby statistics, such as the sample mean, would give different results if the random sampling process was repeated. We thus need to account for sampling variability when making any conclusions from our data.

98
Q

What does the Central Limit Theorem say about statistical inference?

A

The Central Limit Theorem says that the distribution of the sample mean is approximately Normal for sufficiently large samples. Since t tests and ANOVA are based on assuming the sample means have Normal distributions, this means that we can use these methods even if the data seem slightly skewed, particularly if the sample sizes are large.

99
Q

Why might we prefer parametric over non parametric test?

A

If assumptions are satisfied then parametric tests are more powerful than their non- parametric counterparts (although the difference can be minor). Parametric tests also provide direct estimates for effects, including confidence intervals. Nonparametric tests come with their own assumptions too.

100
Q

What is the least squares line?

A

A way of fitting the data with the line that minimises the sum of the squared residuals.