Statistics Flashcards by Annabel Smith

Define variable

Aspect that can take different values for different participants

How well did you know this?

Not at all

Perfectly

What are the two main types of variable?

Categorical

Quantitative

How well did you know this?

Not at all

Perfectly

What are two types of categorical variable?

Nominal - unordered labelled characteristics

Ordinal - small set of ordered/ranked categories

How well did you know this?

Not at all

Perfectly

What descriptive statistics would you do for categorical data?

Frequency

Relative Frequency

How well did you know this?

Not at all

Perfectly

What type of graph would you create for categorical data?

Bar chat

How well did you know this?

Not at all

Perfectly

What descriptive statistics would you do for quantitative data?

Averages
Variation
Symmetry

How well did you know this?

Not at all

Perfectly

What types of graph would you create for quantitative data?

Histogram
Box Plot
Box and whisker

How well did you know this?

Not at all

Perfectly

What is normal distribution?

Mathematically defined theoretical distribution

How well did you know this?

Not at all

Perfectly

Define descriptive statistics

Describe and summarise data in the sample

i.e. how common are certain characteristics, how are different characteristics related to each other

How well did you know this?

Not at all

Perfectly

Define inferential statistics

Using sample data to make inferences about characteristics and relationships in the populations
i.e. standard errors, confidence intervals, p-values

How well did you know this?

Not at all

Perfectly

What is standard error?

Indicates how far, on average, the sample estimate is expected to be from the true population parameter value

How well did you know this?

Not at all

Perfectly

What is the difference between standard error and standard deviation?

SE summarises precision of an estimate

SD summarises variability of an estimate

How well did you know this?

Not at all

Perfectly

What is a confidence interval?

Range of values in which we can be confident the true value lies

How well did you know this?

Not at all

Perfectly

What is a p-value?

Quantifies the extent to which the sample estimate contradicts the null hypothesis

How well did you know this?

Not at all

Perfectly

What is a null hypothesis?

The most boring truth imaginable

How well did you know this?

Not at all

Perfectly

What is an alternative hypothesis?

Opposite of the null hypothesis

Usually two tailed - contradictions to the null hypothesis in either direction

How well did you know this?

Not at all

Perfectly

What is a Type I and Type II error in hypothesis testing?

Type 1 error - null being rejected when it is true, ‘significant’ result due to chance
Type 2 error - null not rejected when it is false, study not powerful enough

How well did you know this?

Not at all

Perfectly

What is a two sample (unpaired) t-test used for? What are the assumptions?

Tests for mean difference between two independent groups
Assumptions:
Variable is normally distributed
SD is similar
Observations are not paired

How well did you know this?

Not at all

Perfectly

What is a paired t-test used for? What are the assumptions?

Used when observations are linked in some way (e.g. before and after)
Analysis based on within-pair differences between groups
Assumption:
Within-pair differences are normally distributed

How well did you know this?

Not at all

Perfectly

What is an ANOVA used for? What are the assumptions?

For comparing 3 or more independent groups
Assumptions:
Each group is normally distributed
SD is similar
Observations are independent

How well did you know this?

Not at all

Perfectly

What is a repeated measures ANOVA used for? What are the assumptions?

For comparing 3 or more paired groups
Assumptions:
Difference scores between any two groups are normally distributed
SD of different scores should be the same for all combined groups

How well did you know this?

Not at all

Perfectly

How do non-parametric tests work?

Analyse rank ordering rather than actual scores

Compare distributions rather than means

How well did you know this?

Not at all

Perfectly

When do we use non-parametric tests?

When assumptions for parametric tests do not hold

e.g. variable is skewed, SD differs markedly, variable is more ordinal than quantitative

How well did you know this?

Not at all

Perfectly

What is a wilcoxon (rank sum) test used for?

Compares two independent groups

How well did you know this?

Not at all

Perfectly

What is a kruskal-wallis test used for?

Comparing three or more independent groups

What is a wilcoxon signed rank test used for?

Compares two paired groups

What is a Friedman test used for?

Compares three or more paired groups

What are the advantages and disadvantages of non-parametric tests?

``` Advantages: Always valid for quantitative data Often provide similar p-values to quantitative tests Disadvantages: Do not make direct inferences Do not provide CIs Based on analysis of ranks, not scores When assumptions hold, not as powerful ```

How do you calculate proportion?

No. in category of interest/total no. of participants

What does the term risk mean?

The proportion/percentage of people with a specified disease in a population

How do you calculate odds?

No. in category of interest/No. in other category OR % in category of interest/ 100-% in category of interest

How are odds and % related?

The higher the %, the higher the odds

In what type of study are odds particularly useful?

Case-control studies

How would you summarise binary variables in 2 independent groups?

Cross tabulate - exposure variable and outcome variable

When calculating absolute measures, what indicates no difference?

How do you calculate risk difference?

% of people affected in one group - % in the other

When calculating relative measures, what indicates no difference?

How do you calculate risk ratio/relative risk?

% affected in intervention group / % in the control

How do you calculate odds ratio?

Odds in one group / odds in the other

What does 0, 0< and 0> mean in term of risk difference?

``` 0 = groups equally likely to have disease 0< = first group more likely to have disease 0> = second group more likely to have disease ```

How do you calculate absolute risk reduction?

Difference in % points between groups

What is number needed to treat?

Number of people that need to receive intervention before one person benefits from it

How do you calculate number needed to treat?

100/risk difference

What does a risk ratio of <1 mean?

Disease occurrence is lower in the intervention group

How do you calculate relative risk reduction?

(1-risk ratio)x100

How do you calculate odds ratio?

odds of disease in exposed group / odds of disease in non-exposed group

What are risk difference and NNT good at quantifying?

Impact of an intervention

What are risk ratio and odds ratio good at quantifying?

Strength of association between intervention and disease status

Which two tests give p-values when comparing binary variables between two groups?

Chi-squared test | Fishers exact test

How do you calculate an 'expected value'?

(row total x column total) / total sample size

What are the assumptions of the Chi-sqaured test?

Total sample size is at least 40 OR If sample is between 20 and 39, the expected value in each cell is at least 5

What are the assumptions of the Fisher's Exact Test?

Fewer than 20 participants | Between 20 and 39 participants and the expected value in at least one cell is less than 5

What is the definition of correlation?

The association between two variables

How can correlation be summarised graphically and numerically?

Graphically - scatterplots | Numerically - correlation coefficients

What does a correlation coefficient do?

Quantifies the strength of association between two variables

What is the difference between Pearson and Spearman correlation coefficient?

Pearson - linear relationship | Spearman - non-linear associations (monotonic - only positive or only negative)

What does R Squared tell us?

The proportion of the variation in one variable that is explained by another variable

How do we calculate R squared?

Multiply Pearson's coefficient by itself

What is linear regression used for?

Estimating the mathematical equation that describes the linear relationship between a quantitative outcome and a quantitative predictor

What is the least squares estimation?

Method of estimating regression coefficients Derives line of best fit/regression line Estimate of the true regression line in the population

What are the assumptions of regression?

Outcome is quantitative Relationship is linear Residuals are normally distributed Constant variance

What are the similarities and differences between regression and persons CC?

Pearsons quantifies strength of association | Regression describes the relationship and can be used to predict outcome variable score

How do we calculate sensitivity?

TP / TP + FN

How do we calculate specificity?

TN / TN + FP

What is PPV?

Positive predictive value - proportion of those with a +ve result that have the condition

What is NPV?

Negative predictive value - proportion of those with a -ve result that do not have the condition

How do we calculate PPV?

TP / TP + FP

How do we calculate NPV?

TN / TN + FN