Statistics Flashcards

Question

What does a Box and Whiskers plot show?

Answer 1

``` Graph indicates: • median • lower quartile • upper quartile • range that contains most values • outliers – extreme observations with very low or very high values ```

Answer 2

1. Variable is normally distributed in each group in the population (or the sample size is large and the variable is not too skewed) 2. Standard deviation is similar in the two groups 3. Participants (observations) are independent between groups – i.e., NOT paired

Answer 3

y axis/vertical

Answer 4

minimises the sum of the (vertical) squared distances between actual outcome scores and the line – line of best fit

Answer 5

The precision with which the true population parameter (the mean) is estimated. The smaller the standard error the more precise the sample estimate is of the true mean.

Answer 6

A non-parametric test for comparing a quantitative variable between two paired groups. Provides an IQR for each group, and p-value.

Answer 7

the most boring truth imaginable, not necessarily what you think the truth is.

Answer 8

quantify the strength of association between two variables

Answer 9

number of participants in a category/total number of participants

Answer 10

graph where the heights of rectangular bars are used to indicate the number (or proportion or percentage) of participants that are in each category

Answer 11

Graph where the heights of rectangular bars (or bins) are used to indicate the (relative) frequency with which values in specific ranges occur. Unlike bar charts (which are used for categorical data) they have no gaps between the bins.

Answer 12

Hypothesis test for comparing the mean across three or more paired (matched) groups. It provides a global p-value comparing the mean across all groups

Answer 13

the association between two variables

Answer 14

1/risk difference

Answer 15

graph used to summarise the relationship between two quantitative variables on two axes. Each participant is represented on the scatterplot using a symbol such as a dot (●) or cross (×). The position of the dot on the vertical axis (y axis) indicates the score on one variable and the position on the horizontal axis (x axis) indicates the score on the other variable.

Answer 16

The odds are how common a binary characteristic is for a single group The odds ratio is the ratio of the odds in one group to the odds in another group. It's calculated to compare the odds between groups

Answer 17

Like a histogram that is turned back to front and flipped on its side. Each observation is represented as a dot (●). The length of the “bars” indicate how common the value is.

Answer 18

Quantitative

Answer 19

have the disease and correctly test positive

Answer 20

The risk is the proportion of people in a single group who have a disease. The relative risk is used to compare the risk between two groups

Answer 21

Chi-squared test or Fisher’s exact test

Answer 22

the proportion of the variation in one variable that is explained by another variable

Answer 23

when the assumptions that underlie parametric methods for independent groups do not hold, specifically where: • variable is skewed (and sample size is small) • standard deviation differs markedly across groups • variable is more ordinal (categorical) than quantitative

Answer 24

This is the number of people that need to receive the intervention before 1 person benefits from it.

Answer 25

The mean quantifies the average for the quantitative variables. It is calculated for a given variable as the sum of the values divided by the total number of values

Answer 26

A mathematically defined theoretical distribution characterised by symmetrical bell-shaped curve.

Answer 27

Distribution refers to the different values that occur and the frequency with which they occur for a given variable.

Answer 28

Either numerical or graphical representation of: • frequency o actual number in each category • relative frequency o proportion of the total in each category o percentage of the total in each category

Answer 29

A non-parametric test for comparing a quantitative variable across three or more paired (matched) groups. Provides an IQR for each group, and p-value.

Answer 30

Reference standards

Answer 31

upper bound of range = mean + 1.96 x standard deviation

Answer 32

* sample is only a subset of the population * there is variability across people * sample is not necessarily representative

Answer 33

most observations bunched at the lower values with a longer tail at the higher values

Answer 34

no variation is explained

Answer 35

Wilcoxon signed-rank test can be used as an alternative to the paired t-test when the assumptions for the latter are not satisfied. This test compares the distribution between the first and second measurements.

Answer 36

* variability (differences) between people in what you are trying to measure * the sample is only a subset of the population and is not perfectly representative of it

Answer 37

Estimating a mathematical equation that describes the linear relationship between a quantitative outcome and a quantitative predictor

Answer 38

Hypothesis test for comparing the mean across three or more independent groups. It provides a global p-value comparing the mean across all groups

Answer 39

Correlated means the scores on one variable are associated with (or predicted by) scores on the other

Answer 40

you may to compare the groups to each other using pairwise comparisons

Answer 41

TP/(TP+FN)

Answer 42

a measure of the correlation between two quantitative variables that have a linear relationship.

Answer 43

the non-parametric alternative to the Chi-squared test to be used for contingency tables

Answer 44

Allows for the interpretation of the confidence interval & hypothesis test (incl. p value) for the mean difference between two independent groups

Answer 45

The confidence interval is the range of values within which we can be 95% certain the true value of the parameter of interest lies in the population

Answer 46

proportion with disease in exposed group/proportion with disease in non-exposed group

Answer 47

categorical with 2 categories e.g. mortality status: alive versus dead

Answer 48

A non-parametric test for comparing a quantitative variable across three or more independent groups. Provides an IQR for each group, and p-value.

Answer 49

An unpaired (or two-sample) t-test is used to compare means between 2 independent groups. It tests the hypothesis that the mean is the same in the populations from which the participants in each group were drawn.

Answer 50

the prevalence of the disease

Answer 51

how far apart are the values from each other

Answer 52

The exposure defines the groups i.e. intervention category in a trial. The outcome is the binary variable being compared i.e. the disease/disorder category of interest.

Answer 53

TN/(FP +TN)

Answer 54

SD stands for standard deviation. It quantifies the variation in the scores for the quantitative variables. It can be interpreted as the average difference between the scores and the mean

Answer 55

null hypothesis might not be rejected when it is false study not large (powerful) enough to reach significance

Answer 56

o the value which characterises the middle of the distribution

Answer 57

do not have the disease but incorrectly test positive

Answer 58

for each person that has a score below the average is there a corresponding person with a score the same distance above the average

Answer 59

1) outcome is quantitative 2) relationship between the outcome and quantitative predictor is linear 3) residuals are Normally distributed (or the sample size is large) 4) constant variance (homoscedasticity)

Answer 60

have the disease but incorrectly test negative

Answer 61

p-value of 0.05 (or less)

Answer 62

constant or intercept

Answer 63

The assumptions made by the t-test are that the distribution of the variables is Normal in each of the groups and the standard deviation is approximately the same in each group.

Answer 64

number of participants in category of interest/number of participants in other category

Answer 65

full set of units that we are interested in

Answer 66

* the difference scores between any two groups are Normally distributed in the population (or sample size is large and difference scores not too skewed) * the standard deviation of the difference scores when comparing any two groups should be similar (“sphericity” assumption)

Answer 67

Nominal/Binary

Answer 68

measure of the correlation between 2 quantitative variables- doesn't have to be linear

Statistics Flashcards

(100 cards)