Biostats Flashcards

Question 1

Q

What is the name given to a descriptive measure computed from the data of a sample?

Answer

A

A statistic

Question 2

Q

What is the name given to a descriptive measure computed from the data of a population?

Answer

A

A parameter

Question 3

Q

If a distribution is skewed to the right, what is the relationship between mean, median and mode? Where is the tail of this graph?

Question 4

Q

If a distribution is skewed to the left, what is the relationship between the mean, median and mode? Where is the tail of this graph?

Answer

A

Mode > Median > Mean

Tail is on the left

Question 5

Q

What does a 95% confidence interval imply in theoretical probabilistic terms?

Answer

A

That in repeating sampling from a normally distributed population, 95% of all intervals will (in the long run) include the true population mean

Question 6

Q

What does a 95% confidence interval imply in practical terms?

Answer

A

We are 95% confident that the interval will include the true population mean

Question 7

Q

Null hypothesis

Answer

A

Means are equal

Question 8

Q

Alternative hypothesis

Answer

A

Means are different

Usually synonymous with research hypothesis

Question 9

Q

What is a Type I error?

Answer

A

When we reject the null hypothesis, although the null is true (“Null = no wolf. Therefore Type I = False alarm”)

Question 10

Q

What symbol is the probability of a Type I error denoted by?

Question 11

Q

What do we call the quantity 1-a?

Answer

A

The level of confidence

Question 12

Q

What is a Type II error?

Answer

A

When we fail to reject the null hypothesis when the null is false (“Null = no wolf. Therefore Type II = Failing to raise the alarm”)

Question 13

Q

What symbol is the probability of a Type II error denoted by?

Question 14

Q

What do we call the quantity 1-B?

Answer

A

Statistical power of the test (probability of rejecting the null hypothesis when it is false)

Question 15

Q

What is a P value?

Answer

A

A measure of how much evidence we have against the null hypothesis. The smaller it is, the more evidence we have against the null.

Question 16

Q

What does P>0.05 mean?

Answer

A

That we cannot reject the null hypothesis as we have insufficient evidence to do so. This does NOT necessarily mean that the null hypothesis is true, only that there is insufficient evidence to reject it

Question 17

Q

What is a paired t-test used to do?

Answer

A

It is used to determine whether there is a significant difference between paired observations, for example, before and after an intervention.

Question 18

Q

What is the significance of a confidence interval that crosses 0?

Answer

A

If the 95% confidence interval for the difference between two means includes zero, then the hypothesis test WILL give a statistically non-significant result (p>0.05)

Question 19

Q

What is a variable?

Answer

A

Any aspect of an individual that is measured or recorded

Question 20

Q

What is a categorical variable?

Answer

A

Qualitative

Question 21

Q

What is a numerical variable?

Answer

A

Quantitative

Question 22

Q

What is the relationship between mean, median and mode in a symmetrical distribution?

Answer

A

They are approximately equal/very similar

Question 23

Q

What percentage of observations are included in the following ranges of a normal distribution:
mean ±1, 2 and 3 standard deviations of the mean?
What is another name for these values?

Answer

A

1: ±68%
2: ±95%
3: ±99%
These can also be referred to as the probability limits

Question 24

Q

What is a synonym for a sample statistic?

Answer

A

Point estimate

Question 25

Q

What is the name given to the measurement of the degree of variation between means from repeated samples? How is this calculated?

Answer

A

Standard error of the mean

SEM = SD/sqrt(n)

Question 26

Q

How is a 95% confidence interval calculated?

Answer

A

mean±1.96*SEM

Question 27

Q

What is standard error?

Answer

A

A measurement of the precision of estimates. It shows how good the estimation of the mean is (SEM)

Question 28

Q

What is standard deviation?

Answer

A

A measurement of the variability of distributions. It shows how widely scattered the measurements are (SD)

Question 29

Q

When is a result inconclusive?

Answer

A

When a sample size is too small to declare an observed effect to be statistically significant

Question 30

Q

When is a result imprecise?

Answer

A

If a small sample size results in wide confidence limits

Question 31

Q

What does a t-test measure?

Answer

A

The difference between two means from independent samples of numerical variables

Question 32

Q

What are the underlying assumptions of a valid t-test?

Answer

A

Sampled populations are normally distributed

- Standard deviations are similar

Question 33

Q

What is the non-parametric equivalent of a t-test and when is this used?

Answer

A

Wilcoxon sum of rank test

- It is used when there is an obvious non-normal data set

Question 34

Q

What does a Chi-squared test measure?

Answer

A

The relationship between two categorical variables, and the differences in proportions

Question 35

Q

What is the purpose of Pearson’s correlation?

Answer

A

To measure the degree of association between two numerical variables

Question 36

Q

Binary variable

Answer

A

Categorical
Observations fall under one of two categories
e.g. exposed vs non-exposed

Question 37

Q

Nominal

Answer

A

Categorical
Observations fall under more than two categories
e.g. classification of disease; marital status

Question 38

Q

Ordinal

Answer

A

Categorical
Observations fal under more than two categories which can be ordered
e.g. classification according to mild, moderate, severe

Question 39

Q

Continuous data

Answer

A

Numerical
Data are measurements that can assume any value within a a specified range
e.g. height, weight, blood pressure

Question 40

Q

Discrete data

Answer

A

Numerical
Data are integers/counted numbers of events
e.g. number of births in a week, number of patients in a clinic

Question 41

Q

Which summary statistics are reported if a data set is symmetrically distributed?

Answer

A

Number of observations
Mean
Standard deviation

Question 42

Q

Which summary statistics are reported if a data set is skewed?

Answer

A

Number of observations
Median
Range
Interquartile range

Question 43

Q

What is a population?

Answer

A

A group of individuals having certain common characteristics about which statistical inferences can be made

Question 44

Q

What is a sample?

Answer

A

A subset of individuals selected from a defined population

Question 45

Q

Benefits of sampling

Answer

A

Less time to collect and analyse data
Greater flexibility in data management and type of information that can be obtained
Reduce cost

Question 46

Q

Random sampling

Answer

A

Each individual in the population has an equal

chance of being included in the sample

Question 47

Q

Stratified sampling

Answer

A

Stratified sampling is used when the population
consist of distinct sub-groups or strata, which
differ considerably with respect to the main feature under study

Question 48

Q

Systematic sampling

Answer

A

A process where individuals are selected systematically throughout the series on a basis of a predetermined sampling fraction

Question 49

Q

Cluster sampling

Answer

A

Study population is divided into clearly defined groups or clusters (example: street-blocks or areas around informal housing units)
A random sample of clusters are drawn