Biostats Flashcards

1
Q

What is the name given to a descriptive measure computed from the data of a sample?

A

A statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the name given to a descriptive measure computed from the data of a population?

A

A parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

If a distribution is skewed to the right, what is the relationship between mean, median and mode? Where is the tail of this graph?

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

If a distribution is skewed to the left, what is the relationship between the mean, median and mode? Where is the tail of this graph?

A

Mode > Median > Mean

Tail is on the left

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does a 95% confidence interval imply in theoretical probabilistic terms?

A

That in repeating sampling from a normally distributed population, 95% of all intervals will (in the long run) include the true population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does a 95% confidence interval imply in practical terms?

A

We are 95% confident that the interval will include the true population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Null hypothesis

A

Means are equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Alternative hypothesis

A

Means are different

Usually synonymous with research hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a Type I error?

A

When we reject the null hypothesis, although the null is true (“Null = no wolf. Therefore Type I = False alarm”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What symbol is the probability of a Type I error denoted by?

A

a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What do we call the quantity 1-a?

A

The level of confidence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a Type II error?

A

When we fail to reject the null hypothesis when the null is false (“Null = no wolf. Therefore Type II = Failing to raise the alarm”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What symbol is the probability of a Type II error denoted by?

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What do we call the quantity 1-B?

A

Statistical power of the test (probability of rejecting the null hypothesis when it is false)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a P value?

A

A measure of how much evidence we have against the null hypothesis. The smaller it is, the more evidence we have against the null.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does P>0.05 mean?

A

That we cannot reject the null hypothesis as we have insufficient evidence to do so. This does NOT necessarily mean that the null hypothesis is true, only that there is insufficient evidence to reject it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a paired t-test used to do?

A

It is used to determine whether there is a significant difference between paired observations, for example, before and after an intervention.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the significance of a confidence interval that crosses 0?

A

If the 95% confidence interval for the difference between two means includes zero, then the hypothesis test WILL give a statistically non-significant result (p>0.05)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a variable?

A

Any aspect of an individual that is measured or recorded

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a categorical variable?

A

Qualitative

21
Q

What is a numerical variable?

A

Quantitative

22
Q

What is the relationship between mean, median and mode in a symmetrical distribution?

A

They are approximately equal/very similar

23
Q

What percentage of observations are included in the following ranges of a normal distribution:
mean ±1, 2 and 3 standard deviations of the mean?
What is another name for these values?

A

1: ±68%
2: ±95%
3: ±99%
These can also be referred to as the probability limits

24
Q

What is a synonym for a sample statistic?

A

Point estimate

25
Q

What is the name given to the measurement of the degree of variation between means from repeated samples? How is this calculated?

A

Standard error of the mean

SEM = SD/sqrt(n)

26
Q

How is a 95% confidence interval calculated?

A

mean±1.96*SEM

27
Q

What is standard error?

A

A measurement of the precision of estimates. It shows how good the estimation of the mean is (SEM)

28
Q

What is standard deviation?

A

A measurement of the variability of distributions. It shows how widely scattered the measurements are (SD)

29
Q

When is a result inconclusive?

A

When a sample size is too small to declare an observed effect to be statistically significant

30
Q

When is a result imprecise?

A

If a small sample size results in wide confidence limits

31
Q

What does a t-test measure?

A

The difference between two means from independent samples of numerical variables

32
Q

What are the underlying assumptions of a valid t-test?

A
  • Sampled populations are normally distributed

- Standard deviations are similar

33
Q

What is the non-parametric equivalent of a t-test and when is this used?

A
  • Wilcoxon sum of rank test

- It is used when there is an obvious non-normal data set

34
Q

What does a Chi-squared test measure?

A

The relationship between two categorical variables, and the differences in proportions

35
Q

What is the purpose of Pearson’s correlation?

A

To measure the degree of association between two numerical variables

36
Q

Binary variable

A

Categorical
Observations fall under one of two categories
e.g. exposed vs non-exposed

37
Q

Nominal

A

Categorical
Observations fall under more than two categories
e.g. classification of disease; marital status

38
Q

Ordinal

A

Categorical
Observations fal under more than two categories which can be ordered
e.g. classification according to mild, moderate, severe

39
Q

Continuous data

A

Numerical
Data are measurements that can assume any value within a a specified range
e.g. height, weight, blood pressure

40
Q

Discrete data

A

Numerical
Data are integers/counted numbers of events
e.g. number of births in a week, number of patients in a clinic

41
Q

Which summary statistics are reported if a data set is symmetrically distributed?

A

Number of observations
Mean
Standard deviation

42
Q

Which summary statistics are reported if a data set is skewed?

A

Number of observations
Median
Range
Interquartile range

43
Q

What is a population?

A

A group of individuals having certain common characteristics about which statistical inferences can be made

44
Q

What is a sample?

A

A subset of individuals selected from a defined population

45
Q

Benefits of sampling

A

Less time to collect and analyse data
Greater flexibility in data management and type of information that can be obtained
Reduce cost

46
Q

Random sampling

A

Each individual in the population has an equal

chance of being included in the sample

47
Q

Stratified sampling

A

Stratified sampling is used when the population
consist of distinct sub-groups or strata, which
differ considerably with respect to the main feature under study

48
Q

Systematic sampling

A

A process where individuals are selected systematically throughout the series on a basis of a predetermined sampling fraction

49
Q

Cluster sampling

A

Study population is divided into clearly defined groups or clusters (example: street-blocks or areas around informal housing units)
A random sample of clusters are drawn