Population Skills - Stats Flashcards

Question 1

Q

What is the difference between prevalence & incidence? How are each of them expressed?

Answer

A

Prevalence describes the frequency of a disease, ie., how common it is in a population. It is expressed in percentage terms.

Incidence describes the number of new cases over a certain time period, so it always has a time element and is not expressed as a percentage. Basically it’s the rate.

Question 2

Q

How can bias or selection bias be minimised?

Answer

A

Random sampling.

Question 3

Q

What is the difference between continuous variables and discrete variables?

Answer

A

Continuous values: can take any value withing a range, including non-integer (ie., fractional) values; only numerical variables can be continuous.

Discrete values: values change in steps (when only integers makes sense), eg., There cannot be 1.5 puppies.

* NB In an average, a value such as 1.5 puppies does make sense, and can be treated as continuous data.

Question 4

Q

What is the difference between nominal and ordinal qualitative variables?

Answer

A

Nominal: the order of categories isn’t important, eg., breed

Ordinal: there is some intrinsic order, eg., small, medium, large

Question 5

Q

How do you test relationship of data between TWO CONTINUOUS VARIABLES?

Eg., Age of Cow & MIlk Yield

Answer

A

If parametric:

Pearson’s Test

If not parametric (ie., if at least one set of continuous data are skewed):

Spearman’s Test

Question 6

Q

How do you test for association between TWO CATEGORICAL variables? Eg., sheep breed and colour?

Answer

A

Chi-squared to test significance. Ie., To determine if the NUMBERS OF OBSERVATIONS in each category are different from what you EXPECT.

Then:

If ANY of the EXPECTED VALUES is less than 5:

Fisher’s exact test

Report the p-value using the Chi-square calculation, which compares observed with expected proportions.

Question 7

Q

How do you test association / significance in PAIRED data between two categorical variables - Ie., If you have before-and-after data from same set of animals.

Answer

A

Don’t use Fisher’s Exact Test. Rather, use:

McNemar’s Test

This is applied to 2x2 tables with a dichotomous trait (ie., two possible classes), with matched pairs of subjects.

Report the p-value usoing McNemar’s Test (not chi-squared).

Question 8

Q

How do you test for asociations between one continuous variable and two (binary) categorical variables?

This is YOUR RP1!

Answer

A

If the data is parametric (use a histogram), then:

Parametric T-test for p-value

If non-parametric:

Use Mann-Whitney U Test

Question 9

Q

How do you test the significance between one continuous variable and MORE THAN TWO categorical variables?

Eg., Does anaesthetic recovery time differ between small, medium and large dogs?

Recovery time is continuous

Small, Medium & Large are categorical (& they are also ordinal categorical since it makes sense to put these into ascending or descending order)

Answer

A

Parametric:

ANOVA

Non-parametric:

Kruskal-Wallis

Question 10

Q

What statistical method do you use in survival analysis in a clinical trial to figure out how many subjects you expect to be left at the end of treatment or intervention?

Answer

A

Kaplan-Meier Method aka Kaplan Meier Estimator

Also known as the product limit estimator, is an estimator for estimating the survival function from lifetime data. In medical research, it is often used to measure the fraction of patients living for a certain amount of time after treatment.

A plot of the Kaplan–Meier estimate of the survival function is a series of horizontal steps of declining magnitude which, when a large enough sample is taken, approaches the true survival function for that population. The value of the survival function between successive distinct sampled observations (“clicks”) is assumed to be constant. See example below.

Question 11

Q

What is the logrank test used for in statistical analysis of outcomes in a clinical trial?

Answer

A

In statistics, the logrank test is a hypothesis test to compare the survival distributions of two samples.

It is a nonparametric test and appropriate to use when the data are right skewed and censored (technically, the censoring must be non-informative). It is widely used in clinical trials to establish the efficacy of a new treatment compared to a control treatment when the measurement is the time to event (such as the time from initial treatment to a heart attack).

The test is sometimes called the Mantel–Cox test, named after Nathan Mantel and David Cox. The logrank test can also be viewed as a time stratified Cochran–Mantel–Haenszel test.