Statistical analysis Flashcards

1
Q

Differentiate nominal, ordinal, interval, and continuous data

A

Categorical
• Nominal: Values that fall into an unordered category. For example, gender, disease status (yes/ no) and blood groups

• Ordinal: Categories that have an order. For example, pain scale (1-10), cancer staging (1-4)

Quantitative
• Interval: Restricted to specified values. For examples, number of live births, number of people who attended dental clinics

• Continuous: Number along a continuous scale such as height, weight and periodontal pocket depth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Be familiar with relative and cumulative frequencies

A
  • Frequency is the number of times an event occurs
  • Relative frequency: the number of times that the event occurs during experimental trials, divided by the total number of trials conducted. Always recorded as a percentage

• Cumulative frequency: used to determine the number of observations that lie above (or below) a particular value in a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Interpret measures of central tendency and variability; mean, median, mode, standard deviation, percentiles, interquartile range

A

Mean
• Tells you the most common value
• It is greatly affected by outliers

Median
• The centre most value in a distribution
• Used for when there are extreme values
• Not affected by outliers

Mode
• The number that occurs most often in the data set

Standard deviation
• A measure of how spread out numbers are from the mean

Percentiles
• Where a certain value will fall into

Interquartile range
• Breaks the data down into the middle 50%
• IQR is Q1 minus Q3
• Tells how spread out the “middle” values are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Understand the term confidence interval

A
  • Arange of valueswe are fairly sure ourtrue valuelies in
  • 95% confidence interval = range of values that you can be 95% certain contains the true population
  • Large sample size gives a narrow 95% Cl = precise estimate of effect
  • Small sample size gives a wide 95% Cl = imprecise estimate of effect
  • If confidence intervals overlap for a few different samples, it is NOT statistically different
  • If a confidence interval contains a no effect value, then the confidence level is not statistically significant
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Interpret the meaning of prevalence

A
  • The amount of affected people in a population at a given time
  • Affected/ total number of population at the time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Interpret incidence in terms of its meaning, how it is determined, cumulative incidence and incidence rate

A

Incidence
• Number of new cases of affected people

How it is determined
• This is determined by following at risk individuals for a period of time to see their transition into sickness
• Not at risk individuals are excluded from the study
• Since you have to follow someone into disease, you have to account for that time

Cumulative incidence
• Total number of new cases/ population at risk during this time

Incidence rate
• Rate of occurrence of new cases over a given time
• Number of new cases/ person-time
• Members at risk contribute to time as the time spent following them up until they were diagnosed
• Person-time is the sum of total time contributed by all subjects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Understand the term “person-years”

A

1/35 person-years
• Means that 1 person becomes sick over 35 years of observation
• 1 new disease is expected to occur is 35 people people are followed for 1 year
• 1 new disease is expected to occur is 5 people were followed for 7 years

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

State the importance of understanding prevalence and incidence

A
  • Prevalence helps determine needs of a community in treating that disease
  • Incidence helps understand the cause of disease and effectiveness of prevention program
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Interpret number needed to treat (NNT)

A

• Quantifies how many patients have to be given a new therapy for a particular duration so that one patient can benefit compared to giving another therapy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Interpret Risk ratio and the values, including confidence intervals

A

Risk ratio
• Relative Risk: is a ratio of the probability of an event occurring in the exposed group versus the probability of the event occurring in the non-exposed group

RR >1: exposed group has higher risk of getting outcome
RR = 1: no difference
RR < 1: exposed group has lower risk of getting outcome

If a confidence interval contains 1 = no statistical significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Interpret Risk difference (attributable risk and absolute risk reduction) and the values, including confidence intervals

A
  • Difference (subtraction) between theriskof an outcome in the exposed group and the unexposed group.
  • Attributable risk: higher risk in the exposed than non exposed
  • Absolute risk reduction: Lower risk in the exposed group than the non exposed (e.g. an intervention to prevent death from a disease)

0 = there is no difference in risk between the two groups

• If a confidence interval contains 0 = no statistical significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Interpret Odds Ratio (OR)

A
  • Odds of a disease occurring in one group compared to the odds of it occurring in another group)
  • Odds compare events with non events. If a horse wins 2 out of every 5 races, its odds of winning are 2 to 3 (expressed as 2:3)

OR = 0 Exposure does not result in outcome
OR> 1 = Exposure associated with higher odds of outcome
OR <1 = Exposure associated with lower odds of outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Understand and interpret probability value (P value)

A
  • Null hypothesis: states that results are due to chance and that the two variables being investigated do not cause each other
  • If P value is low ==> the null must go!
  • If p value is high ==> the null’s your guy!
  • Thus a small p-value indicates that the statistical significance is great
  • P value ≤ 0.05 = difference is considered to be statistically significant
  • P-value > 0.05 = Result is not statistically significant
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Understand the Chi-square (x2) text

A
  • Only works for categorical data
  • Tests relationship between categorical data
  • Gives a ‘p’ value to help decide statistical significance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Understand T-test and ANOVA

A

T-tests use p-values to determine if there is a statistical significance between two groups. There are two types of t- tests:

Independent Samples t-test • compares the means for two unrelated groups

Dependent sample t-test
• compares means from the same group at different times (say, one year apart).

ANOVA:
• T tests can only test 2 means
• ANOVA can test more than 2 means

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Understand the concept of linear regression

A

• When you use the independent variable to predict the dependent variable

17
Q

Define power

A
  • Power is the probability of rejecting the null hypothesis when it is false
  • Basically, making the correct decision when it comes to the null hypothesis
  • Should be at least 80%
18
Q

Identify the factors and how they affect sample size and power

A

Factors that can affect power
• Sample size: a high sample size gives higher power
• Variability of observations (standard deviations): the lower the standard deviation, the higher the power
• Effect size: measures the strength of the relationship between two variables on a numeric scale
• Significance level

19
Q

Understand Type 1 and Type II error

A
  • Type 1 error: rejecting a true null hypothesis

* Type 2 error: accepting a false null hypothesis

20
Q

In terms of diagnostics, interpret “sensitivity”

A

• The probability that a test will indicate ‘disease’ among those with the disease

21
Q

In terms of diagnostics, interpret “specificty”

A

• The fraction of those without disease who will have a negative test result

22
Q

In terms of diagnostics, interpret “likelihood ratios”

A

Positive likelihood:
• How likely it is to get a positive reading when you have the disease vs the group who are disease free
• Should be lower than 1

Negative likelihood:
• how likely it is getting a negative test result when you have the disease vs the group who are diseased
• Should be lower than 1