WEEK 1: INTERPRETING STATISTICAL RESULTS Flashcards

1
Q

Define: p-value

A

measures the probability of obtaining the observed results, assuming that the null hypothesis is true

Tells us there’s a difference between groups and how likely it is due to chance

example: if the p value is less than 0.05 it means there’s a less than 5% chance the result occured due to chance

A low or high p-value does not prove anything with regard to the effectiveness of an intervention

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define effect size

Explain its used

Describe how it can be displayed

A

Effect size: statistic which estimates the magnitude of an effect (e.g. mean difference, regression coefficient, Cohen’s d, correlation coefficient)

Can be used as a relevant interpretation of an estimated magnitude (weak, moderate, or strong-effect)

An effect size can be displayed both as an unstandard and standardized value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe unstandardized and standardized effect sizes

A

If the original units of measurements are meaningful, the presentation of unstandardized effect statistics is preferable over that of standardized effect statistics

Examples: metres, degrees, velocity

Standardized effect statistics are always calculable if sample size and standard deviation are given along with unstandardized effect statistics

Examples: percentage, Cohen’s d, Hedge’s g

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Differentiate between minimal detectable change (MDC) and minimal clinically important difference (MCID)

A

Minimal Detectable Change (MDC): The minimum amount of change in a patient’s score that ensures the change isn’t the result of measurement error.
- Statistical significance

The smallest difference in score in the domain of interest which patients perceive as beneficial and which would mandate, in the absence of troublesome side effects and excessive cost, a change in the patient’s management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define: Cohen’s d

A

indicate the standardized difference between two means.

can be used to quantify effect size

Small effect: d = 0.2 – 0.49
Medium effect: d = 0.5 – 0.79
Large effect: d ≥ 0.8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define: Pearson Correlation Coefficient

A

A correlation analysis provides a quantitative way of measuring the strength of a relationship between two variables

Negative relationship: -1
No relationship: 0
Positive relationship: 1

Interpreting effect size
* Small = +- 0.1
* Medium = +- 0.3
* Large = +- 0.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define: intraclass correlation coefficient

A
  • Used to measure rater reliability for continuous data

Interpreting ICC scores
Poor < 0.5
Moderate = 0.5 to 0.75
Good = 0.75 to 0.9
Excellent > 0.9

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define: variance

A

Provides an estimate of the degree of scatter of individual sample data points about the sample mean - How close an individual value is to the mean value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why are effect deviations squared?

A

Because individual data points fall above and below the mean, the effect of direction of difference will cause some deviations from the mean to be positive and some to be negative

To overcome this, effect deviations are squared to obtain a positive number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

State the population variance and sample variance equations and describe each variable

A
  • 𝜇 = population mean
  • 𝑋( = sample mean
  • Xi = each individual data point
  • N = total number of data points in a population
  • n = total number of data points in a sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define: range

A

Difference between the
maximum and minimum values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define: standard deviation

A

square root of variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define: interquartile range

A

Range of
the middle 50% of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define: coefficient of variation

State eqquation

Explain usage

A

Coefficient of variation: the ratio of the standard deviation to the mean; shows the extent or variability in relation to the mean

CV = (SD/mean) x100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define: standard error of the mean (SEM)

State the equation

A

Standard Error of the Mean: estimate the precision or reliability of a sample, as it relates to the population from which the sample is drawn

SEM does not provide an estimate of the scatter of sample data about the sample mean, and should not be used as such

The accuracy of each sample is determined by the number of observations, therefore error decreases as sample size increases

SEM = s/ sqrt N

where s = standard deviation

N = sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define: confidence intervals

State equation

A

range of values that encompass the actual population or ‘true’ value, with a given probability. The width of the interval indicates the precision of the estimate (effect size), the wider the interval, the less precise the estimated effected size. A study with a small sample size will have greater random error, leading to a wider interval

CI = mean +- [(alpha/tails x critical value) x SEM]

17
Q

Define: alpha levels

A
  • probability, the researcher is willing to accept, that the findings are a result of sampling error
  • It is determined by the level of confidence the researcher decides to use, and is typically set to either 90%, 95% or 99%
  • The greater the confidence level, the wider the interval
  • Sample size can also affect the width of the interval, with smaller
    sample sizes leading to wider intervals
  • A confidence of 95% will provide an alpha level of 0.05
18
Q

Define: tails

A
  • The number of tails depends on the question the researcher is asking
  • Two tails are used if the researcher would like to know if the results of
    an intervention differ from a control or alternate group
  • It is common to use 2 tails in intervention-based research
  • If we are certain the results will only go in one direction, you can use one tail (example: you’re certain the results from the test will only be positive)
19
Q

When do you use a z distribution and t distribution

A
  • The z-distribution is used when the variance of the population is known
    In some cases, an author may choose to use the z-distribution if the sample size is greater than 30
  • The t-distribution should be used when the true variance is not known and has been estimated from the sample. With larger sample sizes, a t-distribution value will become similar to that of a z-distribution
20
Q

Define: content validity

A

assesses whether a test is representative of all aspects of the construct (is the test giving you the information important to you?)

21
Q

Give an example of poor content validity

A

A measure designed to assess lower extremity functional status that inquires about walking 2-blocks and nothing more does not possess content validity

Many other important ambulatory activities (negotiating stair, running, rising from sitting etc.) were not included.

22
Q

Define: construct validity

A

Refers to whether you can draw inferences about test scores related to the concept being studied.

For example, if a person has a high score on a survey that measures anxiety, does this person truly have a high degree of anxiety?

Graph example from class. A new physiotherapy assessment tool has very close score values to a gold standard assessment

23
Q

How can we demonstrate construct validity?

A

Homogeneity: The instrument measures one construct. The construct must be well defined

Convergence: this occurs when the instrument measures concepts similar to that of other instruments.

Theory evidence: This is evident when behaviour is similar to theoretical propositions of the construct measured in the instrument
- For example, when an instrument measures anxiety, one would expect to see that participants who score high on the instrument for anxiety also demonstrate symptoms of anxiety in their day- to-day lives

24
Q

Define: criterion validity

A

evaluates how accurately a test measures the outcome it was designed to measure.

Correlations can be conducted to determine the extent to which the different instruments measure the same variable.

25
Q

Define some measurements of criterion validity

A

convergent validity: shows that an instrument is highly correlated with instruments measuring similar variables

divergent validity: shows that an instrument is poorly correlated to instruments that measure different variables.

predictive validity: means that the instrument should have high correlations with future criterions. For example, a score of high self-efficacy related to performing a task should predict the likelihood a participant completing the task.

26
Q

Define: inter-rater reliability

A

reflects the variation between 2 or more raters who measure the same group of subjects.

27
Q

Define: intra-rater reliability

A

It reflects the variation of data measured by 1 rater across 2 or more trials.

28
Q

Define: test-retest reliability

A

It reflects the variation in measurements taken by an instrument on the same subject under the same conditions. It is generally indicative of reliability in situations when raters are not involved or rater effect is neglectable, such as self-report survey instrument

29
Q

Define: absolute reliability

A

quantifies random measurement error in the same units as the original measurement

  • Thus, measurement error for range of motion is reported in degrees; measurement error for the 6-minute walk test is stated in meters; and measurement error for the LEFS is expressed in LEFS points

Absolute reliability provides an estimate of the variability among multiple measurements for a truly stable or unchanged patient
* As such it comments on the consistency aspect of reliability.

30
Q

Define: relative reliability

A

Relative reliability examines the extent to which a measure is capable of differentiating among the objects of measurement

Example: differentiating between a grade 1 and grade 2 ligament sprain

31
Q

Define: sensitivity

A
  • The ability of a test to reliably detect the presence of disease (positivity in disease).
  • Less likely to miss the condition if one is present
  • Sensitivity = True Positive / (True Positives + False Negatives)
  • High sensitivity = low false negatives

Example: the ability of an x ray to detect stress fractures - low sensitivity

Higher on a bone scan than on an xray

32
Q

Define: specificity

A
  • The ability of a test to reliably detect the absence of a specific disease (negativity in health)
  • Less likely to say the patient has the condition when the really do not
  • High specificity means you won’t waste resources
  • Low specificity, likely to send for imaging when not needed
  • Specificity = True Negatives / (True Negatives + False Positives)
  • High specificity = low false positives

Higher on an x ray than a bone scan

33
Q

Can you use sensitivity and specificity to predict whether an indvidual is diseased or disease free?

A

No - Sensitivity and specificity are merely properties of a test. Sensitivity and specificity should not be used to make predictive statements about an individual patient.

34
Q

Define: positive predictive value (PPV)

A
  • A positive predictive value (PPV) is useful to indicate the proportion of individuals who actually have the disease when the diagnostic test indicates the presence of that disease
  • PPV = True Positive / (True Positive + False Positive)
  • PPV = True Positive / All Positive Results
35
Q

Define: negative predictive value (NPV)

A