oral exam Flashcards

1
Q

Explain the difference between a risk ratio and an odds ratio.

A

Risk ratio is the likelihood of developing a condition (disease) under certain conditions (exposed or control). Calculated by adding rows together. AR = [A/(A+B)] & AR=[B/C+D)]
RRRatio or RR reduction (AR/AR)

Odds ratio is the probability of having the condition or not. calculated by adding columns Odds= [A/A+C)]
OR = AD/BC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Name one type of model you could run if you were comparing two groups and your dependent variable was dichotomous.

A

A chi square test will compare two dichotomous variables

A statistical model that can compare dichotomous outcome variables is a logistic regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the difference between ratios of relative risk and absolute risk? Why is the latter important to always report when possible?

A

Absolute risk is the likelihood of developing the condition under a specific set of circumstances
relative risk is risk that the experimental group incurred in comparison to the control group.
Comparing the relative or increaesd risk back to the baseline absolute risk puts the change in risk in proper proportion. If the baseline incidence of a condition is very low, even a high relative risk will only shift the absolute risk a smaller amount. For example AR is 6% with 50% RR then the new risk is 7.5%; whereas if the AR was 20% and a RR of 50% then the new risk is 30%.
RR= ARe/ARc
where ARe= A/(A+B) and ARc = C/(C+D)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In a clinical trial with more than one time-point (e.g. baseline, 6 weeks, 6 months), what does it mean to adjust for multiple comparisons? What is the name of a test you could use to adjust for multiple comparisons?

A

Each time the analysis is run, in this case for different time points, the likelihood of a type I error or false + finding increases. So a correction is run to accomodate this. A conservative test such as the Bonferroni post hoc test will correct for this.
Conservative tests for confirmation trails; liberal tests for explanatory trials

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If two therapists are independently measuring 100 knees to determine if each knee maximum flexion ROM is either above or below 90 degrees, and the agreement between them is equal to a Kappa of 0.92, what does this mean?

A

Kappa is a measure of agreement between raters. kappa takes into account the agreement beyond chance. In this case because there are two raters the statistic used is Kohen’s Kappa, for 3 or more raters it is a weighted kappa. .92 means almost perfect agreement.

.81-.1.00 is almost perfect agreement
.61-.80 substantial
.41-..60 moderate
.21-.40 fair
.01-.20 slight
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When you have missing data in a trial, you want to know if the data is missing at random (or if there is a specific pattern. You run a Little’s Missing Completely at Random (MCAR) test which gives you a value of P=0.382. What do you infer from these results?

A

Missing completely at random means that there is no relationship between missing data and any of the values. In this case we reject the null hypothesis which indicates the data is missing completely at random.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

If you are comparing costs of care between two treatment groups in a trial, which is likely the better model to use and why? A T-Test or a generalized linear model?

A

A generalized linear model is more flexible in that the data does not have to be normally distributed and in this case cost data is unlikely to be so. A gamma distribution (scale data) generalized linear model is more appropriate as cost is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When considering the types of patients you want to include in your study, you have to think about heterogeneity in your design and in your statistical plan. What does this mean?

A

Heterogeneity refers to differences between the groups. From a study design perspective sound research design
in setting up study conditions and and randomization can minimize this.
Statistically this is measured with Levene’s test for homogeneity of variance. The null hypothesis is that there is no difference between groups, variance is equal. So p>.05 means equal variances assumed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a confidence interval?

A

The range of values in which statistical confidence that the true values lies. For example a 95% CI means the range of values that the test mean will fall into 95 of 100 trials.
CI estimates the precision of the data, smaller CI means more precise estimates.
Also can measure the magnitude of the difference between groups.
CI that cross the summary statistic are not significant
if group CI overlap by < 25% then they. are probably significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does it mean when the confidence interval crosses 0 or 1?

A

For ratio’s such as odds ratio, risk ratio then crossing 1 means no significant difference
for comparison measures when CI crosses 0 then no significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When you are assessing predictor variables for your prediction model, you have to consider multicollinearity between the variables. What does this mean?

A

multicollinearity indicates that the independent variables are related to each other; only performed with multiple linear.

r value is the correlation statistic–> .2
no more than 10% increase in st. error as new variable added to the equation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does it mean when you have wide confidence intervals for the treatment effect?

A

A wide confidence interval indicates less precision, greater uncertaintity in the results. This may be due to a small sample size.
meta analysis indicates greater heterogeneity in studies.
If CI crosses the mean of the other test statistic then. it is not significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is it better to see the confidence intervals of a point estimate than it is to see the p- value?

A

p values can only distinguish if there is a statistically significant difference. CI can identify

  1. significance (cross 0,1)
  2. direction
  3. strength of the effect (width).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

For randomized clinical trials, why is it controversial to put the p-value for the differences between baseline variables in your Table 1?

A

Some journals require and others don’t. The argument is that proper randomization will decrease variablity in groups and so if there are differences these are by chance and even if there is a difference, it does not change how the study is conducted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the difference between prediction and causation?

A

Prediction studies

In a prediction analysis, the goal is to develop a formula for making predictions about the dependent variable, based on values of the independent variables. Regression studies
In causation analysis, the independent variables are regarded as causes of the dependent variable. RCT’s

Causation indicates that one event is the result of the occurrence of the other event; i.e. there is a causal relationship between the two events. Also referred to as cause and effect. In a causal analysis, the independent variables are regarded as causes of the dependent variable. (ADDED 8/11/2020: RCT’s aim to prove causality)
(Causation is not to be confused with correlation! Correlation indicates the amount which two variables move together.)
Prediction is the act of forecasting what will happen in the future. A prediction is specific to the study and experiment that you design to test your hypothesis. It’s the outcome you would observe if your hypothesis were supported.
In a prediction study, the goal is to develop a formula for making predictions about the dependent variable, based on the observed values of the independent variables

Prediction is simply the estimation of an outcome based on the observed association between a set of independent variables and a set of dependent variables. Its main application is forecasting. Causality is the identification of the mechanisms and processes through which a certain outcome is produced.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When planning a clinical trial with 2 comparison groups, if you increase the magnitude of the effect size you expect to see between two groups, what does that do to your sample size estimate (how does it affect your power)?

A

The sample size estimate will decrease in order to meet the expected power level. If the sample size remains the same there will be increased power in the study.

power = 1- beta (beta is “c” in 2 x2 table)
sample size n=30 p=.05
80% power indicates willing to acept 20% type II (false -) and 5% type I false + rate
90% power indicates 10% type II and 5% type I

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

If you set your expected effect size for treatment between two groups to be large, but the effect size between the groups ends up being only moderate, what will your findings likely be?

A

The study may be underpowered as the sample size will not have been large enough resulting in a potential type II error–rejecting the null hypothesis when in fact there was a difference.

may not reach statistical significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

A cohen’s d effect size most often refers to ratio-level data (e.g. difference in means). How would you measure the effect size for dichotomous outcomes?

A

Odds ratio or risk ratios

OR = column totals; odds of developing the condition A/(A+C)
RR= row totals; risk of getting disease based on exposure or not; ; A/(A+B)
partial eta2 is the effect size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does it mean to have a negative correlation? Provide an example.

A

A negative correlation means as one variable increases in magnitude the second decreases.
-1 is perfect negative correlation 0 = no correlation; . 5 low and .3 neglible
Example: heating costs in summer in Alaska. As temperature increases, there is less heat used.
increase in exercise time decrease in body fat percentage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Explain why plotting your data before making any decisions about which analysis to use is an important first step.

A

Plotting the data allows to look for any patterns in missing data points, check for normality and see outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

List at least 3 different measures of validity and provide an example of each one.

A

face validity: measures what it appears to measure
ex: NPRS–asking for verbal number to quantify pain

criterion validity: extent to which the scores are correlated with the expected variables
ex: test scores on exams and test anxiety

content validity: the extent to which a test covers the content of interest;
example: military fitness testing measuring strength, edurence, flexibility, power

discriminate validity: ability to distinguish between two different constructs
example:

22
Q

If the outcome is whether or not a patient had a stroke after cervical manipulation, what type of regression model would you want to use and why?

A

logistic regression; the outcome is dichotomous–yes or no and therefore linear regression is not appropriatee as it requires continuous data

23
Q

Provide the working data assumptions for running a t-test.

A
measurements are I
groups are Independent
continous outcome variable
data is normally distributed
homogeneity variance; variance between groups is equal (Levine's >.05)
24
Q

Provide the working data assumptions for running a one way ANOVA.

A
measurements I
groups I
subjects in each group I
three or more groups
population relative normal distribution
groups normally distributed (if not then Kruskal-Wallis)
continuous outcome variable
population equal variance (Levines >.05)

all cells have min 10 (30 preferred)
cell size ratio no greater 1:4

25
Q

Provide the working data assumptions for running a linear mixed model.

A
  1. the participants must be independent (each participant only once and does not influence the another participants score.
  2. the factors are independent and not highly related to one another.
  3. the outcome variable is normally distributed within each cell at each time point.
  4. the residuals across the model are normally distributed.
  5. there are no influential univariate or multivariate outliers.
26
Q

What type of data do you typically display with the use of a forest plot?

A

systematic review; meta analysis of similiar RCT’s

t

27
Q

How do you interpret the values on a forest plot?

A

studies are chronological in rows
boxes are the effect size
whiskers are the confidence intervals
vertical line is line of no effect

The randomized controlled trials are listed on the left in chronological order of publication. Next, the number of participants for the experimental and control groups effected are listed. There is a vertical line (y-axis) is the line of no effect. On the x-axis is the outcome meaure that is intersected by the line of no effect. Based on the measurement, the left side represents less effect; the right, more effect (or decreased and increased risk). This line can be at zero if the mean sums are being looked at, or at 1.0 if a risk ratio is being graphed. The size of the box indicates the size (effect) of the study. The horizontal lines extending out from the marks are the confidence intervals (shorter lines indicate a small CI, while longer “wiskers” are a bigger CI). If either the box or the confidence interval passes through the horizontal line, the study information is not statistically significant. Information from studies that are statistically significant and graphically appear to overlap on one side or the other of the line are called homogeneous, and therefore more conclusive. Results that do not overlap are heterogeneous and the results are not similar.

28
Q

Let’s say you are trying to calculate the odds ratio and as you put your 2x2 table together, you see that one of the cell counts has only a value of 3 (only 3 people). What should you do and why?

A

continue calculation; odds ratio do not require minimum cell count;
chi square tests do in which case report the fisher’s exact test

29
Q

“Identify the most appropriate statistical analyses for the following from the provided list and explain why.

a. A researcher wants to determine if there is a difference between sex (M or F) and total number of injections received after a diagnosis of patellofemoral pain syndrome.
b. Researchers want to know if there was a difference in Patient Acceptable Symptom State (PASS) scores at baseline and then after treatment at 4 weeks, and you want to look at this difference between sex (M or F).
c. A researcher wants to determine if a population of patients with hip osteoarthritis have similar active range of motion (ROM) values compared to the national normative values.
d. A researcher wants to determine if there is a difference between age groups (18- 40 and 41-65) and Neck Disability Index scores in patients with neck pain after receiving manual therapy.”

A

A. two sample or Independent t test (Mann Whitney U test)–comparing two groups at one time point

B. Paired sample t test (Wilcoxon Signed Rank test)–two groups with repeated measure

C. One sample t test-(one sample wilcoxon signed rank test)-comparing to a known or established value

D. Two sample or independet t test (Mann Whitney U test) comparing two groups at one time point. *looking at the large age group differences this is likely not a normal distribution so Mann Whitney U test

30
Q

What is the danger of using an odds ratio to interpret risk of an event?

A

Odds ratio only looks at the probability of developing the condition; it does not take into account the effect of exposure. Only looks at those with condition (A & C) and those without (B & D)
Odds ratio will ovestimate the risk when the condition is relatively common

31
Q

Let’s say you are planning a clinical trial for knee osteoarthritis with two different treatments that patients can be randomized to receive. Give me an example of a primary outcome measure that you would establish a priori (looking for all of the components of a properly defined primary outcome measure).

A

LEFS scores 4 weeks 12 weeks and 6 months post treatment to show long term change; valid and reliable for MSK conditions. I select this because it is commonly used in my clinical setting which reduces likelihood of measurement error and compliance with introducing a new tool.

LEFS scores 6 indicate MDC and 9 points a clinically meaningul change
SEM:

32
Q

If your data is highly skewed (non-normal distribution), what are some adjustments you can make to make sure it is properly analyzed?

A

consider combining groups
determine if outliers should be removed
log functions may be applied

33
Q

Explain the difference between R and R2?

A

R value is the correlation between the I and dependent variable

R2 is the proportion of the variance explained be the regression model

Adj R2 explains the difference in the number of predictors in the model. When adj R2 goes down with additional variable do not include it in analysis.

R is the multiple correlation coefficient measure of linear relationship between two variables (how much do the variables relate to each other)

R2 is the coefficient of determination; the amount of variance in the dependent variable that is explained by the independent variable (ability to explain the I variable)

34
Q

Explain what a p-value is, and what it represents.

A

The largest probablity of obtaining a test result at least as extreme as the results actually observed assuming the null hypothesis is correct

P value is a probability value at which we can reject the null hypothesis that there is no difference. P .05 means 1 in 20 chance that the test statistic will fall within a range of no difference.

35
Q

Explain what the “power” or “statistical power” of a study means. Why is it important?

A

Power isthe probability of rejecting the null hypothesis when, in fact, itisfalse.Power isthe probability of making a correct decision (to reject the null hypothesis) when the null hypothesisisfalse.Power isthe probability that a test of significancewillpick up on an effect thatispresent

power refers to the sample size required to show a nominated effect size is statistically significant. The sample size needed to avoid a type II error; failing to reject the null hypothesis when it is true and thus missing that a clinically important difference does exsist. (false negative)

36
Q

Explain the acronyms “SpIn” and “SnOut.” What do they mean and how do you use them?

A

SnOUt–a test that is very good at detecting the condition is highly sensitive; therefore if it is negative we can confidently rule out the condition; true +’s (on 2 x 2 table=A)

SpIN; a test that is very good at eliminating the condition therefore a positve test indicates we can confidently rule in the condition; true -‘s (on 2 x 2 table = d)

37
Q

What is the difference between a univariate, bivariate, and multivariate analysis? Give the utility of each one, and considerations of each.

A

variate refers to the number of variables; umivariate means only 1 variable; simple measures such as mean, median, mode, st. deviation; looking for patterns

bivariate: 2 variables analyzed; 1 dependent and 1 independent–chi square, simple linear regression; establish relationship
multivariate: 3 or more variables analyzed: complex analysis; MANCOVA or MANOVA; decrease likelihood f type I error

38
Q

Describe differences between true versus quasi experimental study designs?

A

True experimental design: has treatment and control group, randomization, can determine cause and effect

quasi experimental: treatment and control group but no randomization; increased risk of bias

39
Q

Two groups of patients with neck pain were placed (non-randomly) into a dry needling only group or a manual therapy only group. Treatment was provided by the physical therapist according to what they felt was appropriate. The Neck Disability Index was given to the patients only one time. What type of study design is this?

A

quasi experimental design; cross sectional design because only administering this at one time point

40
Q

Discuss how selection bias might occur in a study.

A

recruitment of participants that are representative of the population
choosing an outcome measure that is invalid, unreliable
lack of randomization

in systematic review:
failure to include non english trials
using too few databases
too narrow selection criteria

41
Q

A study is looking at shoulder interventions or neck and shoulder interventions in patients with [neck pain]. Patients will be randomly allocated to the intervention group. There will be clearly defined inclusion and exclusion criteria. The primary outcome measures will be the Neck Disability Index and the Numeric Pain Rating Scale, and will be measured in each group at baseline, two weeks, four weeks and eight weeks. Which design would be most appropriate for the researcher to use based on this scenario?

A

randomized clinical trial

42
Q

Researchers are interested in studying range of motion gains in patients with rotator cuff tears. They are studying the use of joint manipulation versus mobilization delivered for either four, eight, or 12 minutes at a time before re-testing the patient’s range of motion. This is an example of what factorial design?

A
2 x 3 factorial design
2 factors:
--1. mobilization/manipulation
2. time
and 3 factors; 4,8, 12 weeks
43
Q

Discuss how measurement bias might occur in a study.

A

non validated outcome measure
lack of blinding of participants and data collectors
failure to conceal results

44
Q

Discuss how collection bias in a study may occur?

A

“Vetting: Good answer. From DSC901 Ethics and Bias in Research 2019, slide 9:

  1. Use strict eligibility and inclusion/exclusion criteria
  2. Establish criteria a priori for performing experimental portion of study and blinding
  3. Establish detailed methods for data collection
  4. Choose statistical methods a priori
  5. Use one or more control groups”
45
Q

What are three ways researchers can minimize bias in a study

A

“answer: UPDATE!! (from the vetting in the previous question)
1. Use strict eligibility and inclusion/exclusion criteria
2. Establish criteria a priori for performing experimental portion of study and blinding
3. Establish detailed methods for data collection
4. Choose statistical methods a priori
5. Use one or more control groups
6. Register trials/reviews
7. Reporting guidelines
8. Quality assessment tools
From DSC901 Ethics and Bias in Research 2019 presention, slide #9 & 10”

46
Q

We use the term “inferential statistics” to describe many of the statistical approaches we take. Why do we use that term and what does it mean?

A

inferential statistics is, “A branch of statistics concerned with testing hypotheses and using sample data to make generalizations concerning populations.”

Inferential refers to making hypotheses and sample data to make predictions. Inferential statistics uses samples to make predictions about a population

47
Q

Define and explain a “standard deviation”. What does it measure? What would it mean if the value of the standard deviation were larger than the value of the mean?

A

A standard deviation is the amount of variation in a group of values. Small st. deviation means they center around the mean, large standard deviation means a lot of spread in the values. Standard deviation larger than the mean this suggests non normal distribution.

48
Q

Assuming normally distributed data, how much of the sample population would you expect to have means that are within 1 standard deviation? How many within 2 standard deviations?

A

1 st dev =. 68%
2 St dev = 95%
3 st dev =. 99.7%

49
Q

What is the standard error of the mean, and how is it different from the standard deviation?

A

Standard error of the mean is the variance / square root of n.
How far is the sample data mean from the population mean; a measure of how precise the data is.

St deviation: measure of the variation in the individual measures; the degree to which individuals in the sampe differ from the sample mean

50
Q

Explain the relationship between sum of squares, variane and standard deviation?

A

Answer: Sum of Squares: a measure of deviation from the mean; used in calculating ANOVA statistics. Both within-group and between-group sums of square calculations can be calculated and compared; these two values can be added together to make a total variation (p. 116 Barton). The f-statistic is the variance between groups divided by the variance within the groups. Variance is calculated by look at the sum of squares and dividing by DOF; DOF for between groups is calculated by the number of groups minus 1 (ANOVA ppt, slide 8). Standard deviation is a measure of spread such that it is expected that 95% of the measurements lie within 1.96 standard deviations above and below the mean; the SD is the square root of the variance (p.379, Barton).