Knowledge Questions Part 1 Flashcards by Alexander D.

What type of graph or table is used to represent the correlation between two continuous
variables, such as intelligence and income? Which statistical measure and test correspond with
this?

A Scatterplot; Statistical measures such as the mean and std.dev. correspond to it. You can measure the correlation using (linear) regression analysis.

How well did you know this?

Not at all

Perfectly

What type of graph or table is used to represent the correlation between two binary
variables? Which statistical measure and test correspond with
this?

You can use contingency tables to display the data. The data is classified in terms of proportions and probabilities and you can apply binary logistic regression and other tests making use of the chi-square statistic.

How well did you know this?

Not at all

Perfectly

Let’s assume we have a contingency table for treatment (yes/no) and recovery (yes/no).
Formulate a null hypothesis to establish a difference and a null hypothesis to establish a
correlation. How do these hypotheses relate to each other?

H0: P(Treatment) = P(Recovery)
H0: P(Recovery) = P(Recovery | Treatment)
-> In both cases, the phenomenon mentioned in the question is supported by the rejection of these hypotheses.
-> Hypothesis 2: If the probability of recovery significantly differs depending on whether there was treatment or not, you can say that there is a correlation.

How well did you know this?

Not at all

Perfectly

How does the contingency table calculate the frequencies expected under the null hypothesis?

It calculates them under the assumption, that the variables are not correlated. That is: P(X) = P(X|Y).

How well did you know this?

Not at all

Perfectly

If both variables are dichotomous, Pearson’s correlation coefficient can be rewritten to which
measure of association?

Pearson’s Phi.

How well did you know this?

Not at all

Perfectly

What is the formula for calculating the odds ratio as a measure of association for a 2*2 table?

(CellA * CellD) / (CellB * CellC)

How well did you know this?

Not at all

Perfectly

What is the relation between the odds ratio, the log of the odds ratio and the correlation? What
value must each of these three measures have in order for the association to be negative, positive
or absent?

The odds ratio indicates the difference between the odds of Y, given each value of X. If this value differs significantly from 1, there is likely some correlation. Large numbers indicate positive correlations, and numbers between 0 and 1 indicate negative correlations. When plotting the P(Y) on X, the log of the odds ratio gives the slope of the resulting line. Here, positive numbers indicate a positive correlation and vice versa.

How well did you know this?

Not at all

Perfectly

Let’s assume we split a 2*2 table for variables X and Y into the levels for C, a third variable.
We then perform the Mantel-Haenszel test to establish the so-called ‘common odds ratio’. Which
null hypothesis is being tested in this case? Which assumption must be applied in order for this
test to be meaningful?

It tests the assumption that X and Y are independent while correcting for C. This test investigates the main effect of X.

How well did you know this?

Not at all

Perfectly

How does one determine that an interaction between X and C exists in the sample? And what part of the SPSS output for contingency table analysis tests whether there is interaction in the population?

One has to compare the odds ratios between X and Y given the different levels of C. The test of homogeneity of odds ratios is used to investigate this statistically.

How well did you know this?

Not at all

Perfectly

How does one determine that an interaction between X and C exists in the sample? And what
part of the SPSS output for contingency table analysis tests whether there is interaction in the
population?

One has to compare the odds ratios between X and Y given the different levels of C. The Mantel-Haenszel test may be used for this, which is reported under the test of common odds ratio in SPSS.

How well did you know this?

Not at all

Perfectly

What is the correct follow-up analysis for the effect of X on Y if interaction is present? And
if there is no interaction?

One should now look at simple effects, that is, effects of X on Y within the respective levels of C. If there is no interaction, one can proceed by looking at the main effect of X on Y.

How well did you know this?

Not at all

Perfectly

When can confounding be said to exist between X and C?

When the design is unbalanced. That is, when the relative distributions of X and Y are not the same across levels of C.

How well did you know this?

Not at all

Perfectly

Can confounding and interaction occur at the same time?

Yes.

How well did you know this?

Not at all

Perfectly

What is moderation?

Moderation is when C acts as a kind of ‘gatekeeper’ for the effect of X. For example, only when C equals a certain value, does X have an effect on Y.

How well did you know this?

Not at all

Perfectly

What is the principal difference between linear and logistic regression?

In linear regression, the predictor variable is quantitative, whereas in logistic regression, it is categorical.

How well did you know this?

Not at all

Perfectly

In the case of ANOVA and regression (Statistics 2), the expected value of continuous
variable Y is modelled as the sum total of a constant + main effects + interactions. In logistic
regression, Y is dichotomous. What is being modelled now as the sum total of the effects?

The natural logarithm of the odds of being in a certain category of Y.

How large is ln(X) if X is 1, has a value between 0 and 1, or is greater than 1?

X=1&raquo_space; ln(x) = 0
0> ln(x) < 0
X>1&raquo_space; ln(x) > 0

How large are the log odds if the probability is 50%, less than 50% or more than 50%?

P = 50%&raquo_space; log odds = 0
P < 50%&raquo_space; log odds < 0
P > 50%&raquo_space; log odds > 0

How large is exp(X) if X is 0, smaller than 0 or greater than 0?

X = 0&raquo_space; Exp(X) = 1

X < 0&raquo_space; 0 0&raquo_space; Exp(X) > 0

Assume we perform logistic regression using a single predictor, X, and X is dichotomous.
How do we interpret regression weight B for X in this case? And how do we interpret exp(B)?

B represents the slope of the line if we plot the log odds of Y (for binary logistic regression) on X. Exp(B) is then the odds ration between X and Y.

How large is the odds ratio for the effect of a dichotomous variable X on the dichotomous variable
Y if the regression weight B for X is 0, negative or positive?

If B is 0, the odds ratio will be 1. If B is negative, the odds ration will be between 0 and 1, and if B is positive, the odds ration will be larger than 1.

How large is the odds ratio for the effect of a continuous variable X on a dichotomous variable
Y if the regression weight B for X is 0, negative or positive?

If B is 0, the odds ratio will be 1. If B is negative, the odds ration will be between 0 and 1, and if B is positive, the odds ration will be larger than 1.

How large is the odds ratio for the effect of a continuous variable X on a dichotomous variable
Y if the regression weight B for X is 0, negative or positive? Answer the question under the assumption that there are additional predictors but no interaction between them.

If B is 0, the odds ratio will be 1. If B is negative, the odds ration will be between 0 and 1, and if B is positive, the odds ration will be larger than 1.

Let’s assume that the logistic model for dichotomous value Y (dementia: 0 =no, 1=yes)
includes predictors X (sex: 0=male, 1=female) and C (age in years), and there is no interaction.
The regression weight (B) for X is significantly negative. Will the odds ratio be greater or
smaller than 1? And which of the sexes will have a higher probability of dementia?

It will be smaller. Females will be less likely to have dementia.

Let’s assume that the logistic model for dichotomous value Y (dementia: 0 =no, 1=yes) includes predictors X (sex: 1=male, 0=female) and C (age in years), and there is no interaction. Males are more likely to be demented. Which B value will you find, and what odds ratio?

The B value will be positive. The odds ration will be larger than 1.

Let’s assume that the logistic model for dichotomous value Y (dementia: 1 =no, 0=yes) includes predictors X (sex: 1=male, 0=female) and C (age in years), and there is no interaction. Males are more likely to be demented. Which B value will you find, and what odds ratio?

The B value will be negative. The OR will be smaller than 1.

Let’s assume the logistic regression of Y against X, C and X*C reveals a significant interaction effect. How should we estimate and test the effect of X on Y in this case?

We should looks at the effect of X while keeping the C at a constant level.

Let’s assume the logistic regression of Y against X, C and X*C reveal no interaction effect. How does the method that needs to be applied in this case relate to the Mantel-Haenszel test of the common odds ratio?

The Mante-Haenszel test of common odds ratios is used to evaluate the significance of the interaction. If it produces an insignificant result, we should continue with looking at main effects.

In the previous tutorials and practicals, logistic regression and contingency table analysis yielded similar results. Despite the fact that logistic regression appears much more complicated than contingency table analysis, this method is preferable in most cases. Why is this?

You do not have to order so many additional analyses based on the output of previous analyses. This minimizes the chance of human error. Additionally, accounting for more predictors is only possible using the regression.