Final exam Flashcards

1
Q

What is leverage?

A

How an individual data point can have the potential to influence the slope of the regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is influence?

A

How an individual data point actually affects the slope of the line
If you remove the point, how much the slope will change
High influence data points are always high leverage values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Cook’s distance? What is it a good measure of?

A

Measures the effect of deleting the given observation.

It is a really good measure of influence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How can you calculate variance?

A

Square of SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What assumptions must be met for linear regression?

A

CLINE - Constant variance, linearity, independence, normality (of errors), Error-free predictability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How should your residuals look?

A

Normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a residual value?

A

Actual value - predicted value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do we know that variance-covariance is met in RMA?

A

epsilon adjustments are >0.7

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What type of output is indicative of RMA?

A

H-H, H-F, G-G, epsilon adjustments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

If the epsilon adjustment is

A

G-G, limits type I errors though can increase type II if you have sphericity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you calculate the magnitude of difference after RMA?

A

Pairwise comparison

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can you test if independence is violated?

A

Intra-class correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does an ICC test look like? What value is most important?

A

“xtreg” get a value for rho.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the cutoff for ICC? What is the variable called?

A

Rho. This value is the ICC and explains how much of the variance is subsumed within subject. Note that 30% is huge. Anything over 0 is technically violating independence but analysts usually don’t get highly concerned until a value around 0.05 (i.e., 5%) is reached

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the purpose of linear regression?

A

Predict when you have a dichotomous/binary outcome rather than continuous.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In what test do you find the log likelihood?

A

Logistic regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is LRChi2

A

The odds that our current model is better than our most recent model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How do you calculate LRChi2

A

Difference between log likelihood x -2

19
Q

In what test do you find an odds ratio?

A

Logistic regression

20
Q

What does an odds ratio mean?

A

PERCENTAGE likelihood for a one unit increase in whatever your dependent variable is. e.g. OR = 1.087 is only an 8.7% difference. Don’t make the mistake of multiplying this number directly, remember that it is a PERCENTAGE. E.g. 10 x 10% increase is not a 100x increase it is a 100 PERCENT increase (2x).

21
Q

In what direction should your LL move?

A

Towards zero

22
Q

Do equal odds ratios mean equal effect on dependent variable?

A

NO! It depends on their scales

23
Q

How do you define a confounding variable?

A

An independent risk factor for your outcome.

24
Q

How can you transition between coefficient and OR in logistic regression?

A

The natural log of the odds ratio will give you the coefficient
Likewise e ^x (coefficient) will give you the odds ratio

25
Q

What can be said about odds ratios in logistic regression?

A

This is looking at the effect of personality type on risk for chd CONTROLLING for all these other variables. You have a 2x greater chance of chd being a type A personality.

26
Q

In a pearson correlation table which variable is the “explainer”

A

the one on the top line! Explainer Excels and Elevates

27
Q

What is S(t)?

A

survival function = probability that a subject survives longer than a specified time (t). Here, “survives” is used as a general biomedical term that simply means a subject did not incur the event of interest. Few survival analyses actually involve the true issues of survival or death.

28
Q

What is h(t)?

A

hazard function = the instantaneous potential (per some unit of time) for an event to occur (given, obviously, that the subject has survived until t).

29
Q

What is censored data?

A

An observation is said to be censored in TTE modeling if the event of interest does not occur during the study period.

30
Q

What is left-censored?

A

In this situation (usually very uncommon in biomedical studies), a subject incurs the event before the actual study period begins. In biomedical studies, such subjects are usually excluded from TTE studies.

31
Q

What is right-censored?

A

censored (most common in most TTE studies). An observation is said to be right-censored if the study period ends and the event of interest has not occurred. This could be due to loss to follow-up or a subject that dies or is otherwise forced to leave the study from causes that are not the outcome of interest (competing risks).

The most usual reason is simply that the study ends but the subject has remained free of the outcome of interest (simple censoring).

32
Q

What is interval-censored?

A

An observation is interval-censored if it incurs the event of interest during a time it was not under observation. Without additional information (for example, full medical records), we only know that a subject failed between time (gone) and time (back).

33
Q

What is a log-rank test?

A

Tests whether two kaplan-meier survival curves are statistically different.

34
Q

What does a log-rank test output look like?

A

Table of events observed and events expected. Relevant output = lowest number “LRChi2” which is the p value.

35
Q

What does “adjusting for something mean”?

A

Controlling for it/holding it steady

36
Q

What is an important requirement for Cox regression?

A

Proportional hazards over time

37
Q

What values can hazard ratios take and what do they mean?

A

Hazard ratios 1.00 indicate a deleterious effect. To get to the hazard ratio, simply exponentiate B. Thus e -0.37 = 0.69.

38
Q

What is an important and beneficial element of regression?

A

Remember that one of the properties of regression is that the effect estimate (here, the hazard ratio) for any variable is adjusted for all other variables in the model.

39
Q

What would a null hazard ratio be?

A

1.0 neither protective nor hurtful

40
Q

What is an important test to run alongside a cox proportional hazards test?

A

A test for the proportional hazards assumption. Output = Harrells’ C (equivalent of a p value). This test is similar to Bartletts or Levenes test…desire a non-significant p value.

41
Q

What is a log-log curve?

A

A visual proportional hazards assumption test. You want the two lines to be parallel.

42
Q

What is the harvesting effect?

A

The harvesting effect: the guys who can’t hack it, drop out. Making it so that the older guys remaining are the tough ones. The effects of exposure HARVEST the stronger, insensitive men. This causes the older guys to look healthier/be in better condition. The weak ones dropped out.

43
Q

What does classic effect modification look like?

A

Diverging lines. (When a predictor has a different effect on different groups).