2.6.2. Basic Biostatistics II Flashcards

1
Q

What is a correlation?

A

A measure of the strength of the association between two variables; used to determine whether an “association” exists and quantify its strength

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you prepare a plot of correlation coefficient? Relate this plot to the correlation coefficient.

A
  1. Observe two variables (X, Y) for each member of a random sample of n subjects. 2. Plot pairs of points (X1,, Y1), (X2,, Y2), …, (Xn, Yn) on a scatterplot. 3. Inspect scatterplot for patterns of association. 4. Estimate population correlation coefficient ρ by sample correlation coefficient r
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the three extreme r values we use for correlation coefficient graphs and what do they tell us about our data?

A

r=0 => no linear association r=1 => perfect linear association r=-1 => perfect negative linear association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a regression?

A

Regression: A family of methods for relating a predictor or multiple predictors) to an outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

“Determine whether an association exists and quantify its strength” This is an example of a correlation or regression?

A

Correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

“Use the relationship to predict one variable from the other” This is an example of a correlation or regression?

A

Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Determine whether the observed relationship agrees with some theory or model and estimate the parameters of that model This is an example of a correlation or regression?

A

Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The most common way to measure linear association is by the use of what?

A

The correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Difference between p and r?

A

p is for the whole population. R is just the correlation coefficient for the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the formula for the sample correlation r?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How would you characterize the correlation of the variables below? Negative or positive? Weak or strong? Close to zero?

A

Good example of a zero correlation. Parabolas are always zero.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How would you characterize the correlation of the variables below? Negative or positive? Weak or strong? Close to zero?

A

Strong positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How would you characterize the correlation of the variables below? Negative or positive? Weak or strong? Close to zero?

A

Weak positive correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How would you characterize the correlation of the variables below? Negative or positive? Weak or strong? Close to zero?

A

Rather strong negative. Not a perfect line, but pretty darn good.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How would you characterize the correlation of the variables below? Negative or positive? Weak or strong? Close to zero?

A

Weak negative or close to no correlation. There is a general negative slope but this would be an r of like -0.25 at best

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How would you characterize the correlation of the variables below? Negative or positive? Weak or strong? Close to zero?

A

No correlation

17
Q

How can we test if there is a significant correlation between variables X and Y? Go through the steps starting with already having the X and Y values.

A
  1. Compute r
  2. Compute t

You can use this to find your p-value in a confidence interval table. If p<0.05, you are significantly significant, which means you can reject the null hypothesis which states that there is no relation (meaning, your data supports a cause and effect)

18
Q

There are two reasons we would use a pearson correlation. What are they?

A
  1. Observations are from a random sample
  2. At least one variable follows a normal distribution
19
Q

What other tool can we use with regression to answer questions about the population?

A

Line of best fit

20
Q

When do we use a simple logistic regression?

A

When the dependent variable is continuous/categorical

It estimates odds ratios (log odds)

21
Q

When do we use a multiple logistic regression?

A

One dependent variable is categorical and we have multiple independent variables.

22
Q

Using a correlation or linear regression

vs.

Using a logistic regression

A

To assess association between two CONTINUOUS variables, use correlation or linear regression

To assess association between CONTINUOUS predictor and CATEGORICAL (binary) outcome, use logistic regression

23
Q

We are trying to see what affects BMD in women the most. We are testing weight and race (Black vs. White/other) against the BMDs. What type of stat method should we use?

A

Multiple regression

24
Q

If we wanted to predict a binary outcome, such as finding a disease to be present or absent, or finding the prognosis to be died or survived, what statistical method would help us best?

A

Logistic Regression

25
Q

Why do we need to adjust for certain variables?

A

Because they are confoudning

26
Q
A