Module 4 Flashcards

1
Q

What do the values have to be between for Pearson’s Correlation coefficient?

A

-1 and +1

values f r close to this indicate a strong linear association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does a r value close to 0 indicate?

A
  • little linear association between variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the hypothesis for Pearson’s Correlation?

A

H0: p=o (no linear association)
H1: p not equal to 0 (linear association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What p-value shows a significant linear correlation?

A

p<0.05

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the assumptions of a Pearon’s correlation coefficient?

A
  • linear association
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What test is used if there is no linear association between two variables?

A
  • Spearman’s (rank) correlation coefficient

- require association to be monotonic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define Monotonic?

A
  • always increasing or always decreasing (but doesn’t have to be at the same rate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Does a correlation between 2 variables mean there is a cause and effect relationship?

A
  • no

- there may be an unobserved variable that can this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does correlation measure?

A
  • magnitude of the association between 2 variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does regression measure?

A
  • magnitude of dependence of one variable upon another
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the idea of linear regression?

A
  • find relationship between the independent (x) and dependent (Y) variable
  • want to determine the straight line that best ‘fits’ the data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Can you have more than one independent variable for regression?

A
  • yes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the linear regression model formula?

A

Yi=Bo+B1 Xi + Ei

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the three main steps in regression analysis?

A
  • estimate equation (find coefficients)
  • assess model (significance and assumptions)
  • use good model to make predictions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In Rcomander what is the Bo and B1?

A
  • Bo is the (intercept) under estimate Std.

- B1 is the value under this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is b1?

A
  • regession coefficient (slope of line)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is Bo?

A
  • y-intercept

- the value of Y when X=0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the assumptions for regression?

A
  • Y and X are linearly related
  • the values of Y are independent from each other
  • the random part of Y (error) is normally distributed around 0 with constant variance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the residual?

A
  • is the difference between what our model predicts at a given value of x and what we observe
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the assumptions for residual analysis?

A
  • normally distributed
  • mean of zero
  • constant variance (homoscedasticity)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What do you do is the variance is not equal for residual analysis?

A
  • transform data (Ynew=log(Yi)

- use different methods (weighted least squares regression)

22
Q

What are the two types of prediction?

A
  • interpolation (predict Y value using X values within data range)
  • extrapolation (predict Y values using X value beyond sample data)
23
Q

What is a simple linear regression?

A
  • one dependent variable and one independent variable
24
Q

What is linear regression?

A
  • one dependent variable and 2 or more independent variables
25
Q

What is the formula for MLR?

A

Yi= B0+B1 Xi,1 + B2 X1,2 + ……

26
Q

How do you asses a MLR?

A
  • look at each (y,x) bivariate pair
27
Q

What are rhe main steps in MLR analysis?

A
  1. estimate regression equation
  2. asses the model and test hypothesis (ANOVA F test, Model validitity, explanatory power/adjusred R^2, multicollinearity, parsimony)
  3. test hypotheses regarding individual Xs
  4. if model is ‘good’, use to predict value of dependent variable
28
Q

Why do we use an adjusted R^2?

A
  • the non-adjusted becomes increasingly biased with increasing number of X’s
29
Q

What is an example of a partial regression coefficient hypotheses?

A
  • H0: all partial regression coefficients are zero

- H1: at least one partial regression coefficient is not equal to zero

30
Q

What is parsimony?

A
  • principle of explaining the most variation with the leas number of variables
31
Q

What is information criteria (IC)?

A
  • statistics that consider both parsimony and explanatory power together
  • AIC (akaike IC)
  • BIC (bayesian IC)
32
Q

What is the basic formula for IC?

A

IC = lack of fit (= observed y - predicted Y) + penalty (num of parameters)

33
Q

What is multicollinearity?

A
  • occurs when independent variables are not independent
34
Q

How can you identify mylticollinearity on a correlation matrix?

A
  • a high number compared to others between 2 variables
35
Q

How can you calculate the magnitude of multicollinearity?

A
  • Variance inflation factor (VIF)
36
Q

What does VIF indicate?

A
  • increase in B variance due to presence of other collinear variables in model
  • VIF< 5 is ok
37
Q

What is confounding?

A
  • variables that changes the effect (slope) of an explanatory variable when it is added to the model
38
Q

What is the minimum sample size?

A
  • minimum number of people needed to decalre clinically important effects that are also statistically significant
39
Q

What is power?

A
  • the probability of declaring an effect statistically significant when it is true
  • larger sample size increases its power
40
Q

What is the ethical principle?

A
  • Inadequate sample sizes (too large or too small) to answer the posed question leads to wasted resources and, in clinical trials, unethical issues
41
Q

What is alpha level?

A
p-value = probability of type 1 error = significance level
p-value = 0.05
42
Q

What is beta level?

A

power = 1-prob of type 2 error
power = 0.8 or 0.9
type 2 error (0.2, or 0.1)

43
Q

What are the 4 types of effect size models for sample size?

A
  • pilot study
  • scientific literature
  • expert suggestion
  • wild guess
44
Q

What are the three variables for sample size?

A
  • expected difference
  • power
  • sample size
45
Q

What is the difference between correlation and agreement?

A
  • correlation means a consistent ratio

- agreement means the numbers are the same

46
Q

What is used to measure agreement for continuous data?

A

Bland-Altman plots

47
Q

What is bias?

A
  • a systematic difference
48
Q

What is used to measure agreement for categorical data?

A
  • cross tabulation

- if perfect agreement the off diagonal would be 0

49
Q

What is sensitivity/positive predicitive value?

A
  • proportion of true positives correctly classified
50
Q

What is specificity/ negative predictive value?

A
  • proportion of true negatives correctly classified
51
Q

What can be used to test a questionares reliability?

A
  • Cronbach’s alpha
52
Q

What are the rule of thumb for ranges with cronbach’s alpha?

A
  • 0-0.7 = unreliable
  • 0.7-0.8 = adequate
  • 0.8-0.95 = good
  • 0.95-1.0 = too similar