Lecture 2 - relationships in research Flashcards

1
Q

Relationships in the research inform what we do.

EX: if I know that increased hip ROM = reduced falls w/ stair navigation w/ some diagnosis, im going to work directly on hip ROM to decrease fall risk

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Measures the strength of association between 2 or more vairiables
* How releated two things are

A

Correlation (does not = causation)

If one goes up the other goes up

For example grip strength and fall risk are correlated. More grip strength = less fall risk. However, grip strength in no way helps you not fall. So them being correleated isnt the cause of decreased falls
* However, deconditioning overall, affects both of these factors - if im a deconditined individual im most likely not going to have good grip strength, and my fall risk is going to increase because im not improving strength/challenging balance
* So grip strength and fall risk are releated, however, they don’t directly affect one another (corelation did not = causation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

r = 1 means if one variable increases the other increase

r = -1 means as one increases the other decreases (still a relationship)
* negative relaionship

r = .14 is barley any relationship (norms shown later)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How strong or weak the relationship between two indepedent variables are

A

Correlation

are they in a close relationship where one increases in the other increases/decreases or they they have no affect on eachother
* NOTE: its still a relationship if one increases at the same time the other decreases and visa versa. Its not a relationship when theres no pattern pattern found

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Perasons product moment correlation (r): Defines the magnitude and direction of a LINEAR relationship

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

r = 0 means no relationship

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Correlation means the two are releated, not that one causes the other
* So, when you’re looking at a study you want to know if they controled for confounding factors (something that would influence this relationship)

Grip strength does not directly reduce risk of falling. But deconditioning level does.
* So it may look like decreased grip strength is causing increased fall risk, however, its the conditioning and they would need to control for this.
* Since grip strength and falling have an indirect relationship we can still use that relationship to quantify the risk of falling by getting a numerical value on grip strength (essentially measureing deconditioning by getting grip strength, and deconditioning has a direct relationship on fall risk)
* Because we don’t really have a good deconditioning measurement, so can use the quantitative value of grip strength for this

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Confounding factors: variables that affect both the indepdent variable (what is being studied) and the depdent variable (the outcome), making it difficult to determine the true relationship between them
* these factors can give misleading results because theu introduce bias, suggesting a false association or masking a real one

EX: Imagine a study is trying to determine if drinking coffee leads to better job performance. The researcjers find that people who drink coffee tend to perform better at work. However, a confounding factor could be sleep habits. People who drink coffee might also sleep less or have different energy levels, which could influence their job performance.
* In this case, its unclear if the improved job performance is due to the coffee itself or the fact that these people have different sleep patterns. To draw accurate conclusions, the researchers would need to control for sleep habits to separate the effect of coffee on job performance.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

KNOW: correlation r can be used to measure effect size and estimate power or sample size

Effect size = amount of effect the indepdent variable has on the dependent variable

r = 0.1 will indiacte a weak effect size (the indepdent variable barley affects the outcome or dependent variable)

r = 0.8 represent a strong effect size (the indepdent variable signficantly impact the outcome or depdent variable)

The closer r is to 1 or -1, the stronger the effect size, meaning the indepdent variable has a larger influence on the depdenent variable

power is the porability of detecting an effect if there is one, while sample size referes to the number of participants needed in a study.

Higher correlation (r) values typically require fewer participants to detect an effect because the relationship between the variables is stronger

Lower correlation (r) values require larger sample sizes to detect an effect because the relationship is weaker and harder to observe

To estimate power or sample size, researchers sue correlation (r) in power analysis. The stronger the correlation (effect size), the fewer particiapnts are needed to achieve a high level of power

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

r values:
* Strong =
* Moderate to good =
* Low to Fair =
* Little to no relationship =

A

Remember these values can be positive or negative vales depending on the relation (r = -1 or 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Body weight and exercse time per week is a positive or negative relationship?

A

negative

Increased exercise = Decreased body weight

As one variable increases the other decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Exercise intensity and heart rate is a positive or negative relationship?

A

Positive

Increased intensity = Increased HR

As one variable increases the other decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Assumptions of correlation:

With correlation do we assume a normal distribution or abnormal distribution (w/ graph)

A

Normal

Think a bell curve
* this is a natural phenomena (like height, weight, and test scores) tend to follow this normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Assymptions of correlation:
* Each subject contributes a score for the X and Y axis

What does this mean in the study below?

A

It means we know both their age and their strength

It means if there was any fall off in the study it should not be included
* Say you got the age but never got a strength measurement, well that data shouldnt be included

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Assumptions for correlation
* X and Y are independent measures

Meaning they can be releated (which is why were doing the study, to see how strongly releated they are) but they can’t be apart of it

EX: If I’m doing doing a study on BMI, I shouldnt do a study of BMI vs Height, because height is litteraly apart of BMI (height/weight)
* Ofc those things are releated, one influences the other directly
* This serves no value or purpose

EX: We couldnt do gait speed and distance traveled
* Because gait speed = distance/time and distance is litteraly apart of gait speed (very interreleated)
* Distance is going to directly affect it

  • **EX: A good one would be gait speed vs fall risk
  • They’re releated but the other does not directly influence the other**
  • One is completely indepdent of the other
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Dichotomous

A

type of question that offers only to possible answers (think yes or no questions)
* either or, theres two options

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Assumptions of correlation
*X values are observed

X values are observed: This means you collect data on X, a variable of interest, without manipulating it. X could be something like age, weight or income.

Y can be inte intervention: In some studies Y referes to an intervention or treatment that you apply to see if it affects X. For exmaple, Y could be a medication, and you’re interested in seeing how it affects BP (X)

X is the outcome: in other contexts, X might be the outcome you’re measuring after applying Y. For instance, you apply an intervention (Y) and then observe the outome (X), such as changes in behavior or health status

Both X and Y can also be observed. This means that in many studies, both variables are simply measured without any intervention. For example you might observe the relation ship between height (X) and weight (Y) in a population. Here both X and Y are observed, and you’re looking at how they correlate naturally without any experimental manipulation

Sometimes Y is an intervention or treatment, X is the outcome you’re interesed in

In other cases, both X and Y are just observed variables, and you study how they relate to each other without manipulating them.

X = the depdentend variable (for when Y is the intervention)

Both would be observed in gait speed and fall risk
* theres no intervention being implemented here, just observedation.

X is always some observed measure

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Assumptions in correlation:
* The relationship must be liner - specifically for peasrons product (r)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

5 assumptions in correlation

A

1) Normal distrubtion
2) Each subject contributes a score for the X and Y axis
3) X and Y are independent measures
4) X values are always observed
5) The relationship must be linear (r)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

NOTE: This is an example of a non linear relationship. you can see the r value is low because its non linear

When they use a non linear line of best fit (like below) they will state what it was.

However, for just straight correlation, we must utilize a linear relationship

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

KNOW: in the study below they wanted to see if there was a relationship between cognitive function and ambulation ability

null hypothesis = correlation is 0 (no relationship between variables, each one is indepdenent and they do not affect eachother at all)
* H0: ρ = 0

Alternative hypothesis = correlation is not 0, there is some sort of relationship there
* H1: ρ ≠ 0

note we use ρ instead of r due to the assumption that data represents population
* because the data is the normal distribution it can represent the general population (think bell curve)

r = 0.348 (low to fair)
* slight positive correlation (not very strong, kind of all over the place [look at graph below]

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what is a null relationship?

A

Assumes there is no effect or relationship between variables

It serves as a default starting position

EX: If you’re studying whether a new drug lowers BP, the null hypothesis would state: “The new drug has no effect on blood pressure.” This means you’re starting with the assumption that the drug does not work (meaning the variables had no affect on eachother)

The goal of research is to collect data and analyze it to either
1) Reject the null hypothesis (meaning there is evidence that supports an effect or relationship exists)
2) Fail to reject the null hypothesis (meaning the data does not provide strong enough evidence to conclude an effect or relationship exists)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

** TEST: shes going to ask us what a graph generally looks like and if its a strong correlation, and if its positive or negative correlation**
* If it matches the r value given

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Should make since in the relationship below that cognition level vs cognition r = 1
* its the same variable

Cognition levels vs Ambulation r = 0.348 (can see below that its not a strong relationship)

Significance (2-tailed): Telling us how sure we are that this 0.348 (r value) correlation is accurate
* 0.001 < 0.05 = significant relationship
* Meaning that theres a 0.001 chance that theres a really strong correlation that we missed - so we can be pretty sure that this # is accurate
* So this is important, because now were sure that this is not a strong relationship

TEST
* Should be able to look at the graph below and determine whether its a positive or negative relationship
* Be able to look at the graph below and determine if its a strong or weak relationship (w/o a r value)
* On the table, knowing where their r value is (just know its labeled in the var that says pearson correlation)
* Understand how to interpret the significance level

NOTE: in real research they would proably only provide the top half of the box below because its essentially just repeating itself in the bottom hakf

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

α = 0.05

P = 0.14

Is this significant or not significant?

What kind of error does this represent?

A

Not significant because probability (P) is greater Alpha (a)

To great of a probability of type 1 error

essentially saying the proabibility of type 1 error is 14% when the max we set that it could possibly be was 5% (a)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Explain what type 1 error is

A

So P = proabibility of type 1 error

a = 0.05

if P is greater than that (i.e., 0.6+) than were saying that our proabibility of type 1 error is to high

Type 1 error meaning that correlation (r) isnt as special as we think it is
* theres a higher chance that the r value isnt actually what we think it is

in short, its incorrectly identifying a significant difference when there isnt one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is Alpha (α)?

A

The exceptiable amount of type 1 error
* typically set at 0.05 or 0.01

If alpha is set at 0.05 were saying the exceptable amount of type 1 error is 5%

if P = .14 that means that we have a 14% chance of type 1 error (higher than 0.05 and 0.01) meaning our faith in this correlation (r) being real goes down
* meaning theres a chance this isnt the true correlation for the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Alpha (α) is typically set at 0.05, however, sometimes we set it at 0.01. What can happen if its set super low

A

The chance for type 2 error increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is Type 2 error

A

When alpha (α) is set so low (probs 0.01 instead of 0.05) that theres an actual correlation, but the P value is higher, so we say there isnt actually a correlation when there is

we could incorectly miss a significant difference

we miss an actaul relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is beta (B)
* in studies you don’t typically see beta by itself. What do you see instead

A

type II error

Instead of beta you see power (1-B)

P value always represents the probability of type 1 error - we just compare that to alpha to know if thats significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is power
* What do we want power to =

A

1-Beta

Our proabibility of corretly identifying a trend

Want it to be >/ 80%
* we want to be correct at least 80% of the time
* We want our proabibility of correctly identifying something to be at least 80%

probs not on exam

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Compare and contrast ranked/oridinal data w/ continuous data

A

Continuous data:
* Can take on any value within a range. These values can be infinely precise, meaning they can include decimals or fractions
* EX: Height, weight, temperature, time, distance

Ordinal data: Represents categories with a meaninful order or ranking, but the intervals between the values are not equal or defined
* EX: survey responses like “satisfied”, neutral, dissastified
* The values indicate a position or order (1st, 2nd, 3rd), but the difference between them isnt necessarily uniform or measureable
* Ordinal data tell you the relative ranking (better, worse) but not the magnitude of difference between rankings
* Cannot measure precise differences: For example, the difference between a satisfied and neutral survey response is not as clear or measureable as continuous data like temperature or weight

In summary, continuous data measures values that can be infinetly precise, while ordinal data rank or order categories without specifying exact differences between them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Is ranked/ordinal data parametric or non-parametric?
* What does this mean?

A

Non-parametric

Meaning that the data does not assume a specific probability (like normal distribution), and parametric statistical methods, which rely on such assumptons, are not appropriate for analyzing it
* basically it doesnt follow a normal distrubtion/bell curve

(NOTE: continuous data is often parametric) but not always

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

KNOW:

r = 0 means no relationship (null hypothesis)

r>0 means theres a relationship (alternative hypothesis)
* meaning on the positive or the negative side

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Instead of using pearsons product moment correlation (r) ranked/ordinal data uses what for r

A

Uses spearman rank correlation coefficient

Pearsons Product-Moment Correlatoin
* Type of Data: used for continuous data where the relationship between the variables is linear and the data is measured on an interval or ratio scale (height, weight, temperature)
* The data should have a normal distrubtion
* The relationship between the two variables should be linear
* Use pearsons correlation when you want to assess the strength and direction of a linear relationship between two continous variables
* EX: If you want to know the correlation between students test scores and their number of study hours, and you believe the relationship is linear and normally distributed, perason correlation is appropriate

Spearman rank correlation Coefficient (r)
* Type of data: It is used for ordinal (ranked) data or continous data that does not meet the assumptions of normality or linearity (irgnore this last part). It measures the strength and direction of a monotonic relationship (where on variable consistently increases or decreases as the oteher does, but not necessarily at a constant rate [non-linear])
* Assumptions:
* The data does not need to be normally distributed
* The relationship can be monotonic ratehr than strictly linear
* Use case: Use Spearmans correlation when your data is ordinal or when you suspect a non-linear but still monotonic relationship between variables
* EXL if you want to explore the relationship between rankings of student satisfaction (i.e., satisfied, neutral, dissastisfied) and their class attendance, spearmans correlation would be more appropriate

Better said: Spearmans correlation is used to show the amount of correlation in a ordinal/ranked data set (thats non-linear) while Pearsons is used to show the correlation in a continous linear relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Reading and Verbal Comprehension scores in children w/ a learning disability

What we want to know as a researcher is if there is a relationship between their reading compreshions and their verbal compresion (if they’re good at one are they good at the other?)
* If we speak something to them, how much do they absorb vs if we read something to them, how much do they absorb.

You can see below that they ran both spearmans rho and kenall tau (both showing the correlation between ordinal/ranked data sets [variables that impact eachother in a non-linear way])

You can see that Spearman’s rho yeilded a higher correlation (r)
* So they would proably only show spearmans
* This is a positive relationship

NOTE: when you look at level of significance and see .00000 theres a 1 at the end somewhere, just means we have a very low proabibility of error

n = # of participants
* n = 16 and found a significant value (certain of r value), however, not very many participants = decreases strength of study.
* Effect size for this study is smaller because of decreased # of participants, however, my level of significance is so good that this evens out.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Know: Shes going to be giving us spearman’s rho (r) on our exams, however, in research we might see kendall’s tau (τ). They’re essentially the same thing and used to show the correlation in a ranked/ordinal non linear data set
* Kendall’s tau is still looking at that ranked/ordinal correlation (can have a,b,c)

Most of the time researchers do both of these and only include the one that has the higher correlation value - showing a better reponse (increases bias)

A
39
Q

When is the intraclass correlation coefficient utilized?

A

When measuring relability

Correlation between rater mesaures (inter rater relability)
* Does a group of various raters tend to get the same measurements when measureing the same thing? or would they be different. would there be a correlation between all of your measurements

Correlation within the same rater (intra rater relability)

EX: If theree doctors are rating the severity of a patients symptoms, you can use the ICC to see how consistently they rate the symptoms. If the ICC is high (close to 1), it indicates strong angreement among doctors (inter rater relability). If its low (close to 0), it suggests that the doctors rating vary significantly from another

ICC measures the reliability or agreement between measurements or raters. It is useful in evaulating how consistently different raters or repeated measurements produce the same results, with values closer to 1 indicating higher agreement)

0 = no relationship

1.0 = strongest relationship

If I wanted to know what the best outcome measures for spinal cord injury are than I would want to make sure I’m looking at the relability of a test or measure.
* Need to make sure the test has high relabibility

NOTE: high relabibility = measurement, or assessment consistently produces the same results under the same conditions. In other words, when something is highly reliable, you can trust that it will give consistent outcomes over time, across different situations, or between different raters.
* Consistency = the same results are produced when the test is repeated
* Depdendability = You can depdent on the measurement to be accurate across different occasions or raters
* Low variability = there is little random error or fluctuation in the results

EX: If a scale shows the same weight every tume you step on it, it is a highly reliable scale

EX: If different teacjers grade the same essary and give nearly identical scores, the grading process is highly reliable (high interrater reliability)

40
Q

If im testing the same person over and over again and getting the same results would that be a high level of intra or inter rater relability?
* What correlation coefficent is used?

A

Intra

Intraclass correlation coefficent can be used for both interrater and intra rater relabibility.

correlation of test-retest = intra
Between raters = inter

0 = no relationship
1.0 = strongest

41
Q

What is Phi Coefficient used for? (correlation)

A

Its a correlation coeffienct (like the others) so its used to determine the correlation between the two variables

Used when both X and Y are dichotomous variables

Used to measure the strength of association between two binary variables (variables that can only take on two variables, such as yes/no, true/false). It is specifically designed for 2x2 contingency tables, where both variables have only two possible outcomes

EX: If you’re analyzing the relationship between smoking (yes/no) and lung disease (yes/no) in a group of people, the Phi coefficent can tell you if there is a significant assocaition between being a smoker and having lung disease

EX: Male/Female & Parkinson’s by age 50 (yes/no)

Typically male/female, yes/no scenarios

Calculated using chi square

Not ranked, one is not better than the other

42
Q

What is Point Biserial Correlation Coefficient used for (correlation)
* Whats it similar to

A

When X is Dichotomous and Y is continuous
* X = Dichotomous, meaning it has two cateogies (male/female, yes/no, pass/fail) (also observed)
* Y = Continuous, meaning it can take on any value within a range (height, weight, test scores)
*

Similar to a t-test
* is there a difference between those variables

EX: Male/Female and gripstrength
* Does being male or female affect grip strength
* X = Dichtomous = male/female (also observed)
* Y = Continuous = Grip strength

EX: Gender on test scores
* Male/female vs test scores
* X = male/female
* Y = test scores

Use when you have on dichtomous vairables and one continuous variable you want to measure the strength and direction of their relationship

43
Q

What is Rank Biserial Correlation Coefficient (correlation).

A

X = Continuous on the ordinal level

Y = Continuous on a ratio or interval level

MMT and Amount of exercise/week

I think this one is proably incorrect because she doesnt understand it

X (continuous ordinal variable): This refers to a variable that can be ranked, such as manual muscle testing (MMT), where muscle strength is rated on an ordinal scale (i.e., weak, moderate, strong)

Y (continuous ratio or intervarle variable) This refers to continuous variable measured on a scale where distances between values are meaningful, like the amount of exercise per week (i.e., hours per week)

EX: In this case, MMT scores (ordinal) could be ranked, and you would analtze how these ranks are associated with the amount of exercise a person does (continuous, ratio level). A positive correlation would suggest that as MMT scores increase (stronger muscles), the amount of exercise per week also tends to increase

44
Q

Correltion: Tells us the relationship between the 2 scores within the same person
* The higher a student scores in anatomy, the higher they will score in phsyiologiy
* Tells us if there is a relationship between anatomy and physiology grades in the same person
* Meaning, is the anatomy score releated to the phsyiology score
* generally, the higher they did in anatomy, means they will do better in physiology (positive correlation, as one grade goes up, there other will also go up - note this is in the same person though)
* tells us if one goes up what happens to the other one and what is the strength of that relationship = correlation

t-test tells us the difference between the 2 sets of scores (not in the same person)
* Physiology grades are significantly higher than anatomy grades

We can look at the data below and with a t-test look to see if there is a significant difference between anatomy grades and phsyiology grades

nice because we can decide with these tests if they’re different and are they releated

A
45
Q

When looking at correlation data, consider outliers
* Point(s) of data outside the general cluster
* Could be caused by confound or extraneous factors

we typically take these out because it destory our correlation (below the red r = .095 correlation = basically no correlation, but when we take it out we get r = 0.63, indicating a mdoerate to strong relationship)
* so really there is a correlation going on, but theres that 1 point is destorying the entire correlation

could be a source of bias if ton’s of outliars were excluded

A
46
Q

When looking at the correlation between hospital stay, age, and function our correlation = 0.34 (tells us theres a poor correlation between all 3)

However, we know they all have effects on eachother

So we split them apart into partial correlation (only looking how 2 of the variables interact w/ one another) showing a much higher correlation

Correlation of X and U w/o the effect of Z

Gives us a more complete picture

We know when broken into two variables they have a high correlation
* Age vs Hospital stay
* Function vs Hospital stay
* Function vs age

All 3 of these are releated, they just don’t seem that related when we correlate all 3 at the same time - we use partial correlation to acomplish this.

A
47
Q

Correlation tells us how strong a relationship is (positive and negative)

However, is there a way to do preductions if we know correlation?

For instance, there is a tie between function and hospital stay time
* If they have a lower TUG score, than they will also have a lower hospital stay (using function to predict length of stay)

A
48
Q

With correlation we can draw a line of best fit and a line of regression

What does correlation tell us?

What does regression tell us?

A

Correlation tells us the strength of the relationship between the two variables (how much one impacts the other one)

Regression tells us predictive power
* Can we predict one of the variables if we know the other one

49
Q

What kind of line does regression only work with?

A

Regression can only predict an outcome in a linear relation ship
* Correlation must be linear (line of best fit is linear)

50
Q

x

So this is regression

Were essentially trying to predict one variable by knowing the other one (were solving for Y hat)

So on the next page im trying to use BMI to predict systolic BP

X = BMI

a = regression constant (value of Y when X = 0)

b = regression coefficient (slope on line)

I think were using the best fit lines slope to plug in here

A
51
Q

number of cigarrets/week vs Relative risk of COPD

So were looking to see if we can perdict from the # of cigaretts/week there risk of COPD

  • indepdent variable = # of ciggerates
  • depdent variable = Relative risk of COPD

So we start by drawing a line of best fit (remember the correlation coefficent has to be lienar - line should be linear)

Residuals = The distance of each point from the line of best fit
* if our total residuals = a very large number that means most of these points are far away from line of best fit = poor correlation (they arent all bunched together around the average)

The strength of the correlation will be a good predictor if the analysis of risiduals will be high or low
* Increased correlation = decreased residuals (points arent that far away from the avergae = 1 vraiable directly includeances the other = large correlation between variables)

A
52
Q

What are residuals

A

Average distance of the data points away from the line of best fit

Average of the green arrows below

NOTE: Increased residuals = decreased correlation (points are further apart = indepdent and depdentend variables are not that correlated)

53
Q

This is predicting Y hat

A
54
Q

What does regssion tell us
* Can regression be done a non linear line?

A

The predictive power of that relationship

Cannot be done on a non-linear live

55
Q

This is showing the line of best fit

Not a bad realtionship here because the dots are on either side of the line

Im worried when they’re all above or below or really far away from the line of best fit

A
56
Q

r = .868
* What are our residuals and why?

A

They’re going to be pretty low

Because theres a high correlation, meaning all the dots are close to the line. So the average distance of all the dots from the line isnt going to be far because they’re already scattered near taht line of best fit because they’re highly correlated (as one variable changes the other does in a linear fashion)

57
Q

KNOW:

Regression = the line or curve that best fits the data

Residuals = the distance between the actaul data points and the regression line. These distances indicate how well the model fits each individual point

A
58
Q

Regression line: fits the best through all the plotted points

The goal is to minimize the distance between the estimated value and the actual value (minimizes error)

You can see on the image on the next slide that the line of regression will have lots of errors (distance of the estimated point from the actual point). However, the goal is to minimize these errors. or make them as small as possible

A
59
Q

This is showing a positive relationship between grades vs study time

b0 = y intercept (where the line of regression meets the the vertical line)
* valye of Y when x = 0

b1 = the slope

X = value used to predict y

This is the same as the equation she gave us a = b0

you would use a negative sign for a negative relationship, however, the relationship or correlation below is positive

A
60
Q

if r correlation coefficent is really high than we can assume our residuals are going to be really low (meaning that regression line [line of best fit] was fairly accurate, meaning the residuals [distance of the actual point from the regression line] are pretty close to the regression line = a small # for resiudals because line of regression was accurate)

A
61
Q

R = correlation coefficent (how releated x and y are)
* has values from -1 to 1

R square = How close each data point fits to the regression line
* Tells us how well the regression line predicts actual values
* Only has values between 0 and 1
* r^2 = close to 1 tells us that the actual values and the predicted values were very close togther (this is a very good line of regression for predicting values)
* r^2 = close to 0 tells us that the regression line doesnt fit the data that well
* In the next slide you can clearly see a large amount of distance between the actual values and the perdicted values

Yellow = line of regression. blue = actual data points

A
62
Q

r = correlation coefficeint

r^2 = how close each data point fits to the regression line (1 being a 100% match between regression line predictive data points and actual data points)

adjusted r^2 = I think is basically the same thing as r^2 just slightly more accurate because it adjusts for variance

NOTE: for this r^2 value we can basically utilize the same strength of releationships as we did w/ correlation coefficent (r)
* i.e., ~80% = strong

A
63
Q
A
64
Q

when reading the graph on the card over

The numbers under B are used in that equation she gave us for linear regression

constant = -29.800 y value when x = 0
* I dont know why its negative

6.81 = slope

You can plug in X values and derive what the y value will be
* Plug in BMI’s and derieve Systolic BP

A
65
Q

On a test she will give us:
* r
* adjusted r^2
* Significance level

A
66
Q

Non-linear regression
* that line of best fit is no longer linear

Curve’quadratic equation used instead of slope equation

R^2 depetics strength of relationship (how close each data point fits to that line)

In the example below we know psychomotor skill increases till roughly 35 then declines w/ age so that parabla line of best fit gives us a higher r^2 (predictive ability) than a linear one would
* meaning the non-linear regression line yields increased predictive ability over the linear one

A
67
Q

Logistic regression

one variable predicts a dichotomous outcome

weight –> Myocardial infarction age 45 (y/n)

dont need to know too deep here

A
68
Q

Multiple/Multivariate regression

Multiple factors used to predict an outcome

A weight is calcualted for each factor
* Similar to grades of individuals assignments predicting your final grade in a class

dont need to know too deep here

A

Say we wanted to predict systolic BP

We can take them out and look at how much each one overlaps w/ systolic BP and see that it has the most relationship

69
Q

Relative risk:

Typically performed prospectively

Primarily cohort study
* An exposed group and an unexposed group and see what their long term outcomes are

R > or < 1
* there is a positive or negative correlation to the exposure

R = 1 = no increased or decreased risk

R = 0.5 = decreased risk
* If I exercise I have a decreased risk of heart disease

R = 2 = doubled risk
* if I eat 2 big macs a day I have twice as likely chance of developing heart disease

A
70
Q

Odds ratio
* performed retrospectively
* Primarily case control study

Ratio of the odds of disease and odds of no disease in each scenario

A
71
Q

Multivariate analysis

1 Variable:
* Utilize linear regression to determine based on one variable
* Indepdent variable could be anything
* Depdent variable is time until death (time after an event, age)

2 or more variables
* Cox proportional hazard model:
* Multivariate regression without assumptions about distribution –> meaning we don’t have to assume a normal dystribtuion
* Often considered non-parametric –> not a smooth line but stepwise
* Used when looking at the risk of dying/terminal event –> frailty (if you’re considered frail, your risk for death goes up, some people will get frailty before death)
* Uses hazard ratios - basically an odds ratio –> the odds of that hazardise event (i.e., death, frailty)

A
72
Q

Kaplan-Meier Estimate:

Consider this to be a slope of odds ratios or hazard ratios

Considered non-parametric as it has steps (not a smooth line)

The line is the percentage of individuals still suriving w/ Minor LE amputation vs major LE amputation

At the beginning everyones alive

Notice that the major LEA group has people die much faster

KNOW: How its graphed and why its step wise
* Probs look up video on this

Nice to know, because if you have a PT w/ diabetes and they have a sore on their foot, we can use these numbers about major vs minor amputation to scare them into being cleaner

A

basically taking all this information and graphing it

73
Q

Looks at risk over time, rate of hazard event

A

Hazard ratios

74
Q

Looke at survivial proabibility vs Time in hours

Note: the vertical dashes are sensorship lines, meaning someone just stopped showing up but didnt die

A
75
Q

What is a hazard ratio?

A

Odds ratio of something hazardith
* Basically the risk of death (not just injury)

76
Q

What does a hazard ratio of 1 mean?

A

No increased or decreased risk of death w/ some activity

77
Q

What does a hazard ratio < 1 mean?

A

decreased risk of death

protective ratio

i.e., 0.35 = 2.68x more likely to survive
* found by taking 1/2.68
* you only take this 1/x for numbers less than 1 since they’re going to be in decimals

For example, if their risk of death was 6.58, that means they are 6.58 times more likely to die

78
Q

What does a hazard ratio > 1 mean?

A

Incrased risk of death

79
Q

when looking at these curves know that the P value is basically just stating that theres a statistical significant difference between the two groups
* low being better

A
80
Q

at first the high promience charcacters had the highest surviviability, then that # switched w/ the low prominiance charcters. Where the circle is on the table below some big event happened and we want to figure out what (just tells more of the story)
* Is there some event that occurs around this point that caused this switch?

EX: Low employment part time vs pull time on health
* Maybe we have a better health outcome when were working part time initially because we have more time for ourselves initially
* However, once you’ve been been working part time for long enough other stressors start to set in (i.e., cant afford to pay bills). That increased stress will decrease surviviability and be the reason why there could be a switch between 2 subgroups on one of these graphs
* Shell be asking what could’ve caused that switch at a certain point (what occurs around that time to make that switch?)

A
81
Q

KNOW: When we have a p value between two different subgroups on one of these graph its telling us the odds that these two groups are different

However, when we have 3 different groups on these graphs that P value is just telling us that theres a significant difference somewhere
* It doesnt tell us where or between which ones, its just telling us that theres a significant difference somewhere
* Along that line somewhere one of those graphs is significantly different from another one

A
82
Q

What kind of curve is this?

A

Kaplan Meirs
(note the steps)

shows survivability over time

NOTE: some don’t show that shaded area (which is the 95% confidence interval)

83
Q

What is this?

A

Forest plot

NOTE: You can take those Kaplan Meier curves and turn them into a forest plot

84
Q

What does the p value at the top of this graph tell us?
* Why is p high?

A

How statistically different the females were from the males

p is high because it crosses that reference point (odds ratio)
* meaning that it could be either an increased chance of the event or a decreased chance of the event (we really don’t know)

Essentially telling us the difference betwen the odds ratios of females and males
* how different is the risk of being male vs female

Females odds ratio = 0.80
* because this is less than 1 that means they have a decreased risk of dying compared to males
* 1/0.80 = 1.25 times less like to die than males (which is not much of an increased risk [only 0.25 times more likely to live])
* Which is part of why that p value is so high, meaning theres not really a statistically difference in the risk between makes and females

Were 95% sure that if you’re female you’re somewhere between 0.5 and 1.3 and that doesnt really help because our reference group is also between those #’s

NOTE: Kaplan gives you the story as time goes along and this just gives you the overall picture
* meaning at one point maybe females were much less likely to die then it evened out
* Well we couldnt see that with this, but we might’ve w/ Kaplan

85
Q

2 in this forest plot is looking at social status

  • what does the p value tell you?
  • Is it staitstically different? Why?
A

P value is the difference in the chance of death between being a highborn and a lowborn

Not staistically different because that 95% confidence interval crosses our 1 –> meaning the may be at an increased risk or a decreased risk (we really arent sure)

So the highbirn is the reference in the odds ratio

Low born = 1.28, meaning they’re 1.28 times more likely to die than the high born. HOWEVER, since that 95% confidence interval crosses that 1 (odds ratio) which is our reference point (where the highborns odds ratio is) we arent actually sure if theres a difference between groups, which is why that p value is so high

86
Q

What does an odds ratio of 1 mean?

A

No increased or decreased risk (coinflip)

87
Q

Basically when they cross the odds ratio line were 95% sure that theres a chance that it could be 1, which is why the p value is so high (because thats just telling us there a difference between the two groups, and were not actaully sure there is because 1 falls in that 95% confidence interval)

A
88
Q

For alligence switched why is the p value so low

A

So with no allegiance swiched is our reference, so were compairing allegiance switched to that
**
So the odds of dying with alligence switch is 0.35 (so decreased risk of dying compared to the no alligence switched grou).
* also that 95% confidence interval does not cross 1 meaning were 95% sure that the actual value falls some point to the left of 1, meaning that theres a real difference between the switched and unswitched group
* this is why the p value is so low (were basically 100% sure theres a real difference between the two groups)

**

1/.35 = 2.86

The allegiance switched group is 2.86 times less likely to die than the unswitched group

NOTE: we can do the same thing w/ the 95% confidence interval

1/0.17 = 5.88
1/.70 = 1.42

So were 95% sure that if you switched your alleigence that you’re somewhere between 1.42 and 5.88 times less likely to die
* 95% confident that the true risk falls somewhere in this range
* tells us where the true mean lies for the population

89
Q

What is multivariate regression?

A

Using multiple variables to make perdictions

remember, regression is using some relationship between variables to make a perdiction
* can we perdict one variable if we know the other

90
Q

They used multivariate regression to predict the % of people living at 5 years
* meaning they used multiple variables to make that regssion line to make perdictions

Is the graph below parametric?

They’re using low lonliness score as their reference

this graph below shows the risk of frality with lonliness level over time.

The next one shows the risk of frality with social isolation

Increased loniness = increased death

Increased social isolation = increased death rate but to a lesser degree
* makes sense because some people might be socially isolated but they might not mind or be lonely
* While if they actually rate themselves loney they proably really are loney
* the bigges thing is perspective and they might not actually feel lonely

A

Non parametric (not a smooth line)

91
Q

she essentially graphed this using low loniess as the reference for the odds ratio

A

This is looking at the effect of lonliness on frality

notice that the medium and high lonliness scores never pass that 1 mark (the odds ratio) meaning they’re staistically significant

this is a version of multivariant association
* using multiple variables

Next we look at the effect of social isolation on on frality

92
Q

Appraising Prognostic Studies: Design Quality
* questions for appraising these studies (prognostic for a certain diagnosis)
* * Think risk factors for RTC injury

Is there a defined, representative sample of patients assembled at a common point that is relevant to the study question?
* Does it represent the population were talking about
* make sure data is fully representitive of that population (not just some subgroup [i.e., wealthy whites])

Were end points clearly defined?
* When the study was cut off because thats going to give us the picture for that time frame

Are the factors assocaited with the outcome well justified in terms of the potential for their contribution to prediction of the outcome?
* with lonilness and social isolation vs frality there are lots of studies saying they are well releated
* Or are they totattly unreleated to the point where it doesnt make a while lot of sense (playing video games vs being good at growing vegetables) - unreleated

Were evaluators blinded to reduce bias?

was the study time grame long enough for participants to experience the outcome of interest?
* we need to follow people long enough
*

Was the monitoring process appropriate?
* Did they have to come in? Were there any barries to that process?

Were all participants followed to the end of the study?
* They should clearly state this

A
93
Q

Appraising Prognostic Studies: results and Conclusions

What statistics were use to determine the prognostic statements? Were they appropriate?

Were clinically useful statistics included in the analysis?
* Were they useful?

Will this prognostic research make a difference for my recommendations to the patient?

A