2nd Stats Exam MCQ Flashcards
What is the violation of the assumption of independence? (Multi Level Modelling)
When one data point is dependent on another data point (one data point gives info to another data point)
Linear R’s assume independence so when this is violated you use a multi level model of logistic R
What does the General Linear Model change to once you start using multi-level modelling?
It becomes part of the GENERALIZED linear Model
What is the difference in calculation between independent and paired samples t test?
The way SE/variance is calculated
Independent samples - When SE is calculated > variance for both groups is incorporated into the calculation = pooledSE
Paired samples > whatever variance occurs in time 1 also occurs in time 2 e.g. individual differences (hunger, mood, time…)
Therefore do not need to count variance for both time points. If you did it would be double counting! and get drastically incorrect p values
Just like with Binary logistic regression, we assess mixed effects models using what?
Hint: similar to SSE reduced
-2 Log Likelihood
What do we find out from the Estimates of Covariance Parameter box? Specifically, the intercept variance box?
A p value either under .05 (signif) or over .05 (nonsignif)
If the p value is signif it tells us there is SIGNIF VARIANCE in intercepts and hence we were right to conduct a MLM
What is the new value introduced in Binary Logistic R, used in MLM? and what is it equivalent to?
Walde stat - equivalent to t score
Calculation:
1) estimate(bvalue)/SE (found in estimates of covariance parameter box)
2) then SQUARE
What does the ‘Empirical Best Linear Unbiased Predictions’ box show us?
Each ppts u0 variable estimate - the difference between their most ideal ‘intercept’ and b0 (deviation of ideal intercept from b0)
(
Later on in model with random intercepts AND random slopes will include each ppts u1 variable estimate too
Dependency between data points can also come in the form of what?
Clustering (a general problem) - when a subset of ppts are more connected with each other
Whether clustering exaggerates in favour of our hypothesis, or against it, dependency will always distort our analysis from reality
If 2 related ppts are both in the post-experiment group, what happens? (CLUSTERING)
Increased score for 1st ppt after experiment, causes 2nd ppt scores to be increased more through talking to 1st ppt
So, the experimental group scores are exaggerated
If there are 2 related ppts in study and one is in the control group and one is in the experimental group, what happens? (CLUSTERING)
1st ppt scores will increase after experiment > leads to increase in 2nd ppts score in control through talking to 1st ppt
makes exp group look less effective than it is, because scores in both groups look the same (not much to compare)
If 2 related ppts are both in the control group, what happens? (CLUSTERING)
An event outside study could bring the 1st ppts mood up or down, thus affecting the 2nd ppts mood too
This brings both control scores up (exp looks worse) or down (exp looks better)
scores not based exp itself!
What is heirarchical clustering? An example?
When data is naturally grouped, at multiple levels
- In schools for example, children within classes talk to each other and children within schools too (but less so)
How do we deal with (heirarchical) clustering?
Modelling the dependencies - include dependencies (both class and school) in our model as variables
When looking at if there is an effect of cosmetic surgery on quality of life, where patients are within clinics, how does clustering occur? (L7)
- having different clinics - one clinic in one area better than others = ppts from this clinic starting with higher baseline quality of life
- Different surgeons within clinics - good surgeons boost quality of life a lot more post surgery in exp group, compared to bad surgeons. Bad surgeon could do the surgery wrong and worsen quality of life!
Based on the cosmetic surgery example, what are the variables we need to consider? (L7)
- Quality of life
- Surgery
- Clinic
For the cosmetic surgery example, what is the formula for the first model where we do not specify random effects yet? (MLM L7)
NOTE: can run a MLM same as a linear regression
Quality of life = b0 +b1*Surgery
one way of getting -2LL for Linear R
Also called a No random effects / fixed effects model
When does the MLM depart from Linear R?
When we start adding random intercepts
What variable is added to the original GLM formula when we add random intercepts?
u0 variable
becomes (b0+u0[variable the varying intercept is based on]) *Time
Based on the cosmetic surgery example, what variable is the varying intercept based on?
Clinic > each clinic now has its own intercept (u0) for quality of life
Why do we vary the intercepts based on a particular variable in MLM?
By changing the intercept, we are effectively allowing DV (quality of life) to ‘start’ at a different point in each variable (clinic)
Aside from the p value given in the Estimates of Covariance Parameters box, how can we find whether DV (quality of life) varies between u0 variables (clinics)? (this can be a legitimate research q)
Once the variance is found what is the difference called?
Compare how much DV varies with random intercept to how much DV varies without random intercept (u0 variable)
-2LL current model with random intercept take away -2LL previous model with no random intercepts (linear R model)
Difference = Likelihood ratio
How do you get a chi square p value in the chi square calculator?
The likelihood ratio and DF of 1
Andy Field says that for testing random effects, the __1__ is much more accurate, so you should rely on this rather than __2__
- Likelihood ratio (diff in -2LL between random intercepts model and no random intercepts model(Lin R))
- Wald Z (in the covariance parameters box)
What in SPSS tells us whether there is in fact significant variance in intercepts between participants
i.e. whether the ‘best’ intercepts for each participant are significantly different from each other (i.e. whether they needed to be ‘random’?
Estimate of covariance parameters’ box
look at the p value
How do we get ppts individual intercept?
Add individual u0 variable (quality of life estimate for each clinic) to the fixed b0 value
If we have random intercepts and slopes for a variable (clinic) what are we effectively running?
Separate regressions for each variable (clinic, so 10)
- which can be manually combined - just have to split clinic variable
across 10 regressions, are the best slopes for each clinic. So, rather than ___1___ being our data points, we are now making _2___ our data points
- Patients
2. Clinics
What t test’s t and p values is equivalent to MLM’s t and p values?
Compares manual MLM - 10 data points ( b values in manual R) to zero( the null) (L7)
One sample T Test
- mean of the b values / SE = calculate t
- the t given is close to t in MLM intercepts row
Why do many researchers think psychology is in trouble? e.g. Brian Nosek, Stephen and others (L8)
Replication Crisis - when studies are replicated different p results are found (actually non signif…)
How did Nosek demonstrate the replication crisis? (L8)
replicated random 100 studies all with signif p values and found only a third maintained the signif result
What does this replication crisis mean for us? (L8)
Belief in Null Hypothesis Significance Testing (p values) drastically declines
What is the good about human reasoning?
We can draw on our knowledge of the world (current model - view of someone and factors affecting their decision) which will affect how you influence ambiguous data
We use our current model of knowledge, because we cannot measure direct info (e.g. someones thoughts) that would give us correct answer
What is the bad about human reasoning?
Confirmation Bias - seeing what we expect to see, followed by biased updating of our beliefs
(a) interpreting ambiguous data in light of your prior beliefs and (b) updating those very same beliefs based on your interpretation of the ambiguous data
What scientific method is used in psychology to try and prevent confirmation bias? and who are the famous figures who helped to establish this?
Null Hypothesis Significance Testing (NHST)
Francis Bacon, Pearson, Neyman, Popper, Fisher,
What is the basic idea behind an experiment? (Bacon)
create a hypothesis and if what we expect occurs under certain circumstances then we expect our theory to be true
/ scientists are trying to contrive a situation which will produce unambiguous evidence either for or against the theory
We can conduct many studies to support our theories, but according to Karl Popper, what do we also need to do?
To conduct studies FALSIFYING our theories too - identifying that there can be evidence against our theory too (idea behind crucial experiment)
The idea of a crucial experiment (bacon) in science was created for physics can say ‘If my theory is true, this will definitely happen – if it is false, this other thing will happen, or nothing at all’
but why is there an issue using it in psychology (and medicine)?
Psychology and medicine deals with PROBABILISTIC THEORIES - if my theory is true, then on average this will happen / on average these people will do better
Therefore, cannot really follow Bacons idea of crucial experiment - our experiments do not definitely prove a theory wrong > only provides PROBABALISTIC EVIDENCE
The issues within using scientific crucial experiments in psychology, allows what to creep back in, that we tried to prevent before with scientific method?
Probabilistic theories and evidence we gather, allows for confirmation bias to creep back in
data is open to interpretation by researcher, allowing for biased prior beliefs in favour of theory > interpret data in biased way
What is the data we get based on Fisher?
P values and effect sizes
p value is the probability of getting the current data if the null hypothesis is true
CONTINOUS SPECTRUM - smaller the p, more significant our data is and can reject the null
How do the ideas of analysing our data differ between Fisher and Pearson&Neyman?
Fisher - continuous spectrum for p value = smaller p = evidence for our theory
P&N came up with a DECISION RULE in NHST - ‘reject’ our null hypothesis if the p value is less than .05 (signif result) - conveying evidence for our theory
Intended to stop interpretation of p values > but has the opposite effect
How has the replication crisis arose?
hint: confusion between ideas
A mixing of Fishers idea and Pearson/Neymans idea of analysing data - should both be exclusive (use one way or the other)
What are the problems in science then that allow for the replication crisis? Shows NHST cannot prevent theses problems from corrupting science.
HINT: 4 Horseman
Publication Bias > P-Hacking and bottom drawer effect
HARKing
What is publication bias? (and so bottom draw effect)
Majority of papers published show significant results and nonsignif ones are discarded or not written up
signif papers seem more interested and generate more money
Many studies suspiciously reporting p values just below .05 (reported by Masicampo)
What is bottom draw effect?
Incentive structure to write and publish papers with signif results and discarding any with non-signif results as they usually are rejected by journal (chucking non-signif findings in bottom drawer)
= SKEWED REPRESENTATION
What is one technique to do if publication bias is occurring in research?
Conducting a Meta-anlalysis i.e statistical reviews - they try to come to a single statistical overview of the field
Instead of throwing non-signif results in “bottom drawer” what is a more nuanced form of publication bias?
P-HACKING - nudging our p-value towards a signif result - e.g. researchers will try all the different ‘acceptable’ ways (methods and analyses) out and pick the one that is significant (CONFIRMATION BIAS)
because researchers have certain degree of freedom when running experiment
How can we prevent Bottom drawer effect and P-hacking?
- Following Neyman-Pearson approach more - pre-registration (registering experimental and analysis plans before collecting data)
- Give up on NHST / p value being .05 entirely
make it continuous again like fisher proposed?
arbitrary significant vs non-significant distinction removed
What is a statistically-legitimate opposition to NHST?
Bayesian approach
Why is it hard in psychology to produce powerful tests of theories?
because we have weak theories to begin with (thus vague predictions) - e.g. we say meditating makes us happier - well by how much? no idea
and weak experiments (unlike in physics)
- studying complex humans - messy situation
- confounded - whether confound variables explains signif results rather than theory (whether it is or not is very subjective)
Why do we come to rely on a-priori hyptothesising?
a priori hypothesizing suggests that the researcher is on to something impressive and not due to confound variables (… many researchers want to do this so they may HARK)
What is HARKing?
Hypothesising after the results are known and pretending to make apriori hypotheses
occurs because incentive framework - work with impressive apriori hypotheses more likely to be published - difficult to predict before seeing results though = harking can help get published
pre-registering proposed to prevent HARKing… OR give up the sanctification of a priori hypothesizing there will be no incentive to HARK (rid of NHST)
Why do some want to get rid of NHST?
we don’t know what our theory predicts (so difficult to make apriori hyps) and don’t know if our experiment is a good test of our theory before we run it
we don’t have any mature enough theories to use NHST and make hypotheses, should just do explorative studies first with no expectations in mind
What is the problem with a-priori hypothesizing?
based on vague, weak theories to begin with
confounding variables tend to occur during experiment causing it apriori hyps to weaken - not able to make predictions when everything is complex
Still in exploration stage, not NHST stage
What do we mean by removing ‘significance’ thinking?
Not saying results are significant to .05 in an experiment, instead saying = provides a bit of evidence for or against a theory
No apriori hyps needed - it doesn’t matter if you predicted it beforehand or not – it’s just one piece in the puzzle - in a decade or so we’ll have some idea if the theory is true or not
Why will it prove hard to remove ‘significance’ thinking and apriori hypothesizing?
privatised capitalist journal system which prioritises ‘interestingness’
(L7) instead of getting a whole model r and f value, what do we get instead in MLM?
a whole model -2LL
we do get individual predictor r and f values in the fixed effects box
(L6) Give an example of things you measure indirectly?
Cognitive ability (what we actually want to measure) through scores on an IQ test (tools we use to measure this indirectly)
What Q should we keep in mind always regarding measurement theory?
What is the relationship between my measurement tool and the thing I actually want to measure? (relationship being distance)
What is the relationship between the extern (indirect measure) tool and the internal measure (that is not directly observable)?
The validity of that measurement tool i.e. to what extent are we measuring what we intend to measure? extent is far from perfect.. (far from r2 of 1)
Why is there never a perfect relationship between what a tool is measuring and what we really want to measure?
NUISANCE VARIABLES - there are many other factors that we do not want to measure that affect responses on the measurement tool.
This prevents us from actually measuring what we want to measure = adds noise to your measurement tool
e.g. daily mood, hunger, time
When we collect data with our measurement tools, what do we get back?
different DATA TYPES - NOMINAL, ORDINAL, INTERVAL or RATIO
data type might not always match the data type of the thing we actually want to measure
What is nominal data?
Data with no particular order e.g. gender, eye colour
What is ordinal data?
Has a particular order e.g. ranks in the army, race positions or a likert scale with non-symmetrical points (lopsided)
What is interval data?
Has a particular order and equal distances between points e.g. likert scale with symmetrical points (very bad, bad, neutral, good, very good)
What is ratio data?
Has a particular order, equal distances between points and true zero
if someone scores a ‘zero’ on your measurement tool, they also have ‘zero’ on the thing you’re trying to measure e.g. reaction time (height and weight)
When does data become less crude?
When it moves from nominal»_space;> ratio
- conduct more sophisticated calculations on data further to the right
How many points is recommended on a likert scale?
5 or 7 points to balance sufficient precision while not overloading your responder
What is another thing to think about when deciding the data type of your measurement tool?
How your responders INTERPRET your scale - they may interact with scale in ordinal manner, rather than interval manner
e.g. five star rating system - data points not equally distributed > overly generous with 5star ratings and 4star doesn’t mean good anymore…
Why is it important to think about whether your data is ordinal or interval?
If you get them confused, the mean will NOT be the central point as intended with interval, if data is actually ordinal
Give an example where the same measurement tool used to measure one thing might be interval and used to measure another thing that might be ratio
Measurement tool = number of incidents
data gathered is ratio when = want to measure the reduction in challenging behavior (external - directly observable) - 0 incidents = 0 challenging behaviours (true zero)
data gathered is interval when = want to measure happiness (internal - not directly measured) = no. of incidents then becomes an indirect measure, because a reduction in incidents might not indicate increase in happiness…
What do we call data that have only two ‘points’ on the scale, for example questions with only ‘Yes / No’ responses, or ‘True / false’, responses.
Why is this type of data dealt with in a special way?
Binary data
Special - unsure if two points have particular order and to have equal distances you need at least 3 points
When is logistic regression used?
When a criterion variable is binary
Lin R can only handle criterion variable that are at least interval data
Give an example of a phenomena described as binary?
Downs syndrome - a condition where you either have it or you don’t (genetic mechanism)
For many disorders the way they are measured and the way they are diagnosed varies, how so?
A disorder (e.g. autism) may be measured on a spectrum (not directly measurable), but diagnosed in a binary way (either have condition or you don’t based on cut off point)
(L6) what are standard ways of coding criterion variables for logistic regression e.g. having a disease and not having a disease?
Dummy coding used only
Having the disease = 1
Not having the disease = 0
What happens if you put a binary criterion variable into a Lin R model?
Why is this a problem?
It produces a line with NO BOUNDS - goes to infinity in both ways
Needs to make predictions that stay in its max 1 and min 0 bounds
What happens if you put a binary criterion variable into a Lin R model?
Why is this a problem?
Graph produced with a line with NO BOUNDS - goes to infinity in both ways
Needs to make predictions that stay in its max 1 and min 0 bounds
What does a graph of predictions show when the binary criterion model is run in Log R?
A SIGMOIDAL CURVE, not a linear line
What causes the difference between graphs in Lin and Log R?
from Linear»_space;> Sigmoidal curve
The right side of the normal linear regression formula is transformed
the lin R formula is multiplied by -1, e raised to the power of the model/linRformula, added to 1 and then divided by 1!
How is the left side of the normal linear regression formula transformed in Log R?
Now predicting the PROBABILITY of having disease!=P(Dis) (rather than just predicting the disease=Dis)
In logistic regression - looking at predictions of probability that an individual with the given cause levels has the disease (or whatever you coded as ‘1’)
What values change when we transform the normal linear regression formula?
b0 and b1 values
What happens to the graph when we multiply the normal linear regression formula by -1?
It FLIPS the direction of the line
What is the second step for the Log R transformation?
Raising e to the power of the model so far
(e is 2.71828)
we raise a number (any number) to the power of the linear regression formula which has dramatic effect on the line
What happens to the line on the graph when we complete step 2 of the Log R transformation?
(step 2 - raising e to the power of the model)
IDEAL SLIDE - steepness of the slope of the line is reduced increasingly as it approaches zero, so it never quite reaches zero (i.e. it is bounded at zero).
What is the third step for the Log R transformation?
Adding 1
What step in the Log R transformation do you divide 1 by the model so far? and what happens to the line on the graph?
Step 4 - as the line approaches 1, the slope of the line becomes less steep, never quite reaches 1
end up with a SIGMOIDAL CURVE bounded at 0 and 1 like our criterion variable (producing probabilities of 1 - having the disease)
Why can we not use SSE to assess our model in Log R?
because this model produces predictions of probability rather than just predictions of a continuous variable e.g happiness
When do we use the left vs right side of the LL stat formula for assessing the model?
LEFT - If a ppt is coded 1, so has the disease
RIGHT - if a ppts is coded 0, doesnt have the disease
why do we have ln in our LL formula? (L6 assessing model)
To undo the fact that we raised our model to the power e - ln(x) is the ‘inverse’ of e
for individual scores because we are dealing with values between 0 and 1, they will always come out ____ after we apply ln (L6)
negative
when working out individual scores of LL, ____ values are better i.e. they mean the probability prediction was closer to the correct answer.
smaller
If we get a larger number for the LL of an individual score, what does this mean?
the prediction is less good - because say the model predicted small chance of having dis [P(Dem) of .19], but they actually had the disease (coded 1). = Larger number
What do we do once we work out the LL of the individual scores?
We sum them up - equivalent to how we sum up all the individual squared errors for linear regression
This gives us a FINAL LL value (larger neg value = worse model)
What do we use when calculating the LL stat formula?
DISi (1-have dis, 0-dont)
P(Dis) - probability given that they have the disease
What do we do once we have the final LL value? (the sum of the individual LL scores)
Why do we do this?
Multiply the final LL value by -2 = -2LL ( now a positive since we x2)
-2LL is known as deviance/variance
because the final value has a CHI2 DISTRIBUTION - p value can be calculated easily on computer
When we have the -2LL, how do we know if it is a worse or better model?
higher positive values now mean =worse model
smaller values = better model.
What does the chi squared output in SPSS ombnibus test box mean?
LIKELIHOOD RATIO - The difference in -2LL between our current model (model with predictor e.g. TauC) and model with no predictors
What does the p value for the chi squared output in SPSS ombnibus test box mean?
The probability of getting the LR, if the NULL HYP is true
i.e. if the new model is not any better than the model with no predictors.
What is -2LL equivalent to?
SSE-Left (error reduced after including current model with predictor)
How do we calculate the -2LL Total / SSE Total?
the total answer is the -2LL for the model with no predictors
-2LL for current model + CHI2 output (LR, equivalent to the SSE Reduced)
What is the wald stat equivalent to in Lin R?
the individual predictors coefficients box in Lin R
How do we manually calculate walde for the predictor?
- Bvalue for predictor /SE for predictor
2. square the answer
What does Exp(B) tell us?
The change in the odds of having the dis for every one increase in the pred
Whats the formula for converting probability into odds?
P/(1-P)
Whats the formula for converting odds into probability?
Odds/(1+Odds)
How do we get the odds of getting the disease for every 1 increase in predictor?
Odds of getting dis (or not getting dis) for that pred score, MULTIPLIED by ExpB value for that pred
(L9) What is a bayes question?
a backwards inference question (posterior)
Bayes wanted to update peoples prior beliefs based on new evidence
What is forward inferencing?
What is backwards inferencing? (L9)
FORWARDS - KNOWN CAUSES > KNOWN EFFECTS
BACKWARDS - KNOWN EFFECTS > BACK TO KNOWN CAUSES (more tricky)
what is the order for answering the backwards problem? (L9)
- start with initial prior belief for the probability
- update this prior belief based on probability of getting new data
- arrive at a new posterior belief level
So we want to update the probability of a hypothesis as more evidence becomes available
What is the difference between the false positive rate and the total false positives? (L9)
The total number of false positives depends not only upon the rate, but also upon how many people there are WITHOUT disease
Why do we end up with more FPs in real life? (L9)
Most diseases are rare so we have a large number of individuals without the disease and a small FP rate.
So when we multiply the large no. of people without disease by the FP rate = large FP total
How do we find out the chances of positive disease result being true positive? (L9) TREE METHOD
I.e If an individual gets a positive result, what is the chance that they actually have the disease?
true positive number / total positive results
total positive results = TP number +FP number
When people are figuring out - if you get a pos result, the probability of dis results actually being true positive (actually having disease) or FP?
(backwards inference problem), what is the common mistake they make? (L9)
they give the TP RATE % or the FP RATE % as the answer
thus, confusing forward and backwards inference
Should have done = true positive number / total positive results
Can use tree method or what?
Bayes-LaPlace formula
Both the Bayes-LaPlace formula, and the tree method can be applied to scientific inference to help us understand how it works, and what is wrong with _____ focused statistics
p value
How can we lay out the scientific process with respect to hypotheses? (bayesian approach)
Prior beliefs about hypothesis + new data = posterior belief about hypothesis
What figure in Bayes-LaPlace formula sounds quite a lot like the usual definition of the p value?
But how do the two differ?
P(Da|¬Hy) - probability of getting the current data if our hypothesis is not true i.e. if the null hypothesis is true
only difference - p value - the probability of getting the current data OR MORE EXTREME if the null hypothesis is true
So, if the p value is part of the bayes la place formula as P(Da|¬Hy), what does this show? (highlights issue with NHST)
the p value is only a small part of the whole picture, despite most papers only reporting the p value
to find out the probability that our hypothesis is true, given our data (i.e. the purple posterior), the p value has to be combined with all these other figures – it absolutely cannot tell us that all by itself.
What is a common version of the transposed conditional fallacy?
Confusing P(Da|¬Hy) (pvalue) with probability of the null hypothesis being true (P(¬Hy|Da) (posterior)
This mistake confuses ‘forward’ and ‘backwards’ inference - in natural language hard to detect the difference
Whats problem 1 with the bayesian approach?
why its not a big problem perhaps?
The prior - who decides what the prior belief in the hyp is? SUBJECTIVE
but becomes less of a problem as you gather more evidence - data speaks for itself and you get a true posterior belief
thus, your belief will change according to data in the bayes formula!
What is problem 2 with the bayesian approach?
Don’t know our ALT HYP / P(Da|Hy) - we don’t have a specific value in mind for ‘Hy’, unlike when it come to the ¬Hy (0).
Thus, if we don’t have a value for ‘Hy’ we can’t calculate a probability for P(Da|Hy).
due to SOFT SCIENCE = weak theories and predictions (just predicting a therapy outperforms control by more than 0)
So, why is there a legitimate reason for focusing our statistics on P(Da|¬Hy)? and why Pearson and Fisher based statistical system around it?
only figure we can objectively calculate - only figure all scientists can agree on unlike P(Da|Hy) is based on weak theories and vague predictions
When is bayesian approach usually used?
META-ANALYSES of several experimental papers
Meta-analyses calculate P(Da|Hy) using the figures in papers they’re examining (sample mean and SE).
Why does NHST go too far with looking at the p value for the null hypothesis?
assigns a cut off point - P
What 2 values in the Bayes-LaPlace formula combined constitute the ‘evidential value’ of the data?
P(Da|Hy) and P(Da|¬Hy)
How do we get the likelihood ratio from P(Da|Hy) and P(Da|¬Hy) values?
LR = P(Da|Hy) / P(Da|¬Hy)
(equivalent to SSE-Reduced) = support for alt hyp if it is more than 0
What happens if our calculated posterior belief is higher than our prior belief?
Belief that our hypothesis is true has risen from prior belief value to posterior belief level (e.g. from 0.01 to .23)
The whole original point of NHST is to limit our _____ rate to a fixed value, and ___ has become the standard
FP; 5%
What does the true positive number calculated show? (L9)
if the NULL is FALSE, the probability CORRECTLY getting a SIGNIF result
What is the true positive RATE also known as?
The amount of POWER
What is the typical required power / TP rate for most experiments in psych?
0.8
the power should be greater than the alpha (FP rate) .05
(but most psych experiments have way less power)
What type of error is FP rate and FN rate?
FP = a TYPE 1 error rate (alpha) FN = a TYPE 2 error rate (beta)
Which rate is the opposite of power?
and so opposite of true positive rate
False negative rate - when the hypothesis really was true, but you got a non-significant result i.e. incorrectly accepted the null hypothesis.
We can set type 1 error rate (FP rate) as .05, but setting type 2 error (FN) rate is dependent on what 4 things?
Alpha (FP) - smaller this is, smaller type 2 is, but the larger type 1 error is
Sample Size - bigger sample =greater power and smaller type 2 error
True effect size - the bigger the effect, easier to detect and type 2 error lowers
lower effect = lower power/TP rate
Sample variance - looking for true effect size in the noise/variance - effectsize vs noise ratio
If you set a high alpha level, power will ___
increase
e.g. if alpha set a .01, effect is harder to detect, thus lower power
In MLM, if you’re asked what is the slope for variable 5 (out of 10 since variable has been split to allow for random slopes), how do you calculate the slope?
b1 (in fixed effects box) + u1
( do not just look at slope u1 variable in unbiased predictions box! or u0 intercept variable if asked about intercepts!)
How do we work out the H&L r2 in Log R?
LR / -2LL No Pred Model