Discovering statistics Flashcards

Question

F statistic

Answer 1

- testing fit - sig fit represents sig effect of experimental manipulation - if model results in better prediction than the mean then MSm > MSr - Anova(model_lm)

Answer 2

- R^2 proportion of variance accounted for by model - pearson correlation coefficient between observed and predicted scores^2 - R^2 = SSm/SSr - adjusted R^2 estimate of R^2 in population broom: :glance(data_lm)

Answer 3

- hierarchal (experimenter decides) - forced entry (all entered simultaneously) - stepwise (only used for exploratory analysis, predictors selected using semi partial correlation with outcome)

Answer 4

- outliers distort linear model and estimations of beta values - detect them in: graphs, standardised residual, cooks distance, DF beta statistics - ggplot::autoplot(data_lm, which = 4, ...) + theme_minimal() gives estimate, std.error, p.value and removes outliers

Answer 5

- use of model as can't remove outliers - robust::lmRob(outcome~predictor, data = data) - summary(lm_rob)

Answer 6

linearity (relationship between predictor and outcome is linear) and additivity (combined effects of predictors) spherical errors (pop model have homoscedastic errors and independent errors) normality of errors

Answer 7

- model errors refer to differences between predicted values and observed values of outcome variable in POP model - residuals refer to differences between predicted values and observed of outcome in SAMPLE model

Answer 8

- should be independent - pop error in prediction for one case should not be related to error in prediction for another case - errors should be homoscedastic - violation of assumption

Answer 9

variance of pop errors should be consistent at different values of predicted variable

Answer 10

b's unbiased but not optimal | standard error incorrect

Answer 11

boostrap -> standard errors derived empircally using resampling technique, designed for small samples, robust b, p, and ci heteroskedasticity -> consistent SE, uses HC3 or HC4 methods

Answer 12

- code control group with 0 and the other with 1 - b for dummy variable is difference between means of two conditions - mean condition 1 = b0 + b1(0) - mean condition 2 - mean condition 1 = b1 - dummy coding isn't independent as used same p-value

Answer 13

- outcome = b0 + b1(contrast 1) + b2(contrast 2) - b0 is value of control - b1 is difference between b1 and b0 - b2 is difference between b2 and b0

Answer 14

variability explained by model, SSm, due to participants being assigned to diff groups variability represents experimental manipulation

Answer 15

- independent - to control for error 1, if group is singled out in contrast then it shouldn't be used again - only contrast 2 chunks of variation - k-1, end up with one less contrast than no. groups - first contrast compare control to all experimental ones

Answer 16

1-groups coded with positive weights compared to groups coded negatively 2-sum of weights equal 0 3-if group not used code it as 0 4-initial weight assigned is equal to number of groups in opposite chunk 5-final weight = inital/no. groups with non 0 weight

Answer 17

in absence of hypothesis compare all means inflates type 1 error rate use bonferroni to correct modelbased::estimate_contrasts(data_lm. adjust = "bonferroni")

Answer 18

polynomial contrast only ordered groups contrast(data$predictor)

Answer 19

- when know extraneous/confounding variable influences outcome so adjust for them - reduce error variance by explaining some of unexplained variance - gain greater insight into effects of predictor

Answer 20

total variance = explained by predictor + unexplained variance unexplained variance overlapped by variance explained by predictor and covariate

Answer 21

broom::tidy(data_lm, conf.int = TRUE)

Answer 22

- use dummy coding and mean of covariate - outcome = b0 + b1(contrast 1) + b2(contrast 2) + b3(covariate) - outcome = 1.7 + 2.2(contrast 1) + 1.7(contrast 2) + 0.4(covariate) - code contrast 1 as 0, and covariate as its mean, outcome = 2.9 - repeat with contrast 2 coded as 0

Answer 23

no covariate predicted values are raw group means beta attached to contrast 1 is difference between means of individual conditions from within that contrast

Answer 24

calculated for sums of squares type1: default in R, each predictor evaluated taking into account previous predictors, order of predictor matters type 3: each predictor evaluated taking into account all other predictors, order not matter

Answer 25

data_lm %>% car::Anova(., type = 3)

Answer 26

- for sig of f-stat to be accurate we assume relationship between outcome and covariate is similar across groups - known as homogeneity - when assumption is met, f stat is assumed to follow f distribution and corresponding pvalue

Answer 27

2 or more predictors have been manipulated

Answer 28

acts on relationship between predictor and outcome | outcome = b0 + b1(predictor) + b2(moderator) + b3(predictor x moderator)

Answer 29

predictor x moderator if interaction term pvalue is significant you ignore all other rows it its significant there is significant moderator effect parameter estimate quantifies raw effect size of interaction term in factorial designs, effect of moderator is stronger is certain categories of predictor

Answer 30

afex::aov_4(outcome ~ predictor*moderator + (1|id), data = data) doesn't show parameter estimates, diagnostic plots or robust methods, but afex_plot() plots the interaction

Answer 31

it depends upon sample size

Answer 32

0-1, anything greater than 1 means the model explains more than it doesn't

Answer 33

sampling distribution of parameters

Answer 34

data point that is unrepresentative of relationship being investigated

Answer 35

- the ratio between variance explained by model and residual variance - whether the model explains variance in outcome better than the grand mean - likelihood of obtaining the value you have if no true difference in means of groups

Answer 36

hypothesis driven control type 1 error rate planned a priori

Answer 37

it doesnt as looks at model as whole

Answer 38

expecting the effect of one predictor to vary as function of another predictor

Answer 39

independent contrasts that cross multiply = 0 and add together = 0

Answer 40

effect of just one of the independent variables on dependent variable effect f predictor alone ignoring all other predictors in model

Answer 41

automatically met when variable has only two levels | if not met it is remedied by adjusting degrees of freedom by the degree to which data are not spherical

Answer 42

a. The extent to which the type of variable A affected outcome depended on type of variable B and vice versa

Answer 43

systematic-created by our manipulation | unsystematic-created by unknown factors

Answer 44

more sensitive -unsystematic variance reduced, more sensitive to experiemntal effects more economic-less participants possibel fatigue effects

Answer 45

all participants in all conditions, scores correlate violates assumption of independent residuals need to adjust model to estimate this dependency: outcome = boj + bj(predictor) + ej boj = bo +uoj b1j = b1 + u1j u is variability across different particpants

Answer 46

assume sphericity: estimate and correct for it | fit multigrowth model

Answer 47

difference between pairs of groups should have equal variance assumption the variances are the same between conditions greenhouse geisser estimate e=1 then perfect sphericity

Answer 48

r multiples df by e to correct for effect of psherciity given that e quantifies deviation from perfect spherciity df get smaller which makes harder to tests tat to be sign routinely apply g-g correction

Answer 49

afex::aov_4(outcome ~ predictor + (predictor|id), data = data) ges is effect size

Answer 50

emmeans::emmeans(model_afx), ~predictor, model = "multivariate") data_cons

Answer 51

WRS2::rmanova(y = data$predictor, groups = data$predictor. blocks = data$id) gives f stat, df use WRS2::rmmcp for robust post hoc

Answer 52

emmeans::joint_tests(data_afx, "predictor b") | effect of predictor a within predictor B

Answer 53

pairs(int_emm, adjust = "holm") | don't do if get non-significant

Answer 54

categorical predictors must be coded as contrast variables | extract them using emmeans::contrasts()

Answer 55

eff -> each category compared to average of all categories pairwise -> each category compared to all others poly -> polynomial contrasts trt.vs.crtl -> compares each category to a declared reference category, ref = x consec -> compares each level/category to the previous

Answer 56

only important if interaction term is non-significant emmeans::emmeans(data_Afx, ~predictor, model = "multivariate") look at each predictor separately: (data_afx, c("predictor","predictor"), model = "multivariate")

Answer 57

R multiples df by value of epsilon, which makes result mroe conservative

Answer 58

look at column ges | 0.15 ges is 15%

Answer 59

largest value of F

Answer 60

violates assumption of linearity

Answer 61

ln(P(Y)/1-P(Y)) = b0 +b1(X) + e outcome is log odds of outcome occurring b1 is change in log odds of outcome associated with unit change in predictor

Answer 62

log of 1 = 0 | exponent of 0 = 1

Answer 63

log odds of outcome when predictor is 0 | easier to interpret tahn exp(B0)

Answer 64

- change in log odds of outcome associated with unit change in predictor - easier to interpret exp(b1), odds ratio associated with unit change in predictor - OR >1; as predictor increases probability of outcome increases - OR <1; as predictor increases, probability of outcome decreases

Answer 65

states number of type of 'presents' and how may were 'delivered' and 'undelivered'

Answer 66

number of delivered/number undelivered

Answer 67

number of delivered after treat1/ number of undelivered treat2

Answer 68

odds(delivered after treat2) / odds(delivered after treat1)

Answer 69

odds of delivery is much smaller for treat 2 than treat 1 | 0.15 times smaller

Answer 70

glm(outcome ~ predictor, data = data, family = binomial()) | data_glm %>% parameters::parameteres() %>% parameters::parameters_table(p_digits = 3)

Answer 71

insert "exponentiate = TRUE" into parameters::parameters()

Answer 72

ln(P(Y)/1-P(Y)) = b0 + b1(pred1) + b2(pred2) + b3(pred1 x pred2) + e

Answer 73

``` linearity spherical residuals multicollinearity incomplete information complete separation ```

Answer 74

empty cells inflates standard errors problem escalates quickly with continuous predictors

Answer 75

outcome variable can be perfectly predicted

Answer 76

- if produces a log odd of -1.03 - create glm with subsetting the type of treat - 1.03 = log odd of treat 2 - log odd of treat 1

Answer 77

-control_vs_exp

Discovering statistics Flashcards

(102 cards)