statistical analysis and design Flashcards

Question

the regression model | better option over the mean model

Answer 1

- Uses the best-fit line (regression line) to adjust predictions based on predictor variables. - More accurate than the mean model. - the prediction of purchase intention changes based on the value of ease of purchase - not a perfect prediction for every point

Answer 2

-line represents the predicted values. -May pass through some, all, or none of the actual points.

Answer 3

Residual = difference between what the model predicted and the actual value -> Actual Value – Predicted Value - Measures prediction accuracy. (how much error is in the predictions) -Smaller residuals = better model fit. (lower error) - larger residuals = worse predictions/ higher error -Positive residuals -> when actual value is above regression line -> model underestimated outcome eg the actual purchase intention is higher than predicted -Negative residuals -> when actual value is below regression line -> model overestimated outcome. eg the actual purchase intention is lower than predicted

Answer 4

Y = b0 + b1X + error -Y hat = the predicted outcome based on the regression model - X = the predictor variable (the factor influencing Y) - b0 (beta-zero) = the intercept, representing the predicted value of Y when X = 0 - b1 (beta-one) = represents the change in the predicted Y for each 1 unit increase of X (the slope) - error = differences between actual and predicted values (measure of residuals) Example: If b₁ = 4.41, then a 1-hour increase in study time leads to a 4.41-point increase in exam score.

Answer 5

b0 = y - b1 * x b1 = r * (sy/sx) x and y are the mean values of x and y r is the pearson correlation coefficient sx and sy are the standard deviations of X and Y

Answer 6

imagine we are studying the effect of study hours on exam scores. Let's say we find the equation: Exam Score = 60 + 4X Here’s what this means: Intercept (60) → If a student studies 0 hours, they are predicted to score 60 on the exam. Slope (4) → For each additional hour of study, the exam score increases by 4 points. Example Calculations If a student studies 2 hours, their predicted score is: 60 + (4 x 2) = 68 if a student studies 5 hours, their predicted score is: 60 + (4 x 5) = 80

Answer 7

Y (Predicted Outcome) → This is what we are trying to predict. b0 (intercept) -> This is the value of Y when X=0. It represents the starting point of our line. b1 (slope) -> this tells us how much y changes when x increases by 1 unit. x (predictor variable) -> this is the variable that helps predict y

Answer 8

y = a + bx y = outcome variable x = predictor variable b = slope (how much y changes when x increases by 1 unit) eg if the slope b = 3, then for every 1-unit increase in ease of purchase (x), purchase intention (y) increases by 3 points

Answer 9

1. Levels of Measurement: Outcome (DV): Continuous (interval/ratio). Predictor (IV): Continuous or categorical (with 2 levels). 2. Independence: Each observation/score must come from a different participant. 3. Non-Zero Variance: Predictor values must vary (spread in the data). - can check using the scatterplot to ensure the predictor values are not all the same 4. Linearity: The relationship between predictor and outcome must be linear. Checked via Residuals vs. Fitted Plot with residuals on y-axis and predicted values on x-axis (flat red line). 5. Normality: Residuals should follow a normal distribution. Checked via Q-Q Plot with standardised residuals from the model- majority of points lying on dashed line and Shapiro-Wilk test (p > 0.05 = assumption met, p<0.05- violated). 6. Homoscedasticity: The spread of residuals is constant along the values of the outcome variable. Checked via Scale-Location Plot (flat red line) which shows the square root of the standardised residuals on y and fitted values on x and Breusch-Pagan test (p > 0.05 = assumption met, p<0.05 - violated, heteroscedasticity).

Answer 10

1. r-squared 2. f- value 3. regression co-efficient

Answer 11

R² (Coefficient of Determination): = Proportion of variance in the outcome variable that is explained by the predictor variable. - values vary between 0 and 1 with 1 being the perfect prediction of the outcome variable. Example: R² = 0.34 → 34% of the variance explained by the model - 66% unexplained and may be due to other factors not included in the model. - the r2 itself does not tell us if this explanation is statistically significant. Adjusted R² compensates for model complexity.

Answer 12

f value = determines if the model explains a statistically significant amount of variance. Example: F(1,98) = 49.45, p < .001 → Model is significant. f-test (ANOVA- analysis of variance) to evaluate if R2 is statistically significant. R² tells us how much is explained, and the F-test tells us whether what is explained is statistically significant

Answer 13

= Indicates the direction & strength of the relationship. Example: b = 0.31, p < .001 → A 1-unit increase in Cognitive Cultural Intelligence increases Psychological Adaptation by 0.31 units. positive value = the predictor and outcome variable move in the same direction eg (b) = 2.5 -> For every additional hour of study, grades improve by 2.5 points negative value = the predictor and outcome variable move in opposite directions (b) = -0.8 -> For every additional hour of study hours, grades decrease by 0.8 points - value found under estimate heading for predictor variable in r code

Answer 14

Small = 0.02, Medium = 0.15, Large = 0.35 Formula: f² = (R² adj / (1 - R² adj)) (R2 adjusted divided by 1 minus R2 adjusted) Example: f² = 0.49 (Large effect).

Answer 15

- adjusts the R2 for the number of predictors there are in the model so we report it when there is more than one predictor - it provides a more accurate measure of the goodness of fit - tells us more about the model accuracy when more than one predictor - value is always less or equal to R2

Answer 16

Regression df = 1 (for one predictor, slope). Residual df = n - 2 (adjusted for intercept & slope). report both values in the write up- (1,98)

Answer 17

=Two or more predictor variables → One outcome variable - allows us to analyze the unique contribution of multiple predictors while controlling for others.

Answer 18

Outcome Variable: Academic Engagement Predictors: Positive emotions (continuous) Psychological capital (continuous) Student connectedness (continuous) Extracurricular participation (categorical, two levels)

Answer 19

Y = b0 +b1X1 + b2X2 + b3X3 + b4X4 + error y = outcome variable (dependent) X1, X2, X3 X4 = the predictor variables b0 = the intercept, the predicted value of y when all predictors are 0, baseline level b1, b2, b3, b4 = the change in the predicted Y for each one-unit increase in each predictor error = measure of residuals - the number of x values you have is proportionate to the number of predictor variables you have - All other variables are held constant when interpreting each coefficient. - multiplying b values by the corrosponding x

Answer 20

suppose we are predicting academic engagement (Y) based on the 4 predicor values the equation = Academic Engagement=2.1+0.45(Pos. Emotions)+0.30(Psych. Capital)+0.25(Student Conn.)+1.2(Extracurricular Participation)+ϵ Interpretation of Coefficients: Intercept (2.1): When all predictors are 0, predicted academic engagement is 2.1. 0.45 (Pos. Emotions): For each 1-unit increase in Positive Emotions, Academic Engagement increases by 0.45, holding other variables constant. 1.2 (Extracurricular Participation): Since this is categorical (e.g., 0 = No, 1 = Yes), students who participate in extracurriculars score 1.2 units higher in engagement compared to those who don't.

Answer 21

1. levels of measurement (outcome variable is interval/ratio data, predictors are interval/ratio or categorical with two levels) 2. independence 3. non-zero variance 4. linearity 5. nomality of residuals 6. homoscedasticity 7. no multicollinearity (predictors should not be highly correlated)

Answer 22

Linearity: Residuals should be randomly scattered around zero (red trend line ≈ horizontal). Normality: Use Q-Q plot or Shapiro-Wilk test (p > 0.05 = normally distributed residuals). Homoscedasticity: Scale-location plot should be flat, no funnel pattern. Multicollinearity: Correlation between predictors should be < 0.8. Variance Inflation Factor (VIF) should be < 5.

Answer 23

-When predictor variables are highly correlated they overlap in explained variance. (think r coefficient) This makes it difficult to determine the unique contribution of each predictor - we don't know which predictor is contributing to the outcome variable - a little bit of multicollinearity is not a problem but when correlation between variables is strong this is a problem if we want to estimate effects of individual variables Detecting Multicollinearity: 1. high correlation coefficients - *correlation matrix* (if the r value between two predictors is greater than .80 then they are measuring something similar bringing overlapping information) <0.8, assumption is met 2. Variance Inflation Factor (VIF) (all values should be < 5 to meet assumption), -values of 5 of larger indicate large multicollinearity problem

Answer 24

= 1/ (1-Ri2) - Ri2 is the R2 value obtained by regressing the i predictor on the remaining predictor variables - we need to run a linear regression model for each predictor

Answer 25

1. R-squared (R²): Proportion of variance explained by the model. 2. Adjusted R-squared: Adjusted for number of predictors. 3. F-value: Tests overall model significance. 4. Regression Coefficient (B): Shows the effect of each predictor on the outcome.

Answer 26

Adjusted R² = 0.51: The model explains 51.3% of variance in academic engagement (outcome) F(4, 295) = 79.74, p < .001 → Model is statistically significant. Degrees of Freedom (4, 295 = n - number of variables (4 variables, n = 300 → df = 295). b coeffiecient for one of the predictors = 5.01 eg -> positive emotions positively predict academic engagement where for every one unit increase in positive emotions, we expect academic engagement to increase by 5.01 (b = 5.01, p< .001) - value found under estimate std

Answer 27

- look at p and b values - we can also look at mean values, descriptives and visualise using a violin plot ( eg yes vs no as the two levels)

Answer 28

Unstandardized Coefficients (b): Expressed in the original units of each predictor. (from the R output) - each predictor is not measured in the same units Standardized Coefficients (β): Converted to a common scale → allows for direct comparison of predictor importance and can interpret whether residuals are large or not - we do this by using the scale() function on R Higher standardized B = More important predictor. Categorical predictors (2 levels) do not need standardization. - Can report unstandardised values, however we need to standardised if we want to figure out which predictor is the strongest - making comparisons between them

Answer 29

Effect Size (D): Higher B = stronger effect. Example: D = 1.04 → Very large effect. f2= R2 adjusted/ (1-R2 adjusted) small = 0.02 medium = 0.15 large = 0.35

Answer 30

F (dfs) = f value, p value, adjusted r2 value, effect size, accounting for ...% of the variance.

Answer 31

=A parametric test used to examine whether the difference between two means is statistically significant. Independent t-test: Used when comparing two independent groups. - students t test assumes equal variance - welch's t test does not assume equal variance - if we have more than 2 we cannot use this method Non-parametric alternative: Mann-Whitney U Test (when assumptions are not met). If comparing more than two groups, multiple t-tests are required, leading to an increased risk of Type I error.

Answer 32

Comparing four vocabulary learning methods requires six independent t-tests: L1 direct translation vs. L2 definitions L1 direct translation vs. Loci method L1 direct translation vs. Reminiscence L2 definitions vs. Loci method L2 definitions vs. Reminiscence Loci method vs. Reminiscence Each test has a 5% chance (α = 0.05) of a Type I error. Conducting multiple tests accumulates error: Cumulative probability of making at least one Type I error in 6 tests = 0.26 (26%), instead of 5%. 1- (1-0.05) to the power of 6 = 0.26 6 is the number of tests ran Capitalizing on chance: More tests → higher chance of false positives. Solution: ANOVA (Analysis of Variance) instead of multiple t-tests.

Answer 33

if the null hypothesis is true but the researcher rejects it and says it is false = type 1 error (false positive) if the null hypothesis is false but the researcher says it is true and fails to reject it = type 2 error (false negative)

Answer 34

= conducting so many tests with 0.05 a level on the same data, resulting in the increased likelihood of a type 1 error More t-tests -> more chance of capitalising on chance -> more chance of type 1 error - use ANOVA instead

Answer 35

ANOVA (Analysis of Variance) is a parametric inferential statistical test used to compare the means of three or more groups at once.

Answer 36

-One independent variable (IV). -Each participant is in only one condition/level (randomly assigned).

Answer 37

1. between group variance 2. within group variance 3. f-ratio ANOVA breaks total variation into between-group variance and within-group variance. Variance defines and measures the differences among the means of three or more groups

Answer 38

= The variability in data due to differences between the groups. When the means of the groups are different, this indicates that there is a greater degree of variation between the conditions/levels. If no differences in the means of the groups are found, this indicates that there is no variation. Sources: 1. Treatment Effects: Different interventions (treatments/conditions) lead to differences in performance.-> effect of the different treatments 2. Individual Differences: Participants naturally vary in the way they respond to tasks 3. Experimental Error: Measurement inconsistencies, procedural issues, uncontrolled external factors, randomness

Answer 39

= Variability within each group. Sources: 1. Individual Differences: Even within the same group, participants have different abilities, knowledge or personality traits 2. Experimental Error: Uncontrolled random factors affecting results, randomness in measurement, errors, procedural inconsistencies

Answer 40

between- measures the combined effect of error and treatment within- only measures the effect of error - all participants within any given group receive the same treatment, so the variance in the group cannot be due to this, variance is only due to random error.

Answer 41

= breaks the overall variation observed in the data (total) into components steps: 1. calculate the mean for each grouo 2. calculate grand mean -> add up all individual scores from all groups and then divide by total number of observations 3. Calculate within-groups variance: Differences of individual scores from group means. 4. Calculate between-groups variance: Differences between group means and the grand mean.

Answer 42

= the test statistic for ANOVA - values can vary from 0 to infinity f = between groups variance/ within groups variance - this is equal to treatment effect + error/ error Interpretation: Higher F-ratio = greater between group variance -> greater difference among group means that are more likely not due to chance. If F is significant, we reject H₀ (null hypothesis: no difference between groups).

Answer 43

Partial Eta Squared (ηₚ²): 0.01 = Small effect 0.06 = Medium effect 0.14 = Large effect Example: ηₚ² = 0.02 → Small effect size.

Answer 44

= Used when there is one independent variable (IV) with three or more levels, and each participant appears in only one condition.

Answer 45

1. Levels of Measurement: The dependent variable (DV) must be continuous (interval or ratio data). 2. Independence: Each observation must be independent (each participant provides one score). 3. Normality: Residuals should be normally distributed. Check: Shapiro-Wilk test (p > 0.05 = assumption met) Q-Q residuals plot - majority of points falling alomg dashed line 4. Homogeneity of Variance: The variance around the mean should be equal across groups. Check: Levene’s Test (p > 0.05 = assumption met) Graphical check: Boxplots and residuals vs. fits plot.- look for flat line - large variance makes it harder to detect true differences

Answer 46

1. If sample sizes are equal between groups & effect sizes are large → ANOVA can still be used. 2. If variance assumption is violated - (heterogeneity) → Use Welch’s ANOVA (does not assume equal variance), however residual assumption must be met to do this. 3. If normality is violated → Use Kruskal-Wallis test (non-parametric alternative).

Answer 47

F-statistic: Determines if at least one group mean is significantly different. p-value: If p < 0.05, there is a significant difference. Effect size: Partial eta squared (ηp²). ηp² = 0.01 (small), 0.06 (medium), 0.14+ (large). df1 = number of groups - 1 df2 = number of observations - number of groups (3, 96) Example Write-Up: A one-way ANOVA showed a significant effect of learning method on vocabulary retention, F(3, 96) = 10.37, p < .001, ηp² = .24.

Answer 48

Needed if ANOVA results are significant to determine which groups differ. - which groups have significant differences between them - maintains a level 0.05 Common post-hoc tests: 1. Tukey's HSD: Compares all group pairs while controlling for Type I error. (looking at individual p values for each comparison) 2. Bonferroni: More conservative, adjusts α-level for multiple comparisons calculates cohen’s d effect size for each pairwise comparisons.

Answer 49

Repeated Measures ANOVA (within subjects): Each participant experiences all levels of the independent variable. Between-Subjects ANOVA: Each participant is exposed to only one level of the independent variable.

Answer 50

benefits: simplicity costs: - large variation from person to person - requires large sample sizes for power

Answer 51

benefits: - more economical - fewer people needed - making contrasts within each participant - providing relatively precise estimates- acurately detecting the effect of the conditions or treatments being tested costs: - carryover effects - practise effects - fatigue effects - exposure to treatment at one time influences responses at another

Answer 52

1. Levels of Measurement The dependent variable must be continuous (interval or ratio data). 2. Normality Residuals should be normally distributed. How to check: Q-Q plot: Points should follow a straight line. Shapiro-Wilk test: Each comparison should have p > 0.05. 3. Sphericity = the variance of the differences between all pairs of conditions are approximately equal Checking Sphericity: Mauchly’s test: p > 0.05 means sphericity is met. - (gives a W statistic) If sphericity is violated, apply Greenhouse-Geisser (GG) or Huynh-Feldt (HF) which are sphericity corrections. - We can automatically check the assumption of sphericity when computing the ANOVA test using the anova_test() function from the rstatix package.

Answer 53

- is a statistic calculated when sphericity assumption is violated - measures how far the data is from the ideal sphericity - values range between 0 and 1 - epsilon (E) is the unit we use - (GGe) for greenhouse geisser epsilon - (Hfe) for huynh-feldt epsilon look at greenhouse geisser epsilon first to determine the E value. (GGe) if E is <.75 we use the greenhouse-geisser(GG) correction if E is >.75 we use the Huynh-Feldt (HF) correction

Answer 54

- in danger of making type 2 errors and missing real effects * If sphericity is violated, the test loses statistical power, which means that the test becomes less sensitive to detecting true differences between conditions * The test may underestimate differences between conditions, leading to a non-significant result (p > 0.05), even if one rehearsal condition/method is actually better than the others

Answer 55

Sphericity: Concerns differences between conditions. (mauchlys test) - within subjects Homogeneity: Applies to between-subjects ANOVA for that the three groups being comapred have same variability WITHIN their data points (Levene’s Test).

Answer 56

Without assumption violations: Report standard F-value, p value and dfs With sphericity violation: Use corrected F-values (GG or HF correction), as-well as the generalised effcet size, and dfs

Answer 57

Generalized effect size (ηG2): Measures overall impact of IV. (all levels together) Cohen’s d: Measures differences between two specific conditions we are comparing.

Answer 58

= Determines which conditions differ significantly. - compares two groups at a time to see if their means differ significantly Bonferroni-adjusted pairwise comparisons: - reporting adjusted r values - the t statistic - the dfs - cohens d effect size for each of the pairwise comparisons

Answer 59

Post hoc test is the name of the general test vs Pairwise comparison = specific number of comparisons we are making between groups for both types of one way ANOVA: the number of pairwise comparisons = k *(k-1) / 2 where k is the number of groups (conditions)

Answer 60

=A statistical method to examine the effects of two or more independent variables (IVs) with multiple levels or categories on one dependent variable (DV). - an IV must have an leaste two levels Example: Time of day (morning, afternoon, evening) and rehearsal strategy (visual, verbal) affecting memory retention. - helps to identify whether the factors interact or whether their combined effect is different from their individual effects.

Answer 61

1. Tests the effects of multiple IVs on one DV. 2. More statistically powerful, reducing error variance.

Answer 62

1. Complete: All levels of each IV are paired with all levels of every other IV 2. Incomplete: all levels of each IV are not paired with all levels of every other IV - Conditions are not fully paired due to limitations (e.g., participant availability, not possible to make all comparisons).

Answer 63

the notation tells us the number of levels of each single independent variable 2x2: Two IVs, each with 2 levels (e.g., morning/afternoon and visual/verbal). 3x4: Three levels of IV1 and four levels of IV2.

Answer 64

- Conditions are the different combinations of the levels of the IVs in a factorial design. Each "condition" represents a unique combination of the different levels of the independent variables that participants may be assigned to. eg 2x2 design has 4 conditions 3 x 4 design has 12 conditons

Answer 65

1. Main effect = The effect of a single IV on the DV, regardless of the other IV. - Example: The effect of communication mode (IV1) on user satisfaction (DV), regardless of profile depth (IV2). 2. Interaction effects = How the combination of two IVs affects the DV, simulating real-world scenarios. - Example: The effect of communication mode depends on the profile depth.

Answer 66

1. Levels of Measurement: DV should be interval or ratio data (continuous). 2. Normality: Residuals (the differences between observed and predicted values) should be normally distributed. - shapiro-wilk test (p>0.05, assmption met) 3. Homogeneity of Variance: Variances should be similar across groups. Checked with Levene’s test. p>0.05 assumption met

Answer 67

Boxplot: Visualize the effect of the IVs (e.g., communication mode) on the DV (e.g., user satisfaction), with different levels of the second IV (e.g., profile depth).

Answer 68

for main effect: - Use partial eta squared (ηp²) to interpret effect size - found under ges column on r output - report f value, dfs, p value and np2 for interaction effects: - Examine how the interaction between two IVs affects the DV - report f value, dfs, p value and np2

Answer 69

Bonferroni test: Used to compare differences between the levels of an IV when there are multiple comparisons. - do this for each variable