Stats Test Flashcards

Question

What is SSm

Answer 1

Model sum of squares variation explained by our model

Answer 2

Residual sum of squares variation not accounted for in our model

Answer 3

Two independent variables will be manipulated

Answer 4

1. Check Levene's tests - if significant then assumption of homogeneity of variances has been violated. If violated then transform your data or use a non-parametric test or report inaccurate F value 2. Summary table will include an effect for each independent variable (aka main effects) and the combined effect of the independent variables (aka interaction effects) 3. Bold items are the SSm, Error = SSr 4. Look at the F-ration, if significant then complete post hoc tests

Answer 5

Three or more experimental groups with the same participants

Answer 6

1. Check sphericity (equal variances between treatment levels). If Mauchly's test is significant then the assumption of sphericity has been violated. - If sphericity has been violated, we can look at either the Greenhouse-Geisser Estimate, the Huynh-Feldt estimate or the lowest possible estimate of sphericity (aka lower bound). - Use Greenhouse when Mauchly's is LESS than 0.75, use Huynh when Mauchly's is MORE than 0.75 2. If the effect is significant, we need to look at 'pairwise comparisons' to see where the effect lies. - Look for significant values ie. less than 0.05 3. Calculate effect size - use benchmarks of .10 / .30 / .50

Answer 7

- Use Greenhouse when Mauchly's is LESS than 0.75, use Huynh when Mauchly's is MORE than 0.75

Answer 8

violating sphericity = less power = increases type 2 error

Answer 9

Independent variables are measured using both independent and repeated measures groups

Answer 10

1. As mixed ANOVA uses both independent and repeated design we need to check if assumption of homogeneity of variances AND sphericity have been violated. 2. Look at both output tables and find the main effects (one for each INDEPENDENT VARIABLE) and one interaction term. (words in CAPITALS are your INDEPENDENT VARIABLEs you need to look at these) 3. Look at the F-ratios in both tables. 4. If the effect is significant then we can run t-tests to see where the effect lies, make sure to use Bonferroni method (independent variable alpha 0.05 by the number of tests you will run) - Look at both 'paired samples test' tables. → this is known as a SIMPLE EFFECTS ANALYSIS. 5. Calculate effect size - use benchmarks of .10 / .30 / .50

Answer 11

sometimes we conduct research we know some factors have influence on our DVs (from previous research e.g., age and memory) These factors are called covariates and we can include them in our ANOVA

Answer 12

to reduce the error variance (increase how much variance we can explain) eliminate confounds (by including the covariates we remove the bias of these variables)

Answer 13

1. Check Levene's test of homogeneity of variances. - If significant, transform the data of complete a non- parametric test. 2. The output will look the same, it will just include the covariates. 3. Look at the F-ratio for all the main effects and for the covariates. - If the covariate is significant, this means that it has a relationship with our main independent variable. 4. Calculate effect size - use benchmarks of .10 / .30 / .50

Answer 14

Multivariate analysis of variance ANOVA but when there are several dependent variables

Answer 15

1. Check for independence, random sampling, multindependent variableariate normality and homogeneity of covariances matrices. - If Box's test is significant then the assumption of homogeneity of covariances matrices has been violated. 2. Look at the multindependent variableariate test 'group' table. This is showing the effect of the Independent variable on the DV. 3. When looking at the output Pillai-Bartlett test (Pillai's trace) statistic is the most robust. 4. If there is a significant F ratio then we need to look at the unindependent variableariate tests or run a discriminant analysis. How to interpret unindependent variableariate test statistics? - Levene's should be non-significant - then look at 'tests of between-subjects effects' → corrected model and group row stats should be significant if there is an effect between IVs and DVs.

Answer 16

Look at the 'covariance matrices' to see the direction and strength of the relationships Eigenvaues percentage of variance = variance accounted for, square the canonical correlation to use as an effect size. Wilks' Lambda table shows significance for all variables, look for the significant ones. Use the Standardised Canonical Discriminant Function Coefficients table to see how the DVs have contributed. Scores can range between -1 - 1, high scores = variable is important for the variate. Look down the 'function 1' column, if one value is positindependent variable and the other is negatindependent variable then the variate (aka function) has discriminated the two groups

Answer 17

The ability for a test to find an effect is known as statistical power

Answer 18

Power of a test = the probability that a test will find an effect if there is one We aim to achieve a power of 0.8

Answer 19

1. how big the effect is 2. how strict we are with our alpha level (i.e., 0.05 or 0.01) 3. How big the sample size is - the bigger the sample size, the stronger the power

Answer 20

A range of values that are believed to contain the true population value eg. a 95% confidence interval means that if we were to take 100 different samples and compute a 95% confidence interval for each sample, then approximately 95 of the 100 confidence intervals will contain the true mean value

Answer 21

- If 95% CI do not overlap = means come from different populations. - CIs that have a gap between the upper and lower end of another - p <0.01 - CIs that touch end to end - p = 0.01 - CIs that overlap moderately - p = 0.05

Answer 22

Cohen's D Pearson's correlation coefficient r odds ratio

Answer 23

The difference between two means divided by the SD of the mean of the control group, or a pooled estimate based on the SD of both groups

Answer 24

small d = 0.2 medium d = 0.5 large d = 0.8

Answer 25

small r = 0.1 medium r = 0.3 large r = 0.5 0 = no effect, 1 = perfect effect

Answer 26

the odds of an outcome are equal in both groups

Answer 27

calculate by dividing the probability of the event happening by the probability of it not occurring

Answer 28

Data which can be divided into groups (e.g., gender, age group)

Answer 29

Pearson's chi squared test The likelihood ratio Yates continuity correction Log linear analysis

Answer 30

when we want to see if there is a relaitonship between two categorical variables if the expected frequency is less than 5 then we need to use Fisher's exact test

Answer 31

to be used instead of chi squared test when samples are small

Answer 32

When we have a 2x2 contingency table then type 1 error increases Yate's continuity correction fixes this by lowering the chi squared statistic

Answer 33

2 variables with two level e.g., males vs female / phone vs no phone

Answer 34

When there are 3+ categorical vairbales

Answer 35

independence of residuals (as such you cannot use chi squared on repeated measures) expected values: should not be less than 5

Answer 36

use a chi-squared test if you have nominal (categorical) data the chi squared test can be used to see if these observed frequencies differ from those that would be expected by chance

Answer 37

Chi squared goodness of fit test (one IV) Chi squared as a test of association (Two IVs)

Answer 38

Used to compare an observed frequency distribution to an expected frequency distribution. - Eg. when picking fruit are people more likely to pick an apple vs a banana. - If significant then, some fruit get picked more than we would expect by chance.

Answer 39

Used to see if there is an association between two independent variables. - Eg. is there an association between gender and choice of fruit. - If significant then, there is an association between the two variables.

Answer 40

the outcome variable is linearly related to predictors

Answer 41

At least interval data Additivity and linearity Normally distributed Homoscedasticity/homogeneity of variance Independence

Answer 42

Variance of the outcome variable should be stable at all levels of the predictor variable.

Answer 43

errors in the model should be dependent

Answer 44

- look at the histogram (it should look like a bell curve) -look at the p-p plot (dots should fall on/near the line) - Look at descriptive statistics (skewness and kurtosis should be near to 0)

Answer 45

Look at scatter plots Look at Levene's test - significant = variances unequal = assumption of homogeneity of variances has been broken.

Answer 46

dots scattered evenly everywhere

Answer 47

funnel shape

Answer 48

curve and funnel (e.g., a boomerang)

Answer 49

kruskal-wallis Friedman's ANOVA

Answer 50

Kruskal-Wallis

Answer 51

Friedman's ANOVA

Answer 52

1. Look at the 'ranks' table, the mean ranks tell us which condition had the highest ranks 2. If the chi squared test is significant then there is a difference between groups (but we do not know what kind of difference) 3. To see where the difference lies, look at the box-plot and compare the experimental group to the control group. 4. OR we can do a Mann-whitney test and use Bonferroni correction (divide alpha by the number of tests), look to see which conditions are significant. 5. Calculate the effect size by dividing the z score by the number of obvs square rooted. - use benchmarks of .10 / .30 / .50

Answer 53

Friedman's ANOVA

Answer 54

1. Look at the 'ranks' table, the mean ranks tell us which condition had the highest ranks. 2. If the chi squared test is significant then there is a difference between groups (but we do not know what kind of difference) 3. To see where the difference lies, look at the box-plot and compare the experimental group to the control group. 4. OR we can do a Wilcoxen test and use Bonferroni correction (divide alpha by the number of tests), look to see which conditions are significant in the 'test statistics' box. 5. Calculate the effect size by dividing the z score by the number of obvs square rooted. - use benchmarks of .10 / .30 / .50

Answer 55

relationships between variables

Answer 56

If variables are related, then change in one variable will lead to similar change in another variable

Answer 57

the similarities / differences of the deviation

Answer 58

multiple the deviations of one variable but the deviations of the other variable

Answer 59

Calculate the cross product deviation and divide by the number of observations - 1.

Answer 60

correlation coefficient

Answer 61

the standardised version of the covariance Pearsons correlation coefficient measures the strength of relationship between variables

Answer 62

Divide covariance by standard deviation. Scores lie between -1 and +1 +1 perfect positive relationship

Answer 63

correlation between 2 variables

Answer 64

quantifies relationship between two variables while controlling the effect of other variables

Answer 65

1. Have assumptions been violated? If they have use Kendall's tau / Spearman's rho 2. Look at 'correlations' table and see if Pearson's correlation are significant. 3. Look at the confidence intervals (if the data is not normal, look at the bootstrap CI) - If the confidence interval crosses zero it suggests there could be NO effect.

Answer 66

The coefficient of determination is represented by the term r2 (or R2) it is the percentage of the total amount of change in the dependent variable (y) that can be explained by changes in the iv (x).

Answer 67

Square the correlation coefficient (R)

Answer 68

spearman's rho

Answer 69

1. Look at 'correlations' table and see if correlation coefficient is significant. 2. Look at the confidence intervals (if the data is not normal, look at the bootstrap CI) 3. If the confidence interval crosses zero it suggests there could be NO effect.

Answer 70

Kendall's tau

Answer 71

1. Look at 'correlations' table and see if correlation coefficient is significant. 2. Look at the confidence intervals (if the data is not normal, look at the bootstrap CI) 3. If the confidence interval crosses zero it suggests there could be NO effect.

Answer 72

point-biserial correlation

Answer 73

when one variable is discrete dichotomy (e.g., pregnancy)

Answer 74

used when one variable is a continuous dichotomy (e.g., passing or failing an exam)

Answer 75

we control for the effect that the third variable has on only one of the variables in the correlation

Answer 76

Outcome = (b0 + b1X) + error.

Answer 77

we can predict an outcome for a person using the model (the bit in brackets) and some error associated with this model

Answer 78

slope / gradient

Answer 79

parameters regression coefficients

Answer 80

positive relatiomship

Answer 81

Normally distributed errors Independent errors Additivity and linearity Homoscedasticity

Answer 82

outcome variable and predictors combined effect is best described by addition effects together

Answer 83

durbin-watson value should be between 1 and 3

Answer 84

10 or 15 cases of data per predictor

Answer 85

assessing the accuracy of a model over different samples look at the adjusted R squared

Answer 86

1. Look at the 'model summary' R represents correlation - R squared represents the amount of variance accounted for by the model. 2. Look at the 'ANOVA' table, if the F ratio is significant then our model is a better predictor in comparison to using the mean. 3. 'B' in 'coefficient' table tells us the gradient and the strength of the relationship between a predictor and outcome variable. Significant means the predictor significantly predicts the outcome variable.

Answer 87

the amount of variance accounted for by the model

Answer 88

a model with several predictors

Answer 89

hierarchical regression forced entry stepwise methods

Answer 90

predictors are based on past work and the researcher decides which order to enter the variables known predictors should go first, followed by any new ones that we suspect will be important

Answer 91

all predictors are forced into the model at the same time we have to have good theoretical support to include the predictors we have included

Answer 92

Generally frowned upon because the researcher is not in control. The decision is based on mathematical criterion that SPSS decides. It will see how much variance is a accounted for by one predictor, if it is sufficient then it will keep it and move on to find another predictor which may explain more variance.

Answer 93

multicollinearity

Answer 94

exists when there is a strong correlation between 2+ of our predictor variables

Answer 95

the values of b for each variable are interchangeable

Answer 96

standard errors of the b coefficient increase the size of t is limited it is difficult to assess the individual importance of predictor when they are highly correlated

Answer 97

the correlation between predicted values of the outcome and the observed values

Answer 98

check is the variable inflation factor (VIF) and tolerance statistics

Answer 99

VIF greater than 10 = cause for concern VIF greater than 1 = regression may be biased

Answer 100

tolerance below 0.1 = serious problem tolerance below 0.2 = potential problem

Answer 101

quantifies the impact of an outlier on a model if cooks distance is above 1, then that case may be influencing the model

Answer 102

a statistical procedure that identifies clusters of related items on a test do several facets reflects one variable (e.g., burnout (variable) - stress levels, motivation (facets))

Answer 103

clusters of variables that correlate highly with each other

Answer 104

scree plot where the inflexion is where you should cut off

Answer 105

to discriminate factors

Answer 106

informed consent deception debriefing confidentiality protection from physical and psychological harm

Answer 107

participants should understand what the experiment involves and understand their rights the ability to withdraw at any point

Answer 108

in observational studies only if the person being observed is in a situation where they would be in public view anyway (e.g., shopping centre)

Answer 109

Nominal ordinal interval ration

Answer 110

nominal ordinal

Answer 111

interval ratio

Answer 112

the numbers act as a name data from a nominal scale should not be used for arithmetic nominal data can be used for frequencies

Answer 113

tell us the frequencies and in what order they occured does not tell use the differences between values most self report questionnaires are ordinal data

Answer 114

differences between values on a scale are equal tested with parametric statistics

Answer 115

differences between values on a scale are equal distances along the scale are divisible there is a true zero point (i.e., no minus numbers, e.g., reaction time)

Answer 116

discrete continuous

Answer 117

non-overlapping categories eg being pregnant - you either are or are not

Answer 118

runs along a continuum e.g, agression

Answer 119

whether an instrument measures what it sets out to measure

Answer 120

whether you can establish if a measurement is measuring what it is meant to through comparison to an objective criteria we assess this by relating scores on your measure to real-world observation

Answer 121

evidence that scores from an instrument correspond to external measures eg. nurses are assessed for knowledge via a written & practical test. If they score well on the test and then well on the practical = concurrent validity.

Answer 122

when data from the new instrument are used to predict observations later in time

Answer 123

with questionnaires, we can assess how well individual items represent the construct being measured

Answer 124

when making questionnaire and using factor analysis if your factors are made up of items that seem to go together meaningfully = factorial validity

Answer 125

whether an instrument can be interpreted consistently across different situations

Answer 126

the ability of a measure to produce consistent results when the same entities are tested at two different points in time

Answer 127

split-half method cronbach's alpha

Answer 128

Splitting a test into two and having the same participant do both. The results are then correlated, and if they are similar then there is high internal reliability.

Answer 129

if the correlation is above 0.8 = reliable

Answer 130

The difference between the score we get using our measurement and the level of the construct we are measuring. Eg. I actually weigh 47kg but the scales show 57kg

Answer 131

Used for frequency distribution. Plots a single variable (x-axis) against the frequency scores (y-axis)

Answer 132

Used to show important characteristics of a set of observations. Center of the plot = median Box = middle 50% of observations (aka interquartile range) Upper and lower quartile are the ends of the box. Whiskers = top and bottom 25% of scores.

Answer 133

graphing means

Answer 134

used for graphing relationships a graph that plots each persons score on one variable against another

Answer 135

A method of assessing scientific theories We have 2 competing hypothesis - null hypothesis (no effect) and the alternative hypothesis (there is an effect). We compute a test statistic and find out the likely it is that we would get a value as big as the one we have if the null hypothesis is true (ie. by chance).

Answer 136

less than <0.005 = significant effect

Answer 137

saying there is an effect when there isn't rejecting the null hypothesis when it is true

Answer 138

saying there isn't an effect when there is accepting the null hypothesis when it is false

Answer 139

probability we will find an effect if it exists

Answer 140

effect sizes from different studies testing the same hypothesis are combined to get a better estimate of the size of effect in the population

Answer 141

bayesian analysis

Answer 142

The variable that is being manipulated by the researcher. It is independent from the other variables. IV's can have different levels. IV goes on the x axis.

Answer 143

The variable that is hypothesised to be affected by the IV. It depends on the IV. DV goes on the y axis.

Answer 144

participants placed into different groups they can be part of one group for the entire experiment

Answer 145

same participants placed into all levels of the independent variable

Answer 146

research that deals with numerical data data is analysed to compare groups or make inferences confirm / test hypotheses using numbers

Answer 147

mainly uses words data analysed to summarise, categories and interpret themes explorative, an attempt to understand through words

Answer 148

aims to describe a phenomenon what, when, where and how does not lead us to think about causation

Answer 149

aims to define a statistical relationship between variables e.g., is there a relationship between cognition and caffeine

Answer 150

experimenter has no control over the allocation of pots to conditions or the timing of experimental conditions

Answer 151

aims to establish causality randomisation in important to reduce the effect of confounding variables

Answer 152

baseline behaviour measured (A) treatment applied and behaviour measured while treatment present (B), treatment is removed and the baseline behaviour is recorded again (A)

Answer 153

purposive sampling theoretical sampling

Answer 154

selecting participants according to criteria is important for the research question

Answer 155

the people you attempt to recruit will change as a result of the things you are learning

Answer 156

a research design with 2 independent levels, each with 2 levels

Answer 157

path analysis

Answer 158

a correlation coefficient between a variables and a factor (cluster of variables)

Answer 159

explained relationships between independent variables and dependent variables

Answer 160

alters the relationship between the IV and DV

Answer 161

the standard deviation (spread) of the sampling distribution

Answer 162

a measure of how much scores vary around the mean score

Answer 163

how the values are dispersed around the mean

Answer 164

Number of units of standard deviation any one value is above or below the mean The larger the z-score the further its value is away from the group's mean

Answer 165

(raw score - mean) / standard deviation

Answer 166

the model is a better predictor in comparison to the mean

Answer 167

the probability of obtaining results at least as extreme as the observed results of a statistical hypothesis test, assuming that the null hypothesis is correct

Answer 168

sample size - 1

Answer 169

sample size - k where k is the number of cell means

Answer 170

when scores tend to cluster at the upper end of a distribution

Answer 171

when a task is so difficult that all scores are very low

Answer 172

post hoc compares two individual means at a time

Answer 173

effect of one IV while ignoring the other IV

Answer 174

the combined effect of two or more IV's

Stats Test Flashcards

Sharon's Flashcards (201 cards)