Exam Revision Flashcards

Question 1

Q

Question 2

Q

There are constraints of getting valuable info for hypothesis from study’s design such as - (2)

Answer

A

duration of the study
how many people you can recruit

Question 3

Q

What is a sample?

Answer

A

A sample is the specific group that you will collect data from.

Question 4

Q

What is a population?

Answer

A

A population is the entire group that you want to draw conclusions about.

Question 5

Q

Example of population vs sample (2)

Answer

A

Population : Advertisements for IT jobs in the UK
Sample: The top 50 search results for advertisements for IT jobs in the UK on 1 May 2020

Question 6

Q

What is inferential statistics?

Answer

A

Inferential statistics allow you to test a hypothesis or assess whether your data is generalisable to the broader population.

Question 7

Q

Why is there a focus to do parametric tests than others in research? - (3)

Answer

A

they are more rigorous, powerful and sensitive than non-parametric tests to answer your question
This means that they have a higher chance of detecting a true effect or difference if it exists.
They also allow you to make generalizations and predictions about the population based on the sample data.

Question 8

Q

We can obtain multiple outcomes from the

Answer

A

same people

Question 9

Q

We can obtain outcomes under

Answer

A

different conditions, groups or both

Question 10

Q

What are the 4 types of outcomes we measure? (4)

Answer

A

Ratio
Interval
Ordinal
Nominal

Question 11

Q

What is a continous variables? - (2)

Answer

A

: there is an infinite number of possible values these variables can take on-

entities get a distinct score

Question 12

Q

2 examples of continous variables (2)

Answer

A

Interval
Ratio

Question 13

Q

What is an interval variable?

Answer

A

: Equal intervals on the variable represent equal differences in the property being measured

Question 14

Q

Examples of interval variables - (2)

Answer

A

e.g. the difference between 600ms and 800ms is equivalent to the difference between 1300ms and 1500ms. (reaction time)

temperature (Farenheit), temperature (Celcius), pH, SAT score (200-800), credit score (300-850)

Question 15

Q

What is ratio variable?

Answer

A

The same as an interval variable and also has a clear definition of 0.0.

Question 16

Q

Examples of ratio variable - (3)

Answer

A

E.g. Participant height or weight
(can have 0 height or weight)

temp in Kelvin (0.0 Kelvin really does mean “no heat”)

dose amount, reaction rate, flow rate, concentration,

Question 17

Q

What is a categorical variable? (2)

Answer

A

A variable that cannot take on all values within the limits of the variable

- entities are divided into distinct categories

Question 18

Q

What are 2 examples of categorical variables? (2)

Answer

A

Nominal
Ordinal

Question 19

Q

What is nominal variable? - (2)

Answer

A

a variable with categories that do not have a natural order or ranking

Has two or more categories

Question 20

Q

Examples of nominal variable - (2)

Answer

A

genotype, blood type, zip code, gender, race, eye color, political party

e.g. whether someone is an omnivore, vegetarian, vegan, or fruitarian.

Question 21

Q

What is ordinal variables?

Answer

A

categories have a logical, incremental order

Question 22

Q

Examples of ordinal variables - (3)

Answer

A

e.g. whether people got a fail, a pass, a merit or a distinction in their exam

socio economic status (“low income”,”middle income”,”high income”),

satisfaction rating [Likert Scale] (“extremely dislike”, “dislike”, “neutral”, “like”, “extremely like”).

Question 23

Q

Using the term ‘variables’ for continous and categorical variables as - (2)

Answer

A

both outcome and predictor are variables

We will see later on that not only the type of outcome but also type of predictor influences our choice of stats test

Question 24

Q

Likert scale is ordinal variable but sometimes outcomes measured on likert scale are treated as - (3)

Answer

A

continuous after inspection of the distribution of the data and may argue the divisons on scale are equal

(i.e., treated as interval if distribution is normal)

gives greater sensitivity in parametric tests

Question 25

Q

What is measurement error?

Answer

A

The discrepancy between the actual value we’re trying to measure, and the number we use to represent that value.

Question 26

Q

In reducing measurement error in outcomes, the

Answer

A

values have to have the same meaning over time and across situations

Question 27

Q

Validity means that the (2)

Answer

A

instrument measures what it set out to measure

refers to the accuracy of a measure (whether the results really do represent what they are supposed to measure

Question 28

Q

Reliability means the

Answer

A

ability of the measure to produce the same results under the same conditions

Question 29

Q

Test-retest reliability is the ability of a

Answer

A

measure to produce consistent results when the same entities are tested at two different points in time

Question 30

Q

3 types of variation (3)

Answer

A

Systematic variation
Unsystematic variation
Rnadomisation

Question 31

Q

What is systematic variation - (2)

Answer

A

Differences in performance created by a specific experimental manipulation.

This is what we want

Question 32

Q

What is Unsystematic variation (3)

Answer

A

Differences in performance created by unknown factors
.
Age, Gender,** IQ, **Time of day, Measurement error etc.

These differences can be controleld of course (e.g., inclusion/exclusion of pps setting age range of 18-25)

Question 33

Q

Randomisation (other approaches) minimises - (2)

Answer

A

effects of unsystematic variation

does not remove unsystematic variation

Question 34

Q

What is the independent variable (Factors)? ( 3)

Answer

A

The hypothesised cause
A predictor variable
A manipulated variable (in experiments)

Question 35

Q

What is depenedent variable? (measures)- (3)

Answer

A

The proposed effect , change in DV
An outcome variable
Measured not manipulated (in experiments)

Question 36

Q

In all experiments we have two hypotheses which is (2)

Answer

A

Null hypothesis
Alternative hypothesis

Question 37

Q

What is null hypothesis?

Answer

A

that there is no effect of the predictor variable on the outcome variable

Question 38

Q

What is alternative hypothesis?

Answer

A

is that this is an effect of the predictor variable on the outcome variable

Question 39

Q

Null Hypothesis Signifiance Testing computes the probability the (2)

Answer

A

the probability of the null hypothesis being true (referred as p-value) by computing a statistic and how likely it is that the statistic has that value by chance alone

Referred as p-value

Question 40

Q

The NHST does not compute the probability of the

Answer

A

null hypothesis

Question 41

Q

There can be directional and non-directional hypothesis of

Answer

A

an alternate hypothesis

Question 42

Q

non-directional alternate hypothesis is..

Answer

A

The alternative hypothesis is that this is an effect of the group on the outcome variable

Question 43

Q

Directional alternate hypothesis is…

Answer

A

The alternative hypothesis is that this the mean of the outcome variable for group 1 is larger than the mean of group 2

Question 44

Q

Example of directional alternate hypothesis

Answer

A

There would be far greater engagment in stats lecture if they were held at 4 PM and not 9AM

Question 45

Q

For a non-directional hypothesis you will need to divide your alpha value at

Answer

A

two ends of the tail of normal distirbution

Question 46

Q

The 3 misconceptions of Null Hypothesis Signifiance Testing (NHST) - (3)

Answer

A

A significant result means the effect is important
A non-significant result means the null hypothesis is true
A significant result means the null hypothesis is false (just give probability that data occured given null hypothesis, doesn’t say huge evidence that null hypothesis is categorically false)

Question 47

Q

P-Hacking and HARKING is another issue with

Question 48

Q

p-Hacking and HARKINGS are the - (2)

Answer

A

researchers degrees of freedom

cchange after results are in and some analysis has been done

Question 49

Q

P-hacking refers to a

Answer

A

selective reporting of significant results

Question 50

Q

Harking is

Answer

A

Hypothesising After the Results are Known

Question 51

Q

P-hacking and HARKING are often used in

Answer

A

combination

Question 52

Q

What does EMBERS stand for? (5)

Answer

A

Effect Sizes
Meta-analysis
Bayesian Estimation
Registration
Sense

Question 53

Q

EMBERS can reduce issues of

Question 54

Q

Uses of Effect sizes and Types of Effect Size (3)

Answer

A

There a quite a few measures of effect size
Get used to using them and understanding how studies can be compared on the basis of effect size
A brief example: Cohen’s d

Question 55

Q

Meaning of Effect Size (2)

Answer

A

Effect size is a quantitative measure of the magnitude of the experimental effect.

The larger the effect size the stronger the relationship between two variables.

Question 56

Q

Formula of Cohen’s d

Question 57

Q

What is meta-analysis?

Answer

A

Meta-analysis is a study design used to systematically assess previous research studies to derive conclusions about that body of research

Question 58

Q

Meta-analysis brings together.. and assesses (2)

Answer

A

Bringing together multiple studies to get a more realistic idea of the effect
Can assess effect siz that are averaged across studies

Question 59

Q

Funnel plots in meta-analysis can be made to….. values stuides…. (2)

Answer

A

investigating publication bias and other bias in meta-analysis

values studies by their sample size and observe bias

Question 60

Q

Bayesian approaches capture

Answer

A

probabilities of the data given the hypothesis and null hypothesis

Question 61

Q

Bayes factor is now often computed and stated alongside

Answer

A

conventional NHST analysis (and effect sizes)

Question 62

Q

Registration is where (5)

Answer

A

Telling people what you are doing before you do it
Tell people how you intend to analyze the data
Largely limits researcher degrees of freedom (HARKING p-hacking)
A peer reviewed registered study can be published whatever the outcome
The scientific record is therefore less biased to positive findings

Question 63

Q

Sense is where (4)

Answer

A

Knowing what you have done in the context of NHST
Knowing misconceptions of NHST
Understanding the outcomes
Adopting measures to reduce researcher degrees of freedom (like preregistration etc..)

Question 64

Q

most of the statistical tests in this book rely on
having data measured

Answer

A

at interval level

Answer 61

A

equal differences in the property being measured.

Answer 62

A

continuous variables can be measured in discrete terms; we measure age we rarely use nanoseconds but use years (or possibly years and months).
In doing so we turn a continuous variable into a discrete
one

treat discrete variables as if they were continuous, e.g., the number of boyfriends/girlfriends
that you have had is a discrete variable. However, you might read a magazine that says ‘the average number of boyfriends that women in their 20s have has increased from 4.6 to 8.9’

Answer 63

A

instrument is measuring what it claims to measure (does
your lecturers’ helpfulness rating scale actually measure lecturers’ helpfulness?).

Answer 64

A

unsystematic variation and systematic variation

Answer 65

A

between-group design,

Answer 66

A

differences between the characteristics of the people allocated to each of the groups is likely to create considerable random variation both within
each condition and between them

Answer 67

A

, repeated-measures designs have
more power to d

Answer 68

A

independent and repeated design measure

Answer 69

A

Practice effects
Boredom effects

Answer 70

A

Participants may perform differently in the second condition because
of familiarity with the experimental situation and/or the measures being used.

Answer 71

A

: Participants may perform differently in the second condition because
they are tired or bored from having completed the first condition.

Answer 72

A

counterbalancing the order in which a person participates in a condition

Answer 73

A

we randomly determine whether a participant
completes condition 1 before condition 2, or condition 2 before condition 1

Answer 74

A

A normal distribution

Answer 75

A

bell curve

Answer 76

A

This means that the distribution curve can be divided in the middle to produce two equal halves

Answer 77

A

Mean (central tendency)
Standard deviation (dispersion)

Answer 78

A

standard deviation

Answer 79

A

e.g., If we move 1σ to the right then it contains 34.1% of the valeues

Answer 80

A

normally distributed

Answer 81

A

scores divided by the number of scores

Answer 82

A

central tendency for roughly symmetric distributions

Answer 83

A

it can be greatly influenced by scores in tail e.g., extreme values

Answer 84

A

Median
Mode

Answer 85

A

the middle score when scores are ordered.

the middle of a distribution: half the scores are above the median and half are below the median.

Answer 86

A

extreme scores or skewed distribution
can be used with ordinal, interval and ratio data.

Answer 87

A

frequently occurring score in a distribution, a score that actually occurred

Answer 88

A

with nominal data

Answer 89

A

sample fluctuations and is therefore not recommended to be used as the only measure of central tendency

Answer 90

A

symmetric distribtions

Answer 91

A

mean is greater than the median, which is greater than the mode

Answer 92

A

usually the mode is greater than the median, which is greater than the mean

Answer 93

A

bulge or bend in greek

Answer 94

A

the tendency for the values of a random variable to cluster round its mean, mode, or median.

Answer 95

A

prefix meaning thin

Answer 96

A

a prefix meaning flat or wide (think Plateau)

Answer 97

A

Kolmogorov-Smirnov test
Shapiro-Wilks test

Answer 98

A

sample size

Answer 99

A

significant
normally disttibuted

Answer 100

A

lack of power in the test to detect a significant effect

Answer 101

A

determining whether data is normally distributed or not

Answer 102

A

normality

Answer 103

A

parametric tests

Answer 104

A

each score occurs

Answer 105

A

Lack of symmetry (called skew)
Pointyness (called kurotsis)

Answer 106

A

tails of the distribution are as they should be

Answer 107

A

Continous

Answer 108

A

Continous

Answer 109

A

A Allow inferences of cause

Answer 110

A

A (18-26) = -8/4 = -2

Answer 111

A

A = 80% of 40 is 32 (0.80 * 40)

Answer 112

A

A = independence is an assumption of parametric and not dependence

Answer 113

A

A - correct

B. Incorrect as value of skewnessis –0.079, which suggests that the dataare only very slightly negatively skewedbecause the value is close to zero

C. Incorrect as value of kurtosis is0.098, which is fairly close to zero,suggesting that kurtosis was not aproblem for these data

D. Incorrect as value of skewnessfor the number of hours spent practisingis –0.322, suggesting that the data areonly slightly negatively skewed

Answer 114

A

data is skewed

Answer 115

A

negatively skewed

Answer 116

A

positively skewed

Answer 117

A

how much our data lies around the ends/tails of our histogram which helps us to identify when outliers may be present in the data.

Answer 118

A

pointy or leptokurtic

Answer 119

A

be more sloped or platykurtic

Answer 120

A

peak of a frequency-distribution curve

Answer 121

A

normal distribution

Answer 122

A

all good! = normal distribution

Answer 123

A

platykurtic

Answer 124

A

leptokurtic

Answer 125

A

Good because both the skewness is between -1 and 1 and kurtosis values are between -2 and 2.

Answer 126

A

Bad because although the skewness is between 1 and -1, we have a problem with kurtosis with a value of 2.68 which is larger than 2 and -2

Answer 127

A

third variable = confounding variable

e.g, we find that drownings and ice cream sales are correlated, we conclude that ice cream sales cause drowning. Are we correct? Maybe due to the weather

Answer 128

A

influencing your results e.g., ice cream and drowning session

Answer 129

A

Use of RCTs.

Randomized Controlled Trials allow to even out the confounding variables between the groups

Answer 130

A

causation

Answer 131

A

we need to actively manipulate the variable we are interested in, and control against a group (condition) where this variable was not manipulated.

Answer 132

A

causality between two variables cannot be assumed because there may be other measured or unmeasured variables affecting the results”

Answer 133

A

linearity or less commonly additivity

Answer 134

A

effect of many predictors

Answer 135

A

There is a a linear effect when the data increases at a steady rate like the graph on the left.

Your cost increases steadily as the number of chocolate bars increases.

The graph on the right shows a non-linear effect when there is not this steady increase rather there is a sharp change in your data.

So you might feel ok if you eat a few chocolate bars but after that the risk of you having a stomach ache increases quite rapidly the more chocolates you eat.

This effect is super important to check or your statistical analysis will be wrong even if your other assumptions are correct because a lot of statistical tests are based on linear models.

Answer 136

A

measurement error and NOT variance

Answer 137

A

recording instrument failure to human error

Answer 138

A

Systematic
Random

Answer 139

A

: predictable, typically constant or proportional to the true value and always affect the results of an experiment in a predictable direction

Answer 140

A

for example if I know I am 5ft2 and when I go to get measured I’m told I’m 6ft this is a systematic error and pretty identifiable - these usually happen when there is a problem with your experiment

Answer 141

A

measurable values being inconsistent when repeated measures of a constant attribute or quantity are taken.

Answer 142

A

for example my height is 5ft2 when I measure it in the morning but its 5ft when I measure myself in the evening. This is because my measurements were taken at different times so there would be some variability – for those of you who believe you shrink throughout the day.

Answer 143

A

Average squared deviation of each number from its mean.

Answer 144

A

things being measured and of the measurement process

Answer 145

A

states that the sampling distribution of the mean approaches a normal distribution, as the sample size increases. This fact holds especially true for sample sizes over 30.

Therefore, as a sample size increases, the sample mean and standard deviation will be closer in value to the population mean μ and standard deviation σ .

Answer 146

A

Frequency of scores

Look at distribution of data, skewness, kurotsis

Answer 147

A

To identify outliers

Shows median rather than mean (good for non-normally distributed data)

Answer 148

A

simply bar charts with lines instead of bars

Answer 149

A

display means (and
standard errors)

Answer 150

A

a relationship between two variables, e.g. correlation or regression

Only use regression lines for regressions!

Answer 151

A

Particular kind of scatterplot that can be used instead of the 3-D scatterplot

clearer to read

Answer 152

A

Males M = 27.29

Females SD = 12.20

Answer 153

A

3.26 & 3.42

Answer 154

A

with parametric or non-parametric tests

Answer 155

A

bell shape

Answer 156

A

the mean μ

Answer 157

A

the mean (μ) and the standard deviation (σ).

Answer 158

A

normally distributed

Answer 159

A

μ = 0 is peak of distribution

Block areas under the curve and gives us insight to way data is distributed and certain scores occuring if they belong to normally distribution e.g., 34.1% of values lie one SD below mean

Answer 160

A

SD above or below the mean of a particular score is

Answer 161

A

Take a value of participant (e.g., 56 years old) and take away mean of distribution (e.g., mean age of class is 23) divided by SD (class like 2)

Answer 162

A

a z score of 2 means the original score was 2 standard deviations above the mean

Answer 163

A

pecentiles

Answer 164

A

First calculating z score: graph shows that most people scored below 90. Since 90 is 2 standard deviations above the mean z = (90 - 80)/5 = 2

Z score to pecentile can be looked at table that z score of 2 is equivalent to the 97.7th percentle:

The proportion of people scoring below 90 is thus .977 and proportion of people scoring above 90 is 2.3% (1-0.977)

Answer 165

A

an unbiased estimate of the population mean.

Answer 166

A

Via computing standard error of mean - smaller SEM the better

Answer 167

A

representative the mean was of the observed data.

Answer 168

A

most data points were close to the mean

Answer 169

A

widely spread
from the mean.

Answer 170

A

computed by dividing the standard deviation of the sample by the the square root of the number in the sample

Answer 171

A

standard error of the mean

more confident we can be that the sample mean is representative of the population.

Answer 172

A

as samples get large (usually defined as greater than 30), the sampling distribution has a normal distribution with a mean equal to the population mean, SD = SEM

Answer 173

A

SEM (standard error of the mean)

Answer 174

A

calculate boundaries and range of values within which we believe the true value of the population mean value will fall.

Such boundaries are called confidence intervals.

Answer 175

A

these intervals (created by samples) will contain the population mean

Answer 176

A

95 of these samples, the confidence intervals we constructed would contain the true value of the mean in the population.

Answer 177

A

Dots show the means for each sample
Lines sticking out representing Ci for the sample means
If there was a vertical line down it represents population mean
If confidence intervals don’t overlap then it shows significant difference between the sample means

Answer 178

A

0 (it does not contain it) or 1 (it does contain it).

You have no way of knowing which it is.

Answer 179

A

would be –1.96 and +1.96 -

Answer 180

A

-1.96 and 1.96

Answer 181

A

μ - mean

Answer 182

A

very different from the true mean, indicating that it
is a bad representation of the population

Answer 183

A

smaller - make sense as more we measure more certain sample mean close to population mean

Answer 184

A

LB = Mean - (1.96 * SEM)
UB = Mean + (1.96 * SEM)

Answer 185

A

* M - 530
* N = 10
* SEM = 100/ square root of 10 = 31.62
* Value of z for 95% CI is number of SD one must go from mean (in both directions) to contain 0.95 of the scores
* Value of 1.96 was found in z-table
* Since each tail is to contain 0.025 of the scores, you find the values of z for which is 1-0.025 = 0.975 of the socres below
* 95% of z scores lie between -1.96 and +1.96
* Lower limit = 530 - (1.96) (31.62) = 468.02
* Upper limit = 530 + (1.96)(31.62) = 591.98

Answer 186

A

signal/noise

Answer 187

A

probability of obtaining a certain value or p value.

Answer 188

A

systematic variation against unsystematic

Answer 189

A

the null hypothesis is rejected; When the null hypothesis is rejected, the outcome is said to be “statistically significant”

Answer 190

A

null hypothesis is not rejected.

Answer 191

A

think the variance accounted for by the model is larger than the one unaccounted for by the model (i.e. there is a statistically significant effect but in reality there isn’t)

Answer 192

A

think there was too much variance unaccounted for by the model (i.e. there is no statistically significant effect but in reality there is)

Answer 193

A

fit of the model

Answer 194

A

population, when in fact there isn’t.

Answer 195

A

a-level of usually 0.05

Answer 196

A

population when, in reality, there is.

Answer 197

A

β-level (often 0.2)

Answer 198

A

the size of the an effect

Answer 199

A

Standardized = comparable across studies

Not (as) reliant on the sample size

Allows people to objectively evaluate the size of observed effect.

Answer 200

A

the effect explains 1% of the total variance.

Answer 201

A

the effect accounts for 9% of the total variance.

Answer 202

A

effect accounts for 25% of the variance

Answer 203

A

effect should be placed within the research context.

Answer 204

A

.8, or an 80% chance of detecting
an effect if one genuinely exists.

Answer 205

A

it may be because we do not have enough statistical power

Answer 206

A

correctly rejecting a false H0 OR the ability of the test to find an effect assuming there is one in the population,

Answer 207

A

1 - β OR probability of making Type II error

Answer 208

A

your sample sizee

Answer 209

A

Probability of a type 1 error or a-level [level at which we decide effect is sig - p-value) –> bigger [more lenient] alpha then more power)
True alternate hypothesis H1 [effect size] (degree of overlap, less means more power) - if you find large effect in lit then better chance of detecting something
The sampel size [N]) –> bigger the sample, less the noise and more power
The particular tests to be employed - parametric tests greater power to detect sig effect since more sensitive

Answer 210

A

Sample size calculation at a desired level of power (usually power set to 0.8 in formula)

Answer 211

A

Calculate power of test
Calculate sample size necessary to detect an decent effect size and achieve a certain level of power based on past research

Answer 212

A

Type 1 error p = alpha
Type II error p = beta
Accepting null hypothesis which is correct - p = 1- alpha
Accepting alternate hypo which is correct - p = 1 - beta

Answer 213

A

bigger difference means higher power and and correctly reject the null hypothesis than distributions that overlap more

Answer 214

A

This means that the overlap in distributions is smaller and the power is therefore greater, but this time because of a smaller standard error of our estimate of the means.

Answer 215

A

us how.

We usually set the power to 0.8.

Answer 216

A

A measure of variability:

The number of standard deviations from the population mean or a particular data point is
Z-scores are a standardised measure, hence they ignore measurement units

Answer 217

A

Z-scores allow researchers to calculate the probability of a score occurring within a standard normal distribution

Enables us to compare two scores that are from different samples (which may have different means and standard deviations)

Answer 218

A

Let’s say Trish takes a test and scores 25 and the mean is 20 You may calculate the z-score to be 1.25 you would use a z-score table to see what percentile they would be in (marked in red) so to read the table you would go down to the value 1.2 and you would go across to 0.05 which totals to 1.25 and you can see about 89.4% of other students performed worse.

Answer 219

A

We would use our table and look down the column to a z-score of 1 and across to the 0.00 column (in purple) and we can see 84.1% of students performed worse than Josh so Trish performed better than Josh.

Answer 220

A

68% of scores are within 1 SD of the mean,

95% are within 2 SDs and

99.7% are within 3 SDs.

Answer 221

A

: by taking into account the variability and size of our sample we can estimate how far away from the real population mean our mean is!

Answer 222

A

the 95% confidence interval range

Answer 223

A

high statistical power

Answer 224

A

low statistical power

Answer 225

A

missing a real effect – Type II error)

Answer 226

A

FALSE (or TRUE).

Answer 227

A

us finding an effect when the null hypothesis (H0) is true.

Answer 228

A

H0 is true

Answer 229

A

larger than the one we have found if there were no effect in the population (e.g. the null hypothesis were true)

Answer 230

A

p = .049, p = .050 are essentially the same thing- the former is ‘statistically significant’.

Importance is dependent upon the experimental design/aims: e.g., A statistically significant weight increase of 0.1Kg between two adults experimental groups may be less important than the same increase between two groups of babies.

Answer 231

A

A as one-tailed is directional and two tailed is non-direcitonal

Answer 232

A

A as If we’d collected 100 samples, calculated the mean and then calculated a confidence interval for that mean, then for 95 of these samples the confidence intervals we constructed would contain the true value of the mean in the population

Answer 233

A

A and just because test statistic is sig does not mean its important effect

Answer 234

A

A as If we use the conventional criterion then the probability of this error is .05 (or 5%) when there is no effect in the population

Answer 235

A

A The standard error (which is the standard deviation of the distribution of sample means), defined as σ_Χ ̅ =σ/√N, decreases as the sample size (N) increases and vice versa

Answer 236

A

A The null hypothesis is the opposite of the alternative hypothesis and so usually states that an effect is absent

Answer 237

A

A A Type II error would occur when we obtain a small test statistic (perhaps because there is a lot of natural variation between our samples)

Answer 238

A

A - To make sure our estimates of the parameters that define our model and significance tests are accurate we have to assume homoscedasticity (also known as homogeneity of variance)

Answer 239

A

is the error

Answer 240

A

e.g., outcome 1 is equal to model plus error 1 and outcome 2 is equal to model plus error 2 and so on…

Answer 241

A

scaling (multiplying by a constant) another variable

Answer 242

A

pearson correlation or regression

Answer 243

A

captures the effect of the predictor variables we have manipulated or measured

Answer 244

A

average amount that the data cary from the mean

Answer 245

A

squared (s squared)

Answer 246

A

xi minus average of all scores of pp which is squared and divided by total number of participants minus 1

done for each participant (sigma)

Answer 247

A

average of the squared difference the outcome values from the mean of all outcomes (explaining what the formula of variance does)

Answer 248

A

one variable covarys with another

Answer 249

A

when one variable deviates from its mean we
would expect the other variable to deviate from its mean in a similar way.

So, if one variable increases then the other, related variable, should also increase or even decrease at the same level.

Answer 250

A

square root variance

Answer 251

A

Calculate the error between the mean and each subject’s score for the first variable (x).
Calculate the error between the mean and their score for the second variable (y).
Multiply these error values.
Add these values and you get the product deviations.
The covariance is the average product deviations

Answer 252

A

The answer ispositive: that tells us the x and y values tend to risetogether.

Answer 253

A

X = the value of ‘x’ variable
Y = the value of ‘y’ variable
X(line) = mean of ‘x’ - e.g., green
Y(line) = mean of ‘y’ - e.g., blue
n = the number of items in the data set

Answer 254

A

the mean for one variable

Answer 255

A

as one variable deviates from the mean, the other
variable deviates in the same direction.

Answer 256

A

a negative covariance indicates that as one variable deviates from the mean (e.g. increases), the other deviates from the mean in the opposite direction (e.g. decreases).

Answer 257

A

dependent upon the units /scales of measurement used

So covariance is not a standardised measure

e.g., if 2 variables measured in miles and covariance is 4.25 then if we convert data to kilometres then we have to calculate covariance again and see it increases to 11.

Dependence of scale measurement is a problem as can not compare covariances in an objective way –> can not say whether covariance is large or small to another data unless both data sets measured in same units

So we need to STANDARDISE it.

Answer 258

A

To overcome the problem of dependence on the measurement scale, we need to convert
the covariance into a standard set of units

Answer 259

A

dividing by product of the standard deviations of both variables.

Answer 260

A

Same formula of covariance but multipled of SD of x and SD of y

Answer 261

A

standard deviation for the number of adverts watched (sx)
was 1.67,

SD of number of packets of crisps bought (sy) was 2.92.

If we multiply these together we get 1.67 × 2.92 =
4.88.

.Now, all we need to do is take the covariance, which we calculated a few pages ago as being 4.25, and divide by these multiplied standard deviations.

This gives us r = 4.25/
4.88 = .87.

Answer 262

A

correlational coefficient or Pearson’s r

Answer 263

A

standardised

Answer 264

A

Describes a relationship between variables

If one variable increases, what happens to the other variable?

Answer 265

A

product-moment correlation

Answer 266

A

Pearson’s r correlation coefficient

Answer 267

A

-1 and +1 (direction of relationship)

Answer 268

A

be with each other and the mean

Answer 269

A

there is unexplained variance in the data and results in the data points being more spread out.

Answer 270

A

example of high negative correlation. The data points are close together and are close to the mean.
On the other hand, the graph on the right shows a low positive correlation. The data points are more spread out and deviate more from the mean.

Answer 271

A

between one variable and another hence its use in calculating effect size

Answer 272

A

two variablesare perfectly positively correlated, so as one variable increases, the other increases by a
proportionate amount.

Answer 273

A

a perfect negative relationship: if one variable increases, the other decreases by a proportionate amount.

Answer 274

A

small effect

Answer 275

A

medium effect

Answer 276

A

large effect

Answer 277

A

correlation coefficient is different from zero

(i.e., different from ‘no relationship’)

Answer 278

A

relationship that
we have observed is statistically meaningful.

Answer 279

A

Z scores
T-statistic

Answer 280

A

likely correlation in the population

Answer 281

A

decreases e.g 20 n p is not < 0.05 but at 200 pps it is p < 0.05

Answer 282

A

indicates no linear relationship at all

so if one variable changes, the other stays the same.

Answer 283

A

causality

e.g., although we conclude no of adverts increase nmber of toffees bought we can’t say watching adverts caused us to buy toffees

Answer 284

A

Third variable problem - causality between variables can not be assumed in any correlation
Direction of causality: Correlation coefficients give nothing about which variables causes other to change.

Answer 285

A

significant

Answer 286

A

covariance to a measure of variance

Answer 287

A

coefficient of determination

Answer 288

A

proportion of the variance for a dependent variable )outcome) that’s explained by an independent variable . (predictor)

Answer 289

A

19.4% of variability in exam performance can be explained by exam anxiety

the variance in y accounted for by x’,

Answer 290

A

Multiply 0.1 * 0.1 for example

Answer 291

A

the correlation but without an indication of its direction.

Answer 292

A

Bivarate correlations
Partial correlations
Semi-partial or part correlations

Answer 293

A

relation between 2 variables

Answer 294

A

looks at the relationship between two variables while ‘controlling’ the effect of one or more additional variables.

Answer 295

A

the effect of one or more variables on either X or Y

Answer 296

A

A correlation calculates each data points distance from line (residuals)
This is the error relative to the model (unexplained variance)
A third variable might predict some of that variation in residuals

Answer 297

A

unfiltiered variation of the other

Answer 298

A

third variable constant (but we don’t manipulate these)

Answer 299

A

For example, when studying the effect of a diet, the level of exercise might also influence weight loss

We want to know the unique effect of diet, so we need to partial out the effect of exercise

Answer 300

A

Partial Correlation between IV1 and DV = D / D+C

Unique variance accounted for by the predictor (IV1) in the DV, after accounting for variance shared with other variables.

Answer 301

A

Partial correlation: Purple / Red + Purple

If we were doing just a partial correlation, we would see how much exam anxiety is influencing both exam performance and revision time.

Answer 302

A

The partial correlation that we calculated took
account not only of the effect of revision on exam performance, but also of the effect of revision on anxiety.

If we were to calculate the semi-partial correlation for the same data, then this would control for only the effect of revision on exam performance (the effect of revision
on exam anxiety is ignored).

Answer 303

A

control variable—a variable whose influence is statistically removed or controlled for when examining the relationship between the two primary variables (IV and DV).

Answer 304

A

relative to the amount of variance in the outcome that is left to explain after the contribution of other predictors have been removed from both the predictor and outcome.

Answer 305

A

we could look at the relationship between bladder relaxation (did the person wet themselves or not?) and the number of large tarantulas crawling up your leg controlling for fear of spiders

(the first variable is dichotomous, but the second variable and ‘controlled for’ variables are continuous).

Answer 306

A

. First, notice that the partial correlation between exam performance and exam anxiety is −.247, which is considerably less than the correlation when the effect of
revision time is not controlled for (r = −.441).
. Although this correlation is still statistically significant (its p-value is still below .05), the relationship is diminished.
value of R2 for the partial correlation is .06, which means that exam anxiety can now account for only 6% of the variance in exam performance.
When the effects of revision time were not controlled for, exam anxiety shared 19.4% of the variation in exam scores and so the inclusion of revision time has severely diminished the amount of variation in exam scores shared by anxiety.
As such, a truer measure of the role of exam anxiety has been obtained.

Answer 307

A

other variables are ruled out

Answer 308

A

effect that
the third variable has on only one of the variables in the correlation

Answer 309

A

Partials out the effect of one or more variables on either X or Y.

e.g. The amount revision explains exam performance after the contribution of anxiety has been removed from the one variable (usually the predictor- e.g. revision).

Answer 310

A

unique variation of one variable with the unfiltered variation of the other.

Answer 311

A

Semi-Partial Correlation between IV1 and DV = D / D+C+F+G

Unique variance accounted for by the predictor (IV1) in the DV, after accounting for variance shared with other variables.

Answer 312

A

purple/red + purple + white+ orange
When we use semi-partial correlation to look at this relationship, we partial out the variance accounted for by exam anxiety (the orange bit) and look for the variance explained by revision time (the purple bit).

Answer 313

A

A partial correlation quantifies the relationship between two variables while accounting for the effects of a third variable on both variables in the original correlation.

A semi-partial correlation quantifies the relationship between two variables while accounting for the effects of a third variable on only one of the variables in the original correlation.

Answer 314

A

of bivariate correlation coefficients.

Answer 315

A

Spearman’s roh
Kendall’s tau test

Answer 316

A

ordinal scale (e.g., grades)

Answer 317

A

first ranking the data n(numbers converted into ranks), and then running Pearson’s r on the ranked data

Answer 318

A

data have violated parametric assumptions such as nonnormally distributed data

Answer 319

A

Spearman’s rho

Answer 320

A

proportion of
variance in the ranks that two variables share.

Answer 321

A

when you have a small data set with a large number of
tied ranks.

This means that if you rank all of the scores and many scores have the same rank, then Kendall’s tau should be used

Answer 322

A

For small datasets, many tied ranks

Better estimate of correlation in population than Spearman’s ρ

Answer 323

A

proportion of
variance shared by two variables (or the ranks of those two variables).

Answer 324

A

tau is not comparable to r and r s

Answer 325

A

Kendall’s statistic is actually a better estimate of the correlation in the population

we can draw more accurate generalizations from Kendall’s statistic than from Spearman’s.

Answer 326

A

What type of measurement = continous
How many predictor variables = one
What type of continous variable = continous
Meets assumption of parametric tests - No

Answer 327

A

Pearson’s correlation coefficient r output box

Answer 328

A

one of the two variables is dichotomous (e.g., example of dichotomous variable is women being pregnant or not)

Answer 329

A

depends on whether the dichotomous variable is discrete or continuous

Answer 330

A

one variable is a
discrete dichotomy (e.g. pregnancy),

Answer 331

A

one variable is a continuous dichotomy (e.g. passing or failing an exam).

e.g. An example is passing or failing a statistics test: some people will only just fail while others will fail by
a large margin; likewise some people will scrape a pass while others will clearly excel.

Answer 332

A

Imagine interested in relationship between gender of a cat and how much time it spent away from home
Time spent away is measured in interval level –> mets assumptions of parametric data
Gender is discrete dichotomous variable coded with 0 for male and 1 for female

Answer 333

A

biseral correlation coefficient

Answer 334

A

biserial correlation bigger than point biserial

Answer 335

A

The researchers was interested in whether the amount someone gets paid and amount of holidays they take from work, whether these two variables would be related to their productivity at work

Pay: Annual salary
Holiday: Number of holiday days taken
Productivity: Productivity rating out of 10

Answer 336

A

medium effect size

±.1 = small effect
±.3 = medium effect
±.5 = large effect

Answer 337

A

o This indicates very little correlation between the 2 variables

Answer 338

A

the relationship between all possible combinations of your variables

Answer 339

A

For Pay and Holiday, we can see the line is very flat and indicates the correlation between the two variables is quite low
For pay and productivity, the line is steeper suggesting the correlation is fairly substantial between these 2 variables and same for holidays and pay and productivity and holidays here

Answer 340

A

- The relationship between pay and holidays is very low correlation is -0.04
- Between pay and productivity, there is a medium size correlation of r = 0.313
Between holidays and productivity there is medium going on large effect size of 0.435
Relationship between pay and productivity and also holidays and productivity is sig but correlation with pay and holidays was not sig

Answer 341

A

A student was interested in the relationship between the time spent preparing an essay, the interestingness of the essay topic and the essay mark received.

He got 45 of his friends and asked them to rate, using a scale from 1 to 7, how interesting they thought the essay topic was (1 - I’ll kill myself of boredom, 4 - it’s not too bad!, 7 - it’s the most interesting thing in the world!) (interesting).

He then timed how long they spent writing the essay (hours), and got their percentage score on the essay (essay).

Answer 342

A

Interval scale: difference between 10 degrees C and 20 degrees is same as 80 F and 90 F, 0 degrees does not mean absence of temp
Ratio: Height as 0 cm means no weight and weight, time

Answer 343

A

one IV and one DV

Answer 344

A

D. There was a significant positive correlation between interestingness of topic and the amount of time spent writing, with a large effect size. There was a non-significant positive correlation between time spent writing an essay and essay mark There was a non-significant positive correlation between interestingness of topic and essay mark

Answer 345

A

in between small and medium effect

Answer 346

A

your own research area

Answer 347

A

one variable is dichotomous, but there is an underlying continuum (e.g. pass/fail on an exam)

Answer 348

A

When one variable is dichotomous, and it is a true dichotomy (e.g. pregnancy)

Answer 349

A

example of a true dichotomous relationship.
We can compare the differences in height between males and females.
Use dichotomous predictor of gender

Answer 350

A

Continous
Two or more predictors that are continous
Multiple regression
Meets assumptions of parametric tests

Answer 351

A

every extra predictor you include, you have to add a coefficient;

so, each predictor variable has its own coefficient, and the outcome variable is predicted from a combination of all the variables multiplied by their respective coefficients plus a residual term

Answer 352

A

Y is the outcome variable,
b1 is the coefficient of the first predictor (X1),
b2 is the coefficient of the second predictor (X2),
bn is the coefficient of the nth predictor (Xn),
εi is the difference between the predicted and the observed value of Y for the ith participant.

Answer 353

A

we seek to find the linear combination of predictors that correlate maximally with the outcome variable.

Answer 354

A

an outcome variable from one or more predictor variables

Answer 355

A

for more than 2 predictor (X) variables

Answer 356

A

bad model can’t uniquely explain the outcome

Answer 357

A

one model explains significantly more variance than the other

Answer 358

A

past work and the experimenter
decides in which order to enter the predictors into the model

Answer 359

A

known predictors (from other research) should be entered into the model first in order of their importance in predicting the outcome.

After known predictors have been entered, the
experimenter can add any new predictors into the model.

New predictors can be entered either all in one go, in a stepwise manner, or hierarchically (such that the new predictor
suspected to be the most important is entered first).

Answer 360

A

The first model allows all the shared variance between Ad budget and Album sales to be accounted for.

The second model then only has the option to explain more variance by the unique contribution from the added predictor Plays on the radio.

Answer 361

A

method in which all predictors are forced
into the model simultaneously.

Answer 362

A

good theoretical reasons for including the chosen predictors,

Answer 363

A

makes no decision about the order in which variables are entered.

Answer 364

A

this method is the only appropriate method for theory testing because stepwise techniques are influenced by random variation in the data and so rarely give replicable results if the model is retested.

Answer 365

A

This option is for obtaining collinearity statistics such as the
VIF, tolerance,

Checking assumption of no multicolinearity

Answer 366

A

simple regression requires only one predictor.

Answer 367

A

e.g., two predictors are perfectly correlated , have a correlation coefficient of 1

Answer 368

A

to obtain unique estimates of the regression coefficients because there are an infinite number of combinations of coefficients that would work equally well.

Answer 369

A

real-life data

Answer 370

A

interchangable

Answer 371

A

Untrustory bs
Limit size of R
Importance of predictors

Answer 372

A

a correlation matrix of all of the predictor
variables and see if any correlate very highly (by very highly I mean correlations of above .80
or .90)

Answer 373

A

predictor has a strong linear relationship with the other predictor(s).

Answer 374

A

look at your variables to see if you need to include all variables whether all need to go in model

if high correlation between 2 predictors (measuring same thing) then decide whether its important to include both vars or take one out and simplify regression model

Answer 375

A

reciporal (1/VIF) = inverse of VIF

Answer 376

A

ZRESID on Y and ZPRED on X

Plot of residuals against predicted to asses homoscedasticity

Answer 377

A

(the standardized predicted values of the dependent variable based on the model).

These values are standardized forms of the values predicted by the model.

Answer 378

A

(the standardized residuals, or errors).

These values are the standardized differences between the observed data and the values that the model predicts).

Answer 379

A

basics means and also a table of correlations between variables.
This is a first opportunity to determine whether there is high correlation between predictors, otherwise known as multi-collinearity

Answer 380

A

variance in terms of R squared, and more importantly how R squared changes between models and whether those changes are significant.

Answer 381

A

measure of how much of the variability in the outcome is accounted for
by the predictors

Answer 382

A

fit in the general population

Answer 383

A

assumption of independent errors is tenable (value less than 1 or greater than 3 raise alarm bells)

value closer to 2 the better = assumption met

Answer 384

A

F-tests for each model

Answer 385

A

significantly beter at predicting the outcome than using the mean as a ‘best guess’

Answer 386

A

improvement in prediction that results from fitting the model, relative to the inaccuracy that still exists in the model

Answer 387

A

improvement in prediction resulting from fitting a regression line to the data rather than using the mean as an estimate of the outcome

Answer 388

A

total difference between
the model and the observed data

Answer 389

A

number of predictors (e.g., 1 for first model, 3 for second)

Answer 390

A

Number of observations (N) minus number of coefficients in regression model

(e.g., M1 has 2 coefficents - one for predictor and one for constant, M2 has 4 - one for each 3 predictor and one for constant)

Answer 391

A

calculated for each term (SSM, SSR) by dividing the SS by the df. T

Answer 392

A

F-ratio is calculated by dividing the average improvement in prediction by the model (MSM) by the average
difference between the model and the observed data (MSR)

Answer 393

A

greater than 1 and SPSS calculates exact prob (p-value) of obtaining value of F by change

Answer 394

A

there is a positive relationship between the predictor and the outcome,

Answer 395

A

represents a negative relationship between predictor and outcome variable?

Answer 396

A

Indicating positive relationships so as advertising budget increases, record sales increases (outcome)

plays on ratio increase as do record sales

attractiveness of band increases record sales

Answer 397

A

predictor affects the outcome if the effects of all other predictors are held constant:

Answer 398

A

(b = 0.085):

This value indicates that as advertising budget (x)
increases by one unit, record sales (outcome, y) increase by 0.085 units.

This interpretation is true only if the
effects of attractiveness of the band and airplay are held constant.

Answer 399

A

not dependent on the units of measurements of variables

Answer 400

A

the number of standard deviations that the outcome will change as a result of one standard deviation change
in the predictor.

Answer 401

A

a better insight into the
‘importance’ of a predictor in the mode

Answer 402

A

both variables have a comparable degree of importance in the model

Answer 403

A

true (pop) value of b

Answer 404

A

value of b in this sample is close to the true value of b in the populatio

Answer 405

A

in some samples the predictor has a negative
relationship to the outcome whereas in others it has a positive relationship

Answer 406

A

two best predictors (advertising and airplay) have very tight confidence intervals indicating that the estimates for the current model are likely to be representative of the true population values

interval for attractiveness is wider (but still does not cross zero) indicating that the parameter for this variable is less representative, but nevertheless significant.

Answer 407

A

Pearson’s correlation coefficients

Answer 408

A

represent the relationships between each predictor and the outcome variable, controlling for the effects of the other two predictors.

Answer 409

A

represent the relationship between each predictor and the outcome, controlling for the effect that the other two variables have on the outcome.

representing the unique relationship each predictor has with otucome

Answer 410

A

unique variance in outcome (ignore all other predictors) explained by predictor divided by variance in outcome not explained by all other predictors

A/A+E

Answer 411

A

unique variance in outcome explained by predictor divided by total variance in outcome

A/A+B+C+E

Answer 412

A

may be biased

Answer 413

A

serious problem.

Answer 414

A

a potential problem

Answer 415

A

For our current model the VIF values are all well below 10 and the tolerance statistics all well above 0.2;

therefore, we can safely conclude that there is no collinearity within our data.

Answer 416

A

summary of residuals statistics to be examined of extreme cases

To see whether individual scores (cases) influence the modelling of data too much

Answer 417

A

less than -2 or greater than 2

(We expect about 5% of our cases to do tha and 95% to have standardised residuals within about +/- 2.)

Answer 418

A

10 cases (5% of 200)

Answer 419

A

99% of cases should lie within ±2.5 so expect 1% of cases lie outside limits
From cases listed, clear two cases (1%) lie outside of limits (case, 164 [investigate further has residual 3] and 179) - 1% which isconform to accurate model

Answer 420

A

broken the assumptions of the regression

Answer 421

A

investigate and potentially remove them because they are ‘outliers’

Answer 422

A

Continous outcome variable and continous or dichotomous predictor variables
Independence = all values of outcome variable should come from different participant
Non-zero variance as predictors should have some variation in value e.g., variance ≠ 0
No outliers
No perfect or high collinearity
Histogram to check for normality of errors
Scatterplot of ZRES against ZPRED to check for linearity and homoscedasticity = looking for random scatter
Independent errors (Durbin-Watson)

Answer 423

A

undue influence on a predictor’s b coefficient

Answer 424

A

the partial plot shows the strong positive relationship to album sales.

There are no obvious outliers and the cloud of dots is evenly spaced out around the line, indicating homoscedasticity.

Answer 425

A

the plot again shows a positive relationship to album sales, but the dots show funnelling,

There are no obvious outliers on this plot, but the funnel-shaped cloud indicates a violation of the assumption of homoscedasticity.

Answer 426

A

you cannot generalize your findings beyond your sample

Answer 427

A

transforming the raw data – but
this won’t necessarily affect the residuals!

Answer 428

A

logistic regression instead

Answer 429

A

37.4% of the variance in productivity scores was accounted for by 3 predictor variables

Answer 430

A

if we assumed no relation between predictor variables and outcome variable – flat regression line no association between these variables)

Answer 431

A

holidays had standardized beta coefficient of 0.031 whereas cake had a much higher standardized beta coefficient of 0.499 which tells us that amount of cake given out much better predictor of productivity than the amount of holidays taken

For pay we have a beta coefficient of 0.323 which tells us that pay was also a pretty good predictor in the model of productivity but slightly less than cake

Answer 432

A

P value for holidays is 0.891 which is not significant
P value for cake is 0.032 is significant
P value for pay is 0.012 is significant

Answer 433

A

baseline not M1

Answer 434

A

change statistics

Answer 435

A

M2 explains an extra 7.5% which is sig

Answer 436

A

contribution of that predictor.

Answer 437

A

For this model, the advertising budget (t(196) = 12.26, p < .001), the amount of radio play prior to release (t(196) = 12.12, p < .001) and attractiveness of the band (t(196) =4.55, p < .001) are all significant predictors of record sales.

From the magnitude of the t-statistics we can see that the advertising budget and radio play had a similar impact,
whereas the attractiveness of the band had less impact.

Answer 438

A

we are talking about a variable with a infinante number of real numbers within a given interval so something like height or age

Answer 439

A

variable that can only hold two distinct values like male and female

Answer 440

A

line of best fit in MR

Answer 441

A

one or two outliers then could be okay

Answer 442

A

are over 3 SD from the mean

Answer 443

A

-3 and 3 SD

Answer 444

A

Weight, Activity, and the interaction between them are statistically significant

Answer 445

A

Homoscedasticity: similar variance of residuals (errors) across the variable continuum, e.g. equally accurate.

Heteroscedasticity: variance of residuals (errors) differs across the variable continuum, e.g. not equally accurate

Answer 446

A

your distribution

Answer 447

A

0 = errors between pairs of obsers are pos correl
2 = independent error
4 = errors between pairs of observs are neg correl

Answer 448

A

1.5 and 2.5

Answer 449

A

‘generalizes’ to the entire population.

Answer 450

A

for small N and where results are to be generalized use the adjusted R2

Answer 451

A

Standard: To assess impact of all predictor variables simultaneously
Hierarchical: To test predictor variables in a specific order based on hypotheses derived from theory
Stepwise: If the goal is accurate statistical prediction from a large number of predictor variables – computer driven

Answer 452

A

Tells that OCD interpretiotn of intrustrions would have not have a significant impact on model’s ability to predict social anxiety

Beta value of Interpretation of Intrusions is very small, indicating small influence on outcome variable

Beta is the degree of change in the outcome variable for every 1 unit of change in the predictor variable.

Answer 453

A

When predictor variables correlate very highly with each other

Answer 454

A

Normality of residuals

Answer 455

A

The t-statistic is equal to the regression coefficient divided by its standard deviation

Answer 456

A

The residual error in the prediction of fear scores when both gender and fantasy proneness are included as predictors in the model.

Answer 457

A

The improvement in the prediction of depression by fitting the model

Answer 458

A

Somewhere between −3.369 and −0.517

Answer 459

A

Stress from research

Answer 460

A

As stress from teaching increases by one unit, burnout decreases by 0.36 of a unit.

Answer 461

A

No, because the errors show heteroscedasticity.

Answer 462

A

Note that you expect 1% of cases to lie outside this area so in a large sample, if you have one or two, that could be ok

Answer 463

A

A record company boss was interested in predicting album sales from advertising.
Data
200 different album releases

Outcome variable:
Sales (CDs and Downloads) in the week after release

Predictor variables
The amount (in £s) spent promoting the album before release
Number of plays on the radio

Answer 464

A

observed values of the outcome, and the values predicted by the model.

Answer 465

A

Difference between no predictors and model 1 (a).
Difference between model 1 (a) and model 2 (b).

Our model 2 is significantly better at predicting the value of the outcome variable than the null model and model 1 (F (2, 197) = 167.2, p<.001) and explains 66% of the variance in our data (R2=.66

Answer 466

A

y = 0.09x1 + 3.59x2 + 41.12

For every £1,000 increase in advertising budget there is an increase of 87 record sales (B = 0.09, t = 11.99, p<.001).

For every number of plays on Radio 1 per week there is an increase of 3,589 record sales (B = 3.59, t = 12.51, p<.001).

Answer 467

A

o R squared = 0.09
o F statistic = 22.54
o P value = p < 0.001

Answer 468

A

D - data poiints show random pattern

Answer 469

A

A –>The R square change in step 2 was .020,

Answer 470

A

A fashion student was interested in factors that predicted the salaries of catwalk models. He collected data from 231 models. For each model he asked how much they earned per day (salary), their age (age), and how many years they had worked as a model (years_modelling).

The student wanted to know if the number of years spent modelling predicted the models’ salary after the models’ age was taken into account.

Answer 471

A

Somewhere between 3.369 and 0.517

Answer 472

A

One-samples t-test
Paired t-test
Independent t-test

Answer 473

A

Compares the mean of the sample data to a known value

Answer 474

A

DV = Continous (interval or ratio)
Independent scores (no relation between scores on test variable)
Normal distribution via frequency histogram (normal shape) and Q-plot (straight line) and non significant Shaprio Wilk
Homogenity of variances

Answer 475

A

Is the average IQ of Psychology students higher than that of the general population (100)?

A particular factory’s machines are supposed to fill bottles with 150 millilitres of product. A plant manager wants to test a random sample of bottles to ensure that the machines are not under- or over-filling the bottles.

Answer 476

A

Independence. – no relationship between the groups
Normal distribution via frequency histogram (normal shape) and Q-plot (straight line) and non significant Shaprio Wilk
Equal variances
Homogeneity of variances (i.e., variances approximately equal across groups) via non significant Levene’s test
DV = Interval or continuous
IV = Categorical
No significant outliers

Answer 477

A

Do dog owners in the country spend more time walking their
dogs than dog owners in the city?

Answer 478

A

DV is continuous

Related samples: The subjects in each sample, or group, are the same. This means that the subjects in the first group are also in the second group

Normal distribution via frequency histogram (normal shape) and Q-plot (straight line) and non significant Shaprio Wilk

Answer 479

A

Do cats learn more tricks when given food or praise as positive feedback?

Answer 480

A

What sort of measurement = continous
How many predictor variables = one
What type of predictor variables = categorical
How many levels of categorical predictor = two
Same or different participants for each predictor level = same

Answer 481

A

What sort of measurement = continous
How many predictor variables = one
What type of predictor variables = categorical
How many levels of categorical predictor = two
Same or different participants for each predictor level = different

Answer 482

A

predicting an outcome based on membership of two groups

Answer 483

A

linear model

Answer 484

A

degrees of freedom - related to the sample size.

Answer 485

A

lower degrees of freedom (small N studies)

increased uncertainty and a higher likelihood of observing extreme values than large N studies with less heavy tails as t distribution goes to normal

Answer 486

A

When 2 experimental conditions and different participants are assigned to each conditiont

Answer 487

A

independent-samples t-test

Answer 488

A

Used when there are 2 experimental conditions and same participants took part in both conditions of the experiment

Answer 489

A

Matched pairs or paired samples t-test

Answer 490

A

there was no effect (i.e., null hypothesis was true)

Answer 491

A

error in the model

Answer 492

A

0 - expect differences between sample group means we colelcted to be different to 0

Answer 493

A

null hypothesis is rejected and two sample means differ because of experimental manipulation

Answer 494

A

parametric tests

Answer 495

A

Sampling distribution is normally distributed - in paired it means sampling distribution of differences of scores is normal not the socres itself!
Data measured at least interval level

Answer 496

A

Variances in populations are roughly equal (homegenity of variance) = Leven’s test
Scores are independent since they come from different people

Answer 497

A

Compares mean differences betwen our samples (–D) to the differences we would expect to find between population means (uD) which is divided by standard error of differences (sD / square root N)
If H0 is ture, then expect no difference between population means hence uD = 0

Answer 498

A

pairs of samples from a population have similar means to population

Answer 499

A

that sample means can deviate quite a lot from the populatio mean and

sampling distribution of differences is more spread out

Answer 500

A

systematic variation in the data (represents experimental effect)

Answer 501

A

the difference we observed in our sample is not a chance result and caused by experimental manipulation

Answer 502

A

SD divided by square root of sample size

Answer 503

A

Standard deviation of differences divided by square root of sample size

Answer 504

A

ratio of systematic variation in experiment (average difference D) and unsystematic variation (standard erro of differences)

Answer 505

A

If the experimental manipulation creates any kind of effect,

Answer 506

A

If the experimental manipulation is unsuccessful then we might expect the variation caused by individual differences to be much greater than that caused by the
experiment

Answer 507

A

critical value then conflict if reflects an effect in our IV

Answer 508

A

people doing well in first exam likely doing well in second exam regardless of condition they are in and significantly correlated (r= 0.664)

Answer 509

A

t(19) = 2.72, p = 0.012

Answer 510

A

First condition had smaller mean than second condition

Answer 511

A

95% of the samples (e.g., if we had 100 samples then 95 of those samples..) the constructucted CIs contain true value (population) of the mean difference
CIs tell us boundaries within which true mean difference is likely to lie
The true value of mean difference is unlikely to be 0 if Cis does not contain 0

Answer 512

A

Using cohen’s D

Answer 513

A

Minus big mean from small mean divided by smallest SD (control group)

Answer 514

A

difference between groups is a 1/5 of SD

Answer 515

A

calculate effect size r (above 0.50 is large effect) by converting t-value to r-value

Answer 516

A

contain an equal number of people

Answer 517

A

other sources of variance (such as individual differences between participants’ motivation, IQ etc..)

Answer 518

A

scores came from same participant and so individual differences were eliminated

Answer 519

A

We are looking at differences between the overall means of 2 samples and compare with differences we would expect to get between means of 2 populations from which sampels come from
If H0 is true, samples drawn from same population
Therefore under H0, u1 = u2 therefore u1 - u2 = 0

Answer 520

A

sample group

Answer 521

A

variance of the sampling distribution is equal to the sum of the variances of the two populations from which the samples
were taken

Answer 522

A

standard error for two samples

Answer 523

A

pooled variance estimate t-test

Answer 524

A

differnece in sample size by weighting the variance of each sample

Answer 525

A

Each variance of sample is multipled by its DF and added together and divided by the sum of weights (sum of two DFs)

Larger samples better than small ones as close to population

Answer 526

A

number of degrees of freedom (N-1)

Answer 527

A

maximum value we would expect to get by chance alone in t distribution with same DFs

Answer 528

A

Sleep condition scored an average exam score of 66.200 and no sleep condition earned an average of 58.73

Effect size (Cohen’s D) = Mean of sleep minus mean of no sleep divided by standard deviation of sleep (control grp) = 66.20-58/73/7.12

Answer 529

A

we got equal variance across the groups or whether the variances are unequal

Answer 530

A

no statistically significant difference in variances between the two groups - report results from equal variances assumed

Answer 531

A

variances between the 2 groups are different and they are statistically significantly different - report data from equal variances not assumed

Answer 532

A

Levene’s test is not significant (p = 0.970) so no stats sig differences in variance between two groups
t(28) = 2.87, p = 0.008

Answer 533

A

Paired t-t ests

Answer 534

A

unsystematic variance

Answer 535

A

Wilcoxon signed rank test

Answer 536

A

Wilcoxon rank sum test and Mank Whitney test

Answer 537

A

homogeneity of variance as assessed by Levene’s Test for Equality of Variances (F = 1.58, p = .219)

Answer 538

A

Large effect

Answer 539

A

Research question: Which of the two diet formulas is better for puppies?

Sample: 15 were randomly assigned to each of the two diets (A and B).

Dependent variable: Average daily weight gain (ADG, g/day) between 12 to 28 weeks of age.

Hypotheses:
Ho: µA = µB
Ha: µA ≠ µB.

Statistical Test: Two samples

independent t-test
Significance level: .05

Answer 540

A

boxplots - no outlier here

Answer 541

A

histogram, q-qplot and tests of normality

Answer 542

A

We don’t have sig values for either group in the test of normality, histogram and plots look normal

So we have normality of distribution for both independent groups

Inspection of Q-Q Plots and the non-significant Shapiro-Wilk tests (p > .05) indicate that the ADG is normally distributed for both groups

Answer 543

A

levene’s test

Answer 544

A

was homogeneity of variance as assessed by Levene’s Test for Equality of Variances (F = 1.58, p = .219)

Answer 545

A

This study found that puppies in diet B had statistically significantly higher average daily weight gain (89.29 ± 9.93 g/day) between 12 and 28 weeks of age compared to puppies in diet A (60.20 ± 6.85 g/day), t(27)= -9.24, p < .001.

Answer 546

A

Pooled SD (over conditions)
2.Averaged SD
Control group SD

Answer 547

A

control group SD

Answer 548

A

d = (89.29 - 60.20) / 6.85

d = 4.25

Answer 549

A

d = 0.2 be considered a ‘small’ effect size,

d = 0.5 represents a ‘medium’ effect size

d = 0.8 a ‘large’ effect size

Answer 550

A

Analysis of Variance

Answer 551

A

Q: What sort of measurement? A: Continuous
Q:How many predictor variables? A: One
Q: What type of predictor variable? A: Categorical
Q: How many levels of the categorical predictor? A: More than two
Q: Same or Different participants for each predictor level? A: Different

Answer 552

A

if you are comparing more than 2 groups in IV

Answer 553

A

Which is the fastest animal in a maze experiment - cats, dogs or rats?

Answer 554

A

Doing separate t-tests inflates the type I error (false positive - e.g., pregnant man)

The repetition of the multiple tests adds multiple chances of error, which may result in a larger α error level than the pre-set α level - Family wise error

Answer 555

A

This error rate across statistical tests conducted on the same experimental data

Answer 556

A

type 1 error

Answer 557

A

probability of making a wrong decision in accepting the alternate hypothesis = type 1 error

Answer 558

A

5% of type 1 error of falsely rejecting H0
Probability of no. of Type 1 errors is 95% for a single test
However, for multiple tests the probability of type 1 error decreases as 3 tests together => 0.950.950.95 = 0.857
This means probability of a type 1 error increases: 1- 0.857 = 0.143 (14.3% of not making a type 1 error)

Answer 559

A

ANOVA - 3 levels of categorical variable with dummy variables

Answer 560

A

all group means are equal

Answer 561

A

F statistic or F ratio

Answer 562

A

amount of systematic variance in data to the amount of unsystematic variance i.e., ratio of model to its error

Answer 563

A

overall experimental effect

tells whether experimental manipulation was successful

Answer 564

A

groups were affected due to experimental manipulation

Answer 565

A

multiple regression equation for three means and models acocunt for 3 levels of categorical variable with dummy variables

Answer 566

A

3 or more independent groups

Answer 567

A

Levene’s test

Answer 568

A

Leven’s test is non-significant so equal variances are assumed

Answer 569

A

F(2,42) = 5.94, p = 0.005, eta-squared = 0.22

Answer 570

A

Between groups sum of squares divided by total sum of squares

Answer 571

A

830.207/3763.632 = 0.22
22% of the variance in exam scores is accounted for by the model

Answer 572

A

0.01 = small effect
0.06 = medium effect
0.14 = large effect

Answer 573

A

then use statistics in Welch or Brown-Forsythe test

Answer 574

A

statistics you get and affect if p value is sig or not

Answer 575

A

Full sleep vs partial sleep, p = 1.00, not sig
- Full sleep vs no sleep , p = 0.007 so sig
- Partial sleep vs no sleep = p = 0.032 so sig

Answer 576

A

Mean of all scores regardless pp’s condition

Answer 577

A

difference of the participant’s score from the grand mean squared and summed over all participants

Answer 578

A

difference of the model score from the grand mean squared and summed over all participants

Answer 579

A

difference of the participant’s score from the model score squared and summed over all participants

Answer 580

A

explained by the model and amount of variation caused by extraneous factors

Answer 581

A

DF to calculate them

Answer 582

A

number of group (parameters), k,

Answer 583

A

total sample size, N, minus the number of groups, k

Answer 584

A

MST = SST (N-1)
MSR = SSR (N-k)
MSM = SSM/k

Answer 585

A

exp manipulation explains

Answer 586

A

average amount of variation explained by the model (e.g. the systematic variation),

Answer 587

A

average amount of variation explained by extraneous variables (the unsystematic variation).

Answer 588

A

non-significant effect

Answer 589

A

F ratio is less than 1 means that MSR is greater than MSM = more unsystematic than systematic

Answer 590

A

indicates that experimental manipulation had some effect above and beyond effect of individual differences in performance

Does not tell us whether F-ratio is large enough to not be a chance result

Answer 591

A

MSM is greater than MSR

Answer 592

A

compare the obtained value of F against the maximum value we would expect to get by chance if the group means were equal in an F-distribution with the same degrees
of freedom

Answer 593

A

by chance

. Low degrees of freedom result in long tails of the distribution, so much like other statistics

large values of F are more common to crop up by chance in studies with low numbers of participants.

Answer 594

A

differences between groups lie

Answer 595

A

that one or more of the differences between means is statistically significant (e.g. either b2 or b1 i statistically significant)

Answer 596

A

which groups differ

Answer 597

A

non-normality

Answer 598

A

affected by skew, and non-normality also affects the power of F in quite unpredictable ways

Answer 599

A

violations of normality

Answer 600

A

Planned contrasts
Post-hoc tests

Answer 601

A

compare all pairwise differences in mean
Used if no specific hypotheses concerning differences has been made

Answer 602

A

because every pairwise combination is considered the type 1 error rate increases, so normally the type 1 error rate is reduced by modifying the critical value of p

Answer 603

A

two-tailed

Answer 604

A

One-tailed hypothesis

Answer 605

A

Bonferroni correction, which divides the standard critical value of p=0.05 by the number of pairwise comparisons performed

Answer 606

A

hypothesis

Answer 607

A

pairwise difference so are not penalized as heavily as post hoc tests that do test for every difference

Answer 608

A

data is collected

Answer 609

A

never used again

Answer 610

A

k (number of groups) minus 1

Answer 611

A

Coefficients add to 0 for each contrast (-2 + 1 +1) and once group used alone in contrast then enxt contrasts set coefficient to 0 (e.g., -2 to 0)|

Answer 612

A

quadratic, cubic and quartic

Answer 613

A

lacks statistical power (probability of type II error will be high [ false negative]) so increasing chance of missing a genuine difference in data

Answer 614

A

Use REGWQ or Tukey as good power and tight control over Type 1 error rate

Answer 615

A

Gabriel’s procedure because it has greater power,

Answer 616

A

if sample sizes are very different use
Hochberg’s GT2

Answer 617

A

Games-Howell

Answer 618

A

Bonferroni

Answer 619

A

Linear trend as dose of Viagra increases so does mean level of libido
Error bars overlap indicating no between group differences

Answer 620

A

SSR (unsystematci variation)

Answer 621

A

SSM (systematic variation)

Answer 622

A

Linear trend is significant (p = 0.008)
Quadratic trend is not significant (p = 0.612)

Answer 623

A

with a negative weight

Answer 624

A

the table of weights shows that contrast 1 compares the placebo group against the two experimental groups,

contrast 2 compares the low-dose group to the high-dose group

Answer 625

A

Signifiance value given in table is two-tailed and since hypothesis one-tail we divide by 2

for contrast 1, we can say that taking Viagra significantly increased libido compared to the control group (p = .0029/2 = 0.0145)

. The significance of contrast 2 tells us that a high dose of Viagra increased libido significantly more than a low dose (p(one-tailed) = .065/2 = .0325)

Answer 626

A

Bonferroni
Tukey

Answer 627

A

Independence of data
DV is continuous; IV categorical (3 groups)
No significant outliers;
DV approximately normally distributed for each category of the IV
Homogenity of variance = Levene’s test not significant

Answer 628

A

type 1 error

Answer 629

A

IV and DV

Answer 630

A

A differences between means of groups containing different participants when sampling distribution is normal and the groups have equal variances and data are at least interva

Answer 631

A

D All of these are correct

Answer 632

A

C As the DF increase, the distribution becomes closer to normal

Answer 633

A

CIt is the standard deviation of the sampling distribution of a statistic

Answer 634

A

BFemales and males did not significantly differ in the time spent using email,t(7.18) = –1.90,p= .099

Answer 635

A

CHas less power to find an effect.

Answer 636

A

The experimental groups are represented by a binary variable (i.e. code 1 and 0)

Answer 637

A

C (multiple) Regression or ANOVA (independent) as regression and ANOVA is the same

Did not mention the hypothesis of prediction or it would be regression

Chi-square only used when you have one categorical predictor and outcome is categorical

Answer 638

A

Type of intervention had a significant effect on levels of exam performance, F(4, 29) = 12.43, p < .01.

Answer 639

A

C. At least two of the stimulants will have different effects on the mean time spent awake

Answer 640

A

D. large; low

Answer 641

A

The researcher should accept as statistically significant tests with a probability value of less than 0.016 to avoid making a Type I error

Answer 642

A

D. The treatment groups had a significant effect on the depression levels,F(2, 26.44) = 4.35.

Answer 643

A

C. Bonferroni

Answer 644

A

ANSWER 1 - sum of all weights should be 0

Answer 645

A

Is there a statistically significant difference in Frisbee throwing distance with respect to education status

IV = Education with 3 levels = high school, graduate, postgrad

DV = Frisbee throwing distance

Answer 646

A

There was homogeneity of variance as assessed by Levene’s Test for Equality of Variances (F (2,47) = 1.94, p = .155)

Answer 647

A

There was a statistically significant difference between groups as demonstrated by one-way ANOVA (F(2, 47) = 3.50, p = .038).

Answer 648

A

A Tukey post hoc test shows that the PostGrad group was able to throw the frisbee statistically significantly further than the High School group (p = .034). There was no statistically significant difference between the Graduate and High School groups (p = . 691) nor between the Graduate and PostGrad groups (p = .099).

Answer 649

A

IV = 1 predicto Categorical with more than 2 levels

DV = 1 Continous

Answer 650

A

between subject

Answer 651

A

part of the main experimental manipulation but have an influence on the dependent variable, are known as covariates and they can be included
in an ANOVA analysis.

Answer 652

A

When we measure covariates and include them in an
analysis of variance

Answer 653

A

covariates

Answer 654

A

then we can see what effect an IV has after the effect of covariate

We partial out the effect of covariate

Answer 655

A

To reduce within-group error variance = if we can explain unexplained variance , SSR, in terms of other variables (covariates)then reduce SSR to accurately assess effects of SSM
Elimination of confoundd = remove bias of unmeasured variables that confound results and influence DV

Answer 656

A

Independence of the covariate and treatment effect
Homogeneity of regression slopes

Answer 657

A

independent from the experimental/treatment effect - (IVs - categorical predictors) ( ANCOVA assumption)

Answer 658

A

experimental effect is confounded with the effect of covariate = interpretation of ANCOVA is compromised

Answer 659

A

experimental effect

Answer 660

A

entire dataset and ignore which groups pps fit in

Answer 661

A

the relationship between the
outcome (dependent variable) and covariate differs across the groups then the overall regression model is inaccurate (it does not represent all of the groups).

Answer 662

A

imagine plotting a scatterplot for each experimental condition with the covariate on one axis and the outcome on the other and calculate its regression line

Answer 663

A

exhibits the same slopes for control and 15 minute group

Answer 664

A

30 minutes of therapy exhibts a different slope compared to others

Answer 665

A

ANCOVA
Independent samples-design
One IV , two conditions, interval regime and steady state
One covariate (age in years)
One DV (Race time)

Answer 666

A

Age F(1,27) = 5.36, p = 0.028, partial eta-squared = 0.17 (large and sig main effect)
Regime F(1,27) = 4.28, p = 0.048, partial eta-squared = 0.14 (large and sig main effect)

Answer 667

A

DF for age and DF for error

Answer 668

A

η2 = 0.01 indicates a small effect.
η2 = 0.06 indicates a medium effect.
η2 = 0.14 indicates a large effect

Answer 669

A

Interval has a marginal mean of race times of 56.57
Steady state has a marginal mean of race times 62.97
Estimated marginal means partialled out the effects of age and view mean scores of race times in interval and steady state if mean age scores (30.07) across two groups was held constant

Answer 670

A

Interaction effect of regime * age has a p-value of 0.980
Since p-value is not significant the assumption of homogeneity of regression slopes has been met

Answer 671

A

relationship between covariate and DV differ significantly between two groups or many groups you got and assumption is not satisfied

Answer 672

A

IV (e.g., regime) and covariate (e.g., age) in DV instead of covariate box

Answer 673

A

P-value is not signifcant (p=0.528) so effect of variable age is not sig difference of age across training regime
and so independent variable are assumed to be independent.

Answer 674

A

f the b-value for the covariate is positive then it means that the covariate and the outcome variable have a positive relationship

If the b-value is negative it means the opposite: that the covariate and the outcome variable have a negative relationship

Answer 675

A

b for covariate is 0.416
Besides other things being equal, if a a partner’s libido increases by one unit, then the person’s libido should
increase by just 0.416 units
Since b is positive then partner’s libido ahs pos relation with pps’s libido

Answer 676

A

N - p -1
N is total sample size, p is number of predictors (2 dummy variables and covariate )

Answer 677

A

Tukey LSD with no adjustments (not reccomended)
Bonferroni correction (reccomended)
Sidak correction

Answer 678

A

Bonferroni correction

Answer 679

A

Bonferroni correction

Answer 680

A

loss of power associated with Bonferroni corrected values.

Answer 681

A

Contrast 1 of comparing level 2 (low dose) against level 1 (placebo) is significant (p = 0.045)
Contrast 2 of comparing level 3 (high dose) with level 1 (placebo) is significant (p - 0.010)

Answer 682

A

The significant difference between the high-dose and placebo groups remains (p = .030)
high-dose and low-dose groups do not significantly differ (p = .93)
Low dose and placebo groups do not significantly differ (p value = 0.130)

Answer 683

A

For placebo and low dose there appears to be a positive relationship between pp’s libido and that of their partner

However, in the high-dose condition there appears to be no relationship at all between participant’s libido and that of their partner - shows negative relationship

Doubts whether homogenity of regression slopes is satisfied as not all the slopes are the same (go same direction)

Answer 684

A

eta-squared
partial-eta squared (ANCOVA)
omega squared = used when equal numbe of pps in each grp
r

Answer 685

A

Dividing the effect of interest SSM by total variance in the data SST

Answer 686

A

SS Effect/ SS Effect + SS Residual

Answer 687

A

This differs from eta squared in that it looks
not at the proportion of total variance that a variable explains, but at the proportion of variance that a variable explains that is not explained by other variables in the analysis

Answer 688

A

ANCOVA
ANCOVA is conducted to determine i f there is a statistically significant difference between different studying techniques (IV) on exam score (DV) after controlling for current grade (covariate)

Answer 689

A

IV, DV and covariate

Answer 690

A

covariate factor(s)

Answer 691

A

correlates with outcome DV but not with IV

Answer 692

A

baseline pre-test scores can be used as a covariate to control for inital grp differences on test performance

Answer 693

A

IVs are categorical
Covariates are metric (quantiatively) independent of IV
DV is metric

Answer 694

A

1 DV: Continous

2 predictor variables with 2 levels or more that are categorical and continous

Answer 695

A

infinite number of possible values variables can take on

e.g., interval = equal intervals on variable represent equal difference measured like diff between 600ms and 800ms is = difference between 1300ms and 1500ms

e.g., ratio = same as interval but clear definition of 0 like height or weight

Answer 696

A

A variable that cannot take on all values within the limits of the variable - entities are divided into distinct categories

e.g., nominal = 2 or more caegories e.g., whether someone is vegan or vegetarian

e.g., ordinal categories have order like people who got fail, pass, merit or distinction

Answer 697

A

Independence of the covariate and treatment effect means that the categorical predictors and the covariate should not be dependent on each other

Answer 698

A

Homogeneity of regression slopes means that the covariate has a similar relationship with the outcome measure, irrespective of the level of the categorical variable - in this case the group

Answer 699

A

There are alternative, a bit more advanced, methods to account for such differences as they are not, in general, uninteresting, but for the ANCOVA analysis they do present an issue

Answer 700

A

Quote df for the effect and error, e.g. 2,26

Answer 701

A

The group means can be recalculated once the effect of the covariate is ‘discounted’ = impact of covariate is taken into account and adjusted into each level of predictor variable in mean column

These values can differ markedly from the original group means and help with interpretation.

Answer 702

A

Control for Covariances (continuous variables you may not necessarily want to measure)
Study combinations of categorical and continuous variables – covariate becomes the variable of interest rather than the one you control

Answer 703

A

A three-way ANCOVA was conducted to determine a statistically significant difference between different study techniques on students exam scores after controlling for their current grades.

Answer 704

A

Independent variablesshould becategorical variables.

Thedependent variableand covariate should be continuous variables(measured on aninterval scaleorratio scale.)

Make sureobservations are independent - don’t put people into more than one group.

Normality: the dependent variable should be roughlynormalfor each of category ofindependent variables.

Data (and regression slopes) should showhomogeneity of variance.

The covariate and dependent variable (at eachlevelof independent variable) should belinearly related.

Your data should behomoscedastic

The covariate and theindependent variableshouldn’t interact.In other words, there should be homogeneity of regression slopes.

Answer 705

A

IV and DV

Answer 706

A

Analysis of covariance (ANCOVA)

Answer 707

A

Natural Fear Level

Answer 708

A

D since baseline levels of stress used as covariate and use this as a control when looking at impact treatment has had over 3 month assessment

Not B since grps allocated based on baseline levels of stress (covariate and IV correlated - problematic) and A and C is one-way independent ANOVA

Answer 709

A

IV: Group
DV: Hangover
Covariate: Drunk

Answer 710

A

Q: What sort of measurement? A: Continuous
Q:How many predictor variables? A: Two or more
Q: What type of predictor variable? A: Categorical
Q: How many levels of the categorical predictor? A: Not relevant
Q: Same or Different participants for each predictor level? A: Different

Answer 711

A

ANOVA and ANCOVA

Answer 712

A

as you add more variables to the model, the proportion explained by any one variable will automatically decrease.

Answer 713

A

Sum of squares between (squares of effect M) divided by sum of squared total (squares of everything - effects, errors and interactions)

Answer 714

A

more than one IV

Answer 715

A

Independent Factorial ANOVA

Answer 716

A

When experiment has two or more IVs

Answer 717

A

Independent factorial design
Repeated-measures (related) factorial design
Mixed design

Answer 718

A

There is many IVs or predictors that each have been measured using different pps (between grps)

Answer 719

A

Many IVs or predictors have been measured but same pps used in all conditions

Answer 720

A

Many IVs or predictors have been measured; some measured with diff pps whereas others used same pps

Answer 721

A

Independent factorial design

Answer 722

A

When we use ANOVA to analyse a situation in which there is two or more IVs

Answer 723

A

A one-way ANOVA has one independent variable, while a two-way ANOVA has two.

Answer 724

A

IV = Alcohol - 3 levels = Placebo, Low dose, High dose
Iv = face type 2 levels = unattractive, attractive
DV = Physical attractiveness score

Answer 725

A

linear model

Answer 726

A

The first equation models the two predictors in a way that allows them to account for variance in the outcome separately, much like a multiple regression model
The second equation adds a term that models how the two predictor variables interact with each other to account for variance in the outcome that neither predictor can account for alone.
The interaction is important to us because it tests our hypothesis that alcohol will have a stronger effect on the ratings of unattractive than attractive faces

Answer 727

A

We follow the same routine , similar to one-way ANOVA, to compute sums of squares for each factor of the model (and their interaction) and compare them to the residual sum of squares, which measures what the model cannot explain

Answer 728

A

, we still find the total sum of squared errors (SST) and break this variance down into variance that can be explained by the experiment (SSM) and variance that cannot be explained (SSR).

Answer 729

A

in two-way ANOVA, the variance explained by the experiment is made up of not one experimental manipulation but two.

Therefore, we break the model sum of squares down
into variance explained by the first independent variable (SSA), variance explained by the second independent variable (SSB) and variance explained by the interaction of these two
variables (SSA × B)

Answer 730

A

sum of all grps (pairing each level of IV with another)

n = number of scores in each grp which is multipled by the mean value of each group subtracted by grand mean of all pps regardless of grp squared

Answer 731

A

placebo + attractiveness
placebo + untractiveness
low dose +attractiveness
low dose + unattractiveness
high dose +attractiveness
high dose +unattractiveness - 6 grps

Answer 732

A

considering only two groups at a time and add together - for first IV variable (SSA) (e.g., grps of pps rated attractive and grp of pps that rated unattractive)

number of pps in that grp multiplied by mean of grp subtracted by grand mean overall of all pps squared

Answer 733

A

DF = (g-1) so if male and female then 2 -1 = 1

Answer 734

A

same formula as SSA but for the second IV

added for all grps of pps in second IV

number of pps in one grp of secondIV(mean score of that grp subtract by grand mean of all pps regardless of grp) squared

Answer 735

A

number of grps in second IV minus 1

Answer 736

A

by the interaction of 2 variables

Answer 737

A

SS A X B = SSM - SSA - SSB

Answer 738

A

df A X B = df M - df A - df B

Answer 739

A

individual differences in performance or the variance that can’t be explained by factors that were systematically manipulated.

Answer 740

A

use individual variances of each grp (e.g., attractiveness face type + placebo) and multiply by one less than number of people within the group (n - in this case 6) and do it for each group and add it together

Answer 741

A

number of grps you have in study(number of scores you have per group minus 1)

Answer 742

A

Partial eta-squared
Omega-squared if advised

Answer 743

A

There is not a simple non-parametric counterpart of factorial ANOVA
If assumption of normality is violated then use robust methods described in Wilcox’s and files in R
If assumptions of homogenity of variance then implement corrections based on Welch procedure

Answer 744

A

Independent samples design
Two Ivs, both 2 conditions: drug type (A, B) and onset (early, late)
One DV is cognitive performance
Two way ANOVA

Answer 745

A

The levene’s test is not significant so assume equal variances

Answer 746

A

steps taken to equalise variances through data transformation

Answer 747

A

Drug : F(1,24) = 5.58, p = 0.027, partial eta-squared = 0.19 (large effect + sig effect)
Onset: F(1,24) = 14.43, p = 0.001, partial eta-squared = 0.38 (large effect + sig effect)
Interaction Drug * Onset: F(1,24) = 9.40, p = 0.005, partial eta-squared = 0.28 (large effect + sig effect)
We got two sig main effects and sig interaction effect which are all quite large effect sizes

Answer 748

A

drug B has higher score on cognitive test than A and is sig main effect (CI does not contain 0 and also main effect analysis)

early onset scoring higher on average than late onset (CI does not contain 0 and also main effect analysis)

Important of these main effect as main effects ignoring the effec tof other IV so results for drug at top is regardless of whether late/onset for example , does not tell anything for interaction

Answer 749

A

Blue line is early onset
Green line is late onset
For late onset, drug B lead to higher mean scores on test than drug A
For early onset, drug A led to slightly higher mean scores than drug B
Drug A more effective then drug b for early onset but different marginal
Drug B was substantially more effective than Drug A for late

Answer 750

A

sig interaction effect

Answer 751

A

looks at the effect of one IV at individuals levels of other IV
Seeing whether differences margina/substantial is sig

Answer 752

A

variance explained by the first independent
variable (SSA), variance explained by the second independent variable (SSB ) and variance explained by the interaction of these two variables (SSA × B
).

Answer 753

A

One-way ANOVA have one IV categorical variable (level of educaiton - college degree, grad degree, high school)
Two-way ANOVA , you have 2 categorical IV variables - level of education (college degree, grad degree, high school) and zodaic sign (libra, pisces)

Answer 754

A

1 DV and 2 or more categorical predictors

Answer 755

A

Two-way independent ANOVA

Answer 756

A

IV = 3 different types anxiety medications and control grp
DV: Anxiety levels after treatment of grps
Covariate = anxiety before treatment, depression levels

ANCOVA

Answer 757

A

IV: Level of education - school, college or uni edu and gender (m, f)
DV: Political interest in questionnaire
Two-way independent ANOVA

Answer 758

A

IV: Gender (m,f) , number of hrs spent practicisng
DV: Level of muscial skill after a year
Two-way independent ANOVA, not t-tests since more than one IV

Answer 759

A

Is there an effect of gender overall?
No, F(1,54) = 1.63, p = .207
Is there an effect of education level?
Yes, F(2,54) = 147.52, p < .001
Is there an interaction effect?
Yes, F(2,54) = 4.64, p = .014

Answer 760

A

Main effect of Aspirin: Aspirin reduces heart attackes compard to placebo (1)
Main effect of carotene: Beta carotene reduces heart attack (2)
Interaction effect: Yes, bigger effect when aspirin and beta carotene taken together (3) - also lines drawn more its an interaction

Answer 761

A

A variable that shares some of the variance of another variable in which the researcher is interested.

Answer 762

A

C. The model sum of squares is partitioned into three parts

The model sum of squares is partitioned into the effect of each of the independent variables and the effect of how these variables interact (see Section 13.2.7)

D is also true, but we also do this for both one-way and two-way ANOVA (see Section 13.2.7).

Answer 763

A

16 because 4*4 = 16 (if it was 3x2 then would be 6)

Answer 764

A

A - baseline levels of stress used as covariate

. We can use the baseline, pre-treatment measures as a control when looking at the impact the treatment has on the 3-month assessment.

Answer 765

A

A = for decaffeinated drinks there is little difference between email and no email, but for caffeinated drinks there is

Answer 766

A

Q: What sort of measurement? A: Continuous
Q:How many predictor variables? ONE IV
Q: What type of predictor variable? A: Categorical
Q: How many levels of the categorical predictor? More than two
Q: Same or Different participants for each predictor level? A: Same

Answer 767

A

the assumption of homogeneity of variance in
between-group ANOVA

Answer 768

A

ε or circularity

Answer 769

A

equality of variances of the differences between treatment levels.

Answer 770

A

Calculating differences between between pairs of scores for all treatment levels e.g., A-B, A-C , B-C
Calculating variances of these differences e.g., variances of A-B, A-C, B-C

Answer 771

A

there is some deviation from sphericity because the variance of the differences between conditions A and B (15.7) is greater than the variance of the differences
between A and C (10.3) and between B and C (10.3).

However, these data have local circularity (or local sphericity) because two of the variances of differences are identical.

The deviation from spherecity in the data does not seem too severe (all variances roughly equal) but here assess deviation is serve to warrant an action

Answer 772

A

via Mauchly’s test

Answer 773

A

variance of differences between conditions are significnatly different - must be vary of F-ratios produced by computer

Answer 774

A

varainces of the differences between conditions are equal and does not significantly differ

Answer 775

A

sample size

Answer 776

A

in big samples small deviations from sphericity can be
significant,

small samples large violations can be non-significant

Answer 777

A

several corrections that can be applied to
produce a valid F-ratio

or

use multivariate test statistics (MANOVA)

Answer 778

A

Greenhouse-Geisser correction ε

Huynh-Feldt correction

Answer 779

A

1/k-1 (k is number of repeated measures conditions) and 1

Answer 780

A

more homogeneous the variances of differences, and hence the closer the data are to being spherical.

Answer 781

A

Limit of f ε^ is 1/k (number of repeated-measures conditions)

so… 1/(5-1) = 1/4 = 0.25

Answer 782

A

Greenhouse-Geisser correction is too conservative

Answer 783

A

Greenhouse-Geisser correction

Answer 784

A

MANOVA is not dependent upon the assumption of sphericity

Answer 785

A

between group variance

Answer 786

A

residual variance (SSR) = variance produced by individual differences in performance

SSR is not contaimined by experimental effect as study carried out by different people

Answer 787

A

the effect of experimental manipulation SSM and individual differences in performance (random factors outside of our control) - this is error SSR

Answer 788

A

compares the size of the variation due to our experimental
manipulations to the size of the variation due to random factors

has same type of variances in independent - total sum of squares (SST), model sum of squares (SSM) and a residual sum of squares (SSR)

Answer 789

A

repeated-measures ANOVA the model and residual sums of squares are both part of the within-participant variance.

Answer 790

A

big value of F ratio

we can conclude that the observed results are unlikely to have occurred if there was no effect in the population.

Answer 791

A

SST
SSB
SSW
SSM
SSR

Answer 792

A

SST = grand variance (N-1)

Answer 793

A

square of the standard deviation of each participant’s scores multiplied by the number of conditions minus 1, summed over all participants.

Answer 794

A

DF = N(n-1)

number of participants multiplied by the number of conditions minus 1;

Answer 795

A

square of the differences between the mean of the participant scores for each condition and the grand mean multiplied by the number of participants tested, summed over all conditions.

do this for each condition grp

Answer 796

A

DF = n-1
n is number of conditions

Answer 797

A

the difference between the within-participant sum of squares and the sum of squares for the model.

Answer 798

A

DF of SSW minus DF of SSM

Answer 799

A

one-way repeated ANOVA

Answer 800

A

individual differences between cases

Answer 801

A

post-hoc tests

Answer 802

A

Bonferroni method seems to be generally the
most robust of the univariate techniques,

especially in terms of power and control of the Type I error rate.

Answer 803

A

Tukey can be used

Answer 804

A

Games–Howell procedure, which uses a pooled error term,

it is more preferable to Tukey’s test.

Answer 805

A

standard post hoc tests used for independent designs not avaliable for repeated measure designs

Answer 806

A

levels of the independent variable have a meaningful order e.g., meausred DV at successive time points or adminstered increasing doses of a drug

Answer 807

A

concerned about the loss
of power associated with Bonferroni corrected values.

Answer 808

A

Left shows variables represent each level of IV which is animal
Right shows descriptive statistics - higher mean time to retch when celebrity eating stick insect (8.12)

Answer 809

A

P-value is 0.047 which is less than 0.05
Thus, reject the assumption of spherecity that variances of the differences between levels are equal

Answer 810

A

Since there are 4 conditions, lower limit of ε^ is 1/(4-1) = 0.333 (lower-bound estimate in table)
SPSS Output 13.2 shows that the calculated value of ε
^ is 0.533.
0.533 is closer to the lower limit of 0.33 than it is to the upper limit of 1 and it therefore represents a substantial deviation from sphericity

Answer 811

A

The value of F = 3.97 which is compared against a critical value for 3 and 21 DF and p-value is 0.026
conclude there is significant difference between 4 animals in their capacity to induce retching when eaten

Answer 812

A

The F-ratios are the same across the rows
the D.F is changed as well as critical value the F-statistic is compared with

Answer 813

A

Adjustment made by multiplying the DF by the estimate of spherecity.

Answer 814

A

Observed F statistic not significant using Greenhouse-Geisser ( p> 0.05)
Greenhouse-Geisser is quite conservative and miss true effects that exist
Thus, Huynh-Feldt showend F-statistic is still significant as p-value of 0.048

Answer 815

A

Taking average of two significant values e.g., 0.063+ 0.048/2 = 0.056
Thus, go with Greenhouse-Geisser correction and conclude F ratio is non-significant

Answer 816

A

Type 1 error (False positive) or not

Answer 817

A

celebrities took significantly longer to retch after
eating the stick insect compared to the kangaroo testicle (Level 1 vs. Level 2) - p-value of 0.002
Time taken to retch was not significantly different in Level 2 vs 3 and Level 3 vs 4

Answer 818

A

ignored

inclined to conclude main effects of animal was significant and proceed with further tetss like contrasts

Answer 819

A

Repeated measures design
One IV (Incentive) , four conditions (week 1, week 2, week 3, week 4)
One DV (Sales Generated)
One-way repeated ANOVA

Answer 820

A

does not actually make any adjustments to p-value in terms of critical value as what post-hoc test should do

Answer 821

A

sales are increasing across the weeks
Week 1 start at 427.93 and gradually rise by week 4 to 642,28 pounds
looks like incentives are having an effect and seem to generate higher sales

Answer 822

A

P-value is not significant ( p = 0.080)
Assumption of spherecity is satisfied so we got equal variances between differences across conditions

Answer 823

A

DF for week is 3 and 57 (spherecity assumed from week and error)
Week: F(3,57) = 26.30, p < 0.001 (p = 0.000), eta-squared is 0.58 - large effect
- There is an overall effect going on and change across weeks

Answer 824

A

No sig difference betwen W1 and W2
Sig difference between W1 and W3 = ihigher sales in W3 (538.570) compared to W1 (427.933)
Sig difference between W1 and W4 = ihigher sales in W3 (642.284) compared to W1 (427.933)
*Not sig diff with W2 and W3
Sig difference between W2 and W4 , higher sales in W4 (642.284) than W2 (481.388)
Sig difference between W3 and W4 , higher sales in W4 (642.284) than W3 (538.570)

Answer 825

A

- Did sales increase from W1 to W2? = p = 0.010 significant
Did sales increase from W2 to W3? = p = 0.030
Did sales increase from W3 to W4? = p = 0.008

Answer 826

A

Post hoc has lack of power due to many multiple comparisons
By limiting comparisons in contradt we get around problem

Answer 827

A

more than one IV

Answer 828

A

4 different IV

Answer 829

A

IV with 3 levels
IV with 2 levels

Answer 830

A

Repeated measures design
Two IVs: alcohol (3 conditions) and sleep (2 conditions)
DV: Reaction Times
Two-way repeated measures ANOVA

Answer 831

A

large number for RT means slower RT
Alcohol seem to have an effect on RT but particularly for 2 pints + no sleep

Answer 832

A

Two p-values: alcohol ( p = 0.00) and alcohol * sleep [ interaction effect] (p = 0.00) – > sig so assumption of spherecity is violated so report Grenhouse-Geisser values from main ANOVA table
No p-value for sleep as only 2 conditions and test of sphericity need more than 2

Answer 833

A

Main sig effect of alcohol: F(1.16,22.06) = 51.38, p < 0.001, partial eta-squared = 0.73
Main sig effect of sleep: F(1,19) = 88.61, p < 0.001, partial-eta-squared = 0.82
Interaction effect: F(1.15,21.91) = 23.36, p < 0.001, partial-eta squared = 0.55

Answer 834

A

condition 1 and condition 2 which was significant
Condition 1 vs 3 which was significant
Condition 2 with Condition 3 was significant
So all groups differing significantly from each other so interpret from that higher does of alcohol has more impact on RT

Answer 835

A

Interaction effect is there = as line continue they cross
Most pronouned effect was in alcohol grp 3 (2 pints)
When alcohol grp 3 had full nights sleep (2), impairs their RT very slightly
When alcohol grp 3 had sleep deprivation (1) in combination with 2 pints, it impairs RT by a lot –> use simple effect analysis as well as two-way independent ANVOA to see if difference in grp 3 of blue and green line is sig

Answer 836

A

Can do non-parametric test called Friedman’s ANOVA if only one IV

There is no non-parametric counterpart for more than one IV in repeated design

Answer 837

A

Normal distribution
Repeated measures design (same participants)
Sphereicity - Mauchly’s test

Answer 838

A

A significant effect means that corrections need to be made later on

Those corrections are listed in the main ANOVA output table

Answer 839

A

1 DV continous and 2 or more categorical predictors with 2 or more levels with same participants in each predictor level

Answer 840

A

1 DV continous

1 Predictor categorical with more than 2 levels

Same participants in each predictor level

Answer 841

A

significant main effects and interactions

Answer 842

A

The variables are the type of drink (Beer - Wine - Water) and the type of imagery used in the advertisement (positive - negative - neutral)

The outcome is how much the participant likes the beverage on a scale from -100 (dislike very much) to 100 (like very much)

Participants went two conditions

Answer 843

A

A mixture of between-subject and within-subject

Several independent variables or predictors have been measured; some have been measured with different entities, (pps) whereas others used the same entities (pps)

Answer 844

A

mixed design

Answer 845

A

Q: What sort of measurement? A: Continuous
Q:How many predictor variables? A: Two or more
Q: What type of predictor variable? A: Categorical
Q: How many levels of the categorical predictor? A: Not relevant
Q: Same or Different participants for each predictor level? A: Both
This leads us to and Factorial mixed ANOVA

Answer 846

A

a mixed ANOVA is often used in studies where you have measured a dependent variable (e.g., “back pain” or “salary”) over two or more time points or when all subjects have undergone two or more conditions (i.e., where “time” or “conditions” are your “within-subjects” factor),

but also measure DV when your subjects have been assigned into two or more separate groups (e.g., based on some characteristic, such as subjects’ “gender” or “educational level”, or when they have undergone different interventions). These groups form your “between-subjects” factor.

Answer 847

A

Theme

No

Tropical, Old Library,
New York Café

Amount of coffee consumed

Within-subjects

1-way Repeated measures

Answer 848

A

Sale/ No Sale, Store’s layout

Yes

Sale-No Sale, Grid-Circular

Profit

BS (Sale) and WS (Layout)

2-way mixed Measures

Answer 849

A

Type of treatment, Type of counseling.

Yes

Residential or outpatient/ cognitive-behavioural, psychodynamic or client-centered.

Substance-free days

Between subjects

2-way independent measures ANOVA.

Answer 850

A

Normal Distribution

Independent and Repeated Factors

Homogeneity of Variance for the Independent factor
+
Sphericity for the Repeated factor

Answer 851

A

Normal Distribution

Repeated Measure Design (same participants)

Sphericity (Mauchly’s Test)

Answer 852

A

Normal Distribution

Independence of Scores

Homogeneity of Variance (Levene’s Test)

Answer 853

A

Levene’s test would likely be significant as the variance between the two groups are quite different.

Answer 854

A

repeated and mixed models

Answer 855

A

If GG < 0.75 THEN USE GG

IF GG > 0.75 THEN USE HF

Since GG less than 0.75 report adjusted F, DF and sig which is F(1.24, 21.00) = 212.32 , p < 0.001

Answer 856

A

distribution of groups are similar?

Answer 857

A

Dog breed and measurement time
Yes
Collie-German Shepard/Week 1-Week 5
Number of growls
Dog breed Between and measurement time is within
2-WAY mixed ANOVA

Answer 858

A

1) is there an effect overall = Yes (green)

2) Is the effect bread = Yes (red)

3) Is there an interaction = Yes (blue)

Answer 859

A

Rule 1: Groups coded with positive weights compared to groups coded with negative weights.

Rule 2: The sum of weights for a comparison should be zero.

Rule 3: For a given contrast, the weights assigned to the group(s) in one chunk of variation should be equal to the number of groups in the opposite chunk of variation.

Rule 4: If a group is not involved in a comparison, assign it a weight of zero

Rule 5: If a group is singled out in a comparison, then that group should not be used in any subsequent contrasts.

Answer 860

A

C

Two IVs = Group (Control, Anroexic, Bullimic) and Domain (Food, Friends, Physical Laws)

Group is between

Each partiicpant underwent domains so within

DV = Participannts measured

Answer 861

A

The assumption of sphericity has been met, indicated by Mauchly’s test (p > .05).

There was a significant main effect of distraction (F(2, 36) = 45.95, p < .001). This effect tells us that if we ignore the effect of age, driving accuracy was significantly different in at least two of the distraction groups.

Answer 862

A

A = Mauchly’s test was non-significant, so we can report the result in the row labelled ‘sphericity assumed’

Answer 863

A

a = see from the means in the Descriptive Statistics table that positive singing resulted in the highest number of goals scored and negative singing resulted in the least number of goals score

Answer 864

A

A = We can read the results in the row labelled ‘sphericity assumed’, as we can see from the output of Mauchly’s test that the assumption of sphericity has been met, p > .05. However, we would need to do some follow-up tests to investigate exactly where the differences between groups lie

Answer 865

A

Q: What sort of measurement? A: Categorical (in this case counts or frequencies)
Q:How many predictor variables? A: One
Q: What type of predictor variable? A: Categorical
Q: How many levels of the categorical predictor? A: Not relevant
Q: Same or Different participants for each predictor level? A: Different
This leads us to and Chi-square test for independence of groups

Answer 866

A

pass or fail,

pregnant or not pregnant,

win, draw or lose

Answer 867

A

contributes to the frequency or count with which a category occurs

Answer 868

A

could they dance - yes
could they dance - no
food as reward
affection as reward

Answer 869

A

Row totals give frequencies of dancing and non-dancing cats

Answer 870

A

The column totals give frequencies of food and affection as reward

These are the numbers in each group

Answer 871

A

One categorical DV (because of frequencies)

with one categorical IV with different participants at each predictor level

Answer 872

A

up on the basis of expected frequencies, four all four variable combinations, based on the idea that the variable of interest has no effect on frequencies

Answer 873

A

whether there is a relationship between two categorical variables.

Answer 874

A

mean or any similar statistic hence cannot use any parametric tests

Answer 875

A

observed frequencies from the data with frequencies which would be expected if there was no relationship between the two variables.

Answer 876

A

frequencies (number of items that fall into combination of categories)

Answer 877

A

We have a list of movie genres; this is our first variable. Our second variable is whether or not the patrons of those genres bought snacks at the theatre. Our idea (or, in statistical terms, our null hypothesis) is that the type of movie and whether or not people bought snacks are unrelated. The owner of the movie theatre wants to estimate how many snacks to buy. If movie type and snack purchases are unrelated, estimating will be simpler than if the movie types impact snack sales.

Answer 878

A

Data values that are a simple random sample from the population of interest.

Two categorical or nominal variables. Don’t use the independence test with continuous variables that define the category combinations. However, the counts for the combinations of the two categorical variables will be continuous.

For each combination of the levels of the two variables, we need at least five expected values. When we have fewer than five for any one combination, the test results are not reliable

Answer 879

A

We have a simple random sample of 600 people who saw a movie at our theatre. We meet this requirement.

Our variables are the movie type and whether or not snacks were purchased. Both variables are categorical.

But last requirement is for more than five expected values for each combination of the two variables. To confirm this, we need to know the total counts for each type of movie and the total counts for whether snacks were bought or not. = check later

Answer 880

A

50 + 125 + 90 +45 = 310

75 + 175 + 30 + 10 = 290

50 + 75 = 125

125 + 175 = 300

90 + 30 = 120

45 + 10 = 55

310 + 290 = 600

Answer 881

A

Calculate the difference from actual and expected for each Movie-Snacks combination.
square that difference.
Divide by the expected value for the combination.
We add up these values for each Movie-Snacks combination. This gives us our test statistic.

Answer 882

A

e.g., for action and snacks it would be column total (310) * row total (125) divided by grand total of 600 = 65

Answer 883

A

For this it would be 65.03

Answer 884

A

Independence
Each item or entity contributes to only one cell of the contingency table.

The expected frequencies should be greater than 5.
In larger contingency tables up to 20% of expected frequencies can be below 5, but there is a loss of statistical power.
Even in larger contingency tables no expected frequencies should be below 1.

Answer 885

A

Set your significance level = .05
Calculate the test statistic -> 65.03
Find your critical value from chi-squared distribution table based on df & significance level
Degrees of freedom: df (r – 1) x (C-1)
For the movie example this is; Df = (4-1) x (2-1) = 3 -> 7.815
compare test statistic with critical level
65.03 > 7.82 so reject the idea that movie type and snack purchases are independent

Answer 886

A

Research question:
Does the area of psychology that a person prefers depend on whether they would select a cat or a dog as a pet?

Hypotheses:
H0: The area of interest in psychology and type of pet preferred are independent of each other.
H1: The area of interest in psychology and type of pet preferred are not independent of each other. That is the primary area of interest in psychology depends on whether you prefer a cat or a dog.
Significance level: α = .05

Answer 887

A

Here we see that all the expected counts in the cat group and one expected count in the dog group are below 5.

We also have one in the cat group that is below 1.

So, SPSS has flagged that we have 60% of the expected counts falling below 5.

So assmption of expected frequencies greater than 5 is not assumed

Answer 888

A

We should use Fisher’s Exact Test which can correct for this.

Answer 889

A

A chi-square independence test was performed to examine whether there was a relationship between their area of studies in psychology and their preference for cats or dogs.

The relationship between these variables was not significant, χ²(4, N = 46) = 1.46, p = .834, so we fail to reject H0.

Answer 890

A

A = only when you have 2 variables to compare and can’t do non-directional in chi-square have to use loglinear or goodness of fit tests

Answer 891

A

If we are just comparing pet preferences between males and females, we can make a directional hypothesis (2 x 2 – male/female, cats/dogs).

Males prefer cats or females prefer dogs.

However, when we start adding variables to the design it gets complicated.

If we wanted to compare drink preferences at different times of the day for students/lecturers, we couldn’t form a directional hypothesis.

This is because we have 3 main effects and several interactions to consider. We need to use loglinear analyses to do this.

Answer 892

A

extension

Answer 893

A

can determine complex interactions in multidimensional contingency tables with more than two categorical variables.

Answer 894

A

there’s no clear distinction between response and explanatory variables

Answer 895

A

Think of chi-square like t-tests (2 groups) and log-linear like ANOVA (more than 2 groups).

Answer 896

A

Research question: Is the new treatment associated with improvements in health in cats and dogs?

Hypotheses:
H0: Treatment, type of animal and improvements are independent of each other.
H1: Treatment, type of animal and improvements are associated with each other.

Significance level: α = .05

Answer 897

A

Independence
Expected counts > 5

Answer 898

A

Here we have 3 things we are comparing: animal (cat/dog), treatment (yes/no) and improvement (yes/no) all of which are categorical.

We look and see that all of the expected counts are above 5.

So met assumption of independence and expected counts

Answer 899

A

all terms present (all main effects and all possible interactions

main effects: Animal, Treatment and Improvement

interactions: Animal * Treatment, Animal * Improvement, Treatment * Improvement and Treatment* Animal* Improvement

Answer 900

A

Remove a term and compares the new model with the one in which the term was present.

Starts with the highest-order interaction (including max number of variables/categories)

Uses the likelihood ratio to ‘compare’ models below:

If the new model is no worse than the old, then the term is removed and the next highest-order interactions are examined, and so on.

Answer 901

A

We can see that the model selection worked in a way that it first tried to remove the 3-way interaction.

However, we can see here that it * affected the fit of the model, so it was left in.

Since removing the highest-order interaction made a * difference to the fit of the model, we get a final model that is the saturated model (it contains all main effects and interactions).

Answer 902

A

we are using the likelihood ratio here because that’s how we compare the models to find the best fit

. We see that all main effects and interactions are significantly contributing to explaining the variance in the data

Answer 903

A

K represents the level of the terms.

For example, K=1 would be the main effects,

K=2 would be our 2-way interactions and

K=3 is our 3-way interaction.

Answer 904

A

There is a significant three-way interaction between animal, treatment and improvement, as well as two significant two-way interactions between animal and improvement and treatment and improvement (p < .001)

a * 3-way interaction between animal, treatment and improvement as well as two * 2-way interaction between animal/improvement and treatment/improvement.

Like our post-hoc tests, this is telling us where the * differences are.

Answer 905

A

Based on the raw data, there seems to be indication that the cats responded better to treatment than dogs, this should be followed up by chi-square tests separately for cats and dogs to determine whether the association between treatment and improvement is present in both cats and dogs

Answer 906

A

hypothesis that frequencies predicted by model (expected frequencies) are sig different from actual frequencies in data (obsevered)

Answer 907

A

our model was significantly different from our data (i.e., the model is a bad fit to the data).

Answer 908

A

non-parametric methods

Answer 909

A

When data violate the assumptions of parametric tests we can sometimes find a nonparametric equivalent
eg. normality of distribution

Answer 910

A

randomization or ranking the data for each group

Answer 911

A

outliers and skew

Answer 912

A

Add up the ranks for the two groups and take the lowest of these sums to be our test statistic

The analysis is carried out on the ranks rather than the
actual data.

Answer 913

A

Mann-Whitney or Wilcoxon
rank-sum test

Answer 914

A

Wilcoxon signed-rank test

Answer 915

A

Kruskall-Wallis or (for trends)
Jonckheere-Terpstra

Answer 916

A

Friedmanʼs ANOVA

Answer 917

A

Loglinear analysis (categorical
outcome, with participants as a factor)

Answer 918

A

Spearman’s Rho or Kendall’s Tau

Answer 919

A

two independent groups of scores

Answer 920

A

two dependent groups of scores

Answer 921

A

> 2 independent groups of scores

Answer 922

A

> 2 dependent groups of scores

Answer 923

A

two continuous variables are related (pattern of responses across variables)

Answer 924

A

Step 1: Get some not normally distributed data

Step 2: Rank it (regardless of group)

Step 3: Significance testing
Does one of the groups have more of the higher ranking scores than the other?

Answer 925

A

(r-1)(c-1)

Answer 926

A

small sample sizes

Answer 927

A

df = (r-1)(c-1)

Answer 928

A

1 DV = Ordinal (e.g., high school, bachelors, order is meaningful) or continous
1 IV = Categorical and 2 levels
Different partiicpants
Does not meet assumption of parametric

Answer 929

A

same procedure and used to compare two independent groups and assess whether samples come from same distribution

Answer 930

A

Rank all the data on the the basis of the scores irrespective of the group

compute the sum of ranks of each group

Answer 931

A

the lower of the two sums of ranks

Answer 932

A

sum of ranks for group 1, R1, as follows

Answer 933

A

Here we have data for two groups; one taking alcohol, the other ecstasy. The scores for a measure of depression. Scores were obtained on two days; Sunday and Wednesday. The drugs were administered on Saturday.

Answer 934

A

The graphic here shows how we can list the scores in order and as a result assign each score a rank.

When scores tie, we give them the average of the ranks.

If we ensure we keep track of the group the scores came from we can relatively easily add the ranks up for each group.

Note, that if there was little difference between the groups the sums of their ranks would be similar, as they are for the data shown her for Sunday.

However, the sum of ranks differ considerably for the data obtained on Wednesday.

Answer 935

A

The first terms involving n1 and n2 actually compute the maximum possible sum of ranks for group 1.

U is zero when all those in group one have scores that exceed the scores of those in group 2.

Answer 936

A

effect size so r = z / square root of N (number of pps

Answer 937

A

1 IV categorical with 2 levels

Same participants in each predictor level

1 DV - Ordinal or continous

Does not meet assumption of parametric tests

Answer 938

A

Compute the difference between scores for the two conditions
Note the sign of the difference (positive or negative)
Rank the differences ignoring the sign and also exclude any zero differences from the ranking
Sum the ranks for positive and negative ranks

Answer 939

A

The table shown here has the Depression Scores taken on Sunday and Wednesday for those taking ecstasy on Saturday.

Data for Sunday are in the first column and Wednesday in the second column.

The third column shows the difference between scores obtained on Sunday and Wednesday.

NOte some could be negative, some positive. In this example however the difference is always positive apart from two values when the difference is zero.

The fourth column notes the sign of the difference or notes it is going to be excluded because the difference was zero.

The fifth column ranks the differences in terms of their size, but not sign.

The sixth and seventh column list the ranks that were for positive and negative differences, respectively.

It is these two columns that are summed to get the relevant statistics, called T+ and T-.

Because T+ and T- are not independent, we take only the T+ value.

Answer 940

A

1 DV of continous or ordinal

1 IV categorical predictor of more than 2 levels

Diff participants in each predictor level

Does not meet assumption of parametric

Answer 941

A

Rank all the data on the the basis of the scores irrespective of the group

Compute the sum of ranks of each group, Ri , where i is the group number

Answer 942

A

1 DV continous or ordinal

1 IV predictor categorical with more than 2 levels

Same participants in each predictor level

Doesnot meet assumption of parametric tests

Answer 943

A

Rank the scores or each individual - that means you will have ranks varying from 1 to the number of conditions the participants took part in

Compute the sum of ranks, Ri , for each condition

Answer 944

A

K = conditions
N = number of pps

Answer 945

A

In this example, they wanted to look at whether attendance at lectures had an impact on their exam performance on whether they passed or failed
Attendence was coded as 1 if participants generally attended lectures , barirng illness, and 2 if they did not attend
Exam was scored as 1 = Pass and 2 = Fail

Answer 946

A

Attendence, Attended Lectures , Count = this is people who attended lecture and number of people who passed was 84 and people who failed was 29
% Within attendence give same info so 74.3% passed and 25.7% failed when attended lectures
-Going to didn’t attend lectures, 22 people passed and 35 failed and below is in percentages:
Easier using percentages writing up

Answer 947

A

At top row, pearson chi-squared which chi-square statistic which was 20.617, DF which is 1 and p-value was 0.000
0 cells have a count less than 5  met assumption of chi-square test that expected counts greater than 5

Answer 948

A

0 cells have a count less than 5  met assumption of chi-square test

Answer 949

A

x^2 (1) = 20.62, p < 0.001
Cramer’s V = 0.35 , indicating a medium effect size

Answer 950

A

Small effect = 0.1
Medium effect = 0.3
Large = 0.5 and above

Answer 951

A

correlation coefficient:

Answer 952

A

odds of passing/failing for students who attended lecture = no. of students who attended and passed (84) / no. of students who attedned and failed (29) = 2.897

odds of passing/failing for students who did not attend = no. of students who did not attend lectures and passed (22) / no of students who did not attended and fail (35)

Odds ratio = odds of P/F of attended/ odds of P/F of not attended = 2.897/ 0.629 = 4.606 saying for an individual who attended lectures lead them to be more likely to pass exam

Answer 953

A

Independent sample design
One IV, two conditions = existing vs new medication
One DV (symptoms) but this time on ordinal (scale from 1 to 5) and got combination of non-normally distributed data and small sample size (very problematic for t-tests)
Mann Whitney U Test

Answer 954

A

This box summarises the p-value ( p = 0.026) and tells you whether to accept or reject the null hypothesis.

Answer 955

A

Mann Whitney U test statistic is 166.000 to report and also people report standardised test statistic 2.292 which is z score so handy to report as know if its above +/-1.96 then p-value we get out of test is significant
P-value of exact significant is p = 0.026
This is significant difference between the 2 groups

Answer 956

A

For existing treatment, median score was 3.

And new treatment the median score was 4.

It suggests new treatment was more effective in reducing symptons than the existing treatment

Answer 957

A

Again we got ordinal data for DV not sure distances between levels is going to be the same
Related design
One IV, 3 conditions
One DV (level reached to video game)
Friedman’s ANOVA = more than 2 groups in related design

Answer 958

A

We got total sample size which is 30 and test statistic which is 21.788, DF = 2 and p value is 0.000 so significant difference between the 3 groups

Answer 959

A

post hoc tests for pairwise comparisons to look where the differences are

Answer 960

A

First one is Joy stick vs Vyper Max
Second one is Joystick vs Evo Pro etc…
Notice it gives two p-values of sig and adjusted sig
Adjusted sig control for multiple comparison and make correcitons to p-value (use this
Difference between joystick vs Vyper Max was sig at p = 0.005
Difference between Joystick vs Evo Pro was sig at p = 0.00
Difference between VyperMax vs Evopro is non-significant as p = 0.660

Answer 961

A

to detect sig effects compared to parametric effects so maybe issue of dealing with power so may have median scores higher in one then another but not sig

Answer 962

A

A = non parametric have fewer assumptions than parametric

Answer 963

A

A = correct, because it is false. Chi-square can be used on categorical variables only

Answer 964

A

A = If our model is a good fit of the data then the observed and expected frequencies should be very similar (i.e., not significantly different

Answer 965

A

B = 34-28/3

Answer 966

A

B = 0.50 squared

Answer 967

A

A = k (number of grps) - 1 = 5 - 1 = 4

Answer 968

A

C = 2^k - 1 = 2^3 - 1 = 7