final exam Flashcards
correlation (r)
reflects the strength and direction of a relation between two continuous variables
When are correlations stronger?
They can be from 1 to -1, correlations closer to 1 are stronger than correlations closer to 0
What do negative correlations tell us?
the variables have a different relation between them than positive correlations
Examples of correlations
strong - r= .80
weak- r=. 10
positive- r= .80, .10
negative- r= -.80, -.10
Correlation does not mean what?
Causation
regression coefficient (b)
reflects how well one of those continuous variables predicts the other
predict
means we can see what happens with one variable and predict what will happen with the other
Regression equation
y= a + b (x)
What do the variables mean in regression equation?
x= predictor variable
y= criterion variable (score we are predicting)
b= regression coefficient
a= regression constant (where we start)
b in the formula means
for every 1 raw unit increase in x their is a b unit increase in y
a in the formula means
the predicted value of y when x equals zero
conceptual interpretation
For every one raw unit increase in [x -> hours slept last night] there is a [b -> 1] unit increase in [y -> happy mood]
Substantive interpretation
For every additional hour of sleep people are predicted to be one point happier
multiple regression
more than one predictor
Does income and sleep predict happiness?
simple regression
one predictor
Does income predict happiness?
multiple regression equation
y= a + b(1)x(1) + b(2)x(2) and so on depending on how many predictors
b 1.2
partial regression coefficient for X1
b 2.1
partial regression coefficient for X2
partial out
to remove shared credit from other predictions
conceptual interpretation for partial regression
for every raw unit increase in X there is a b 1.2 unit increase in Y partialing out the other predictions.
substantive interpretation for partial regression
for every one raw unit increase in income their is a 0.5 unit increase in happiness partialing out sleep.
R is tested with what?
F test
b is tested with?
t test
R squared reflects what?
the proportion of variation in Y examined by our set of predictors
what does b show us?
strength and direction of prediction
hierarchical regression
sets of predictors
Set A -> R squared
age
education
gender
Set B -> triangle r squared
money
sleep
stress
logistic regression
allows to model each value in your categorical outcome
probability of guilty verdict = .8
probability of not guilty verdict = .2
odds ratio of guilty verdict= .8/.2
we add predictors to the model after
Effect sizes
express magnitude and sometimes direction of effect or estimate, they are standardized
d index
standardized mean difference, focuses on the standardized difference between two group averages
Why are standardized units beneficial
because they remove raw units, we use standard deviation to remove raw units
SD pooled
standard deviation from both participants
conceptual interpretation of d
there is a (d) standard deviation unit difference between the average (y) for group 1 and group 2
substantive interpretation of d
there is a 2.5 standard deviation unit difference between the average salaries for men and women in business
benchmarking
d = .20 -small
d = .50 -medium
d= .80 -large
conventional benchmarking is too?
general
empirical benchmarking
compares d-index to other similar studies
confidence interval
interval or numbers whose length reflects the precision of an estimate
Given a sample from a population, the CI indicates a range in which the population mean is believed to be found. Usually expressed as a 95% CI, indicating the lower and upper boundaries.
M= 200k, 95% CI (150k, 250k)
M= 200k, 95% CI (150k, 250k) this starts with a?
lower limit of 150 and high limit of 250
The width of a confidence interval reflects?
how precise or accurate our estimate is
TOO WIDE IS LESS PRECISE
misinterpretation of confidence intervals
- 95% confidence interval has a 95% chance of containing the population parameter of interest
- a 95% confidence interval predicts that 95% of sample estimates from future studies will fail within it (no)
data cleaning
preparing data involves cleaning it which ensures the “dataset” on which you conduct analyses is complete, correct and consistent
complete
has all the data been recorded or transferred into data set
data entry
recording all data in the form of scores from participants in your study
transcription error
researchers entered data incorrectly
reporting error
participants enter data incorrectly
downloading error
sometimes data isn’t downloading correctly
cleaning data
ensures its integrity and trustworthy
assumptions
the things we hold true for an inferential test to operate the way we think it should
inferential tests
they only work right if our assumptions are true, they can handle moderate or minor departures
when statistical assumptions are violated the probability of a test statistic may be
inaccurate
normality
outcome variable scores are normally distributed in the population
homogeneity of variance
all groups have the same variance in the population (focuses on width not shape)