RS4 Flashcards
What is the basic comparative design?
Comparing the scores of one group with another.
Usually the mean
What is the basic correlational exam?
Researcher is measuring two or more different variables at the same time in a single group of cases
See if there is correlation between those variables
What is a correlation coefficient?
Standard statistical index measuring (i) the direction and (ii) the strength of a relationship between two variables
Ranges from -1 to +1
It is an effect size itself:
.1 is small
.3 is medium
.5 is large
What is the coefficient of determination?
r2 - r squared.
The proportion of variance in one variable shared by the other
An r of .6 indicates how much variance is shared?
36%
.6x.6 x 100
What is covariance? what is the standard measure of covariance?
The relationship between how much scores on two variables deviate from their respective means. So if X and Y both deviate similarly then the covariance is positive, if they do not similarly deviate then the covariance will be negative.
The correlation coefficient is the standard measure. Makes it unitless
What are the four types of correlation coefficients?
Spearman’s rho
Kendall’s Tau
Phi-coefficient
Biserial point correlation
When would you use spearman’s rho and Kendall’s Tau-b?
Smaller data set, with a large number of tied ranks (the same value)
When would you use point biserial correlation?
When one variable is dichotomous (categorical with 2 categories i.e. dead or alive), NOT when there is an underlying continuum (pass/fail)
and the other variable is continuous
When would you use the phi correlation (2x2)
Correlating between two categorical (nominal) but dichotomous variables
What does reliability refer to in terms of a test? Some ways it is measured…
The consistency over time:
- Test-retest
Consistency of the items within the measure:
- Split half
- Cronbach’s alpha
What does validity refer to in terms of a test?
The extent a measure measures the underlying construct
Criterion related validity:
- Predictive and concurrent validity
Minimum correlation of test-retest reliability expected?
Minimum correlation of 0.6 expected
Problems with split half reliability?
Only estimates reliability of half
Not obvious which way to split the test
What reliability coefficients mean it is how reliable?
<0.6 is suspect
.6-7 satisfactory
> .8 is excellent
Methods of estimating criterion related validity?
Predictive validity
- See how well your test predicts some later obtained criterion scores
Concurrent validity
- See how well your test scores concurrently predict obtained criterion scores
What is linear regression?
Technique used to predict an outcome score. Examines the amount of variance that can be explained by one predictor (simple) or more than one (multiple).
In regression what are the dependent and independent variables called?
Dependent is an outcome
Independent is a predictor
What equation is used in regression?
The equation of a straight line:
Y = b0 + b1X
Y = outcome variable or expected value of y given a value of x X = predictor variable b0 = intercept (value of Y when X is 0) b1 = regression coefficient (gradient - strength/direction of the relationship)
How do we test the significance of the predictive model in simple regression?
F ratio test and R2.
F ratio expressed as the Mean Squares
General aim of the least squares method for simple regression?
Goal is to minimise the sum of the squared differences (error) between the observed value of the dependent (outcome) variable and the predicted value (provided by regression line)
Sum of the squared residuals.
In this way it is trying to get the best fit possible.
Does the intercept have to make ‘sense’
No - the intercept may or may not actually make sense in real life
How does the simple linear regression equation change when you have to estimate population parameters?
The y becomes yhat and the beta’s become b’s
ŷ = mean value of y for a given value for x
What are residuals in regression?
The distance from the best fit line also called the error.
Always add up to 0
Why do you square residuals
- makes them all positive
2. Emphasises the larger deviations - exaggerates them
What is the sum of squared residuals?
The squared residuals (errors) all added together.
How do we determine how good the linear regression model is?
We are comparing it to the line without the predictor variable (the straight line) - if it is good it should reduce the distance or the SSE (sum of squared errors)
Is the expected value of y an exact value?
No it’s actually just the mean of a distribution
What is the least squares criterion?
Min Σ (yi ŷi)2
yi = actual value
ŷi = predicted
Minimum of the sum of the squared residuals (the y’s)
This formula is used to calculate the SST, SSR and SSE
What is the centroid?
The point of the mean x value and mean y value, the line of best fit must go through this
Interpreting the regression equation ŷi = 0.1462x - 0.8188?
ŷi = 0.1462x - 0.8188
b1 = 0.1462(1) (gradient)
b0 = -0.8188
SO for every £1 the bill amount (x) increases the tip (ŷi) increases by 0.1462- this part makes sense
if the bill amount is zero (x) then the expected tip is -0.8188: this obviously doesn’t make sense, but it doesn’t have to.
What is the SSR in simple linear regression model?
It’s the Sum of squares for residual.
The SST (total) is the SSE (error) in the straight line with maximum error
In our model we want less error. The SSR (regression sometimes called model) is this SST - SSE. The better the line fits the data the smaller the SSE will be and the bigger the SSR (model) will be. We are comparing to our bad model when SST = SSE
In the simple linear regression equation what is the SSE?
The difference between the predicted value of y (from the model) and the actual value at each point. The predicted value has been calculated by plugging in the x values to the simple linear regression equation.
What is the coefficient of determination in regression?
Quantifies the ratio between SSR (model) and SST (total).
Shows us how well the model fits
It is an r2 value
Do SSR/SST, can times the r2 by 100 to get a percentage
What is multicollinearity in multiple regression?
The fact that independent (predictor variables) variables can also be related to EACH other as well as the outcome (dependent variable)
This is bad if they are because it can be harder to discern what factor is really affecting the dependent variable, If this happens they are redundant and you’d take out related variables
Disadvantages to having lots of predictor variables in multiple regression?
Means that there are many more relationships to consider, have to account for each predictors relationship to the outcome as well as to each other.
What is the (estimated) multiple regression equation?
ŷ = b0 + b1x1 + b2x2 + … (however many predictors)
Each coefficient in multiple regression should be interpreted as what, for example the b1 in b1x1?
Each coefficient is the estimated change in y (outcome) corresponding to a 1 unit increase in that predictor, when all other variables are held constant.
What is the adjusted R squared?
The R squared adjusted for the number of independent variables (lower than R squared)
How can you use Standard Error (SE) in multiple regression?
Tells us how wide the ‘band’ is around the regression line - in the independent (predictor variables units)
How high can the VIF and tolerance be to indicate the variables are not multicollinearity?
VIF should be below 10
Tolerance should be higher than 0.2
How many participants should there be in regression analysis?
Liberal: 15 ptps per predictor
Conservative: Tabachnick and Fidell (1996) suggest 50 + 8 x(number of IVs)
Stepwise suggests 40 per IV
What is random sampling variability, and sampling error?
Random sampling variability is the fact that when you sample a population it may come out different with different samples i.e. you might have a different sample mean several times
Sampling error is the fact that the sample you have will be different from the mean population score
What is the central limit theorem?
Formulated by pierre laplace (1810):
The idea that different samples taken from the same population will often have different sample stats such as mean and SD - random sampling variability
What is the Standard error of the mean?
What is the standard deviation?
The standard error measures the expected difference (due to chance) between the sample mean and that of the population
The standard deviation is achieved becuase of random sampling variability, the standard error of the mean is essentially the standard deviation of a distribution of samples
What’s the difference in a z score and a t score formula?
The t score formula uses estimated standard error instead of standard deviation
used when we need to estimate the standard deviation (error) of a population because we don’t know it - most situations, in this situation we use N-1 to estimate the SD formula
How do we estimate confidence limits in t scores?
How does this then translate into Confidence intervals
We know that 95% of the scores will be +/-1.96 SEs (SDs) from the mean (if the sample is normally distributed) - when samples are above 100, if they are lower you have to increase that SE figure.
The Confidence Interval will be the confidence interval x the SE(mean) so:
Upper 95% CI: Sample mean + (1.96 x SE(mean))
(-) for lower 95% CI
If there is no direction specified in the prediction of a test, is this a one tailed or two tailed test?
Two tailed
Is the RQ: ‘does alcohol improve problem solving ability?’ a one tailed or two tailed hypothesis?
1 tailed
SE equation?
population mean/square root of N
Do the SD and population N decrease/increase to get the SE to decrease
SD decreases
N increases
What is the coefficient of multiple determination? How could you get it to increase?
R2, you could add addition predictor variables to a multiple regression equations
What is the formula Z = (sample mean(Xbar) - population mean(μ)) / SD(σ)
The way of making a linear transformation of a normal variable into a standard normal variable
What is psychometric testing?
Measuring a psychological variable with objective numerical or categorical measures
Advantages of questionnaires?
- Don’t have to manipulate variables
- Quick and easy to administer
- Potentially large number of responses
- Anonymous responding may produce more truthful responses
Disadvantages of questionnaires?
- Response rate is low - internet
- Difficult to correct misunderstanding (if you aren’t there then people may interpret questions in different ways)
- Potential influence of question order
- potential influence of question wording
What different formats of questionnaires are there?
Open questions
Closed questions;
- Yes/No
- PANAS - positive and negative affect schedule - continuum of agreement
- Likert-like
Advantages and disadvantages of open questions?
Advantages:
- Gets all the info
- Does not lead respondent
- More naturalistic
Disadvantages:
- Can be difficult to complete
- Difficult to code and analyse
- Poor when numeric results required
Advantages and disadvantages of closed questionnaires?
Advantages:
- Easy to code and analyse
- Good when a numerical result is required
- Quick for respondents to complete
Disadvantages:
- Can encourage bias (if worded badly)
- Can miss possible answers
- Create opinions where none exist
Why have lots of different items on a psychometric questionnaire?
So that you can try and eliminate error.
The observed response is a combination of true response and error.
What are the possible sources of error in psychometric questionnaires?
Ptp must read and understand question
Ptp must decide on their attitude
Ptp must match their attitude to the scale in the questionnaire
Things can go wrong at any of these steps
How can you minimalise issues with misinterpretation in questions in questionnaires?
Short, clear and unambiguous
be very clear with how you define terms that can be interpreted differently, use simple language
Use don’t know or unambiguous
Avoid double barrelled questions i.e. only ask about one thing, and clearly do it.
Avoid quantitative statements:
‘Private education is better than non-private’
- could say no and mean that private was the same or worse
Avoid leading questions
Avoid loaded terms
Avoid hypothetical situations
Avoid double-negatives
Why do we use 5 or 7 items in a likert response format?
Attitude is a continuum
5/7 have a reasonable number of choices but are not unmanageable
More than 7 - labelling becomes more difficult
If you leave them unlabelled this can leave to ambiguity
What is the funnel approach to question order in psychometric questionnaires?
Very broad questions going down to narrow specific questions
This encourages completion and allows for logical expression of ideas§
Why use a scale rather than a single item?
May be several dimensions to the attitude we want to measure
Minimise the effects of random error
How can you balance a scale, what is a balanced scale?
A balanced scale is one where the questions are framed from both sides of a viewpoint, so do that.
This can reduce acquiescence bias.
What is acquiescence bias?
Tendency to agree with a statement regardless of its content
How many items do you want in a pilot questionnaire?
More than you want overall - some will not be suitable
What is internal reliability?
How should items relate to each other?
How do you measure it?
The internal consistency of a test or scale
Items should correlate with each other. One set of items should correlate with another set.
Use Cronbach’s alpha or split-half
How do we deal with variance in item scores?
Items with large variance is good. They will discriminate between high and low scorers
What is item-item correlation?
How items correlate with others: look at overall correlation matrix
If an item is removed then you have to relook at all the data again
What is the item-total approach?
Reject items that do not correlate with the total score.
Preferred way to do this is correlate it with the total excluding the item you are measuring
How might you perform split-half reliability?
Does one half correlate with the other:
- Odd and even numbers
- First half with the second half
- Random selection
What is the spearman-brown reliability coefficient?
What is considered sufficient?
The coefficient for split half reliability
0.7 is normally adequate but 0.6 may be allowed.
1 + [(n -1) x known reliability]
n is normally 2, but can be more if you need to determine how long your questionnaire needs to be.
What is cronbach’s alpha?
The average of every correlation between every possible half of the items with every other possible half
Best measure of reliability - does not depend on one particular split
How do you calculate cronbach’s alpha coefficient?
Add all the spearman-brown coefficients and divide by the number of them
Using means squares:
between people variance
Main difference in item analysis and factor analysis?
Item analysis works on the assumption that we are only measuring one construct - and all questions relate to that.
Factor analysis is measuring the different factors that make up your questionnaires
What is factor analysis?
Data reduction statistical technique
Takes a large set of variables and reduces it using smaller set of factors/components that are independent of each other.
What is deductive inductive and abductive reasoning?
Deductive:
- All As and Bs
- All Bs are Cs
All As are C
Inductive:
A1 is B
A2 is B
A3 is B
All B’s must be A
- you are generalising from a finite number of observations
Abductor reasoning (factor analysis): - devising a theory from observations (but not with direct testing)
The surprising fact C is observed,
If A were true C would be a matter of course, hence there is reason to suspect A is true
Not conclusively true however
What is a factor? What is it’s loading?
Hypothetical variable assumed to underlie a group of highly correlated items
The greater the loading the more that factor explains the variance of those items
An items ‘loading’ is a correlation coefficient, ranges from -1 to +1
Goal of factor analysis?
Trying to understand the underlying dimensions and psychological processes behind the responses
Have as clear a solution as possible
Types of factor analysis?
Exploratory:
- Highlight factors within a set of responses
Confirmatory:
- Used to test whether a set of data fits a pre-existing pattern of factors
Stages of explorative factor analysis?
Extraction:
- Determines how many factors underlie the data
- Normally principle components analysis
Rotation:
- determines the loading of each item
- Either Orthogonal or Oblique
Orthogonal: assumes each factor is unique, theoretical model may suggest independent variables (e.g. varimax)
Oblique: More often used, determines the relationship of factors to one another
Tests to determine the suitability of data for factor analysis?
Look at correlation matrix, want a few correlations above 0.3.
Barlett’s test of sphericity: should be above 0.6
KMO - Kaiser-Meyer-Olkin: Should be above 0.6
Two techniques to determine number of factors to retain?
Look at the eigenvalue: Factors with a value above 1 will be retained
Scree test: at which point does it become horizontal
What is rotation in factor analysis?
tries to present the factors so that factors are easiest to interpret, so that they load on the factors that they are most correlated with.
What is test-retest reliability?
Correlation of between two scores at two different times of testing, this can be affected by:
- Internal reliability
- External factors, such as mood or fatigue
Test-retest reliability is useful when measuring what sort of characteristics?
Characteristics that are stable and do not vary much in the short term
Others do vary e.g. mood
What is alternate forms reliability?
Attempt to reduce any carry-over effect from doing the test before
Can give another version of the test
Limited by internal reliability of the two forms of the test
What is validity generally?
The extent to which the test measures what it claims to measure.
What is face validity?
Subjective test of whether the items appear to be measuring what you think they are
What is content validity?
Trying to get a full range of the content the items should be measuring
Subjective
Seek out experts or people who are similar to those you are trying to measure
What is concurrent validity?
Comparing your test to a previously established test.
May be a more established and accepted test of the same variable, i.e. physiological measures.
What is predictive validity?
The ability of the test to predict future events e.g, if intelligence predicts employment
No necessary for a test - but may add to validity
What is convergent and discriminant validity?
Convergent: measures of the same variable must all correlate with each other, regardless of the type of measure e.g. interview should correlate with questionnaire and physiological measures.
Discriminant validity is the opposite, there should be little correlation between a test and measures of DIFFERENT variables
What is construct validity?
A form of validity when the construct is not assessed with respect to external criteria. A test that you believe is the same thing you believe in.
What is the nomological net?
The idea that a new construct has few specifiable associations in which to pin down the construct. The construct will send out roots in many directions attaching it to associations as research proceeds.
The construct will change and become more defined and rigid as research goes on.
How do you test construct validity?
TRICK QUESTION. HA. You Can’t. The whole investigation is a test in a way or some other suitably fluffy bollocks explanation. Martin goes on about this for about 15mins.
What is effect size?
The size of the relationship between two variables.
How can you define effect size?
What is a small, medium or large effect size?
R squared or eta squared (correlation) is an effect size
Cohens d, look at the difference between two means, and standardise it (SD), so you can compare across studies:
SD
D=0.2 is small
- 5 is medium
- 8 is large
Can report as SD’s or the original units
What is type 1 and 2 error?
Type 1 error:
- Finding a significantly result when it is not really there.
Type 2 error:
- Failure to find a significant result when there is one.
What is statistical power?
Likelihood of finding an effect when one does really exist:
A study with a power of 0.8 has an 80% chance of finding a significant effect when it does actually exist.
What affects power?
Effect size - the larger the better
Sample size - the larger the better
Alpha level - the less stringent the more power, however increases chance of type 1 error. So can’t change.
How can you do effect size before research has started?
Could do pilot.
Getting the average effect size from other research (meta-analysis). If research is new then the effect size might be new (you don’t know)
On the side can do cost-benefit analysis. Minimum effect size that you need to see for the study to be worthwhile.
How do you do a meta-analysis?
Define variables of interest.
Plan database search - inclusion and exclusion criteria
Calculate effect sized from other research
Combine them: Convert to z score, calculate average z-score then convert back.
How can you increase power?
Less stringent alpha (not gonna happen)
Increase sample size
Reduce noise (reduce SD):
- Standardise procedures
- More reliable measures
- Repeated measures design
Focused (planned) contrasts rather than omnibus test.
- Use a test that only look for a linear relationship: not ANOVA.
Combine results of individual studies
What are attitudes?
A psychological tendency expressed as evaluating a particular entity with some degree of favour or disfavour
Latent
Complex
Not the responses, but the attitude underlying that response