SPSS skills Flashcards
Recoding variables into new variables
(recode the values of existing variables into new variables without altering the original data)
useful to categorize continuous data e.g. age
1) select the variable you want to recode
2) transform
3) recode into different variables
4) insert variable into input box
5) change name and label
6) old and new values
7) split variable into categories
e.g. age -> split 17-19yo (cat 1), 20-22yo (cat 2 ), 23-25yo (cat 3)
8) range - 17
9) through - 19
10) new value - value - 1 (name of category)
11) add
12) repeat for all categories
13) continue
14) paste
15) new variable will appear, change measurement to ordinal
16) edit value labels
e.g. age -> value 1 = 17-19yo -> add
value 2 = 20-22yo -> add, and so on
17) check if it has been recoded correctly, analyze
18) descriptive statistics
19) frequencies
20) insert original variable and recoded variable into input box
21) run
22) compare results of frequency table of original variable with frequency table of new variable, if cumulative percentages are the same, ok
23) if you want to change labels, after recoding and pasting, click data, define variable properties and change
Reverse code variable
(to make sure all variables are aligned in the same direction)
1) transform
2) recode into different variables
3) display variable names
4) choose the items you want to recode
5) new name and label
6) old value and new value
1-5
2-4
3-3
4-2
5-1
7) continue, paste
8) value labels the other way around
(if originally in the scale e.g. 1 was totally disagree and 5 totally agree)
value label for new recoded variable is going to be 5 totally agree and 1 totally disagree
Explore and boxplot
1) analyze
2) descriptive
3) explore
4) choose variable
5) statistics - outliers
6) plots - histogram
7) paste
to make a boxplot
1) graph
2) legacy dialogs
3) box plots
4) simple
5) summary of selected variables
6) define
7) move your item to boxes represent
8) ok
Exploratory factor analysis
1) analyze
2) dimension reduction
3) factor - replace variable names
4) include items
5) descriptives
6) univariate descriptives
- coefficients
- significance levels
- Kino and Bartletts
7) extraction
- method -> principal axis factoring
8) scree plot
9) eigenvalue greater than 1
10) rotation oblimin
11) options -> sorted by size and suppress small coefficients
12) absolute value below 30
Reliability analysis
1) analyze
2) scale
3) reliability analysis
4) display variable names
5) include your items
6) model -> alpha
7) statistics
8) scale if item deleted
9) paste
- if items appear as negative, recode
Correlation analysis
1) analyze
2) correlate
3) bivariate
4) select variables
5) set if one tailed or two tailed
6) paste
correlation r lies between -1 and 1
closer to 1, strong positive correlation
closer to -1 strong negative correlation
p value (significance) - less than 0.5, considered significant
Explained variance
tells you how well the factors represent the original set of observed variables
total variance explained = sum of eigenvalues
% of variance = if the % of variance for one factor is 40%, it means that 40% of variability in your data can be accounted by that factor
cumulative % indicates the sum of each % of variances (cumulatively), and the remaining variance is considered unexplained
Scree plot
each point on the scree plot represents an eigenvalue
eigenvalues before the inflection point have a higher variance
eigenvalues after the inflection point have a lower variance (insignificant factors)
Factor loadings
latent concepts which are associated with your variables
range between -1 and 1
positive loading - as factor increases, variable increases
neg loading - as factor increases variable decreases
0.3 or higher is considered significant
Variables are typically assigned to the factor on which they have the highest loading.
e.g. Question 1 is associated with Factor 1 because it has the highest loading on that factor (0.72).
Compute a new variable
1) transform
2) compute
3) name the new variable
e.g. if you want to convert from hours to minutes, choose your variable and x by the number of minutes (60 in an hour) in the numeric expression tab
4) paste, run
5) change label
6) change measurement level
if you want to combine two variables
1) compute
2) select v1+ v2
3) paste, run
to make a mean score
Eigenvalues
eigenvalue for each factor represents the amount of variance that the factor explains in the data
larger eigenvalues = larger variance
Check for missing values
1) data
2) define variable properties
3) move all variables to the right box
4) continue
5) go through all your variables and click the missing box for the ones that have a value (strange) that does not align with the scale (usually likert, usually don’t know)
6) run syntax
Select cases
1) data
2) select cases
3) if condition is satisfied
4) if -> choose variable and write down condition
e.g. sex = 1 (males)
5) paste and run
6) only selected cases will be shown, others will be crossed
if you want to compare frequencies of selected case with all cases;
run freq table for the variable you want (filter is on)
then deselect filter
1) data
2) select cases
3) ALL cases (filter off)
4) paste and run - filter off
run freq table for the same variable you want (filter is now off), compare results