R Unsupervised Flashcards
1
Q
what is the metric you should use to select the number of clusters
A
tot.withinss
access this by km.out$tot.withinss
2
Q
function for kmeans
A
kmeans(data.frame, k, nstart = 50) #just have nstart = 50 is a good idea. part of base package
3
Q
function for pca
A
prcomp() pca = prcomp(USArrests, scale = T) # always do scale = T when different scales part of base package
4
Q
plot the first 2 principal components with loading vectors
A
biplot(pca, scale = 0)