Wronged Questions: Unsupervised learning Flashcards
If two observation profiles are close together along the vertical axis, they have a ___________.
Small euclidean distance
If two observation profiles have a similar shape, they have a _____________.
Small correlation-based distance
________ can happen when centroid linkage is used.
Inversion
T/F: Hierarchal Clustering may not assign extreme outliers to any cluster
False
T/F: The resulting dendrogram can be used to obtain different numbers of clusters.
True. Depending on the height where we cut the dendrogram, we get the cluster assignments for different #’s of clusters
T/F: Hierarchal clustering is not robust to small changes in the data.
True. Small changes in the data can result in different cluster assignments.
The loadings for the first PC are based on the ___ axis.
Top
The loadings for the second PC are based on the _____ axis.
Right
T/F: Using more principal components in a PCR model generally leads to a decrease in the model’s variance.
False. It leads to an increase in model variance.
T/F: The incorporation of additional principal components tends to increase the model’s squared bias.
False. It leads to a decrease in squared bias
T/F: PCR becomes identical to ordinary least squares regression when all principal components are employed.
True
Two unsupervised learning methods
Cluster analysis, PCA
T/F: The maximum number of principal components that can be extracted from this data is three if the data has 3 independent continuous predictors.
True. The maximum number of principal components that can be extracted from this data is three.
T/F: The loadings are constrained so that their sum of squares is equal to zero, since otherwise setting these elements to be arbitrarily large in absolute value could result in an arbitrarily large variance.
False. The loadings are constrained so that their sum of squares is equal to one, since otherwise setting these elements to be arbitrarily large in absolute value could result in an arbitrarily large variance.
T/F: The principal component loading vectors represent the directions in feature space along which the data vary the most, while the principal component scores are the projections of the data along these directions.
True.
T/F: Principal components provide low-dimensional linear surfaces that are closest to the observations.
True
T/F: The first principal component loading vector is the line in p- dimensional space that is closest to the n observations.
True
T/F: Together the first M principal component score vectors and the first M principal component loading vectors provide the best M-dimensional approximation to the i-th observation.
True
T/F: A maximum of max(n-1, p) distinct principal components can be created from a dataset with n observations and p features.
False. A minimum of min(n-1, p) distinct principal components can be created from a dataset with n observations and p features.
T/F: For KNN, running the algorithm once is guaranteed to find clusters with the global minimum of the total within-cluster variation.
False. This is a benefit of hierarchal clustering, not k-means clustering
T/F: For KNN, the clusters do not have to be nested upon changing the desired number of clusters.
True. Hierarchical clustering must produce nested clusters as a function of the number of clusters.
T/F: For KNN, there are fewer areas of consideration in clustering a dataset in comparison to hierarchal clustering.
True. K-means clustering requires pre-specifying the number of clusters, whereas hierarchical clustering requires choosing a measure of dissimilarity, choosing a linkage, and deciding on the number of clusters (i.e. at what height to cut the dendrogram).
T/F: Average linkage clustering can lead to the formation of extended, trailing clusters.
False. Single linkage clustering can lead to the formation of extended, trailing clusters in which single observations are fused one-at-a-time.
T/F: The number of possible ways to reorder a dendrogram is 2^n, with n representing the total number of leaves.
False. The number of possible ways to reorder a dendrogram is 2^(n-1), with n representing the total number of leaves. This is because at each of the n-1 points where fusions occur, the positions of the two fused branches could be swapped without affecting the meaning of the dendrogram.
T/F: It’s generally necessary to execute the algorithm multiple times based on the final number of clusters chosen for hierarchal clustering.
False. It is only necessary to execute the algorithm a single time, regardless of how many clusters are ultimately decided to use. One single dendrogram can be used to obtain any number of clusters.
T/F: Agglomerative clustering is the least common type of hierarchical clustering.
False. Bottom-up or agglomerative clustering is the most common type of hierarchical clustering.
T/F: We cannot draw conclusions about the similarity of two observations based on their proximity along the horizontal axis.
True. We cannot draw conclusions about the similarity of two observations based on their proximity along the horizontal axis.
Rather, we draw conclusions about the similarity of two observations based on the location on the vertical axis where branches containing those two observations first are fused.
T/F: K-means clustering aims to minimize the average distance within clusters.
False. K-means clustering aims to minimize the average squared distance within clusters.
T/F: K-means clustering does not require pre-specification of the number of clusters.
False. K-means clustering requires pre-specification of the number of clusters
T/F: K-means clustering looks to find heterogeneous subgroups among the observations.
False. Clustering looks to find homogeneous subgroups among the observations.