7.3 Unsupervised Learning Algorithms and Other Models Flashcards
List three unsupervised machine learning algorithms.
- Principal component analysis (PCA)
- Hierarchical clustering
- K-means clustering
PCA stands for what?
Principal component analysis.
PCA summarizes the information in a large number of ___________ factors and into a much small set of _________ factors.
correlated;
uncorrelated;
Give the term that describes the following: The uncorrelated factors in a PCA which are linear combinations of the original features.
Eigenvectors;
Within a PCA, each ________ has an ________. An _________ is the proportion of total variance in the data set explained by the _________.
eigenvalue;
eigenvector;
Within a PCA, explain how eigenvectors are applied.
Eigenvectors are essentially (my interpretation) percentage weightings that are applied to each independent variable to change what otherwise would be a set of correlated independent variables (e.g., interest rates, stock prices; etc.) into a set of uncorrelated independent variables.
Within a PCA, the __________ with the highest __________ is the independent variable that is the most important to the model.
eigenvector;
eigenvalue;
PCA is totally a ____ ___ where you are completely dependent upon the computer to make predictions. This may work fine on __-_____ data. Maybe not so great for ___-__-_____ data.
black box;
in-sample;
out-of-sample;
__-____ clustering requires that you be familiar enough with the data to know how many _______ you should have.
K-means;
clusters;
K-means clustering partitions observations into “__” ________ clusters.
k;
nonoverlapping;
______ _______ is an ________ unsupervised algorithm used to form a _________ of clusters. Each ____________ is its own cluster.
Hierarchical clustering;
iterative;
hierarchy;
observation;
In an ____________ (or ______-__) clustering algorithm, you start with one observation as its own cluster and add other similar observations to that group.
agglomerative;
bottom-up;
In a _________ (or ___-____) clustering algorithm, you start with one giant cluster and then partition that cluster into smaller and smaller clusters.
divisive;
top-down;
A ________ _________ consists of nodes connected by links.
neural network;
List and describe the three layers of a neural network.
- Input layer (contains nodes with values for the features, i.e., the independent variables);
- Hidden layers (contains nodes to which the multiple nodes within the input layer connect to … can be multiple hidden layers);
- Output layers;