Week 11 Flashcards
What was the goal of supervised learning? What are the pros of it?
The goal was to build the model approximating the conditional mean of Y well out-of-sample, but avoiding overfit
could relatively easily tell whether the model is doing a good or bad job – just look at the forecast risk estimates out- of-sample or some other performance measures
How does data in unsupervised learning look like? What is the goal of unsupervised learning?
In the unsupervised learning scenario, we only have data on a P-dimensional set of features X.
The goal is then to extract interesting and hopefully low- dimensional information from these data.
What is the main challenge of unsupervised learning?
it is very subjective and hard to assess in many situations.
No clear goal of analysis – not a prediction prob
What are some methods for unsupervised learning?
- Density estimation
- principal component analysis
- Clustering
- Topic model
- Graphical model
What is clustering? Which are the clustering algorithms?
collection of methods for finding subgroups, or clusters, in the data at hand.
trying to break density functions into more bite-sized bits.
Algo: K means, Hierachical
How do we define simlarity in clustering?
proximity/closensess
What does k means clustering do ? What kind of data is it designed for?
seeks to partition a data set into K non- overlapping clusters using the Euclidean distance as a measure of closeness.
continuous data(consideration similar to knn)
What is the k mean clustering algo process?
1)
Randomly assign a number, from 1 to K, to each of the observations. These serve as initial cluster assignments for the observations.
2)
Iterate until the cluster assignments stop changing:
- For each of K clusters, compute the cluster centroid. The k-th cluster centroid is the vector of the p feature means for the observations in the k-th cluster
- Assign each observation to the cluster whose centroid is closest (in Euclidean distance).
The algorithm is guaranteed to _____________________________at each step, since the ______ minimize the _____________, and reallocating observations closer can only improve the objective.
decrease the value of the objective function
means
sum of squared errors
What type (local/global)of minimum does the algo find and why? How to find the other min?
locall not flobal
Results depend on initial assignmnet
rerun the algorithm multiple times with different initial randomizations and select the solution with the smallest objective.
Will we always benefit from standardising data for kmeans?
May or may not
nee dto try both and see which makes more sense
How is hierachical clustering better than k means?
no need to commit to specific choice of k, resuts in nice looking tree
What is the nice looking tree based representation called?
dendogram
What is the general process behind hierachical clusteirng?
build a series of clusters in which each higher level cluster is formed by merging lower level clusters.
Lowest level – single observation; highest level – whole sample.
What are the key approaches of hierachical clustering ? How are clusters formed?
- agglomerative (bottom up, fuse observation, MOST COMMON)
- divisive(top down, split existing clusters)
Clusters are formed based on a dissimilarity measure, with the method being quite sensitive to the choice of this measure.