K-Means Flashcards
Radius: ____________ from any point of the cluster to its centroid
square root of average distance
Diameter: _______________ between all pairs of points in the cluster
square root of average mean squared distance
What is the elbow method?
plots the value of the cost function produced by different values ofk.
The value ofkat which improvement in distortion ___________ the most is called the elbow
declines
Cost Function: For each k, calculate the ______________
total within-cluster sum of square (wss).
Support:
Freq (X,Y) / N
Confidence:
Freq (X,Y) / Freq (X)
Lift:
Support / Support(X) * Support(Y)
Conviction:
1-supp(y)/(1- conf(x->y))
If an itemset is frequent, then all of its ______ must also be frequent
subsets
If an itemset is not frequent, then all of its _______ cannot be frequent
supersets
The ______ of an itemset never exceeds the _________ of its subsets
support
Mining Association Rules
- Generate all itemsets whose support >=minsup
- Generate high confidence rules from each frequent itemset
An association rule r is strong if
Support(r) ≥ min_sup
Confidence(r) ≥ min_conf
Classification Accuracy
the number of correct predictions made as a ratio of all predictions made.