Week 12 Flashcards
Clustering method: partitioning
Arbitrarily choose k objects, reassign based on mean of clusters and update until no change.
Problems of partitioning clustering
Sensitive to outliers
Takes time
Problems of hierarchical clustering
Join unrelated objects
Rigid
Hard to define clusters
Data structure based internal measures for number of clusters
maximise inter-distance
minimise intra-distance
Stability-based metrics for number of clusters
remove part of information and regenerate until there is no change in clustering results
Integrating known biological information for number of clusters
Use a knowledge base
Meta-analysis
combine results of published data, check for overall effects
Problems with meta-analysis
Publication bias
“Comparing apples to oranges”
Benefits of meta-analysis
Generalisation to broad population possible
Precision and accuracy improve
Differentiate between real and sampling variation