Data Science - MODULE 6 CODE Flashcards
Daar is n hele aantal nuwenimports wanneer jy k means clustering gebruik
Import math
Import seaborn as sns
From sklearn.metrics import silhouette_score
From scipy.spatial.distance import cdist
From sklearn.cluster import KMeans
Hulle generate n artificial scatter?
From sklearn.datasets import make_blobs
X,y = make_blobs(n_samples=200, n_features = 2, centers = 4, cluster_std=1.7, random_state=10, shuffle=true)
In basiese beginsels, hoe generate jy die cluster fits
Itereer oor n seker hoeveelheid clusters
Km = KMeans(n_clusters=i, init=’random’, n_init=10, max_iter=200, tol=1e-4, random_state=1)
y_km = km.fit_predict(X)
Hoe bereken jy die distortion?
Itereer:
Km = KMeans(n_clusters=i, init=’random’, n_init=10, max_iter=300, tol=1e-04, random_state=0)
Km.fit(X)
Distortions.append(sum(np.min(cdist(X,km.cluster_centers_, ‘euclidian’), axis=1))/X.shape[0])
En dan plot jy