Data Preparation Flashcards
Normalization formula (scaling)
Xnorm = (X - Xmin) / (Xmax-Xmin)
Normalization code (scaling)
importing the Normalizer from sklearn
from sklearn.preprocessing import Normalizer
#Creating a sample data array
X = [[4, 1, 2, 2],[1, 3, 9, 3],[5, 7, 5, 1]]
transformer = Normalizer().fit(X) # fit does nothing.
transformer
Normalizer()
transformer.transform(X)
Standardization formula (Z-score normalizing)
Xstand = (X - mean(X)) / std(X)
Standardization code (Z-score normalizing)
importing the StandardScaler from sklearn
from sklearn.preprocessing import StandardScaler
#Creating a sample data array
data = [[0, 0], [0, 0], [1, 1], [1, 1]]
scaler = StandardScaler()
print(scaler.fit(data))
print(scaler.mean_)
print(scaler.transform(data))
print(scaler.transform([[2, 2]]))