Supervised Learning Flashcards
What is supervised learning
Predicts target variables using labeled data (predictor variables)
What is unsupervised learning
Uncovers hidden patterns using unlabeled data
What is reinforcement learning
Software interacts with environment using system of rewards and punishments to optimize behavior
type()
Tells the type of data
Ex: (numpy.ndarray)
.shape
Tells the shape of the array or dataset
Ex: (150, 4)
.target_names
Shows the dependent variables of the array
Ex: array([‘setosa’, ‘versicor’, ‘virginica’], …)
What is k-Nearest Neighbors and how does it work
It predicts label of a data point by looking at ‘k’ closest labeled data points
It takes majority vote
.fit()
Uses training data to create a model called “fitting”
.predict()
Predicts the labels of new data based on what it learned from the .fit() method
What is train_test_split() and what is the function structure
Divides up data into train and test sets to create unbiased prediction models
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3 …)
How does larger ‘k’ effect model complexity and smoothness
Less complex model and smoother
How does smaller ‘k’ effect model complexity and smoothness
More complex model and less smooth; can lead to overfitting
How to read .csv files
pd.read_csv()
How to drop an entire column in a dataframe named df
df.drop(‘column_name’, axis=1)
How to get values of a column ‘col’ of a dataframe called df
df[‘col’].values