Machine Learning Flashcards

Question 1

Q

Machine learning

Answer

A

Algorithms that can learn from observational data and make predictions from it

Question 2

Q

Unsupervised learning

Answer

A

An algorithm makes sense of a data set without prior learning experience or answers to learn from

Question 3

Q

Latent variable

Answer

A

A previously unknown part of the data, which unsupervised learning can do

Question 4

Q

Supervised learning

Answer

A

An algorithm learns from a data set plus the correct “answers”

Question 5

Q

Training/testing sets

Answer

A

A model is trained using a training set of data, then the model is tested on a similar but disjoint set of data to test its accuracy.

Question 6

Q

What are practical considerations for training/testing sets?

Answer

A

Question 7

Q

Why is train/test useful?

Answer

A

It can guard against overfitting.

Question 8

Q

K-fold cross variation

Answer

A

Question 9

Q

K-means clustering

Answer

A

Randomly pick K centroids.
Assign each data point to the closest centroid.
Recompute the centroids based on the average position of each centroid’s data points.
Iterate until points stop moving.

Question 10

Q

What is a large caveat with K-means clustering?

Answer

A

The algorithm does not assign names or titles to clusters.

Question 11

Q

Entropy (data science)

Answer

A

Disorder of data

Zero if all data points are the same.

(11 cards)