Introduction to machine learning Flashcards
What is an AI?
Algorithm capable of learning and making decisions
What is the machine learning pipeline?
- Training: Load data Extract features Train model Evaluate - Testing: Load data Extract features Predict using model Evaluate
(Only difference is that you initally train, and then use the trained model when testing)
Briefly describe the 4 types of machine learning
- Supervised: Humans “teach” the algorithm
- Unsupervised: Algorithm does everything by itself
- Semi-supervised: In between the two previous
- Reinforcement: Algorithm left to itself, but gets
rewarded if it does something correctly
What is classification? How do you calculate loss in classification?
Classification is a classifier used in machine learning. The output is discrete. Loss i calculated by counting the number of misclassified samples.
What is regression? How do you calculate loss in regression?
Regression is another classifier. The output is continuous. Loss is calculated by squaring (to avoid outliers) and adding up and taking the average.
What are some other classifiers (besides classification and regression)?
Mainly tree based models, like Random Forest. Works by using a combination of N decision trees.
What is bagging?
Combining several classifiers. Can reduce variance.
What is the difference between classification and regression? Explain with an example if possible.
Classification has a discrete (categorical). If you are trying to predict house prices, it can tell you which houses might sell for below or above the predicted price (categories).
Regression has a continuous output (numerical). So if predicting housing prices, it actually gives you the predicted price.