CH1: The Machine Learning Landscape Flashcards
What is machine learning?
- Machine Learning is the science (and art) of programming computers so they can learn from data
- field of study that gives computers the ability to learn without being explicitly programmed.
- A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves
with experience E.
What is a training set?
The examples that the system uses to learn are called the training set
What is a training instance/ sample
Each training example is called a training instance (or sample).
Why use ML (machine learning)
- Problems for which existing solutions require a lot of hand-tuning or long lists of rules: one Machine Learning algorithm can often simplify code and perform bet‐ ter.
- Complex problems for which there is no good solution at all using a traditional approach: the best Machine Learning techniques can find a solution.
3 Fluctuating environments: a Machine Learning system can adapt to new data.
- Getting insights about complex problems and large amounts of data.
What are the 3 different categories to classify the diffferent types of ML?
- Whether or not they are trained with human supervision (supervised, unsuper‐ vised, semisupervised, and Reinforcement Learning)
- Whether or not they can learn incrementally on the fly (online versus batch learning)
- Whether they work by simply comparing new data points to known data points, or instead detect patterns in the training data and build a predictive model, much
like scientists do (instance-based versus model-based learning)
What is supervised learning?
In supervised learning, the training data you feed to the algorithm includes the desired solutions, called labels
What is classification?
it is trained with many example emails along with their class (spam or ham), and it must learn how to classify new emails.
What are the typical tasks for supervised learning?
- A typical supervised learning task is classification.
- regression
What is regression?
Another typical task is to predict a target numeric value, such as the price of a car, given a set of features (mileage, age, brand, etc.) called predictors. This sort of task is called regression (Figure 1-6).1 To train the system, you need to give it many examples
of cars, including both their predictors and their labels (i.e., their prices).
What is the difference between an attribute and a feature?
In Machine Learning an attribute is a data type (e.g., “Mileage”), while a feature has several meanings depending on the context, but generally means an attribute plus its value (e.g., “Mileage =
15,000”)
Can some regression algorithms be used for classification, give an example?
Note that some regression algorithms can be used for classification as well, and vice versa. For example, Logistic Regression is commonly used for classification, as it can
output a value that corresponds to the probability of belonging to a given class
What are the 6 most important supervised learning algorithms?
- k-Nearest Neighbors
- Linear Regression
- Logistic Regression
- Support Vector Machines (SVMs)
- Decision Trees and Random Forests
- Neural networks
What is unsupervised learning>
In unsupervised learning, as you might guess, the training data is unlabeled
What are the most important unsupervised learning algorithms?
- Clustering
—K-Means
—DBSCAN
— Hierarchical Cluster Analysis (HCA) - Anomaly detection and novelty detection
—One-class SVM
— Isolation Forest - Visualization and dimensionality reduction
— Principal Component Analysis (PCA)
—Kernel PCA
— Locally-Linear Embedding (LLE)
— t-distributed Stochastic Neighbor Embedding (t-SNE) - Association rule learning
—Apriori
— Eclat
What are hierarchical clustering algorithms?
If you use a hierarchical clustering algorithm, it may also subdivide each group into smaller
groups.
What are the uses of visualization algorithms?
Visualization algorithms are also good examples of unsupervised learning algorithms: you feed them a lot of complex and unlabeled data, and they output a 2D or 3D rep‐
resentation of your data that can easily be plotted
What are the tasks of unsupervised learning>
- dimensionality reduction
- anomaly detection
3.novelty detection
- association rule learning
What is dimensionality reduction?
A related task is dimensionality reduction, in which the goal is to simplify the data without losing too much information.
What is feature extraction?
One way to do this is to merge several correla‐ ted features into one.
Why should you reduce the dimension of your training data?
It is often a good idea to try to reduce the dimension of your train‐ ing data using a dimensionality reduction algorithm before you feed it to another Machine Learning algorithm (such as a super‐ vised learning algorithm). It will run much faster, the data will take up less disk and memory space, and in some cases it may also per‐
form better.
What is anomaly detection and an example?
et another important unsupervised task is anomaly detection—for example, detect‐ ing unusual credit card transactions to prevent fraud, catching manufacturing defects, or automatically removing outliers from a dataset before feeding it to another learn‐ ing algorithm. The system is shown mostly normal instances during training, so it
learns to recognize them and when it sees a new instance it can tell whether it looks ike a normal one or whether it is likely an anomaly
What is the difference between novelty detection and anomaly detection?
the difference is that novelty detection algorithms expect to see only normal data during training, while anomaly detection algorithms are usually
more tolerant, they can often perform well even with a small percentage of outliers in the training set
What is association rule learning?
another common unsupervised task is association rule learning, in which the goal is to dig into large amounts of data and discover interesting relations between
attributes.
What is semisupervised learning>?
Some algorithms can deal with partially labeled training data, usually a lot of unlabeled data and a little bit of labeled data.
What is a example of semisupervised learning algorithms?
DBNs
What are DBNs based on?
Most semisupervised learning algorithms are combinations of unsupervised and supervised algorithms. For example, deep belief networks (DBNs) are based on unsu‐ pervised components called restricted Boltzmann machines (RBMs) stacked on top of
one another.
How are the RBMs trained?
RBMs are trained sequentially in an unsupervised manner, and then the whole system is fine-tuned using supervised learning techniques.
What is reinforcement learning
The learning system, called an agent in this context, can observe the environment, select and perform actions, and get rewards in return (or penalties in the form of negative rewards, as in Figure 1-12). It must then learn by itself what is the best strategy, called a policy, to get the most
reward over time.
What does the policy define?
A policy defines what action the agent should choose when it is in a given situation.
What are examples of Reinforcement learning
For example, many robots implement Reinforcement Learning algorithms to learn how to walk. DeepMind’s AlphaGo program is also a good example of Reinforcement
Learning:
It learned its winning policy by analyzing millions of games, and then playing many games against itself. Note that learning was turned off during the
games against the champion; AlphaGo was just applying the policy it had learned.
What is another criterion to classify the ML systems?
Another criterion used to classify Machine Learning systems is whether or not the system can learn incrementally from a stream of incoming data.
What is batch learning?
In batch learning, the system is incapable of learning incrementally: it must be trained using all the available data.
What is offline learning?
This will generally take a lot of time and computing resources, so it is typically done offline. First the system is trained, and then it is launched into production and runs without learning anymore; it just applies what it
has learned. This is called offline learning
What are the disadvantages of offline learning?
If you want a batch learning system to know about new data you need to train a new version of the system from scratch on the full dataset then stop the old system and replace it with the new one
This solution is simple and often works fine, but training using the full set of data can take many hours,
Also, training on the full set of data requires a lot of computing resources (CPU, memory space, disk space, disk I/O, network I/O, etc.). If you have a lot of data and you automate your system to train from scratch every day, it will end up costing you a lot of money. If the amount of data is huge, it may even be impossible to use a batch
learning algorithm.
Finally, if your system needs to be able to learn autonomously and it has limited resources then carrying around large amounts of training data and taking up a lot of resources to train for hours
every day is a showstopper.