Hands on machine learning(Book) Flashcards

1
Q

machine learning vs artificial intelligence

A

Put in context, artificial intelligence refers to the general ability of computers to emulate human thought and perform tasks in real-world environments, while machine learning refers to the technologies and algorithms that enable systems to identify patterns, make decisions, and improve themselves through experience …

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Apache

A

As a Web server, Apache is responsible for accepting directory (HTTP) requests from Internet users and sending them their desired information in the form of files and Web pages. Much of the Web’s software and code is designed to work along with Apache’s features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Google colab

A

Colaboratory, or “Colab” for short, is a product from Google Research. Colab allows anybody to write and execute arbitrary python code through the browser, and is especially well suited to machine learning, data analysis and education.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What Is Machine Learning?

A

Machine learning is the science (and art) of programming computers so they can
learn from data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Traning set, training instance, model?

A

The examples that the system uses to learn are called the training
set. Each training example is called a training instance (or sample). The part of a
machine learning system that learns and makes predictions is called a model. Neural
networks and random forests are examples of models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data mining?

A

Data mining is the process of finding anomalies, patterns and correlations within large data sets to predict outcomes. Using a broad range of techniques, you can use this information to increase revenues, cut costs, improve customer relationships, reduce risks and more.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is machine learning great for?

A

To summarize, machine learning is great for:
* Problems for which existing solutions require a lot of fine-tuning or long lists of
rules (a machine learning model can often simplify code and perform better than
the traditional approach)
* Complex problems for which using a traditional approach yields no good solution
(the best machine learning techniques can perhaps find a solution)
* Fluctuating environments (a machine learning system can easily be retrained on
new data, always keeping it up to date)
* Getting insights about complex problems and large amounts of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Examples of Machine learning

A

Analyzing images of products on a production line to automatically classify them

Detecting tumors in brain scans

Automatically classifying news articles

Automatically flagging offensive comments on discussion forums

Summarizing long documents automatically

Creating a chatbot or a personal assistent

Forecasting your company’s revenue next year, based on many performance metrics

Making your app react to voice commands

Detecting credit card fraud

Segmenting clients based on their purchases so that you can design a different marketing strategy for each segment

Representing a complex, high-dimensional dataset in a clear and insightful diagram

Recommending a product that a client may be interested in, based on past purchases

Building an intelligent bot for a game

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Types of Machine Learning Systems

A

There are so many different types of machine learning systems that it is useful to
classify them in broad categories, based on the following criteria:
* How they are supervised during training (supervised, unsupervised, semisupervised,
self-supervised, and others)
* Whether or not they can learn incrementally on the fly (online versus batch
learning)
* Whether they work by simply comparing new data points to known data points,
or instead by detecting patterns in the training data and building a predictive
model, much like scientists do (instance-based versus model-based learning)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

There are many categories, but we’ll discuss the main ones:
supervised learning, unsupervised learning, self-supervised learning, semi-supervised
learning, and reinforcement learning.

A

Pogoogli

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Unsupervised learning ex.

A

Clustering, dim. reductuon.
task is anomaly detection—for example, detecting
unusual credit card transactions to prevent fraud, catching manufacturing defects,
or automatically removing outliers from a dataset before feeding it to another learning
algorithm. The system is shown mostly normal instances during training, so it
learns to recognize them; then, when it sees a new instance, it can tell whether it looks
like a normal one or whether it is likely an anomaly (see Figure 1-10). A very similar
task is novelty detection: it aims to detect new instances that look different from all
instances in the training set.

association rule learning, in which the
goal is to dig into large amounts of data and discover interesting relations between
attributes. For example, suppose you own a supermarket. Running an association rule
on your sales logs may reveal that people who purchase barbecue sauce and potato
chips also tend to buy steak. Thus, you may want to place these items close to one
another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Semi-supervised learning

A

Since labeling data is usually time-consuming and costly, you will often have plenty
of unlabeled instances, and few labeled instances.

Google Photos, are good examples of this. Once
you upload all your family photos to the service, it automatically recognizes that the
same person A shows up in photos 1, 5, and 11, while another person B shows up in
photos 2, 5, and 7. This is the unsupervised part of the algorithm (clustering). Now
all the system needs is for you to tell it who these people are. Just add one label per
person3 and it is able to name everyone in every photo, which is useful for searching
photos.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Self-supervised learning

A

Another approach to machine learning involves actually generating a fully labeled
dataset from a fully unlabeled one. Again, once the whole dataset is labeled, any
supervised learning algorithm can be used. This approach is called self-supervised
learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Transfer learning

A

Transferring knowledge from one task to another is called transfer
learning, and it’s one of the most important techniques in machine
learning today, especially when using deep neural networks (i.e.,
neural networks composed of many layers of neurons). We will
discuss this in detail in Part II.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Reinforcement learning + examples

A

Reinforcement learning is a very different beast. The learning system, called an agent
in this context, can observe the environment, select and perform actions, and get
rewards in return (or penalties in the form of negative rewards, as shown in Figure
1-13).
It must then learn by itself what is the best strategy, called a policy, to get
the most reward over time. A policy defines what action the agent should choose
when it is in a given situation

For example, many robots implement reinforcement learning algorithms to learn
how to walk. DeepMind’s AlphaGo program is also a good example of reinforcement
learning: it made the headlines in May 2017 when it beat Ke Jie, the number one
ranked player in the world at the time, at the game of Go.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Batch Versus Online Learning

A

In batch learning, the system is incapable of learning incrementally: it must be trained
using all the available data. This will generally take a lot of time and computing
resources, so it is typically done offline. First the system is trained, and then it is
launched into production and runs without learning anymore; it just applies what it
has learned. This is called offline learning.

17
Q

Model rot/Data drift?

A

Unfortunately, a model’s performance tends to decay slowly over time, simply because
the world continues to evolve while the model remains unchanged.

18
Q

Online learning / incremental learning

A

In online learning, you train the system incrementally by feeding it data instances
sequentially, either individually or in small groups called mini-batches. Each learning
step is fast and cheap, so the system can learn about new data on the fly, as it arrives
(see Figure 1-14).
Online learning is useful for systems that need to adapt to change extremely rapidly
(e.g., to detect new patterns in the stock market).

19
Q

Instance vs model based learning

A

This is called instance-based learning: the system learns the examples by heart, then
generalizes to new cases by using a similarity measure to compare them to the
learned examples (or a subset of them). For example, in Figure 1-16 the new instance
would be classified as a triangle because the majority of the most similar instances
belong to that class.

Another way to generalize from a set of examples is to build a model of these
examples and then use that model to make predictions. This is called model-based
learning

20
Q

Instance vs model based learning PRIMERA

A

Instance: spam mail npr… na pirmerih se nauči in generalizira na podobne
Model: Npr napoved zadovoljstva glede na GDP.. dobimo linearn model in pol lahko na podlagi GDP ocenimo zadovoljstvo nove države.

21
Q

The Unreasonable Effectiveness of Data? Razvoj ideje “da so podatki morda bolj pomembni od algorimtnov

A

In a famous paper published in 2001, Microsoft researchers Michele Banko and Eric
Brill showed that very different machine learning algorithms, including fairly simple
ones, performed almost identically well on a complex problem of natural language
disambiguation6 once they were given enough data (as you can see in Figure 1-21).
As the authors put it, “these results suggest that we may want to reconsider the tradeoff
between spending time and money on algorithm development versus spending it
on corpus development”.
The idea that data matters more than algorithms for complex problems was further
popularized by Peter Norvig et al. in a paper titled “The Unreasonable Effectiveness of
Data”, published in 2009.7 It should be noted, however, that small and medium-sized
datasets are still very common, and it is not always easy or cheap to get extra training
data—so don’t abandon algorithms just yet.

22
Q

BAD DATA problems

A

Premal podatkov;
Nereprezentativni - sampling bias
Slaba kvaliteta (manjka, velik noisa..) (If some instances are missing a few features (e.g., 5% of your customers did not
specify their age), you must decide whether you want to ignore this attribute
altogether, ignore these instances, fill in the missing values (e.g., with the median
age), or train one model with the feature and one model without it.)
Nerelevantne spremenljivke;

23
Q

BAD ALGORITEMS problems

A

Overfitting; happens when the model is too complex relative to
the amount and noisiness of the training data. - Regularization

Udrefitting; - treba zbrat bol močan model, dodat nove spremenljivke ali pa zmanjšat reglulacijske parametre.

24
Q

Hyperparameter?

A

A hyperparameter is a parameter of a learning algorithm (not of the
model).

25
Q

Generalization error/out of sample error/test error

A

As these names imply, you train your model using the training set, and you test it
using the test set. The error rate on new cases is called the generalization error (or
out-of-sample error), and by evaluating your model on the test set, you get an estimate
of this error.

26
Q

No free lunch theorem

A

In a famous 1996 paper,9 David Wolpert demonstrated that if you make absolutely
no assumption about the data, then there is no reason to prefer one model over any
other. This is called the No Free Lunch (NFL) theorem. For some datasets the best
model is a linear model, while for other datasets it is a neural network. There is no
model that is a priori guaranteed to work better (hence the name of the theorem).
The only way to know for sure which model is best is to evaluate them all. Since
this is not possible, in practice you make some reasonable assumptions about the
data and evaluate only a few reasonable models. For example, for simple tasks you
may evaluate linear models with various levels of regularization, and for a complex
problem you may evaluate various neural networks.

27
Q

main steps of a machine learning project

A
  1. Look at the big picture.
  2. Get the data.
  3. Explore and visualize the data to gain insights.
  4. Prepare the data for machine learning algorithms.
  5. Select a model and train it.
  6. Fine-tune your model.
  7. Present your solution.
  8. Launch, monitor, and maintain your system.
28
Q

Sites with datasets reposatories/ML problems

A

Popular open data repositories:
—OpenML.org
—Kaggle.com
—PapersWithCode.com
—UC Irvine Machine Learning Repository
—Amazon’s AWS datasets
—TensorFlow datasets
* Meta portals (they list open data repositories):
—DataPortals.org
—OpenDataMonitor.eu
* Other pages listing many popular open data repositories:
—Wikipedia’s list of machine learning datasets
—Quora.com
—The datasets subreddit

29
Q

Pipelines

A

A sequence of data processing components is called a data pipeline. Pipelines are very
common in machine learning systems, since there is a lot of data to manipulate and
many data transformations to apply.
Components typically run asynchronously. Each component pulls in a large amount
of data, processes it, and spits out the result in another data store. Then, some time
later, the next component in the pipeline pulls in this data and spits out its own
output. Each component is fairly self-contained: the interface between components
is simply the data store. This makes the system simple to grasp (with the help of a
data flow graph), and different teams can focus on different components. Moreover,
if a component breaks down, the downstream components can often continue to
run normally (at least for a while) by just using the last output from the broken
component. This makes the architecture quite robust.
On the other hand, a broken component can go unnoticed for some time if proper
monitoring is not implemented. The data gets stale and the overall system’s performance
drops.

30
Q

root mean square error (RMSE).

A

Your next step is to select a performance measure. A typical performance measure
for regression problems is the root mean square error (RMSE). It gives an idea of how
much error the system typically makes in its predictions, with a higher weight given
to large errors. Equation 2-1 shows the mathematical formula to compute the RMSE.