Math & Statistics - Machine Learning Flashcards

Question 1

Q

What are the components of machine learning?

Answer

A

Task T: example - playing checkers
Performance measure P: example - percentage of games won agains opponents
Training experience E: playing practice games against itself.

Question 2

Q

What is the inductive learning hypothesis?

Answer

A

Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved examples.

Question 3

Q

How to acquire a Concept learning?

Answer

A

Acquiring the definition of a general category by samples of positive and negative training examples of the category.

Question 4

Q

What is Concept Learning?

Answer

A

Concept learning can be viewed as the task of searching through a large space of hypotheses implicitly defined by the hypothesis representation. The goal of this search is to find the hypothesis that best fits the training examples.

Question 5

Q

Algorithms to find Hypothesis that fit the Concept=1

Answer

A

FIND-S

2. Candidate Elimination

Question 6

Q

What is the difference between parametric and non-parametric approaches fo find f( )?

Answer

A

non-parametric approaches
completely avoid this danger, since essentially no assumption about the
form of f is made.

Question 7

Q

What is the difference between supervised and unsupervised statistical learning problems?

Question 8

Q

What is another name used for qualitative variables?

Answer

A

Categorical

Question 9

Q

How problems of categorical variables are classified?

Answer

A

We tend to refer to problems

with a quantitative response as regression problems

Question 10

Q

How problems of quantitative variables are classified?

Answer

A

qualitative response are often referred to as classification problems

Question 11

Q

How to measure the performance of a statistical learning method?

Answer

A

In order to evaluate the performance of a statistical learning method on a given data set, we need some way to measure how well its predictions actually match the observed data. That is, we need to quantify the extent
to which the predicted response value for a given observation is close to the true response value for that observation. In the regression setting, the most commonly-used measure is the mean squared error (MSE).

Question 12

Q

What is supervised statistical learning?

Answer

A

Building a statistical model to predict, or estimating, an output based on one or more inputs.

Question 13

Q

What is unsupervised statistical learning?

Answer

A

Learn relationships and structure from data, not trying to predict an output number.

Question 14

Q

What is the difference between classification problems and quantitative problems?

Answer

A

Quantitative problems output a number, while classification problems output a classification.

Question 15

Q

Data descoberta/invenção métodos de statistical learning?

Answer

A

Século 19 - Regressão Linear - Legendre e Gauss - Method of Least Squares
1940 - Logistic Regression - vários autores
1970 - Generalized Linear Models - Nelder e Wedderburn
1985 - Classification and Regression Trees - Breiman, Friedman, Olshen and Stone
1986 - Generalized Additive Models (non-linear models) - Hastie e Tibshirani

Question 16

Q

Notation used by Introduction to Statistical Learning

Answer

Study These Flashcards

A

n - represent the number of distinct points, or observations, in our sample.
p - denote the number of variables that are available for use in making predictions.
Color fonte (red) - variables
i - index the samples of observations. (from 1 to n)
j - index the variables (from 1 to p)
X - denote a n x p matrix whose (i, j)th element is xij. Like a spreadsheet.
Xt - transpose of matrix.
Yi - denote the ith observation of the variable on which we wish to make predictions.

Question 17

Q

E

Answer

Study These Flashcards

A

Random error term

Question 18

Q

The accuracy of Y (chapeu) as prediction for Y depends on which two quantities?

Answer

Study These Flashcards

A

Reducible error and

2. Irreducible error

Question 19

Q

Why is the irreducible error larger than zero?

Answer

Study These Flashcards

A

The quantity may contain unmeasured variables that are useful in predicting Y : since we don’t measure them, f cannot use them for its prediction.

Question 20

Q

Reasons to estimate f

Answer

Study These Flashcards

A

Prediction

2. Inference

Question 21

Q

Modeling for inference

Answer

Study These Flashcards

A

Descobrir o relacionamento entre as diferentes variáveis e o resultado (Y).

Question 22

Q

Modeling for prediction

Answer

Study These Flashcards

A

Gera uma previsão (Y) baseado em input X.

Question 23

Q

What are common approaches to determine f?

Answer

Study These Flashcards

A

Linear Regression

Logistic Regression

Math & Statistics - Machine Learning Flashcards

(23 cards)