Math & Statistics - Machine Learning Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What are the components of machine learning?

A

Task T: example - playing checkers
Performance measure P: example - percentage of games won agains opponents
Training experience E: playing practice games against itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the inductive learning hypothesis?

A

Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved examples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How to acquire a Concept learning?

A

Acquiring the definition of a general category by samples of positive and negative training examples of the category.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Concept Learning?

A

Concept learning can be viewed as the task of searching through a large space of hypotheses implicitly defined by the hypothesis representation. The goal of this search is to find the hypothesis that best fits the training examples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Algorithms to find Hypothesis that fit the Concept=1

A
  1. FIND-S

2. Candidate Elimination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the difference between parametric and non-parametric approaches fo find f( )?

A

non-parametric approaches
completely avoid this danger, since essentially no assumption about the
form of f is made.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the difference between supervised and unsupervised statistical learning problems?

A

x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is another name used for qualitative variables?

A

Categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How problems of categorical variables are classified?

A

We tend to refer to problems

with a quantitative response as regression problems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How problems of quantitative variables are classified?

A

qualitative response are often referred to as classification problems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How to measure the performance of a statistical learning method?

A

In order to evaluate the performance of a statistical learning method on a given data set, we need some way to measure how well its predictions actually match the observed data. That is, we need to quantify the extent
to which the predicted response value for a given observation is close to the true response value for that observation. In the regression setting, the most commonly-used measure is the mean squared error (MSE).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is supervised statistical learning?

A

Building a statistical model to predict, or estimating, an output based on one or more inputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is unsupervised statistical learning?

A

Learn relationships and structure from data, not trying to predict an output number.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the difference between classification problems and quantitative problems?

A

Quantitative problems output a number, while classification problems output a classification.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data descoberta/invenção métodos de statistical learning?

A

Século 19 - Regressão Linear - Legendre e Gauss - Method of Least Squares
1940 - Logistic Regression - vários autores
1970 - Generalized Linear Models - Nelder e Wedderburn
1985 - Classification and Regression Trees - Breiman, Friedman, Olshen and Stone
1986 - Generalized Additive Models (non-linear models) - Hastie e Tibshirani

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Notation used by Introduction to Statistical Learning

A

n - represent the number of distinct points, or observations, in our sample.
p - denote the number of variables that are available for use in making predictions.
Color fonte (red) - variables
i - index the samples of observations. (from 1 to n)
j - index the variables (from 1 to p)
X - denote a n x p matrix whose (i, j)th element is xij. Like a spreadsheet.
Xt - transpose of matrix.
Yi - denote the ith observation of the variable on which we wish to make predictions.

17
Q

E

A

Random error term

18
Q

The accuracy of Y (chapeu) as prediction for Y depends on which two quantities?

A
  1. Reducible error and

2. Irreducible error

19
Q

Why is the irreducible error larger than zero?

A

The quantity may contain unmeasured variables that are useful in predicting Y : since we don’t measure them, f cannot use them for its prediction.

20
Q

Reasons to estimate f

A
  1. Prediction

2. Inference

21
Q

Modeling for inference

A

Descobrir o relacionamento entre as diferentes variáveis e o resultado (Y).

22
Q

Modeling for prediction

A

Gera uma previsão (Y) baseado em input X.

23
Q

What are common approaches to determine f?

A

Linear Regression

Logistic Regression