Chapter 1,2 - Overview Flashcards

1
Q

Statistical Learning refers to…

A

A vast set of tools for understanding data. These tools can be unsupervised or supervised.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Supervised vs Unsupervised Learning:

A

Supervised: Predicting/estimating an output based on one or more inputs. We have labels for the data already.

Unsupervised: inputs with no supervising output,; but we can still learn relationships and structure from data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When we want to predict our estimate a continuous or quantitative variable, what kind of problem is this?

A

Linear Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a Classification Problem?

A

When we are trying to predict a non-numerical value. Eg. Will stock prices go UP or DOWN (ignoring by how much)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a clustering problem?

A

Where we want to group variables based on observed characteristics.

No output variables for corresponding input variables, however we want to see data groups, structures and relationships.

Unsupervised.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What model/method could we use to predict a qualitative variable?
Eg patient lives or dies, stock market goes up or down?

A

Logistic Regression or Linear Discriminant Analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
In statistics, what does this notation stand for:
n
p
x(small i,j)
X
T
y(small i)
A

n = number of total distinct data points
p = number of variables
x(small i,j) = ith observations of jth variable
X = n x p matrix whose (i,j)th element is xij
T = Transpose
y(small i) = ith observation of variable we wish to make predictions on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are some of the names given to input vs output variables in ISLR.

A

Input: predictors, independent variables, fewtures, also just variables

Output: response or dependent variable - often denoted by symbol Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe Y = f(X) + €

A

Assumptions:

f is some fixed but unknown function

€ is a random error term, which is independent of X and has a mean of zero

In this formulation, f represents the systematic information that X provides about Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In essence, statistical learning refers to a set of approaches for estimating:

A

f

Some unknown function f that we can use to predict/estimate Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the two main reasons for estimating f(or the function)?

A

Two main reasons:

Prediction and Inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The accuracy for Y(^hat) as a prediction for Y depends on two quantities….

A

Reducible error: in general, f(^hat) is not a perfect estimate of f, and this inaccuracy will introduce some error. This error is reducible because we can potentially improve the accuracy of f(hat) by using the most appropriate statistical learning technique for estimating f.

Irreducible error:
Even if we could perfectly predict f(X), we would still see error in our predictions. No matter how well we estimate f(X), we cannot reduce the error caused by €.
This is because Y is also a function of €, which by definition cannot be predicted by X.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Inference vs Prediction to estimate f

A

Prediction: estimate value for Y based on inputs X

Inference: what effect does change in input X have on Y.
We want to understand the relationship between X and Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Inference vs Prediction example

A

Inference:
How much extra will a house be worth if it has a view of the river?

Prediction:
Is the house under-valued or overvalued?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

If we want to find a function f(hat) that estimates Y ~= f(X) for any observation (X,Y), what type of statistical learning approaches can we take?

A

Parametric:
Involve a two-step model-based approach.
1. Make an assumption about the functional form. Eg. Assume that f is linear in X(linear model).

  1. After selecting a model, we need a procedure that uses training data to fit or train model.

Non-parametric: no assumption of functional form. Seeks to estimate f by getting as close to data points ad possible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Estimating f.

Parametric approach: Advantages vs disadvantages

A

Advantages: assuming form for f simplifies the problem and makes it easier (generally speaking) to estimate a set of parameter.

Disadvantages: the model we choose will generall not much the true unkown form of f. If form is too far from f, estimate will be poor.

17
Q

Estimating f.

Non parametric methods advantages vs disadvantages:

A

Advantages: potential to fit a wider range of possible shapes for f. Avoid danger of functional form used to estimated f being different from real f as no assumptions are made.

Disadvantages:
Major one is since they don’t reduce problem of estimating f to a small number of paramaters, a very large number of observations is required in order to obtain an accurate estimate for f.

18
Q

What is the trade-off between prediction accuracy and model interpretability?

A

As Interpretability increases:
Flexibility decreases….

And vice versa