Introduction Flashcards

1
Q

Non-Technical Definition: Statistical Learning

A

A broad set of tools for understanding and extracting information from data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Non-Technical Definition: Supervised Learning

A

Methods for predicting an output (response) based on one or more inputs (predictors) when the correct output is known.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Non-Technical Definition: Unsupervised Learning

A

Methods for finding structure in data with inputs but no known or labeled output.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Definition: Regression Problem

A

Predicting a continuous (quantitative) response.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Definition: Classification Problem

A

Predicting a discrete (qualitative) output variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Dimension Reduction?

A

Summarizing or transforming high-dimensional data into fewer dimensions while retaining key information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Difference Between Classification and Regression

A

Classification predicts a discrete category (e.g. ‘Up’ or ‘Down’), while regression predicts a numeric value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Key Premise of ISLR #1

A

Statistical learning methods are broadly useful across many fields, not just statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Key Premise of ISLR #2

A

Statistical learning should not be seen as a ‘black box’; understanding the assumptions and trade-offs is crucial.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Key Premise of ISLR #3

A

We need not master the deep mathematical details to effectively use these methods.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Key Premise of ISLR #4

A

Practical real-world applications are the main focus, with hands-on labs demonstrating methods in R.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Notation: n and p

A

n is the number of observations in a data set; p is the number of variables (features).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Matrix Representation of Data X

A

X is an n×p matrix, where each row is an observation and each column is a variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Definition: Transpose of a Matrix (X^T)

A

A matrix whose rows are the columns of the original matrix (and vice versa).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Notation: y_i

A

The i-th observation of the response variable we wish to predict.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Matrix Multiplication Requirement

A

You can only multiply A (of size r×d) and B (of size d×s) if the number of columns in A equals the number of rows in B.

17
Q

Formula for (AB)_{ij}

A

The (i, j) element of AB is the sum of the products of corresponding elements from row i of A and column j of B.

18
Q

Distinction Between Bold/Capital vs Lower-Case Font

A

Bold capitals (e.g., A) are matrices, lower-case bold (e.g., a) are n-length vectors, lower-case normal (e.g., a) are scalars or feature vectors, and capital normal (e.g., A) can denote random variables.

19
Q

Linear vs. Non-Linear Methods

A

Linear methods assume a linear relationship between predictors and response; non-linear methods can capture more complex, flexible relationships.

20
Q

Examples of Non-Linear Approaches

A

Tree-based methods (bagging, boosting, random forests), support vector machines, and generalized additive models.