Chp 4 Components of Learning Flashcards

1
Q

How is data usually represented

A

As a matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data science spend the most time

A

cleaning data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Target function

A

A function that maps X to Y, we do not know this function and the goal is to recreate this function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Learning job

A

Create a hypothesis function that also maps X to Y very similarly to the target function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Learning steps 5

A
  1. We have an unknown target function F which maps X to Y
  2. We have certain training examples and we use those training examples as part of a learning algorithm
  3. The learning algorithm has a number of hypotheses.
  4. These training examples and hypotheses together will produce a final hypothesis.
  5. We hope this final hypothesis is very close to the target function
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Learner input output

A

seen data as input, classifier as output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Classifier input output

A

unseen data, response to that data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Model is an

A

artifact, learner builds a model and classifier uses that model to predict

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Curse of dimensionality

A

The various challenges and complications that come from data that is very high in dimensions, too much data to handle every single case

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Generalizing

A

Being able to adapt to data that the model has not seen before

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Selection

A

Selecting the data you need

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Preprocessing

A

Clean data and understand what you need to remove

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Transformation

A

Transform it into the shape you want, add/remove attributes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Data Mining

A

Get patterns from the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Supervised learning

A

Model is trained on labeled data, input output pairs. Algorithm learns to map input to output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Unsupervised Learning

A

Model is trained on unlabeled data. Algorithm looks for pattern or structure in data

17
Q

Reinforcement learning

A

Algorithm receives feedback in form of rewards or penalties. Goal is to maximize reward over time

18
Q

Binarization

A

Converting continuous or categorical data into binary form

19
Q

Discretization

A

Converting continuous data into discrete categories or intervals

20
Q

Classification

A

The task of learning a target function that maps each attribute set x to one of the predefined class labels

21
Q

How do decision trees work?

A

Takes some data, uses tree induction algorithm to understand this data, learn a model from it, apply deduction to get new responses right

22
Q

Decision trees are always

A

Binary

23
Q

Hunts Algorithm

A

Dt set of training records that reach a node t
If Dt contains records that belong to the same class Yt, then t is a leaf node labeled as Yt
If Dt is an empty set, then t is a lead node labeled by the default class Yd
If Dt contains records that belong to more than one class, use an attribute test to split the data into smaller subsets.
Recursively apply the procedure to each subset

24
Q

what is the default class

A

The class that is most frequent in the data set