fa3 + logistic reg to gradient boosting Flashcards

Question 1

Q

We can visualize the tree using the export_graph function from the tree module.

Group of answer choices:
True
False

Question 2

Q

In the decision tree, the region can be found by traversing the tree from the root and going left or right.

Group of answer choices
True
False

Question 3

Q

Decision tree is a model that learns a hierarchy of if/else questions, leading to a decision.

Group of answer choices
True
False

Question 4

Q

The .dot file format is a _____ file format for storing graphs.

Question 5

Q

In the decision tree, the ______ represents the whole dataset.

Group of answer choices
Terminal Nodes
Edges
Root
Conditions

Question 6

Q

The .dot file format is an image file format for storing graphs.
Group of answer choices
True
False

Question 7

Q

Decision trees in scikit learn are implemented in ________ and DecisionTreeClassifier classes.

Group of answer choices: DecisionRegressorTree
TreeDecisionRegressor
RegressorDecisionTree
DecisionTreeRegressor

Answer

A

DecisionTreeRegressor

Question 8

Q

Which is not true about Random Forest?

Group of answer choices
Not in the options
Less memory usage.
Less burden or parameter tuning.
As many trees are created, detailed analysis is difficult.
Poor performance for large and sparse data.

Answer

A

Less memory usage.

Question 9

Q

To build a random forest model, you need to decide on the __________ to build.

Group of answer choices
Depth of the tree Height of tree
Number of trees
Root
Node of the tree

Answer

A

Number of trees

Question 10

Q

The _______ are methods that combine multiple machine learning models to create more powerful models.

Answer

A

ENSEMBLES

Question 11

Q

In the decision tree, the terminal nodes represent the whole dataset.

Group of answer choices
True
False

Question 12

Q

In the decision tree, the sequence of if/else questions are called qualifiers.

Group of answer choices
True
False

Question 13

Q

Which is not true about Random Forest?

Group of answer choices
Reduces underfitting by averaging trees that predict well.
Reduces overfitting by averaging trees that predict well.
Selects candidate features at random when splitting nodes.
Randomly selects some of the data when creating a tree.

Answer

A

Reduces underfitting by averaging trees that predict well.

Question 14

Q

What are the parameters for Gradient Boosting?

a. n_estimators, learning rate
b. n_estimators, max_features
c. n_estimators, learning rate, max_depth
d. n_estimators, max_features, max_depth

Question 15

Q

Gradient boosting is used when you need to take more performance in random forests.

Group of answer choices
True
False

Question 16

Q

In the decision tree, the sequence of if/else questions are called ______.

Group of answer choices
Qualifiers
Condition
Tests
Nodes

Question 17

Q

Decision trees in scikit learn are implemented in DecisionTreeRegressor and _______ classes.

Group of answer choices
DecisionClassifier
TreeDecisionClassifier
DecisionTreeClassifier
DecisionClassifierTree

Answer

A

DecisionTreeClassifier

Question 18

Q

We can visualize the tree using the ______ function from the tree module.

Answer

A

export_graphviz

Question 19

Q

Two most common linear classification algorithms:

Answer

A

Logistic Regression
Linear Support Vector Machines

Question 20

Q

Logistic Regression, implemented in where

Answer

A

linear_model.LogisticRegression

Question 21

Q

Linear Support Vector Machines (Linear SVMs), implemented in where

Answer

A

svm.LinearSVC

Question 22

Q

SVC stands for?

Answer

A

support vector classifier

Question 23

Q

______ is a classification algorithm and not a regression algorithm, and it should not be confused with LinearRegression

Answer

A

LogisticRegression

Question 24

Q

the trade-off parameter detemrins the strength of the regularizaiton, called _____

Question 25

Q

Higher values of C correspond to _____

Answer

A

LESS REGULARIZATION

Question 26

Q

When you use a high value of the parameter C, LogisticRegression and LinearSVC will _______

Answer

A

try to fit the training set as best as possible

Question 27

Q

low values of the parameter C, the models put more emphasis on _______

Answer

A

finding a coefficient vector (w) that is close to zero

Question 28

Q

Using low values of C will cause the algorithms to try to adjust to the _____ of data points

Answer

A

“majority”

Question 29

Q

using a higher value of C stresses the importance that each ______ be classified correctly

Answer

A

individual data point

Question 30

Q

_______ are a family of classifiers that are quite similar to the linear models

Answer

A

Naive Bayes classifiers

Question 31

Q

In Naive Bayes, ____is faster than linear classifier

Answer

A

Training Speeds

Question 32

Q

In Naive Bayes, _____ performance is slightly lower

Answer

A

Generalization

Question 33

Q

The reason that Naive Bayes models are so efficient is that they______ and collect simple per-class statistics from each feature

Answer

A

learn parameters by looking at each feature individually

Question 34

Q

The reason that Naive Bayes models are so efficient is that they learn parameters by looking at each feature individually and _______

Answer

A

collect simple per-class statistics from each feature

Question 35

Q

3 Kinds of Naive Bayes Classifier in Scikit-learn:

Answer

A

GaussianNB
BernoulliNB
MultinomialNB

Question 36

Q

GuassianNB -> ____ data

Answer

A

Continuous

Question 37

Q

BernoulliNB -> ____ data, ___ data

Answer

A

Binary data, Text data

Question 38

Q

MultinomialNB -> ____ data, ___ data

Answer

A

Integer count data, text data

Question 39

Q

In Naive Bayes, it controls _____

Answer

A

model complexity with alpha parameter

Question 40

Q

In Naives Bayes, _____ by adding virtually positive data as much as alpha

Answer

A

Smooth statistics

Question 41

Q

In Naive Bayes, ____ decreases the complexity of the model but does not change the performance

Answer

A

Large alpha

Question 42

Q

_____ is a high-dimensional dataset

Answer

A

GaussianNB

Question 43

Q

_____ and ______ are a text-like used to count sparse data

Answer

A

BernioulliNB and MultinomialNB

Question 44

Q

In Naive Bayes, _____ are fast and easy to understand and process

Answer

A

Training and testing

Question 45

Q

Naives Bayes works well with _____ and is not _____

Answer

A

sparse high-dimensional datasets, parameter sensitive

Question 46

Q

______ are widely used models for classification and regression tasks

Answer

A

Decision trees

Question 47

Q

In Decision Trees, they learn a hierarchy of ____, leading to a decision

Answer

A

if/else questions

Question 48

Q

Learning a _____ means learning the sequence of if/else questions that gets us to the true answer most quickly

Answer

A

decision tree

Question 49

Q

In the machine learning setting, if/else questions are called ___

Question 50

Q

To build a tree, the algorithm searches over all possible tests and finds the one that is ____ about the target variable

Answer

A

most informative

Question 51

Q

The top node is called the ___, representing the whole dataset.

Question 52

Q

Parts of a decision tree:

Answer

A

Root Node
Node
Edge (Connects tests to other nodes)
Terminal Node (Nodes with no futher edges)
Characteristics (inside nodes)

Question 53

Q

A prediction on a new data is made by checking which region of the ____ the point lies in, and then predicting the majority target (or the single target in the case of pure leaves) in that region

Answer

A

partition of the feature space

Question 54

Q

The ____ can be found by traversing the tree from the root and going left or right, depending on whether the test is fulfilled or not

Question 55

Q

Decision trees in scikit learn are implemented in ____ and ____ classes

Answer

A

DecisionTreeRegressor, DecisionTreeClassifier

Question 56

Q

We can visualize the tree using the ___ function from the tree module

Answer

A

export_graphviz

Question 57

Q

export_graphviz writes a file in the ____, which is a text file format for storing graphs

Answer

A

.dot file format

Question 58

Q

export_graphviz writes a file in the .dot file format, which is a ____for storing graphs

Answer

A

text file format

Question 59

Q

We can visualize the _____ in a way that is similar to the way we visualize the coefficients in the linear model

Answer

A

feature importances

Question 60

Q

______ is impossible for extrapolation predicting outside the range of training data

Answer

A

Extrapolation 0

Question 61

Q

____ is not affected by scale

Answer

A

Decision Tree Regression

Question 62

Q

_____ are methods that combine multiple machine learning models to create more powerful models

Answer

A

Ensembles

Question 63

Q

Two ensemble models that have proven to be effective on a wide range of datasets, for classification and regression, both of which use decision trees as their building blocks:

Answer

A

Random Forests
Gradient Boosted Decision Trees

Question 64

Q

It is one of the ensemble methods that can avoid overfitting by combining multiple decision trees

Answer

A

Random Forests

Answer 50

A

averaging trees that predict well

Answer 51

A

average of predicted values

Answer 52

A

average of predicted probabilities

Answer 53

A

Random Forests

Answer 54

A

some of the data

Answer 55

A

candidate features at random

Answer 56

A

number of trees

Answer 57

A

n_estimators

Answer 58

A

bootstrap sample

Answer 59

A

max features

Answer 60

A

Mostly widely used algorithm in regression and classification
Excellent performance, less burden or parameter tuning, no data scale required
Large datasets can be applied

Answer 61

A

Mostly widely used

Answer 62

A

excellent performance, parameter tuning, no data scale required

Answer 63

A

As many trees are created, detailed analysis is difficult, and the trees tend to get deeper
Poor performance for large and sparse data
More memory usage and slower training and prediction than linear models

Answer 64

A

detailed analysis

Answer 65

A

large and sparse data

Answer 66

A

memory usage
slower training and prediction

Answer 67

A

n_estimators, max features

Answer 68

A

Gradient Boosted Regression Trees (gradient boosting machines)

Answer 69

A

classification and regression

Answer 70

A

pre-pruning

Answer 71

A

Gradient Boosted Regression Trees (gradient boosting machines)

Answer 72

A

more parameter sensitive
higher performance

Answer 73

A

shallow tree of 5 or less

Answer 74

A

least squares error loss function

Answer 75

A

logistic loss function

Answer 76

A

gradient descent method

Answer 77

A

Use when you need to take more performance in random forests (xgboost for larger scales)
No need for feature scale adjustment and can be used for binary and continuous features

Answer 78

A

performance, xgboost

Answer 79

A

feature scale adjustment

Answer 80

A

Doesn’t work well for sparse high-dimensional data
Sensitive to parameters, takes longer training time

Answer 81

A

high-dimensional data

Answer 82

A

Parameters
Longer Training time

Answer 83

A

n_estimators
learning rate
max_depth (<=5)