Module 5 Flashcards

Question 1

Q

Decision trees can be used for classification. True or False

Question 2

Q

Why do we prefer information gain over accuracy when splitting?

Answers:
a.
Decision Tree is prone to over-fit and accuracy doesn’t help to generalize

b.
Information gain is more stable as compared to accuracy

c.
Information gain chooses more impactful features closer to root

d.
All of these

Answer

A

d.
All of these

Question 3

Q

Which of the following is true about an individual tree in random forest classifier?
1. it is built on a subset of the features
2. it is built on all the features
3. it is built on a subset of observations
4. it is built on full set of observations

Answers:
a.
2 and 4

b.
1 and 4

c.
1 and 3

d.
2 and 3

Answer

A

c.
1 and 3

Question 4

Q

Which of the following are the disadvantage of decision tree classifiers?

Answers:
a.
not easy to interpret

b.
not a very stable algorithm

c.
overfit the data easily if it perfectly memorizes it

d.
Both b and c

Answer

A

d.
Both b and c

Question 5

Q

Imagine a two variable predictor space having 10 data points. A decision tree is built over it with 5 leaf nodes. The number of distinct regions that will be formed in predictors space?

a.
5

b.
2

c.
25

d.
10

Question 6

Q

Random forests can be used to classify infinite dimensional data. True or False

Question 7

Q

Which of the following is correct with respect to random forest?

Answers:
a.
Random forest are difficult to interpret but often very accurate

b.
Random forest are easy to interpret but often very accurate

c.
Random forest are difficult to interpret but very less accurate

d.
None of the above

Answer

A

a.
Random forest are difficult to interpret but often very accurate

Question 8

Q

Which of the following is selected by the classifier when using a random forest classifier? (Assume the use of default parameters is selected by the user)

Answers:
a.
Number of decision trees

b.
features to be taken into account when building a tree

c.
samples to be given to train individual tree in a forest

d.
b and c

e.
a, b, and c

Answer

A

d.
b and c

Question 9

Q

You are given two independent variables X and Y with P(x)=0.3, P(y and x) = 0.2. What is the probability P(y|x)?

Answers:
a.
0.5

b.
2/3

c.
1/6

d.
1/3

Question 10

Q

Which of the following statements about Naive Bayes is incorrect?

Selected Answer:
a.
Features are statistically dependent of one another given the class value.

b.
Features can be nominal or numeric.

c.
Features are statistically independent of one another given the class value.

d.
Features are equally important.

Answer

A

a.
Features are statistically dependent of one another given the class value.