Final Exam Flashcards

1
Q

True or False, Topic Modeling is an unsupervised learning technique

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

True or False, Singular Value Decomposition (SVD) aims to address skewed frequency of terms

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

True or False, If a model performs indistinguishably from a random value, its AUC will be closer to zero

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

True or False, Mutual information ensures better predictive performance when it is available than no weight option in SAS Enterprise Miner

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the incorrect answer about weightings in text filtering?

a) Term weights are consistent across documents
b) Inverse document frequency depends on the distribution of terms across documents
c) Log transformation for local weights reduces the impact of term frequency more than binary and linear options
d)Mutual information requires a categorical target variable

A

Inverse document frequency depends on the distribution of terms across documents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Zipf’s law can be interpreted as follows: “The product of the frequency of words (f) and their rank is approximately constant.” Let a be the product of the frequency and rank. What is the incorrect answer?

a) In(f) = In(a) - In(r)
b) The frequency of the terms exponentially decreases with rank
c) Hypothetically, the second most prevalent word appears twice as frequently as the fourth frequent word.
d) Topmost frequent words are likely to be good discriminators

A

Topmost words are likely to be good discriminators

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Quiz 1, Question 9, see slide

A

Correct!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Write a text filter in SAS Enterprise Miner to return all documents having the term “White House” and not including “Canada.”

A

“White House” -Canada

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Provide a possible situation where you might prefer interpretability over predictive power?

A

In situations where you are presenting to executives or a operational or business audience.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Quiz 1, Question 12, see slide on Lecture 5

A

Correct!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

True or False - When you’re interested in a small set of terms in text mining, specifying a stop list will be more effective than specifying a start list

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

True or False, The skip-gram model aims to predict context words using a target word

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

True or False, In a long short-term memory (LSTM) model, you determine how much information from previous hidden states and the current state information should be retained through a forget gate

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

True or False, The Bidirectional Encoder Representations from Transformers (BERT) model has both an encoder and a decoder

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

True or False, In training machine learning algorithms, you can overcome high bias by collecting a large number of data points

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explainable machine learning indicates that your model can be understood by a human without further technical support

A

False

17
Q

This is a set of co-occurrence probabilities for target words “ice” and “steam” in the GloVe model. According to the results, the term “gas” is more effective in distinguishing between “ice” and “steam” than the term “water”

A
18
Q

Choose the incorrect answer about embedding models:

A. Bag of words can be considered a special case of the n-gram model
B. A bigram model considers one previous word to predict a word’s probability
C. GloVe learns embedding from global context via a word-word co-occurrence matrix
D. TF-IDF can handle unseen words by leveraging the context in which they appear in the document corpus

A

D.

19
Q

Choose the incorrect answer about the structure of deep learning models.

A. Bias plays a similar role to that of y-intercept in the linear equation
B. Weights are used to introduce non-linearity in the model
C. Inputs can be viewed as features or attributes in a data set
D. The work of the summation is to bind the weights and inputs together and calculate their sum

A

B.

20
Q

Choose the incorrect answer about the structure of deep learning models

A. Bias plays a similar role to that of y-intercept in the linear equation
B. Weights are used to introduce non-linearity in the model
C. Inputs can be viewed as features or attributes in a dataset
D. The work of the summation function is to bind the weights and inputs and calculate their sum

A

B.

21
Q

Choose the incorrect answer about Recurrent Neural Networks (RNNs) and their extensions.

A. RNN retains a memory, which is a distinct characteristic from a basic neural network
B. RNNs are more suitable for handling spatial data rather than sequential data
C. The long short-term memory model is intended to overcome the vanishing gradient problem of RNNs
D. Unlike RNNs, the transformer can effectively deal with dependencies between terms with long distances in the input sequence due to the attention mechanism.

A

B.

22
Q

Choose the incorrect answer about the Transformer model and its extensions.

A. An encoder is a to process the input while a decoder generates the output
B. Parallelization enables the transformer to save training times more compared to recurrent neural networks.
C. Unlike the original transformer, BERT introduced positional encodings to maintain word order information.
D. Pre-training and fine-tuning enable data scientists with limited computing resources to build a high-performing model

A

C.

23
Q

Choose the incorrect answer about fairness issues in machine learning

A. Machine learning algorithms can generate discriminatory outcomes even without the developer’s intention.
B. The high accuracy of a model guarantees fair algorithms to users.
C. Algorithms can learn bias embedded in their training dataset
D. Even if a predictive model for advertising didn’t learn human bias, its advertising outcomes can be biased due to market mechanisms

A

B.

24
Q

Choose the incorrect answer about interpretable and explainable machine learning

A. Higher performing models tend to be less interpretable
B. The goal of permutation importance technique is to understand how the model works.
C. Linear regression is considered one of the interpretable machine learning models
D. The objective of shuffling values in permutation importance is to remove the contribution of the independent variable

A

B.

25
Q
A