Midterm Ai fundamentals reviewer 2 Flashcards

1
Q

The KL distance is often used in machine learning to evaluate the performance of a classification model. In this context, a low KL distance indicates that the model’s predicted class probabilities are:

A

Somewhat similar to the true class probabilities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The KL distance is always positive and is equal to zero only when the two probability distributions are:

A

Identically distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is the y-intercept of the line of best fit calculated using the least squares method?

A

By dividing the sum of the product of the x values and the y values by the sum of the x value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In information theory, the KL distance can be used to measure the information lost when approximating one distribution with another. Which of the following is NOT a property of the KL distance in this context?

A

It is non-symmetric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters based on the distance between their centroids.

A

Centroid

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters that have the maximum distance between them.

A

Single

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters based on the mean distance between their points.

A

Average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The KL distance is often used in natural language processing to compare the distribution of words in a document with the distribution of words in a reference corpus. In this context, a low KL distance indicates that the document is:

A

Somewhat similar to the reference corpus

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is the slope of the line of best fit calculated using the least squares method?

A

By dividing the sum of the product of the x values and the y values by the sum of the squares of the x values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are some advantages of batch learning algorithms?

A

They can learn from a limited amount of resources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The KL distance can be used to measure the information lost when approximating one distribution with another. In this context, the distribution being approximated is known as the:

A

Base distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The KL distance can be used to measure the difference between two probability distributions in terms of the information content of the distributions. In this context, the KL distance is also known as:

A

The information divergence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The KL distance is also known as what other measure?

A

Cross-entropy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The KL distance is often used in machine learning and artificial intelligence to compare two probability distributions, such as a model’s predicted distribution and the true distribution. In this context, the KL distance can be used as a:

A

Loss function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In hierarchical clustering, the final clusters are represented using a ______________ diagram.

A

Dendrogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The KL distance between two discrete probability distributions P and Q is defined as:

A

The sum of the logarithm of the ratio of the probabilities of each event in P and Q

17
Q

What are some disadvantages of batch learning algorithms?

A

They are slow to adapt to changes in the data

18
Q

The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters that have the minimum distance between them.

A

Complete

19
Q

Is the least squares method a deterministic or a probabilistic method?

A

Deterministic

20
Q

In hierarchical clustering, the distance between clusters is typically measured using the ______________ criterion.

A

Linkage criterion