Midterm Ai fundamentals reviewer 2 Flashcards

Question 1

Q

The KL distance is often used in machine learning to evaluate the performance of a classification model. In this context, a low KL distance indicates that the model’s predicted class probabilities are:

Answer

A

Somewhat similar to the true class probabilities

Question 2

Q

The KL distance is always positive and is equal to zero only when the two probability distributions are:

Answer

A

Identically distributed

Question 3

Q

How is the y-intercept of the line of best fit calculated using the least squares method?

Answer

A

By dividing the sum of the product of the x values and the y values by the sum of the x value

Question 4

Q

In information theory, the KL distance can be used to measure the information lost when approximating one distribution with another. Which of the following is NOT a property of the KL distance in this context?

Answer

A

It is non-symmetric

Question 5

Q

The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters based on the distance between their centroids.

Question 6

Q

The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters that have the maximum distance between them.

Question 7

Q

The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters based on the mean distance between their points.

Question 8

Q

The KL distance is often used in natural language processing to compare the distribution of words in a document with the distribution of words in a reference corpus. In this context, a low KL distance indicates that the document is:

Answer

A

Somewhat similar to the reference corpus

Question 9

Q

How is the slope of the line of best fit calculated using the least squares method?

Answer

A

By dividing the sum of the product of the x values and the y values by the sum of the squares of the x values

Question 10

Q

What are some advantages of batch learning algorithms?

Answer

A

They can learn from a limited amount of resources

Question 11

Q

The KL distance can be used to measure the information lost when approximating one distribution with another. In this context, the distribution being approximated is known as the:

Answer

A

Base distribution

Question 12

Q

The KL distance can be used to measure the difference between two probability distributions in terms of the information content of the distributions. In this context, the KL distance is also known as:

Answer

A

The information divergence

Question 13

Q

The KL distance is also known as what other measure?

Answer

A

Cross-entropy

Question 14

Q

The KL distance is often used in machine learning and artificial intelligence to compare two probability distributions, such as a model’s predicted distribution and the true distribution. In this context, the KL distance can be used as a:

Answer

A

Loss function

Question 15

Q

In hierarchical clustering, the final clusters are represented using a ______________ diagram.

Answer

A

Dendrogram

Question 16

Q

The KL distance between two discrete probability distributions P and Q is defined as:

Answer

A

The sum of the logarithm of the ratio of the probabilities of each event in P and Q

Question 17

Q

What are some disadvantages of batch learning algorithms?

Answer

A

They are slow to adapt to changes in the data

Question 18

Q

The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters that have the minimum distance between them.

Question 19

Q

Is the least squares method a deterministic or a probabilistic method?

Answer

A

Deterministic

Question 20

Q

In hierarchical clustering, the distance between clusters is typically measured using the ______________ criterion.

Answer

A

Linkage criterion