Midterm Ai fundamentals reviewer 2 Flashcards
The KL distance is often used in machine learning to evaluate the performance of a classification model. In this context, a low KL distance indicates that the model’s predicted class probabilities are:
Somewhat similar to the true class probabilities
The KL distance is always positive and is equal to zero only when the two probability distributions are:
Identically distributed
How is the y-intercept of the line of best fit calculated using the least squares method?
By dividing the sum of the product of the x values and the y values by the sum of the x value
In information theory, the KL distance can be used to measure the information lost when approximating one distribution with another. Which of the following is NOT a property of the KL distance in this context?
It is non-symmetric
The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters based on the distance between their centroids.
Centroid
The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters that have the maximum distance between them.
Single
The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters based on the mean distance between their points.
Average
The KL distance is often used in natural language processing to compare the distribution of words in a document with the distribution of words in a reference corpus. In this context, a low KL distance indicates that the document is:
Somewhat similar to the reference corpus
How is the slope of the line of best fit calculated using the least squares method?
By dividing the sum of the product of the x values and the y values by the sum of the squares of the x values
What are some advantages of batch learning algorithms?
They can learn from a limited amount of resources
The KL distance can be used to measure the information lost when approximating one distribution with another. In this context, the distribution being approximated is known as the:
Base distribution
The KL distance can be used to measure the difference between two probability distributions in terms of the information content of the distributions. In this context, the KL distance is also known as:
The information divergence
The KL distance is also known as what other measure?
Cross-entropy
The KL distance is often used in machine learning and artificial intelligence to compare two probability distributions, such as a model’s predicted distribution and the true distribution. In this context, the KL distance can be used as a:
Loss function
In hierarchical clustering, the final clusters are represented using a ______________ diagram.
Dendrogram