Quiz #1 Flashcards
Exam Prep
Errors due to _____ are errors made as a result of choosing a learning algorithm that is not well suited for the data or problem.
A. Bias
B. Variance
C. Sampling
D. Noise
A. Bias
Which of these is a regression problem?
A. Can I determine a person’s income based on their age and type of job?
B. Which states have the highest infant mortality rate?
C. How can I group supermarket products using purchase frequency?
D. Identify similarities in shopping patterns between customers of a department store.
A. Can I determine a person’s income based on their age and type of job?
Clustering is a type of unsupervised learning.
True
False
True
The _______ of a dataset represents the number of features in the dataset.
A. Resolution
B. Density
C. Dimensionality
D. Coarseness
C. Dimensionality
As the complexity of a model increases, bias decreases but variance increases.
True
False
True
Which of these terms is used to describe the degree to which data exists for each feature of all observations.
A. Density
B. Resolution
C. Dimensionality
D. Coarseness
A. Density
As part of the data transformation process, we sometimes have to discretize our data or create dummy variables. Which of these is a reason why we would need to do this?
A. It helps when trying to fix duplicate data.
B. This is an important step in balancing imbalanced datasets.
C. Some algorithms only work with either continuous or discrete variables.
D. This is an approach to normalize our data set.
C. Some algorithms only work with either continuous or discrete variables.
Which of these types of visualizations is best to use to explore the correlation between two continuous features?
A. Scatter plot
B. Sankey diagram
C. Histogram
D. Pie chart
A. Scatter plot
In class we discussed 6 stages in the “Analytic Process”. Which of these is not one of those stages?
A. Data Exploration
B. Validation and Interpretation
C. Data Summarization
D. Modeling
C. Data Summarization
A dataset with two class values that is significantly skewed (more than 90%) towards one of those class values is known as _______ dataset.
A. an inverted
B. a bimodal
C. an imbalanced
D. a skewed
C. an imbalanced
The method of imputation that fills in missing values using similar instances from the same dataset is known as _________ imputation.
A. Same-deck
B. Cold-deck
C. Hot-deck
D. Warm-deck
C. Hot-deck
The goal of unsupervised learning is to predict future outcomes based on prior experience.
True
False
False
Error due to \_\_\_\_\_\_\_\_ are errors made as a result of not providing the learning algorithm with the right amount or type of training data. A. Bias B. Sampling C. Randomness D. Variance
D. Variance
The attribute or feature that you are trying to predict, which is described by the other features within an instance is known as the \_\_\_\_\_\_\_\_. A. Instance B. Feature C. Class D. Dependent variable
C. Class
Color, shape, angle and number of edges are examples of nominal (or discrete) features.
True
False
False