week 2 Flashcards
According to the reading, the output of a data mining exercise largely depends on:
The programming language used.
The data scientist.
The quality of the data.*
The scope of the project.
When data are missing in a systematic way, you can simply extrapolate the data or impute the missing data by filling in the average of the values around the missing data.
FALSE
What is an example of a data reduction algorithm?
Prior Variable Analysis.
Cojoint Analysis.
A/B Testing.
Principal Component Analysis.*
After the data are appropriately processed, transformed, and stored, what is a good starting point for data mining?
Data Visualization.*
Machine learning.
Creating a relational database.
Non-parametric methods.
In-sample forecast is the process of formally evaluating the predictive capabilities of the models developed using observed data to see how effective the algorithms are in reproducing data.
TRUE
Prior Variable Analysis and Principal Component Analysis are both examples of a data reduction algorithm.
FALSE
After the data are appropriately processed, transformed, and stored, machine learning and non-parametric methods are a good starting point for data mining.
FALSE
According to the reading, the output of a data mining exercise largely depends on the skills of the data scientist carrying out the exercise.
FALSE
What should you do when data are missing in a systematic way?
Determine who was managing the database.
Determine the impact of missing data on the results and whether missing data can be excluded from the analysis.*
Determine the average of the values around the missing data.
Extrapolate the data.
When data are missing in a systematic way, you can simply extrapolate the data or impute the missing data by filling in the average of the values around the missing data.
FALSE