UNIT 2-3 Flashcards
The process of preparing data for analysis by removing or modifying incorrect, incomplete, irrelevant, duplicated, or improperly formatted data
Data Cleaning
T/F: In importing data, you may not change the type, role, and name of each attribute (variable)
FALSE
Many different string values
Polynomial
Exactly two values
Binomial
A fractional number
Real
A whole number
Integer
Indicator for date and time
date_time
Indicator for date without time
date
Indicator for time without date
time
It is an operation in RapidMiner which has criteria and retains data depending on the given criteria
Filtering
Instead of filtering, you may remove all cases with missing values using the ______________, instead of add filters
Condition Class
To remove “white spaces” in the encoding, use the ______ operator
TRIM
It is the graphical representation of data; Techniques used to communicate insights from data through visual representation
Data Visualization
T/F: Data Visualization is used to distill large datasets into visual graphics to allow for easy understanding of complex relationships within the data and analyze massive amounts of information and make data-driven decisions
TRUE
What are the common visualization techniques
Bar graph, Line graph, Pie graph, Histogram, Scatterplot, Boxplot, Heatmap