data preparation Flashcards
process of preparing data for analysis by
removing or modifying incorrect, incomplete,
irrelevant, duplicated, or improperly formatted data
data cleaning
types of data: many different string value
polynomial
types of data: exactly 2 values
binomial
types of data: a fractional number
real
types of data: a whole number
integer
types of data: both date and time
data_time
rapid miner interface
repository/source tabs
operators/analysis tabs
description tabs
parameter tabs
canvas
data will appear in the __ tab
results
“read excel” is found in what tab
operator tab
to find the basic statistics of each attributes, click __
statistics
connect the __ node of the read excel operator and __ of the result knob
out, res
filter examples may be found in what tab
operator tab
add filter may be found in what tab
parameter tab
instead of filtering, you may remove all cases with missing values, using the __ class, instead of add filters
condition
replace missing values may be found in what tab
operator tab
t/f:in dealing with miscoded data, the out node of the retrieve customer operator must also be connected to the first res of the result knob
f (second knob)
__ __ operator may be used to tag the attribute that will be used as the label (target variabls)
set role
if two data sets are needed to be merged in order to make an analysis, use the __ operator
join
connect the first data set or its result in the _- node of the join operator and the other data set at the right node
left