Exploratory Data Analysis Flashcards
Data that are expressed on a numerical scale is what data type?
numeric
Data that can take only a specific set of values representing a set of possible categories (enums, enumerated, factors, nominal) are what data type?
categorical
Cite the two numerical data types
continuous and discrete
Cite the two categorical data types
binary and ordinal
Data that can take on any value in an interval (float, numeric)
continuous
Data that can take only integer values, such as counts
discrete
True or False
Data typing in software acts as a signal on how to process the data
True
Rectangular data (like a spread sheet) is the basic structure for statistical and machine learning models, cite the structure?
dataframe
A column (series) within a table is commonly referred to as a _______?
feature
Many data science projects involve predicting an ______?
outcome (dependent variable, response, target, output)
A row in a table is referred to as a ______?
record
What is the sum of all values divided by the number of values
mean
The sum of all values times a weight divided by the sum of the weights
weighted mean
The value such that one-half of the data lies above and below
median
The value such that P percent of the data lies below
percentile (quantile)
The value such that one-half of the sum of the weights lies above and below the sorted data
weighted median
The average of all values after dropping a fixed number of extreme values
trimmed mean (truncated mean)