Chapter 2 Flashcards
Binning
Categorizing a metric variable into a smaller number of categories/bins and thus converting the variable into a nonmetric form
Boxplot
box reporesents the major portion of the distribution and the extensions (whiskers) reach to teh extreme points of the distribution
- useful in making comparisons of one or more metric variables across groups formed by a nonmetric variable
Cardinality
the number of distinct values for a variable
Censored data
observations that are incomplete in a systematic and known way
- censored data are an example of “ignorable missing data”
curse of dimensionality
problems associated with including a very large number of variables in the analysis
- distance measures becomign less useful
- higher potential for irrelevant variables
- differing scales of measurement for the variables
Data management
all activities associated with assembling a dataset for analysis
Data quality
accuracy of the informatin in a dataset
● 8 dimensions → completeness / availability and accessibility / currency / accuracy / validity / usability and interpretability / reliability and credibility / consistency
dCor
newer measure of association that is distance-based and more sensitive to nonlinear pattenrs in the data
dichotomization
dividing cases into classes basedon being above or below a specific value
elasticity
measure of ratio of % change in Y for a % change in X.
Obtained by using a log-log transformation of both DVs and IVs
Heat map
form of a scatterplot of nonmetric variables where frequency within each cell is color-coded to depic relationships
Assumptions of equal variance of the population error E
E is estimtaed from e
Homoscedasticity
variance of the error terms (e) appears constant over a range of predictor variables
Heteroscedasticitiy
error terms have increasing or modulating variance
Histogram
graphical display of the distribution of a single variable
Hoeffding’s D
new measure of association/correlation, based on distance measures between the variables and thus more likely to incorporate nonlinear components