final exam Flashcards
Append
Add records from one dataset to another
Merge
Add fields from one dataset to another
Rectangular Data
Product of records and fields
Stages of CRISPDM
Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, Deployment
SuperNode
Condensing several nodes into a single node
Histogram is used for which fields
Continuous Fields
Distribution is used for which fields
Categorical Fields
Direct link between 2 variables
Causation
2 variables change at a certain rate in relationship to eachother
Correlation
Data point that deviates so far from the other observations
Outlier/Extreme
Values 3-5 SD from the mean
Outlier
Values more then 5 SD away from the mean
Extreme (Outlier)
2 or more categories that can be ranked (School ranking)
Ordinal Ranking
2 or more categories that can be ranked, that have no order (peoples favorite color)
Nominal Ranking
All the numbers added together then divided by how many number there are (average)
Mean