INFORMATICS quz 2 Flashcards
Is a process of grouping
distinct data points, dividing into subsets
and making informed decisions based
on findings
Clustering
Involves dividing
data set into cluster groups for
evaluation of individual cluster
Partitioning Method
Single dataset
cluster grouped into similarities
Hierarchical method
ML based data
where group plotted clusters are
analyzed
Density Based Method
Efficient method
with cells on grid.
Grind Based method
Involves
searching for repeated instance of
attribute/data point. (CRM customer
relationship management) database for
instance of specific product purchase
Single-dimensional method
Involves
sourcing <1 points attributes in a data
set.
Multi-dimensional method
Statistical association can
help retailers notice parent shoppers
(Looking for childcare supplies) are
more likely to buy specialty food
beverage
Analysis of impromptu shopping
behavior
Checks formats of data
points in each dataset
Verifying Data
Ensures data
uniformity across dataset. Sorts
numerical values for only numbers and
letters + characters for string values.
Converting Data Types
Clears
useless or irrelevant data.
Removing Irrelevant Data
Assists in
making mining process more efficient by
reducing errors
Eliminating Duplicate Point
Removes mistakes
pertaining to grammar, spellings, typing,
etc, to increases accuracy and quality
of analysis
Removing Errors
Provides
estimated value for missing data
Completing Missing Values
Shows probability of
particular result with two possible
outcomes
Logistic regression