Chapter 4 Flashcards
Entity
Something the business has the will and means to keep information about
Attribute
A single piece of information about an entity
Relationship
An association of two entities for a business purpose
Unique identifier
Attribute(s) that uniquely identifies an instance of an entity
relationship cardinality
Describes how many instances of an entity can be related to instances of other entity in the relationship
ERD
Entity-relationship diagram
Master data
the official list of data
MDM
master data management
SQL
Structured query language
data mining*
The nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns of data stored in data bases
data mining key words
process nontrivial valid novel potentially useful understandable
data mining is a blend of multiple disciplines
statistics
AI
machine learning
information visualization
types of patterns
association
prediction
cluster
sequence
Prediction
classification
regression
time series
association
market based
link analysis
sequence analysis
segmentation
clustering
outlier analysis
CISPDM
Cross industry standard process for data mining
CISPDM steps
business understanding data understanding data preparation model building testing and evaluation deployment
Association rule input
the simple point of transaction
Association rule output
most frequent affinities among items
Data mining mistakes
- Selecting the wrong problem for data mining
- Ignoring what your sponsor thinks data mining is and what it really can/cannot do
- Beginning without an end in mind
- Not leaving sufficient time for data acquisition, selection, and preparation
- Looking only at aggregated results and not at individual records/predictions