Chapter 4 Flashcards

1
Q

Entity

A

Something the business has the will and means to keep information about

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Attribute

A

A single piece of information about an entity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Relationship

A

An association of two entities for a business purpose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Unique identifier

A

Attribute(s) that uniquely identifies an instance of an entity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

relationship cardinality

A

Describes how many instances of an entity can be related to instances of other entity in the relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

ERD

A

Entity-relationship diagram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Master data

A

the official list of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

MDM

A

master data management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

SQL

A

Structured query language

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

data mining*

A

The nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns of data stored in data bases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

data mining key words

A
process
nontrivial
valid
novel
potentially useful
understandable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

data mining is a blend of multiple disciplines

A

statistics
AI
machine learning
information visualization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

types of patterns

A

association
prediction
cluster
sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Prediction

A

classification
regression
time series

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

association

A

market based
link analysis
sequence analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

segmentation

A

clustering

outlier analysis

17
Q

CISPDM

A

Cross industry standard process for data mining

18
Q

CISPDM steps

A
business understanding
data understanding
data preparation
model building
testing and evaluation
deployment
19
Q

Association rule input

A

the simple point of transaction

20
Q

Association rule output

A

most frequent affinities among items

21
Q

Data mining mistakes

A
  • Selecting the wrong problem for data mining
  • Ignoring what your sponsor thinks data mining is and what it really can/cannot do
  • Beginning without an end in mind
  • Not leaving sufficient time for data acquisition, selection, and preparation
  • Looking only at aggregated results and not at individual records/predictions