Chapter 6 test deck Flashcards

1
Q

frquent pattern

A

A set of items subsequences substructures that occure frequently in a data set
an intrinsic and important p[roprty if dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

frequent itemset

A

a set of items that appear frequently together in a transaction data set, e.g. milk and bread

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

grquent sequential pattern

A

buying your first pc, then a digital camera and then a memory car

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

substructures

A

refer to different strucural forms such as subgraphs, subtrees, or sublattice, which may be combined with itemsets or subsequences. If a substructure occurs frequently it is called a frquent structured pattern

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

fequent pattern mining

A

searches for recurring relationships in a given data setand is the foundation fro many essential data mining tasks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

market basket analysis

A

the earliest form of frequent pattern mining for association rule

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

association rule

A

each item has a boolean variable representing the presence or absence of that item
each baskit can then be represented by a boolean vector of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

support

A

usefulness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

confidence

A

certainty of rules
the teliability of the inference made by a rule
the higher the confidence the more likely it is for b to be present in transactions that contain A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

itemset

A

a set of one or more items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

occurenace frequency

A

frquency of an itemset x, supportm suoport count, or count of the itemset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

relative support

A

the fraction of transaction that contain x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Association rule does not necessarily imply causality

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

mining association rules

A
  1. find all frequent itemsets: each of these item sets will occur at least as frequently as a predetermined min support count
  2. generate strong association rules from the frequent itemsets: these rules must satisfy minimum support and minimum confidence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

apriori property

A

all nonempty subsets of a frequent itemset must also be frequent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

antimonotonicity

A

if a set cannot pass a test, all of its supersets will fail the same test as well. The property is monotonic in the context of failing a test

17
Q

apriori

A

acandidate generation-and-test approch
uses horizontal data format
TID itemset format

18
Q

eclat

A

frquent pattern mining with vertical data format
equivalence class transformation algorithm
item TID _set format
the support count of an itemset is simply the length of the TID_set of the itemset

19
Q

Downward closue property of requent patterns

A

any subset of a frequent item set must be frequent
if beer, diaper, and nuts are frequent, beer and diapers must be too
every transaction having all three also contains just two

20
Q

scalable mining methods

A
  1. apriori
  2. frequent pattern growth not covered in class
  3. vertical data format approach
21
Q

apriori pruning priciple

A

if ther is any itemset which is infrequent, its superset should not be generated or tested

22
Q

apriori method

A

initially scan DB once to get requent utemset
generate length cadidate itemsets form length k frequent itemsets
test the candidtis against DB
terminate when no frequent or candidate set can be generated

23
Q

Association rule generation

A
  1. for each frequent item set, generate all non empty subsets of it
  2. for every nonempty subsetS of I output rule s -> I-S if couport I / support S >= min
24
Q

patern evaluation method

A

strong association rules can be uniteresting and misleading

25
Q

correlation mesasure lift

A

the ocurrance of itemseta is independent of the occuence of item set b if p(aub) = p(a)P(b) otherwise items set a nad b are dependent and correlated as events

26
Q

negativly correlated

A

lift(a,b) is < 1 meaning the occuence of one likey lead to the absence of the other

27
Q

positivly correlated

A

lift a,b > 1 meaning that the occurance of one implies the occurrence of the other

28
Q

independent

A

lift a,b =1 there is no correlation between them