UNIT 2 - Introduction To Continuous Association Rule Mining Algorithm (CARMA) Flashcards
Why use CARMA?
- Efficient: uses less space/time than Apriori, uses at
most two scans can get the rules - CARMA uses rule support instead of antecedent
support (used by Apriori) - Allows rules with multiple consequents
- Allows changes of support thresholds during execution
- Only support binary/flag variables
What are the two phases of CARMA?
Phase I
identifies frequent itemsets in the data through the construction of a lattice of all potentially frequent itemsets. (constructed in one scan pass)
Phase II
removes itemsets that are not frequent and then generates rules from the lattice of frequent itemsets
What should be considered when choosing a method for association analysis?
a) characteristics of data to be analysed
b) strengths and weaknesses of the techniques under consideration.
What should be done to numeric inputs before using APRIORI and CARMA?
Numeric fields have to be categorized or binned
CARMA can handle inputs with more than two categories whereas Apriori can only handly binary inputs. True or False?
False.
CARMA can only handle binary inputs whereas Apriori can handle inputs with more than two categories
If rules with many concequents are desired, which is the preferred method to use?
CARMA
Which methods are flexible in terms of choice in evaluation methods?
CARMA and APRIORI
The Apriori method allows the choice of four different rule evaluation measures as discussed previously, whereas the CARMA method uses the rule support and rule confidence for rule evaluation. In addition, CARMA also allows users to specific the rule size, and allows users to vary the support threshold.
What is ARSD?
Association Rule Summary Diagram
-Assemble related rules into a single diagram that is
succinct and easier to interpret.
What is ARSD used for?
-Assemble related rules into a single diagram that is
succinct and easier to interpret.
- Important when analysts need to present mining results to non-technical audience such as business managers.
What is Phase 1 of CARMA?
- Increment the count of all itemsets in the lattice that also occurs in T
- For each subset v of T, if v is not in L and all subsets of v are in L, insert v into L, update some statistics of v > continuous feedback
- Prune the lattice by removing itemsets with low support (< Min Rule Support)
What is Phase 2 of CARMA?
- Determines the precise support of all itemsets
- Removes infrequent itemsets and their supersets to reduce lattice size (downward closure principle)
- Generates rules from the lattice of frequent itemsets
What’s the advantages of ARSD?
In general, ARSD can be used to organise different rules that have part-whole relationships into a diagrammatic representation. There are two interesting
properties that can be used to validate the correctness of ARSD. Firstly, the support percentage of the second rule must be less than or equal to the support percentage of the first rule (i.e., y ≤ w). Secondly, the number of arrows corresponds to the number
of rules represented by ARSD.
Why use both instead of target or input?
we want the software to consider each attribute as a possible antecedent or consequent.
Why use None role?
they will not be used in this illustration
Advantages of CARMA?
> Efficient
- uses less space/time than Apriori
- uses at most two scans can get the rules
Online user interactive feedback oriented technique, user can continuously change the support threshold during the process.
Suitable for learning large dataset, and where transactions are read
from a network.