Easy Flashcards
Distance Functions, Euclidean, _, _
Manhattan, Minkowski
optimal dataset for K neighbours
3-10
association rules find all sets of itemsets that have _ greater than the minimum
support
association rules: find desired rules that have _ greater than the min
confidence
association rules are usually needed to satisfy a user-specified _ & _
minimum support, confidence
formula: support for association rules
frq(x,y)/n
formula: confidence for association rules
frq(x,y)/frq(x)
K-means clustering… place _ at _ locations; repeat until convergence
centroids, random
K-means clustering… 1. for each point xi:
find nearest centroid
K-means clustering… 2. assign the point, & for each determine new centroid
to cluster
K-means clustering… 3. stop when non of the __ change
clustering assignments
a perceptron is used to classify _ classes
linearly separable
a percepton consists of _, _, _
weights, summation processor, activation function
a percepton takes a weighted sum of input and outputs, 1 if the sum > than _ _ _ _, _
some adjusted threshold value, theta
the perceptron can have another input known as
the bias
perceptron: it is normal practice to treat the bias as
just another input
the perceptron bias allows us to
shift the transfer curve horizontally along the input axis
the perceptron weights determine
the slope of the curve
draw the perceptron

perceptron concept: the ouput is set at one of two levels, depending on whether the _, is greater or less than some _ value. This is called:
total input, threshold, unit step (threshold)
draw the unit step threshold

Perceptron function: the _ consists of two functions, _ and _ , ranging from 0 and 1, and -1 to +1
sigmoid, logistic, tangential

perceptron function: Ouput is proportional to the total weighted output
piecewise linear

perceptron function: bell shaped curves that are continuos. the node output (high / low) is interpreted in terms of class membership (1/0) depending on how close the net input is to a _
Gaussian, chosen value of average

what helps us control how much we change the weight and bias in a perceptron, which we do in order to get a smallest error
the learning rate

Perceptron: if we have n variables then we need to find _
n + 1 weight values (n variables + the bias)
perceptron: if we have 2 inputs the equation becomes:
w1x1 + w2x2 + b = 0
where wi is the weight of input i and b is the bias (w0 with input value x0 of 1)
what is the data mining task of prediction the value of a categorical variable (target or class)
classification
transforming attributes from numerical to categorical
binning or discretization
transforming attributes from categorical to numerical
encoding or continuization
Represents

Linear / Non-linear separability & inseperable
list the frequency table methods
- ZeroR
- One R
- Naive Bayesian
- Decision Tree
list the covariance matrix methods of classification
- linear discriminate analysis
- logistic regression
list the similarity functions method of classification
K Nearest Neighbours
List the other methods of classication
artificial neural network, support vector machines
simplest classification method
ZeroR
Zero R classifier relies:
on the targer and ignores all predictors
although there is no predictability power in ZeroR it is useful for _
determining a baseline performance as a benchmark for other classification methods
how to implement ZeroR
contruct a frequency table for the target and select it’s most frequent value

Zero R only predicts _
the majority class correctly (as shown by the confusion table)
