Chapter 12 Flashcards

1
Q

outlier detection, anaomaly detection

A

The process of finding data objecrts with nehaviors that ar very differne from expectation
such objects are called outliers or anomalies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

normal or expected data

A

data objexts that are not outliers which are abnormal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

outlier

A

a data object tjhat deviates significantly from the normal objects as if it were genrated by a differnet mechanism

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

nosie data

A

is random error or variance in a measured variable

noise should be removed before outlier detection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

three types of outliers

A

global, contextual and collective outlires

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

global outlier or poit anaomaly

A

eobject is Og if it siginicatly deciates fiorm the rest of the data set
issue: find an appropriate measurement of deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

contexutal outlier or conditional outlier

A

object isOc if it deviates ignifcantly based on a selected context
ex. 80 degrees in urban, outlier depends on if it is winter oe summer
contextual attributes and behavioral attributes
generalization of local outlires
issue: how to define or formualre meaningful context

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

contexual attibutes

A

deines the context, time or location

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

behavioral attributes

A

chartacteristics of the object, used in outlier evaluation, ex temperature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

collective outliers

A

a subset of data objects collectively deciare significantly form the whole data ser even if the individual data objects may not be outliers
a data set may have multiple types of outliers
one object may belong to more than one type of outlier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

two outlier detection methods

A

based on whether user labeled examples of outliers can be obtained: supervised, semi supervised unserpervised methods
based on assumptions about normal data and outliers: statistical proximity based and cluseting based methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

supervised methods

A

a classigfication problem
model normal objects ad report those not matching as otliers or treat those not matching the model as normal
ChallengesL imbalanced classes, catch as many outliers as possible, recall is more important than accuracy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

unserpervide methods

A

assume the normal objects aresomwaht clustered into multiple groups
an outlier is expected to be far way
weakneees: can not detect collective outliers effectively

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

semi supervised methods

A

regarded as appluications of semi supervised learning
use labeled exampkse and the proximate unlabled object to train a model for normal objects
those not fitting the model are outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

statistical methods, model based methosa

A

assume that the normal data follow some statistical model, those that do not follow are outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

proximatiy based methods

A

an object is an outlier if the nearest neighbor of the object are far way, the proximity of the object is significantly deviates form the proximity of most of the other objects
distance based and density based