IYSE 6501 Glossary Flashcards

1
Q

Algorithm

A

Step-by-step procedure designed to carry out a task.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Change detection

A

Identifying when a significant change has taken place in a process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Classification

A

The separation of data into two or more categories, or (a point’s
classification) the category a data point is put into.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Classifier

A

A boundary that separates the data into two or more categories. Also
(more generally) an algorithm that performs classification.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Cluster

A

A group of points identified as near/similar to each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Cluster center

A

In some clustering algorithms (like 𝑘𝑘-means clustering), the central
point (often the centroid) of a cluster of data points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Clustering

A

Separation of data points into groups (“clusters”) based on
nearness/similarity to each other. A common form of unsupervised
learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

CUSUM

A

Change detection method that compares observed distribution mean
with a threshold level of change. Short for “cumulative sum”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Deep learning

A

Neural network-type model with many hidden layers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Dimension

A

A feature of the data points (for example, height or credit score). (Note
that there is also a mathematical definition for this word.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

EM algorithm

A

Expectation-maximization algorithm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Expectation-maximization
algorithm (EM algorithm)

A

General description of an algorithm with two steps (often iterated),
one that finds the function for the expected likelihood of getting the
response given current parameters, and one that finds new parameter
values to maximize that probability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Heuristic Algorithm

A

Algorithm that is not guaranteed to find the absolute best (optimal)
solution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

𝑘-means algorithm

A

Clustering algorithm that defines 𝑘𝑘 clusters of data points, each
corresponding to one of 𝑘𝑘 cluster centers selected by the algorithm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

k𝑘-Nearest-Neighbor (KNN)

A

Classification algorithm that defines a data point’s category as a
function of the nearest 𝑘𝑘 data points to it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Kernel

A

a type of function that computes the similarity between two inputs;
thanks to what’s (really!) sometimes known as the “kernel trick”,
nonlinear classifiers can be found almost as easily as linear ones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Learning

A

Finding/discovering patterns (or rules) in data, often that can be
applied to new data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Machine

A

Apparatus that can do something; in “machine learning”, it often refers to both an algorithm and the computer it’s run on. (Fun fact: before
computers were developed, the term “computers” referred to people
who did calculations quickly in their heads or on paper!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Margin

A

For a single point, the distance between the point and the classification
boundary; for a set of points, the minimum distance between a point
in the set and the classification boundary. Also called the separation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Machine learning

A

Use of computer algorithms to learn and discover patterns or structure
in data, without being programmed specifically for them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Misclassified

A

Put into the wrong category by a classifier

22
Q

Neural network

A

A machine learning model that itself is modeled after the workings of
neurons in the brain.

23
Q

Supervised learning

A

Machine learning where the “correct” answer is known for each data
point in the training set.

24
Q

Support vector

A

In SVM models, the closest point to the classifier, among those in a
category. (Note that there is a more-technical mathematical definition
too.)

25
Q

Support vector machine (SVM)

A

Classification algorithm that uses a boundary to separate the data into
two or more categories (“classes”).

26
Q

Unsupervised learning

A

Graphical representation of splitting a plane with two or more special
points into regions with one special point each, where each region’s
points are closer to the region’s special point than to any other special
point.

27
Q

Accuracy

A

Fraction of data points correctly classified by a model; equal to
TP +TN / TP +FP+TN+FN

28
Q

Confusion matrix

A

Visualization of classification model performance

29
Q

Diagnostic odds ratio

A

Ratio of the odds that a data point in a certain category is correctly
classified by a model, to the odds that a data point not in that category
is incorrectly classified by the model; equal to (TP / FN) / (FP / TN) = (TN X TP) / (FN X FP)

30
Q

Fall out

A

fraction of data points not in a certain category that are incorrectly classified by a model; equal to fp /(TN + FP) also called false positive rate

31
Q

False negative (FN)

A

Data point that a model incorrectly classifies as not being in a certain
category. (“Negative” means the model classified it as not being in the
category, and “False” means the model’s classification is incorrect.)
Sometimes abbreviated as “FN”.

32
Q

False negative rate

A

Fraction of data points in a certain category that are incorrectly
classified by a model; equal to FN/TP+FN. Also called miss rate

33
Q

False positive (FP)

A

Data point that a model incorrectly classifies as being in a certain category. (“Positive” means the model classified it as being in the category, and “False” means the model’s classification is incorrect.) Sometimes abbreviated as “FP

34
Q

False positive rate

A

Fraction of data points not in a certain category that are incorrectly classified by a model; equal to FP/TN+FP . Also called fall out.

35
Q

False omission rate

A

Fraction of data points the model classifies as not in a certain category, that are really in the category; equal to NF/(TN+FN)

36
Q

Hit rate

A

Fraction of data points in a certain category that are correctly classified by a model; equal to TP/(TP+FN) sensitivity, and recall.

37
Q

Miss rate

A

Fraction of data points in a certain category that are incorrectly classified by a model; equal to FN/(TP+FN) Also called false negative rate

38
Q

Negative likelihood ratio

A

Ratio of the fraction of data points in a certain category that are misclassified as not in the cateogry, to the fraction of data points not in the category that are correctly classified as not being in the category; equal to (1-sensitivity)/specificity = (FN/(FN+TP))/(TN/(TN+FP))

39
Q

Negative predictive value

A

Fraction of data points classified as not in a certain category that are really not in that category; equal to TP / (TP+FP)

40
Q

Positive likelihood ratio

A

Ratio of the fraction of data points in a certain category that are correctly classified as being in that category, to the fraction of data points not in the category that are incorrectly classified as being in the category; equal to sensitivity/(1-specificity) = (TP / (TP+FN)) / (FP/(FP+TN)

41
Q

Positive predictive value

A

Fraction of data points classified as being in a certain category that are really in that category; equal to TP / (TP+FP) Precision . Also called precision

42
Q

Precision

A

In analytics, the fraction of data points classified as being in a certain category that are really in that category; equal to TP / (TP+FP) positive predictive value.

43
Q

Recall

A

Fraction of data points in a certain category that are correctly classified by a model; equal to TP / (TP + FN) positive rate.

44
Q

Sensitivity

A

Fraction of data points in a certain category that are correctly classified by a model; equal to TP/(TP+FN) and recall

45
Q

Specificity

A

Fraction of data points not in a certain category that are correctly classified by a model; equal to TN/(TN+FP) rate. also called the true negative rate

46
Q

True negative (TN)

A

Data point that a model correctly classifies as not being in a certain category. (“Negative” means the model classified it as not being in the category, and “True” means the model’s classification is correct.) Sometimes abbreviated as “TN”.

47
Q

True negative rate

A

Fraction of data points not in a certain category that are correctly classified by a model; equal to TN / (TN+FP) ; also called specificity.

48
Q

True positive (TP)

A

Data point that a model correctly classifies as being in a certain category. (“Positive” means the model classified it as being in the category, and “True” means the model’s classification is correct.) Sometimes abbreviated as “TP”.

49
Q

True positive rate

A

Fraction of data points in a certain category that are correctly classified by a model; equal to tp/ (TP+FN) ; also called sensitivity, hit rate, and recall.

50
Q
A