Mathematics Flashcards

1
Q

Canberra distance

A

a numerical measure of the distance between pairs of points in a vector space
d(P, Q) = Sum(|Pi - Qi| / (|Pi| + |Qi|))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Euclidean distance

A

d(P, Q) = sqrt(Sum((Pi - Qi) **2))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Manhattan distance

A

d(P, Q) = Sum(|Pi - Qi|)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Empirical distribution function

A

empirical distribution function is the distribution function associated with the empirical measure of a sample. This cumulative distribution function is a step function that jumps up by 1/n at each of the n data points. Its value at any specified value of the measured variable is the fraction of observations of the measured variable that are less than or equal to the specified value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Feature scalling

A

a method used to standarize the range of independent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Rescaling

A

(x - min(x)) / range(x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Mean normalization

A

(x - mean(x)) / range(x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

standarization

A

(x - mean(x)) / standard_deviation(x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

scaling to unit length

A

x / euclidean_length(x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Euclidean length

A

Also called magnitude of a vector measures the length of the vector.

||x|| = sqrt(p1 ** 2 + p2 ** 2……..pn ** 2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Binomial distribution

A

Ture or false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Poisson process

A

usually used in scenarios where we are counting the occurrences of certain events that appear to happen at a certain rate, but completely at random

It is derived from binomial distribution, assuming the number of trails is infinite.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Entropy

A

A measure of disorder in the data

E = - sum(Pi * log2(Pi))

Pi : the probability of each categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Information gain

A

A measure of the reduction in the data disorder as a result of partitioning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly