Mathematics Flashcards

Question 1

Q

Canberra distance

Answer

A

a numerical measure of the distance between pairs of points in a vector space
d(P, Q) = Sum(|Pi - Qi| / (|Pi| + |Qi|))

Question 2

Q

Euclidean distance

Answer

A

d(P, Q) = sqrt(Sum((Pi - Qi) **2))

Question 3

Q

Manhattan distance

Answer

A

d(P, Q) = Sum(|Pi - Qi|)

Question 4

Q

Empirical distribution function

Answer

A

empirical distribution function is the distribution function associated with the empirical measure of a sample. This cumulative distribution function is a step function that jumps up by 1/n at each of the n data points. Its value at any specified value of the measured variable is the fraction of observations of the measured variable that are less than or equal to the specified value.

Question 5

Q

Feature scalling

Answer

A

a method used to standarize the range of independent variables

Question 6

Q

Rescaling

Answer

A

(x - min(x)) / range(x)

Question 7

Q

Mean normalization

Answer

A

(x - mean(x)) / range(x)

Question 8

Q

standarization

Answer

A

(x - mean(x)) / standard_deviation(x)

Question 9

Q

scaling to unit length

Answer

A

x / euclidean_length(x)

Question 10

Q

Euclidean length

Answer

A

Also called magnitude of a vector measures the length of the vector.

||x|| = sqrt(p1 ** 2 + p2 ** 2……..pn ** 2)

Question 11

Q

Binomial distribution

Answer

A

Ture or false

Question 12

Q

Poisson process

Answer

A

usually used in scenarios where we are counting the occurrences of certain events that appear to happen at a certain rate, but completely at random

It is derived from binomial distribution, assuming the number of trails is infinite.

Question 13

Q

Entropy

Answer

A

A measure of disorder in the data

E = - sum(Pi * log2(Pi))

Pi : the probability of each categorical data

Question 14

Q

Information gain

Answer

A

A measure of the reduction in the data disorder as a result of partitioning

Mathematics Flashcards

(14 cards)