Distributions Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

How are Y and y defined?

A

Y –> Actual outcome of an event
y –> One of the possible outcomes

Ways of writing the likelihood of a particular outcome y:

P (Y = y)
p (y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is p(y) called?

A

Since p(y) expresses the probability of each distinct outcome, we call this:

The probability function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

With what do we define distributions?

A

2 characteristics:

Mean –> Average value –> μ

Variance –> How spread out the data is –> σ^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How are population and sample data defined?

A

Population data –> All the data

Sample data –> Just a part of it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How are sample mean and variance denoted?

A

Sample mean symbol: x̄

Sample variance: s^2 (square)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How is the standard deviation defined/denoted

A

Standard deviation –> Square root of variance:

√(σ^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Formule standaarddeviatie

A

Standaarddeviatie:

Sx = σ = de standaarddeviatie van getallenreeks x.
Xi = de waarde van getal i in de getallenreeks.
μ = het gemiddelde van de getallenreeks (som getallen / aantal)
Nx = het aantal getallen in de proef.

Formule Standaarddeviatie
σ = Sx = √( ∑ ( (xi - μ)2 / nx) )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Notation for distributions:

A

Variable name
Tilde sign
Type –> Capital letter
Characteristics (μ, σ^2)

X ~ N (μ, σ^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Discrete distribution when all outcomes are equally likely?

A

Equiprobable –> Uniform distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Discrete distribution with only two possible outcomes?

A

Follow a Bernoulli distribution
Single trial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Discrete distribution when carrying out a similar experiment several times in a row

A

Binomial Distribution
Two outcomes per iteration
Many iterations –> Multiple trials

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Discrete distribution when calculating chance of succes after given an average probability?

A

Poisson distribution

How unusual is an event frequency for a given interval

Example: There’s 35 points per game. How big is the chance of 12 points in the first quarter of the next game?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Characteristics of normal distribution?

A

Often observed in nature
Margin values are called outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When is Student’s T distribition used?

A

A small sample approximation of a Normal distribution

It accommodates extreme values much better

Curve has fatter tails than normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When is Chi-Squared distribition used?

A

A-symmetric continuous distribution

Only consists of non-negative values

Starts from 0

Often used in hypothesis testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When is exponential distribution used?

A

Present with events that are rapidly changing early on

Example is how online news articles get hits –> Typically when topic is still fresh and then it dies off.

17
Q

When is logistic distribution used?

A

Logistic distribution

Useful in forecast analysis

Useful for determining a cut-off point for a succesful outcome

18
Q

Discrete distributions characteristics

A

FINETELY many distinct outcomes

Every unique outcome has a probability assigned to it

Example: Darts board –> The possible outcomes are from 0 to 60, thus finite

19
Q

How do continuous distributions differ from discrete?

A

Infinitely many consecutive outcomes

Therefore cannot record frequency of each distinct value

Cannot respresent them with a table but with a graph

f(y) >= 0

20
Q

Characteristics of the PDF Graph?

A

The function is called PDF (Probability Density Function)

Curve is called “Probability distribution curce (PDC)

It is like a discrete distribution but with infinite amount of samples

It gives the probability on y-axis for every possible value ‘y’ on x-axis

21
Q

Likelihood of each individual outcome in continuous distribution?

A

Likelihood of each individual one is infinitely small

favourable / sample space = 1 / infinite amount

Thus the probability for any individual value is equal to 0 –> P(X) = 0

Also: P(x > X) = P(x >=X)

Example of this: Finishing a run in exactly 6 minutes is extremely unlikely therefore we say the time until 6 minutes is the same as the time including the six minute exact moment.

22
Q

What is the CDF? How is it build up?

A

When you integrate the PDF, you would get the CDF

When you derive the CDF you would get the PDF

Therefore the CDF runs from 0 to 1

23
Q

What are the characteristics of the normal distribution?

A

Bell shaped

μ is the mean

Symmetric

Expected value E(X) = μ

24
Q

How would you transform a distribution?

A

Plus/Minus –> Moves the graphs to the right/left

Multiply/Divide –> It will shrink/expand

25
Q

What is standardizing?

A

Special kind of transformation where E(x) = 0

Var(X) = 1

The distribution we get after standardizing any normal distribition is called A ‘Standard normal distribution’

26
Q

What is the standard normal distribution table?

A

There’s a CDF table with all the standardized values for this graph

Also called Z- Score table

27
Q

What is the process of standardizing using transformation?

A

Moving the vertical centerline to y-axis by adding or subtracting a constant value –> -μ

We need to make sure the standard deviation is 1 –> Divide every element by the value of the standard deviation –> / σ

28
Q

How to calculate the Z-score (based on transformation)

A

Z = (Y - μ) / σ

29
Q

Define student’s T distribution

A

t (k) –> Where k is degrees of freedom

Y ~ t(3) –> Variable Y follows t distribution with 3 degrees of freedom

30
Q

When do you use student’s T distribution and what are key difference with normal disribution

A

You use this when there’s not sufficient data for the normal distribution

Small sample size approximation of a Normal Distribution

Another key difference is that apart from mean and variance you must also define degrees of freedom for the distribution

Graph is also bell shaped but with larger tails to accomodate occurence of values for away from the mean

31
Q

Difference statistics vs characteristics

A

Statistics:
Sample
60% of 1000 people have brown eyes –> Statistic

Characteristics:
Population
If 65% of people worldwide have brown eyes then that is a characteristic of the human population

32
Q

How can you determine the distribution type and what can you do with it?

A

Shape of the curve
Characteristics mu and sigma

You can create models (like regression models)

33
Q

How do statistics relate to data science

A

An expansion of probability, statistics and programming that implements computational technology to solve more advanced questions

34
Q

How does ML fits in the statistics world?

A

ML relies on expected values a lot.

ML is trial and error where a computer adjusts its expected value along the way

There is always a probability of failure due to unforeseen events (earthquakes etc)

35
Q
A