1st lecture Flashcards
Probability Distribution
tells us the probabilities of observing different values within a population
the probability density function is the function displaying the probability distribution
Discrete RV
can take on only certain specific values
Continuous RV
can take any value
the probability that a continuous RV will take on a specific value is always zero
probabilities for continuous values are described by pdf’s
Measures of Central Tendency
median
mean
mode
Variance
measures the dispersion of a probability distribution
Range
difference between the largest and smallest of the observed data points
Interquartile range
difference between the 75th and 25th percentile
Covariance & Correlation
Covariance shows the direction in which both variables move;
Correlation shows the direction AND the strength in which two different variables move
Cov(x,y) = 0
both variables are independent from each other
E[X x Y] = …
E[X] x E[Y] + cov(X,Y)
Var(Zahl)=
0
Var(cY+d) = …
c²Var(y)
Var(cX+dY) = …
c²Var(X) + d²Var(Y) +2cdCov(X,Y)
Cov(x+y , z) = …
cov(x,z) + cov(y,z)
What is the meaning of “unimodal”?
Only one peak in the distribution
Standard normally distributed variable:
Z = (y-mean) / stdev
Cumulative distribution function
The probability that a continuous RV lies above or below a certain value
X²- distribution
sum of squares of n independent standard normal distributions wit h degrees of freedom
Degrees of freedom - …
Number of independent values that can vary in an analysis without breaking any constraints (observations - parameters)
F - distribution: F(n1,n2)
the ratio of two independent X² distributions divided by their respective degrees of freedom n1 and n2
CLT - central limit theorem:
allows us to make inferences about a whole population using sample data, regardless of the population’s original distribution.
Taking many samples of a population and calculating their means will result in a normal sample mean distribution, although the original distribution wasn’t normally distributed.
What does a regression do?
it describes and evaluates the empirical relationship between dependent and independent variables using a sample of data