Statistics and distributions Flashcards
Define a statistic
A value calculated from data. It can be used to describe/summarise the data or infer properties of the underlying distribution.
What is the significance of the law of large numbers to inferring properties of a distribution using statistics?
As the sample size grows, the sample mean, std, correlation, covariances and other stats approach in probability the population values.
Define reliability.
The degree to which a measure is correlated with itself over multiple occasions.
Define Validity
The degree to which the variable actually measures the thing it is supposed to. I.e. is the measure correlated with the truth.
Can there be validity without reliability? Why?
No, because for a test to be valid, it must correlate with the truth, but if two instances of a test aren’t correlated with eachother, then at least one of them cannot be correlated with the true value.
Does reliability imply anything about validity?
No, a test can be reliable i.e. repeatable, but not actually correlate with the value it is trying to measure.
What is the standard normal distribution?
A normal distribution with mean 0 and variance 1.
X and Y are normally distributed. Can we say anything about a(X + Y) for a != 0?
It too is normally distributed.
Define the z-score.
For a normal distribution, the z-score is given by the number of stds a random variable is away from the mean.
z = (x - \mu) / \sigma
What does the critical value z_(\alpha / 2) represent?
The probability that a standard normal random variable is between +- this value is 1 - \alpha.
What is a sampling distribution?
The distribution of statistics based off a random sample.
Define the standard error.
The standard deviation of a statistic’s sampling distribution.
What is the SE for the sample mean?
\sigma / \sqrt(N)