Statistics Flashcards
3 basic Classification of Statistics:
Classical statistics - paremeters unknown to us but they are fixed and we want to make inferences(mu, sigma ^2, X bar. )
Bayesian statistics - paremeters are not fixed, more parametric, you have to impose a distribution
Non parametric Statistics - does not assume normality, has the least assumptions
What are Descriptive Statistics?
- First approach to turn data into information
- Summarize large amounts of data - ease of interpretation
- It consists of tables, graphs, summary measures, images or
anything that illustrates the information contained in the data.
-A picture is worth a thousand words
Types of Statistical Variables:
1) Qualitative: sex, socioeconomic status, marital status
2) Quantitative:
a) discrete- # of times a particular phenomenon has happened.
b) Continuous-indicate the result of a random
experiment whose sample space or possibilities is uncountable.
Type of Statistical Data:
1) Ordinal: 1, 2, 3,…; A, B, C, …)
2) Non-ordinal: Married, Divorced, Single, Widowed …
3) Time Series: Poverty over time
4) Cross Section: Population in 200 countries at a given time (say for January 2010)
5) Panel Data: Population in 200 countries over the last 30 year.
numerical measures:
• Location: Average, median, mode, quartiles, quintiles,
deciles, percentiles (quantiles in general), trimmed mean, weighted mean, geometric mean, harmonic mean, etc.
• Scale: Range, interquartile range, variance, pseudovariance, standard deviation, etc.
• Other: Coefficient of Variation, Sharpe Ratio, skewness, kurtosis…
mean:
the arithmetic average
mean=EX/N
-it is important to remember that although mean provides a useful peace of information, it does not tell you anything about how spread out the scores are(variance), outliers that might skew the mean, etc.
median
the number in the distribution that marks the 50th percentile/the number in the middle of the entire distribution
mode
the number that has the highest frequency(occurs most often)
Quantiles
quartile: splits the ranked data into 4 segments with an equal number of values per segment:
quintiles: splits the ranked data into 5 segments…
deciles: splits the ranked data into 10 segments…
percentiles: splits the ranked data into 100 segments…
Trimmed mean/Truncated mean
A method of averaging that removes a small percentage of the largest and smallest values before calculating the mean. After removing the specified observations
- the trimmed mean is found using an arithmetic averaging formula (look in below website).
https: //www.easycalculation.com/statistics/learn-trimmed-mean.php
Weighted mean
Instead of each data point contributing equally to the final mean, some data points contribute more “weight” than others.
Formula: (X1 x .40) + (X2 x .30) + (X3 x .20) + (X4 x .10)
-If all the weights are equal, then the weighted mean equals the arithmetic mean (the regular “average” you’re used to)
range
The difference between the lowest and highest value.
Example: In {4, 6, 9, 3, 7} the lowest value is 3, and the highest is 9, so the range is 9 − 3 = 6.
Interquartile Range(IQR)/H-spread
also called the midspread or middle fifty, it is a measure of statistical dispersion, it “chops off” the top 25% quartile and bottom 25%(ignores 50% of the data).
variance
the expectation of the squared deviation of a random variable from its mean, and it informally measures how far a set of (random) numbers are spread out from their mean(dispersion).
σ^2 = [ ∑(x-mean)^2] / N
standard deviation
a measure that is used to quantify the average amount of variation or dispersion of a set of data values from the mean.
(represented by the Greek letter sigma σ or the Latin letter s)
Square root of variance (√[ ∑(x-mean)^2 / N)
Parameters vs Estimators
P correspond to the population. They are practical quantities. They can be computed from the data
E correspond to the sample. They are theoretical quantities, many times unknown
stochastic model
tool for estimating probability distributions for a collection of random variables over time
3 methods assigning probability
Classical Method - based on the assumption of equally likely outcomes - > counting techinques
Relative Frequency Method - based on experimentation or historical data
Subjective Method - based on judgement, still can be scientific
Complement vs Union vs Intersection
The Complement of an event is defined to be the event consisting of all sample points that are not in A
-it is denoted as A^c
The union of events A and B is the event containing all sample points that are in A or B(or both)
-denoted as A U B
The Intersection of events A and B is the set of all sample points that are in both A AND B
-denoted as A ^ B
Addition Law
provides a way to compute the probability of event A, or B, or both A and B occuring
- law is written as P(AUB) = P(A) + P(B) - P(A∩B)
- this is done so you don’t count them twice
Mutually exclusive events
have no sample points in common, cannot happen at the same time.
For example: when tossing a coin, the result can either be heads or tails but cannot be both.
Conditional probability
The probability of an event given that another event has occurred
-Denoted as P(A|B) computed mathematically as follows P(A|B) = P(A ^ B) / P(B)
Independent Events
If the probability of event A is not changed by the occurrence of event B
-It would simply be denoted as P(A) but mathematically to find out if they are dependent do P(A) x P (B)
Baye’s Theorem
describes the probability of an event, based on conditions that might be related to the event.
-For example, suppose one is interested in whether a person has cancer, and knows the person’s age. If cancer is related to age, then, using Bayes’ theorem, information about the person’s age can be used to more accurately assess the probability that they have cancer.
Bayes’ theorem provides the means for revising theprior probabilities.
Steps of Bayes Theorem
Begin analysis with initial Prior Probabilities, then get any new information, then apply bayes theorem to and get Posterior probabilities.
Equally Likely Probability Spaces (or simple spaces)
When the probability of each possible outcome is the same
Binomial Theorem
a formula for finding any power of a binomial without multiplying at length
random variable
a numerical description of theoutcome of an experiment. A function.
discrete random variable
may assume either a finite number of values or an infinite sequence of values. Numerable.
Example: number of TVs sold on one day
Types of Fallacies:
- Measurement Rules:changing the way you measure
- Correlation vs Causation
- Ceteris Paribus-if I drive fast, I’ll spend less time on the road, less chance of an accident. All things are not equal here.
- Extrapolation Bias-taking a survey of just woman or college students and applying it to the entire population
- selection bias-some are not interviewed
Discrete Probability Distributions
The required conditions are they have to be no negative numbers (f(x) > 0) and will all add up to 1 (Ef(x) = 1)
expected value, of a random variable
EV(x) =u =Exf(x)
Properties of a Binomial Probability Distribution experiment
- The experiment consists of a sequence of n identical trials.
- Two outcomes, success and failure, are possible on each trial.
- The probability of a success, denoted by p, does not change from trial to trial
- The trials are independent.
-Our interest is in the number of successes occurring in the n trials. We let x denote the number of successes occurring in the n trials.
Binomial Probability Function
( n choose x) P^x(1-p)^(n-x)
What is the Expected value, Variance, and SD of a binomial.
EV(x)=np
Var(x)=np(1 – p)
SD=sqr{np(1 – p)}
Poisson Probability Distribution
A Poisson distributed random variable is often useful in estimating the number of occurrences over a specified interval of time or space.
It is a discrete random variable (R.V.) that may assume an infinite sequence of values (x = 0, 1, 2, . . . ).
Two Properties of a Poisson Experiment
- The probability of an occurrence is the same for any two intervals of equal length.
- The occurrence or nonoccurrence in any interval is independent of the occurrence or nonoccurrence in any other interval.
Poisson Probability Function
f(x)=mu^x(2.781828)^-mu / x!
What is the Variance, and SD of a poisson distribution
A property of the Poisson distribution is that the mean and variance are equal. Of course, SD is just the square root.
Hypergeometric Probability Distribution
- it is closely related to binomial distribution, However 2 main differences:
1. The trials are not independent
2. The probability of success changes from trial to trial
They are also without replacement, more close to the real world
Hypergeometric Probability Function
P= (r choose x) (N choose n-k) / (N - r choose n - x) / (N choose n)
where: x = number of successes n = number of trials N = number of elements in the population r = number of elements in the population labeled success
(see notes for example)
continuous random variable
can assume any value in a real interval or a collection of intervals. non-countable.
How do we handle continuous random variables?
We don’t want to ask the probability of a continuous random variable, either way it will always be 0. Instead it would fall in some interval/range.
Continuous Probability Distribution
The equation used to describe it is called a probability density function:describes the relative likelihood for this random variable to falling within a particular range of values, the density over that range.
Characteristics of Uniform Probability Distribution
when the probability of any event is proportional to the length of the interval. The distribution looks very flat and even.
Uniform probability density function:
f(x) = 1/(b-a) if a
Expected Value of X [Uniform Probability Distribution]
E(X) = (a + b)/2
We divide a+b by 2 to get the middle point of the distribution for expected value.
Variance of X [Uniform Probability Distribution]
Var(X) = (b - a)^2 / 12
Normal Probability Density Function
f(x) = 1/Ssqr(2ii) x e^(…very long)
-m3 pg 16
Characteristics of Normal Probability Distribution
The distribution is symmetric, and bell-shaped. It is centered around the mean which is also its highest point(median and mode), the width is defined by the standard deviation.
Probabilities are given by the area under the curve which is 1 so .5 each side.
Normal distribution standard deviation breakdown
- 26% of values of a normal random variable are within +/- 1 standard deviation of its mean.
- 44% of values of a normal random variable are within +/- 2 standard deviation of its mean.
- 74% of values of a normal random variable are within +/- 2 standard deviation of its mean.
A random variable having a normal distribution with a mean of 0 and a standard deviation of 1 is said to have:
a standard normal probability distribution. X ~ N (0, 1)
Converting to the Standard Normal Distribution
z= (x̄ - μ) / (σ/√n) or z = x - μ / σ for only 1 observation
We can think of z as a measure of the number of standard deviations x is from u.
-This is done to compare them easily/standardize them, doing this will make the mean 0 and the sd 1
What does the standardized z-score tell us?
tells us how far above or below the sample mean is compared to the population mean in units of standard error.
3 THINGS Central Limit Theorem tells us:
- It is a normal distribution
- As the sample size increases, the variance decreases(it is more accurate)
- It is unbiased (it is the true mean)
therefore it permits us to draw conclusions about the population based strictly on sample data without having knowledge about the distribution of the underlying population.
CLT Standard deviation/standard error mean.
x̄=σ/√n