Statistics I Flashcards
Difference between and observational study and a survey
Survey requests information from the subjects
Difference between binomial and normal distribution
Binomial: variable is counter the number of successes in a certain number of trials Normal: Variable takes on values that occur according to the “bell shaped curve”
What is the t-distribution
Variable is based on smaple averages and you have limited data
What is correlation
The strength and direction of the linear relationship between x and y
Census vs. sample
Census is the entire population, sample is only part of it
Mean, median, mode
Mean: average
Median: equal number of data points above and below that specific data point
Mode: data point that occured the most
Standard deviation equation
n = sample size
What is the empirical rule
68 / 95 / 99.7 rule
68% of the data lies within 1 standard deviation
95% of the data lies within 2 standard deviations
99.7% of the data lies within 3 standard deviations
Distribution of z score
Central Limit Theorem
Gives you the ability to measure how much your sample mean will vary, without having to take any other sample means to compare it with
Gives you the ability to use confidence intervals and hyposthesis tests
Basically, if you keep taking samples of a set size, the resulting distribution of the means of the samples will be normal! The higher the set size, the more “normal” the distribution is.
What is the basic definition a a z distribution
Mean = 0, Std dev = 1
Blind vs. double-blind
Blind: participant doesn’t know
Double-blind: participant and admin doesn’t know
Margin of error
Supposed to measure the maximum amount by which the sample results are expected to differ from those of the actual population.
Often, this is referencing the confidence interval
Confidence Interval
The Percentage that represents the certainty that the mean is within a particular range
Hypothesis test
Data collected from a sample and measured against a claim about a population parameter
p-value
Shows the confidence for or against the null hypothesis.
The null hypothesis is the claim that’s on trial.
The alternative hypothesis is the one you would believe if the null hypothesis was untrue.
p-value < 0.05 indicates strong evidence against the null hypothesis, so reject it
p-value > 0.05 indicates weak evidence against the null hypothesis, so you fail to reject it
p-value == 0.05 could go either way
What is the relationship between mean, median, and skew
If the mean is larger than the median, skewed right
If the mean is smaller than the median, skewed left
Skewed right has a tail off to the right
Skewed left has a tail off to the left
What is the definition of a percentile
The percentage fo data that is below or above the particular data point. This doesn’t have to be continuous distributions, can be discrete counting
What is the “five number summary” of a dataset
[minimum, 25 percentile (first quartile, Q1), median (50 percentile), 75th percentile (third quartile, Q3), maximum]
Innerquartile range is Q3-Q1
Box plot
Great way to represent the five number summary
What are the characteristics of a binomial
- fixed number of trials
- each trial is either a success or failure
- there is a probability of success that is constant for each trial
- trials are independent (the outcome of one doesn’t influence others)
Equation for determining the probability of a certain number of desirable outcomes in a binomial distribution
Where:
b = binomial probability
x = total number of “successes” (pass or fail, heads or tails etc.)
P = probability of a success on an individual trial
n = number of trials
“A coin is tossed 10 times. What is the probability of getting exactly 6 heads?”
Combination and choose notation
“n choose r”
A coin is tossed 10 times, what is the probability of getting exactly 6 heads
10C6 is the notation for the formula
Also (10 over 6) (can’t upload two images, but number 10 over number 6 in parenthesis is another form of notation)
Relationship between variance and standard deviation
Standard deviation is the square root of variance