Lecture 5 Flashcards
distribution = ?
a distribution is a collection of data/scores of a variable
how are the values of a distribution ordered?
values of a distribution are commonly ordered
(e.g., from smallest to largest)
probability distribution = ?
a mathematical function that calculates the probability of possible outcomes
how is a discrete probability distribution portrayed?
breaks/holes between values
probability is represented by the height of each bar (on bar chart)
how is a continuous probability distribution portrayed?
in a continuous probability distribution, it’s the area instead of the height, that represents the probability
probability is represented by the area under the curve
normal distribution = ?
a bell shaped curve, centred in the middle
normal probability distribution is very common in statistical analyses
what does skewness measure?
the degree to which a dataset leans
asymmetric distribution of variables
gaussian distribution = ?
same as normal distribution
where is the highest point on the normal curve in normal distribution?
the mean
what does the standard deviation determine regarding a curve?
standard deviation determines the width of the curve
larger standard deviation = wider/flatter curve
smaller standard deviation = narrower curve/pointier
how does a normal curve look?
symmetric, bell-shaped
standard normal distribution = ?
instead of using units, scores are used as variables
z-score = ?
the number of standard deviations from the mean for a particular data point
how is z-score for a normal distribution calculated?
(variable - mean) / standard deviation
what are the two ways the standard normal distribution is used?
forward and in reverse
forward = for a given data value (x), calculate z and find the probability/area associated with z
reverse = for a given probability/area, find z then calculate the data value x
68-95-99.7 / empirical rule = ?
the rule is applied to remember percentage of values that lie within an interval estimate of the normal distribution
rule only works with normal distribution
approximately 68.3% of data will be within 1 standard deviation of the mean
central limit theorem (CLT) = ?
as sample size increases, the sampling distribution of the sample mean rapidly approaches the bell shape of a normal distribution
what sample size is considered large?
a sample size consisting of more than 30 variables
if the population has a normal distribution…
the sampling distribution of the sample mean has a normal distribution for any sample size
according to central limit theorem
law of large numbers (LLN) = ?
consists of a theorem that describes the result of performing the same experiment a large number of times
as number of performances increases, the result approaches the expected result (mean/average)
hypothesis = ?
a statement regarding the inference of the data
how does statistical hypothesis testing work?
- a hypothesis is made about a characteristic of the population
- sample is then taken in an effort to establish whether or not the statement is true
- if sample produced results adverse to the hypothesis, the hypothesis would be considered false
what are the two types of hypothesis?
null hypothesis & alternative hypothesis
null hypothesis = ?
H0
status quo
the statement to be tested in the hypothesis testing
alternative hypothesis = ?
opposite to the null hypothesis
(e.g., the sky is blue on a tuesday = null hypothesis; the sky isn’t blue on a tuesday = alternative hypothesis (Ha)
significance level (alpha) = ?
the probability that defines what we mean by unlikely sample results under an assumption that the null hypothesis is true
accepting vs failing to reject the null hypothesis = ?
no such thing as accepting a null hypothesis
only failure to reject opposed to accepting the claim