Samples, Populations, and the Normal Distribution Flashcards
we need to make sure, as far as we can, that our sample is an unbiased and representative sample of our population.
Inferential statistics
we also need to make sure that our sample is
large enough
Two Conditions for Random Sampling to be satisfied:
- every member of the population must
have an equal chance of being selected.
equi-probability
the “gold standard” to which other sampling techniques aspire
random sampling
Two Conditions for Random Sampling to be satisfied:
- the selection of any one member of the
population should not affect the chances of any other
being selected.
independence
Many variables that can be measured on a continuous
scale are (approximately)
Many statistical tests make the assumption that our data
are
normally distributed
represented as a line chart,
with continuous variable on the x-axis (and where the
y-axis represents the frequency density), we can
calculate the number of people who have any score, or
range of scores, by calculating the area of the chart.
histrogram
Random Sampling is virtually impossible
Volunteer sample
Snowball sampling
Purposive sampling
Convenience sampling
We calculate the area under the curve, which will give
the number of people, which will give the probability of
the _____of responses.
range
Most of us will never need to know the ________of
the normal curve.
exact equation
Formula for Area of Triangle
W x H x 0.5
under the curve is equal to the number of people
area
There is a formula that we use to calculate the area
under the normal curve, for any value taken from the
normal distribution, and we can use this to calculate the
probability of ______
any range of responses.
to make the point that the normal curve is a
theoretical curve that is mathematically generated.
formula
frequency of a given value of X*
big Y
mean of the distribution
µ
any score in the distribution
big X
total frequency of the distribution
N
a constant of 3.1416
π
a constant of 2.7183
e
The _______ goes on to infinity in each
direction.
Normal Distribution
The great advantage of a normal distribution is that if
you know (or can estimate) two values_______, you know everything there is to
know about it.
(Mean and
Standard Deviation)
There is no beginning and end to the ______ on a normal
distribution plot, at least in theory.
x-axis
In a normal distribution half of the ______will lie above
the mean and half below the mean.
scores
This means we can use the area under the curve to find
the _______ of any value or range of values.
probability
95.45% of cases lie within _____ of the Mean.
2 SDs
47.72% of cases lie between the Mean and______
-2 SDs
2.27% of cases lie
more than 2 SDs below the Mean
49.9% of cases lie between the Mean and ____
+3 SDs
A score that is presented in terms of the number of
standard deviations above the mean is called a ____
z-score or standard score
A z score is a transformed score that designates how
many Standard Deviation units the corresponding raw
score is ______
above or below the Mean.
process by which the raw score is
altered is called a.
Score transformation
To calculate a z-score, we use the formula:
z = score – mean / σ (standard deviation)
The z transformation results in a distribution having a ____
Mean of 0 and an SD of 1.
comparing scores that are not otherwise
directly comparable.
important use
Z scores allow us to determine the ______ that fall above or below any score
in the distribution.
number or percentages of scores
the ability to compare scores that are measured on
different scales is of fundamental importance to the topic of ______
correlation
Z scores have the _____ as the set of raw scores.
same shape
Transforming the ______ into their corresponding z
scores does not change the shape of the distribution.
raw scores
The scores do not change their _________. All that
changes are the score values.
relative positions.
The mean of the z scores always equals____
zero
The scores
located at the mean of the raw scores will also be at the
________
mean of the z scores.
The SD of z scores always equals to
1
A raw score that is 1 SD above the mean has a z
score of
+1
distribution of
means from a set of samples. It is a listing of all the
values the mean can take, along with the probability of
getting each value if sampling is random from the
null-hypothesis population.
Sampling distribution of the mean
With a smaller sample, the distribution will be
t-shaped
Tells us that, given some assumptions, the sampling
distribution of the mean will form a normal distribution,
with a large sample.
Central Limit Theorem
Statistical tests do not assume that the distribution of
the data in the sample is
normal
tells us that if the distribution in the sample is
approximately normal, then the sampling distribution
will be the correct shape
central limit theorem
If the sample distribution is not normal, but the sample
is large enough, then the sampling distribution will still be
normal (or t-shaped).
The larger the sample, the less we need to worry about
whether our sample data are
normally distributed or
not.
states that the sampling
distribution of any statistic will be normal or nearly
normal, if the sample size is large enough.
central limit theorem
The more closely the
sampling distribution needs to resemble a normal
distribution, the more sample points will be required.
requirements for accuracy
How large is “large enough”?
depends on two factors
The more
closely the original population resembles a normal
distribution, the fewer sample points will be required.
The shape of the underlying population.
some statisticians say that a sample size of 30
is large enough when the population distribution is
roughly _____
bell-shaped
Others recommend a sample size of at least____
40
But if the original population is distinctly not normal
(e.g., is badly skewed, has multiple peaks, and/or has
outliers), researchers like the sample size to be even_____
larger
standard deviation of the
sampling distribution of the mean.
Standard Error ( se )
The Standard Error should be affected by the_______ The bigger the sample, the closer our sample
mean is likely to be to the population mean.
Sample Size
if there is a lot of
variation in the sample, there will be more uncertainty
in the sample, so there will be more uncertainty about
the population mean.
Amount of Variation in Sample