Hypothesis Testing (3) Flashcards
discrete random variables
Discrete random variables: are essentially anything that must be a whole number
Ex: households can’t be ½, 1/3, must be whole numbers
-are represented by bars in graph
-probability mass function
continuous random variables
Continuous random variables: doesn’t have to be whole numbers, can take pretty much any number
- represented by bell curve which is continuous
- probability density function
Normal(Gaussian) Distibution
1.consider these 2 normal random variables – through
standardization we can make them share the same basic m
and s statistics in order to be able to represent them together on a single standard normal distribution
2.z=(x-averagex)/s
z=1.6
3. use normal distribution table using the appropiate tail to find probability using z score of 1.6
STTANDARDIZATION IS THE MOST IMPORTANT STEP, MAKES THINGS EASY
-the standard normal distribution has a mean = 0 and standard deviation = 1, and is
symmetrical around the mean – the probability of something less than z = -2 is the
same as the probability of something greater than z = 2
central limit theorem
(2) assumptions
• it assumes that the individuals in the population are independent, meaning that
one individual does not influence any other
• it assumes that any samples are random and identically distributed – ie, have no
internal structure
confidence intervals
• for a normal distribution, 95% of all data points fall within 1.96 standard deviations
of the mean
• this tells us that any sample of the population we take
should fall within ±1.96 standard deviations of the mean 95%
of the time
we are 95% confident that the true height of students in this class is between 163.2
and 175.6 cm
• we can do this with other confidence levels, eg, 90% (z = 1.645), 99% (z = 2.58),
although 95% is most conventional
• if we increase n, then the range of values will decrease
Large and Small Samples and normal distribution
Large(>30 indv): uses the normal distribution curve
Small(<30 indv):
• when the sample is “small”, the distribution of points
follows the t distribution, not the normal distribution
(although these look similar, there is a subtle difference in
the shapes)
• the t distribution uses n – 1 degrees of freedom instead
of sample size n
• instead of z0.95 = 1.96, we use t0.95 = 2.045
How to determine a specific sample size based on wanting to achieve a specific confidence level (3 steps)
- which population parameter is of interest
• most of the time we are interested in m - how close do we wish the estimate made from our sample to be to the true
value of the population parameter
• this is a question of precision; obviously, if we don’t need to be too
precise, we can get away with fewer individuals - how confident do we wish to be that our estimate is within the tolerance
specified in 2
• this is the a value; a good default is a = 0.05 (95% confidence), but we
could increase this to 0.01 (99% confidence) if this were a really important
study
n=(zscore(95%) x estimated standard deviation)/allowable error
n=(1.96 x 0.7) /0.2
n=47.06
Simplified=
- What population parameter we are interested in
- How close do we wish the estimate to be to the population parameter
- question of precision - How confident do we want to be?
95% confidence interval represents…between ____ and ____
95% is generally the best and represents -1.96 to 1.96
Hypothesis Testing (6 steps)
• each hypothesis test is essentially asking the same question: is our sample the same
as the population, or is it different?
- formulate a hypothesis
.two sided:
-H(0) sample statistic is the same as population parameter
-H(A) sample statistic is not the same as population parameter
one sided: instead of the same it is either greater or less - specification of sample statistic and its sampling distribution
-this is generally prescribed by the research question: are we basing our test on
the mean, variance, proportion, etc. - selection of a level of significance
-unfortunately, it is almost always impossible to simultaneously minimize the
probability of both types of error – we can limit a or b, but not both
• by convention, we control only for a, setting a typical value of 0.10, 0.05, or 0.01
- if we set a to a small value, and we end up rejecting H0
, we do so with only a
small probability of error
• at the same time, we cannot be confident that a decision to accept H0
is
correct, since b is uncontrolled
• this means that H0
should always be something we want to reject (innocent
until proven guilty) - construction of a decision rule
- the rejection region is always in the extreme limbs of
the distribution
• the boundary, or critical limit, between the reject and
do not reject regions is derived from probability tables,
and can often take on standard values (providing the
sampling distribution remains the same) - compute value of the test statistic
- decision
• the decision is simply comparing step 4 to step 5
• when the test statistic fall beyond the critical limits defined by the decision rule,
we reject H0
, and vice versa
the interpretation of this result is that the
sample and the population are not the same,
and therefore Peterborough County does not
receive the same amount of precipitation as
Ontario
Why use a one sided test rather then a two sided test or the other way?
- decide on one sided(bigger or smaller) or two sided test(equal)
- no direction, “is it different”, so it is a two-sided test
Type 1 and 2 errors
Type 1: Reject H(0) and H(0) is actually true
Type 2: Accept H(0) and H(0) is actually fasle
THE HIGHER the confidence and LOWER the alpha value makes it less chance of a type 1 error
The LOWER the confidence and HIGHER the alpha values makes it less chance of a type 2 error
Assumptions in hypothesis testing (4)
• the data are normally distributed, or at least near-normally distributed
• the observations are independent
• in spatial data sets, this assumption is usually violated – nearby rainfall
stations will behave similarly simply because they are influenced by the
same weather patterns
• this usually appears as a reduction in sample size – since some of the
observations behave the same, we can group them together as a single
observation
- this has the potential to increase the probability of a Type I error
rule-of-thumb: if you reject H0 at two-tails, then you
will always _____ H0 at one-tail, all other things being
equal
reject
classical method of hypothesis testing is less used in practice because..
hile the classical method of hypothesis testing is useful to understand, it is less
used in practice than the alternative “prob-value” method
• one of the drawbacks of the classical method is that we need to specify a – often
there is no rational way of deciding what significance level to use – we know it
should be small, but how small?
• if 2 researchers choose different a levels, they might reach different
conclusions about a dataset – convention rather than theory dictates the choice
of a
• also, the classical method only determines whether we reject or accept H0
– this
simple decision leaves out important information, such as how confident are we that
we can reject H0
?
• if z = 1.96 and we get test statistics of 2.01 and 5.36 for two datasets, we reject
H0
for both, but clearly one dataset is more different than the other
Prob-valye hypothesis testing method is better because
• the prob-value method avoids both of these problems by directly computing the
probability of making a Type I error
• if we reject the null hypothesis, the prob-value tells us how likely it is that we
are wrong
• assume that we calculate a test statistic as z = 1.5
• if we take that value to a z table, we would find a
prob-value of 0.0668
• but note that this is for a one-tail test – if we are
doing 2 tails we must double the prob-value to
0.1336
• the prob-value is the lowest value at which we could set
the significance of the test and still reject H0
• likewise, we find that the likelihood of making a Type I
error is 13.36%
if we take that value to a z table, we would find a
prob-value of 0.0668
What is the likelihood of making a type 1 error?
13.36%
but note that this is for a one-tail test – if we are
doing 2 tails we must double the prob-value to
0.1336
• the prob-value is the lowest value at which we could set
the significance of the test and still reject H0
• likewise, we find that the likelihood of making a Type I
error is 13.36%
P Value:
Z=
Significance
standard deviation
in a one-sample test output table what do the significance(2-tailed) and interval of differences tell you
sig=0.000
interval:
-341.0441 to -176.5871
Questions about rivers is asking if the difference 0? The confidence level shows us that 0 is not within the range of -341 and -176
What kind of test would you use for these?
• a cartographer tests the time taken by students to perform a set of tasks involving
the extraction of map information. At the end of the course, students are given
the same test again. Have the students learned how to use maps more effectively?
• the mean household size in a certain city is 3.2 persons, with a standard deviation
of 1.6. A business interested in estimating weekly household expenditures on food
takes a random sample of 100 households and finds a mean household size of 3.6
persons. Is the sample representative of the whole city?
• a geographer interested in comparing the shopping habits of men and women
interviewed 95 men and 115 women and determined the distances travelled by
each respondent to the store at which their last major clothing purchase was
made. The average distance travelled by men was 6.2 km and by women 14.8 km.
The standard deviations were 17.5 and 23.2 km, respectively. Is it true that, on
average, women drive 2 km further than men for shopping?
Sample test 1=Paired
Sample test-testing once and then testing the same people again later on
Sample Test 2=One sample test
Sample Test 3=two sample test
Levine’s test for equality of variances tells you..
if the sample variances are equal
Levene’s Test for Equality of Variances tests the variance between the two samples and reveals that the two samples are not significantly different. The P-value is above 0.05.
Tobler’s First Law of Geography
they are relatively close together and affected by similar weather
patterns
Type 1 error is easier in larger data sets because
rejection of the null hypothesis is easier in larger datasets – we
can find significant differences in datasets when n is larger
• since spatial dependence raises n without providing useful data,
we are at risk of making Type-I errors – rejecting the null
hypothesis when it is actually true
sampling method 3 important parts
the sampling method defines how we select individuals from a
population to be part of a sample
• the population includes all members of the group of interest
• the sampling frame includes a list of members of the population
which can be selected for a sample – the sampling frame defines
what is being studied and what is not
• the sample is a subset of the members of the sampling frame
sampling bias
sampling bias, whether purposeful or not, means that some members of the
sampling frame are less likely to be included in the sample than others –
every entity should have an equal opportunity to be included, or not
include
define
- non probability sampling
- convenience sampling
- probability sampling
- simple random sampling
- systematic random sampling
- cluster sampling
1• non-probability sampling: the researcher cannot say what the chances are, or
likelihood, of an entity being selected for the sample
2• convenience sampling: the researcher includes all individuals that are
readily accessible until he/she is happy with the sample size
3• probability sampling: the chance or likelihood of each individual in the sampling
frame being selected for the sample is known
4• simple random sampling: randomly choose n
entities from the sampling frame
- always use a random number generator to
pick out individuals from the sampling
frame – never think of a random number
- note: random numbers are not necessarily
evenly spaced; if clusters arise from a
random process, it is still random
5.the first selected entity is chosen randomly, but
then every nth entity is chosen from the sampling frame
6.cluster sampling: a small number of subgroups are chosen, then
clusters of entities are chosen from those subgroups
______ sampling is the worst for accuracy and bias
Cluster
• this is a mix of convenience and random sampling, and can
introduce severe bias in highly heterogeneous regions
• often used in political surveys, where certain neighbourhoods
are targeted to assess how they may vote in an upcoming
election
objects vs fields
• objects: discrete entities (eg, people, families, cities, schools, countries,
etc.)
• fields: continuous entities (eg, elevation, temperature, energy, salinity,
etc.)
• measuring continuous fields is very challenging, and the researcher usually
reverts to using discrete sampling methods to sample the continuous field
• eg, precipitation (a continuous field) is measured using a precipitation
gauge (a discrete entity)
representativeness
representativeness: the degree to which the smaller set resembles the larger set
• if our sampling method leads to non-representative samples, then we have
introduced bias
• the sampling frame generalizes the population, while the sampling method
generalizes the sampling frame
• non-probability sampling means the research cannot say with much
certainty how well the sample represents the sampling frame, which
means the link between the sample and population is less certain
.redundancy means that you have wasted resources like time and money by continuing data collection for too long
.not collecting enough data can result in missing important population characteristics
the entities that provide our data make up the _____, while the entities to
which we ultimately want to generalize make up the _____
sample
population
2 situations where Non parametric methods can be applied
there are two situations where nonparametric methods can be applied:
1. when the data of the random variable is nominal or ordinal, parametric
methods cannot be used – it is impossible for a nominal or ordinal variable
to be normally distributed
2. when the nature of the underlying population distribution is unknown or
unspecified – we don’t know that the random variable is normally
distributed
T OR F
it is possible for a nominal or ordinal variable
to be normally distributed
F
IS IS IMPOSSIBLE
Parametric Vs Non Parametric
non is not as quanititative, can be ordinal or nominal data
- seen as worse then parametric, but in some situations there is no choice and it works well
- less powerful than parametric creating more chance of Type 1 error
- parametric tests are very sentitive to normaility, non parametric are more robust and less affected by normality
- nonparametric can handle all types of data(nominal,ordinal,ration,interval) with fewer restrictions
- non parametric is easier to understand as it uses simple mathematics
- MORE reliable for SMALL samples as normaility is often violated
homoscedasticity
The assumption of homoscedasticity (meaning “same variance”) is central to linear regression models. Homoscedasticity describes a situation in which the error term (that is, the “noise” or random disturbance in the relationship between the independent variables and the dependent variable) is the same across all values of the independent variables.
goodness of fit tests
goodness-of-fit tests are a special class of nonparametric test which can be used to determine if the distribution of a random variable fits a prescribed probability distribution, eg, the normal distribution
• the simplest test for normality is the _____-______,
• the simplest test for normality is the quantile-quantile, or Q-Q, plot
• when the points on a Q-Q plot are a straight line, then we can assume
normality, otherwise it may be some other non-normal distribution
goodness of fit tests
goodness-of-fit tests are a special class of nonparametric test which can be used to determine if the distribution of a random variable fits a prescribed probability distribution, eg, the normal distribution
-ex: Q-Q test which gives a plotted line where your points should run along
• the simplest test for normality is the quantile-quantile, or Q-Q, plot
• when the points on a Q-Q plot are a straight line, then we can assume
normality, otherwise it may be some other non-normal distribution
- unfortunately, the Q-Q plot approach is rather subjective – how close to a
straight line is needed to confirm normality?
Chi square tests is to test
Non parametric Chi square test is to confirm independence of two samples
• when comparing two datasets (eg, two-sample difference of means), the
assumption is that the 2 series are independent – obviously, if 2 or more individuals
are acting together they will inflate n and make it easier to make a Type I error
H0
: the samples are independent
the Shapiro-Wilk test is another, more probabilistic approach
H0
is always that the random variable is normally distributed
• this prob-value approach tells us that there is a 25.4% change of a Type I error if
we reject H0
, therefore we must conclude that the data are normally distributed