Chapter 5: Statistical Data Treatment and Evaluation Flashcards
this interval defines a numerical interval around the mean of a set of replicate results within which the population mean can be expected to lie with a certain probability
confidence interval
the limits of the interval are called
confidence limits
the probability that the true mean lies
within a certain interval and is often
expressed as a percentage.
confidence level
The probability that a
result is outside the confidence interval is often called the
significance level
90% confidence interval , z=
1.64
95% confidence interval , z=
1.96
99% confidence interval , z=
2.58
formula for finding confidence interval when standard deviation population is known or s is a good estimate of the standard deviation population
CI for u=x+- z(sigma)
CI for u= mean +- z(sigma)/square root of N
TRUE or FALSE
It is essential to keep in mind at all times that confidence intervals based on CI for u= mean +- z(sigma)/square root of N apply only in the absence of bias and only if we can assume that s is a good approximation of s.
TRUE
studied the limits of the Poisson and binomial distributions,
the sampling distribution of the mean and standard deviation, and several
other topics.
W. S. Gossett
His most important work on the t test was developed to determine how closely the yeast and alcohol content of various batches of Guinness matched the standard amounts established by the brewery
W. S. Gossett
is the basis for many decisions made in science and engineering.
Hypothesis testing
postulates that two or more observed quantities are the same.
null hypothesis
Specific examples of hypothesis tests that scientists often use include the comparison
of
1) the mean of an experimental data set with what is believed to be the true
value,
2) the mean to a predicted or cutoff (threshold) value, and
3) the means or
the standard deviations from two or more sets of data.
TRUE or FALSE
The first, the null hypothesis H0, states that m population is equal to true value of mean population. The second, the alternative hypothesis Ha can be stated in several ways.
TRUE
it does not matter whether the mean is larger or smaller than the known value
two-tailed test
other alternative hypotheses are mean population > true mean population or vice versa, which indicated the direction of the difference matters and is called
one-tailed test
TRUE or FALSE
The crucial elements of a test procedure are the formation of an appropriate test statistic
and the calculation of a probability value—p-value
TRUE
TRUE or FALSE
The smaller the value of p, the stronger the evidence against H0. Conversely, if the p value is large, there is good evidence that H0 is true, and therefore, it should be accepted
TRUE
For tests concerning one or two means, the test statistic might be the ________ if we have a large number of measurements or if we know s
z statistic
used for a small number of measurements with unknown s. Also, when in doubt, it is also should be used.
t statistic
TRUE or FALSE
If the probability p of obtaining
the z (or t) value is very low when assuming H0 is true, reject H0.
TRUE
TRUE or FALSE
using the rejection region approach, if z (or t) lies within the rejection region, reject H0.
TRUE
TRUE or FALSE
The rejection region approach has fallen
out of favor because the region holds only the chosen significance level
TRUE
what test is applicable If a large number of results are available so that s is a good estimate of s
large sample z test
rejection can occur for results in either tail of the distribution.
two-tailed test
For a small number of results, use a similar procedure to the z test except that the test
statistic is the t statistic and the test is called the
small sample t test
TRUE or FALSE
In testing for bias, we do not know initially whether the difference between the experimental mean and the accepted value is due to random error or to an actual
systematic error. The t test is used to determine the significance of the difference.
TRUE
TRUE or FALSE
If the analytical method
had no systematic error, or bias, random errors would give the frequency distribution
TRUE
TRUE or FALSE
t test is used to determine the significance of the difference of experimental mean and the accepted value
TRUE
If it were confirmed by further
experiments that the method always
gave low results, we would say that
the method had a
negative bias
TRUE or FALSE
If the absolute value of the test statistic is less than the critical value, the null hypothesis is accepted, and no significant difference between the means has been demonstrated
TRUE
TRUE or FALSE
A test value of t greater than the critical
value indicates a significant difference between the means
TRUE
TRUE or FALSE
Since t is less than or equal to the tcrit, you can conclude that there is a significant difference at % confidence level.
TRUE
TRUE or FALSE
When t is greater than the tcrit value, accept the null hypothesis at % confidence level and conclude that there is no significant difference between the experimental and the accepted value.
TRUE
TRUE or FALSE
The number of degrees of freeon for binding the critical value of t is N1+N2-2
TRUE
If there is good reason to believe that the standard deviations of the two data sets differ, the___________ must be used.1 However, the significance level for this t test is only approximate, and the number of degrees of freedom is more difficult to calculate.
two-sample t test
What statistical test should be used by scientists and engineers who often make use of pair measurements on the same sample in order to minimize sources of variability that are not of interest?
t test analyzing paired data
What statistical test should be used when two methods for determining glucose in blood serum are to be compared
t test analyzing paired data
occurs when H0 is
rejected although it is actually true.
type I error
In some sciences, a type I error is called a
false negative
occurs when H0 is accepted and
it is actually false.
type II error
type II error is sometimes termed as a
false positive
The probability of a type II error is given by the symbol
beta (B)
Making alpha (a) smaller such that 0.01 instead of 0.05 appears to make sense in order to minimize what type of error
type I error
TRUE or FALSE
decreasing the type I error rate increases
the type II error rate because they are inversely related to each other.
TRUE
TRUE or FALSE
an a value of 0.05
(95% confidence level) provides an acceptable compromise.
TRUE
TRUE or FALSE
If a type I error is much more
likely to have serious consequences than a type II error, it is reasonable to choose
a small value of a.
TRUE
TRUE or FALSE
a type II error would be
quite serious, and so a larger value of a is used to keep the type II error rate under
control.
TRUE
TRUE or FALSE
As a general rule of thumb, the largest a that is tolerable for the situation
should be used. This ensures the smallest type II error while keeping the type I error
within acceptable limits.
TRUE
What statistical analysis requires that the standard deviations of the data sets being compared are equal.
t test
can be used to test this assumption under the provision that the populations follow the
normal (Gaussian) distribution. It is also used in comparing more than two means and in linear regression analysis
F test
is based on the null hypothesis that the two population variances under
consideration are equal
f test
TRUE or FALSE
t and z test is based on the null hypothesis that the two population mean under consideration are equal
TRUE
is defined as the
ratio of the two sample variances is calculated and compared with the
critical value of F at the desired significance level
test statistic F
TRUE or FALSE
For a one-tailed test, we test the alternative hypothesis that one variance is greater than
the other. Hence, the variance of the supposedly more precise procedure is placed in the denominator and that of the less precise procedure is placed in the numerator
TRUE
TRUE or FALSE
If F1 is less than fcrit and p> actual value, we accept the null hypothesis
TRUE
TRUE or FALSE
If F2 is greater than fcrit and p> actual value, we reject the null hypothesis
TRUE
Analysis of Variance also stands for
ANOVA
These methods use a single test to determine whether there is
or is not a difference among the population means rather than pairwise comparisons
as is done with the t test
ANOVA
take advantage of ANOVA in
planning and performing experiments.
Experimental design methods
Common characteristics in a
comparison are ________. The values of
the factors are ________. Experimental
results are ________
factors; levels; responses
the populations have differing values of a common characteristic called
factor or treatment
The different values of the factor of interest are called
levels
The comparisons among the various populations are made by measuring a
response
Often, several factors may be involved, such as in an experiment to determine
whether pH and temperature influence the rate of a chemical reaction. In such
a case, the type of ANOVA is known as a
two-way ANOVA
The basic principle of ________
is to compare the variations
between the different factor
levels (groups) to that within
factor levels.
ANOVA
In ANOVA, the factor levels are often called
groups
TRUE or FALSE
The basic principle of ANOVA is to
compare the between-groups variation to the within-groups variation.
TRUE
The basic statistical test used for ANOVA is the
F test
detect difference in several population means by comparing the variances
ANOVA
it is the average of all the data
grand average
TRUE or FALSE
The error sum of the squares is related to the individual group variances
TRUE
TRUE or FALSE
As a rough rule of thumb, the largest s should not be much more than
twice the smallest s for equal variances to be assumed
TRUE
By dividing the sums of squares by their corresponding degrees of freedom, you
can obtain quantities that are estimates of the between-groups and within-groups
variations. These quantities are called
mean square values
are sums of squares divided by degrees of
freedom.
mean square values
is an estimate of the variance due to error
mean square due to errors
is an estimate of the error variance plus the between-groups variance
MSF/ mean square due to factor levels
TRUE or FALSE
If the factor effect is significant, MSF is
greater than MSE.
TRUE
There are several methods to determine
which means are significantly different. One of the simplest is the
least significant difference method
In this method, a difference is calculated that is judged to be the smallest difference that is significant. The difference between each pair of means is then compared to the least significant difference to determine which means are different
least significant difference method
is a result that is quite
different from the others in the data
set and might be due to a gross error.
outliers
is a simple, widely used statistical test for deciding whether a suspected
result should be retained or rejected
Q test
the absolute value of the difference
between the questionable result xq and its nearest neighbor xn is divided by the
spread w of the entire set to give the quantity Q:
Q test
TRUE or FALSE
If Q is greater than Qcrit, the questionable result can be rejected with the indicated degree of confidence
TRUE
TRUE or FALSE
the only valid reason for rejecting
a result from a small set of data is the sure knowledge that a mistake was made in the
measurement process. Without this knowledge, a cautious approach to rejection of an outlier is wise.
TRUE
is used to compare more than two means and to determine whether any differences are real or the result of random errors.
ANOVA
can be used to minimize sources of variability that influence both
members of a pair of measurements.
paired t test
computes the t statistic and the probability of t assuming that H0 is true. If p
is small compared to the significance level a, H0 is rejected. Alternatively, a rejection
region is calculated from critical values of t.
t test
asserts that the mean is equal to the accepted value
null hypothesis
We can compare an experimental mean to a known value or an accepted level by a
hypothesis test, the ____________, if the population standard deviation is known.
z test
is the interval within which the population mean is expected to lie with a certain probability.
confidence interval