chapter 13 textbook Flashcards
validation
The process of ascertaining that interviews actually were con- ducted as specified.
The goal of validation is solely to detect interviewer fraud or failure to follow key instructions.
cheating can happen so validation is important
The purpose of the validation process, as noted earlier, is to ensure that interviews were administered properly and completely. Researchers must be sure that the research results on which they are basing their recommendations reflect the legitimate responses of target individuals.
editing
The process of ascertaining that questionnaires were filled out properly and completely.
Whether the interviewer failed to ask certain questions or record answers for certain questions.
Whether skip patterns were followed.
Whether the interviewer paraphrased respondents’ answers to open-ended questions.
step 2: coding
The process of grouping and assigning numeric codes to the various responses to a question.
coding process
List responses
Consolidate responses.
set codes
enter codes –> a. Read responses to individual open-ended questions on questionnaires.
b. Match individual responses with the consolidated list of response categories, and determine the appropriate numeric code for each response.
c. Write the numeric code in the appropriate place on the questionnaire for the response to the particular question (see Exhibit 13.5) or enter the appropriate code in the database electronically.
automated coding
A number of developments are making it likely that the tedious coding process for open-ended questions will also soon be replaced with computer-based systems requiring limited high-level human intervention and decision-making
step 3: data entry
The process of converting infor- mation to an electronic format
step 5:
tabulate the survey results
one way frequency table
A table showing the number of respondents choosing each answer to a survey question.
The frequency table is most often generated for multiple-choice question
analzyinbg a frequency table
In most instances, a one-way frequency table is the first summary of survey results seen by the research ana- lyst. In addition to frequencies, these tables typically indicate the percentage of those responding who gave each possible response to a question.
Graphical representations of frequencies, like those we created using Excel or SPSS above, provide the manager with a quick picture of the results.
An issue that must be dealt with when one-way frequency tables are generated is what base to use for the percentages for each table. : total respondents, number of people asked the particular question, number of people answering the question
cross tabulations
The examination of the response to one question relative to the responses to one or more other questions.
A common way of setting up cross tabu- lation tables is to construct a table using rows to represent factors such as demograph- ics or other categorizing variables which are expected to be predictors of the state of mind, behaviour, or intentions data shown as columns of the table, or vice vers
he researcher has a choice of three types of percentage calculations: row percentages, column percentages and total percentages
descriptive statistics
Descriptive statistics are the most efficient means of summarizing the characteristics of large sets of data. In a statistical analysis, the analyst calculates one number or a few numbers that reveal something about the characteristics of large sets of data.
measures of central tendency
arithmetic mean, median, and mode
The mean is properly computed only from interval or ratio (metric) data
The median can be computed for all types of data except nominal data.
The mode can be computed for any type of data (nominal, ordinal, interval, or ratio). It is determined by finding the value that occurs most frequently.
One problem with using the mode is that a particular data set may have more than one mode.
measures of dispersion
Frequently used measures of dispersion include standard deviation, variance, and range.
how widespread the data is
percentages and statistical tests
In performing basic data analysis, the research analyst must decide whether to use measures of central tendency (mean, median, mode) or percentages (one-way fre- quency tables, cross tabulations). Responses to questions are either categorical or take the form of continuous variables.
statistical signifanc
The basic motive for making statistical inferences is to generalize from sample results to population characteristics. A fundamental tenet of statistical inference is that it is possible for numbers to be different (or related) in a mathematical sense but not sig- nificantly different (or related) in a statistical sense
Three different concepts can be applied to the notion of differences when we are talking about results from samples:
- Mathematical differences or relation: By definition, if numbers are not exactly the same, they are different. Similarly, it may seem that when one variable goes up, the other goes up as well. This does not, however, mean that the difference or relation is either important or statistically significant.
- Statistical significance: If a particular difference is large enough (or relation is strong enough) to be unlikely to have occurred because of chance or sampling error, then it is statistically significant.
- Managerially important differences: One can argue that a difference (or relation) is important from a managerial perspective only if results or numbers are sufficiently different (or related). For example, the difference in consumer responses to two dif- ferent packages in a test market might be statistically significant but yet so small that they have little practical or managerial significance.8
hypothesis testing
hypothesis
An assumption or theory that a researcher or manager makes about some characteristics of the population under study.
The marketing researcher is often faced with the question of whether research results are different enough from the norm that some element of the firm’s marketing strategy should be changed.
Either the hypothesis is true and the observed dif- ference is likely due to sampling error, or the hypothesis is false and there is indeed a difference in the population as well.
steps in hypothesis testing
Five steps are involved in testing a hypothesis. First, the hypothesis is specified. Second, an appropriate statistical technique is selected to test the hypothesis. Third, a decision rule is specified as the basis for determining whether to reject or fail to reject (FTR) the null hypothesis H0. Please note that we did not say “reject H0 or accept H0.” Although a seemingly small distinction, it is an important one. The distinction will be discussed in greater detail later on. Fourth, the value of the test statistic is calculated and the test is performed. Fifth, the conclusion is stated from the perspective of the original research problem or question.
step 1: stating the hypothesis
Hypotheses are stated using two basic forms: the null hypothesis H0 and the alterna- tive hypothesis Ha. The null hypothesis H0 (sometimes called the hypothesis of the status quo) is the hypothesis that is tested against its complement, the alternative hypothesis Ha (sometimes called the research hypothesis of interest).
It should be noted that the null hypothesis and the alternative hypothesis must be stated in such a way that both cannot be true. The idea is to use the available evidence to ascertain which hypothesis is more likely to be true.
step 2: choosing the appropriate test statistic
chi squared
anova
paired samples
etc
Step Three: Developing a Decision Rule
The significance level (α) is critical in the process of choosing between the null and alternative hypotheses. The level of significance—0.10, 0.05, or 0.01, for example—is the probability that is considered too low to justify acceptance of the null hypothesis.
Step 4: Calculating the Value of the Test Statistic
- uses the appropriate formula to calculate the value of the statistic for the test chosen
- compares the value just calculated to the critical value of the statistic (from the appro- priate table), based on the decision rule chosen
- based on the comparison, determines to either reject or fail to reject the null hypothesis H
Step 5: Stating the Conclusion
The conclusion summarizes the results of the test. It should be stated from the perspec- tive of the original research question.
TYPES OF ERRORS IN HYPOTHESIS TESTING
type I error (α error) Rejection of the null hypothesis when, in fact, it is true.
The probabil- ity of committing a type I error is referred to as the alpha (α) level. Conversely, 1– α is the probability of making a correct decision by not rejecting the null hypothesis when, in fact, it is true.
type II error (β error) Failure to reject the null hypoth- esis when, in fact, it is fals
only one of α and β can be controlled.
Depending on what is being tested, a type I error (measured by α) may not be nearly as serious as a type II error (measured by β).
ACCEPTING H0 VERSUS FAILING TO REJECT (FTR) H0
Researchers often fail to make a distinction between accepting H0 and failing to reject H0. However, as noted earlier, there is an important distinction between these two deci- sions. When a hypothesis is tested, H0 is presumed to be true until it is demonstrated to be likely to be false. In any hypothesis-testing situation, the only other hypothesis that can be accepted is the alternative hypothesis, Ha. Either there is sufficient evidence to support Ha (reject H0) or there is not (fail to reject H0). The real question is whether there is enough evidence in the data to conclude that Ha is correct. If we fail to reject H0, we are saying that the data do not provide sufficient support of the claim made in Ha—not that we accept the statement made in H0.
ONE-TAILED VERSUS TWO-TAILED TEST
Testsareeitherone-tailedortwo-tailed.Thedecisionastowhichtousedependsonthe nature of the situation and what the researcher is trying to demonstrate.
HYPOTHESIS TESTING PARAMETERS
All these tests involve comparing a computed value to a tabular value of the statistic pertaining to the distribution of data to determine whether or not to reject the null hypothesis. The distributions used for comparing the computed and tabular values of the statistics are the Z distribution, the t distribution, the chi- square ( χ2) distribution, and the F distribution. The table of values for these distribu- tions can be easily found online.
Independent versus Related Samples
independent samples
Samples in which measurement of a variable in one population has no effect on measurement of the variable in the other
related samples
Samples in which measurement of a variable in one population may influence measurement of the variable in the other.
Degrees of Freedom
The number of degrees of freedom is the number of observations in a statistical problem that are not restricted or are free to vary.
The number of degrees of freedom (d.f.) is equal to the number of observations minus the number of assumptions or constraints necessary to calculate a statistic.
p Value and Significance Testing
standard—a level of significance and associated critical value of the statistics—is established, and then the value of the sta- tistic is calculated to see whether it beats that standard. If the calculated value of the statistic exceeds the critical value, then the result being tested is said to be statistically significant at that level.
However, this approach does not give the exact probability of getting a computed test statistic that is largely due to chance.
p value The exact probability of getting a computed test statistic that is due to chance. The smaller the p value, the smaller the probability that the observed result occurred by chance
The smaller the p value, the smaller is the probability that the observed result occurred by chance (sampling error).