CH12 Data Processing and Basic Data Analysis Flashcards

1
Q

what are the steps in the data analysis procedure for survey research?

A

data analysis procedure for survey research:
1. validation and editing (quality control)
2. coding
3. data capture
4. logical cleaning of data
5. tabulation and statistical analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is the purpose of step 1 in the data analysis procedure for survey research?

A

the researcher wants to make sure that all the interviews actually were conducted as specified (validation) and that the questionnaires have been filled out properly and completely (editing)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

define validation

A

validation: the process of ascertaining that interviews actually were conducted as specified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is the goal of validation?

A

to detect the interviewer’s fraud or failure to follow key instructions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

list the primary characteristics to consider when evaluating a panel company

A

primary characteristics to consider:
1. method of recruitment
2. frequency of replenishment
3. maximum number of surveys completed per month
4. verification of identity
5. profile information collected
6. incentives used
6. panel hygiene procedures
8. typical response rates
9. privacy policies in place
10. customer service
*continually monitor internet panel performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

list the steps taken to maximize respondent cooperation and to eliminate respondents who did not give sufficient attention to the questions being asked or intentionally tried to “cheat” the system to be included in a survey in which they did not qualify

A

steps taken:
- check for multiple responses and out-of-area respondents
- exclude speeders (anyone who speeds through a survey so fast that they could not possibly have reasonably considered the survey)
- setting the minimum survey length
- examining individual questions
- exclude straight liners/flat liners (those who are just clicking answers to get it done) and those who provide contradictory responses
- monitor key demographic and behavioral characteristics to be sure the sample is representative of the target market
remove respondents before fielding has ended
- extend survey fielding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

define big data

A

big data: the accumulation and analysis of massive quantities of information, often related to human behavior and actions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how is big data used in making business decisions?

A

big data is used with analytical tools to uncover hidden patterns, correlations, trends, and customer preferences for better business decision-making

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the main characteristic of big data in terms of its organization?

A

big data is typically unstructured, meaning it lacks a predefined data model or organization, and can include text, video, audio, as well as dates, numbers, and facts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is unstructured data?

A

unstructured data: information that does not have a predefined structure or organization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is handoop and how is it used in processing big data?

A

handoop: an open-source software framework used for storing and processing big data in a distributed manner on large clusters of computers

it is designed to handle big data and unstructured data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

define editing

A

editing: the process of ascertaining that questionnaires were filled out properly and completely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are some problems involved in the editing process that requires manual checking for paper surveys?

A

problems involved:
1. whether answers were not recorded for certain questions
2. whether skip patterns were followed
3. whether the interviewer paraphrased respondents’ answers to open-ended questions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

define a skip pattern

A

skip pattern: a sequence in which later questions are asked, based on a respondent’s answer to an earlier question or questions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is the disadvantage of the editing process?

A

it’s extremely tedious and time-consuming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is coding?

A

coding: a process of grouping and assigning numeric codes to open end question

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

why are open-ended questions phrased in an open-ended manner?

A

open-ended questions are phrased in an open-ended manner because
- the researcher either had no idea what answers to expect
- or wanted a richer response than is possible with a closed-ended question

18
Q

what is the process of coding responses to open-ended questions?

A

coding process:
1. list responses given
2. consolidate responses
3. set codes
4. enter codes
a. “Review responses to individual open-ended questions on questionnaires.
b. match individual responses with the consolidated list of response categories and determine the appropriate numeric code for each response.
c. record the numeric code in the appropriate place on the questionnaire for the response to that particular question (exhibit 12.5) or enter the appropriate code in the database electronically

19
Q

define intelligent data capture

A

intelligent data capture: a form of data capture in which the information being entered into the data capture device is checked for internal logic

20
Q

what is the purpose of the data capture process?

A

the data capture process is used to enter validated, edited, and coded questionnaire data into a computer system

21
Q

why is it preferred to capture data directly from questionnaires rather than transposing them manually?

A

capturing data directly from questionnaires reduces the chances of introducing errors compared to manual transposition

22
Q

what does the handwritten number in the upper right-hand corner of the questionnaire represent?

A

the handwritten number in the upper right-hand corner uniquely identifies the questionnaire and serves as a reference for data input

23
Q

what does the number in parentheses next to a question indicate in terms of data entry?

A

the number in parentheses next to a question indicates the field on the data record where the code for the answer should be entered

24
Q

how are open-ended responses captured in the data record?

A

open-ended responses are captured by entering the corresponding code in the designated field of the data record

25
Q

what is scanning technology?

A

scanning technology: a form of data capture in which responses on questionnaires are read in automatically by the data capture device

26
Q

define logical cleaning of data

A

logical or machine cleaning of data: final computerized error check of data

27
Q

what are error-checking routines?

A

error-checking routines: computer programs that accept instructions from the user to check for logical errors in the data

28
Q

what is the purpose of an error-checking routine?

A

to ensure that data are logically consistent

29
Q

define one-way frequency tables

A

one-way frequency table: table showing the number of respondents choosing each answer to a survey question

30
Q

what is an issue that must be dealt with when one-way frequency tables are generated?

A

what base should be used for the percentages for each table

31
Q

list the 3 options for a base used in one-way frequency tables

A

base options:
1. total respondents
2. number of people asked the particular question
3. number of people answering the question

32
Q

define cross-tabulation

A

cross-tabulation: an examination of the responses to one question relative to the responses to one or more other questions

33
Q

what is a common way of setting up cross-tabulation tables?

A
  • columns are used to represent factors such as demographics and lifestyle characteristics, which may be predictors of state of mind, behavior, or intentions data, shown as rows of the table
  • percentages usually are calculated on the basis of column totals. this approach permits easy comparisons of the relationship between, say, lifestyle characteristics and expected predictors such as sex or age
34
Q

spreadsheet programs such as excel have extensive graphics capabilities. with these programs, it is possible to:

A
  • quickly produce graphs to tell your story
  • display those graphs on a monitor or with a projector
  • make desired changes and redisplay
  • print final copies on a laser or inkjet printer
35
Q

what are line charts typically used for?

A

line charts are particularly useful for presenting measurements over time

36
Q

what is appropriate to display through a pie chart?

A

pie charts are appropriate for displaying marketing research results in a wide range of situations

37
Q

what are the 4 types of bar charts?

A
  1. plain bar chart
  2. clustered bar chart
  3. stacked bar chart
  4. multiple-row, three-dimensional bar chart
38
Q

which chart is the most flexible out of the 3: line, pie, or bar chart?

A

bar charts are the most flexible

anything that can be shown in a line graph or pie chart can also be shown in a bar chart

39
Q

what are the 3 measures of central tendency?

A

3 measures of central tendency:
- arithmetic mean or average
- median
- mode

40
Q

define the mean

A

mean: the sum of the values for all observations of a variable divided by the number of observations

the mean is properly computed only from interval or ratio (metric) data

41
Q

what is the median?

A

median: the middle-most number when the observations are arranged in numerical order

the median can be computed for all types of data except nominal data

42
Q

define the mode

A

mode: the value that occurs most frequently

the mode can be computed for any type of data (nominal, ordinal, interval, or ratio)