Data Driven Decisions Flashcards

1
Q

Descriptive analytics

A

Depict and then describe the characteristics of what is being studied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Predictive analytics

A

Use data from the past to predict the future

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Prescriptive analytics

A

Include experimental design and optimization to suggest a course of action

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

True or False?

From data mining, someone is able to make conclusions about the underlying causes of certain variables.

A

False

Correct. This is a false statement. Data mining is often able to find trends, but it will usually overlook the underlying causes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

True or False?

As technology improves, there will be a greater amount of raw data.

A

True

Correct. This statement is true. Data collection will become easier as technology improves which will lead to an increase in raw data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Davenport-Kim three-stage model

A

A decision-making model developed by Thomas Davenport and Jinho Kim that consists of three stages: framing the problem, solving the problem, and communicating results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Stage 1: Problem recognition consists of the following steps:

A

Identifying stakeholders
Focusing on decisions
Identifying the kind of story you’re going to tell
Determining the scope of the problem
Getting specific about what you’re trying to find out

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Stage 2: Solving the problem

A

The modeling step
The data collection step
The data analysis step

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

True or False?

The first step in the Davenport-Kim three-stage model is to frame the problem by recognizing what the problem is and then reviewing previous findings to begin to structure the analysis.

A

True

Correct. This statement is true. Stage #1 is to frame the problem by recognizing what the problem is and then reviewing previous findings to begin to structure the analysis. Stage #2 is to solve the problem. Stage #3 is the communicate your findings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

True or False?

The stage that involves the most intense statistics and data work is stage 3, communicating results.

A

False

Correct. This statement is false. The stage that involves the most intense statistics and data work is stage 2, solving the problem. This step includes data modeling, data collection, and data analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Continuous data

A

Data that can lay along any point in a range of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Discrete data

A

can only take on whole values and has clear boundaries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Nominal data

A

sometimes called categorical data, is used to label subjects in a study. Nominal data is a type of discrete data.
Ex: The choice of crayon color: burnt sienna, prussian blue, periwinkle, apricot
Ex: Type of tape: masking, packing, Scotch, electric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Ordinal data

A

is a type of discrete data. It places data objects into an order according to some quality. So, the higher a data object on the scale, the more it has of a certain quality.
Ex: small, medium, and large paperclips
Ex: Level of education: some HS, HS degree/GED, some college, Bachelor’s, Masters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Interval data

A

Data that is ordered within a range and with each data point being an equal interval apart
Ex: Daily temperature (in Fahrenheit or Celsius)
Ex: The number that signifies the year: 2000, 1987, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Ratio data

A

Similar to interval data in that the data that is ordered within a range and with each data point being an equal interval apart, also has a natural zero point which indicates none of the given quality.
Ex: Heights of people in your family
Ex: The time it takes the Space Shuttle to orbit once around the earth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

True or False?

The following are examples of nominal data:

male/female
red/blue
living/deceased

A

True

Correct. This statement is true. Nominal data, sometimes called categorical data, places objects into a category.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

True or False?

Interval data has an order and all the objects are an equal interval apart.

A

True

Correct. This statement is true. Interval data has an order and all the objects are an equal interval apart. You cannot have a natural zero point in interval data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Data Management

A

The management, including cleaning and storage, of collected data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Analytics

A

The discovery, analysis, and communication of meaningful patterns in data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Big Data

A

A catch-phrase that describes a massive volume of data that is so large that it’s difficult to process using traditional database and software techniques.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Blind Study

A

A study performed where the participants are not told if they are in the treatment group or control group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Omission Error

A

An error because something (for example, data or survey response) is missing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Reliable Data

A

Data that is consistent and repeatable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Benchmarks

A

Standards or points of reference for an industry or sector that can be used for comparison and evaluation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Valid Data

A

Data resulting from a test that accurately measures what it is intended to measure

27
Q

Data Set

A

A collection of related data records on a storage device.

28
Q

Systematic Errors

A

Errors in measurement that are constant within a data set, sometimes caused by faulty equipment or bias

29
Q

Relational Database

A

A database structured to recognize relations among stored items of information.

30
Q

Statistics

A

The science that deals with the interpretation of numerical facts or data through theories of probability. Also, the numerical facts or data themselves.

31
Q

Information Bias

A

A prejudice in the data that results when either the respondent or the interviewer has an agenda and is not presenting impartial questions or responding with truly honest responses, respectively

32
Q

Random Errors

A

Errors in measurement caused by unpredictable statistical fluctuations

33
Q

Measurement Bias

A

A prejudice in the data that results when the sample is not representative of the population being tested

34
Q

Double-Blind Study

A

A study performed where neither the treatment allocator nor the participant knows which group the participant is in

35
Q

Triple-Blind Study

A

A study performed where neither the treatment allocator nor the participant nor the response gatherer knows which group the participant is in

36
Q

If you were to take your temperature 10 times in a row using the same thermometer and got the same result every time, you could say that the thermometer is __________.

a) valid
b) reliable
c) accurate
d) measurable

A

b) reliable

Feedback: The correct answer is B. A test is reliable if it is consistent and repeatable.

37
Q

According to the 2000 census the average number of people in a family in the U.S. was 3.17. Since it isn’t possible to have .17 of a person, you would use a __________ data point to describe the number of people in your family.

a) continuous
b) discrete
c) valid
d) ordinal

A

b) discrete

Feedback: The correct answer is B. You would use a discrete number such as one, three, or five to describe the number of people in your family.

38
Q

You survey 100 New Yorkers about their preference for New York-style or Chicago-style pizza. What would be wrong with this?

a) You would encounter information bias.
b) You would encounter gender bias.
c) You would encounter random error.
d) You would encounter measurement bias.

A

d) You would encounter measurement bias.

Feedback: The correct answer is D. Asking 100 New Yorkers about their preferences would most likely result in measurement bias. The same would occur if you were to ask the question of 100 Chicagoans.

39
Q

Rankings are an example of which kind of data?

a) nominal
b) continuous
c) ordinal
d) discrete

A

c) ordinal

Feedback: The correct answer is C. Ordinal numbers place subjects in order according to some quality. So, if you came in first, second, or third in a race, this would be an example of ordinal data.

40
Q

The science of using mathematical procedures to describe data is __________.

a) statistics
b) mathematics
c) descriptive data
d) analytics

A

a) statistics

Feedback: The correct answer is A. Statistics uses mathematical procedures to describe data. Analytics makes use of statistical analysis.

41
Q

The third stage of Davenport and Kim’s Three-Stage Model of quantitative decision making is which of the following?

a) solving the problem
b) framing the problem
c) communicating results
d) None of the above

A

c) communicating results

Feedback: The correct answer is C. The third stage in Davenport and Kim’s Three-Stage model is communicating results.

42
Q

Cleaning and organizing collected raw data refers to which of the following?

a) data collection
b) data management
c) data discovery
d) rectangular data

A

b) data management

Feedback: The correct answer is B. Cleaning and organizing raw data is known as data management. The result is sometimes a rectangular data file.

43
Q

Suppose you wanted to determine the ratio of cyclists to drivers in cities with higher versus lower air quality. What kind of study might you use?

a) observational study
b) experimental study
c) double-blind study
d) triple-blind study

A

a) observational study

Feedback: The correct answer is A. Because you cannot control for all variables, you would not be able to use an experimental study or blind studies.

44
Q

Suppose you were to use analytics in an experiment to determine how many salespeople to assign to particular sales territories based on the makeup and performance of the territories in the results of the experiment. You would be using which kind of analytics?

a) predictive
b) prescriptive
c) descriptive
d) proactive

A

b) prescriptive

Feedback: The correct answer is B. Prescriptive analytics determines a course of action

45
Q

Suppose you employed analytics to determine which sales territories had shown the most profitable growth in the last four quarters and would most likely do so again in the future. You would be using which kind of analytics?

a) predictive
b) prescriptive
c) descriptive
d) proactive

A

a) predictive

Feedback: The correct answer is A. Using past information to make decisions about the future is called predictive analytics.

46
Q

Of the following, which is considered the most serious kind of data error?

a) poorly formatted data
b) number transportation
c) out-of-range data
d) missing data

A

d) missing data

Feedback: The correct answer is D. Missing data can severely compromise the results of your study.

47
Q

If you designed a drug trial in which the subject, the data gatherer, and the treatment allocator did not know who was in the control group, then you created a __________ study.

a) blind
b) biased
c) double-blind
d) triple-blind

A

d) triple-blind

Feedback: The correct answer is D. A study where all parties do not know who is in the control group and who is in the treatment group is a triple-blind study. If the treatment allocator and data gatherer are the same person, this would be a double-blind study.

48
Q

Suppose you were making a simplified representation of a complex problem in order to solve it, which stage of the Three Stage Model would you be in?

a) framing the problem
b) data collection
c) solving the problem
d) communicating results

A

c) solving the problem

Feedback: The correct answer is C. The modeling step is part of the solving the problem stage.

49
Q

Assume you are measuring the various returns on investment, over the past year, for four different stocks in your portfolio. You find the following values (each as a percent of your investment): 4.68, 5.65, 3.78, -0.46, 6.91. What kind of data are these data points?

a) continuous data
b) nominal data
c) discrete data
d) ordinal data

A

a) continuous data

Feedback: The correct answer is A. In a set of continuous data, a point can lay along any point in a range of data.

50
Q

If you were to take your temperature 10 times in a row using the same thermometer and get the following results (in degrees Fahrenheit), what could you assume about the thermometer? 34, 99, 108, 45, 66, 21, 78, 53, 94, 102

a) It is reliable but not valid.
b) It is valid but not reliable.
c) It is neither reliable nor valid.
d) It is both reliable and valid.

A

c) It is neither reliable nor valid.

Feedback: The correct answer is C. Because the average temperature for human beings is 98.6 degrees Fahrenheit, you can assume the results are not valid. You can also assume they are unreliable, because of the wildly varying results.

51
Q
For companies to attract and retain their best customers they need a complete portrait of who they are. To develop this portrait companies turn to…
A. Statistics
B. Analytics
C. Management Science
D. Histograms
A

B. Analytics

52
Q
A manufacturer wants to maximize their factory output while specifically minimizing labor costs. What type of analytics might they employ to achieve this goal?
A. Descriptive Analytics
B. Predictive Analytics
C. Prescriptive Analytics
D. Diagnostic Analytics
A

C. Prescriptive Analytics

53
Q
What type of data error that occurs in measurement is constant within a data set and is sometimes caused by faulty equipment or bias?
A. Random
B. Omission
C. Outlier
D. Systematic
A

D. Systematic

54
Q
An Educator develops a new standardized test to measure math skills of ninth graders. She has students in her home state of Ohio take the test. If the test is to be used on a national level, what type of error might be found in her data?
A. Omission Error
B. Systematic Error
C. Measurement Bias
D. Information Bias
A

C. Measurement Bias

55
Q
A city government is trying to determine the national origins of its recent immigrant population. If a survey of the immigrant population is conducted in English what type of error might be present in the data?
A. Random
B. Omission
C. Outlier
D. Accuracy
A

B. Omission

56
Q

The use of Big Data is increasingly important to businesses in competitive markets. Which of the following characteristics is not true of big data?
A. Requires the use of analytics
B. Contains structured data
C. Contains unstructured data
D. Can be analyzed with traditional spreadsheets

A

D. Can be analyzed with traditional spreadsheets

57
Q
The Davenport-Kim three-stage model consists of framing the problem, solving the problem, and communicating results. Which two of the following are part of framing the problem stage?
A. Determine the scope of the problem
B. Data collection
C. Review of previous findings
D. Presenting a recommendation
A

A. Determine the scope of the problem

C. Review of previous findings

58
Q
A healthcare provider is researching blood glucose levels before and after exercising. What two elements should be part of any experimental study such as this?
A. Treatment procedures
B. Patient observation
C. Statistical validity
D. Experimental response
A

A. Treatment procedures

D. Experimental response

59
Q
Runners cover 26.2 miles in the Olympics marathon. What level of measurement is this?
A. Nominal
B. Ordinal
C. Interval
D. Ratio
A

D. Ratio

60
Q
What level of measurement is the type of cars produced in Ford factory?
A. Nominal
B. Ordinal
C. Interval
D. Ratio
A

A. Nominal

61
Q
What level of measurement is this the 10 best cities in the U.S. to retire in?
A. Nominal
B. Ordinal
C. Interval
D. Ratio
A

B. Ordinal

62
Q
What level of measurement are women’s dress sizes (2,4,6, etc.)?
A. Nominal
B. Ordinal
C. Interval
D. Ratio
A

C. Interval

63
Q
A local school board is studying the impact of a proposed change in testing on math scores. Bias can be introduced into the study by both students and teachers. Which research technique would eliminate this type of bias?
A. Observation study
B. Blind study
C. Cohort study
D. Double blind study
A

D. Double blind study

64
Q
A Company’s product development team test 3 new car waxes by waxing 5 cars with each wax and then running them through a car wash. They then record number of washes it takes before the wax begins to deteriorate. What is the term for the five cars?
A. The response
B. The construct validity
C. The experimental unit
D. The treatment
A

C. The experimental unit