Week 3 - Research and Measurement Flashcards

1
Q

why do research and analysis?

A

in order to make the right decision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

does all data and analysis have value?

A

NO - only if they help us make a decision

raw data has very little value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

in hypothesis testing, when do you make a prediction

A

prior to testing (a priori)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is the purpose of marketing research

A

inform decision making for business decisions (vs scientific research for instance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what do you call raw data once it has been analysed?

A

interpreted data, ie, information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how should a decision maker be involved?

A

understand enough to know what’s reliable
tell the research team which questions to answer
potentially make predictions
project manage perhaps
be able to think like a researcher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how should a researcher be involved?

A

convert questions/predictions into testable hypotheses
conduct the applicable research
present results in a way to answer the original question
communicate information clearly - reduce the complex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how should administrators be involved?

A

understand sufficiently to

1) find common ground
2) engage throughout the process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is inferential statistics?

A

statistical analysis to infer or estimate from a population

based on probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the properties of data?

A

assignment
assignment and order
assignment order, and distance
assignment order, distance, and origin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the minimal requirement for raw data to be analysed?

A

must be able to place into categories (at least assignment)
can have:
assignment order, distance, and origin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is assignment for data?

A

groupings

eg, color, gender, state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is order for data?

A
data points that can be ordered
eg, birth order, class rank, placement in race
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is distance for data?

A

ability to understand how far apart data points are from each other
eg, one person has 100%, another has 80%, distance is 20ppt

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is origin for data?

A

an unambiguous starting point or point of comparison
eg, zero is the lowest grade, 2018 is the current year

allows measurement of distance between data points AND vs origin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are the four classifications of data?

A

non-metric

  • nominal
  • ordinal

metric

  • interval
  • ratio
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is nominal data classification?

A

nonmetric = nonparametric tests
assignment only
central tendency is only mode (most frequently occurring)

eg, most of these m&ms are blue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is ordinal data classification?

A

nonmetric
assignment and order
central tendency is only mode or median

eg, shortest to tallest height

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is interval data classification?

A

metric = parametric data analysis available
assignment, order, and distance
(considered continuous because distance between points is measurable)
central tendency: mean, median, and mode (all three)

eg, what is the average length of a canoe

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is ratio data classification?

A

metric
assignment, order, distance, and origin
continuous
all central tendencies (mean, median, and mode)

eg, star ratings between books, consumption over years

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what are descriptive statistics

A

a quantitative approach to identifying characteristics about a respondent pool
not a testing method

who answered our questions? what is the make up of our data overall?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what tools does descriptive statistics use?

A

central tendencies (mean, median, mode)
percentages
measures of dispersion
frequency distributions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

when and how can you use mode?

A

any data with assignment (nominal)

what’s the most common?

24
Q

how do you use median?

A

any data with an order property
what’s in the middle if you count from each side?
if two, you average the tie to come up with the answer

25
Q

how do you use the mean?

A

only if you have distance property

average the group

26
Q

what is a percentage?

A

a frequency, expressed as a fraction of 100

27
Q

what is a range?

A

the defined distance between the smallest and largest numbers in the data?

28
Q

how do you measure standard deviation?

A

what is the average difference between data points and the mean
how similar are numbers on average?

29
Q

how do you measure frequency distribution?

A

visualise the distribution of the data - say with a bar chart
mode is just picking the tallest bar
can be applied to nominal data

30
Q

what is the difference between census and sample studies?

A
census = entire population
sample = part of the population

inferential statistics help when you can’t perform a whole census

31
Q

when is a census study better than a sample study?

A

any time you can do a census study

but often it isn’t reasonably possible

32
Q

what is a population parameter

A

population parameter = true fact based on 100% observation (census)
statistic = estimate

33
Q

what are the pros and cons of sampling?

A

pro

  • lower cost
  • easier and faster data handling

cons

  • higher error rate
  • errors can drive bad decision making
34
Q

why are sample-based estimates useful?

A

probability distributions allow for predictable estimates

35
Q

how much does sample drawing matter?

A

it’s THE most critical part - an error here can lead to skewing or bias

36
Q

how can you draw a sample?

A

probability - researcher has no role in drawing (eg, random sample)
nonprobability - researcher does have a role (eg, convenience sampling of people nearby)

37
Q

what is probability sampling?

A

researcher plays no role in buliding the sample
generally near random
similar but not exactly every data point has an equal chance of being selected

38
Q

what is nonprobability sampling?

A

researcher does play a role in selection

convenience sample is very common - stopping people at the subway for instance

39
Q

why does error occur in statistical inference?

A

because a sample <> census
thus while it is in theory representative,
often reality can differ

40
Q

what are the two types of errors found in statistical inferences?

A

sampling error - nonrepresentative sample

nonsampling error - systemic and/or random error not associated with the manner of drawing the sample

41
Q

when should sampling error be suspected

A
probability sampling (random) - no risk of sampling error, but VERY rarely 100% followed (think - completion bias)
non-probability sampling (selected) - high risk of error, must assume at least a certain level of error (hence statistical significance)
42
Q

when should nonsampling error be suspected?

A

any time you don’t have a full census

even if the sample is random, if it isn’t complete (eg census) we can never be 100% sure of conclusions

43
Q

what is the null hypothesis?

A

proof that there is no difference between compared populations
eg, people who take this medicine are definitely no better off than people who don’t
the null hypothesis is generally assumed true until proven false

44
Q

what is a Type 1 error?

A

telling a man he’s pregnant when he isn’t

rejecting the null hypothesis, when it’s actually True

45
Q

what is a Type 2 error?

A

telling a man he’s not a man when he really is

accepted the null hypothesis when the null hypothesis is false

normally type 2 is safer

46
Q

can you decrease the likelihood of type 1 or 2 errors?

A

yes, by selecting significance levels
but decreasing type 1 increases risk of type 2
choose your adventure

47
Q

what are the two categories of data collection?

A

primary data

secondary data

48
Q

what is secondary data?

A

collected for a purpose other than this research project

eg, UN data

49
Q

what is primary data?

A

collected specifically for our hypotheses

50
Q

what are the pros/cons of secondary data?

A

pros

  • available, already there
  • price, might be cheap or even free

cons

  • relevancy, might not fit needs
  • accuracy, why was it collected, what standards were in place?
51
Q

what is big data?

A

normally secondary data
passively collected
both structured and unstructured
can test hypotheses, but can’t verify cause/effect

52
Q

how is primary data collected?

A

questioning - survey, interview (might not be answered honestly)
observing - watching, documenting (more honest answers, but harder to understand the why) - on a person or on a company (eg, keyword analysis of company legal policies)

53
Q

how can you establish causality?

A

only through experimentation

must be very careful to not communicate correlation as causality

54
Q

what three factors are required to prove causality?

A

evidence of statistical association
temporal ordering
control for competing hypotheses

55
Q

how do you prove causality - evidence of statistical association?

A

necessary, but insufficient for causality

56
Q

how do you prove causality - temporal ordering?

A

must prove that A came before B

eg, fire trucks arrived after fire started, not before

57
Q

how do you prove causality - control for competing hypotheses?

A

look for unmeasured or unobserved hypotheses
alternative hypotheses
randomise away errors through probability sampling and experiment design

churches and liquor stores increase in parallel, but even with temporal ordering, neither causes the other
reality: population growth caused both