Week 3 - Research and Measurement Flashcards

Question

how do you use the mean?

Answer 1

only if you have distance property | average the group

Answer 2

a frequency, expressed as a fraction of 100

Answer 3

the defined distance between the smallest and largest numbers in the data?

Answer 4

what is the average difference between data points and the mean how similar are numbers on average?

Answer 5

visualise the distribution of the data - say with a bar chart mode is just picking the tallest bar can be applied to nominal data

Answer 6

``` census = entire population sample = part of the population ``` inferential statistics help when you can't perform a whole census

Answer 7

any time you can do a census study | but often it isn't reasonably possible

Answer 8

population parameter = true fact based on 100% observation (census) statistic = estimate

Answer 9

pro - lower cost - easier and faster data handling cons - higher error rate - errors can drive bad decision making

Answer 10

probability distributions allow for predictable estimates

Answer 11

it's THE most critical part - an error here can lead to skewing or bias

Answer 12

probability - researcher has no role in drawing (eg, random sample) nonprobability - researcher does have a role (eg, convenience sampling of people nearby)

Answer 13

researcher plays no role in buliding the sample generally near random similar but not exactly every data point has an equal chance of being selected

Answer 14

researcher does play a role in selection | convenience sample is very common - stopping people at the subway for instance

Answer 15

because a sample <> census thus while it is in theory representative, often reality can differ

Answer 16

sampling error - nonrepresentative sample | nonsampling error - systemic and/or random error not associated with the manner of drawing the sample

Answer 17

``` probability sampling (random) - no risk of sampling error, but VERY rarely 100% followed (think - completion bias) non-probability sampling (selected) - high risk of error, must assume at least a certain level of error (hence statistical significance) ```

Answer 18

any time you don't have a full census | even if the sample is random, if it isn't complete (eg census) we can never be 100% sure of conclusions

Answer 19

proof that there is no difference between compared populations eg, people who take this medicine are definitely no better off than people who don't the null hypothesis is generally assumed true until proven false

Answer 20

telling a man he's pregnant when he isn't rejecting the null hypothesis, when it's actually True

Answer 21

telling a man he's not a man when he really is accepted the null hypothesis when the null hypothesis is false normally type 2 is safer

Answer 22

yes, by selecting significance levels but decreasing type 1 increases risk of type 2 choose your adventure

Answer 23

primary data | secondary data

Answer 24

collected for a purpose other than this research project | eg, UN data

Answer 25

collected specifically for our hypotheses

Answer 26

pros - available, already there - price, might be cheap or even free cons - relevancy, might not fit needs - accuracy, why was it collected, what standards were in place?

Answer 27

normally secondary data passively collected both structured and unstructured can test hypotheses, but can't verify cause/effect

Answer 28

questioning - survey, interview (might not be answered honestly) observing - watching, documenting (more honest answers, but harder to understand the why) - on a person or on a company (eg, keyword analysis of company legal policies)

Answer 29

only through experimentation | must be very careful to not communicate correlation as causality

Answer 30

evidence of statistical association temporal ordering control for competing hypotheses

Answer 31

necessary, but insufficient for causality

Answer 32

must prove that A came before B eg, fire trucks arrived after fire started, not before

Answer 33

look for unmeasured or unobserved hypotheses alternative hypotheses randomise away errors through probability sampling and experiment design churches and liquor stores increase in parallel, but even with temporal ordering, neither causes the other reality: population growth caused both

Week 3 - Research and Measurement Flashcards

(57 cards)