section 1.2: data basics Flashcards
where can data come from?
could come from notes, experiments, surveys, etc., not restricted to one particular thing
what is statistics?
study of how best to collect, analyze, and draw conclusions from our data
what is the formal name for a row?
a case or observational unit
what is a case or observational unit?
the formal name for a row
what is a variable?
what a column represents (ex: the grade for homework 0)
each column in a data matrix is ___________
its own variable
what is a data matrix?
a convenient and common way to organize data
what type of observation is numerical?
quantitative
what type of observation is categorical?
qualitative
what are the types of variables we can have?
discrete, continuous, nominal, ordinal
what is a continuous variable?
type of numerical variable, like a double in R, can be any number along a scale (ex: 1.0, 1.0001, 1.000000001)
what is a discrete variable?
type of numerical variable, like an integer in R, has jumps (can’t have 0.1 of a person)
what is a numerical variable?
a variable that can take a wide range of numerical values, and it is sensible to add, subtract, or take averages with those
values.
what is a categorical variable?
a variable where it doesn’t make sense to take the average or do other computations with it
what is a nominal variable?
type of categorical variable, like characters in R, unordered categorical
what is an ordinal variable?
type of categorical variable, like characters in R, ordered categorical
what are associated variables?
when two variables show some connection with one another
what are dependent variables?
another name for associated variables
when are two variables considered independent?
two variables are independent if there is no evident relationship between the two
what is a positive association?
both variables increase or decrease together
what is a negative association?
when one variable increases, the other decreases. when one variable decreases, the other variable increases.
what are the three big distinctions you need to make between the types of variables?
discrete, continuous, categorical
would a telephone or ID number be considered a numerical variable?
No, because there is no meaning to the average of the number. For example, the average of the area code of a telephone number doesn’t mean anything.
Data was collected about students in a statistics course. Three variables were recorded for each student: number of siblings, student height, and whether the student had previously taken a statistics course. Classify each of the variables as continuous numerical, discrete numerical, or categorical.
Number of siblings: discrete numerical
Student height: continuous numerical
Previously taken statistics: categorical
what are the two primary types of data collection?
observational
studies and experiments
what is an observational study?
researchers collect data by observing records or conducting surveys, not experimenting
what can observational studies provide?
observational studies can provide evidence of a naturally occurring association between variables
what can observational studies not provide?
they cannot by themselves show a
causal connection.
When do researchers conduct an experiment?
when they want to investigate the possibility of a casual relationship
what is a randomized experiment?
when participants are assigned to the control and treatment groups randomly
what is a placebo?
a fake drug that looks exactly like the real one