What is Statistics? (Exam 1) Flashcards
group
usually referred to as a population/population of interest
question
usually asks to describe the distribution of a variable across the population
variable
a characteristic recorded about individuals in the population
distribution
a model that explains all possible values of the variable and the likelihood of observing each value
why might we may not be possible to measure our variable across all individuals in the population?
resource constraints:
- cost of contacting an entire population
- time required to contact an entire population
- difficulty contacting/getting responses from all individuals in a population
sample
look at a subset of the population, rather than the whole thing, should be representative of the population
representative
sample should be a mirror image of the population (all individuals in the population are just as likely to be included and no groups are systematically excluded… equally likely to be selected)
unweighted sample
the distribution of each of the variables within our SAMPLE
weighted sample
the distribution of each of these variables in our POPULATION
statistics
- science of collecting, organizing, interpreting numerical facts (“data”)
- using data and knowledge about randomness to condense, communicate, contextualize info
- provide insight into the setting from which the data came
- allows us to generalize what we see in our sample to what we would expect to see had we studied the complete population
outline of a statistical analysis
1) identify the population of interest & the question about population
2) collect date (sampling)
3) perform preliminary data analysis to explore trends that exist within the sample
4) draw conclusions about the population based on what we observed in the sample
categorical variables
- assign individuals in the population into one of the several groups or categories with a common characteristic
- any variables where the data represent GROUPS
- ex: gender, hometown, zipcode
quantitative variables
- numerical values for which arithmetic operations makes sense
- any variables where the data represent AMOUNTS
discrete quantitative variables
- have a finite or countably infinite number of outcomes
- ex: number of employees at a business
- positive, full number
continuous quantitative variables
- have an uncountably infinite number of outcomes (a large range/interval of possible outcomes)
- no pattern to list off
- ex: the daily % change in a business’ stock value
- possible outcomes: [-100, ∞)