Exploratory data analysis Flashcards
anecdotal evidence
Evidence, often personal, that is collected casually rather than by a well-designed study.
population
A group we are interested in studying. “Population” often refers to a group of people, but the term is used for other subjects, too.
cross-sectional study
A study that collects data about a population at a particular point in time.
cycle
In a repeated cross-sectional study, each repetition of the study is called a cycle.
longitudinal study
A study that follows a population over time, collecting data from the same group repeatedly.
record
In a dataset, a collection of information about a single person or other subject.
respondent
A person who responds to a survey.
sample
The subset of a population used to collect data.
representative
A sample is representative if every member of the population has the same chance of being in the sample.
oversampling
The technique of increasing the representation of a sub-population in order to avoid errors due to small sample sizes.
raw data
Values collected and recorded with little or no checking, calculation or interpretation.
recode
A value that is generated by calculation and other logic applied to raw data.
data cleaning
Processes that include validating data, identifying er- rors, translating between data types and representations, etc.