Chapter 1, Intro to Data Flashcards
What is a summary statistic?
a single number summarizing a large amount of data
What is a proper data set called? and what makes is “proper”?
data matrix, each row corresponds to a unique case and each column corresponds to a variable.
What is the formal name for a row?
case or observational unit
What do columns represent? and what is important to know about them?
characteristics, called variables (imp to understand what each variable means, as well as units of measurement)
What are 2 types of variables?
Numerical and Categorical
What are the 2 kinds of numerical variable?
Discrete and continuous
What are the 2 kinds of categorical variable?
Ordinal and nominal
What is a discrete numerical variable?
a number value that can only be a whole number, e.g. population, since you can’t have half a person
What is a continuous numerical variable?
a number value that can be in between whole numbers, e.g. an hourly pay rate.
What is an ordinal categorical variable?
a categorical variable that involves an ordering, e.g. educational level attained
What is a nominal categorical variable?
a categorical variable that doesn’t involve an ordering, e.g. color
What are possible categorical variables called?
levels
What makes 2 variables “associated” or “dependent”?
When they show some connection with one another.
What is a scatterplot graph useful for?
Showing whether or not 2 variables are associated, as well as trends in the relationship
What is a positive correlation between 2 variables?
a relationship where if one variable increases, the other also increases or vice versa