MODULE 3 - DESCRIPTIVE STATISTICS Flashcards
What is a variable?
is any measurable characteristic of an observation unit
3 pieces of information a variable contains
- what the variable represents
- the measurement unit
- a description of the observation unit
what are numerical variables?
those where the data is numeric
what are categorical variables?
those where the data is a qualitative description
what are continuous numerical variables?
a variable that can take on continuous numbers
continuous numbers are those that can take on any value including fractional numbers
eg. your weight is a continuous numerical value because it can be portions of a kilogram (e.g., 104.23 kg)
what are discrete numerical variables?
a variable that can take only take on whole numbers (integers)
eg. if you are counting the number of patients that arrive at the emergency room each day, you can only have integer values (e.g., 28 people)
what are ordinal categorical variables?
a variable that can take on qualitative values but where values are from a ranked scale
eg. using emojis to describe how you are feeling today
what are nominal categorical variables?
a variable that can take on qualitative values but where values do not have any particular order
eg. food
what is the data type for describing age?
continuous numerical
what is the data type for the description: child, teenager, adult?
ordinal categorical
what is the data type for the number of students in a class?
discrete numerical
what is the data type for the letter grade on your exam?
ordinal categorical
what is the data type for the percentage grade on your exam?
continuous numerical
what is a count?
the number of sampling units in each category, and proportions are the share of the total sampling units in each category
what are proportions?
the share of observations in your sample that fall into each category
what is a range?
the difference between the maximum and minimum values for numerical variables, or the difference between the maximum and minimum number of counts for categorical variables
what is the mean?
the average value
what is a variance?
a measure of the amount of variation in your sample
how do you calculate variance?
- Calculate the mean for a sample
- Calculate the difference between each data point and the mean, then square that value
- Sum the squares of the differences and divide by the number of observations/data points
what is standard deviation?
the square root of variance
what is a quartile?
one quarter of your sample when the values are ranked from lowest to highest
how to calculate quartiles?
- sort data from lowest to highest value
- find the 2nd quartile by splitting the data in half according to whether:
- the sample has an odd number of observations, in which case the middle value of the dataset is the second quartile
- the sample has an even number of observations, in which case the average of the two values closest to the middle is the second quartile
- find the 1st quartile by creating a subset of the data that is the lower-valued half of the observations, then use the rules in step 2 to find the middle value. The lower-valued subset is created according to whether
- the sample has an odd number of observations, in which case the lower-valued subset is all values less than or equal to the second quartile. The subset includes the second quartile
- The sample has an even number of observations, in which case the lower-valued subset is all values less than the second quartile. The subset does not include the second quartile
- find the 3rd quartile by repeating step 3 but for the upper-valued half of the observation
what is the central quartile?
the median
what is dispersion?
describes how much variation there is in a sample