MODULE 3 - DESCRIPTIVE STATISTICS Flashcards

1
Q

What is a variable?

A

is any measurable characteristic of an observation unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

3 pieces of information a variable contains

A
  1. what the variable represents
  2. the measurement unit
  3. a description of the observation unit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are numerical variables?

A

those where the data is numeric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are categorical variables?

A

those where the data is a qualitative description

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are continuous numerical variables?

A

a variable that can take on continuous numbers

continuous numbers are those that can take on any value including fractional numbers

eg. your weight is a continuous numerical value because it can be portions of a kilogram (e.g., 104.23 kg)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are discrete numerical variables?

A

a variable that can take only take on whole numbers (integers)

eg. if you are counting the number of patients that arrive at the emergency room each day, you can only have integer values (e.g., 28 people)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are ordinal categorical variables?

A

a variable that can take on qualitative values but where values are from a ranked scale

eg. using emojis to describe how you are feeling today

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are nominal categorical variables?

A

a variable that can take on qualitative values but where values do not have any particular order

eg. food

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the data type for describing age?

A

continuous numerical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the data type for the description: child, teenager, adult?

A

ordinal categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the data type for the number of students in a class?

A

discrete numerical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the data type for the letter grade on your exam?

A

ordinal categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the data type for the percentage grade on your exam?

A

continuous numerical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is a count?

A

the number of sampling units in each category, and proportions are the share of the total sampling units in each category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what are proportions?

A

the share of observations in your sample that fall into each category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is a range?

A

the difference between the maximum and minimum values for numerical variables, or the difference between the maximum and minimum number of counts for categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is the mean?

A

the average value

18
Q

what is a variance?

A

a measure of the amount of variation in your sample

19
Q

how do you calculate variance?

A
  1. Calculate the mean for a sample
  2. Calculate the difference between each data point and the mean, then square that value
  3. Sum the squares of the differences and divide by the number of observations/data points
20
Q

what is standard deviation?

A

the square root of variance

21
Q

what is a quartile?

A

one quarter of your sample when the values are ranked from lowest to highest

22
Q

how to calculate quartiles?

A
  1. sort data from lowest to highest value
  2. find the 2nd quartile by splitting the data in half according to whether:
  • the sample has an odd number of observations, in which case the middle value of the dataset is the second quartile
  • the sample has an even number of observations, in which case the average of the two values closest to the middle is the second quartile
  1. find the 1st quartile by creating a subset of the data that is the lower-valued half of the observations, then use the rules in step 2 to find the middle value. The lower-valued subset is created according to whether
  • the sample has an odd number of observations, in which case the lower-valued subset is all values less than or equal to the second quartile. The subset includes the second quartile
  • The sample has an even number of observations, in which case the lower-valued subset is all values less than the second quartile. The subset does not include the second quartile
  1. find the 3rd quartile by repeating step 3 but for the upper-valued half of the observation
23
Q

what is the central quartile?

A

the median

24
Q

what is dispersion?

A

describes how much variation there is in a sample

25
what is the interquartile range (IQR)?
the range between the 1st and 3rd quartiles
26
how to calculate the IQR?
subtract the 1st quartile from the 3rd quartile
27
pros and cons to quartiles
pros: - The median and interquartile range are relatively robust to extreme values cons: - The median and interquartile range become quite variable for samples with a small number of observations
28
pros and cons to using mean
pros: - The mean and standard deviation are more robust when there is a small number of observations in the sample cons: - The downside to the mean and standard deviation is that they are sensitive to extreme values
29
Calculate the mean & median of the following data: 7.5 9.9 8.6 10.3 8.5 9.4 15.1
mean: 9.9 median: 9.4
30
Would the mean or median be a better descriptor of the ‘middle’ value for this set of data? 7.5 9.9 8.6 10.3 8.5 9.4 15.1
median
31
Calculate the population variance & interquartile range (IQR) of the following data: 7.5 8.6 8.9 8.5 9.4 10.7 15.1
variance: 5.5 IQR: 1.5
32
Calculate the interquartile range (IQR) for the following set of numbers and indicate what range the answer lies within. 10.1, 18.6, 19.8, 15.7, 21.9, 12.9, 11.8, 26.0, 13.0, 12.9
5 < ANSWER < 7
33
Calculate the interquartile range (IQR) for the following set of data and indicate what range the answer lies within. 46.7, 18.7, 39.4, 7.2, 19.8, 42.1, 2.6, 17.1, 30.7, 21.9
19 < ANSWER < 23
34
what is effect size?
the change in mean value of the response variable among groups
35
2 ways to calculate effect size
1. difference 2. ratio
36
difference calculations
the differences in mean values among groups
37
ratio calculations
the ratio of mean values among groups
38
The rate of home ownership in Canada decreased from 46% in 2004 to 44% in 2011. What is the effect size as a difference between the years?
-2%
39
true or false: relative effect sizes have no units
true
40
In the United Kingdom, 56% of older adults (55+ years) get their news from the television whereas only 12% of youth (18-24 years) do. What is the relative effect size of youth compared to older adults?
4.7