MODULE 3 - DESCRIPTIVE STATISTICS Flashcards

1
Q

What is a variable?

A

is any measurable characteristic of an observation unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

3 pieces of information a variable contains

A
  1. what the variable represents
  2. the measurement unit
  3. a description of the observation unit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are numerical variables?

A

those where the data is numeric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are categorical variables?

A

those where the data is a qualitative description

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are continuous numerical variables?

A

a variable that can take on continuous numbers

continuous numbers are those that can take on any value including fractional numbers

eg. your weight is a continuous numerical value because it can be portions of a kilogram (e.g., 104.23 kg)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are discrete numerical variables?

A

a variable that can take only take on whole numbers (integers)

eg. if you are counting the number of patients that arrive at the emergency room each day, you can only have integer values (e.g., 28 people)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are ordinal categorical variables?

A

a variable that can take on qualitative values but where values are from a ranked scale

eg. using emojis to describe how you are feeling today

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are nominal categorical variables?

A

a variable that can take on qualitative values but where values do not have any particular order

eg. food

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the data type for describing age?

A

continuous numerical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the data type for the description: child, teenager, adult?

A

ordinal categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the data type for the number of students in a class?

A

discrete numerical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the data type for the letter grade on your exam?

A

ordinal categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the data type for the percentage grade on your exam?

A

continuous numerical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is a count?

A

the number of sampling units in each category, and proportions are the share of the total sampling units in each category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what are proportions?

A

the share of observations in your sample that fall into each category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is a range?

A

the difference between the maximum and minimum values for numerical variables, or the difference between the maximum and minimum number of counts for categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is the mean?

A

the average value

18
Q

what is a variance?

A

a measure of the amount of variation in your sample

19
Q

how do you calculate variance?

A
  1. Calculate the mean for a sample
  2. Calculate the difference between each data point and the mean, then square that value
  3. Sum the squares of the differences and divide by the number of observations/data points
20
Q

what is standard deviation?

A

the square root of variance

21
Q

what is a quartile?

A

one quarter of your sample when the values are ranked from lowest to highest

22
Q

how to calculate quartiles?

A
  1. sort data from lowest to highest value
  2. find the 2nd quartile by splitting the data in half according to whether:
  • the sample has an odd number of observations, in which case the middle value of the dataset is the second quartile
  • the sample has an even number of observations, in which case the average of the two values closest to the middle is the second quartile
  1. find the 1st quartile by creating a subset of the data that is the lower-valued half of the observations, then use the rules in step 2 to find the middle value. The lower-valued subset is created according to whether
  • the sample has an odd number of observations, in which case the lower-valued subset is all values less than or equal to the second quartile. The subset includes the second quartile
  • The sample has an even number of observations, in which case the lower-valued subset is all values less than the second quartile. The subset does not include the second quartile
  1. find the 3rd quartile by repeating step 3 but for the upper-valued half of the observation
23
Q

what is the central quartile?

A

the median

24
Q

what is dispersion?

A

describes how much variation there is in a sample

25
Q

what is the interquartile range (IQR)?

A

the range between the 1st and 3rd quartiles

26
Q

how to calculate the IQR?

A

subtract the 1st quartile from the 3rd quartile

27
Q

pros and cons to quartiles

A

pros:
- The median and interquartile range are relatively robust to extreme values

cons:
- The median and interquartile range
become quite variable for samples with a small number of observations

28
Q

pros and cons to using mean

A

pros:
- The mean and standard deviation
are more robust when there is a small number of observations in the sample

cons:
- The downside to the mean and standard deviation
is that they are sensitive to extreme values

29
Q

Calculate the mean & median of the following data:

7.5 9.9 8.6 10.3 8.5 9.4 15.1

A

mean: 9.9
median: 9.4

30
Q

Would the mean or median be a better descriptor of the ‘middle’ value for this set of data?
7.5 9.9 8.6 10.3 8.5 9.4 15.1

A

median

31
Q

Calculate the population variance & interquartile range (IQR) of the following data:
7.5 8.6 8.9 8.5 9.4 10.7 15.1

A

variance: 5.5
IQR: 1.5

32
Q

Calculate the interquartile range (IQR) for the following set of numbers and indicate what range the answer lies within.
10.1, 18.6, 19.8, 15.7, 21.9, 12.9, 11.8, 26.0, 13.0, 12.9

A

5 < ANSWER < 7

33
Q

Calculate the interquartile range (IQR) for the following set of data and indicate what range the answer lies within.
46.7, 18.7, 39.4, 7.2, 19.8, 42.1, 2.6, 17.1, 30.7, 21.9

A

19 < ANSWER < 23

34
Q

what is effect size?

A

the change in mean value of the response variable among groups

35
Q

2 ways to calculate effect size

A
  1. difference
  2. ratio
36
Q

difference calculations

A

the differences in mean values among groups

37
Q

ratio calculations

A

the ratio of mean values among groups

38
Q

The rate of home ownership in Canada decreased from 46% in 2004 to 44% in 2011. What is the effect size as a difference between the years?

A

-2%

39
Q

true or false: relative effect sizes have no units

A

true

40
Q

In the United Kingdom, 56% of older adults (55+ years) get their news from the television whereas only 12% of youth (18-24 years) do. What is the relative effect size of youth compared to older adults?

A

4.7