Lecture 2 - Intro to Stats Flashcards

1
Q

What are the 3 common scales of measurement for variables in medicine?

A
  • Nominal
  • Ordinal
  • Numerical
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe Nominal data

A
  • Simplest - data fits in categories (no actual order)
  • Often dichotomous of binary (yes/no or male/female)
  • Could be multiple categories like blood groups
  • We can just describe it - no way to rank it
  • Just use proportion or percentages
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are nominal data also called?

A
  • Qualitative Observations

- Categorical Observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe Ordinal data

A
  • Inherent order to the categories (ex. Cancer staging 0-4)
  • Summary statistic = median
  • Difference between 2 adjacent categories is not the same throughout the scale
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe Numerical data

A
  • Difference have meaning on numerical scale

- Also called quantitative observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the two types of numerical scales?

A
  • Continuous scale - has a value on a continuum (ex. age)

- Discrete scale - values are integers (# of fractures, # of medications)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What summary statistics do you use for numerical data?

A

mean and SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What type of data:
Nominal, ordinal, or continuous ?

Name

A

nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What type of data:
Nominal, ordinal, or continuous ?

Hair color

A

nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What type of data:
Nominal, ordinal, or continuous ?

Eye color

A

nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What type of data:
Nominal, ordinal, or continuous ?

Height

A

continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What type of data:
Nominal, ordinal, or continuous ?

Age

A

continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What type of data:
Nominal, ordinal, or continuous ?

Gender

A

nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the 3 “Measures of Middle”?

A
  • Mean
  • Median
  • Mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the mean?

A
  • it’s the average yo

- used with numerical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the median?

A

The median is the middle observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the mode?

A

The mode is the value that occurs most frequently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Can data have more than 1 mode ?

A

bimodal distribution

ex. some diseases have 2 peaks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

If the data is not skewed, you can use ____ and ___.

A

mean and SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

If the data is skewed, you should use ____ and ___.

A

median and IQR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Negatively skewed is ____ skewed (outlying small values)

22
Q

Positively skewed is ____ skewed (outlying values are large)

23
Q

How do you know if something is right/positively skewed?

A

Mean > Median

24
Q

How do you know if something is left/negatively skewed?

A

Mean < Median

25
Use mean if data is _____
symmetric
26
Use _____ for ordinal data or numerical data that is skewed
median
27
What are some measures of spread?
- Range - Standard deviation/variance - Coefficient of variation - Percentiles - Interquartile range
28
What is the range?
difference between smallest and largest values
29
How is variance related to standard deviation?
Variance is the statistic before the square root is taken
30
What is the coefficient of variation?
Measure of relative spread CoV = SD/mean x 100
31
What is a percentile?
It is the percentage of a distribution that is equal to or below a particular number (median = 50th percentile)
32
What is IQR?
interquartile range IQR = Q3 - Q1
33
What do you use SD with?
mean (with symmetrical data)
34
What do you use percentiles and IQR with?
median for ordinal data or skewed numerical data
35
List 4 ways we can express numerical data
- Stem and leaf plots - Five number summary - Boxplots - Grouped Frequency Tables
36
Why are stem and leaf plots useful?
- get some idea about the centrality | - helps to see if it's skewed or not
37
What is a 5 number summary and why is a 5 number summary useful?
- Min - Q1 - Median - Q3 - Max *Helps to show the location and spread of the data
38
What is the formula for finding percentile that he gave us?
p(n+1) So say you're trying to find the 25th percentile out of 16 numbers, you would do: (0.25)(17) = 4.25 You would round down and choose the 4th number.
39
Describe a box and whisker plot
- Upper and lower hinges of box are the Q1 and Q3 | - Median is inside the box
40
Describe how symmetry can be interpreted from a box and whisker plot ?
- Hinges equidistant from median means that the data is symmetrical - If upper hinge is further away from the median, data are positively skewed - If lower hinge is further away, data are negatively skewed
41
What do the whiskers represent?
the largest/smallest non-outlying values
42
What are outliers identified with in a box and whisker plot?
asterisk
43
What is the boundary for outliers?
(1.5)(IQR) + Q3
44
Describe grouped frequency tables
- Group observations on variable - into contiguous, non-overlapping (preferably equal) class intervals (bins) - Place each observation into only one bin - Tabulate frequency of observations in each bin - Can calculate relative frequency - proportion or percentage - Can also tabulate cumulate frequency and cumulative relative frequencies
45
Grouped frequency tables: What does k represent?
how many bins
46
Grouped frequency tables: What does w represent?
how wide
47
Grouped frequency tables: What is the formula for determining the # of bins (k)?
``` K = the # of bins n = the sample size ``` k = 1 + 3.322 x log10(n)
48
Grouped frequency tables: What is the formula for determining the width (w) ?
``` w = width of bind k = # of bins R = range ``` w = R/k
49
How is a frequency polygon created?
by linking the mid-points of successive bins
50
How do you work backwards on a frequency polygon to find the mean?
Mean = Sum (f*mid)/ Sum (f) *not on formula sheet
51
How do you find the median from a frequency polygon?
Go to 50% and look over to see where it hits the line
52
How does sample size and population affect the probability distribution?
As sample size gets bigger and width decreases, the underlying distribution becomes clearer and you get a more smooth curve