1 - Intro to Stats Flashcards

1
Q

What are the 3 common scales of measurement for variables in medicine?

A
  • Nominal
  • Ordinal
  • Numeric (interval or ratio)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe nominal measurement

A
  • Simplest b/c data fits in categories in no particular order
  • No actual measurement
  • Often dichotomous or binary (yes/no, male/female)
  • Can be multiple categories
  • Generally described in percentages or proportions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe ordinal measurement

A
  • Inherent order to the categories
  • Summary statistic (median)
  • Often used in assessment of pt risk
  • Difference between 2 adjacent categories isn’t the same throughout the scale
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe numerical measurement

A
  • Differences have meaning on numerical scale

- 2 types of numerical scales - interval and ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe the difference between interval and ratio

A
  • Interval - difference between any pair of levels is the same (ex: temperature 10-15 = 20-25), but no meaningful zero value
  • Ratio - interval scale w/ meaningful zero value (ex: time)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 3 types of variables? Give examples of each

A
  • Continuous (age, time)
  • Discrete (number of houses on a street)
  • Summary statistics (mean and SD)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the 3 measures of middle?

A

Mean, median, and mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe mean

A
  • Arithmetic average
  • Mean = sum of x/n (x = individual observation; n = number of observations)
  • Used w/ numerical variables; shouldn’t be used w/ ordinal variables (but often is)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe median

A
  • Middle observation
  • Arrange the observations from smallest to largest; count and find the middle
  • For odd number of observations - median is the middle observation
  • For even number - median is average of the values on either side of the middle
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe mode

A
  • Value that occurs most frequently

- Data can have more than 1 mode (bimodal distribution)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which measure of middle should be used in skewed distributions?

A
  • Use mean in symmetric (normal) distributions

- Use median for ordinal data or numerical data that is skewed (mean very sensitive to extreme values in small datasets)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the 4 measures of spread (dispersion)?

A
  • Range
  • Standard deviation/ variance
  • Percentiles
  • Interquartile range
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe range

A
  • Difference between the smallest and largest observation

- Minimum and maximum may also be given

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe the standard deviation formula

A
  • s = square root of [(sum of x - /x) ^ 2 / (n - 1)]
  • x = value
  • /x = mean
  • n = sample size
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is variance?

A

Sum of x - /x before square root is taken

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe percentiles

A
  • Percentage of a distribution that is equal to or below a particular number
  • Median = 50th percentile
  • Common example = physical growth charts for children
17
Q

What is interquartile range?

A
  • Difference between the 25th and 75th percentiles (1st and 3rd quartiles)
  • Describes the middle 50% of the distribution regardless of the shape
18
Q

When should standard deviation be used?

A

W/ mean w/ symmetric data

19
Q

When should percentiles and interquartile range be used?

A

W/ median for ordinal data or skewed numerical data

20
Q

Describe tabular presentations

A
  • Nominal and ordinal data presented as proportions or percentages
  • Summarized in frequency tables
21
Q

What is the purpose of contingency tables?

A

To facilitate simultaneous examination of multiple distributions (variables)

22
Q

What are 4 types of numerical data?

A
  • Stem-and-leaf plots
  • Five number summary
  • Boxplots
  • Grouped frequency tables
23
Q

Describe a boxplot

A
  • Upper and lower hinges of the box are made w/ 1st and 3rd quartile
  • Median is the line in the box
24
Q

How is symmetry of a boxplot evaluated?

A
  • Evaluated by symmetry of the hinges w/ respect to median
  • If hinges equidistant from median = data is symmetrical
  • If upper hinge further away from median = data positively skewed
  • If lower hinge further away from median = data negatively skewed
25
Q

Describe a box and whisker plot

A

Same as boxplot but whiskers drawn from upper and lower hinges to largest/smallest non-outlying values

26
Q

Describe a modified boxplot

A
  • Outliers are identified by an asterisk

- Boundary for outliers = 1.5x interquartile range from the box

27
Q

Describe how to construct a grouped frequency table

A
  • Group observations on variable into contiguous, non-overlapping (preferably equal) class intervals (bins)
  • Place each observation into only one bin
  • Tabulate frequency of observations in each bin
  • Can calculate relative frequency proportion or percentage
  • Can also tabulate cumulative frequency and cumulative relative frequencies
  • How many bins (k) and how wide (w)
28
Q

What are some “rules” for grouped frequency tables?

A
  • Poor grouping = loss of information (may emphasize or hide elements of the variable)
  • Too few bins = loss of info
  • Too many = cumbersome, data gaps
  • General rule = 5-20 class intervals
29
Q

Describe guidance for grouped frequency tables

A
  • w = R/k

- w = width of the bin, k = # of bins, R = range

30
Q

What is the difference between histograms and bar charts?

A

Histogram bars are generally joined b/c they represent a continuous distribution

31
Q

What is a frequency polygon?

A
  • Created by linking the mid-points of successive bins

- Polygon finished by joining to x-axis at point corresponding to the mid-point of the extreme zero-frequency bins

32
Q

How can you use a histogram to find the mean?

A

Mean = [sum of (f * xmid)] / sum of f

  • f = frequency
  • x mid = midpoint of x value range