descriptive statistics Flashcards

1
Q

categoric variable

A

individual fall into one of several categories
binary: 2 categories e.g. yes/no
ordinal:>2 categories, with a natural ordering e.g. low/medium/high
Nominal: >2 categories but no ordering e.g. hair colour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

numerical variables

A

variables measured on a numeric scale

discrete: there is a distinct number of values e.g. years in age
continuous: any value within a particular range e.g. blood pressure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

descriptive statistics (categorical)

A
  • Probability/ proportion = the number with outcome/ the total number (scale 0 to 1)
  • Percentage = (proportion)*100 (scale 0 to 100)
  • Rate= the number of times something happens per a quantifier (x per 100 people) (scale 0 to infinity)
  • Odds = the number with the outcome/the number without the outcome (scale 0 to infinity)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

quantifying differences

A

• Not sufficient to say ‘one looks more effective’, want to quantify that measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

risk ratio (RR)

A
  • Divide one probability/percentage by the other
  • Whichever group goes on the top is the focus
  • When we divide two numbers together, there are 3 potential outcomes (>1,=1,<1)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

odds ratio (OR)

A
  • Divide the odds in one group by the other

* Same rules apply as RR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

odds

A

Odds = probability/ 1-probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

standard deviation (SD)

A
  • Calculate difference between each value and the mean
  • Square those values to make them all positive
  • Add all those squared differences together
  • Divide by the number of values
  • Then square root the number
  • It is affected by extreme values, but uses all values so more powerful, can be used if skewed data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

measures of spread

A
  • Range
  • inter-quartile range (more representative, representing middle 50% of the data, calculate 25th centile and 75th centile, associated with the median)
  • standard deviation (average distance from mean for individual picked at random, measure of how spread out the values are, used for comparison)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

symmetric distribution

A

mean and SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

non-symmetric distribution

A

median and IQR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

normal distribution

A
  • Most people have values around the middle (mean) some with extremes but roughly the same on each side
  • when we know the mean &SD, we can work out where a certain % of the sample are within
  • Can use properties of normal distribution to create ‘normal ranges’ – where 95% of the data lie
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

quantifying differences

A
  • One numeric and one categoric variable
  • Two numeric variables
  • If we can use difference in means we do
  • If any of groups not normally distributed, difference in medians
  • Concluding whether that difference is big enough to be important
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

comparing 2 numeric variables

A
  • Pearson’s correlation coefficient (denoted r)
  • R must be between -1 and +1
  • +1 = perfect positive linear association
  • -1 = perfect negative linear association
  • 0= no linear relation at all
How well did you know this?
1
Not at all
2
3
4
5
Perfectly