Data analysis Flashcards
What are the types of data (3)
- Interval - measurement
- Ordinal - measurement
- Nominal - classification
What is interval data (3)
- Measurements on a constant interval scale of measurement (e.g. length)
- Exactly defined
- Constant
What is ordinal data (4)
- Measurements not on a constant interval scale of measurement.
- Undefined
- Not necessarily constant
- Commonly arises in pharmacy research from ‘Likert items’ in questionnaires
What is ordinal data (2)
- Factors that are classified not measured (pregnant/nor-pregnant, dead/alive)
- Where there are 2 possibilities = Dichotomous
What is the mean (2)
- Aka average/arithmetic mean
- The sum of all numbers/number of values
What is the median
The middle value of the numbers in ascending order
What is the mode (5)
- A value that occurs with a peak frequency
- There is no formula for the calculation of the mode - an ‘eye-ball’ value
- Unimodal - a single cluster
- Bimodal - two clusters
- Polymodal - more than one cluster
What are the indicators of central tendency (3)
- Mean - The standard indicator. OK for many data sets
- Median - Used fairly frequently
- Mode - Rarest - only usually used with polymodal data
What are the indicators of dispersion (2)
- standard deviation
- Coefficient of variation
What is the standard deviation
Deviation from mean squared/no. of values -1
What is the coefficient of variation (2)
- Standard deviation/mean
- Can be expressed as a decimal or a percentage
What are quartiles (5)
- Where the median is a value that divides a data set into 2 equal-sized groups
- The quartiles are three values that divide a data-set into 4 equal-sized groups
- 3 Quartiles divide the data into 4 equal groups
- deciles divide the data into 10 equal groups
- 99 centiles divide the data into 100 equal groups
What is the interquartile range
The difference between quartile 1 and quartile 3
What are the characteristics of a normal distribution (3)
- Unimodal
- Symmetrical
- No sharp cut-offs
What is population (3)
- The group about which we wish to draw some conclusion
- Size not under our control
- Rarely possible to study
What is a sample (3)
- Not of direct interest
- Size is under our control
- Should be possible to study it
What are sampling errors (2)
- Bias or systematic error - can be removed
- Random error (unpredictable direction) - cannot be designed out
What controls how precise sample means will be (3)
- sample size
- variability within the population
- We can assess likely sampling error from these two factors
How does sample size affect results (4)
- Small samples = the means will vary wildly
- Large samples = much more consistent estimates
- Small samples = bad
- Big samples = good
How does standard deviation affect dispersion (variability) (2)
- Low SD = sample means will be fairly consistent (good)
- High SD = sample means will be much more variable (bad)
What is sampling Error and Standard Error of the Mean (SEM) (4)
- The sample mean may not accurately reflect the true population mean - random sampling error
- Need to consider the likely extent of sampling error
- Low SD indicates a precise scheme.
- A high SD an imprecise scheme.
How do you calculate the Standard Error of the Mean (SEM)
SEM = standard deviation/ square root of the number of values
How does sample size affect the Standard Error of the Mean (SEM) (2)
- Large sampling error = Large SEM
- Small sampling error = Small SEM