Statistics Flashcards
Describe the different types of data
- Quantitative - A measurement which can either be discrete or continuous (Discrete are whole numbers eg counts where as continuous measurements take any value eg height)
- Qualitative - When objects are classified into groups and this can be either ordinal or nominal (In ordinal there is a numerical relationship between the groups whereas in nominal there is so order to the groups. Categorical data with only two values is binary.
What is a stratified sample?
Used in a study where certain categories need to be represented. The population is divided into strata and a random sample is chosen from each of these.
What type of data do pie charts and bar graphs usually represent?
Categorical variables
Which graphs are used to visualise the distribution of continuous data?
Histograms
Stem and leaf plots
Box and whisker plots
What are scatter plots used to visualise?
The relationship between two variables
Why is there no gaps between the bars on a histogram?
The data they represent is continuous, whereas is bar charts it is categorical or discrete
What is the total area of the colums equal to in a relative frequency histogram?
1
What do scatter plots represent?
The relationship between two quantitative variables
How do you calculate the strength of the relationship between the two variables in a scatter plot?
Calculating the coefficient of correlation
What is the line fitted to a scatter plot called?
Regression line
Why is the mean not useful in skewed data? What would be a better estimate of this?
It is very sensitive to outliers. The median is not sensitive to outliers.
What is the adjustment that must be made when calculating sample variance to make an unbiased estimate of population value?
Denominator must be n-1, not n
What does a larger standard deviation tell you about the spread of the data?
Large SD = Wide spread of data
What does standard deviation measure?
The spread of the data around the mean
What does positively skewed mean?
Most values lie towards the bottom end of the range with a tail to the right (larger end of the range)
Give two measurements in healthcare that are most often positively skewed?
Units of alcohol drunk or number of cigarettes smoked.
What does negatively skewed mean?
Most values lie towards the upper end of the range with a tail to the left.
If you get a coefficient of skewness of 0 what does this mean?
Data distribution is symmetrical
If you get a coefficient of skewness of 1 what does this mean?
Positive skew
If you get a coefficient of skewness of -1 what does this mean?
Negative skew
When do you use the normal distribution?
Continuous variables such as lengths, heights and weights.
When do you use the binomial distribution?
Binary data such as alive and dead, male and female
When do you use the poisson distribution?
Rare events and events occurring at random intervals of time and space.
What are the characteristics of the normal distribution?
- Bell shaped
- Single central peak
- Symmetrical
- Equal mean, median and mode
- Continuous
- Takes values between -ve infinity and + infinity