Statistics 2 Flashcards
What are the two types of data?
Discrete & continuous
What is an example of discrete and continuous data?
Frequencies and weight
What are the four scales of measurement?
Nominal, Ordinal, Interval and Ratio
What is the nominal scale of measurement?
Discrete, mutually exclusive categories, if coded numerically then they have no numerical significance, no arithmetic possible
What is the ordinal scale of measurement?
Discrete, categories are not mutually exclusive - instead they form a natural ordering/hierarchy, intervals in between categories are not uniform, no arithmetic possible
What is the interval scale of measurement?
Continuous data, intervals are split equally allowing for simple arithmetic, however there is no absolute zero value which means there are no values below a certain point and it also prevents complex arithmetic
What is the ratio scale of measurement?
Continuous data, interval data with an absolute zero point. all arithmetic possible
What scales of measurement can a bar chart represent?
Discrete (nominal and ordinal) data
How is data represented on the bar chart?
The length of the bar is proportional to frequency
What is represented on the x and y axis of a bar chart?
X-axis = nothing just the names of the different categories y-axis = frequency/measurement
How would you order nominal and ordinal data on a bar chart along the x-axis?
Nominal: order from smallest category to the largest category
Ordinal: order from pre-determined way
What data is represented on a histogram?
Continuous (ratio or interval) data
How is data represented on the histogram?
The area is proportional to the size of the category
If there is a change to the category size what axis is changed?
The x-axis
Why is there no gaps between the different categories on the x-axis?
Because it is continuous data
What is univariate analysis?
analysis of a single variable distribution
How can distribution be described?
Visually (graphs and plots) or numerically (summary statistics)
How is central tendency described?
Averages
What is a bimodal distribution and how does it affect the central tendency?
it has two peaks so essentially there is no centre
What is dispersion?
The spread of the data
What are the two methods for representing and measuring dispersion?
Median and IQR & Standard deviation and variance
When would you use the median and IQR as a method to represent dispersion?
Most suitable for non-normal distribution. Also when the sample size is small
When would you use standard deviation and variance as a method to represent dispersion?
For normal distribution. (ratio and interval)
What is standard deviation?
Summary statistic that characterises the spread of the distribution around the mean
What happens as SD increases?
the spread of the data around the mean broadens/increases
How much of the data is in 1 standard deviation of the mean?
68.26%
How much of the data not within 1 SD is on either side of the mean?
15.87% on either side
What happens to the shape of the normal curve at the point of 1SD?
It changes from a concave to convex shape
What are important characteristics of a normal distribution?
It is spread equally and symmetrically
What is variance?
Standard deviation squared
What scales of measurement can variance not be used ?
nominal or ordinal
What can the variance never be?
non-negative
What is the name for values and calculations for a sample and a population?
Sample: statistic
Population: parameter
What is the coefficient of variation?
It allows us to compare different variables of a distribution with different units or magnitude by standardising them.
What scale of measurement is the coefficient of variation only valid for?
Ratio
What is skewness?
The characterisation of shape in a normal distribution - measuring its symmetry
If the skew is =, less than or more than 0 what does this mean?
= 0: No skew
<0: negative skew (more values concentrated to the right of mean)
>0: positive skew (more values concentrated to the left of mean)
What is kurtosis?
A measure of peakedness
What happens if kurtosis increases?
the mass of the data gets moved from the shoulders of the data distribution towards the centre
What does it mean if the kurtosis is or = 3?
<3 means that it is platykurtic (not very peaked)
~3 means that it is mesokurtic (relatively peaked)
>3 means that it is leptokurtic (very peaked)
What are the four moments of distribution?
1) Central value, 2) dispersion, 3) skewness, 4) kurtosis
What is parametric data?
ratio or interval data that is normally distributed. n>30
What is non-parametric data?
ordinal or nominal data OR ratio or interval data that is not normally distributed. n<30