Data visualisation Flashcards
Why do we use figures and tables?
To analyse the data to draw a conclusion
Why do we use graphs?
Data visualisation
What makes a good graph?
- Show the data without distortions
- Present many numbers with minimum ink
- Make large data sets coherent
- Induce reader to think about and compare data
- Always include title, label axis, good scale, use colour-blind friendly colours, y-axis should start at 0
What is a histogram?
Presents frequency distribution
X axis: scores from small to high
Y axis: frequency density
Some have a density curve
Shape of distribution determines the type of analysis to perform
What is a normal distribution?
Gaussian distribution or bell curve
Mode = median = mean
What is the symmetry distribution? (Skewness)
Negative skew = left foot
Positive skew = right foot
What is the shape of the distribution? (Kurtosis)
Leptokurtic = spiked (most data in the middle)
Platikurtic = flat (data widely distributed around the mean)
What is the bimodal distribution?
When there are two modes, thus two normal distributions in the same data set
Why might data not be normally distributed? (4 reasons)
1) Outliers (data point significantly differing from other observations)
2) Small sample size - insufficient data
3) Multiple distributions (bimodal or multimodal)
4) Measurement issues
What is the central limit theorem?
The sampling distribution of the mean approaches a normal distribution as the sample size increases
What is a box plot/ box and whiskers diagram composed of?
- Min and max score
- First and third quartiles
- Median
- Outliers (data point located outside the whiskers of the box plot)
How can you identify outliers in a box plot?
Over 1.5 times the IQR above the upper quartile or below the lower quartile:
1.5 x IQR = ans
Q1 - ans
Q3 + ans
What is a violin plot a combination of?
Box plot and histogram
What figures/graphs are used for a categorical design?
Looking at frequency of occurrence across two categories
- Bar chart
What figures/graphs are used for an experimental design?
Looking at differences between conditions after manipulating variables
- Nominal - bar chart
- Ordinal - line graph
(Include error bars)