Representations of Data Flashcards
Outlier
Value that is 1.5 times the IQR above the UQ or below the LQ
Histograms
to calculate height of each bar (frequency density)
area of bar = k x frequency
Histograms pt.2
If k = 1 then:
frequency density =
Frequency / class width
Drawing frequency polygons
Remember to join the MIDDLE of the top of each bar in a histogram
Feature of a Histogram
Area of the bar is proportional to the frequency
When comparing data sets, if the data set contains extreme values then
it is better to use the median and IQR rather than mean and standard deviation
Bivariate Data
data which has pairs of values for two variables
Independent variable also known as
explanatory variable - is on the x-axis
Dependent variable also known as
response variable - is on the y-axis
Regression Line definition
Straight line that minimises the sum of the squares of the distances of each data point from the line
Regression line of y on x written as
y = a + bx
coefficient b tells you the change in y for each unit change in x (contextualise in exam questions)
Use it to predict the values for the dependent variable that are within the given range of data