Data representation & Regression and Correlation Flashcards
What do we need on a box plot?
- Minimum and maximum value
- Median
- Upper and lower quartile (Q1 and Q3)
- Outliers represented as crosses
How do we find outlier boundaries?
Q1-1.5xIQR
Q3+1.5xIQR
What kind of data do we use for histograms?
Continuous data
What equation do we need to remember with histograms?
Kf=cw x fd
How do we find frequency density with a histogram?
Frequency/ class width
What is the area of a histogram equal to?
Frequency x k
(you must find out what k is in the question)
What do we do when we are comparing sets of data?
- Compare a measure of location (median, mean)
- Compare a measure of spread (variance,standard deviation)
- Pu them into context
What is the correlation coefficient?
It is a value which measures the strength and positivity/negativity of correlation on scatter graphs.
What values is the correlation coefficient between?
-1≤ r ≤ 1
What value is r when there is no correlation and what would that look like on a scatter graph?
0
What would a scatter graph with a positive correlation look like?
What would a scatter graph with a negativecorrelation look like?
What is a regression line and what is its formula?
Line of best fit
comes in the form y=mx+c
What is interpolation?
Estimating inside the data range
What is extrapolation?
Estimating outside the data range