Correlation and regression Flashcards
Following stuff im gonna add:
- Histograms & cumulative frequency :D
- Box plots and skewness :D
- Scatter diagrams and regression :D
- Numerical measures ,’:(
Ya
What histogram?
The bar graph thing with the frequency density on y axis, and some rando value on x axis
What cumulative frequency?
The line graph thing that can be correlated with boxplot.
What boxplot?
And how to write/find values?
Is a boxplot with line corresponding to these:
- Lowest value (is just the lowest…)
- LQ (1/4 of whole thing)
- Median ( the middle point)
- UQ (3/4 of whole thing)
- Highest value (is just the highest…)
Correlates with cumulative frequency
How do u find the frequency density?
frequency/class width
How to find frequency only?
The area of the bar
How do u find IQR?
UQ - LQ = IQR
(3/4) - (1/4) = ^^^
How do u compare distributions of box plots?
You just talk about literally the values man, compare them
What are outliers?
What formula? Not in data booklet
and how to find them?
Data points found below 1.5 x IQR the LQ OR above 1.5 x IQR the UQ
Formula:
(Q1 - (1.5 X IQR))
(Q3 - (1.5 x IQR))
How to find them:
1) You got ur data set
2) Find ur UQ & LQ:
- U either got it ez with boxplot
- Or, find it urself from the given data set by:
Find median (n+1/2)
Then it’s ur 1/4 or 3/4 typ shi
3) Find ur outliers using formula:
- For Q1, it’s numbers below the number u calculated
- For Q3, it’s numbers above “
That bout’ it
What say if no outliers found?
No boundaries….
How to tell if box plot positively skewed?
whats it mean too?
Median more closer or LQ
When mean > median. Data constitute higher frequency of high valued scores
How to tell if box plot negatively skewed?
whats it mean too?
Median more closer to UQ
When mean < median. Data constitute higher frequency of low valued scores
How to tell if it’s symmetric skew from some rando ahh bar graph?
It’s just in the middle…
also similarly to box plot for finding if positive/negative skew
In a graph, how to find out if it’s a perfect positive correlation or weak?
Depends on how closer each point is towards the “line of best fit”
How to answer them regression questions?
1) You are given the regression equation
2) Interpreted with y=mx + c or y = a+ bx
3) How to interpret y or m in context?:
- Look in book, this too big D:
- Check photos
(this probably the hardest part for me for regression)
4) Most of the time tho, u just sub in the certain value within equation to find ur answers.
For finding out whether ur answer is accurate or not from question, based on the data range. Like if it’s within data range or not.