week 3 SCM (confounds and control) Flashcards
What is the concept of overfitting in data models?
- When you have a more complex model, it may do really well at having a low number of errors for a specific dataset, but struggle to generalise.
- This is because the model may fit well to the noise or errors, and not entirely follow the underlying trend (so its in a way too specific)
how do you calculate the variance?
The mean squared error
what is sample variance?
the sum of squared errors divided by n-1
what is the difference between population variance and sample variance?
population variance is the mean of the sum of squared errors, so the sse’s/n
sample variance is the sum of squared errors/n-1
this means that the variance values of the sample is higher to compensate for the lack of information about the population data, it reflects the degrees of freedom when sampling
what do z scores tell us
where a particular score lies in relation to the distribution of scores
so it tells us how far away from the mean the score is, in units of the standard deviation
in a normal distribution, 68% of the data is between which z scores?
-1 and 1
in a normal distribution, 95% of the data is between which z scores?
-2 and 2
how do you solve a linear mathematical eqaution with one unknown?
rearrange it
how do you solve a linear mathematical equation with two unkowns and why?
youll need to have two independent equation
this is because each equation is a line and the solution is the intercept between the two lines
how to recognise a quadratic exquation?
it has x^2 in