statistics Flashcards
what is modelling?
the process of setting up and using mathematical equations to describe and make predictions about the real world.
what must we recognise when dealling with models?
all mathematical models make simplifying assumptions and thus limitations must be considered
important exam technique for dealing with modelling questions?
read carefully, underline key terminology and decode into mathematical meaning
how do you formulate a linear model?
use give values and constants to fit an equation into the yb= mx+c format. usually requires some simultaneous solving
what needs to be remembered when interpreting models, evaluating and explaining assumptions/limitations?
be specific. state what constants mean in relation to actual situation, contrast actual values for model values when evaluating and contextualise limitations to real world scenarios
when is linear regression model used?
when there is a strong enough correlation between variables that all points cluster around a straight line and a linear equation can be given to it
regression model in laymans terms and uses
line of best fit used to predict one y variable based on one other known x value
what is the explanatory variable and where is it plotted?
independent variable plotted on the x axis - used to explain changes on the y axis
what is the response variable and where is it plotted?
dependent variable plotted on the y axis - responses to changes in the explanatory variable
full name and official form of the regression line?
least squares regression line
y = a + bx
dependent variable always the subject.
what is interpolation?
we know the relationship between the variables on our regression line for the spread of our data. hence this can confidently be used to predict values within the interval
what is bivariate data?
every data item is a pair of values, the association between these is called correlation
what are the degrees of correlation and aproximate pmccs?
what is the pmcc?
product-moment correlation coeficcient is the measure of strength of linear correlations, called r for a sample.
it varies between 0 and |1|, with ome bing a perfect correlation and 0 being no correlation
how do you interperate relationship between values?
full interpretation of r in both statistical and non statistical language - mention what the variables actually are in both observations:
eg.
“there is a strong positive correlation between height and weight of british men”(stat language)
“taller men tend to be heavier”(non stat language)
give both stat and non stat interpretations
important note when observing correlation?
correlation does not always imply causation
what is hypothesis testing?
test to see wether wether a sample set of data supports a claim about population - sees if some king of change has effected population.
hypothesis testing is a way of deciding wether something is unusual
what is a population parameter?
value that describes whole population
what is a null hypothesis (Ho)
statement about population parameter, normally fixed depending on type of parameter bing tested
what is alternative hypothesis (H1)?
statement about pop. parameter: determined by what kind of claim is being tested
what is a significance level?
proportion of variables that may give an alternative hypothesis outcome by chance due to natural variation
what is the critical value?
(stats)
pre-calculated “limiting value” for given significance level for particular hypothesis test; found in a table
wht is critical region?
fang e of values for test stat which is “significant” (unlikely - according to significance value - to happen by chance)
what is the p value?
probability of a sample testoccuring due to natural variation based on pop. parameter
what does a hypothesis test do?
compares a test stat with a critical value (or p value with a significance level) to decide if the chance of the result happening sue to natural variation is small enough to suggest there is evidence for the alternative variation.
does not prove anything is true or false on its own, used to suggest wether a further investigation is useful
tests wether pmcc ( r ) of a sample indicates linear relationship
what does greek rho denote? (stats)
pmcc of a population, not sample