statistics Flashcards

1
Q

what is modelling?

A

the process of setting up and using mathematical equations to describe and make predictions about the real world.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what must we recognise when dealling with models?

A

all mathematical models make simplifying assumptions and thus limitations must be considered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

important exam technique for dealing with modelling questions?

A

read carefully, underline key terminology and decode into mathematical meaning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how do you formulate a linear model?

A

use give values and constants to fit an equation into the yb= mx+c format. usually requires some simultaneous solving

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what needs to be remembered when interpreting models, evaluating and explaining assumptions/limitations?

A

be specific. state what constants mean in relation to actual situation, contrast actual values for model values when evaluating and contextualise limitations to real world scenarios

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

when is linear regression model used?

A

when there is a strong enough correlation between variables that all points cluster around a straight line and a linear equation can be given to it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

regression model in laymans terms and uses

A

line of best fit used to predict one y variable based on one other known x value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the explanatory variable and where is it plotted?

A

independent variable plotted on the x axis - used to explain changes on the y axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the response variable and where is it plotted?

A

dependent variable plotted on the y axis - responses to changes in the explanatory variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

full name and official form of the regression line?

A

least squares regression line
y = a + bx
dependent variable always the subject.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is interpolation?

A

we know the relationship between the variables on our regression line for the spread of our data. hence this can confidently be used to predict values within the interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is bivariate data?

A

every data item is a pair of values, the association between these is called correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are the degrees of correlation and aproximate pmccs?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is the pmcc?

A

product-moment correlation coeficcient is the measure of strength of linear correlations, called r for a sample.
it varies between 0 and |1|, with ome bing a perfect correlation and 0 being no correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how do you interperate relationship between values?

A

full interpretation of r in both statistical and non statistical language - mention what the variables actually are in both observations:

eg.
“there is a strong positive correlation between height and weight of british men”(stat language)
“taller men tend to be heavier”(non stat language)

give both stat and non stat interpretations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

important note when observing correlation?

A

correlation does not always imply causation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is hypothesis testing?

A

test to see wether wether a sample set of data supports a claim about population - sees if some king of change has effected population.

hypothesis testing is a way of deciding wether something is unusual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is a population parameter?

A

value that describes whole population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is a null hypothesis (Ho)

A

statement about population parameter, normally fixed depending on type of parameter bing tested

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is alternative hypothesis (H1)?

A

statement about pop. parameter: determined by what kind of claim is being tested

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what is a significance level?

A

proportion of variables that may give an alternative hypothesis outcome by chance due to natural variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what is the critical value?
(stats)

A

pre-calculated “limiting value” for given significance level for particular hypothesis test; found in a table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

wht is critical region?

A

fang e of values for test stat which is “significant” (unlikely - according to significance value - to happen by chance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

what is the p value?

A

probability of a sample testoccuring due to natural variation based on pop. parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

what does a hypothesis test do?

A

compares a test stat with a critical value (or p value with a significance level) to decide if the chance of the result happening sue to natural variation is small enough to suggest there is evidence for the alternative variation.

does not prove anything is true or false on its own, used to suggest wether a further investigation is useful
tests wether pmcc ( r ) of a sample indicates linear relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

what does greek rho denote? (stats)

A

pmcc of a population, not sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

what is the null hypothesis?

A

no correlation between 2 variables; pmcc = 0

denoted bt Ho: ρ = 0

28
Q

what is a one tailled test?

A

test to see if there is correlation in a particular (+ or -) direction

29
Q

what is a 2 tailled test?

A

tests for relationship between 2 variables in either direction

30
Q

example of one tailled alt. hypothesis?

A

H1 :ρ > 0

H1 :ρ < 0

31
Q

example of 2 tailled alt. hypothesis?

A

H1 : ρ ≠ 0

32
Q

what is test stat?

A

always pmcc in sample r, value compared with critical value for r

33
Q

what test stat concludes rejection of null hypothesis?

A

whan test stat is bigger (closer to|1| nad perfect correlation) than critical value

34
Q

when is null hypothesis accepted?

A

if test stat is weaker than critical value

35
Q

what must be done once you have decided to accept/reject Ho?

A

state conclusion in context

36
Q

what is process for conducting a pmcc hypothesis test?

A
  1. state hypothesis
  2. calculate test stat (pmcc of sample)
  3. identify critical value from table (using sample size, sig. level)
  4. compare test sta with critical value
  5. reject/ accept Ho
  6. state conclusion in context
37
Q

what does qualitative mean?

A

non numerical, something not given a numerical value but still data

38
Q

what does quantitive discrete mean?

A

numerical data with specified incraments

39
Q

what does quantitive contimuous mean?

A

counting, so can take any increasing/ dercimal value

40
Q

how do you compare the overall typical value of a dataset?

A

use an average

41
Q

how compare variation within each dataset?

A

use a measure of spread

42
Q

what must be compared and commented on when evidencing claims about dataset?

A

mode, median, IQR and range.
explain what variations in these suggest

43
Q

what is standard deviation?

A

sigma x - measure of spread, how far each value is from the mean

can be used with mean to support comparisons

44
Q

why are mode and range poor measures?

A
  • range uses most extreme values so is effected by outliars
  • there may be too many modes or none at all and repetition doesn’t indicate a typical value
45
Q

symbol for mean and how to calculate from summarised data?

A

x̄ - x bar, calculated by no. cases times mid value of group divided by no. values

46
Q

where do you find how to calc sd from summary stats?

A

formula book

47
Q

what is variance?

A

dtandard deviation squared

48
Q

why can range and sd not be calculated from tabulated data with undefined limits?

A

cannot work out range as no min/max value, cannot work out sd as not all values are known

49
Q

advantage of mean?

A

takes account of all values

50
Q

disadvantages of mean?

A

outliars can have a significant impact

51
Q

how calculate median?

A

take middle value (if even go halfway betwwen), could use linear interpolation

52
Q

advantages of using median?

A

less sensitive to outliars

53
Q

disadvantage of median?

A

does not use all values

54
Q

advantage of mode?

A

can be used for qualitative data

55
Q

disadvantages to mode?

A
  • only relevant for high frequencies
  • does not consider numerical value of data
56
Q

advantages of standard deviation?

A
  • includes all data
  • ## takes account of all numerical values
57
Q

disadvantage of standard deviation?

A

skewed by outliars

58
Q

advantage of IQR?

A

less sensitive to to extreme values

59
Q

disadvantage of IQR?

A

does not use all data

60
Q

advantage of range?

A

quick to calculate

61
Q

disadvantage of range?

A

only uses extreme values - highly susceptible to outliars

62
Q

disadvantage of linear interpolation?

A

assumes even spread of data

63
Q

how do you estimate values of grouped data using linear interpolation?

A

lower bound + class width X (position of desired percentile - cumulative freq of prev. group all divided by group freq)

or
double number line

64
Q

what is effect of adding or subtracting a value from every value in a dataset on mean and sd?

A
  • mean: increases/decreases by same amount
  • ## standard deviation: no effect
65
Q

effect of multiplying or dividing all values in dataset by same amount on mean and sd?

A
  • mean: multiplied/divided by same amount
  • standard deviation: multiply or divide by same amount
66
Q
A