Discovering Statistics Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is validity?

A

The degree to which a theory/model reflects a true/accurate picture.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is reliability?

A

The replicability of results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the characteristics of a normal distribution?

A
Symmetrical
Bell shaped curve
Standard Deviation determines steepness
Unimodal
Continuous
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What percentage of values fit within +/- 1.96 standard deviations in a normal distribution?

A

95%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the standard error?

A

The standard deviation (variability) of the sampling distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are point estimates?

A

Single numbers used to guess corresponding population parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are examples of point estimates?

A

Measures of central tendency such as mean median and mode
Measures of dispersion such as range and standard deviation
Relationships such as correlations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are interval estimates?

A

uncertainty quantified around point estimates (smaller intervals mean more confidence and less uncertainty)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are confidence intervals?

A

range of values that’s likely to include a population value with a certain degree of confidence. E.g 95% Confidence interval means that 95% of samples will include the population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the t-distribution?

A

A way of approximating confidence intervals if the sampling distribution mean is not known. It is centred around 0, symmetrical and its shape changes based degrees of freedom (df=infinity, the distribution is normal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the three levels of hypothesis?

A

Conceptual
Operational
Statistical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the scientific method?

A
Observation
Theory
Hypothesis/predictions 
Test hypothesis 
Interpret data
Reach conclusions + generate more hypotheses
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the linear model?

A

To obtain the value of an outcome from one or more predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the equation for the general linear model?

A

Outcome = b0 (intercept) + b1(predictor) + e (error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is b0 (intercept)?

A

The value of the outcome when the predictor is 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is b1?

A

The change in the outcome for every unit change in the predictor (slope)

17
Q

What value is used to establish the significance of b1?

A

T value measures how many SD our estimate is from 0, we want it faraway from 0 as possible to reject null hypothesis

18
Q

How is model fit evaluated?

A

R2 and adjusted R2

Always lies between 0 and 1, near 0 means does not fit the variance, 1 means good fit

19
Q

What is the F stat?

A

The statistic that indicates whether there is a relationship between outcome and predictor. The further the f is from 1 means there is a relationship

20
Q

What are outliers?

A

A value in the data that does not follow the trend

21
Q

How can outliers be detected in the GLM?

A
Graphs
Standardised residuals (if difference between observed and predicted is more than 3 its is outlier)
Cooks distance (more than 1 is outlier)
22
Q

If outliers are present, what should be done?

A

A robust estimation model should be used instead of OLS model as they are more resistant to their influence

23
Q

What are the assumptions of the linear model?

A

Linearity and additivity
Normally distributed
Independent errors
Homoscedastic errors

24
Q

What are the differences between errors and residuals?

A

Errors refer to difference between observed and predicted values of the population - this cannot be observed
Residuals refer to difference between observed and predicted values of the sample

25
Q

What are independent errors?

A

Errors in one prediction that are unrelated to errors in another

26
Q

What are homoscedastic errors?

A

Variance of residuals should be consistent at different levels of the predictor variable.

27
Q

What is heteroscedasticity?

A

The number of residuals are more on one side of the spread creating a funnel shape

28
Q

What should be done if the assumption of normal distribution is not met?

A

Normality can be ignored as long as sample size is big enough die to central limit theorem (at least 30 samples), if small sample size is used - bootstrapping can be used.