Johnny, CH.2 - SPINE Flashcards
Why do Scientists use statistical models?
- models represent real world processes to predict how these processes operate under certain conditions
- We can have little confidence, not complete confidence in the predictions models make
- Outcome (data) = model +error. This equation means that the data we observe can be predicted from the model we choose to fit plus some amount of error
What is the relationship between samples and populations when it comes to psychological research?
Scientists are usually interested in finding results that apply to an entire population. Because we can’t collect data from every being in a population, we use a sample and use these data to infer things about the population as well
What is one of the most common statistical methods?
The Linear Model:
Y1 = b0 + b1X1 + e1
(This equation is expressed as we want to predict the value of an outcome variable Y from a predictor variable X.
- b0: intercept of a line (determines the vertical height of a line, represents the overall level of the outcome variable when predictor variables are 0)
- b1: slope
SPINE of Statistics
What does SPINE stand for?
S: Standard Errors
P: Parameters
I: Interval Estimates (CI)
N: Null Hypothesis Significance Testing (NHST)
E: Estimation (of Parameters)
Parameters
What are Parameters?
Parameters are a numerical or other measurable factor that define a system or define the conditions of how the system works
(Very general, don’t memorize)
Parameters
What do parameters represent?
Some fundamental truth about the variables in the model.
- !!! Parameters are not measured !!!
- We can predict values of an outcome variable based on a model
Parameters
What are some important things to note on parameters?
- Always use the word “estimate”: When we calculate parameters based on sample data they are only estimates of what the true parameter value is in the population
- the model variables have no error
- See Picture 1
Parameters
What is error in statistics?
A discrepancy between observed values and true values
(See Picture 2 for formula and explanation)
- deviance: outcome - model
- error: observed - predicted
Parameters
What is the total error (Or else, sum of errors)?
(See Picture 3 for explanation and examples)
Parameters
How do you estimate the mean error in the population (mean squared errors)?
- total error / degrees of freedom
- total error: sum of differences between observed and predicted scores, squared
- degrees of freedom (df): N - 1
(See Picture 4)
Parameters
What is the fit of a model and how do you estimate it?
The fit of a model is how representative of the real world a model is.
- We estimate it using the sum of squared errors and the mean squared error
- ~As Sum of Squared Errors decreases, the fit of a model increases
Estimation of Parameters
What is the method of least squares?
A method used to minimize the sum of squared errors
(I think the method is rather unimportant to mention, there hasn’t been anything in slides or exercises as well, if you guys want me to put it then let me know)
Estimation of Parameters
What is the maximum likelihood estimation?
An estimation method whose goal is to find the values that maximize the likelihood.
- Likelihood refers to how well a sample provides support for particular values of a parameter in a model (In other words, when we are calculating likelihood we are trying to determine if we can trust the parameters in the model based on the sample data that we have observed)
Standard Error
Why is the standard error important?
It is important becuase it shows us how representative our samples are of the population of interest
Standard Error
What is Sample Variation?
Samples vary because they contain different members of the population
Standard Error
What is A sampling distribution?
A frequency distribution of sample means
- Average of all sample means: Value of the population mean
- Use it to tell us how representative a sample is of the population
- SD is also known as standard error of the means (or standard error)
Standard Error
What is the Central Limit Theorem?
As samples increase, the sampling distribution becomes normal with a mean equal to the proportion mean and a SD equal to = (See Picture 5)
Interval Estimates (CI)
General Info
- Confidence level (e.g. 95%), we can be 95% confident that the parameter in question is within this range
- Boundaries of a CI (See Picture 6)
- If we have small sample, we don’t use a normal but a t distribution (See Picture 7 for boundaries)
(Picture 8 also contains an example of how to represent visually CI)
Null Hypothesis Significance Testing (NHST)
What are the 2 philosophical frameworks leading in NHST?
- Fisher Paradigm (Father of P-value). He created the concept of the Ho
- Neyman-Pearsonm: included the Ho and created the concept of the Ha
!!! IN GENERAL: If P-value is significantly low, reject Ho !!!
NHST
What are the differences between Ho and Ha?
(This table was in the lecture but Johnny didn’t explain it well, I think just remember the general outline of it, no need to understand it really)
Ho:
- Skeptical point of view
- No effect
- No preference
- No correlation
- No difference
Ha:
- Refute skepticism
- Effect
- Preference
- Correlation
- Difference
NHST
What are the 2 main aspects of Frequentist Probability?
- Objective Probability: The likelihood of an event occurring based on empirical data and concrete measures
- Relative frequency in the long run
NHST
What is the standard error?
The Variability of the Sampling Distribution (in other words, if I repeat my experiment over and over again how much variability will there be in the outcome of that experiment?)
- As variability increases, so does SE
(See Picture 9 for formula)
NHST
How do we use the SE?
To do parameter estimation. It is crucial in designing Confidence Intervals, since it determines the boundaries and how wide the CI will be.
- If we want to design a 95% CI, as our SE increases, so will the width of our CI, in order to be 95% sure that our CI entails the true parameter. So we can have two 95% Ci intervals with the first being wider than the other, if that first one has a bigger SE
(See picture 10)