STAT Definitions (EDUS 608) Flashcards
Variable
a characteristic that can vary in value among subjects in a sample or a population.
Categorial Variable (Qualitative)
scale for measurement is a set of categories.
Examples:
Racial-ethnic group (white, black, Hispanic)
Political party identification (Dem., Repub., Indep.)
Vegetarian? (yes, no)
Happiness (very happy, pretty happy, not too happy)
Gender
Religious affiliation
Major
Quantitative Variable
possible values differ in magnitude.
Examples: Age, height, weight, BMI Annual income GPA Time spent on Internet yesterday Reaction time to a stimulus (e.g., cell phone while driving in experiment) Number of “life events” in past year
Nominal Scale
Used to measure CATEGORICAL VARIABLES by using unordered categories.
Example:
Preference for President, Race, Gender,
Religious affiliation, Major
Opinion items (favor vs. oppose, yes vs. no)
Ordinal Scale
Used to measure CATEGORICAL VARIABLES by using ordered categories.
Political ideology (very liberal, liberal,
moderate, conservative, very conservative)
Anxiety, stress, self esteem (high, medium, low)
Mental impairment (none, mild, moderate, severe)
Government spending on environment (up, same,
down)
Interval Scale
Used to measure QUANTITATIVE VARIABLES by using numerical values.
The difference between values are consistent:
-Moving from $20,000 to $21,000 is the same
magnitude as moving from $50,000 to $51,000
-Moving from 90 degrees F to 95 degrees F is the
same as moving from 70 to 75
Note: In practice, ordinal categorical variables often
treated as interval by assigning scores
(e.g., Grades A,B,C,D,E an ordinal scale, but
treated as interval if assign scores 4,3,2,1,0 to
construct a GPA)
What is Descriptive Statistics?
- Describing data with tables and graphs
(quantitative or categorical variables) - Numerical descriptions of center
(mean/median) and variability (standard
deviation/ variance) (quantitative variables)
Histogram
Bar graph of frequencies or percentages.
Skewed right
Long tail on the right. Mean is to the RIGHT of the Median
Skewed left
Long tail on the left. Mean is LEFT of the Median.
Bimodal
Mean and median are the same, but there are two modes.
Bell-shaped
Mean, median, and mode are the same.
Median
Middle measurement of ordered sample.
Mean
average that is used to derive the central tendency of the data in question. It is determined by adding all the data points in a population and then dividing the total by the number of points. The resulting number is known as the mean or the average.
Mean vs. Median (Distribution)
Mean sensitive to “outliers” (median often preferred for highly skewed distributions)
When distribution symmetric or mildly skewed or discrete with few values, mean preferred because uses numerical values of observations
Range
Difference between largest and smallest observations (highly sensitive to outliers).
Standard Deviation
A “typical” distance from the mean. It is the square root of the variance.
Variance
Measures how far a data set is spread out. It comes from calculating the average of the squared differences from the mean.
Deviation
The difference of an observation’s value from the mean.
Properties of Standard Deviation
- s 3 0, and only equals 0 if all observations are equal
- s increases with the amount of variation around the mean
- like mean, affected by outliers
Empirical Rule
If distribution is approximately bell-shaped:
• about 68% of data within 1 standard dev. of mean
• about 95% of data within 2 standard dev. of mean
• all or nearly all data within 3 standard dev. of mean
Point Estimation
Estimating parameters (mean, median, standard dev.)
Inference
Testing theories about parameters.
Hypothesis Testing
Creating models based on hypotheses and testing them with data to see if they are consistent with the data.
Null Hypothesis H0
– There is no effect.
– E.g. contestants on “Survivor” and members of the public will not differ in their scores on personality disorder questionnaires
It is called the “null” because it is frequently,
though not always, used to say that something
is 0)
• Examples of null hypotheses:
– Ho: μ=0
– Ho: There are no differences in math achievement
by SES level.
– Ho: “I don’t have the flu”
The alternative hypothesis, HA (or H1)
– There is an effect.
– E.g. contestants on “Survivor” will score higher on personality disorder questionnaires than members of the public
• Typically suggests that an effect exists, or
(in this class) is statistically significant
• Examples of alternative hypotheses
corresponding to the previous examples:
– Ha: μ≠0
– Ha: There are differences in math achievement by SES level.
– Ha: “I have the flu”
Null versus Alternative
- The null and the alternative can’t both be true and are mutually exclusive
- Using statistics, we have strong tools to assess the probability that one is correct and the other isn’t…
- Based on the results you obtain, you will either reject the null hypothesis (you have evidence an effect exists), or you will fail to reject the null hypothesis (you don’t have enough evidence that an effect exists)
- Instead of saying “fail to reject” the null hypothesis, some disciples use “retain” the null hypothesis
Type I Error
Aka. False Positive.
Reject the null if it is true.
• My test says that μ≠0, but actually μ=0
• My test says that achievement differs by SES, it actually
doesn’t
• My swab results say “I have the flu”, but I actually don’t
– i.e., I got a false positive
– In medical tests, testing “positive” means rejecting the null
Type II Error
Aka. False Negative.
Fail to reject the null when it’s false.
• My test says that I can’t reject the assertion that μ=0, but in reality μ≠0
• My test says achievement doesn’t differ by SES, but it actually does
• My swab results say “I don’t have the flu”, but I actually do
– i.e., I got a false negative
– In medical tests, testing “negative” means you do not reject the null
Alpha (α)
- the proportion of the times I can expect to reject the null when it’s true in repeated randomly drawn samples of the same sample size from the population
- aka the probability that I will make a type I error (with repeated sampling)
- This is also called the “significance level”
How Do We Choose Alpha?
• There are conventional choices for α
– The most common choice for α is .05
– Also common are .1 and .01
• All these choices are arbitrary, but they are attempting to be conservative
The smaller the alpha level, the smaller the area where you would reject the null hypothesis. So if you have a tiny area, there’s more of a chance that you will NOT reject the null, when in fact you should. This is a Type II error.
In other words, the more you try and avoid a Type I error, the more likely a Type II error could creep in. Scientists have found that an alpha level of 5% is a good balance between these two issues.
Approaches to Conducting Hypothesis Tests
- Test of significance (t-test, z-test)
2. Confidence interval (90%, 95%, 99%)
Test of Significance
- Use formula to create a test statistic
- Compare the value of the test statistic with the value corresponding to our choice of α in the distribution appropriate to the test (could be z, t, chi-square, etc.)
* This value is called the “critical value” - If the test statistic is larger than the critical value, you reject the null hypothesis. If the test statistics is smaller than the critical value, you fail to reject the null hypothesis.
Critical Value
In hypothesis testing the value
against which a test statistic is compared to determine whether or not the null hypothesis is rejected.
for alpha = .05, this is +/- 1.96. If the test statistic is larger than the critical value, then we reject the null.
Rejection Region
The set of values of a test
statistic that leads to rejecting the null hypothesis. If the test statistic is falls in the rejection region, then we reject the null hypothesis.
p-value
The smallest significance level at which the null hypothesis can be rejected.
It captures the amount of confidence we have in our inference
based on the probability that we would get our particular
parameter estimate under our null hypothesis.
It can be thought of as the amount of evidence against the null.
What would a low p-value indicate?
– The lower the p-value, the less likely it is that we would get the result we got if the null were true
– Thus, the lower the p-value, the more significantly different our estimate is from the hypothesized null value of the parameter.
Typically, we say that p-values need to be .05 or smaller to reject, but this is just one choice.
Confidence Interval
an interval of numbers within which a given parameter is believed to fall
It has the form: Estimate ± Margin of error OR
[Estimate - Margin of error, Estimate + Margin of error]
Elements of Confidence Interval
–Your estimate of the population mean –Your estimate of the standard deviation of the population –The sample size –The level of confidence that you specify
Alpha-Level
• We choose the level of α to give us a specific degree of confidence
– α =.10 => 90% confidence
– α =.05 => 95% confidence
– α =.01 => 99% confidence
• This is very similar to choosing a threshold for Type I error in hypothesis testing- you are
choosing which level of confidence is appropriate (usually at least .05/ 95%)
• Just like in significance testing, each level of confidence corresponds to a particular critical value
Correlation
It is a way of measuring the extent to which two variables are related.
It measures the strength of the association between two interval/ratio quantitative variables.
It measures the pattern of responses across variables.
Covariance
A measure of how much two variables change together.
Problems with Covariance
It depends upon the units of measurement.
– E.g. The Covariance of two variables measured in
Miles might be 4.25, but if the same scores are
converted to Km, the Covariance is 11.
(Pearson) Correlation coefficient
• One solution to problems with covariance is to standardize it.
– Divide by the standard deviations of both variables.
– It is relatively unaffected by units of measurement.
About Correlation
• It varies between -1 and +1 – 0 = no relationship • Positive values suggest a positive relationship between the two variables – As X increases, Y increases – As X decreases, Y decreases • Negative values suggest a negative relationship – As X increases, Y decreases – As X decreases, Y increases
Effect Size (Correlation)
It can be interpreted an effect size:
– ±.1 = small effect
– ±.3 = medium effect
– ±.5 = large effect
Coefficient of Determination, r-squared
– By squaring the value of r you get the proportion of variance in one variable shared by the other.
Correlation + Causality
• The third-variable problem:
– in any correlation, causality between two variables cannot be assumed because there may be other measured or unmeasured variables affecting the results.
• Direction of causality:
– Correlation coefficients say nothing about which variable causes the other to change
• Correlation is not causation
Nonparametric Correlation
• For small samples (say less than 30), or for data that are severely non-normally
distributed
• Spearman’s Rho
– Pearson’s correlation on the ranked data
• Kendall’s Tau
– Better than Spearman’s for small samples
• For this class: we will focus on the Pearson correlation (r)
Regression
A way of predicting the value of one variable from another.
– It is a hypothetical model of the relationship between two variables.
– The model used is a linear one;
– Therefore, we describe the relationship using the equation of a straight line.
Linear Model
y = b0 + b1*X (plus error)
Population Model
Used to describe an overall theoretical linear relationship between two variables.
Prediction Model
We use the same format as a population model but “plug in” estimated values and make a prediction.
Big difference: no error term, for prediction we assume error washes out
Sum of Squares Regression
Model variability (difference in variability between the model and the mean).
Sum of Squares Residual
Residual/Error variability (variability between the regression model and the actual data).
Sum of Squares Total
Total Variability (variability between scores and the mean)
How Sum of Squares Relate
Residual/Error variability (variability between the regression model and the actual data).
How Good is the Model?
We need to test the model. We can can do this in two ways, using the sum of squares to create two diagnostic tests - F-test and R Square.
Interpreting the F-test
Divide Sums of Squares by df, this gives you Mean Square. Divide Mean Square Regression / Mean Square Residual. Compare this value with df of (n-2). Check the p-value.
R
the absolute value of the correlation coefficient.
R Square
the square of the correlation coefficient. “coefficient of determination” Equal to Sum of Squares Regression divided by Sum of Squares Total.
Super useful for determining model fit - it represents the proportion of the total variance (sum of Squares Total) that is explained by your regression equation.
R Square shows you how well your model fits your data; it is much more precise than the F-test, which just tells us whether things fit in very general terms.
You can think of R Square as an effect size for your regression, with the same range of values:
.1 - .3 (small); .3 - .5 (medium); .5 or higher (large)
Synthesize Results (Steps)
How much evidence do you have to justify your findings?
- Is the relationship significant?
- What is the direction of the relationship?
- How well does the model fit the data?
Generalizability
The goal of statistics - we want to the research to apply to a larger population.
Generalizability Assumptions for Regressions
- Variable Type
- Non-Zero Variance
- Linearity
- Independence
- Homoscedasticity
- Normally-distributed Errors
- No Multicollinearity
Assumption - Variable Type
- Outcome must be continuous (interval/ratio)
- Predictors can be continuous (interval/ration) or dichotomous.
Assumption - Non-Zero Variance
Predictors must not have zero variance (have to have some variation).
Assumption - Linearity
The relationship we model is, in reality, linear.
Assumption - Independence
All values of the outcome should come from a different person.
Assumption - Homoscedasticity
For each value of the predictors the variance of the error term should be constant.
Assumption - Normally-distributed Errors
Residual error values should be normally distributed when viewed in a histogram.
No Multicollinearity
Predictors must not be highly correlated.
*This only applies to multiple regressions, not simple linear regressions.
Tools for Checking Assumptions Using Residuals (Errors) - aka going through the garbage
- Homoscedasticity and Linearity - plot standardized residual values (ZRESID) against standardized predicted values (ZPRED.)
- Normality of Errors - (1) normal probability plot or (2) a histogram of the standardized residuals.
Big Picture for Residual Plots
From Field, p. 348: “If everything is OK [assumptions have been met] then this graph should look like a random array of dots, if the graph funnels out then that is a sign of heteroscedasticity and any curve suggests non-linearity.”
Look for:
1. Funneling- the width of the points gets smaller/ larger (heteroscedasticity)
2. Curving- the points show a curved trend up or down (or both!) (non-linearity)
Homoscedasticity vs. Heteroscedasticity
We assume that the residual errors demonstrate
homoscedasticity (a good thing):
– This means that for each value of your predictor(s) the variance of the error term should be constant
– We will practice how to look for this next
If the residuals errors are not homoscedastic, then they demonstrate heteroscedasticity (a bad thing):
– This means that the variance of the error term is not constant
– The assumption of homoscedasticity has been violated
Normally-Distributed Errors vs. Non-Normality of Errors
We assume that our residual error values are normally distributed (a good thing):
– This means that the residual errors should be normally distributed when viewed in a histogram or a probability plot
If the normality of errors assumption is violated, we have evidence of non-normality of errors (a bad thing):
– This means that there may be outliers or influential observations skewing our results, or maybe we
specified the wrong kind of model
Linear vs. Non-Linear Relationship
We assume that our outcome and predictor(s) have a linear relationship (a good thing):
– This means that the relationship can be modeled with a line
– We will practice how to look for this next
• If the linearity assumption is violated, we have evidence of a non-linear relationship (a bad thing):
– This means that the relationship should not be modeled with a straight line (might be quadratic,
exponential, etc.)
– We would need another kind of model to accurately
represent it
Normal Probability Plot
Looking for a nearly straight line and curve or patterns is a sign that something is up with the data.
Multiple Linear Regression (MLR)
Relationships between more than two variables.
Why we need MLR
Bivariate analyses (e.g. simple linear regression) are informative, but we usually need to take into account many variables.
- Many predictors (“x”es) have an influence on any particular outcome (y). Especially in education: many things influence achievement, as an example.
- The effect of a given predictor (x) on an outcome (y) may change when we take into account other variables.