Chapter 2: Random Variables Flashcards
Integer
a whole number (not a fractional/decimal number) that can be positive, negative, or zero
Variable
is an number or amout that can vary and has an effect on other things
Examples of variables include height, age, income, province or country of birth, grades obtained at school, and type of housing.
Interval data
Temperature in Celsius – can be categorized, ranked, and has equal intervals, but no true zero (0°C doesn’t mean “no temperature”).
Nominal data
Gender (Male, Female, Other) – can only be categorized, no ranking.
Ordinal Data
Education level (High school, Bachelor’s, Master’s, PhD) – can be categorized and ranked, but the intervals are not equal.
Ratio Data
Weight – can be categorized, ranked, has equal intervals, and a natural zero (0 kg means “no weight”).
R-squared value show in regression table
How much of the variance is explained by the model.
how well the data fits the regression model, value is 0.7162. This means that the model explains 71.62% of the variance in the outcome. The R-squared value shows how well the model’s predictions match the actual data.
when close to 0 then very weak linear association; close to 1 strong association
Predictor variables
A predictor variable, also known as an independent variable or explanatory variable, is a variable used in statistical modeling to predict or explain the outcome of another variable, known as the dependent variable or response variable.
how to calculate t-value in regression table and SE and slope
t=estimate (coefficient)/SE.
You transform formula to calculate SE and estimate/slope (remember to DO NOT TAKE of the difference)
so t = estimate(coefficient;slope)/SE
SE= slope/t-value
Slope = t-value * SE
Deviance calculations
dont need to know the calculations
Is for GLM models when there is no multiple R squared
1. 1-(ResidualDeviance/Null Deviance): Use when you want to know how much deviance is explained by the model (similar to R-squared).
2. ResidualDeviance/Null Deviance100% : Use when you want to know how much deviance is left unexplained by the model.
If you get a result of 0.75, this means the model explains 75% of the deviance in the data
If you get 25%, this means 25% of the deviance remains unexplained by the model.
Difference r2 and f value
• R² tells you how much of the variance is explained by the model.
• F-statistic tests if the model as a whole fits significantly better than a model with no predictors.