DSE R CODE Flashcards
reading data
Advertising = read.csv(“Data/Advertising.csv”, head = TRUE)
head(Advertising)
Generate lm model
lm1 = lm(sales ~ TV, data = Advertising)
Generate r output table
summary(lm1)
Gnerate confidence interval for coeff of variable and constnat (95% or 90%)
confint(lm1)
confint(lm1,level=0.9)
How to plot scatter plot? Sales against tv
label axis
colour red
dots
with regression line line width 3 and colour boue
plot(x = Advertising$TV, y = Advertising$sales,
xlab = “TV”, ylab = “Sales”, col = “red”, pch = 19)
abline(lm1, col = “blue”, lwd = 3)
How to get coefficient from R output table or summary table?>
summary(lm4)$coefficients
How to determine whether there is relationship between variables? 4dp
round(cor(Advertising), digits = 4)
how to get adjusted r squared?
summary(lm_mpg1)$adj.r.squared
How to load data from ISLR package?
Get auto fata
library(ISLR)
data(Auto)
how to generate residuals plot( such as q-q plot) ?
par(mfrow = c(2, 2))
plot(lm_mpg1)
How to generate scale-location plot?
plot(lm_mpg2, which = 3)
xclude name variable from linear model (mpg= everything except name)
lm_mpg4 = lm(mpg ~ . - name, data = Auto)
how to add abels to nominal categorical data?
Auto$origin = factor(Auto$origin, labels = c(“American”, “European”, “Japanese”))
how to generate logistic model ?
glm_fit = glm(default ~ balance, data = Default, family = binomial)
need to put familiy=binmomial
BINOMIAL IS NOT A STRING!!!!!
How to have predicted probabilities from your own data
df_new = data.frame(student = c(“Yes”, “No”),
balance = c(1500, 1500), income = c(40000, 40000))
predict(glm_fit, newdata = df_new, type = “response”)