Intro Flashcards
Load libraries into R
library(Caret)
Edit data in R
fix()
View names in data frame
names()
Load Variable Names Into Environment So Don’t have to type the name of columns
attach(DataFrame)
basic linear model function R
lm.(y~x, data = DataFrame)
display statistics on model
first send model output to variable lm.output = lm(y~x, data=Dataframe)
Then, summary(lm.output)
Show information after fitting a model
names(model.output)
summary(model.output)
Show model coefficients and confidence intervals
coef(model.output) #shows the coeff
confint(model.output) #shows the 95% conf interval for the coefficients
Use model to predict new values
Predict()
Predict(model.output, dataframeofx’s, interval=”confidence)
Prediction vs. Confidence Interval
When predicting a new data point, want prediction interval. Confidence Interval is about the where the average of future values lie.
To get PI, Predict(model.output, dataframeofx’s, interval = “prediction”)
Scatter Plot with Regresion Line
Plot(x, y)
abline(model.output, lwd=3, col= “red”) #adds line to scatterplot
lwd is for width
See diagnostic plots of linear regression
plot(model.output) #Automatically does it, b/c model output contains it, wow!
if there are 4 graphs, first create 4 tiles, so first do:
par(mfrow=c(2,2))
how does predict() work
predict(model.output) will return a vector of predicted Y values
predict(model.output,
inspect functions
type function name predict
if there is call to method, use methods(methodname)
Get max of vector
which.max(vector), returns index of max position
Shorthand formula for regression in R
formula = lm(Yvariable ~ ., data = DATAFRAME)
instead of writing x1 + x2 + etc you can just put a dot.
Function to use when there is Colinearity
Need to see the Variance Inflation Factor VIF, part of car package.
library(car)
vif(lm.fit) #use on model output
remember VIF > 5 indicates colinearity
How to see a correlation matrix
cor() all columns must be numeric, if a column isn’t numeric use matrix notation such as cor(data.frame[,-9])