statsmodels Flashcards
creates a new pd.df with a new column (equal in length to x1), which consists only of 1s to account for the constant beta_0
x = sm.add_constant(x1)
Fit the model according to the OLS with a dependent variable y and an independent x
results = sm.OLS(y,x).fit()
display model summary
results.summary()
make prediction
predictions = results.predict(new_data)
check for multicollinearity
variables = df[['col1','col2','col3']] vif = pd.DataFrame() vif["Features"] = variables.columns vif["VIF"] = [variance_inflation_factor(variables.values, i) for i in range(variables.shape[1])]
yhat equation from linear regression results
y_hat = x1*results.params[1]+results.params[0]
logistic regression
x = sm.add_constant(x1)
reg_log = sm.Logit(y,x)
results_log = reg_log.fit()
results_log.summary()
logistic regression predictions. Assume you have results in results_log
results_log.predict()
logistics regression confusion matrix. Assume you have results in results_log
results_log.pred_table()