statsmodels Flashcards

1
Q

creates a new pd.df with a new column (equal in length to x1), which consists only of 1s to account for the constant beta_0

A

x = sm.add_constant(x1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Fit the model according to the OLS with a dependent variable y and an independent x

A

results = sm.OLS(y,x).fit()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

display model summary

A

results.summary()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

make prediction

A

predictions = results.predict(new_data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

check for multicollinearity

A
variables = df[['col1','col2','col3']]
vif = pd.DataFrame()
vif["Features"] = variables.columns
vif["VIF"] = [variance_inflation_factor(variables.values, i) for i in range(variables.shape[1])]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

yhat equation from linear regression results

A

y_hat = x1*results.params[1]+results.params[0]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

logistic regression

A

x = sm.add_constant(x1)
reg_log = sm.Logit(y,x)
results_log = reg_log.fit()
results_log.summary()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

logistic regression predictions. Assume you have results in results_log

A

results_log.predict()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

logistics regression confusion matrix. Assume you have results in results_log

A

results_log.pred_table()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly