Generalized Linear Models Flashcards by Mbongeni Sibanda

What is the objective of Generalising Linear Models?

To allow us to do regression in problems where our Yi is not normally distributed

How well did you know this?

Not at all

Perfectly

What is the stochastic/random part of a model?

The form of the model which characterises the distribution of Yi (eg. Yi ~ N(mu(i), sigma²)

How well did you know this?

Not at all

Perfectly

What is the structural part of the model?

A function of mu(i) which describes its relationship with the covariates (eg. mu(i) = B0 + B1X1 + B2X2 + … + BPXP)

How well did you know this?

Not at all

Perfectly

What are the two types of model which we go over in this course?

Poisson Model (for count outcomes)

- Binomial Model (for binary or binomial outcomes)

How well did you know this?

Not at all

Perfectly

What is the difference between a binomial outcome and a binary outcome?

Binary (or Bernoulli) outcome is dependent on a single trial where as Binomial outcome is dependent on a number of trials

How well did you know this?

Not at all

Perfectly

What is a link function?

A function which describes the relationship between the parameter of a distribution and the covariates

How well did you know this?

Not at all

Perfectly

What is the link function for the Poisson Model?

log(lambda) = linear covariates

*natural logarithm

How well did you know this?

Not at all

Perfectly

What is the link function for the Binomial Model?

log(odds of success) = linear covariates

*natural logarithm

How well did you know this?

Not at all

Perfectly

Define the term “odds”?

A quantity which the the ratio of the probably of an event occurring divided by the probability of the event not occurring.

= [p(A)] / [p(not A)]

*in Bernoulli events, “A” is success and “not A” is failure

How well did you know this?

Not at all

Perfectly

How do you read data from a CSV file into R?

data = read.csv(“filename.csv”)

How well did you know this?

Not at all

Perfectly

What is the R function for viewing the first few rows of a data object?

head(data)

How well did you know this?

Not at all

Perfectly

What is the R function for viewing the names of the variables in a data object?

names(data)

How well did you know this?

Not at all

Perfectly

What is the R code for viewing the values under a specific variable name in a data object?

data$variableName

How well did you know this?

Not at all

Perfectly

What is the R code for viewing the number of each type of value under a specific variable name in a data object?

table(data$variableName)

How well did you know this?

Not at all

Perfectly

What is the R code for viewing the proportion of each type of value under a specific variable name in a data object?

prop.table(table(data$variableName))

How well did you know this?

Not at all

Perfectly

What is the R code for adding a variable name to a data object based on some condition of each row?

Study These Flashcards

data$newVariable = ifelse(data$conditionVariable == “something”, 1 , 0)

Will set newVariable to 1 if condition is true else set newVariable to 0

What is the R code for fitting a GLM to a binomial dependent variable and viewing a summary of the model?

Study These Flashcards

model1 = glm(dependent ~ explanatory, family = “binomial”, data = dataObject)

summary(model1)

the outputs from the summary give the coefficients for the link function (logit in this case)
standard error gives an idea of the variability in the estimate of that coefficient

What is does logit(p) equate to?

Study These Flashcards

log(odds of p)

natural logarithm

How do we know how well the model fits the data?

Study These Flashcards

D = -2(l(c)  - l(f))
D ~ chi-squared with n-k-1
* where n is number of observations
* where k+1 is the parameters estimated
* where l is likelihood
* where c is current
* where f is a full/ideal model which fits all of the data

%Deviance Explained = [Dnull - Dcurrent]/[Dnull]
* where Dnull is the Deviance of the model with just the intercept

What is the R code for fitting a GLM to a “count” dependent variable and viewing a summary of the model?

Study These Flashcards

model2 = glm(dependent ~ explanatory, family = “poisson”, data = dataObject)

summary(model2)

the outputs from the summary give the coefficients for the link function (logit in this case)
standard error gives an idea of the variability in the estimate of that coefficient