Advanced Design and Data Analysis Flashcards
Define and describe variance
For estimating population variance:
* Person score-population mean/population is unbiased
○ Makes fewer extreme estimates
○ Unrealistic though because it’s rare that we know the population mean
* (Person score-sample mean)^2/population is biased
* (Population score-sample mean)^2/population-1 is unbiased
○ This is unbiased because the sample mean is not the same as the population mean (used in the first option), the sample scores are going to be closer to the sample mean than the population mean, so the squared deviation terms are going to tend to be underestimates. Therefore, (n-1) corrects for that by making the result slightly bigger (which has a larger proportional effect when the n is small - because when the sample is larger it will be a closer estimation to the population mean)
A sampling distribution is a sample of difference variances estimated by the formulas, from it you can see how biased/unbiased a predictor is if you know the actual population variance. You can get a more accurate prediction because it creates an average of many estimates
What is power?
To say that a test has 80% power means it has 80% chance of rejecting the null hypothesis given that the null hypothesis is false, and given
* A particular sample size
* Particular effect size
* Particular alpha level (often .05 probability of rejecting the null hypothesis)
* Other considerations, including those related to whether the assumptions of the test are satisfied
Reflections on power:
* We don’t have great intuitions about sample size as it relates to power.
○ Our intuitions may have been warped by seeing psychology journals reporting findings with very small sample sizes but statistically significant results
§ Example of publication bias (when non-significant studies tend to get chucked out
Explain Statistical inference issues
-you cannot prove or disprove theories
-They provide probabilistic information and at most can corroborate theories
-significance tests do not allow probabilities to be assigned to any particular hypothesis
-with an alpha level of .05, we should reject the true null hypothesis 5% of the time (kind of like a margin of error). However, that is a global error rate, and doesn’t tell us the probability of a local mistake
Explain P-Values
p is the probability of getting our observed result, or a more extreme result, if the null hypothesis is true
Explain confidence intervals
General formula for confidence intervals for the population mean is M +/- Margin of error
If you’re randomly sampling from a normally distributed population:
Issue is it is VERY rare to know the population standard deviation
If you don’t know the standard deviation you will base it off a t-distribution rather than a normal distribution
Cut-offs will be different, and with larger sample sizes it will be more similar to a normal distribution
But don’t just use the cut-offs for a normal distribution because they won’t be the same
INTERPRETING CONFIDENCE INTERVALS
* If we ran many studies, 95% of the intervals would contain the population mean
* We don’t know whether this particular interval does or doesn’t
What is the difference between p-values and confidence intervals?
Advantages of confidence intervals
○ They give a set of rules that, if they were your null, would have been rejected
○ They tell you about the precision of your estimate
An advantage of p-values
○ The give a clearer indication of the evidence against the null hypothesis
Describe the replication crisis and 5 suggested reasons for it
The ‘file drawer’ problem
* The bias introduced in the scientific literature by selective publication - chiefly by a tendency to publish positive results, but not to publish negative or nonconfirmatory results
Gelman (2016) mentions five reason:
* Sophistication
○ Since psych was focussing on more sophisticated concepts than other disciplines, it made it more open to criticism
* Openness
○ Has culture in which data sharing is very common - easier to find mistakes
* Overconfidence deriving from research design
○ Researchers may feel that they can’t go wrong using simple textbook methods, and their p-values will be ok
* Involvement of some prominent academics
○ More of its leading figures have been dragged into the replication crisis than other disciplines
* The general interest of psychology
○ Methods are very accessible so more people are willing to find mistakes
What are some routes to the replication crisis
Outright fraud (rare)
P-hacking, data dredging, data snooping, fishing expeditions (rarer than is commonly believed)
○ Looking for what people want to find
○ Sifting through the data
The garden of forking paths (more common than generally realised)
○ Only run the experiment that seems like it might be significant based on the data (looking at data before doing the tests, if data was different, might have done other tests)
○ Solution could be to make requirements to preregister the hypotheses and methods
What are some typical experimental designs?
Between-subjects design
* Different participants contribute data to each level of the IV
* But differences might be due to differences between participants
○ Random assignment can help reduce this
○ Or matching - balance out the conditions
Within-subjects design
* Each participant gets exposed to each level of the IV
* Major concern with this is sequencing effects (each previous level having an influence on the next level)
Single factor design
* Only one IV
* Can have two levels (each placebo and treatment), or more than two levels (eg placebo, treatment level 1, treatment level 2)
Factorial designs
* More than one IV
* Analysed with two-way ANOVA
* Interaction effects
Explain correlational research and its uses
nvestigates the relationships between two (usually continuour) variables - manipulate on variable to observe the effect on the other
This type of research is often linked with th concepts of correlation and regression
○ Regression
§ Predicting a variable from other variables in a regression model
Designs are useful when
○ Experiments cannot be carried out for ethical reasons
○ Ecological validity is a priority
While correlation and regression are associated with correlational research, ANOVA and t-tests are associated with experimental research
This distinction is bogus, can use any of them with another research design
What is a quasi-experimental design?
○ Groups occur naturally in the world; cannot be random assignment to groups
§ Eg comparing men and women
○ Often used in program evaluation
§ Provides empirical data about effectiveness of government and other programs
What are some issues with null hypothesis significance testing
If power is low, there may be a difference, but you don’t see it
If power is very high (if you have a very large sample size you will likely have high power), even small differences can seem significant
Explain empiricism
out about the world through eveidence
○ Could be considered as:
§ Observation = truth + error
§ Observation = theory + error
§ Observation = model + error
○ We need a theory of error to fit our models
○ Classical methods in statistics tend to assume errors are normally distributed
§ Gauss was first to fullt conceptualise normal distribution
Why do we use linear models
○ Easy to fit
○ Commonly used
○ Lots of practical application (prediction, description)
○ Provide a descriptive model that is very flexible
○ Have assumptions that are broadly reasonable
What is the theory of error?
§ Often assume that the ‘error’ is normally distributed with zero mean
□ A theory of error
® It is the error term that requires statistical techniques
® The real question - eg relationship between age and IQ isn’t statistical at all
Why do we assume normal errors?
○ Two broad categories of justification for building models around the assumption of normal errors
§ Ontological
□ Study of nature of being
□ Normal distributions occur a lot in the world so let’s build model around them
□ Any process that sums together the result of random fluctuations has a good chance of somewhat resembling normal distributions
□ Sampling distributions and many statistics tend towards normality as sample size increases
§ Epistemelogical □ Normal distributions represent a state of our knowledge (more like ignorance) □ Don't contain any info about the underlying process, except its mean and variance □ Should still be interested in underlying process, but when we don't know anything about it, it's best to take in as few assumptions as possible
Explain common model assumptions
○ Validity
§ Ensure data is relevant to the research question
○ Representativeness
§ Want sample to represent population as well as possible
○ Additivity and linearity
§ Most important mathematical assumption
§ Want the non-error part of function to be a linear model
○ Independence of errors
○ Equal variance of errors
§ Also referred to as homogeneity/homoscedasticity
○ Normality of errors
Explain moderation vs mediation
A
Moderation
○ Situations where the relationship between two variables depends on another variable
Mediation
○ Indirectly through another variable
○ Inherently causal
○ X causes M causes Y
○ Maybe X also causes Y, but it doesn’t have to
Explain mediation effects
Mediating variables transmit the effect of an independent variable on a dependent variable
Mediation is important in many psychological studies
It is the process whereby one variable acts on another through an intervening (mediating) variable
○ Eg Theory of Reasoned Action
§ Attitudes cause
intentions, which cause behaviour
Simplest mediation model contains three variables
○ Predictor variable X
○ Mediating variable M
○ Outcome variable Y
○ Causal model
What is the causal steps approach to mediation
○ Based on the above regression equations and involves 4 requirements
§ X directly predicts Y (ie coefficient c is significant
§ If c is significant, X directly predicts M (ie coefficient a is signficiant)
§ M directly predicts Y (coefficient b is significant)
§ When both X and M predict the Y, the effect of X is either:
□ Reduced (coefficient c’ is smaller than c, though both remain significant), and there is partial mediation, or
□ Eliminated (ie coefficient c’ is not significant) - then there is full mediation
○ If any of the four requirements are not met stop everything
What is the Baron and Kenny approach to mediation?
Independent regressions of the IV to DV, IV to MV, and IV + MV to DV. Mediation occurs if the effect of the IV is reduced when MV is introduced, but only if the IV was significant in the first place
What is the basic idea behind Principal Component Analysis
○ We have multivariate data (we’ve measured participants on multiple variables)
○ We fit straight lines to our data and call the lines ‘Principal Components’ (PCs)
○ 1st PC is the best line we can fit
○ 2nd PC is second best line we can fit etc
○ Maximum number of PCs = number of variables in our dataset
○ We want to represent our data with fewer PCs
○ Correlated continuous variables, and reducing them into the least amount of factors while keeping the data
○ Aims to fit straight lines to data points
§ Second best line is line fitting the errors of the components
§ Eg reducing alcohol dryness and content into one component, while still describing the alcohol fully
□ First line reduces the diagonal distances from the data points and the principal component line
□ Worst will always be perpendicular/orthogonal to the principal component
* Can also be thought of in terms of dimensions
○ If you have n variables, you are in n dimensional space
○ Maybe there are new axes that make life simpler
○ Maybe you don’t need the full n components to describe your data well
Explain the MAP test for determine how many components to extract for PCA
○ Velicer devised a test based on partial correlations known as the Minimum Average Partial Correlation test
§ After each component Is extracted, it (and those extracted before) gets partialled out of the correlation matrix of original variables, and the average of the resulting partial correlations are calculated
§ As more correlations are partialled out, the partial correlations approaches 0
§ But at some point components that reflected ‘noise’ would be partialled out so the average partial correlation would begin to rise
§ Choose number of components corresponding to minimum average partial correlation
Describe component rotation
○ With n variables, you can have n components
§ These are completely determined and follow directly from the matrix operations
○ But if we only use smaller number of components, there is some freedom in the final solution
§ In particular, we can rotate components to get a simple structure
§ Axes get rotated until the variables are tending to load on one component only, to as great an extent that is possible, and they are as close to 0 on the other components as possible
§ With large component loadings or some variables, and small component loadings for others
§ Used more in FA than PCA
What are different types of component rotations?
○ Orthogonal
§ Components/factors remain uncorrelated
§ Quartimax simplifies the variable pattern of the loadings
§ Varimax (most common method) simplifies factor patterns of loadings
§ Equamax a comprimise simplification of variable and factor pattern simplification
○ Oblique
§ Components/factors correlated
§ Direct Oblimin
§ Promax
§ Both offer control of the degree of correlation of factors
□ Direct oblimin
® Delta (ranging from -0.8-0.8)
□ Promax
® Kappa (ranging from 1 upwards
*recommended to do oblique because it is more realistic
What matrices should be interpreted for orthogonal rotation?
Rotated component/factor matrix
What matrices should be interpreted for oblique rotation?
Pattern, structure, component/factor correlation matrix
What are the assumptions of the common factor model in EFA?
-Common factors are standardised
-Common factors are uncorrelated
-Specific factors are uncorrelated
-Common factors are uncorrelated with specific factors
Explain the rationale behind partial correlation
○ Suppose a correlation of 0.615 between items
1. ‘Don’t mind being the centre of attention’
2. ‘Feel comfortable around other people’
○ Correlation between item 1 and extraversion is 0.82
○ Correlation between item 2 and extraversion = 0.75
○ The aim is to find a latent or unobserved variable which, when correlated with observed vairables, leads to partial correlations between the observed variables that are as close to 0 as e can get
What are some practical issues for EFA?
A
○ Interval or ratio data
§ If proceeding with it ordinal data, say that you are aware of it being problematic, but you are continuing anyway for the sake of the assignment
○ Adequate sample size
○ Any missing data dealt with
§ Either impute the missing data or delete the cases
○ Decently high correlations
○ Linearity
§ Misleading results for non-linear relationship
§ Look at scatterplots
§ Don’t bother converting data to linear relationships in assignment
○ Weak partial correlations
○ Absence of outliers
○ Absence of multicollinearity/singularity
○ Distribution appropriate to the method used to extract the factors
What is the Guttman-Kaiser image approach?
-Used in EFA
○ Image analysis involves partitioning of the variance of an observed variable into common and unique parts, producing
§ Correlations due to common parts,
□ Image correlations
§ Correlations due to unique parts
□ Anti-image correlations (should be near 0)
What is the difference between PCA and EFA?
○ Principal components are ‘just’ linear combinations of obsereved variables. Factors are theoretical entities (latent variables)
○ In FA, error is explicitly measured, in PCA it isn’t
○ If another factor is added (removed) , the factor loadings of the others will change. If another component Is added (removed) that other component loadings stay the same
§ Not the case in PCA
○ Unlike PCA, FA is a theoretical modelling method, and we can test the fit of out model
○ FA fragments variability into common and unique parts, PCA doesn’t
○ PCA runs using single canonical algorithm and it always works. FA has many algorithms (some may not work with your data)
What are the similarities between EFA and PCA?
-Both have same general forms
-They deliver similar results especially if number of variables is large
-If you loosely define ‘factor analysis’ as a method for suggesting underlying traits, PCA can do that too
How to know whether to use PCA or EFA?
-Run EFA if you wish to test a theoretical model of latent factors causing observed variables
-Run PCA if you want to simply reduce your correlated observed variables to a smaller set of important uncorrelated composite variables
What is Widaman’s conclusion regarding using PCA vs EFA
‘the researcher should rarely, if ever, opt for a component analysis of empirical data if their goal was to interpret the patterns of covariation among variables as arising from latent variables or factors’
Explain Structural Equation Modelling in a nutshell
An umbrella term for a set of statistical techniques that permit analysis of relationships between one or more Ivs and DVs, in possibly complex ways
○ Also known as causal modelling, causal analysis, simultaneous equation modelling, analysis of covariance structures
○ Special types of SEM include confirmatory factor analysis and path analysis
SEM enables a combined analysis that otherwise requires multiple techniques
○ For example, factor analysis and regression analysis
The modelling of data by joining equations [1] and [2] is known as structural equation modelling
That aspect of the model concerned with equation [2] is often called the measurement model
That part focusing on equation [1] is known as the structural model
If the structural model contains observed variables but no latent factors, we are doing a path analysis
Explain the difference between Confirmatory factor analysis and Exploratory Factor Analysis
Exploratory factor analysis can impose two kinds of restrictions
○ Could restrict the number of factors
○ Constrain the factors to be uncorrelated with an orthogonal rotation
Confirmatory factor analysis can restrict factor loadings (or factor correlations of variance) to take certain values
○ A common vale: zero
○ If factor loading was set to zero, the hypothesis is that the observed variable score was not due to the factor
Moreover,
○ Using maximum likelihood and generalised least squares estimation, CFA has a test of fit
○ So, it’s possible to test the hypothesis that the factor loading is zero
○ If the data fit the model, hypothesis is supported
○ Hence name confirmatory factor analysis
-CFA provides us with a confirmatory analysis of our theory
What are some issues with CFA?
Sample size
○ Wolf et al. (2013) show ‘one size fits all’ rules work poorly in this context
○ Jackson, (2003), provides support for the N:q rule
§ Ratio cases (N) to parameters being estimated (q)
§ >20:1 recommended
§ >10:1 likely to cause problems
○ Absolute sample size harder to assess
§ N=200 is common but may be too small
§ Barrett (2007) suggests journal editors routinely reject any CFA with N<200
Significance testing
○ Kline (2016) reports diminshed emphasis on signficance testingf because
§ Growing emphasis on testing the whole model rather than individual effects
§ Large-sample requirement means even trivial effects may be statistically significant
§ P-value estimates could change if we used a different method to estimate model parameters
§ Greater general awareness of issues with significance testing
Distributional assumptions
○ The default estimation technique (maximum likelihood) assumes multivariate normality
§ Possible to transform variables to obtain normality
§ Widaman (2012): maximum likelihood estimation appears relatively robust to moderate violations of distributional assumptions
§ Some robust methods of estimations are available
○ CFA generally assumes continuous variables
§ Some programs have special methods for ordered categorical data
Identification
○ Necessary but insufficient requirements for identification
§ Model derees of freedom must be greater than or equal to 0
§ All latent variables must be assigned a scale
§ Estimation is based on solving of a number of complex equations
○ Constraints need to be placed on the model (not the data) in order for these equations to be solved unambiguously
Model is identified if it’s theoretically possible for a unique estimate of every model parameter to be derived
Q
Explain the differences between underidentified, just-identified, and overidentified - which is ideal?
Underidentified:
-Not possible to uniquely estimate all the model’s free parameters (usually because there are more free parameters than observations. Need to respecify your model
Just-identified:
-Identified and has the same number of observations as free parameters (model df = 0). Infinite series of possible answers. Model will reproduce your data exactly, so won’t test your theory.
Over-identified:
Identified and has more observations than free parameters (df > or equal to 1). permits discrepancies between model and data, permits test of model fit, and of theory.
What are some methods for estimation of a CFA model?
○ As for EFA, the most commonly used are
§ Unweighted least squares
§ Generalised least squares, and
§ Maximum likelihood
○ ML is often preferred, but assumes normality
○ Some more exotic methods for handling special types of data are available (but not taught in this course)
○ If you’re picking between two methods and they yield substantially different results, report both
What are the global fit statistics for a CFA model and their cut-offs?
Chi-square:
-If chi Square isn’t significant, it could be because you have a large sample size, so doesn’t necessarily mean anything bad, but if it’s significant it could also be a factor because of a small sample size, so take it with a grain of salt
-0 with perfect model fit and increases as model misspecification increases
-p=1 with perfect model fit and decreases as model misspecification decreases
Standardised Root Mean Square Residual (SRMSR):
-should be less than .08
-Transforms the sample and model-estimated covariance matrices into correlation matrices
Comparative Fit Index (CFI):
-should be above .95
-Compares your model with a baseline model - typically the independence (null) model
Tucker-Lewis Index (TLI)
-Also known as the non-normed fit index (NNFI)
-Relatively harsher on complex models than the CFI
-Unlike the CFI, it isn’t normed to 0-1
-Highly correlated with CFI so don’t report both
Root Mean Square Error of Approximation (RMSEA):
-Less than .05, over .1 is unacceptable
-Acts to ‘reward’ models analysed with larger samples, and models with more degrees of freedom
Discuss the components of CFA, Path Analysis, and ‘full’ SEM
CFA:
-not a structural model
-is a measurement model
-has latent variables
-has observed variables
Path analysis
-Is a structural model
-Is not a measurement model
-Does not have latent variables
-Has observed variables
‘full’ SEM:
-Is a structural model
-is also a measurement model
-has latent variables
-has observed variables
When will correlation and regression be the same?
Regression and correlation will only be the same if the variance of x is the same as the standard deviation of x and y (denominator will be the same
What are Path models?
Path models are expressed as diagrams
The drawing convention is the same as in confirmatory factor analysis
○ Observed variables are drawn as rectangles
○ Unobserved variables as circles/ellipses
○ Relations are expressed as arrows
§ Straight, single headed arrows are used to indicate causal or predictive relationships
§ Curved, double-headed arrows are used to represent a non-directional relationship such as correlation or covariance
What are two types of path models?
○ Recursive
§ Simpler
§ Unidirectional
§ The residual error terms are independent
§ Such models can be tested with a standard multiple regression
○ Non-recursive
§ Can have
□ Bidirectional paths
□ Correlated errors
□ Feedback loops
§ Such models need structural equation software to fit them
How can you model data for a Path analysis?
A
○ Can be done via
§ Multiple regression
§ Structural Equation Modelling
Compare regressions and SEM
Regression weights agree perfectly, but
Standard errors differ
Standardised regression weights differ
The squared multiple correlation is rather less in SEM
And we did get a warning regarding the uncorrelated predictors
Multiple regression must model the correlations among the independent variables, although this is not shown
○ A path analytic representation is thus a much more accurate representation
§ And gives more information
Describe McCallum and Austin’s (2000) idea behind assessing model fit
They essentially say that no model is ever going to be perfect, so the best you can ask for is a parsimonious, substantively meaningful model that fits the observed data adequately well. But at the same time you also need to realise that there will be other models that do just as good of a job. So finding a good fit does not imply that a model is correct or true, but plausible.
What do model test statistics seek to find?
§ ‘It the variance-covariance matrix implied by your model sufficiently close to your observed variance-covariance matrix that the difference could plausibly be due to sampling error?’
What are approximate fit indices?
○ Approximate fit indices ignore the issue of sampling error and take different perspectives on providing a continuous measure of model-data correspondence
○ Three main flavours available under the ML estimation
§ Absolute fit indices
□ Proportion of the observed variance-covariance matrix explained by the model
□ Eg SRMR
§ Comparative fit indices
□ Relative improvement in fit compared to a baseline
□ Eg CFI
§ Parsimony-adjusted indices
□ Compare model to observed data but penalise models with greater complexity
□ Eg RMSEA
What are some limitations of global fit statistics?
○ Kline (2016) six main limitations of global fit statistics
§ They only test the average/overall fit of a model
§ Each statistic reflects only a specific aspect of fit
§ They don’t relate clearly to the degree/type of model misspecification
§ Well-fitting models do not necessarily have high explanatory power
§ They cannot indicate whether results are theoretically meaningful
§ Fit statistics say little about person-level fit
Explain tests for local fit and why they are used
Growing recent acknowledgement that good global fit statistics can hide problems with local fit, ie poor fit in specific parts of your model
Various methods of testing local fit, some quite complex, described in Thoemmes et al. (2018)
Some simpler methods are also possible
○ Examining Modification Indices
○ Examining Residual Covariances
When examining the residual covariances as a test for local fit, what matrices can we look at, and which one do we want to use?
A
-Sample covariances: our input variance-covariance matrix
-Implied covariances: the model-implied variance-covariance matrix
-Residual covariances: differences between sample and implied covariances
-Standardised residual covariances: ratios of covariance residuals over their standard errors. This is the one we want to use
What happens when assumptions are not met?
Model can be incorrectly rejected as not fitting
Standard errors will be smaller than they really are (ie parameters may seem significant when they are not
Solve these problems through bootstrapping
○ To assess overall fit: Bollen-Stine test
To obtain accurate standard errors
Explain bollen-stine bootstrapping
Parent sample gets transformed so covariance matrix has perfect fit. Chi-square value for this would be 0. Transformed samples will fit pretty well because their parents sample has perfect fit. But the bootstrapped samples won’t fit exactly the same, so a model is said to have good fit if it performs better than 5% of the bootstrapped samples.
Explain naive bootstrapping
Take new samples from the observed dataset
Explain linear vs non-linear models
Linear models
○ Changes in x produce the same changes in y regardless of the value of x
○ Eg
§ If someone’s height increases from 100 to 110, we predict an increase in weight from 31 to 37.6 (+6.6) kg
§ If their height increases from 200 to 210, we would predict a weight increase from 97 to 103.6kg (+6.6kg)
Non-linear models
○ Changes in x produce change in y that depends on the value of x
There are many cases where linear models are inappropriate
○ Not everything increases or decreases without bounds
§ Sometimes we have a lower bound of zero
§ Sometimes we might have an upper bound of some kind
○ Not everything changes by the same amount every time
§ Negatively accelerated functions: learning over time, forgetting over time, increase in muscle mass with training etc
§ Positively accelerated functions (eg exponential growth): spread of infections, population growth etc
What is logistic regression?
Regression on binary outcomes
○ What has two outcomes
§ Predicting whether someone is alive or dead
§ Predicitng whether or not a student is a member of a group
§ Predicting a participant’s two choice data
□ Accuracy! There are many cases where responses are scored either correct or incorrect
□ Yes vs no responses
□ Category A vs Category B (categorisation)
□ Recognise vs not-recognise (recognition memory)
-instead of predicting Y=0 or 1, we model the probability of Y=1 occurring, this is a continuous function ranging between 0 and 1
-Specifically, we model the log odds of obtaining Y=1
What are log odds?
-We predict the logarithm of the odds as a regression
Difference between log odds and regular odds
Odds:
-P(Y=1)/P(Y=0)
-Suppose P(Y=1) = 0.8, then P(Y=0) = 0.2
Odds = 0.8/0.2 = 4
If odds >1 then Y=1 is a more probable outcome than Y=0. If the odds = 1 then it’s 50/50. If odds < 1, then Y=0 is more probable than y=1.
Bounded - only positive
Log Odds:
Log P(Y=1)/P(Y=0)
Log odds are unbounded
If log odds>0 then odds > 1 so Y=1 is more probable
What is the generalised linear model?
Has some form as linear model, but is now written as a function of y. Eg, f(Y)=a+b1x1+b2x2+bnxn + e (linear model form).
This function is called the link, and can sometimes be written as mew.
The appropriate function/link can allow linear techniques to be employed even if the data is not linear.
What are some links for the generalised linear model?
Identity link:
-mew=y
-This gives the linear model
Logistic link:
mew=logP(Y=1)/P(Y=0)
-Used for binary variables
-Gives logistic regression model
Logarithm link:
-mew = log(Y)
-used for counts or frequencies
-gives loglinear model