ERP Flashcards
Validity
The extent to which a measurement correctly represents the concept of study (is it measuring what we want it to measure)
Internal Validity
The extent to which the study establishes a trustworthy cause and effect between a treatment and an outcome.
External Validity
The extent to which the results of a study can be generalised to other situations
Accuracy
How close the measurement in the study is to the actual value.
Reliability
How consistent the results are if repeated more than once.
Cross-sectional data
Observations at a given point in time.
Panel data
Collection of observations of multiple subjects at multiple points in time.
Questions the researcher has to deal with
- Type of data source (primary vs secondary)
- Type of measure (nominal etc)
- Level of analysis (firm, strategic business unit, inter-organisational)
Primary data
- By researcher
- Subjective
- More customised to study
- More expensive and time-consuming
Secondary data
- By other agents
- More objective
- Less customised cheaper
Types of performance measures
- Financial performance (profitability)
- Operational performance (marketshare, efficiency)
- Overall effectiveness
Selection Bias
When a sample is fundamentally different from the population.
Resource Bias
Choose secondary source as it is cheaper and less time consuming
Popularity Bias
The researcher chooses popular variables instead of the right ones to measure what is necessary
Convenience Bias
Researchers use easily available measures
Elements of Descriptive/Summary statistics
- N. of observations
- Measure of central tendency
- Skewness
- Kurtosis
- Max/Min/Range
- SD/Variance
Best measure of central tendency for nominal data
Mode
Values that show significant skewness
Outside -1 to 1
Which measure of central tendency to use if distribution is skewed/not skewed
Skewed - Median
Not skewed - Mean
What is skewness
How much data deviates from normal distribution
What is Kurtosis
Degree to which observations cluster at the tail
Kurtosis values
Less than 3: platokurtic
3 : Normal/Mesokurtic
More than 3: leptokurtic
Graphical technique for showing kurtosis and skewness
Histogram
Dummy variable
Binary variable expressing whether a condition is fulfilled
When comparing models, search for the highest…
R squared
Assessing your model
Goodness of fit
Coefficient of determination
To interpret correlation coefficient
Sign, size, significance
Adjusted R squared
Takes into account the number of variables and degrees of freedom
How to interpret the regression coefficient
The amount of change in y due to a change in x, ceteris paribus
When can we reject H0
When p is smaller than the significance level. It means that x has a significant impact
Multicollinearity
High correlation between at least two variables
Values of problematic multicollinearity
Usually above 0.7 although some literature says 0.5
Consequences of multicollinearity
- Hard to distinguish individual impact
- Larger standard error, more insignificance
- Non-sensical coefficients
Variance Inflation Factor
Quantifies severity of multicollinearity.
Above 10 means there is multicollinearity
Potential causes of multicollinearity
- Lagged variables (income last and this year)
- Similar phenomena (unemployment, poverty)
Solutions to multicollinearity
- Increase sample size
- Drop one of the variables
- Transform correlated variable (log transformation or composite variable)
Heteroscedasticity
Changing error variance over the range of observations
Formula for VIF
VIF= 1/(1-R2)
How to check for heteroscedasticity
- Breusch-Pagan test
- White test
- Scatterplot of residuals
Solutions for heteroscedasticity
- Weighted least squares. (Observations with a higher variance get a lower weight in determining the regression coefficent)
- Calculate the robust standard error
Outlier
A data point that does not follow the general trend of the data.
Solutions to outliers
- Remove outlier
- Trim data set
There is ommited variable bias if…
- An excluded variable has an effect on the dependant variable
- There is endogeneity (correlation between error term and variable)
Robustness/Sensitivity analysis
How sensitive changes are to the model.
Test with combinations of variables, data sets and time frames. Do they remain the same?
Tips when using moderation
- Include all interaction terms if hypothesis is conditional
- Include all constituative variables
- Do not interpret constituative variables as if they are unconditional
- Calculate meaningful marginal effect and standard error
When to use a moderator
When the hypothesis is conditional (the relationship between two variables is dependant on the value of another variable)
Marginal effect on x
∂ y/∂ x=β2x +β4z
Marginal effect on y
∂ y/∂ x=β3z+β4x
Issues with traditional results tables when there is a moderator
- β2 only captures effect of x when z = 0
- SIgnificance only valid if z = 0
- If β4 is insignificant, it may be significant for higher values of z.
What is factor analysis
A technique to reduce large amounts of information into a simple message with minimal information loss.
Factor rotation
Visualise factors as axes so that variables load maximally onto 1 factor and minimally on the other.
Eigen-values
- Importance of a factor
- Keep only factors with eigen value above 1 (Kaiser’s criterion)
Scree plot
Keep only the factors that are to the left of the inflection point
What rotation procedure should be used?
- Orthogonal - factors independent
- Oblique - factors correlated
Reliability analysis
- Degree of consistency
- Cronbach’s alpha, above 0,7 or 0,8 to be reliable
Explanatory factor analysis
Searching for a structure of variables with the goal of identifying the right factors
Confirmatory factor analysis
You have preconceived thoughts on the structure of the data and want to check whether this is right or not