Coali Flashcards

Question

What is the Cronbach’s Alpha?

Answer 1

Cronbach's Alpha is a measure of internal consistency, which is often used as an estimate of the reliability of a psychometric test for a sample of examinees. It assesses how closely related a set of items are as a group. Here's a breakdown of the key points from the image: Internal Consistency: It refers to the degree to which all items in a test measure the same construct or concept. A high internal consistency means that items that are intended to measure the same general construct yield similar scores. Scale Reliability: Cronbach's Alpha provides a measure of the reliability of a scale. Reliability in this context refers to the consistency of the measurement, or the degree to which the scale produces stable and consistent results. One-Dimensional Concept: The assumption underlying Cronbach's Alpha is that the items on the scale measure a one-dimensional construct. That is, all items are supposed to reflect only one attribute, quality, or construct. If a scale measures multiple dimensions, Cronbach's Alpha may not be a suitable measure of reliability, and a factor analysis might be needed to understand the underlying structure. Interpretation of the Value: The value of Cronbach's Alpha ranges from 0 to 1. Higher values indicate greater internal consistency. Common benchmarks for interpreting the alpha value are: 0.6<α<0.6 Poor 0.6≤α<0.7 Questionable 0.7≤α<0.8 Acceptable 0.8≤α<0.9 Good α≥0.9 Excellent

Answer 2

A Semantic Differential Scale is a type of rating scale used to measure the connotative meaning of objects, events, and concepts. The connotative meaning refers to the subjective, emotional, and cultural associations that people might connect with a word or phrase, as opposed to its denotative meaning, which is its literal or dictionary definition. Structure: It typically consists of a series of bipolar adjectives (e.g., happy-sad, efficient-inefficient), with respondents asked to rate the item being evaluated on a scale between two extremes. Odd Number of Alternatives: It often uses an odd number of alternatives so that a neutral or middle point can be identified. This allows respondents who feel ambivalent or neutral about the subject to select a midpoint rather than being forced to lean towards a positive or negative response.

Answer 3

**Likert scales** tend to provide a straightforward method for **gauging the level of agreement with specific statements**, leading to easily quantifiable data, while **semantic differential scales** are better suited for capturing the complexity and richness of attitudes and perceptions (with semantic differential, **Responses are more subjective as they are based on the respondent's personal associations with the adjectives**). The choice between the two scales should be based on what you want to measure and what aspects are most important to you. Likert scales might be preferred for direct questions about beliefs or behaviors, while semantic differential scales might be chosen to explore deeper, more nuanced attitudes or the emotional aspects of a concept.

Answer 4

**Types of Interviews** **Structured Interviews**: These involve standardized questions with a logical sequence and formulation, allowing for limited variability in responses. **Semi-Structured Interviews**: These are guided by predefined themes and topics, but the interviewer has the flexibility to adapt their strategy based on the interviewee’s real-time feedback. **Unstructured Interviews**: These are more like free-flowing conversations with open-ended questions that may not be predetermined. **Deeper Understanding**: Interviews are a qualitative research method used to gain a deeper understanding of a problem. **Refine Theory:** They are mainly used to refine theories or identify new attributes relevant to the researcher's area of study. **Crafting Hypotheses**: Information gathered from interviews can help in crafting better hypotheses and formulating questions for subsequent quantitative surveys. **Testing Hypotheses**: Structured interviews can also be used to test hypotheses. **Exploratory Insight**: Exploratory interviews, in particular, are highlighted as an activity that can uncover overlooked insights.

Answer 5

**Question Design**: Questions should be centered on specific topics and phrased to be as neutral as possible to avoid leading questions. **Attitude of Interviewer:** The interviewer's attitude should be neutral, especially when asking for opinions, to encourage honest responses. **Factual Reference**: Questions should refer to facts rather than abstract concepts.

Answer 6

**Defining the Population**: Before choosing a sampling technique, you must define the population under study. This includes identifying relevant traits and characteristics of individuals who will be included. **Sampling Strategy:** After defining the population, the next step is to choose the most appropriate sampling strategy. There are two broad categories of sampling strategies: * **Probability Sampling**: In this approach, every unit in the population has a non-zero probability of being selected in the sample, and you can determine this probability. * **Non-probability Sampling**: Here, some units in the population may have a zero probability of selection, or their probability of selection cannot be determined. The choice between these sampling strategies will influence how representative the sample is of the population and the kinds of conclusions you can draw from your research.

Answer 7

**Random Sampling:** Each unit in the population has the same probability of being selected. The selection process is a simple random draw. **Systematic Sampling:** You start with a list of all units in the population and select every nth unit to be included in the sample. It's important that there is no pattern in the list that could be related to the outcome under study, as this could bias the results. **Stratified Random Sampling:** The population is divided into subgroups, or "strata," based on a known characteristic or trait. Simple random sampling is then conducted within each stratum. **Cluster Sampling:** The population is divided into clusters (e.g., classrooms in a school). In one-stage cluster sampling, you randomly select entire clusters. In two-stage cluster sampling, you select individual units from within the randomly chosen clusters.

Answer 8

**Quota Sampling:** The population is divided into subgroups based on known traits. A predetermined quota of units is selected from each group to match the proportions in the population. Selection within each group is not random. **Purposive Sampling:** The researcher uses their judgment to choose members of the population for the sample. It requires good knowledge of the targeted population. This method is typically used with small sample sizes. **Snowball Sampling:** The sampling process begins with a small group of known individuals who meet the study criteria. These initial subjects recruit future subjects from among their acquaintances. This continues until enough data has been collected. **Convenience Sampling:** The researcher selects units that are easiest to access. It is not considered a robust method due to the high potential for bias. These non-probability sampling methods are used when it is not feasible to select a sample that accurately represents the entire population, often due to constraints such as time, budget, or the nature of the research question.

Answer 9

**Population Specific Error**: Occurs when there is bad framing of the population. The population needs to be properly specified to avoid this error. **Sample Frame Error**: Happens when the sample is badly specified. The intention might be to sample from population P, but instead, the sample comes from population Q. **Selection Error**: This type of error arises when individuals choose themselves to participate in a study. For example, an online survey might only attract those who feel strongly about the topic or who have the time and interest to complete the survey. This can lead to a sample that is not representative of the entire population. **Non-Response Error**: Relates to differences between those who respond to the survey and those who do not. It's important to control for these differences to avoid bias in the results.

Answer 10

The FWL says that the following 3 estimators of β₁ are equivalent: * the OLS estimator obtained by regressing y on x₁ and x₂ * the OLS estimator obtained by regressing y on x̃₁, where x̃₁ is the residual from the regression of x₁ on x₂ * the OLS estimator obtained by regressing ỹ on x̃₁, where ỹ is the residual from the regression of y on x₂

Answer 11

* Linearity: The relationship between the dependent variable y and the independent variables X is linear. * No Perfect Multicollinearity: None of the independent variables is a perfect linear function of any other variables, which ensures that the matrix X has full rank (the rank is equal to the number of predictors K, which is less than or equal to the number of observations N). * Random Sampling: The observations are assumed to be a random sample from the population, which supports the idea that the sample represents the population well. * Exogeneity of Errors: The conditional expected value of the error term ε, given the independent variables X, is zero. This implies that the error term is uncorrelated with the independent variables. * Homoskedasticity: The variance of the error term ε is constant across all levels of the independent variables (no heteroskedasticity). * Normality: The distribution of 𝜖 is normal

Answer 12

* Level model (level-level): interpretation of coefficient as marginal effect. If x rises by one unit, y changes by 𝛽 units. * Level-log model: interpretation of coefficient as semi-elasticity. If x rises by 100%, y changes by 𝛽 units. * Log-level model: interpretation of coefficient as semi elasticity. If x rises by one unit, y changes by 𝛽%. * Log-log model: interpretation of coefficient as elasticity. If x rises by 100%, y changes by 𝛽%.

Answer 13

In principle, there is nothing wrong with including variables in your model that are correlated. HOWEVER, if the correlation is too high, this may lead to estimation problems. Technically, the matrix X’X that we compute for the OLS estimator is close to being not invertible, leading to unrealiable estimates with high standard errors and unexpected signs/magnitudes.

Answer 14

**SUPERFLUOUS VARIABLES**: Your model includes some variables that are “not needed”. This issue is not a huge problem, however if you include too many variables in your model, you are basically reducing the degrees of freedom and the accuracy of your estimates. The risk is to over-fit or over-control your regression. The keyword here is to be parsimonious. **OMITTED VARIABLES** Your model does not include variables that should instead be included. Is this a serious issue? Yes! If you omit variables that have a significant effect on your outcome variable, you are violating assumption 4 -> the error term cotanins a component that is correlated with other regressors. Solution: very difficult to argue that you are controlling for all possible confounders in your model: this is why OLS results are hardly interpretable in a causal way. To solve the endogeneity problems there are several ways: either to change the estimator or by carefully designing your empirical strategy.

Answer 15

Assumption 5 about the conditional variance of the errors is not met. The OLS estimator is still consistent and correct if assumptions 1 to 4 are met, however we have problems in estimating the variance→estimates of the standard errors are biased→biased tests! Solution: use robust estimators

Answer 16

**Advantages** * Very simple to estimate and interpret (the coefficients are marginal changes in the probability of success) * Inference is identical to the OLS case. **Disadvantages** * The errors 𝜖𝑖 are not normally distributed anymore. The errors 𝜖𝑖 are heteroskedastic by construction. * The predicted probabilities can be outside the [0,1] interval. It is possible that the LPM routines gives values outside this range, since it is not bounded. * Plus, we assume that the effect of each regressor is linear.

Answer 17

**Advantages** * We are sure that the predicted probabilities fall within the [0,1] range -> due to the fact that we model using a cumulative distribution function (F). * We do not impose a linear structure on the marginal effects (more later) **Disadvantages** * We cannot directly interpret the coefficients we obtain as marginal effects (again, more later) * We have to use Maximum Likelihood Estimators, requiring more computational effort ( short overview of ML in a while...)

Answer 18

**Multinomial Models:** These models are used when there are multiple discrete outcomes with no ordinal properties, meaning the outcomes cannot be logically ordered. An example would be different modes of transportation, like bus, tram, bike, walking, or car. The choices in the model should be exhaustive and mutually exclusive, ensuring that every possible outcome is covered and that they do not overlap. Probabilities of all possible outcomes should sum to 1. Factors affecting the probability of each outcome can be analyzed using multinomial probit or logit models. **Ordinal Models:** These models are appropriate when the dependent variable is ordinal, meaning the categories have a logical order (e.g., Likert scale ratings), but the distance between categories is not assumed to be equal. Modeling is similar to binary cases but with multiple thresholds based on the number of categories. A **key technical assumption** is proportional odds, where the relationship between all pairs of outcomes is consistent across the levels of the dependent variable: *It means that the relationship between “likely” vs “unlikely” + “somehow likely” is the same as those that describe “unlikely” vs “likely” + “somehow likely”.*

Answer 19

**Causal Inference**: 1) **Randomized Control Trials **(the best!) 2) **Observational studies with appropriate techniques**: * «**Conditioned» regressions** (i.e., retreiving causal parameters by conditioning on observables) * **Instrumental variables** * **Difference-in-Differences** * **Regression Discontinuity** **Prediction:** 1) Supervised Machine Learning (S-ML)

Answer 20

**Causal Effect Interest:** The framework is designed to evaluate the causal effect of a treatment or intervention (denoted as D) on an outcome variable Y. **Treatment Variable**: D is a dummy variable that indicates whether a unit has been treated (1) or not (0). **Potential Outcomes**: Each unit has two potential outcomes Yi(0) and Yi(1). **Fundamental Problem**: A key problem in causal inference is that we can only observe one of the two potential outcomes for a unit; the other outcome remains counterfactual. We never observe what would have happened to the treated unit had it not been treated and vice versa. **Causal Effect**: The causal effect of the treatment for an individual unit is the difference between the two potential outcomes. However, since we can't observe both outcomes for the same unit, we cannot directly calculate this for an individual. **Generalization to Multiple Units**: When we have multiple units, we can consider the Average Treatment Effect (ATE), which is the expected difference in outcomes between the treated and control groups: ATE = E(Yi | Di = 1) - E(Yi | Di = 0). **Application of the ATE: This formula can be applied under certain conditions such as random assignment of treatment**, which ensures that the treated and control groups are similar in all aspects except for the treatment.

Answer 21

**Selection Problem**: The issue arises because we observe outcomes only for treated units under treatment conditions and control units under control conditions. We cannot observe what would have happened to each unit had the opposite treatment been applied. **Randomization**: Random assignment ensures that the treatment group and control group are statistically equivalent prior to the application of the treatment. **Expectation of Identical Outcomes**: By randomly assigning units to treatment or control groups, the expected potential outcomes of both groups are identical to the average potential outcomes in the population. **Rationale**: If the control group had been treated, they would, on average, show the same results as the currently treated group due to the random assignment. **In a Perfect Setting: With perfect randomization and full compliance** (every participant adheres to their assigned group), estimating an unbiased and consistent Average Treatment Effect (ATE) can be straightforward. **Estimation Methods**: In such ideal conditions, comparing means between the two groups (using t-tests, ANOVA, etc.) or employing a simple OLS regression can yield an accurate estimate of the ATE.

Answer 22

CIA Defined: Under the CIA, treatment assignment is independent of the potential outcomes conditional on some covariates (X). This implies that once we control for X, the treatment is as good as randomly assigned even in a non-experimental setting. CIA is crucial when: * Randomization in an RCT fails, such as when there is self-selection or other threats to validity. * Conducting observational studies where random assignment is not possible.

Answer 23

**Variance Decreases with Sample Size**: As the sample size n increases, the variance of the parameter estimates decreases, improving the model's estimation. **Effect of Number of Parameters**: The variance of the model is also influenced by the number of parameters p included. Increasing p can cause the model to become more variable, potentially leading to overfitting. **Bias-Variance in Practice**: * In low-dimensional settings (with fewer parameters), aiming for unbiased parameter estimates can be advantageous for prediction. * In high-dimensional settings (with many parameters relative to the sample size), good parameter estimation may not translate to good predictive performance due to the risk of overfitting.

Answer 24

**Model Selection Dilemma**: Including too many regressors can lead to overfitting, where the model captures the noise rather than the signal in the data. Including too few regressors can result in omitted variable bias, where important variables are left out, leading to biased estimates. **High Dimensionality:** The problem of model selection becomes more pronounced in high-dimensional settings, where the number of potential regressors (p) is close to or larger than the number of observations (n). * If p>n, the model cannot be identified because there are more parameters than data points. * If p=n, the fit of the model will be perfect but meaningless, as it will simply memorize the data. * If p

Answer 25

**Goal of Regularized Regression**: The objective is to find a model that balances the tradeoff between fitting the training data well (low bias) and maintaining good performance on new, unseen data (low variance). This is achieved by adding a penalty for complexity to the regression model. **Lasso Method**: The Lasso (Least Absolute Shrinkage and Selection Operator) is a type of regularized regression that constrains the sum of the **absolute values** of the regression coefficients. This can result in some coefficients being exactly zero, which means Lasso performs feature selection by excluding some variables from the model. **Regularization Parameter**: * The higher the regularization parameter λ, the greater the penalty on the size of the coefficients. * The parameter λ can be chosen using cross-validation, analytical solutions (in the context of certain assumptions such as heteroskedasticity), or information criteria (AIC or BIC).

Answer 26

**It is the same idea of FWL but using LASSO.** This method combines machine learning (ML) with causal inference to identify the impact of a causal regressor on an outcome. It involves using Lasso regression to select relevant control variables from a high-dimensional set. **Step-by-Step Approach**: * Usual Procedure: First, regress the outcome on controls using OLS to obtain residuals, then regress the causal variable on controls using OLS to obtain residuals, and finally regress the first set of residuals on the second set. * Machine Learning Approach: Similar to the usual procedure, but using machine learning algorithms to estimate the residuals.

Answer 27

1. **The causal variable of interest is known**, and there is no need for variable selection for this particular variable. 2. **Lasso is our best choice if we believe in approximate sparsity **(Approximate sparsity means that we assume that, from the p regressors that we can include, only a few of them matter for the prediction).

Coali Flashcards

(51 cards)