Block Course Test Prep Flashcards

Question

What does the R2 value in a regression formula mean?

Answer 1

It is the coefficient of determination. It is the proportion of the variance in the dependent variable that is predictable from the independent variable. It provides a measure of how well observed outcomes are replicated by the model, based on the proportion of total variation of outcomes explained by the model.

Answer 2

Multivariate analysis of variance is a procedure for comparing multivariate sample means. It is used when there are two or more dependent variables. It helps to determine whether changes in independent variables have significant effects on the dependent variables.

Answer 3

Statistical models of parameters that vary at more than one level. It is used when participants are organised at more than one level. An example could be a model of student performance that contains measures for individual students as well as measures for classrooms within which the students are grouped.

Answer 4

A variable which takes the value of 0 or 1 to indicate the absence or presence of some categorical effect that may be expected to shift the outcome. eg. 1 = male, 2 = female

Answer 5

A sequence of random variables is homescedastic if all its random variables have the same finite variance. This is also known as homogeneity of variance.

Answer 6

The residual variable produced by a statistical or mathematical model, which is created when the model does not fully represent the actual relationship between the independent variables and the dependent variables.

Answer 7

The error term should predict random error. If it is correlated with the independent variable, it is not independent from that variable. That systematic variation that is creating the correlation should be included in the regression model itself.

Answer 8

The effect of one independent variable on the dependent variable does not depend on the value of another independent variable.

Answer 9

Analysis of Variance - generally used to determine if there is statistically significant difference between means among 2 or more groups.

Answer 10

A moderator variable (eg. sex, race, class) influences the strength/direction of a relationship between an independnt and dependent variable. A mediator explains the relationship between the two variables.

Answer 11

Logistic regression is used to predict two different types of dependent variables. The first is a dichotomous dependent variable. The seoncd is an ordinal dependent variable.

Answer 12

The Y intercept. It is the predicted value of Y when all predictors are held at zero. Also known as the constant.

Answer 13

An indirect is transmitted through one or more meditator variables. Contrast this to a direct effect which is transmitted through the independent variables.

Answer 14

The slope. It is the expected increase in Y for a one-unit increase in X1.

Answer 15

It is the starting point along the X axis for a regression model.

Answer 16

The rate at which an outcome changes in a regression model given a one unit change in a predictor.

Answer 17

A technique for estimating correlation between two theorised normally distributed latent variables, from two observed ordinal variables.

Answer 18

Bivariate analysis.

Answer 19

Is similar to factor analysis. It attempts to extract components comprised of both the correlations between items as well as the unique variances of individual items.

Answer 20

A formative model means that the observed variables cause the latent

Answer 21

True factor analysis attempts to extract factors that explain the correlations between items.

Answer 22

It compares the mean differences between groups that have been split on two factors (IVs), where one factor is a within-subjects factor and the other factor is "between-subjects" factor.

Answer 23

A type of inferential statistic used to determine if there is a significant difference between the means of two groups, which may be related in certain features.

Answer 24

True factor analysis attempts to extract factors that explain the correlations between items.

Answer 25

A scree plot is a line of the eigenvalues of factors or principal components in an analysis. It is used to determine the number of factors to retain in an exploratory factor analysis or principal components to keep in a principal component analysis.

Answer 26

A method for determining the number of components or factors to retain from PCA or factor analysis. It works by creating a random dataset with the same numbers of observations and variables as the original data.

Answer 27

The program doing the FA rotates the axes of the model to to find the best fit between the variables and the latent factors.

Answer 28

Rotation minimises the complexity of the factor loadings to make the structure simpler to interpret. Factor loading matrices are not unique, for any solution involving two or more factors there are an infinite number of orientations of the factors that explain the original data equally well.

Answer 29

Multicollinearity exsists when there is a strong correlation between two or more predictor variables.

Answer 30

To describe, explain, and predict data.

Answer 31

1. Must be correlated with the IV 2. Is not affected by the DV 3. Has a causal affect on the DV

Answer 32

It ensures that different experimental conditions will have similar average levels of all pre-exsisting attributes

Answer 33

It is a discrete distribution with two possible outcomes

Answer 34

The probability of an event occurring. It may be thought of as an unconditional probability. It is not conditioned on another event.

Answer 35

The error term (the difference between the observed and predicted values of Y).

Answer 36

The outcome variable.

Answer 37

The predictor variable.

Answer 38

The residual is the diference between someone's Y and a score predicted using the model. Error is the difference between Y and the "true" regression.

Answer 39

``` Mean = zero SD = 1 ```

Answer 40

OLS is unbiased, consistent, and efficient.

Answer 41

To see how much extra variance we can explain by adding a particular predictor or set of predictors. This is sometimes called hierarchical regression.

Answer 42

It will result in biased estimates (similar to publication bias).

Answer 43

If you are willing to collect extra data to conduct cross-validation (to see how well you model explains new data,and refit the model with new data to get unbiased estimates).

Answer 44

1. Additivity and linearity 2. Measurement error only in the y 3. Error terms are independent 4. Homoscedasticity 5. Normal distribution of errors

Answer 45

The predicted score when predictor variables are zero.

Answer 46

For every extra one unit increase in X, the outcome increase by the B score when holding all other predictors constant.

Answer 47

For every one-SD increase in predictor X, the outcome falls by the given score, when holding all other predictors constant.

Answer 48

They are presented in normalized units, eg. standard deviations.

Answer 49

A test statistic which incorporates information about effect size, sample size, and variability.

Answer 50

The p value. If the true slope for predictor X in the population was zero, the probability of observing a t statistic as large or larger than "predictor X b score" would be "1 - predictor X b score".

Answer 51

What % of variance is explained by your model.

Answer 52

It adjusts R2 with penalty based on the number of predictors.

Answer 53

It uses a similar model but considers using new methods, new operational definitions etc.

Answer 54

When you set up dummy variables with more than two categories you will end up with one reference category.

Answer 55

Squared or cubed predictor values.

Answer 56

1. A theory implies that one is present and you want to test it 2. For practical purposes, when you need to check if an effect holds in particular groups

Answer 57

Multiply the two predictors together, and include both interaction term and the main effects in the model.

Answer 58

The unstandardized coefficient for one of the predictor variables is the predicted change in the outcome variable given a one-unit increase in the predictor variable while holding the other predictor variables at zero.

Answer 59

The unstandardized coefficient for the interaction term is the change in the effect of one of the predictors when increasing the other predictor by one unit. eg. Coefficient of +.166 for Principles*Sex means that the effect of principles for women is +1.66 units more positive for women than men for women, a one-unit increase in principles results in -.378+.166 = -.212 less hours of personal web use at work per week.

Answer 60

When you have a categorical outcome variable and categorical and/or continuous predictor variables.

Answer 61

When the outcome is dichotomous?

Answer 62

They are both forms of the generalized linear model.

Answer 63

Log odds of being in outcome variable group 1 (where the possible values of the outcome variable are 0 and 1)

Answer 64

Maximum likelihood criterion. ML finds the set of estimates for which the likelihood of the data is the highest (of all possible estimates).

Answer 65

Probability of event happening/probability of event not happening.

Answer 66

A way of expressing how changing something affects the odds of something else happening. Calculated by odds of the first group divided by the odds of the second group.

Answer 67

The logarithm of a number X is the exponent to which a particular base has to be raised to produce X. In statistics we usually use natural logarithm, in which the base is e=2.718

Answer 68

The value produced by the logistic model. It takes the value of 1 and is sometimes called the "logit". Using this model we can estimate the odds of the outcome variable value being 1 for any given combination of values of the predictors.

Block Course Test Prep Flashcards

(95 cards)