Block Course Test Prep Flashcards

1
Q

What is exploratory research?

A

The goal of exploratory reserarch is connect ideas to understand cause-effect. It provides potential relationship and relevant questions in order to focus on type 2 error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is observational research?

A

The systematic study of behaviour as it occurs in the natural environment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is confirmatory research?

A

The goal of confirmatory research is to confirm a pre-specified relationship. It uses hypothesis testing to find statistically significant results and focuses on Type 1 error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does randomising participants to conditions help to ensure?

A

It increase internal validity by reducing systematic bias between groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is internal validity?

A

The extent to which a study establishes a trustworth cause-effect relationship between a treatment and an outcome. It depends largely on the procedures of a study and how rigorously it is performed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is external validity?

A

External validity refers to how well the outcome of a study can be expected to apply to other settings. In other words, how generalizable the findings are.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a continuous distribution?

A

A continous distribution descries the probailities of the possible values of a continuous random variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a lognormal distribution?

A

A lognormal distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a discrete distribution?

A

A discrete distribution describe the probability of occurrence of each value of a discrete random variable. A discrete random variable has countable values, such as a list of non-negative integers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does Massey’s Code of Responsible Research Conduct say about sharing data?

A

Research data should be made available to peers who wish to repeat or elaborate on the study, subject to requirments for privacy, confidentiality and intellectual property.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How would you reverse score an item?

A

You would recode each item score to the reverse. Alternatively you can take the highest response score, add one to it, and then subtract the original respone to give you the reversed score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is expectation-maximisation imputation?

A

An interative procedure which uses other variables to impute a value (expectation), then checks whether that is the value most likely (maximisation). If not, re-imputes a more likely value. This goes on until it reaches the most likely value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is multiple imputation?

A

A general approach to the problem of missing data. It aims to allow for the uncertainty about the missing data by creating several different plausbile imputed data sets and appropriately combining results obtained from each of them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Type 1 error?

A

A false positive. When the null hypothesis is rejected but it is actuall true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does a P value tell you?

A

Describes the level of evidence against the null hypothesis. A small value, typically <0.05 indicates strong evidence that the null should be rejected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is an alpha level?

A

The probability of a type 1 error (false positive). You set this before analysing your data and a P-value below this will reflect a statistically significant result.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is family-wise error rate?

A

The probability of making one or more false discoveries (type 1 error) when performing multiple hypotheses tests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does a 95% confidence interval tell you?

A

It defines a range of values that you can be 95% certain the population parameter falls within.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a beta level?

A

The probability of a type 2 error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is P-hacking?

A

Also known as data dredging, refers to the misuse of data analysis to find patterns in data that can be presented as statistically significant when n fact there is no real underlying effect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does pre-registration help to address?

A

It should limit the degree to which p-hacking occurs because you outline your data analysis method before collecting data. Less false positives should be published.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is ordinary least squares in multiple regression?

A

It chooses the parameters of a linear function of a set of explanatory variable by the principle of least squares. Least squares minimizes the sum of squares of the difference between the observed dependent variable in the given data set and those predicted by the linear function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is Cronbach’s Alpha useful for?

A

A measure of internal consistency presented as a coefficient. It is calculated using the average covariance between item-pairs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is internal consistency?

A

Correlations between different items on the same tests. It measures whether several items that propose to measure the same general construct produce similar scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What does the R2 value in a regression formula mean?

A

It is the coefficient of determination. It is the proportion of the variance in the dependent variable that is predictable from the independent variable. It provides a measure of how well observed outcomes are replicated by the model, based on the proportion of total variation of outcomes explained by the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is MANOVA?

A

Multivariate analysis of variance is a procedure for comparing multivariate sample means. It is used when there are two or more dependent variables. It helps to determine whether changes in independent variables have significant effects on the dependent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is multilevel modelling?

A

Statistical models of parameters that vary at more than one level. It is used when participants are organised at more than one level.

An example could be a model of student performance that contains measures for individual students as well as measures for classrooms within which the students are grouped.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is a dummy variable?

A

A variable which takes the value of 0 or 1 to indicate the absence or presence of some categorical effect that may be expected to shift the outcome.

eg. 1 = male, 2 = female

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is homoscedasticity?

A

A sequence of random variables is homescedastic if all its random variables have the same finite variance. This is also known as homogeneity of variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is an error term?

A

The residual variable produced by a statistical or mathematical model, which is created when the model does not fully represent the actual relationship between the independent variables and the dependent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What does it mean when error terms are independent?

A

The error term should predict random error. If it is correlated with the independent variable, it is not independent from that variable. That systematic variation that is creating the correlation should be included in the regression model itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is additivity?

A

The effect of one independent variable on the dependent variable does not depend on the value of another independent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is ANOVA?

A

Analysis of Variance - generally used to determine if there is statistically significant difference between means among 2 or more groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is the difference between a moderating and mediating effect?

A

A moderator variable (eg. sex, race, class) influences the strength/direction of a relationship between an independnt and dependent variable. A mediator explains the relationship between the two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is logistic regression?

A

Logistic regression is used to predict two different types of dependent variables. The first is a dichotomous dependent variable. The seoncd is an ordinal dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What does B0 stand for in a regression equation?

A

The Y intercept. It is the predicted value of Y when all predictors are held at zero. Also known as the constant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is an indirect effect?

A

An indirect is transmitted through one or more meditator variables. Contrast this to a direct effect which is transmitted through the independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What does β1 tell you in a regression equation?

A

The slope. It is the expected increase in Y for a one-unit increase in X1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is an intercept in multilevel modelling?

A

It is the starting point along the X axis for a regression model.

40
Q

What is a slope in multilevel modelling?

A

The rate at which an outcome changes in a regression model given a one unit change in a predictor.

41
Q

What is a polychoric correlation?

A

A technique for estimating correlation between two theorised normally distributed latent variables, from two observed ordinal variables.

42
Q

Which statistic is used to estimate a relationship between two ordinal variables?

A

Bivariate analysis.

43
Q

What is principal components analysis?

A

Is similar to factor analysis. It attempts to extract components comprised of both the correlations between items as well as the unique variances of individual items.

44
Q

What are formative models?

A

A formative model means that the observed variables cause the latent

45
Q

What is true factor analysis?

A

True factor analysis attempts to extract factors that explain the correlations between items.

46
Q

What is a factoral ANOVA?

A

.

47
Q

What is a repeated measures ANOVA?

A

.

48
Q

What is a mixed ANOVA?

A

It compares the mean differences between groups that have been split on two factors (IVs), where one factor is a within-subjects factor and the other factor is “between-subjects” factor.

49
Q

What is a t-test?

A

A type of inferential statistic used to determine if there is a significant difference between the means of two groups, which may be related in certain features.

50
Q

What is a true factor analysis aimed at doing?

A

True factor analysis attempts to extract factors that explain the correlations between items.

51
Q

What is a scree plot used for?

A

A scree plot is a line of the eigenvalues of factors or principal components in an analysis. It is used to determine the number of factors to retain in an exploratory factor analysis or principal components to keep in a principal component analysis.

52
Q

What is a parallel analysis?

A

A method for determining the number of components or factors to retain from PCA or factor analysis. It works by creating a random dataset with the same numbers of observations and variables as the original data.

53
Q

What are eigenvalues?

A

.

54
Q

What is factor rotation?

A

The program doing the FA rotates the axes of the model to to find the best fit between the variables and the latent factors.

55
Q

What is the purpose of factor rotation?

A

Rotation minimises the complexity of the factor loadings to make the structure simpler to interpret. Factor loading matrices are not unique, for any solution involving two or more factors there are an infinite number of orientations of the factors that explain the original data equally well.

56
Q

What is multicollinearity?

A

Multicollinearity exsists when there is a strong correlation between two or more predictor variables.

57
Q

What are the three purposes of data analysis?

A

To describe, explain, and predict data.

58
Q

What are the three criteria that mae up a confounding variable?

A
  1. Must be correlated with the IV
  2. Is not affected by the DV
  3. Has a causal affect on the DV
59
Q

What does randomisation ensure?

A

It ensures that different experimental conditions will have similar average levels of all pre-exsisting attributes

60
Q

What is a Bernoulli distribution?

A

It is a discrete distribution with two possible outcomes

61
Q

What is marginal probability?

A

The probability of an event occurring. It may be thought of as an unconditional probability. It is not conditioned on another event.

62
Q

What does e represent in a regression equation?

A

The error term (the difference between the observed and predicted values of Y).

63
Q

What does Y represent in a regression equation?

A

The outcome variable.

64
Q

What does X represent in a regression equation?

A

The predictor variable.

65
Q

What is the difference between a residual and error in a regression analysis?

A

The residual is the diference between someone’s Y and a score predicted using the model. Error is the difference between Y and the “true” regression.

66
Q

What is the mean and SD for a Z score?

A
Mean = zero
SD = 1
67
Q

What are the three properties of OLS regression?

A

OLS is unbiased, consistent, and efficient.

68
Q

Why might one add additional predictors in blocks?

A

To see how much extra variance we can explain by adding a particular predictor or set of predictors. This is sometimes called hierarchical regression.

69
Q

Why should you not use statistical procedures to select which predictors to use in a regression?

A

It will result in biased estimates (similar to publication bias).

70
Q

When is it appropriate to use statistical methods to selects predictors in a regression?

A

If you are willing to collect extra data to conduct cross-validation (to see how well you model explains new data,and refit the model with new data to get unbiased estimates).

71
Q

What are the 5 assumptions OLS relies on?

A
  1. Additivity and linearity
  2. Measurement error only in the y
  3. Error terms are independent
  4. Homoscedasticity
  5. Normal distribution of errors
72
Q

What does the (Constant) B score represent in an SPSS multiple regression analysis?

A

The predicted score when predictor variables are zero.

73
Q

What does the “predictor X” B score represent in an SPSS multiple regression analysis?

A

For every extra one unit increase in X, the outcome increase by the B score when holding all other predictors constant.

74
Q

What does the standardized coefficients Beta score for “predictor X” represent in an SPSS multiple regression analysis?

A

For every one-SD increase in predictor X, the outcome falls by the given score, when holding all other predictors constant.

75
Q

What are standardized coefficients?

A

They are presented in normalized units, eg. standard deviations.

76
Q

What does the t score represent in an SPSS multiple regression analysis?

A

A test statistic which incorporates information about effect size, sample size, and variability.

77
Q

What does sig represent in an SPSS multiple regression analysis?

A

The p value. If the true slope for predictor X in the population was zero, the probability of observing a t statistic as large or larger than “predictor X b score” would be “1 - predictor X b score”.

78
Q

What does R2 tell you in an SPSS model summary?

A

What % of variance is explained by your model.

79
Q

What does adjusted R2 tell you in an SPSS model summary?

A

It adjusts R2 with penalty based on the number of predictors.

80
Q

What is a conceptual replication?

A

It uses a similar model but considers using new methods, new operational definitions etc.

81
Q

What is reference category?

A

When you set up dummy variables with more than two categories you will end up with one reference category.

82
Q

What are polynominal?

A

Squared or cubed predictor values.

83
Q

Why would you study moderation or interaction effects

A
  1. A theory implies that one is present and you want to test it
  2. For practical purposes, when you need to check if an effect holds in particular groups
84
Q

How do you test a moderating effect in a regression equation?

A

Multiply the two predictors together, and include both interaction term and the main effects in the model.

85
Q

How do you interpret the unstandardized coefficient in a model which includes interaction effects?

A

The unstandardized coefficient for one of the predictor variables is the predicted change in the outcome variable given a one-unit increase in the predictor variable while holding the other predictor variables at zero.

86
Q

How do you interpret the interaction coefficient of an interaction effect?

A

The unstandardized coefficient for the interaction term is the change in the effect of one of the predictors when increasing the other predictor by one unit.
eg. Coefficient of +.166 for Principles*Sex means that the effect of principles for women is +1.66 units more positive for women than men
for women, a one-unit increase in principles results in -.378+.166 = -.212 less hours of personal web use at work per week.

87
Q

When is logistic regression used?

A

When you have a categorical outcome variable and categorical and/or continuous predictor variables.

88
Q

What is binary logistic regression?

A

When the outcome is dichotomous?

89
Q

How is logistic regression similar to linear regression?

A

They are both forms of the generalized linear model.

90
Q

What do you predict in logistic regression?

A

Log odds of being in outcome variable group 1 (where the possible values of the outcome variable are 0 and 1)

91
Q

What method is used to calculate logistic regression equation?

A

Maximum likelihood criterion. ML finds the set of estimates for which the likelihood of the data is the highest (of all possible estimates).

92
Q

How are the odds of an event calculated?

A

Probability of event happening/probability of event not happening.

93
Q

What is an odds ratio and how is it calculated?

A

A way of expressing how changing something affects the odds of something else happening. Calculated by odds of the first group divided by the odds of the second group.

94
Q

What is a logarithm?

A

The logarithm of a number X is the exponent to which a particular base has to be raised to produce X. In statistics we usually use natural logarithm, in which the base is e=2.718

95
Q

What is the log odds?

A

The value produced by the logistic model. It takes the value of 1 and is sometimes called the “logit”. Using this model we can estimate the odds of the outcome variable value being 1 for any given combination of values of the predictors.