Research Methods II Flashcards

Question

The Standard Error of Xbar

Answer 1

The standard deviation of the distribution of sample means.

Answer 2

The larger the sample size, the more probable it is that the sample mean will be close to the population mean.

Answer 3

Used to help estimate or feel confident that the actual µ is within a certain range of the mean of means from many samples.

Answer 4

Statistical representation of the real world.

Answer 5

The mean is a hypothetical value (i.e., doesn't have to be a value that actually exists in the data set). As such, the mean is a simple statistical model.

Answer 6

The mean if a model of what happens in the real world: typical score. It is not a perfect representation of data.

Answer 7

The predicted relationship does NOT exist. The symbol is Ho.

Answer 8

The predicted relationship does exist. The symbol is H1.

Answer 9

The rejection of the null hypothesis is true. Saying there is a relationship when it does not exist.

Answer 10

The acceptance of the null hypothesis when the null hypothesis is false. Saying there is no relationship when there is a relationship.

Answer 11

Minimizing risk of a Type I error.

Answer 12

The probability of correctly rejecting the null hypothesis when the null is false. It is the probability that Type II error is not committed.

Answer 13

- The alpha level. Increase the alpha level x increase power of statistical test. - One tailed vs. 2-tailed alpha test. - Sample size. As sample size increases, so does power. - Reduce error variance will increase power. (Test everyone in same quiet room rather than different room, different noises.)

Answer 14

Would increase power.

Answer 15

Decrease power.

Answer 16

- Normally distributed sampling distribution. - Homogeneity of variance/ homoscedasticity. - Interval or ratio data. - Independence of scores.

Answer 17

A probability of distribution of a random variable with perfect symmetry and a skew of 0 and kurtosis of 0.

Answer 18

A family of statistical tests that do not rely on the restrictive assumptions of parametric test. Does not assume sampling distribution is normally distributed.

Answer 19

Assumption that the variance 1 continuous variable is stable/consistent between treatment groups of a discrete variable for t-tests & ANOVAs.

Answer 20

Assumption the variance of 1 continuous variable is stable/consistent across scores of another continuous variable. For regressions.

Answer 21

One data point does not influence another data point.

Answer 22

More powerful (statistically speaking in reflecting the null hypothesis when it is false) compared to non-parametric tests.

Answer 23

More freedom! Not restricted by assumptions to do data analysis.

Answer 24

- Tests if data differ from a normal distribution. - Significant = non-normal data. - Non-significant = normal data.

Answer 25

Frequency distribution with bar drawn adjacent. Gives continuous figure that emphasizes the continuity of variables. Good for continuous variables.

Answer 26

Quantile-quantile plot. Plots quantiles of a particular distribution. If value falls on the diagonal plot, the value shares the same distribution as the normal distribution.

Answer 27

The proportion of case we find below a certain value.

Answer 28

Have a skewness of 0 and a kurtosis of 0.

Answer 29

- Tests if variances in different groups are the same. - Significant = variances not equal (bad) - Non-significant = variances are equal (good)

Answer 30

Reduce positive skew.

Answer 31

Positive skew and to stabilize variance.

Answer 32

Dividing 1 by each scores also reduces the impact of large scores. This transformation reverse the scores; you can avoid this by reversing the scores before the transformation, 1 (xhighest - xi)

Answer 33

Transforming the data helps as often as it hinders the accuracy of F.

Answer 34

account for violations of assumptions (e.g., normality).

Answer 35

Way of measuring the extent to which two continuous variables are related. Measures pattern of responses across variables.

Answer 36

A perfect linear relationship (r = +1.00 or - 1.00) is when all the data points lie on a straight line in a scatter plot.

Answer 37

Average cross-product deviations. - Calculate the error between the mean and each subjects' score for the first variable (x). - Calculate the error between the mean and their score for the second variable (y). - Multiply these error values. - Add these values and you get the cross product deviations.

Answer 38

- Depends upon the units of measurement. e.g. the covariance of the two variables measured in miles might be 4.25, but if the same scores are converted to kilometers, the covariance is 11. - Standardize it. Divide by the standard deviations of both variables. - The standardized version of covariance is known as the correlation coefficient. H is relatively unaffected by units of measurement.

Answer 39

Measures the degree and direction of linear relationship between two variables in terms of standardized (z-scores with a mean of 0 and a SD of 1).

Answer 40

To calculate the Pearson Correlation, you need to first calculate the Sum of Products.

Answer 41

a relationship between 2 variables, not causality.

Answer 42

Causality between two variables cannot be assumed because there may be other measured or unmeasured variables affecting the results.

Answer 43

Correlation coefficients say nothing about which variable causes the other to change.

Answer 44

The more error there is in a measure, the smaller the correlation can be. This measures the relationship or correlation between two parallel measures of a variable of interest.

Answer 45

- r squared = coefficient of determination - It measures the proportion of variability in one variable that can be determined from the relationship with the other variable.

Answer 46

- Variables transformed into standard scores or added or multiplied by a constant, the correlation between the two variables will remain the same.

Answer 47

- Pearson's correlation on the ranked data - Good for non-normally distributed data

Answer 48

- Also measures relationship between ordinal variables. - It gives more accurate p-values for small data sets, but less popular than Spearman's rho.

Answer 49

The straight line that best describes the linear relationship between two variables. - Describes the relationship between two variables. - Identifies the center or "central tendency" of the relation. - Line used for prediction.

Answer 50

Yi = bo + biXi + Ei

Answer 51

criterion variable, dependent variable

Answer 52

predictor variable, independent variable

Answer 53

Regression coefficient for the predictor (IV) - Gradient (slope) of the regression line - Direction/strength of the relationship

Answer 54

Y intercept (value of Y when X = 0). Also called the constant. - Point at which the regression line crosses the Y-axis (ordinate)

Answer 55

the error of the regression line

Answer 56

Since it is based on linear correlations, multiple regression assumes linear bivariate relationships between each x and y, and also between y and predicted y.

Answer 57

Both univariate & multivariate distributions of residuals (actual scores minus predicted scores) are normally distributed. Thus, the Y scores are independent and normally distributed (Shapiro-Wilk).

Answer 58

Independence of Y (outcome: DV) scores.

Answer 59

The errors (residuals) from observations using the regression formula should not be correlated with each other (Durbin-Watson test).

Answer 60

The predictors (IVs) should not be highly correlated with each other. Rule of thumb is no higher than r -.80 between predictors.

Answer 61

Variance of the residuals are uniform for all values of U.

Answer 62

Spread of data points is smaller for low X values and wider for higher X values.

Answer 63

- The measure of accuracy of regression. - The regression equation allows predictors, but does not provide info about accuracy. - Standard distance between the regression line and the actual data points. - The greater the correlation, the smaller the standard error of the estimate.

Answer 64

- Want the sums of the squared errors to be the smallest. - Yields values for the b-weights and the y-intercept that results in sum of the squared residuals being at the minimum. - The best fitting-line has the smallest total squared error.

Answer 65

- Occurs when you have a nonrandom sample from a population and two measures that are imperfectly correlated. - The sample posttest mean is closer to the posttest population mean than their pretest mean to the pretest population mean.

Answer 66

- Can occur because the sample is not random. - Group phenomenon. - Happens between any two variables. - The more extreme the sample group; the greater the regression to the mean. r = 1, no regression to the mean r =.5, 50% regression to the mean r = .2, 80% regression to the mean r = 0, 100% regression to the mean

Answer 67

2 or more independent variables in the regression model - include more than one predictor to enhance the prediction of Y

Answer 68

Relationship between two variables, ignoring influence of other variables in prediction.

Answer 69

Relation between 2 variables after controlling for influence of 1 other variable.

Answer 70

Relation between 2 variables after controlling for influence of 2 other variables.

Answer 71

Relationship between two variables after removing the overlap completely from both variables.

Answer 72

Relationship between two variables after removing a third variable from just the independent variable.

Answer 73

All variables are entered at the same time and the Beta weights are determined simultaneously.

Answer 74

Used to build a subset of predictors.

Answer 75

Variables entered are determined by some theory.

Answer 76

Computer decides which variables are entered based on their unique predictive abilities.

Answer 77

Enough participants for each predictors. If participants is low to number of predictors, sample estimated may not predict population.

Answer 78

High correlations betwene predictors can cause instability of prediction.

Answer 79

Measurement does not reflect true score, the application of Beta weights to ta new sample may not be accurate.

Answer 80

Variance explained by the regression model, aka coefficient of determination

Answer 81

B weight, raw score; the unique effect of the predictor

Answer 82

Standard error of B weight of the predictor

Answer 83

beta weight; the standardized unique effect of the predictor based on z-scores

Answer 84

A method of resampling in which the model is re-run multiple (e.g., thousands of) times with different permutations of the same data. - It is used to provide a more reliable estimate (s) of the statistic. - It is especially useful for small samples and non-normal data.

Answer 85

When or how much the IV will affect the DV. (Levels, low, average, high, etc.)

Answer 86

The mechanism: how or why the effect of the IV occurs. (Third variable, affects the independent variable)

Answer 87

To get the interaction term for 2 continuous independent variables in multiple regression, the score of the predictors must be centered around the mean.

Answer 88

Low statistical power.

Answer 89

Relationship between IV & DV completely disappears. Beta weight approaches 0.

Answer 90

Relationship with IV & DV, but it is reduced. Beta weight may be significant, but difference between new & old beta weights is significant.

Answer 91

More flexible and statistically powerful.

Answer 92

Indirect effect of M (total effect - direct effect). Must be significant to show that mediation is significant.

Answer 93

Must NOT be significant to prove complete mediation.

Answer 94

Combined indirect & direct effect.

Answer 95

more accuracy & power.

Answer 96

Only one independent variable is manipulated in only two ways and only one outcome is measured.

Answer 97

- Independent t-test & dependent t-test are PARAMETRIC TESTS based on the NORMAL DISTRIBUTION. - Sampling distribution is NORMALLY DISTRIBUTED. - Dependent variable data are CONTINUOUS (interval or ratio). - Independent t-test assumes: x Homogeneity of variances. x Scores in different conditions are independent.

Answer 98

- Two means based on independent data. - Data from different groups of people.

Answer 99

Nominal IVs are usually done with t-tests or ANOVAs. Can be done in regression as long as predictors are coded correctly (Dummy coding).

Answer 100

Compares two means based on related data.

Answer 101

0.2 = small 0.5 = medium 0.8 = large

Answer 102

- Nonparametric version of independent t-test x Mann-Whitney test; aka Wilcoxon rank-sum test - Nonparametric version of dependent t-test x Wilcoxon signed-rank test - Robust test x Bootstrapping

Answer 103

- Non-parametric equivalent of the independent sample t-test. - Use to test differences between two conditions in which different participants have been used.

Answer 104

To compare two sets of scores when these scores come from the same participants.

Answer 105

These functions require the data to be in two different columns.

Answer 106

Robust method to compare independent means.

Answer 107

"Where 20% of extreme scores are excluded from calculating the mean" - Could be excluding a certain percentage from the mean.

Answer 108

- Median rather than trimmed mean.

Answer 109

- Determines if mean differences exist for 2 or more treatments or populations. Involves comparison of variances that reflect different sources of variability. - Test the differences between samples are due to chance (sampling error) or whether they are systematic treatment effects that have caused the scores in one group to be different from the scores in another.

Answer 110

T-squared = F when there is 1 independent variable with only 2 treatment conditions.

Answer 111

- Between-Treatment Group Variability: Between treatment conditions, differences among sample means. x Treatment effect, individual differences & experimental error could explain differences. - Within-Treatment Group Variability: Variability within each sample. x Individual differences and experimental error.

Answer 112

Fratio = Treatment effect + experimental error/individual differences + experimental error ~ 1

Answer 113

F ratio = Treatment effect + individual differences + experimental error/ individual differences + experimental error > 1

Answer 114

- Independence - Normality - Homogeneity of variance

Answer 115

Levene's Test

Answer 116

- Requires data to be in wide format rather than long format. - Robust ANOVA based on trimmed means. - Compare medians rather than means. - Add bootstrap to the trimmed mean method.

Answer 117

- The variability explained by the model (experimental manipulation, SSm) is due to participants being assigned to different groups. - This variability can be broken down further to test specific hypotheses about which groups might differ. - We break down the variance according to hypotheses made a priori (before the experiment).

Answer 118

- Compare medians rather than means - Add bootstrap to trimmed mean method

Answer 119

Determine specifically which treatment groups or samples are different from each other.

Answer 120

Alpha level you select for each individual hypotheses.

Answer 121

Total probability of a Type I error that is accumulated from all the separate tests in the experiment.

Answer 122

- Calculates new pairwise alpha to keep the familywise alpha at .05. - Strictest correction procedure, lowest power. - Bonferroni a = a/number of tests - Most commonly used. - May overcome Type I.

Answer 123

- Allow researcher to conduct any & all comparisons while preventing experiment wise error from exceeding alpha level. - Strict, also low power.

Answer 124

- Help control for experimentwise Type I error when the set of comparisons consists of pairs of treatment means or pairwise comparison.

Answer 125

- Non-parametric counterpart of one way independent ANOVA. - Data violated assumptions. Based on ranked data.

Answer 126

- Used for testing differences between condition when there are 2 or more conditions. - Same participants for all conditions. Repeated measures design. - Based on ranked data.

Answer 127

Any ANOVA design can become an ANCOVA by the addition of a concomitant variable called a covariate (CV). ANCOVA is an extension of ANOVA where the effects of the IVs on the DV are assessed after the effects of one or more covariates are partialed out partitioning variance.

Answer 128

- Reduces error variance -- by explaining some of the unexplained variance (SSr) the error variance in the model can be reduced. - Increases statistical power -- by equating the treatment groups, we reduce error variance due to subject variability, and thus, increase power. Matching is a "procedural" way to equate the groups, whereas ANCOVA is a statistical way to equate them. - Greater experimental control -- by controlling known extraneous variables, we gain greater insight into the effect of the predictor variables (s)

Answer 129

- A covariate is a source of variation that is not controlled for in the design of the experiment, but which does affect the dependent variable. - The covariate is correlated with the dependent variable. We want a correlation of at least r = .20. - The covariate should be independent of the independent variable(s), and it shouldn't correlate highly with any other covariates. - Adding a covariate complicate the design. It also means you'll probably need more subjects so you won't get empty cells in the new design.

Answer 130

- Use a regression equation to find the adjusted means for each treatment group. - Will be used when you want to do main group comparisons (contrasts).

Answer 131

- The regression slope has to be the same for each of the 3 groups. - Want parallel lines, not interactions.

Answer 132

No relationship between covariate and IV. Run regular ANOVA.

Answer 133

Levene Test - same as ANOVA, want the value to be not significant

Answer 134

R uses the Type I model to fit predictors into the model which uses the predictors based on how they were entered. Want to use Type III to get the unique effect of each predictor. - Use contrast coding to write data in format for Type III model. - Second line describes the relationship between covariate & DV. - Third line indicates that the IV does has a significant effect on activity level, AFTER controlling for partner activity level.

Answer 135

Do robust ANCOVA to free the analysis from the restriction of homogeneity of regression slopes.

Answer 136

- We can look at how variables interact. Moderation model. - Interactions: Show how the effects of one IV on the DV might depend on the effects (level) of another IV.

Answer 137

- The separate effect of each independent variable AVERAGED over the levels of the other independent variable (AVERAGED EFFECTS)

Answer 138

The variability among the treatment means associated with one independent variable at a particular level of the other independent variable. AKA simple main effects.

Answer 139

An interaction is present when the effect of one of the independent variables on the dependent variable is not the same at all levels of the second independent variable. The simple effects of one of the independent variables are not the same at all levels of the second independent variable. When an interaction is present the main effect is NOT representative of the corresponding simple effects. The main effects do not fully or accurately describe the data.

Answer 140

If the average line representing each condition of A is flat or no slope then there is no main effect A. If there is a slope of that averaged line, then there is a significant main effect A.

Answer 141

If the dots that represent the average of all the conditions of B are right on top of each other, then there is no main effect of B. If the dots are not directly on top of each other and they are spaced apart, then yes, there is a significant main effect of B.

Answer 142

If the lines are parallel, then there is no interaction. If the lines are NOT parallel, then there is a significant interaction.

Answer 143

Within the 3-way ANOVA (A x B x C) are smaller interactions (e.g. A x B of level C1 and at level C2) are called simple interactions. The comparison of smaller interactions is called interaction comparisons.

Answer 144

- Any interaction analysis that analyses each factor only on 2 of its levels. - So for a 4 x 4 design, an interaction contrast would be 2 x 2 x 2 (where 1 level of A, 2 levels of B, and 1 level of C are ignored). - In a 3-factor design, 3 x 4 x 3, an interaction contrast would be 2 x 2 x 2 (where 1 level of A, 2 levels of B, and 1 level of C are ignored). - These types of interaction analyses are useful because like main & simple comparisons, they break the interaction terms into a single degree of freedom. - It would enable us to understand the interaction or the simple effects in more detail.

Answer 145

- Can utilize planned (apriori) group comparisons in the model - Do contrast coding of the IVs first so that we can make group comparisons of no versus any alcohol & 2 pints versus 4 pints of alcohol for example.

Answer 146

Used to see if two or more categories of a nominal variable differ significantly from expectation. - df + (c-1); number of categories minus 1

Answer 147

- Use to see whether there's a relationship between two categorical variables. - t obs needs to be greater than t crit.

Answer 148

- An alternative to Pearson's chi-square, based on maximum likelihood theory. - Create a model for which the probability of obtaining the observed set of data is maximized. - This model is compared to the probability of obtaining the observed set of data is maximized. - The resulting statistic compares observed frequencies with those predicted by the model. - i and j are the rows and columns of the contingency table and ln is the natural logarithm. - Preferred to Pearson's chi-square when the total sample (N) is small.

Answer 149

- Independence x Each person, item, or entity contributes to only one cell of the contingency table. - The expected frequencies should be greater than n = 5. x In larger contingency tables, up to 20% of expected frequencies can be below 5 for a category or group, but there is a loss of statistical power. x Even in larger contingency tables, no expected frequencies should be below 1. If you find yourself in this situation, consider using Fisher's exact test.

Answer 150

df = (rows - 1) x (columns - 1)

Research Methods II Flashcards

(174 cards)