8- Research Design and Statistics Flashcards

1
Q

Which of the following is NOT generally considered a direct threat to external validity?
Select one:
A. order effects
B. hawthorne Effect
C. interaction between selection and treatment
D. history

A

Correct Answer is: D
Distinguishing between internal and external threats to validity can be difficult. Indeed, some experts disagree on how to categorize some of them. However, all of the choices except “history” are generally considered to be threats to external validity.
Order effects* (also known as carryover effects) occurs in repeated measures designs, or in studies in which the same subjects are exposed to more than one treatment. For example, in a study on the effects of marital therapy interventions, couples are given relaxation training followed by communication training. If significant improvement occurs, it may be due to relaxation training preceding communication training; therefore, the results could not be generalized to situations in which subjects only receive communication training.

The Hawthorne effect* occurs when subjects behave differently due to the fact that they are participating in research. Obviously this threatens external validity since the results cannot be generalized to real-life situations in which people are not participating in research.

Interaction between selection and treatment* refers to when a treatment has different effects depending on the selection of subjects. For example, studies that only use undergraduate students (as many studies do) might not generalize to non-undergraduate students (* incorrect options).

Finally, history refers to an external event, other than the experimental treatment, that affects scores on the DV. This is primarily considered a threat to internal validity. For example, if a study on the effects of a new treatment for depression began several weeks before the events on “9-11” and concluded several weeks after “9-11,” the results might indicate that the new treatment is not effective. However, this might not be a valid conclusion due to the effects of history.
Additional Information: Threats to External Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
When a study has two or more independent variables the research design used is:
Select one:
A. MANOVA
B. ANOVA-one way
C. Factorial ANOVA
D. ANCOVA
A

Correct Answer is: C
The factorial ANOVA is used when a study involves more than one independent variable. A one way ANOVA is used when a study has one independent variable and more than two independent groups. MANOVA is used when the study has two or more dependent variables and ANCOVA is used to adjust dependent variable scores to control for the effectiveness of an extraneous variable.
Additional Information: Factorial ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Mothers who get high scores on the WAIS-III tend to have children who get high scores on the WISC-III. On the basis of this information, which of the following conclusions is most justified?
Select one:
A. intelligence is hereditary
B. parental intelligence is correlated with offspring intelligence
C. the WAIS-III and the WISC-III are correlated
D. the WAIS-III and the WISC-III are uncorrelated

A

Correct Answer is: B
None of these answers is great, but the only one that is possible is that parental intelligence and offspring intelligence are correlated. To know this for sure, we would need to know more about moderate and low scorers on the WAIS-III – do their children have moderate and low scores, respectively on the WISC-III? However, none of the other choices makes any sense. For instance, we can’t say that intelligence is hereditary from this information, since environmental rather than genetic factors may have resulted in the similarity of scores between mother and child. Also, we can’t say the WISC-III is correlated (or uncorrelated) with the WAIS-III. To measure the correlation between two tests, one must administer both of them to the same set of examinees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
Which of the following is a measure of "amount of variability accounted for"
Select one:
A. alpha
B. Cohen's d
C. eta squared
D. F-ratio
A

Correct Answer is: C
The “amount of variability accounted for” is assessed by a squared correlation coefficient. Eta squared is the square of the correlation coefficient (i.e., the correlation between the treatment and the outcome) and is used as an index of effect size.
Alpha* is the level of significance set by a researcher prior to analyzing the data. Cohen’s d* is used as an index of effect size, but it is a measure of the mean difference between two groups. The F-ratio* is the statistic calculated when using the analysis of variance (* incorrect options).
Additional Information: Correlation and the Correlation Coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
In designing a research study, you take a number of steps that have the effect of reducing beta. This means that you have reduced the probability of:
Select one:
A. retaining a true null hypothesis.
B. retaining a false null hypothesis.
C. rejecting a true null hypothesis.
D. rejecting a false null hypothesis.
A

Correct Answer is: B
Beta is the probability of making a Type II error, or of retaining a false null hypothesis. In plain language, it is the probability of failing to detect a true effect.
Additional Information: Type II (Beta) Error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A MANOVA is used to statistically analyze data when:
Select one:
A. a study includes two or more independent variables
B. a study includes two or more dependent variables
C. there are more than two levels of a single independent variable
D. a study includes at least one independent variable that is a between-groups variable and another independent variable that is a within-subjects variable

A
Correct Answer is: B
A MANOVA (multivariate analysis of variance) is used to analyze the effects of one or more independent variables on two or more dependent variables that are each measured on an interval or ratio scale.
a study includes two or more independent variables

A factorial ANOVA is used to analyze data when a factorial design, which includes two or more independent variables, is used and the dependent variable is measured on an interval or ratio scale.

there are more than two levels of a single independent variable

A one-way ANOVA is used when a study has one independent variable and more than two independent groups.

a study includes at least one independent variable that is a between-groups variable and another independent variable that is a within-subjects variable

The split-plot (mixed) ANOVA is the appropriate technique when at least one independent variable is a between-groups variable and another independent variable is a within-subjects variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
If data points are widely scattered around a regression line, it would indicate
Select one:
A. high heteroscedasticity.
B. low heteroscedasticity.
C. low homoscedasticity.
D. a low correlation coefficient.
A

Correct Answer is: D
Simply put, a lot of variance around the regression line indicates that the correlation isn’t too high. Be careful not to confuse this with the idea of heteroscedasticity. This term means that the scatter is uneven at different points of the continuum. For instance, there might be high variability around the regression line at low x (predictor) values, and low variability around the line at high x values. In other words, heteroscedasticity refers to a differential level of scatter, not high scatter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A psychologist is conducting research to evaluate the effectiveness of three predictor tests of overall mental health he has developed. He administers the predictors to 35 individuals randomly chosen from the population of interest and obtains a squared multiple correlation coefficient (R2) of .47. If the psychologist administers the predictors to another 70 individuals drawn from the same population, the best prediction is that, he would obtain an R2 that is:
Select one:
A. lower than .47.
B. about equal to .47.
C. slightly to moderately higher than .47.
D. much higher than .47.

A

Correct Answer is: C
The principle behind this question is that the greater the range of scores in both the predictor(s) and the criterion, the higher the validity coefficient will be. If you administer the predictors to 70 people as opposed to 35, you are likely to get a somewhat greater range of scores in the former case. Therefore, you will get a somewhat higher correlation coefficient. This choice (“much higher than .47”) is not a good answer. Increasing the range of scores can only do so much for your correlation coefficient, especially if you already have a reasonably representative sample to begin with. Increasing the sample size from 35 to 70, for example, will not turn a poor set of predictors into a good one.
Some of you might have gone for this choice (“lower than .47”), thinking that, due to shrinkage, the correlation coefficient would be smaller. Shrinkage, however, is associated with the development of a predictor or set of predictors. It occurs when, based on research with one sample, items for a predictor are chosen from a larger pool, and the newly developed predictor is then tested on a second sample. The correlation coefficient for the second sample is likely to be smaller, because the predictor was “tailor made” for the first sample. In this question, however, the predictors are not in the process of development, and the first group of 35 people is not a validation sample (i.e., a sample of people used to determine which items to retain for the final version of the test). To see if you understand this distinction, try to rewrite the question, changing as few words as possible, so that this (“lower than .47”) becomes the best answer.
Additional Information: Multiple Correlation and Multiple Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
The best control for practice effects in an experiment is:
Select one:
A. counterbalancing.
B. random selection.
C. double-blind.
D. equivalent groups.
A

Correct Answer is: A
If subjects are getting a series of treatments, and the order of the presentation might affect the outcome, you would present their treatments in different orders in a counter-balanced design in order to control for the possible practice effect of receiving a set order of treatments.
Additional Information: Ways to Increase External Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
A study is conducted to determine if males and females differ in terms of the reasons they buy automobiles; 100 males and 100 females are asked whether the primary reason they bought their car was its appearance, its price, or its perceived reliability. The best statistical test to evaluate the hypothesis that males and females differ in these preferences would be a
Select one:
A. factorial ANOVA.
B. t-test for independent samples.
C. one-way ANOVA.
D. chi-square test.
A

Correct Answer is: D
The chi-square test is used to analyze the result of studies where the data is classification of objects into categories. This is also referred to as nominal data. For example, in this study, the data will be a 2 X 3 matrix of frequencies, or counts, of responses within each category (male-appearance, female-appearance, male-price, female-price, male-reliability, female-reliability). The chi-square test could then be used to determine whether the observed frequencies differ significantly from what would be expected if males and females did not differ in these preferences. All the other choices are tests used in studies that use interval or ratio data as opposed to categorical data. With this type of data, the difference between two interval scale values can be quantified (e.g., on an IQ test, a difference between an IQ of 100 and 85 is equivalent to the difference between 130 and 115). With interval or ratio data, one can obtain the mean scores of different research groups or trials, and choices 1, 2, and 3 are all tests that involve comparing group means to each other. By contrast, with nominal data, the concept of mean scores would not make sense; all you can do is count frequency of occurrence within categories.
Additional Information: Chi-Square Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
The use of "pooled variance" in statistics assumes that:
Select one:
A. the sample sizes are equal
B. the sample variances are equal
C. the population sizes are equal
D. the population variances are equal
A

Correct Answer is: D
Pooled variance is the weighted average variance for each group. They are “weighted” based on the number of subjects in each group. Use of a pooled variance assumes that the population variances are approximately the same, even though the sample variances differ.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
An admissions committee is planning to modify its application and admissions policy. They are evaluating the current student enrollment and are interested in the relationship between gender and high school GPA. Which statistical method would be used?
Select one:
A. Point biserial correlation
B. Multiple correlation
C. Canonical correlation
D. Tetrachoric correlation
A

Correct Answer is: A
The point biserial correlational technique is used when one variable is dichotomous (gender) and one is continuous (high school GPA).
Multiple correlation

Multiple correlation is used when there are two or more predictor variables and a single criterion variable.

Canonical correlation

Canonical correlation is used when there are two or more predictor variables and two or more criterion variables.

Tetrachoric correlation

Tetrachoric correlation is a technique used to estimate the magnitude of the relationship between two continuous variables that have been dichotomized, such as dividing age into two groups: under 40 and over 40.

Additional Information: Other Correlation Coefficients

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The correlation obtained between two tests supposedly measuring the same ability will be affected most by the:
Select one:
A. time of day during which the tests are taken.
B. reliability of the tests used.
C. whether raw scores or standard scores are used as data.
D. ranges of abilities tested

A

Correct Answer is: B
If a test’s reliability is low, the scores obtained will not be accurate. You will get too much error. If you compare two tests with a lot of error in them, you will not get an accurate prediction of their relationship. Remember that low reliabilities of measures used on the predictor or on the criterion measures will restrict your obtained correlation.
Additional Information: Reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

You conduct a study designed to assess the effectiveness of psychotherapy in the treatment of depression. You work with two groups, one of which receives the therapy and one of which is an attention-only control group. All of your subjects are hospitalized inpatients; thus, all of them are extremely depressed and therefore score extremely low on your pretest measure of depression. The biggest threat to external validity in this study is:
Select one:
A. regression to the mean
B. reactivity
C. interaction between selection and treatment
D. pretest sensitization

A

Correct Answer is: C
Note that you are being asked for the biggest threat to external validity, not internal validity in this question. Therefore, you can rule out regression to the mean, which is generally viewed as a threat to internal validity (regression probably wouldn’t threaten internal validity anyway in this case, since both groups appear to be equivalent in terms of their baseline depression levels).
External validity refers to the generalizability of research results. An “interaction between selection and treatment” means that the effect of a treatment may not generalize to other members of the target population who differ in some way from the research subjects. For example, in this case, it’s possible that your therapy is effective for individuals who are highly depressed, but would not have any effect on individuals who are moderately depressed.
Additional Information: Threats to External Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
A set of past graduate students are divided into two groups by a doctorate admissions committee. One group consists of students who finished the program in five years or less, the other consists of those who did not. Based on undergraduate grade point average and GRE score, which of the following could be used to predict successful completion of the graduate program?
Select one:
A. MANOVA
B. Structural equation modeling
C. Discriminant function analysis
D. Cluster analysis
A

Correct Answer is: C
Discriminant function analysis (DA) is used to determine which continuous variables discriminate between two or more naturally occurring groups, or provide insights into how each predictor (e.g., grades, GRE score) individually and/or in combination predicted completion or non-completion of a graduate program. In DA, the independent variables are the predictors and the dependent variables are the groups. In contrast, in MANOVA, the independent variables are the groups and the dependent variables are the predictors.
A multivariate analysis of variance (MANOVA) is used to analyze the effects of one or more independent variables on two or more dependent variables that are each measured on an interval or ratio scale.

Structural equation modeling is a technique used to evaluate or confirm the cause-and-effect or hypothesized relationship between both measured and latent variables.

Cluster analysis is a method for grouping objects of similar kind into respective categories. It can be used to discover structures in data without providing an explanation/interpretation.
Additional Information: Discriminant Function Analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
With regard to research design, the term external validity refers to:
Select one:
A. significance
B. control
C. generalizability
D. accuracy
A

Correct Answer is: C
External validity refers to the ability to generalize findings beyond the specifics (e.g., time, setting, and subjects) of a research study. Internal validity is the extent the changes in the dependent variable are believed to be caused by the independent variable. Both types of validity are research design considerations.
Additional Information: External Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

A moderator is
Select one:
A. a variable that affects the direction or strength of the association between two other variables.
B. an explanation of how external physical events take on internal psychological significance.
C. a variable that identifies the relationship between two variables and serves to magnify the strength of the variables.
D. a variable that accounts for the relationship between two variables.

A

Correct Answer is: A
In general, a moderator is a qualitative (e.g., race, sex, class) or quantitative (e.g., level of reward) variable that affects the direction and/or strength of the relation between an independent or predictor variable and a dependent or criterion variable. A moderator only influences the strength of the relationship between two other variables, it doesn’t fully account for it. In contrast, a variable functions as a mediator to the extent that it accounts for the relation between the predictor and the criterion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q
A psychologist wants to study the effectiveness of a new treatment she developed to reduce self-mutilative behaviors in patients with Borderline Personality Disorder. She plans to use a single-subject design but, if effective, she does not want to withdraw the treatment due to the potential harm that could result. She should, therefore, use which of the following research designs:
Select one:
A. ABAB
B. multiple baseline
C. reversal
D. latin square
A

Correct Answer is: B
A multiple baseline design is a single-subject design in which an independent variable is sequentially administered across two or more subjects, behaviors, or settings (i.e., across “baselines”). The multiple baseline design has the advantage of not having to withdraw the treatment once it has been applied to a baseline.
Reversal designs, on the other hand, such as the ABA or ABAB designs have a second baseline (the second “A”), during which the treatment is withdrawn. The latin square design is not a single-subject design. Rather, it uses many subjects who are all administered all levels of an independent variable, but the order of administration varies between subjects or subgroups of subjects.
Additional Information: Single-Subject Designs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q
The underlying structure in a set of variables is identified by which of the following?
Select one:
A. canonical correlation
B. multiple regression
C. factor analysis
D. discriminant analysis
A

Correct Answer is: C
Factor analysis is a complex statistical technique designed to determine the degree to which a large set of variables can be accounted for by fewer, underlying constructs (referred to as “factors” or “principal components” ). For example, factor analyses of the WAIS-IV have suggested that four factors - verbal comprehension, perceptual reasoning, processing speed, and working memory - explain, to a large degree, scores on the subtests.
Additional Information: Factor Analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

An organization is interested in improving employee morale in an off-site office with 150 employees. An organizational psychologist is contracted to identify and train the employees with the lowest morale. The employees scoring in the bottom 10% of a pretest are selected for extensive training. At the conclusion of the training, a posttest is administered and improvement in scores is noted. Test performance improvement would be expected even without training because:
Select one:
A. There has been a lapse of time between the first and second administrations.
B. Such tests are notably unreliable, particularly when based on small samples.
C. Regression of scores toward the mean is to be expected as a purely chance phenomenon.
D. The range for which the test was designed has been restricted by the method of sampling.

A

Correct Answer is: C
The net effect of regression toward the mean is that the lower scores (or measurements) on the pretest tend to be higher on the posttest, and the higher scores (or measures) on the pretest tend to be lower on the posttest. It is important to note that regression is always to the population mean of a group. However, there is essentially no change from the pretest to the posttest due to the dependent variable or treatment. It is important to note when conducting experiments because it affects the internal validity of the experimental design and occurs whenever the sample or subjects are chosen on the basis of extreme pretest scores.
Additional Information: Regression to the Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q
In a positively skewed distribution, one would most likely find, ranked from lowest to highest in value, the:
Select one:
A. median, mean, mode.
B. median, mode, mean.
C. mean, mode, median.
D. mode, median, mean.
A

Correct Answer is: D
You have to picture the positively skewed curve in order to get this correct. Positive skewness means there are some outliers (extreme scores) way over on the positive side. That’s where the tail is, way off to the right, or positive, end. Since the mean takes into account the magnitude of the scores, these outliers can be pictured as “pulling” the mean to the positive side, or the right. So, in any ordering of measures of central tendency, the mean would be the highest value. Thus, you can eliminate the two distractors that don’t list the mean as the highest value. To distinguish between the remaining answers, let’s go back to consider what the median is. The median is the middlemost point irrespective of value. If you’ve pictured the curve correctly you can see that more than half the cases fall on the right side because some are way over on the positive side. If you put a line where the highest point is on the curve, which is the mode, you’d see that more than half the cases fall to the right of that line. Hence the median, the 50% point, is to the right of the high point, the mode. This should have gotten you to the correct answer.
Additional Information: Skewed Distributions, Measures of Central Tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

To use the statistical technique known as trend analysis, you need:
Select one:
A. a quantitative independent variable.
B. a linear relationship between independent and dependent variables.
C. a true experimental research design.
D. two or more independent variables.

A

Correct Answer is: A
Trend analysis is what is sounds like; i.e., it is used to identify trends and, therefore, requires a quantitative independent variable. You might use trend analysis, for example, to determine if amount of time you spend studying is related to your score on the licensing exam in a linear or nonlinear fashion.
Additional Information: Trend Analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q
Which of the following statistical techniques would involve specifying a model of a problem domain that may involve observed and latent variables related to each other causally and non-causally in a unidirectional and bi-directional fashion?
Select one:
A. multivariate multiple regression
B. structural equation modeling
C. multiple ANOVA
D. discriminant function analysis
A

Correct Answer is: B
Structural equation modeling is a complex statistical technique that is used to explore and test relationships among many variables. The variables may be latent (i.e., unobserved variables, such as hypothetical traits or constructs) or observed, may have causal or correlational relationships, and the causal relationships may be specified as unidirectional or bi-directional. The first step in structural equation modeling is model specification. Here, you specify the variables involved, whether they are latent or observed, and the expected relationships among them. The results of statistical analysis indicate whether or not the model is a good fit for the data. The statistics involved are often specialized versions of other multivariate techniques, such as factor analysis and multivariate multiple regression, and the analysis requires specialized statistical software packages such as LISREL or EQS.
Regarding the other choices, multivariate multiple regression allows you to test hypothesized predictive relationships between multiple input (or predictor) and output (or criterion) variables. A multiple ANOVA (MANOVA) is a statistical significance test used in experiments with multiple dependent variables. And discriminant function analysis is used to identify variables that are most useful for distinguishing among two or more groups.
Additional Information: Structural Equation Modeling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

In forward stepwise multiple regression analysis, the goal is to obtain the smallest subset of predictors to account for the largest amount of variability in the criterion variable. Statistically, this involves:
Select one:
A. adding predictors to the multiple regression equation and determining, through statistical analysis, if the coefficient of multiple determination is significantly increased
B. using the correction for attenuation formula to estimate what the predictive power of the multiple regression equation would be if all the predictors had perfect reliability
C. using the Spearman-Brown Prophecy formula to estimate the magnitude of the multiple correlation coefficient if all the predictors were used, and comparing the result to the magnitude of the coefficient when different subsets of the predictors are used
D. administering different subsets of the predictors to two validation samples, and conducting statistical analyses to estimate the degree of shrinkage in the multiple correlation coefficient from the first to the second validation sample

A

Correct Answer is: A
The goal of stepwise regression analysis is to derive the smallest subset of predictors, out of a larger set, that maximizes the ability to predict outcome on a criterion variable. There are two types of stepwise multiple regression: forward and backward. In forward stepwise regression, predictors are successively added to the multiple regression equation. With each addition, an analysis is conducted to determine if the predictive power of the equation is increased. Predictive power is measured by the squared multiple correlation coefficient (also known as the coefficient of multiple determination).

Thus, this choice is the best answer.

Additional Information: Stepwise Multiple Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q
Which of the following is a statistical measure of the degree of difference among scores of subjects within the same experimental or treatment group?
Select one:
A. F ratio
B. mean square between
C. mean square within
D. standard error of the mean
A

Correct Answer is: C
Mean square within (or MSW) is a measure of within-group variance – the degree to which subjects within the same experimental group differ from each other. MSW is the denominator of the F ratio, and is referred to as the error term. The larger the magnitude of MSW, the less likely the F ratio will be significant.
Additional Information: Logic of the ANOVA and the Derivation of the F Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q
Which of the following techniques would not be useful for controlling or assessing the effects of an extraneous variable?
Select one:
A. stratified random sampling
B. blocking
C. matching
D. ANCOVA
A

Correct Answer is: A
Stratified random sampling involves dividing a population of interest into sub-populations (strata) and obtaining random samples from each strata. For instance, a researcher interested in studying the American population as a whole may break it down by ethnic groups and take proportionate random samples from each. The technique is designed to ensure that subjects are representative of the population of interest. Unlike the other choices, it is not used to control for the effects of an extraneous variable.
Additional Information: Ways to Increase External Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

You have conducted a study assessing the relationship between salary and job performance, and you find a significant correlation between these two variables. Your assistant tells you that the data fail to take into account a $25.00 cost of living raise which every employee received. You should:
Select one:
A. decide that the raise invalidated the research.
B. reanalyze the data after the raises have been added to the current salary.
C. not worry about small details; the actual amount is too small to make a significant difference.
D. assume the correlation will not be affected.

A

Correct Answer is: D
The basic point being tested here is that if you add a constant to each score – in either or both data sets – the relationship between the two variables won’t be affected. In other words, adding a constant to every score does not affect the correlation coefficient. The same is true of multiplying or dividing all scores by a constant, or subtracting a constant from every score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q
The technique which allows a researcher to identify the underlying (latent) factors that relate to a set of measured variables and the nature of the causal relationships between those factors is:
Select one:
A. structural equation modeling (SEM)
B. cluster analysis
C. Q-technique factor analysis
D. survival analysis
A

Correct Answer is: A
Structural equation modeling is a multivariate technique used to evaluate the causal (predictive) influences or test causal hypotheses about the relationships among a set of factors.
Cluster analysis* is used to identify homogeneous subgroups in a heterogeneous collection of observations. Q-technique factor analysis* determines how many types of people a sample of people represents. Survival analysis* is used to assess the length of time to the occurrence of a critical event (* incorrect options).
Additional Information: Structural Equation Modeling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q
If a person has a T-score of 70 in a normal distribution with 200 people, what does the 70 mean?
Select one:
A. 70th percentile
B. 3 standard deviations above the mean
C. z-score of plus one
D. better than 97%
A

Correct Answer is: D
This is a difficult question because none of the choices offer what you are expecting which would be “the 98th percentile.” Instead the best choice is answer D, which is “better than 97%.” In actuality, a T-score of 70 is two standard deviations above the mean (the mean of a T-score distribution is 50; the standard deviation is 10). When any score is two standard deviations above the mean, 98 percent of the distribution is below that score. In this case, 98 percent of the scores are below a T-score of 70, in other words, better than approximately 97% of people in the distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

A percentile rank is
Select one:
A. a norm-referenced score, but not a standard score.
B. a standard score, but not a norm-referenced score.
C. a standard score and a norm-referenced score.
D. neither a standard score nor a norm-referenced score.

A

Correct Answer is: A
To answer this question, you have to be able to define and understand three terms: norm-referenced, standard score, and percentile rank. A norm-referenced score is one that is interpreted in terms of a comparison to others who have taken the same test. A standard score is a type of norm-referenced score that is interpreted in terms of how many standard deviation units a score falls above or below the mean. Examples include z-scores and T-scores. A percentile rank indicates the percentage of scores that fall below a given score. For example, a person who achieves a percentile rank of 90 on the SAT scored better than 90% of others who took the test. Since interpretation of percentile ranks involves a comparison between scorers, a percentile rank is a norm-referenced score. However, since it is not interpreted in terms of standard deviation units, it is not a standard score.
Additional Information: Percentile Ranks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q
Which of the following would increase the power of a statistical test?
Select one:
A. an increase in alpha
B. a decrease in alpha
C. a decrease in sample size
D. use of a two-tailed test
A

Correct Answer is: A
The “power” or sensitivity of a statistical test is the probability of rejecting the null hypothesis when it is false, that is, the probability of correctly identifying that a difference exists. When alpha is increased (e.g., from .01 to .05), it becomes easier to reject the null hypothesis and, consequently, power is also increased. All of the other choices (a decrease in alpha, a decrease in sample size, or the use of a two-tailed test) would decrease the test’s power.
Additional Information: Power of a Statistical Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

When a multiple regression analysis is employed to predict outcome, there should be
Select one:
A. low intercorrelations among the predictors and high correlation of each predictor with the criterion.
B. high intercorrelations among the predictors and high correlation of each predictor with the criterion.
C. low intercorrelations among the predictors and low correlation of each predictor with the criterion.
D. high intercorrelations among the predictors and low correlation of each predictor with the criterion.

A

Correct Answer is: A
This question has come up in other examples throughout the tests. Simply stated, we need to have a high correlation between the predictor and the criterion we’re making predictions about (this eliminates two of the four alternatives). Also, we need to have the predictors themselves be more or less independent of each other. That is, they shouldn’t intercorrelate. If they do, then there’s no point in using all of them – if they all measure the same thing, why not use just one? So, you don’t want the predictors to intercorrelate.
Additional Information: Multiple Correlation and Multiple Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Which of the following best describes confidence intervals use?
Select one:
A. estimate true scores from obtained scores
B. calculate the standard error of measurement
C. calculate the test’s mean
D. calculate the standard deviation

A

Correct Answer is: A
Confidence intervals allow us to determine the range within which an examinee’s true score on a test is likely to fall, given his or her obtained score.
The standard error of measurement is used to construct confidence intervals, not the other way around.
Additional Information: Standard Error of Measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

In a normal distribution of scores, the range of raw scores represented by the percentile rank range of 50 to 55 is _______ the range of raw scores represented by the percentile rank range of 90 to 95.
Select one:
A. less than
B. greater than
C. the same as
D. depending on the standard deviation, either less than, greater than, or the same as

A

Correct Answer is: A
This question is a bit tricky and requires careful reading and a good grasp of the concepts of percentile rank and normal distribution. The easiest way to understand it is in terms of an example. Say that you have a test, with a mean of 70 and a range of possible raw scores from 0-100. The raw score mean is 70; in a normal distribution, the mean is equivalent to a percentile rank of 50 and is in the exact middle of the distribution (if you don’t know why this is, go back and review the Statistics section before attempting to understand this question). In a normal distribution, most of the raw scores are near or at the middle of the distribution; thus, most of the raw scores will be near or at 70. Similarly, the PR score range of 50 to 55 is in the middle part of the distribution, which is to say that most of the raw scores in this part of the distribution will be at or near 70. So the raw score range set by PR 50 to PR 55 will not be wide.
Now if you look at a normal curve, you will see that in the high end of the raw score distribution, there is a long tail spread across the bottom. This reflects the fact that there are relatively few high scorers, and the scores of these individuals are spread out (over the length of this tail). Since the 90 to 95 PR range is in the high end of the distribution, the range of raw scores here will be relatively higher than the range of raw scores in the middle of the distribution.

If you chose “the same as”, you probably did so based on the fact that the percentile rank distribution is flat. This means that the same amount of people will score between 50 and 55 and 90 and 95. However, the question is not about how many people will score within this PR range. Instead, it’s asking about the raw score ranges these PR ranges correspond to.

If you didn’t understand the above explanation, it might be useful to read it again with the normal curve in front of you. As you’re looking at the curve, remember that it is a raw score distribution, and try to approximate where the percentile ranks in the question would be placed on this distribution. If you still don’t understand, don’t worry too much. As you review this concept and practice with more questions, these things will be come clearer and clearer. Remember that difficult technical content requires repeated review, so you should keep track of those particular concepts you need to review on a regular basis.
Additional Information: Percentile Ranks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q
Which of the following correlation coefficients is used to assess convergent validity:
Select one:
A. heterotrait-monomethod
B. monotrait-heteromethod
C. heterotrait-heteromethod
D. monotrait-monomethod
A

Correct Answer is: B
The response choices make up a multitrait-multimethod matrix, a complicated method for assessing convergent and discriminant validity. Convergent validity requires that different ways of measuring the same trait yield the same result. Monotrait-heteromethod coefficients are correlations between two measures that assess the same trait using different methods; therefore if a test has convergent validity, this correlation should be high. Heterotrait-monomethod and heterotrait-heteromethod both confirm discriminatory validity, and monotrait-monomethod coefficients are reliability coefficients.
Additional Information: Convergent and Discriminant (Divergent) Validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q
A significant finding for a one-way ANOVA indicates that the
Select one:
A. group means were different.
B. sample means were different.
C. population means were different.
D. within-group variance was different.
A

Correct Answer is: C
We use statistical tests to make inferences about a population. So if we have significant results, we assume that this represents what happens in the real world – that is, in the population.
Additional Information: Parametric Tests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q
Choosing a correlation coefficient is based on factors such as the variables scale of measurement and the shape of the relationship between them. When measuring the relationship between two continuous variables when the relationship between them is nonlinear, which of the following is used?
Select one:
A. rho
B. eta
C. phi
D. tau
A

Correct Answer is: B
Eta is the appropriate correlation coefficient to use when both variables are measured on an interval or ratio scale and the relationship between the predictor (the X variable) and the criterion (the Y variable) is curvilinear.
Rho (sometimes referred to as the Spearman rank-order correlation coefficient) is appropriate when both variables are measured as ranks.

The phi coefficient is used when both variables are true (natural) dichotomies.

When both variables are measured on an ordinal scale, Kendall’s tau is appropriate.
Additional Information: Other Correlation Coefficients

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q
While studying the use of journaling in the treatment of depression, a researcher finds only individuals with good writing ability benefit from journaling. Writing ability is a(n):
Select one:
A. outcome variable
B. mediating variable
C. moderator variable
D. feedback variable
A

Correct Answer is: C
The strength of the relationship between the independent and dependent variables is affected by a moderator variable. Writing ability is moderating the effects of journaling on the treatment of depression.
Outcome variable* is another term for dependent variable. A mediating variable* is affected by the independent variable and affects the dependent. It is responsible for an observed relationship between an independent variable and a dependent (outcome) variable. A feedback variable* is an unrelated term (* incorrect options).
Additional Information: Factors Affecting the Validity Coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q
The probands, in a study comparing characteristics of adult ADHD patients, with characteristics of their first degree and second degree biological relatives and non-patients (controls), are:
Select one:
A. non-patients
B. first degree relatives
C. first and second degree relatives
D. ADHD patients
A

Correct Answer is: D
The ADHD patients are the probands in this study. Probands, or index cases, are the individuals who are first brought to the attention of the researcher - i.e., individuals manifesting the characteristic of interest or disease.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q
You are conducting a study to examine the differences in reaction time between elderly people and young people. Subjects are asked to view stimuli on a computer screen and to press a lever every time they see certain target stimuli. Your results indicate that younger people respond faster than older people, and you conclude that reaction time is faster for younger people. Your conclusion is faulty because of
Select one:
A. carry-over effects.
B. differential attrition effects.
C. a selection bias.
D. cohort effects.
A

Correct Answer is: D
The study described here is an example of a cross-sectional design, in which two or more different age groups are compared to determine whether aging has an effect on a particular dependent variable. A problem with cross-sectional designs is cohort effects. This refers to differences between the groups in experience rather than age that could be accounting for differences between them on the dependent variable. Cohort effects seem like a particularly plausible explanation for the results here, since it’s likely that young people have more experience with computers than older people.
Additional Information: Developmental Research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Consider the factor loadings for each of the four tests as explained in the table below:
Correlations of Various Tests with Job Skills

JOB SKILLS

Test	Concentration	Verbal Reasoning	Motoric	Computation
1	.60	.53	.04	.13
2	.21	.11	.87	.09
3	.07	.15	.03	.62
4	.33	.22	.20	.45

If these tests are combined into a test battery, the predictive validity of the battery would be maximized if the correlations among the four tests were
Select one:
A. at least as high as the correlation of each one with its highest correlated skill.
B. as low as possible, preferably close to .00.
C. no higher than the average correlations of each of the four tests with each skill.
D. equal to the sum of the squared correlations of each test with each skill.

A

Correct Answer is: B
The idea behind this question is that when you use various tests for prediction, each of the tests you use should have some relationship to the thing you’re predicting. And, if you use several of these tests together, say, in a battery, it is best if each of the tests tells you something different and apart from the others. That is, each of the tests should contribute some unique information to the equation. If you give two tests and they both give the same information (that is, they correlate), then you needn’t give the two tests. Why not stick with only one, say the cheaper one? Putting this all together, we have the situation where you have several tests, each contributing something to predicting a job skill, but each contributing a unique bit of added information. In other words, the tests themselves should not correlate.
Additional Information: Multicollinearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

LISREL would be most useful in
Select one:
A. testing the relationship between multiple predictor variables and one criterion variable.
B. confirming hypotheses regarding relationships between several latent and observed variables.
C. testing a hypothesis regarding the effects of multiple independent variables on one dependent variable.
D. validating questions on a personality inventory.

A

Correct Answer is: B
LISREL is an acronym that stands for linear structural relations analysis. It is a software program used in Structural Equation Modeling, which is a technique used to test theories regarding unidirectional and bi-directional relationships among latent (unobserved) and manifest (observed) variables. Most experts recommend that Structural Equation Modeling should be used as a confirmatory method, to provide evidence for a theory as opposed to exploratory method to originate new theories.
Additional Information: Structural Equation Modeling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q
Which one of the following is least likely to attenuate a measure of correlation?
Select one:
A. restricted range
B. homoscedasticity
C. curvilinear relationship
D. the use of unreliable measures
A

Correct Answer is: B
Homoscedasticity refers to even scatter around the regression line. Homoscedasticity is actually a good thing. It wouldn’t attenuate the correlation at all. The other three choices list factors that would attenuate the correlation coefficient.
Additional Information: Factors Affecting the Pearson r

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q
\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ is used to study the cognitive processes that occur during the performance of tasks or solution of problems.
Select one:
A. protocol analysis
B. functional analysis
C. event sampling
D. situation sampling
A

Correct Answer is: A
Protocol analysis, or the “think aloud” technique, is the only technique listed that is useful for obtaining information on cognitive processes. Protocol analysis assumes that subjects instructed to verbalize their thoughts in a manner that doesn’t alter the sequence of thoughts mediating the completion of a task, can think-aloud without any systematic changes to their thought process and can therefore be accepted as valid data on thinking.
A functional analysis* is used to identify the functions of a behavior - i.e., the antecedents and consequences that maintain the behavior. Both sampling responses are behavioral observation techniques with event sampling* useful when a behavior of interest occurs infrequently and situation sampling* when observing a behavior in a variety of situations (* incorrect options).
Additional Information: Protocol Analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

During a research study the participants are able to guess the research hypothesis, causing them to behave differently than they would under normal conditions. This phenomenon is due to:
Select one:
A. demand characteristics
B. the Hawthorne effect
C. the use of a quasi-experimental design
D. the use of psychic research participants

A

Correct Answer is: A
Demand characteristics are cues in a research study that allow participants to guess the hypothesis. As a result, participants may behave differently than they would under normal conditions.
the Hawthorne effect

The Hawthorne effect is a similar phenomenon, but refers to the tendency of research participants to behave differently due to the mere fact they are participating in research - rather than due to cues about how they are expected to behave.

the use of a quasi-experimental design

Quasi-experimental designs are simply designs which do not randomly assign participants to groups.

the use of psychic research participants

Finally, this choice is a possible, but less probable, cause of this phenomenon.

Additional Information: Threats to External Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

In the analysis of the effects of two independent variables, multiple regression analysis is sometimes used as a substitute for the factorial ANOVA. One advantage of using multiple regression as opposed to a factorial ANOVA is that:
Select one:
A. multiple regression analysis can be used for multiple dependent variables as well as multiple independent variables.
B. continuous or categorical data (as opposed to solely categorical data) can be used to measure the independent variables in multiple regression analysis.
C. the use of multiple regression allows one to estimate the probability that obtained differences on the dependent variable between groups represent true population differences.
D. when multiple regression is used and a significant result is obtained, the conclusion that there is a causal relationship between the independent variables and the dependent variables is more plausible.

A

Correct Answer is: B
One limitation of the ANOVA technique is that independent variables must be divided into categories for the analysis to be conducted. In multiple regression, the researcher has the choice of using categories or continuous data (e.g., scores on a test) to measure the independent variables. This is considered an advantage of regression, because it allows for the data to provide more precise and specific information about the variables being measured.
multiple regression analysis can be used for multiple dependent variables as well as multiple independent variables.

This choice is not true of multiple regression; it is designed for use with one dependent variable only.

when multiple regression is used and a significant result is obtained, the conclusion that there is a causal relationship between the independent variables and the dependent variables is more plausible.

This is also not true; the strength of the conclusion that variables are causally related depends on the research design, not the statistical analysis.

the use of multiple regression allows one to estimate the probability that obtained differences on the dependent variable between groups represent true population differences.

This is true of both multiple regression and ANOVA, since they are both inferential statistical methods.

Additional Information: Multiple Correlation and Multiple Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q
A psychologist believes that physical exercise can reduce a person's anxiety level, which reduces the strength of substance cravings in people recovering from substance dependence. According to this hypothesis anxiety is a:
Select one:
A. suppressor variable
B. mediator variable
C. moderator variable
D. criterion contaminator
A

Correct Answer is: B
A mediator variable is a variable that accounts for or explains the effects of an IV on a DV. That is, the IV affects the mediator variable, which affects the DV. In this example, the IV is exercise, the mediator variable is anxiety, which explains how the DV, substance craving, is reduced.
A moderator variable is similar to a mediator variable, but a moderator variable only influences the strength of the relationship between two other variables, it doesn’t fully account for it. For example, if a job selection test has different validity coefficients for different ethnic groups, ethnicity would be a moderator variable because it influences the relationship between the test (predictor) and actual job performance (the criterion) but it does not fully account for the relationship.

A suppressor variable reduces or conceals the relationship between variables. For example, the K scale in the MMPI-2 is a suppressor variable because it measures defensiveness, which can suppress the scores on the clinical scales. The K scale is, therefore, used as a correction factor for some of the clinical scales.

Criterion contamination is the artificial inflation of validity which can occur when raters subjectively score ratees on a criterion measure after they have been informed how the ratees scored on the predictor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

In a study of 400 personality variables, it was found that 19 correlated at the .05 level of significance with a measure of actual behavior. The 19 significant correlations could be considered valid for:
Select one:
A. future research.
B. future therapy.
C. both future research and future therapy.
D. neither future research nor future therapy.

A

Correct Answer is: D
At the .05 level of significance, there is a 5% probability of making a Type I error. So, out of 400 relationships, you’d expect 20 or so to be found significant when they really aren’t. Hence the 19 significant correlations probably aren’t very meaningful. They likely have no application to either research or therapy.
Additional Information: Type I Error and Alpha Level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q
The Drugs-R-Us company wants to compare the effectiveness of 3 new antidepressant medications. Patients with depression are randomly assigned to one of the three medications and depressive symptoms are measured at weeks 1, 6, and 12. Which type of research design would be most appropriate for this study?
Select one:
A. ABAB
B. between subjects
C. within subjects
D. mixed
A

Correct Answer is: D
A mixed research design has at least one between-subjects independent variable and at least one repeated measures variable (or within-subjects variable). Since this study is comparing the effects on three different groups of subjects (i.e., a between-subjects variable) combined with the use of a repeated measures (within-subjects) variable, it would be considered a mixed design. An ABAB design is a type of reversal design, in which a baseline measure of a behavior is obtained (the “A” phase), the behavior is again measured after a treatment is administered (the “B” phase), the treatment is removed or reversed and the behavior is measured again (the second “A”), and the treatment is then re-applied (the second “B”) and a final behavior measure is taken.
Additional Information: Variations of the Factorial ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

If scores obtained by parents on an adult test of intelligence are highly correlated with scores obtained by their children on a childhood intelligence test, one can conclude that
Select one:
A. high parental intelligence is the cause of high childhood intelligence.
B. intelligence has a high heritability factor.
C. scores on the two tests are associated.
D. both nature and nurture account for scores on childhood intelligence tests.

A

Correct Answer is: C
The point of this question is that one cannot make any theoretical conclusions about variables on the basis of a high correlation alone. A correlation means that two variables co-vary, or that values change in a predictable direction–e.g., as the value of one goes up, the other tends also to increase (positive correlation), or as the value of one increases, the other tends to decrease (negative correlation). Another way of saying this is that the two variables are associated. When two variables are correlated, it could be that either one is a cause of the other, or that there are one or more other variables causing the two in question to co-vary. For example, in the case of parent-child intelligence, there could be third variables, such as SES or test bias, that account for any observed association. There is evidence that intelligence has a strong genetic component, but the question is about what one can conclude on the basis of a correlation alone, not about conclusions that have been drawn on the basis of all the available evidence.
Additional Information: Correlation and Causality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

A psychologist uses a two-group pretest/posttest design to evaluate the effects of a new treatment. She obtains the following data:
PreTest Post Test
Group 1 Mean 13.4, SD 1.2 Mean 19.8, SD 1.5
Group 2 Mean 19.5, SD 1.5 Mean 21.7, SD 1.9

The biggest threat to this study's internal validity is
Select one:
A. reactivity.
B. test x treatment.
C. selection.
D. history.
A

Correct Answer is: C
In this study the means of the two groups are very different initially (Pretest), which will make it hard to interpret the results. When internal validity is threatened by initial group differences, this threat is called selection. Note that the term selection is misleading because it actually refers to assignment. If assignment was random, we would expect the pretest scores for Groups 1 and 2 to be approximately equal, which they are not, 13.4 and 19.5, respectively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q
A researcher inquires about the subjects' performance expectations and beliefs about the purpose of the study at the conclusion of the experiment. The researcher finds the subjects' actual performance is consistent with their beliefs and expectations when analyzing the data. The results of the study may be confounded by:
Select one:
A. the Hawthorne effect
B. demand characteristics
C. carryover effects
D. changing criteria
A

Correct Answer is: B
Demand characteristics are unintentional cues in the experimental environment or manipulation that affect or account for the results of the study. In this situation, the subjects’ may have acted in ways consistent with their expectations rather than simply in response to the experimental manipulation.
The Hawthorne effect occurs when research subjects act differently because of the novelty of the situation and the special attention they receive as research participants.

Carryover effects occur in repeated measures designs when the effects of one treatment have an impact on the effects of subsequent treatments.
Additional Information: Threats to External Validity

53
Q
If a behavior reaches the criterion level for each phase, treatment is considered to be effective in which of the following designs?
Select one:
A. multiple baseline
B. changing criterion
C. Solomon four-group
D. Latin square
A

Correct Answer is: B
The changing criterion design is a type of single case design that consists of a series of phases in which a different behavioral criterion is set for each phase. If the behavior reaches the criterion level for each phase, the treatment is considered to be effective.
The multiple baseline* sequentially applies a treatment across subjects, settings, or behaviors in a single case design. The Solomon four-group design* is utilized to assess the pretesting effects on the internal and external validity of a study. Partial counterbalancing is used when the number of participants does not permit a completely counterbalanced design. Latin square* design, a type of partial counterbalancing, administers all levels of an independent variable to all subjects but the order of administration varies between subjects or subgroups of subjects (* incorrect options).

54
Q

An advantage of using a MANOVA over multiple one-way ANOVAs is that
Select one:
A. the use of a MANOVA reduces the experiment-wise error rate.
B. a MANOVA can be used when the study involves more than one dependent variable.
C. a MANOVA is the more appropriate test when the researcher has an a priori hypotheses about the nature of the relationship between the independent and dependent variables.
D. a MANOVA involves simpler mathematical calculations.

A

Correct Answer is: A
When a study involves two or more dependent variables, data can be analyzed with either multiple (one for each dependent variable) statistical tests (e.g., multiple one-way ANOVAs) or one MANOVA. An advantage of the latter technique is that it reduces the probability that at least one Type I error (incorrect rejection of the null hypothesis) will be made. This is because the fewer statistical tests one conducts, the less likely it is that a Type I error will occur. In an experiment that involves more than one comparison, the probability of at least one Type I error is referred to as the experiment-wise error rate.
Additional Information: Multivariate Analysis of Variance (MANOVA)

55
Q

All of the following are assumptions of the regression equation, except:
Select one:
A. a linear relationship exists between X and Y.
B. the variability of Y scores is equal throughout the range of X scores.
C. one can predict scores on Y on the basis of scores on X.
D. changes in the level of X cause changes in the level of Y

A

Correct Answer is: D
A regression equation is used to predict the value of a Y variable on the basis of a person’s score on an X variable. For example, if an industrial psychologist wanted to use a job applicant’s score on a job selection test to predict his future score on a supervisor’s rating scale, he could develop and use a regression equation to do so. When regression is used, there is not necessarily the assumption that changes in the value of X cause changes in Y. The X and Y variables are correlated, but a correlation between two variables does not always mean that they are causally related.
The incorrect choices are all assumptions of the use of regression.

a linear relationship exists between X and Y.

The technique is based on the assumption that the relationship between X and Y can be depicted as a straight line.

the variability of Y scores is equal throughout the range of X scores.

This choice is referred to as the assumption of homoscedasticity.

one can predict scores on Y on the basis of scores on X.

And this describes the whole purpose of using regression.

Additional Information: Regression Equation

56
Q

According to the central limit theorem,
Select one:
A. as sample size increases, the shape of a sample distribution becomes more normal.
B. as the size of a sampling distribution of means increases, its distribution becomes more normal.
C. as sample size increases, the shape of a sampling distribution of means becomes more normal.
D. as sample size increases, the shape of a sampling distribution of means approximates the shape of the population distribution.

A

Correct Answer is: C
According to the central limit theorem, the shape of a sampling distribution of means approaches normality as sample size increases. The central limit theorem is covered in the Advanced Statistics section of your materials, and you should study it after you have a reasonably solid grasp of the material presented in the rest of the section.
Additional Information: Central Limit Theorem

57
Q
If you wanted to compare the average depression level (as measured by the number of scored responses given on a depression inventory) of anorexic females to that of non-anorexic females, you would use which of the following statistical tests?
Select one:
A. two-way ANOVA
B. student's t-test
C. chi-square
D. Kolmogorov
A

Correct Answer is: B
The t-test (which is also known as student’s t-test) is the appropriate statistical test to use when comparing two means.
A two-way ANOVA would be used to compare means from a study with two independent variables; in this case, there is only one independent variable (diagnosis) with two levels (anorexic vs. non-anorexic).

A chi-square test is used when the data from a study is frequency of observations within categories, as opposed to (as in this case) mean scores of groups.

Finally, the Kolmogorov, an infrequently used test, is used with ordinal data (e.g., ranks).
Additional Information: Parametric Tests

58
Q
All of the following are measures of variability except:
Select one:
A. variance
B. standard error
C. range
D. standard deviation
A
Correct Answer is: B
Variability represents the amount of difference found in responses from a population or sample on a topic being investigated. Variance*, range*, and standard deviation* all reflect the variability in the data (* incorrect options).
Standard error (of measurement) is not a measure of variability. It is a statistic indicating the amount of difference in results that is accounted for by flaws or "noise" in the instrument used to measure a variable.
59
Q
A score indicating the percentage of items correct on a test would be using which of the following scales of measurement?
Select one:
A. nominal
B. ordinal
C. interval
D. ratio
A

Correct Answer is: D
There are two defining characteristics of a ratio scale of measurement: 1) Successive data points on the scale reflect equally distant scores from each other. This is true of percentage scores; for instance, the difference between a score of 49% and 50% is the same as the difference between 50% and 51%. 2) The scale has an absolute zero point, which means that a score of “0” reflects a complete absence of whatever is being measured. Moreover, a ratio scale is the only scale of measurement that possesses the quality of an absolute zero point. This is true of percentage scores; a score of 0% indicates that the person answered no questions correctly. If you thought the answer was ordinal scale, you may have been confusing percentage scores with percentile ranks.
Additional Information: Scales of Measurement

60
Q
The risk of sampling error is greatest when a:
Select one:
A. sample size is small
B. test has low reliability
C. test has low validity
D. confounding variable exists
A

Correct Answer is: A
Sampling error is the extent to which a sample value deviates from the corresponding population value which it is supposed to represent. Thus, the smaller the sample size, the greater the risk of sampling error.
You should have been able to eliminate reliability* and validity, since those are characteristics of a test - which is not applied until after the sampling procedure. Sampling error, as it’s name implies, takes place during the sampling or selection of subjects. A confounding variable is a variable that is not of interest in a study but which exerts a systematic effect on the DV (* incorrect options). Thus it would threaten the internal validity of a test but it is not related to sampling error.
Additional Information: Samples, Populations, and Sampling Error

61
Q

In a research study, there are four subject groups, two groups of all men and two groups of all women. All four groups read the same essay. One group of men and one group of women are told that the essay was written by a woman. The other two groups are told that the essay was written by a man. All groups are then asked to rate the quality of the essay. Based on past results of research, what are the most likely results of this study?
Select one:
A. A main effect of writer gender, such that the prose of women is rated higher than the prose of men.
B. A main effect of writer gender, such that the prose of men is rated higher than the prose of women.
C. A main effect of rater gender, such that women rate the prose of the essay higher than men regardless of the author’s gender.
D. An interaction effect between rater gender and writer gender, such that men rate the prose of the male author higher, and women rate the prose of the female author higher

A

Correct Answer is: B
This question is about how people perceive others based on gender, but it uses terminology that might confuse you if you if you are not familiar with it. This study has two independent variables, gender of raters (the subjects) and (perceived) gender of the writer. In a study with multiple independent variables, a main effect means that an independent variable had an overall effect, whereas an interaction effect means that an independent variable’s effect differed depending on the level, or value, of another independent variable. For example, a main effect of writer gender could mean that male writers (or female writers) were rated higher; the main effect refers to the effect of writer gender overall. An interaction effect could mean that women rated female writers higher and men rated male writers higher. In other words, the effect of one of the independent variables, writer gender, depended on the value of another, rater gender. In many studies such as these over the years, no interaction effects are usually found. Gender appears to function as an indicator of status for both men and women, and men are rated as more competent by both males and females. For instance, both men and women rate the prose of the same essay as better when they believe the essay is written by a man; both men and women tend to interpret a gender-ambiguous author name as male when the topic of the written work is a male-dominated subject such as politics or economics; and males are rated higher in performance than females for performing the same work.
Additional Information: Factorial ANOVA

62
Q
Which of the following methods of sampling would be best to ensure that specific ethnic or racial groups are not underrepresented?
Select one:
A. Cluster
B. Stratified
C. Systematic
D. Matched
A

Correct Answer is: B
Stratified random sampling involves dividing a population into subsets, or strata, and then randomly selecting research participants from each stratum. For example, to ensure that African Americans, Hispanics, and Asians are properly represented in a study, a researcher would divide the target population into these groups, then randomly select participants from each group. Selection would ensure that the proportion of the study’s research participants in each group reflected that group’s representation in the underlying population.
Regarding the other choices, cluster sampling involves randomly selecting naturally occurring groups (clusters) in the population and then selecting participants from within the selected clusters. For example, in a study of inpatients, a researcher might first randomly select hospitals and then select only patients from the selected hospitals. This differs from stratified random sampling in that in cluster sampling, the groups are randomly selected, whereas in stratified sampling, individuals within targeted groups are randomly selected.

In systematic sampling, every nth (e.g., every 10th) member of a target population is selected.

In matched sampling, research subjects are matched on a characteristic that researchers believe may exert an extraneous effect on results, and then members of each matched group are randomly divided into groups. For example, if researchers believe intelligence may affect the results of a study, subjects of equal intelligence would be matched and then randomly assigned to research groups.
Additional Information: Ways to Increase External Validity

63
Q

A researcher wants to obtain the correlation between several academic predictor tests and three measures of academic success in college. The appropriate method of correlational analysis to use would be the
Select one:
A. Pearson Product Moment Correlation Coefficient.
B. canonical correlation.
C. multiple correlation.
D. factor analysis.

A

Correct Answer is: B
Canonical correlation is a method used to assess the relationship between two sets of variables–i.e., two or more predictor variables and two or more criterion variables. Scores on both sets of variables are weighted and summed to come up with two canonical variates, or weighted sum scores, one for the predictor variables and one for the criterion variables, and the results reflect how strongly the two canonical variates are related.
The Pearson Product Moment Correlation Coefficient is used to test the strength of the relationship between two variables only, one predictor and one criterion (e.g., one academic test and college GPA).

When there are two or more predictor variables and one criterion variable, multiple correlation can be used.

And factor analysis is not a correlational method, but instead a way of finding a few, unobserved variables that could account for scores on many variables.
Additional Information: Canonical Correlation

64
Q

A factorial design, unlike a two group design:
Select one:
A. allows more independent variables to be studied
B. requires a larger sample
C. shows the effect of an independent variable on the dependent variable
D. cannot detect a curvilinear relationship between variables

A

Correct Answer is: A
In a two group design, one group is exposed to a treatment and another, control group, is not exposed or gets a different treatment. The results of both groups are tested in order to compare the effects of treatment. A factorial design is a design with more than one independent variable. In this design, the independent variables are simultaneously investigated to determine the independent and interactive influence they have on the dependent variable. The effect of each independent variable on the dependent variable is called a main effect and in a factorial design there are as many main effects as there are independent variables. An interaction effect between two or more independent variables occurs when the effect that one independent variable has on the dependent variable depends on the level of the other independent variable.
At least three levels must be used to predict a curvilinear relationship.
Additional Information: Multiple IVs (Factorial Design)

65
Q

Cohen’s d is a method for calculating an effect size. Which of the following is required to use this method?
Select one:
A. the means of the experimental and control groups
B. the median scores for the pre- and post-tests
C. the actual and predicted scores for the outcome measure
D. the standard error of estimate of the criterion

A

Correct Answer is: A
Cohen’s d is calculated by subtracting the mean of the control group from the mean of the experimental group and dividing the result by the control group standard deviation or by a pooled standard deviation. It indicates the magnitude of the effect of a treatment in terms of the difference between the means of the experimental (treatment) and control (no treatment) groups.

66
Q

The Solomon four-group design is:
Select one:
A. a quasi-experimental design
B. used to analyze the difference scores among four different treatment groups
C. used to reduce practice effects
D. used to evaluate the effects of pretesting

A

Correct Answer is: D
The Solomon four-group design is a true experimental design used to evaluate the effects of pretesting, since some groups are pretested and others are not.

67
Q
Excessive variability in a behavior over time can make it difficult to obtain accurate information about the effects of an intervention on that behavior. Such variability poses the biggest threat for which of the following research designs?
Select one:
A. single-subject
B. factorial
C. split-plot
D. Solomon four-group
A

Correct Answer is: A
In a single-subject research design, the target behavior is measured at regular intervals throughout the baseline and treatment phases. If the behavior changes often in strength, intensity, or frequency, it would be difficult to obtain a clear baseline reading or to determine if the intervention is having the desired effect.
Additional Information: Single-Subject Designs

68
Q
A study designed to identify problem-solving skills among violent offenders asks subjects to "think aloud" while they are performing a difficult task. The researcher records what the subjects say and do and later interprets the cognitive processes they used to solve the problem. This is an example of:
Select one:
A. free association
B. retrospective debriefing
C. protocol analysis
D. naturalistic research
A

Correct Answer is: C
This describes a protocol analysis. Note that this is not a quantitative or statistical study; rather, it is qualitative, or based on the researcher’s own interpretations.
“Retrospective debriefing” is a related approach, in which subjects, after working on a problem, are asked how they determined the solution.

“Naturalistic research” simply refers to studies which observe behaviors in their natural setting, as compared to “analogue research” which draws conclusions about real-world phenomena based on laboratory findings.
Additional Information: Protocol Analysis

69
Q
Counterbalancing is a within-subjects design that entails changing the order each treatment is administered to different groups of subjects with the goal being to use each sequence of treatments with an equal number of participants for each. Researchers would use which of the following if the number of subjects in a study is too small to use a completely counterbalanced research design?
Select one:
A. changing criterion
B. Solomon four-group
C. Latin square
D. multiple baseline
A

Correct Answer is: C
Researchers may use partial counterbalancing, like the Latin square design, when the number of subjects doesn’t allow a completely counterbalanced design. The Latin square design helps establish the specific sequences of treatment to be administered to different groups of subjects.
The changing criterion is a type of single case design involving a series of phases in which a differing behavioral criterion is set for each. The treatment is deemed to be effective if the behavior reaches the criterion level for each phase.

The Solomon four-group design is used to determine the effects of pretesting on internal and external validity.

The multiple baseline is a type of single case design involving sequentially administering a treatment across subjects, behaviors or settings.
Additional Information: Ways to Increase External Validity

70
Q
A colleague of yours is interested in studying the effects of aging on IQ scores. He consults with you for some ideas regarding how to proceed with this research. Which of the following types of research designs would you recommend?
Select one:
A. longitudinal
B. cross-sectional
C. cross-sequential
D. multiple baseline
A

Correct Answer is: C
The colleague is interested in conducting developmental research, in which the effects of development (e.g., aging) on a dependent variable (in this case, IQ scores) are investigated. Longitudinal, cross-sectional, and cross-sequential are all types of developmental research designs. Of these, cross-sequential research is the strongest from a scientific point of view. Cross-sequential research is a combination of cross-sectional and longitudinal research. In cross-sequential research, as in cross-sectional research, subjects are divided into age groups (e.g., young, middle-aged, and old). And, as in longitudinal research, subjects are assessed repeatedly on the dependent variable over time. Because cross-sequential research combines the methodology of the two strategies, it is not associated with the limitations of one or the other.
Additional Information: Developmental Research

71
Q

A researcher would use which of the following techniques to classify people into criterion groups based on their scores or status on two or more predictors?
Select one:
A. structural equation modeling
B. discriminant function analysis
C. cluster analysis
D. multitrait-multimethod matrix Incorrect

A

Correct Answer is: B
Discriminant function analysis is used to determine which variables discriminate between two or more naturally occurring groups. Discriminant function analysis is multivariate analysis of variance (MANOVA) reversed. In MANOVA, the independent variables are the groups and the dependent variables are the predictors. In DA, the independent variables are the predictors and the dependent variables are the groups. As previously mentioned, DA is usually used to predict membership in naturally occurring groups. It answers the question: can a combination of variables be used to predict group membership? Usually, several variables are included in a study to see which ones contribute to the discrimination between groups.
Structural equation modeling is used to evaluate the cause-and-effect or predictive relationships between measured variables and latent factors. The multitrait-multimethod matrix is used to evaluate convergent and divergent validity. Cluster analysis is a method for grouping objects of similar kind into respective categories. It can be used to discover structures in data without providing an explanation/interpretation.
Additional Information: Discriminant Function Analysis

72
Q
A linear relationship is an assumption of all of the following, except:
Select one:
A. structural modeling equation
B. regression analysis
C. Pearson r
D. eta
A

Correct Answer is: D
Eta is a correlational coefficient used for non-linear, or curvilinear, relationships.
Structural modeling, a variety of techniques based on correlations between multiple variables, regression analysis, a method used to estimate the value of one variable based on the value of another variable, and Pearson r, all assume a linear relationship between variables ( incorrect options).
Additional Information: Other Correlation Coefficients

73
Q
R2 is the:
Select one:
A. coefficient of stability
B. coefficient of multiple determination
C. coefficient of internal consistency
D. silver character on "Star Wars"
A

Correct Answer is: B
You probably had to take a guess on this one. Nevertheless, you now know that R2 is known as the “coefficient of multiple determination.” It is a correlation coefficient like the Pearson r. However, uppercase “R” is a multiple correlation coefficient, which is used when there are multiple predictors. Like the Pearson r, the multiple correlation coefficient can be squared (R2), which indicates the percent of variance in the criterion explained collectively by all of the predictors.
FYI, the correct name for the silver “Star Wars” character was “R2-D2.”
Additional Information: Multiple Correlation and Multiple Regression

74
Q

Path analysis is useful for:
Select one:
A. examining the unidirectional relationships among a set of measured and latent traits.
B. examining the bidirectional relationships among a set of measured and latent traits.
C. examining the unidirectional causal relationships among a set of measured traits.
D. examining the bidirectional causal relationships among a set of measured traits.

A

Correct Answer is: C
Path analysis is a causal modeling technique. It is somewhat limited compared to other techniques because it permits only one-way (unidirectional) paths between variables and involves looking only at the relationships among measured variables. (LISREL, a more complicated technique, looks at both measured variables and the latent traits measured by those variables and permits one- and two-way paths.)
Additional Information: Path Analysis

75
Q
In which of the following research designs is autocorrelation most likely to be a problem:
Select one:
A. between groups
B. Solomon four-group
C. double-blind
D. repeated measures
A

Correct Answer is: D
When the dependent variable is repeatedly administered to the same subjects, the correlation between measurements of the dependent variable is referred to as autocorrelation. Repeated measures is the only design listed that repeated measurement occurs in.
Additional Information: Time-Series Design

76
Q
If you are interested in determining whether the relationship between arousal and performance assumes a linear or a non-linear shape, the best statistical analysis to use would be
Select one:
A. multiple regression analysis.
B. trend analysis.
C. logistic regression.
D. principal components analysis.
A

Correct Answer is: B
Trend analysis is a statistical technique used to determine the trend or shape that best describes the relationship between two variables. The technique basically involves collecting data on two variables and running statistical analyses to determine what trend or trends (e.g., linear, U-shaped) are significant. For example, in studying the relationship between arousal and performance, one could study 100 students and collect data on how aroused they are and how well they perform. Then, one could run a separate analysis for different types of trends and see which receives the strongest support.
Additional Information: Trend Analysis

77
Q

One advantage of standard scores as compared to percentile ranks is that standard scores
Select one:
A. allow you to determine the relative standing of examinees who take the same test.
B. set cutoff scores above which a given percentage of examinees will score.
C. provide more meaningful information about differences between examinees’ test scores.
D. when used properly, can decrease the cultural bias of test scores in many cases.

A

Correct Answer is: C
One disadvantage of percentile ranks is that a given distance between two percentile ranks does not necessarily reflect the same distance between the examinees’ raw scores. Specifically, percentile ranks tend to overestimate raw score differences in the middle of the score distribution and underestimate raw score differences at the end of the distribution. Let’s take an example: Say that Examinee A has a percentile rank score of 93, Examinee B has a percentile rank score of 96, Examinee C has a percentile rank score of 50, and Examinee D has a percentile rank score of 53. If you’re just looking at percentile ranks, you might assume that the score difference between Examinee A and B is equivalent to the score difference between Examinee C and D. However, because examinees A and B scored at the extreme high end of the distribution, their raw score difference will be greater than that between examinees C and D, who scored in the middle of the distribution.
allow you to determine the relative standing of examinees who take the same test.

set cutoff scores above which a given percentage of examinees will score.

These two options are true of both standard scores and percentile ranks.

when used properly, can decrease the cultural bias of test scores in many cases.

This is true of neither.

Additional Information: Standard Scores

78
Q
\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ is used to statistically remove the effects of an extraneous variable on the dependent variable making it easier to detect the effects of the independent variable on the dependent variable.
Select one:
A. Factorial ANOVA
B. Split-plot (mixed) ANOVA
C. ANCOVA
D. MANOVA
A

Correct Answer is: C
The analysis of covariance (ANCOVA) is used to adjust dependent variable scores to control for the effectiveness of the covariate, or an extraneous variable, making it easier to determine the effects of the independent variable on the dependent variable.
Factorial ANOVA

A factorial ANOVA is used to analyze data when a factorial design, which includes two or more independent variables, is used and the dependent variable is measured on an interval or ratio scale.

Split-plot (mixed) ANOVA

The split-plot (mixed) ANOVA is the appropriate technique when at least one independent variable is a between-groups variable and another independent variable is a within-subjects variable.

MANOVA

The multivariate analysis of variance (MANOVA) is a type of ANOVA used when the study has two or more dependent variables and at least one independent variable. A researcher could use the MANOVA when all the dependent variables are measured on a ratio or interval scale rather than using separate ANOVAs to evaluate the effects of each of the dependent variables, thus also helping control the experiment-wise error rate.

Additional Information: Analysis of Covariance (ANCOVA)

79
Q

The coefficient of determination indicates
Select one:
A. the proportion of variability in one variable that is accounted for by variability in another variable.
B. the correlation between two variables with the effects of a third variable removed.
C. the correlation between two variables without the effects of a third variable removed.
D. the proportion of variability accounted for in a variable by all the factors in a factor analysis.

A

Correct Answer is: A
The coefficient of determination is calculated by squaring a correlation coefficient. As compared to the correlation coefficient, it provides a more direct way of interpreting the calculated relationship between two variables. Specifically, it indicates the proportion of variability shared by the two variables, or the proportion of variability in one variable that can be accounted for by variability in the other.
Additional Information: Factors Affecting the Pearson r

80
Q
A researcher studying the effects of two different psychotherapies in the treatment of depression conducts a statistical test, finds the mean score on a test of depression for one of the therapy groups is significantly lower than that of the mean score for the other group, and rejects the null hypothesis that the two therapies do not differ in effectiveness. In reality, in the underlying population, the mean scores of patients after undergoing each type of therapy are equal. The researcher has made which of the following errors?
Select one:
A. Type I
B. Type II
C. false negative
D. experiment-wise
A

Correct Answer is: A
To answer this question, it helps to understand some basic features of statistical hypothesis tests. First, they use samples from populations to test hypotheses about entire populations. For this reason, the conclusions they arrive at are probabilistic–since the entire population is not included in the study, there is always a chance that conclusions reached about the entire population will be erroneous. There are ways, such as increasing sample size, to reduce the probability of error, but there is no way to be 100% certain that obtained results from a sample hold true for an entire population. Second, they test the probability that the null hypothesis, or the probability of no effect, is true. For example, in the study described by the question, a statistical hypothesis test would provide the probability that means are actually equal in the population, given the obtained differences in sample means. Put another way, the test would yield the probability that the two samples were drawn from the same population. There are two types of erroneous conclusions about populations that statistical tests can yield. One would be to reject the null hypothesis when the null hypothesis is in fact true. In other words this type of error would be to conclude that population means are different when in fact they are the same, or to conclude that a treatment has an effect on a dependent variable when in fact it does not. This type of error, which is exemplified in the question, is called a Type I error. The other type of error, retaining a false null hypothesis, or concluding that a treatment does not have an effect in the underlying population when in fact it does, is called a Type II error.
Additional Information: Statistical Decision Making

81
Q
A study comparing the effectiveness of stress inoculation training, hypnosis, and EMDR on PTSD measures client's anxiety level prior to receiving treatment and at 6 and 12 weeks after beginning treatment. This design would be considered:
Select one:
A. between-groups
B. within-subjects
C. mixed
D. counterbalanced
A

Correct Answer is: C
A mixed research design has at least one between-subjects independent variable and at least one repeated measures variable (or within-subjects variable). Since this study is comparing the effects on three different groups of subjects (i.e. a between-subjects design) combined with the use of a repeated measures (within-subjects) design, it would be considered a mixed design.
Counterbalancing is a technique used to control order effects in a repeated measures design, and involves administering the treatments to the different groups in a different order.
Additional Information: Variations of the Factorial ANOVA

82
Q

From both a practical and ethical standpoint, one potential problem with an ABAB design in single subject research is
Select one:
A. autocorrelation.
B. the need to use a placebo in one of the AB phases.
C. determining whether the treatment or external factors are responsible for observed behavioral changes.
D. getting the behavior to revert to baseline levels.

A

Correct Answer is: D
In single-subject research, the A phase is the baseline phase, or when the behavior under study is measured before a treatment is applied, and the B, or treatment, phase is when the behavior is measured after a treatment is applied. So in an ABAB design, a baseline measurement is obtained, a treatment is administered, the treatment is withdrawn until the behavior reverts to its original baseline level, and the treatment is re-applied. In one respect, this design is superior to an AB design, because it provides stronger evidence that any observed changes in the behavior are due to the treatment and not some factor extraneous to treatment. If the behavior changes both times following the same treatment, and reverts back to previous levels following its withdrawal, the conclusion that the treatment and not something else causes the behavioral change is reasonable. However, as noted, the ABAB design requires withdrawal of a treatment that, assuming it was successful, probably produced a positive change in the subject. This could be logistically problematic because some behaviors are resistant to reversion and ethically problematic because in many cases, withdrawing the treatment can be harmful to the subject.
Additional Information: Single-Subject Designs

83
Q
Randomly selecting certain schools from a large school district and then including all teachers in those schools or a random sample of teachers from those schools in your study is referred to as
Select one:
A. cluster sampling.
B. stratified sampling.
C. systematic sampling.
D. nested sampling.
A

Correct Answer is: A
In this situation, you are starting out the sampling processing by selecting naturally occurring groups (clusters) of subjects. This is referred to as cluster sampling, and is useful when it’s not feasible to directly sample individuals from the population.
Additional Information: Ways to Increase External Validity

84
Q
A within-subjects design that involves changing the order in which each treatment is administered to different groups of participants is referred to as:
Select one:
A. counterbalancing
B. changing criterion
C. Latin square
D. Solomon four-group
A

Correct Answer is: A
Counterbalancing is a within-subjects design that involves changing the order in which each treatment is administered to different groups of individuals. The goal is to use every possible treatment sequence with equal numbers of participants for each sequence.
If the number of participants is too small to permit the use of a completely counterbalanced research design, then researchers may use a type of partial counterbalancing like the Latin square design. This design is useful for determining what exact sequences of treatment will be administered to the different participant groups.

The changing criterion design is a single case design consisting of a series of phases with differing behavioral criterion set for each. The treatment is considered effective if the behavior reaches the criterion level for each phase.

The Solomon four-group design is used to evaluate the effects of pretesting on internal and external validity.

85
Q
From a large population, you take two random samples and measure them on one dependent variable. In this case, you would expect the F-ratio to be close to:
Select one:
A. 0.0.
B. 0.50.
C. 2.50.
D. 1.0
A

Correct Answer is: D
In statistical hypothesis testing, 1 is the expected value of the F ratio under the null hypothesis, or the hypothesis that two or more samples do not differ on some dependent variable measure. The null is rejected when F is significantly greater than 1. In other words, an F ratio of 1 suggests that the means being compared were drawn from the same population, which is the case here.
Another way of looking at this question is to remember that MSB, the numerator of the F ratio, is a measure of between-group variability (differences between sample means), which is due to both treatment effects and error. But, if two random samples are taken from the same population, MSB reflects only error, since there is no treatment. And MSW, the denominator of the F ratio, is a measure of within-group variability, which reflects only error. So, in this case, the F ratio (MSB/MSW), instead of equaling “(treatment + error)”/error” will equal “error/error”, or 1.
Additional Information: Logic of the ANOVA and the Derivation of the F Ratio

86
Q

A multivariate analysis of variance would be used to analyze collected data when:
Select one:
A. the researcher wants to analyze the effects of an extraneous variable
B. the researcher wants to remove the effects of an extraneous variable
C. the study includes two or more independent variables
D. the study includes two or more dependent variables

A

Correct Answer is: D
The multivariate analysis of variance (MANOVA) is a type of ANOVA used when two or more dependent variables are included in a study. Rather than using separate ANOVAs to evaluate the effects of each of the dependent variables, a researcher could use the MANOVA when all the dependent variables are measured on a ratio or interval scale. This also helps to control the experiment-wise error rate.
Additional Information: Multivariate Analysis of Variance (MANOVA)

87
Q
To measure the strength of the relationship between two continuous variables when their relationship is nonlinear, which of the following would be most useful?
Select one:
A. phi coefficient
B. eta coefficient
C. Pearson r
D. kappa coefficient
A

Correct Answer is: B
Most correlation coefficients have an underlying assumption that the relationship between variables is linear. The eta coefficient, or the correlation ratio, is useful when the relationship between two continuous variables is nonlinear.
The phi coefficient* is used when both variables are true dichotomies. The Pearson r* is used when both variables are continuous and the relationship is linear. The kappa coefficient* is used to evaluate inter-rater reliability (* incorrect options).
Additional Information: Other Correlation Coefficients

88
Q
A 10-year-old child is administered the WISC-III and obtains a score of 140 Full Scale IQ. If she is retested at the age of 16, her IQ score will most likely be:
Select one:
A. higher
B. lower
C. the same
D. impossible to predict
A

Correct Answer is: B
This question was a little tricky in that it appears to be about the reliability of IQ scores over time, when it is really a statistics question. The WISC-III does have very good reliability over time and, if the IQ score was in the normal range, we could predict that it would stay the same over time. However, a score of 140 on the WISC-III is extremely high (classified as “very superior”) and would, therefore, likely be lower upon retesting due to regression to the mean – which is the tendency of extreme scores to be less extreme upon retesting.
Additional Information: Regression to the Mean

89
Q
You are investigating whether there is a relationship between the number of years one has been smoking cigarettes and the number of psychotherapy sessions required to quit smoking. The best statistical method to analyze the results is:
Select one:
A. chi-square
B. Pearson r
C. t-test for independent samples
D. multiple regression analysis
A

Correct Answer is: B
In this case, you are attempting to assess the relationship between two variables that are measured on a continuous (interval or ratio) scale. The Pearson r allows you to do this. The Pearson r is the bivariate (i.e., for two variables) correlation coefficient used when variables are measured on an interval or ratio scale.
Additional Information: Pearson r

90
Q
The exam score and the \_\_\_\_\_\_\_\_\_\_\_\_\_\_ are necessary to calculate the 68% confidence interval for an examinee's obtained test score.
Select one:
A. standard deviation
B. standard error of measurement
C. standard error of estimate
D. test's mean
A

Correct Answer is: B
Adding and subtracting one standard error of measurement to and from the examinee’s obtained test score yields a 68% confidence interval. The standard error of measurement (calculated from the test’s standard deviation and reliability coefficient) is needed to determine a confidence interval around an obtained test score.
While the standard deviation is needed to calculate the standard error of measurement, it cannot be used to determine a confidence interval by itself and the standard error of estimate is used to construct a confidence interval around a predicted criterion score.
Additional Information: Standard Error of Measurement

91
Q

In principal components analysis, an eigenvalue would indicate
Select one:
A. the amount of variability in a group of variables accounted for by an independent statistical component.
B. the amount of variability in a group of variables accounted for by a statistical component that shares variability with other statistical components in the analysis.
C. the amount of variability in one measured variable accounted for by all the independent statistical components in the analysis.
D. the amount of variability in all measured variables accounted for by all the statistical components in the analysis.

A

Correct Answer is: A
Principal components analysis and factor analysis are two complex statistical techniques designed to determine the degree to which a large set of variables can be accounted for by fewer, underlying constructs (referred to as “factors” or “principal components”). In principal components analysis and factor analysis, an eigenvalue is a statistic that indicates the degree to which a particular factor is accounting for variability in the variables studied. In other words, a factor’s eigenvalue indicates its strength or explanatory power.
the amount of variability in a group of variables accounted for by a statistical component that shares variability with other statistical components in the analysis.

The reason the correct choice is better than this option is that in principal components analysis, the factors or components are always independent, or uncorrelated.
Additional Information: Explained Variance (or Eigenvalues)

92
Q
When processing data of "low quality," from small samples, or on variables about which nothing is known concerning their distribution, which statistical procedure would be most appropriate?
Select one:
A. parametric
B. non-parametric
C. path analysis
D. discriminant function analysis
A

Correct Answer is: B
Nonparametric methods were developed to be used in cases when the researcher knows nothing about the parameters of the variable of interest in the population (hence the name nonparametric). In more technical terms, nonparametric methods do not rely on the estimation of parameters (such as the mean or the standard deviation) describing the distribution of the variable of interest in the population. Therefore, these methods are also sometimes (and more appropriately) called parameter-free methods or distribution-free methods.
Additional Information: Nonparametric Tests

93
Q
Which of the following techniques is most similar to latent trait analysis (LTA)?
Select one:
A. cluster sampling
B. analysis of covariance
C. multitrait-multimethod matrix
D. latent class analysis
A
Correct Answer is: D
Latent class analysis, like latent trait analysis, is used to identify the underlying latent structure of a set of observed data. The techniques differ in that in LTA, the latent variable that determines the structure is continuous whereas in LCA, the latent variable is nominal.
Cluster sampling is a sampling technique in which groups of participants are selected instead of individuals.

Used to statistically remove the effects of the covariate, or an extraneous variable, on the dependent variable, the analysis of covariance (ANCOVA) makes it easier to determine the effects of the independent variable on the dependent variable.

The multitrait-multimethod matrix is used to assess convergent and divergent validity.
Additional Information: Factor Analysis

94
Q
Which of the following correlation coefficients indicates the strongest predictive relationship between two variables?
Select one:
A. 0.76
B. 0.09
C. -0.01
D. -0.84
A

Correct Answer is: D
A correlation coefficient is a numerical value between -1 and +1 that expresses the strength and direction of the relationship between two variables. The negative or positive sign indicates the direction of the relationship, not its strength. So to determine which of these correlation coefficients indicates the strongest relationship, you have to consider the number alone and disregard the sign. Therefore, of the four choices, -0.84 represents the strongest relationship. To illustrate: let’s say there is a strong negative relationship between income and unhappiness; hypothetically, we’ll say that the correlation between yearly income and scores on the Beck Depression Inventory is -0.84. The “negative” part indicates the direction of the relationship and means that as one variable increases (income), the other (BDI scores) decreases. But the negative direction has nothing to do with the strength or weakness of the relationship; in this case, the relationship is negative and strong. Now consider a weak positive relationship, say between income and finger length. Hypothetically, let’s say that the correlation is +0.10. A positive relationship means that the values of both variables tend to increase together. In this case, the relationship is very weak and any correlation between the two would be due to random chance factors alone. So in other words, positive or negative have nothing to do with the strength of a correlation; there can be strong negative correlations and weak positive ones (and vice versa of course).
Additional Information: Correlation and the Correlation Coefficient

95
Q
If you want to measure whether a weight training program resulted in significant changes in weight and strength for a sample of body builders, the best test to use is:
Select one:
A. MANOVA.
B. paired t-tests.
C. repeated measures ANOVA.
D. chi-square.
A

Correct Answer is: A
When you have two dependent variables (weight and strength), you would need a test that can handle two DVs. Among the four choices here, only the MANOVA can do that. Some people get stuck because they think the MANOVA requires more than one IV, but that’s not the case. The requirement is that there is at least one IV and more than one DV.
Additional Information: Multivariate Analysis of Variance (MANOVA)

96
Q

Structural equation modeling is used to:
Select one:
A. classify participants into criterion groups based on their status or score on two or more predictors.
B. to evaluate convergent and divergent validity.
C. to identify homogeneous groups from a collection of observations.
D. to evaluate predictive relationships between measured variables and latent factors.

A

Correct Answer is: D
Structural equation modeling is a technique used to evaluate or confirm the cause-and-effect or hypothesized relationship between both measured and latent variables.
classify participants into criterion groups based on their status or score on two or more predictors.

Classifying participants into criterion groups based on their status or score on two or more predictors is referred to as discriminant function analysis.

to evaluate convergent and divergent validity.

Convergent and divergent validity is evaluated using the multitrait-multimethod matrix.

to identify homogeneous groups from a collection of observations.

Cluster analysis is a method for grouping objects of similar kind into respective categories. It can be used to discover structures in data without providing an explanation/interpretation.

Additional Information: Structural Equation Modeling

97
Q
Jose scored 75 on his final exam. The test scores were normally distributed, with a mean of 60 and a standard deviation of 15. Jose's score would be in which of the following percentile ranges?
Select one:
A. 35"“49
B. 50"“64
C. 65"“79
D. 80"“95
A

Correct Answer is: D
In a normal distribution, 1.0 is 34 percentile points above the mean of 50. Jose’s standard score is (75-60)/15 or 1.0, putting his score at the 84th percentile.
Additional Information: Normal Distribution, Standard Scores, and Percentile Ranks

98
Q
A within-subjects design that involves changing the order in which each treatment is administered to different groups of participants is referred to as:
Select one:
A. counterbalancing
B. changing criterion
C. Latin square
D. Solomon four-group
A

Correct Answer is: A
Counterbalancing is a within-subjects design that involves changing the order in which each treatment is administered to different groups of participants. The goal is to use every possible treatment sequence with equal numbers of participants for each sequence.
changing criterion

The changing criterion design is a single case design consisting of a series of phases with differing behavioral criterion set for each. The treatment is considered effective if the behavior reaches the criterion level for each phase.

Latin square

If the number of participants is too small to permit the use of a completely counterbalanced research design, then researchers may use a type of partial counterbalancing like the Latin square design. This design is useful for determining what exact sequences of treatment will be administered to the different participant groups.

Solomon four-group

The Solomon four-group design is used to evaluate the effects of pretesting on internal and external validity.

Additional Information: Ways to Increase External Validity

99
Q
Of the following, which is designed most explicitly to assist an investigator in deciding how much confidence to put in a particular finding based on data?
Select one:
A. descriptive statistics
B. inferential statistics
C. measures of central tendency
D. correlation coefficient
A

Correct Answer is: B
Inferential statistics are used to make inferences from data in a study to more general conditions. For example, determining the probability that an observed difference between groups is dependable or happened by chance. Most of the major inferential statistics come from a general family of statistical models known as the General Linear Model which includes the t-test, Analysis of Variance (ANOVA), Analysis of Covariance (ANCOVA), regression analysis, and many of the multivariate methods like factor analysis, multidimensional scaling, cluster analysis, and discriminant function analysis.
descriptive statistics

Descriptive statistics simply to describe what the data is and/or what it shows in a study.

measures of central tendency

Central tendency is the center or middle of a distribution and three most common measures of central tendency are the mean, the median, and the mode.

correlation coefficient

Correlation coefficient is a statistical measure of the interdependence of two or more random variables wherein the value indicates how much of a change in one variable is explained by a change in another.
Additional Information: Inferential Statistics

100
Q

A study is conducted to determine the effectiveness of 3 different reading programs on reading comprehension. The participants are 5th grade students who are divided into 3 levels based on their past reading comprehension (below average, average, and above average). Results from a factorial ANOVA indicate that there are significant main effects of each variable and a significant interaction effect. Based on these results, one could conclude that:
Select one:
A. each of the reading programs is equally effective for students at every reading level
B. only one of the reading programs is effective for students at every reading level
C. the reading programs are only effective for students at a particular reading level
D. the most effective reading program is dependent on the student’s reading level

A

Correct Answer is: D
A factorial ANOVA is used when a study has more than one independent variable. Factorial designs also allow for the assessment of both main effects (the effects of each independent variable considered individually) and interaction effects (the effects of each variable at the different levels of the other variable). The study described in this question has two “significant main effects” for the independent variables: type of reading program and past level of reading comprehension. And a “significant interaction effect” means that the effects of the different reading programs varied significantly for students at different reading levels. For example, “Reading Program A” may have been highly effective for above average students, moderately effective for average students, yet ineffective for below average students. On the other hand, “Reading Program B” may have been only effective for below average students, while “Reading Program C” may not have been effective for any students.
Additional Information: Factorial ANOVA

101
Q

In an ABAB design:
Select one:
A. the same subject is administered all treatments.
B. different subjects are administered treatments.
C. a treatment is administered to one subject across a number of different settings.
D. a treatment is administered to the same subject for a number of different behaviors.

A

Correct Answer is: A
An ABAB design is a type of single-subject design. It is an example of a reversal design – a baseline measure of a behavior is obtained (the “A” phase), the behavior is again measured after a treatment is administered (the “B” phase), the treatment is removed or reversed and the behavior is again measured, (the second “A”), and the behavior is again measured after the treatment is re-applied (the second “B”). In other words, the same subject receives all the treatments that are applied (actually, the same treatment at different times; thus, the word “all” might be somewhat misleading, but this is still the best answer).
a treatment is administered to one subject across a number of different settings.

a treatment is administered to the same subject for a number of different behaviors.

These two choices are examples of multiple baseline designs.

Additional Information: Single-Subject Designs

102
Q
The main difference between these types of resampling tests is the way the resamples are computed. Which of the following is computed with replacements?
Select one:
A. jackknife
B. bootstrapping
C. permutation test
D. cross-validation
A

Correct Answer is: B
Resampling procedures compute a test statistic for each sample or rearrangement with the resulting set constituting the sampling distribution (often called a reference distribution) of that statistic. The sampling (reference) distribution can be used to draw inferences about the model underlying the data. The issue of replacement is one distinction between the provided approaches. Bootstrapping takes the combined samples as representative of the population from which the data came, drawing many samples with replacement, from some pseudo-population. Bootstrapping is primarily focused on estimating population parameters, and it attempts to draw inferences about the population(s) from which the data came.
Jackknife uses less information and fewer samples than bootstrapping. Jackknife subsampling generates different subsets of the original sample without replacement.

Permutation test or randomization procedures begin with the original data then systematically or randomly reorder (shuffle) the data, and then calculating the appropriate test statistic on each reordering. Shuffling data amounts to sampling without replacement. Randomization procedures focus on the underlying mechanism that led to the data being distributed between groups in the way that they are.

The cross-validation uses a part of the available observation to fit the model, and another part to test in the computation of predication error. The objective of cross-validation is to verify replicability of results.

103
Q

A research study using an ABAB design involves the following elements in sequence:
Select one:
A. treatment, baseline, treatment, baseline
B. baseline, treatment, baseline, treatment
C. baseline, intervention in setting A, baseline, intervention in setting B
D. baseline for group A, treatment for group A, baseline for group B, treatment for group B

A

Correct Answer is: B
An ABAB design is a type of single-subject design. It is also an example of a reversal design - a baseline measure of a behavior is obtained (the “A” phase), the behavior is again measured after a treatment is administered (the “B” phase), the treatment is removed or reversed and the behavior is again measured (the second “A”), and the behavior is again measured after the treatment is re-applied (the second “B”).
baseline, intervention in setting A, baseline, intervention in setting B

baseline for group A, treatment for group A, baseline for group B, treatment for group B

These two options are examples of multiple baseline designs. Multiple baseline designs do not involve withdrawal of treatment, but rather, apply the treatment to multiple settings (multiple baseline across settings) or to the same behavior of different subjects (multiple baseline across subjects).

Additional Information: Single-Subject Designs

104
Q
Which of the following correlation coefficients would be used to determine the degree of association between two variables that are reported in terms of ranks?
Select one:
A. Spearman
B. contingency
C. phi
D. biserial
A

Correct Answer is: A
When both variables are ranks, the Spearman rho (also known as Spearman rank-order correlation coefficient) is used.
When both variables are measured on a nominal scale the contingency coefficient* is used. When both variables are true dichotomies the phi coefficient* is used. When one variable is continuous and the other is a artificial dichotomy, the biserial coefficient* is the appropriate correlation coefficient (* incorrect options).
Additional Information: Other Correlation Coefficients

105
Q
A test measuring verbal fluency is administered to 250 college students, and a split-half reliability coefficient is obtained. If the same test instead had been administered to 250 students aged 12-21, the obtained reliability coefficient probably would have
Select one:
A. been higher.
B. been lower.
C. remained about the same.
D. moved from negative to positive.
A

Correct Answer is: A
One factor that affects any correlation coefficient, including a reliability coefficient, is the range of scores. If the range of scores is restricted on either or both sets of scores, the correlation coefficient will be lowered. The two sets of scores involved in a split-half reliability coefficient are scores obtained by the same group of individuals on two different halves on the test. Originally, the test was administered to only college students. In the second scenario, the test was administered to a broader range of students.
Additional Information: Reliability

106
Q

In order to compare the results from two or more different studies that measure the same concept but used different outcome measures, you would be most interested in the:
Select one:
A. p-value
B. effect size
C. order effects
D. credentials of the researcher, since it’s impossible to compare data from different studies.

A

Correct Answer is: B
When comparing data from different studies which measure the same concepts, you would be conducting a meta-analysis which results in an effect size. An effect size indicates the average effect of a treatment across many different studies.
The p-value, also known as the alpha level, indicates the probability that the null hypothesis is false, which, as always, is useful to know, but is not as useful as the effect size in making comparisons between studies.

Order (or carryover) effects are the effects of the order of treatment administration in a repeated measures design.
Additional Information: Meta-Analysis

107
Q
If you had a categorical variable and a continuous variable which of the following tests would you use?
Select one:
A. Point-biserial
B. Eta
C. Spearman Rho
D. Tetrachoric
A

Correct Answer is: A
The point-biserial correlation is used with a continuous variable and one dichotomous (categorical).
Eta is used when there is a nonlinear relationship between variables. The Spearman Rho is a correlation coefficient used to correlate two variables that have been ordinally ranked. A tetrachoric coefficient is used when both variables are artiifiicially dichotomized.
Additional Information: Other Correlation Coefficients

108
Q

Mary obtains a percentile rank of 93 on a chemistry test and John obtains a 58 percentile rank on the same test. Due to errors in scoring their exams, 5 points have to be added to each of their raw scores, but not to the scores of the other examinees. This should cause:
Select one:
A. Mary’s percentile rank to increase more than John’s
B. John’s percentile rank to increase more than Mary’s
C. the percentile ranks to increase by the same amount
D. the percentile ranks to remain the same

A

Correct Answer is: B
Percentile ranks are evenly distributed, creating a flat or rectangular distribution of percentile scores. However, raw scores in a normal distribution, form a bell-shape, with most of the scores clustered in the center of the distribution and fewer scores at the high and low ends. Therefore, when points are added to a raw score at the 93rd percentile rank (located at the high end of the distribution), it will “jump over” fewer scores than when the same number of points are added to a raw score at the 58th percentile rank (located closer to the center of the distribution).
Additional Information: Normal Distribution, Standard Scores, and Percentile Ranks

109
Q
In a negatively skewed distribution, the order of measures of central tendency from lowest to highest will be
Select one:
A. mode, mean, median.
B. mode, median, mean.
C. mean, mode, median.
D. mean, median, mode.
A

Correct Answer is: D
A frequency distribution is a list of the values that a variable takes in a sample, showing the number of times each value appears. In a negatively skewed distribution, most of the values fall on the high end. For example, the distribution of scores on an easy test might be negatively skewed, with most people scoring high and only a few low scores. The distribution is called negatively skewed because when graphed on an x-y axis, the tapering side, representing fewer values, is on the left, or lower score, side. In a negatively skewed distribution, the mode is higher than the median, which is higher than the mean. The mode is the most frequently occurring score in the distribution, and on an easy test, the most frequent score will be toward the high end. And of the three measures of central tendency, the mean is most sensitive to outlying scores and will be pulled down by the few low scores. The median, the score that cuts the distribution in half, with 50% scoring above the median and 50% below, will fall in between the mean and the mode.
Additional Information: Skewed Distributions, Measures of Central Tendency, Skewed Distributions

110
Q
When trying to prove causation, a researcher mismatches levels of data and tries to apply statistics at one level to infer to data of another level. This is referred to as:
Select one:
A. tautology
B. teleology
C. ecological fallacy
D. latent coding
A

Correct Answer is: C
Ecological fallacy is a logic error that occurs when trying to prove causation, levels of data are mismatched and statistics are applied at one level to infer to data of another level.
Tautology* is a logic error based on circular reasoning, meaning that something is true by definition or the dependent variable is simply a restatement of the independent variable. Teleology* is a logic error which explains a phenomenon by saying that it was some spirit or higher power that causes the relationship. Latent coding* occurs when a researcher reads into the meaning of the content he/she is analyzing to get data rather than simply taking it at face value. This is in contrast to manifest coding which occurs in content analysis when coding content is based on the face-value rather than looking into the meaning (* incorrect options).

111
Q

Cluster sampling involves
Select one:
A. randomly selecting individual subjects from a larger target population.
B. randomly selecting a naturally-occurring group of subjects from a larger target population.
C. randomly selecting several naturally occurring groups from a larger population, and then randomly selecting individuals from each group.
D. randomly selecting individuals from a larger target population and dividing subjects into groups on the basis of their status on a demographic variable.

A

Correct Answer is: B
In cluster sampling, naturally occurring groups of subjects, rather than individual subjects, are randomly selected for participation in research. For instance, if a researcher wants to use elementary school students in an educational study, he or she could randomly choose a school from the schools in his state, and use all the students in that school as participants in the study. That’s cluster sampling. A variation of cluster sampling, known as multistage cluster sampling, involves selecting a large cluster (group) and then selecting selectively smaller clusters. For example, the researcher studying elementary school students could randomly select a school district, then randomly select a school from the chosen school district, and then randomly select a classroom from the chosen school. In this case, both forms of cluster sampling would be more practical than the alternative of simple random sampling. That would involve randomly selecting individual elementary school students from across the state.
Additional Information: Ways to Increase External Validity

112
Q
The error inherent in the best fit regression line is called the standard error of the:
Select one:
A. estimate.
B. mean.
C. measurement.
D. coefficient.
A

Correct Answer is: A
The standard error of estimate tells us how far we can expect to be off when making predictions based on a regression (prediction) equation. It’s a way to assess how well the equation “fits” the data. Try to keep the term “standard error of the estimate” in a box in your mind labeled “correlation coefficient” since, the higher the correlation coefficient, the lower the error of estimate.
The other errors relate to different situations. The standard error of the mean tells us how closely our sample mean approximates the population mean. Keep this one in a box in your mind labeled “experiments and samples.” The standard error of measurement tells us how accurately an obtained score on a test estimates someone’s true score on that test, if a true score were ever possible to obtain. So keep this one in a box labeled “reliability of a test.” And the standard error of the coefficient is a foil. There is no such thing.
Additional Information: Standard Error of Estimate

113
Q
Increasing internal validity is best achieved by:
Select one:
A. random selection
B. matching
C. random assignment
D. blocking
A

Correct Answer is: C
Known as the “great equalizer,” randomization of subjects to groups is the most powerful way for controlling extraneous variables. Unlike random assignment which occurs after subjects are selected, random selection refers to a method of selecting subjects to participate from the population being studied. Random selection influences external validity. Matching, a procedure to ensure equivalency on a specific extraneous variable, and blocking, studying the effects of the extraneous variable, are also methods of increasing internal validity.
Additional Information: Random Assignment

114
Q
The major threat to internal validity of a time-series quasi-experiment would be
Select one:
A. maturation.
B. selection.
C. regression.
D. history.
A

Correct Answer is: D
To get this correct (except if you got it correct by chance), you’d need to know what a time-series design is. Basically, you take a number of measurements over time to get a longitudinal baseline trend, then somewhere along the line you introduce your experimental manipulation. If, following the manipulation, you see the trend change, you can infer that your intervention caused the change. But a major threat to the internal validity of this design is a historical event which could co-occur with your experimental manipulation. You’d have no control over these events and they could be a rival explanation for changes in your measurements.
Additional Information: Time-Series Design

115
Q
All of the following are norm-referenced scores except:
Select one:
A. pass/fail
B. grade-equivalent scores
C. T-score
D. percentile rank
A

Correct Answer is: A
Norm-referenced scores indicate how well an individual performed on a test compared to others in the norm group. A pass or fail score achieved by one individual does not indicate how many others passed or failed. Pass/fail is a criterion-referenced score, which indicates if an individual knows the exam content or not, but does not measure performance relative to other examinees. The other three responses are norm-referenced scores.
A grade-equivalent score* permits a test user to compare an individual’s exam performance to others in different grade levels. A T-score* is a type of standard score, or norm-referenced scores indicating how a test-taker performed in terms of standard deviation units from the mean score of the norm group. A percentile rank* shows the percent of individuals in the norm group who scored lower (* incorrect options).
Additional Information: Standard Scores

116
Q
The upper and lower limits of the standard error of measurement for a test with a mean of 80 and standard deviation of 10 are:
Select one:
A. 0 to 80
B. 0 to 10
C. -1.0 to +1.00
D. 0 to +1.0
A

Correct Answer is: B
There is no error in measurement and the standard error of measurement equals zero when the reliability coefficient of a test is equal to +1.0 (the highest reliability coefficient possible). The standard error of measurement equals the standard deviation of the test scores when the test’s reliability coefficient is equal to 0 (the lowest possible). It is helpful to know the formula for the standard error of measurement: the standard error of measurement equals the standard deviation times the square root of one minus the reliability coefficient, when answering this type of question.
Additional Information: Standard Error of Measurement

117
Q
Using an 8 hour comprehensive battery of tests to assess intelligence, subjects' performance systematically declined in the later hours of assessment. This is due to which of the following threats to internal validity?
Select one:
A. maturation
B. testing
C. selection
D. instrumentation
A

Correct Answer is: A
Maturation refers to any internal change (biological or psychological) that occurs in subjects while an experiment is in progress and which results in a systematic effect on the DV. Such a long duration of testing would probably cause subjects to be fatigued, which would affect their performance on a test of intelligence.
Testing refers to subjects’ improved performance on a post-test due to their experience with the pre-test.

Selection refers to pre-existing subject factors that account for the results in the DV (for example, in measuring the effects of an exercise program, the participants chosen for the program were more athletic than the control group).

Instrumentation threatens internal validity when the measuring process changes between the pre- and post-tests (e.g.: raters become better at rating with practice).
Additional Information: Threats to Internal Validity

118
Q

An educational psychologist has data on 12 different variables collected from students in the graduating high school class of the preceding year, including high school GPA, SAT scores, teacher ratings, and various tests of motivation and personality. She is interested in using these measures to predict success in college. In this instance, the psychologist would use stepwise multiple regression in order to
Select one:
A. develop a predictive equation using all 12 measures.
B. determine the optimal set of measures to use.
C. determine if mean differences on the 12 measures significantly differ from each other.
D. identify any cultural bias in the predictor or criterion measure.

A

Correct Answer is: B
Stepwise multiple regression is a variation of multiple regression. In multiple regression, one develops an equation that uses two or more predictor variables to predict scores on a criterion (outcome) variable. Stepwise multiple regression involves starting with a large set of predictors and reducing them to a smaller set that provides significant predictive value without providing overlapping information. Specifically, the goal is to get predictors that have high enough correlations with the criterion and low enough correlations with each to be included. If predictors have high correlations with each other, they are basically providing overlapping information and there is no point in including them. The two variations of this technique are forward stepwise multiple regression and backwards stepwise multiple regression. In forward stepwise multiple regression, you choose the predictor with the highest correlation with the criterion, you add one predictor at a time, and then run a significance test to see if the added predictor significantly increases the combined predictive value of the overall equation. The process stops when an added predictor fails to significantly increase predictive value. In backwards stepwise regression, you start with all predictors, and remove predictors, starting with the one that is least correlated with the criterion. This process ends when removal of a predictor causes a significant decrease in the ability of the equation to predict values on the criterion.
Additional Information: Stepwise Multiple Regression

119
Q

The eta correlation ratio would be used to
Select one:
A. estimate the strength of a nonlinear relationship.
B. measure the relationship between two dichotomous variables.
C. estimate the strength of a relationship between a dichotomous variable and a quantitative variable.
D. measure the relationship of variables measured by ranks.

A

Correct Answer is: A
There are a number of different types of correlation coefficients. The most common is the Pearson r, which is used to measure the relationship between two quantitative variables assumed to be related in a linear way. When a nonlinear relationship is assumed, the eta coefficient can be used instead. A linear relationship is one where as the value of one variable increases the other increases (positive correlation), or where as the value of one variable increases the other decreases (negative correlation). For example, the correlation between height and weight will be positive and linear; the correlation between income and health problems will be negative and linear. In a nonlinear relationship, variables are related but not in this linear fashion. An example would be a depression drug that has no effect at low dose, decreases symptoms at moderate doses, and increases symptoms at very high doses. Here there would be a non-linear relationship between dosage and depression level.
Regarding the other choices, the phi coefficient can be used to measure the correlation between two dichotomous variables (i.e., variables that can take one of two values).

The point-biserial coefficient is used to measure the correlation between a dichotomous variable and a quantitative variable.

And Spearman’s rho is used to measure the correlations between two sets of ranked data.
Additional Information: Other Correlation Coefficients

120
Q

One potential advantage of nonparametric statistical tests over parametric tests is that the former
Select one:
A. require fewer assumptions about the population data.
B. are more powerful.
C. result in a lower probability of false positives.
D. provide more precise information

A

Correct Answer is: A
Both parametric and nonparametric tests are used in statistical hypothesis testing. In both, sample data is collected and analysis is run to see if the data supports a research hypothesis. Parametric tests make assumptions about the underlying population data (e.g., that data is normally distributed) and also are typically used to estimate population parameters. For instance, given a statistic from a sample of subjects, parametric tests typically indicate the probability that the statistic falls within a certain range in the underlying population. By contrast, nonparametric tests do not make assumptions about and do not attempt to estimate population parameters. Another advantage of nonparametric tests is they can be used to test hypotheses about ranked data or non-numerical categorical data. However, when the assumptions required to use them are met, parametric tests provide more accurate and precise results than nonparametric tests.
Additional Information: Nonparametric Tests

121
Q
A psychologist in a hospital is conducting research designed to assess the effects of a new drug on the social behavior of psychotic patients. Which of the following would be the best way to decrease experimenter bias in this type of study?
Select one:
A. a double-blind study
B. counterbalancing
C. a randomized block design
D. a Solomon four-group design
A

Correct Answer is: A
In a double-blind study, neither the experimenter nor the subjects know the research hypothesis. This technique thus controls for all types of experimenter and subject expectancies, including experimenter bias. The experimenter’s bias in favor of the research hypothesis cannot influence the results of the study if he or she does not know the hypothesis.
Additional Information: Ways to Increase External Validity

122
Q
When random assignment of subjects to groups is not possible, researchers use:
Select one:
A. Experimental design
B. Quasi-experimental design
C. Developmental research
D. Longitudinal research
A

Correct Answer is: B
Quasi-experimental designs are used when random assignment of subjects to groups is not possible. In true experimental design the investigator randomly assigns subjects to different groups which receive different levels of the manipulated variable. Developmental research involves assessing variables as a function of time. A type of developmental research is longitudinal in these studies the same people are studied over a long period of time.
Additional Information: Specific Research Designs and Strategies

123
Q
To determine the relationship between a dichotomous variable and a continuous variable, you would use which of the following correlation coefficients?
Select one:
A. point biserial
B. biserial
C. Spearman's Rho
D. eta
A

Correct Answer is: A
You should memorize the different correlation coefficients and when they are used. The point-biserial coefficient is used when a dichotomous variable (e.g., gender) is correlated with continuous variable (e.g., IQ score). You might have thought the biserial coefficient is also correct, since it is used to correlate an artificial dichotomy with a continuous variable. An artificial dichotomy is one that is created arbitrarily by setting a cutoff score on a test; for instance, if you give the WAIS-IV and classify everybody who scores over 110 as having “high intelligence” and everybody who scores below 110 as having “low intelligence,” you have created an artificial dichotomy. Because an artificial dichotomy is not, in a pure sense, a dichotomous variable, “biserial” is not as good an answer as “point biserial”.
Additional Information: Other Correlation Coefficients

124
Q
A psychological researcher would like to determine what variables best distinguish between patients who benefit from psychotherapy and patients who do not. To identify these variables, the research would most likely use which of the following?
Select one:
A. discriminant function analysis
B. factor analysis
C. canonical correlation
D. MANOVA
A

Correct Answer is: A
Discriminant function analysis is used to identify variables that distinguish between two or more existing or naturally occurring groups. Its use would involve collecting data on a variety of measures and determining which combination of them best predict differences between the groups. Since the researcher’s purpose is to find variables that distinguish between existing groups, discriminant function analysis is the best answer.
Regarding the other choices, factor analysis is used to reduce variability in a set of variables to a smaller set of unobserved variables, or factors. For example, factor analysis might be use to confirm a theory that score differences on a variety of intelligence measures can be explained in terms of two factors, verbal intelligence and performance intelligence.

Canonical correlation is a technique for assessing the relationship between two sets of variables: i.e., it is used to assess the relationship between multiple predictor and multiple criterion variables.

And MANOVA, or multivariate analysis of variance, is used in research studies to evaluate the effects of one or more independent variables on multiple (two or more) dependent variables.
Additional Information: Discriminant Function Analysis

125
Q

Which of the following techniques would be appropriate when multiple predictors will be used to predict a score on a single criterion?
Select one:
A. multiple regression analysis
B. multiple discriminant function analysis
C. principal components analysis
D. linear regression analysis

A

Correct Answer is: A
When multiple predictors will be used to predict a score on a single criterion, multiple regression is appropriate. Discriminant function analysis is used to determine which continuous variables discriminate between two or more naturally occurring groups.
Multiple discriminant function analysis, an extension of discriminant function analysis, involves using multiple predictors to sort individuals into one of three or more criterion groups.

Principle components analysis, similar to factor analysis, is used to determine the variables or components that account for the total variance in test scores.

When a single predictor is used to predict or estimate a score on a single criterion, linear regression is appropriate.
Additional Information: Multiple Correlation and Multiple Regression

126
Q

Protocol analysis typically involves
Select one:
A. specifying the unstated rules of communication between two individuals.
B. analyzing a behavior in terms of its antecedents and consequences.
C. recording specific behaviors that allow one to understand the subject’s problem-solving methodology.
D. standardizing psychotherapy procedures.

A

Correct Answer is: C
Protocol analysis is sort of an umbrella term used to refer to qualitative research studies that involve collecting verbatim reports. These reports could consist of an examinee’s verbal statements, a descriptive account of a subject’s behavior, or both. The term is commonly applied to research where the subject is asked to “think aloud” as he or she is performing a task. The researcher then records what the subject does and says, and analyzes the data to determine what cognitive processes are used to solve the problem. The analysis, by the way, is not quantitative or statistical; rather, it is qualitative, or based on the researcher’s own interpretations.
Additional Information: Protocol Analysis

127
Q

You have been hired by a firm to develop a battery of tests designed to predict the precise amount of goods, in dollar amounts, that each member of the firm’s sales staff will sell. The firm’s president tells you that he will spare no expense, and he wants you to develop as many tests as possible, since the more tests that are included in the battery, the stronger the battery’s predictive power will be. As an expert in quantitative psychology, you tell him
Select one:
A. that he is absolutely correct; the more predictors one has, the higher the predictive power one obtains.
B. that what he is saying is true only if the predictors are highly correlated with each other.
C. that due to multicollinearity, it will not be cost efficient to add more predictors to the battery after a certain point.
D. how to spell your name on the paycheck, since, as a quantitative psychologist, you know a lucrative gig when you see one.

A

Correct Answer is: C
Multicollinearity occurs in multiple regression equations when predictors are highly correlated. Though ideally, predictors should be correlated with the criterion, it is best if they are not correlated with each other; otherwise, there is no point in combining the predictors since they provide redundant information. Eventually, multicollinearity becomes an issue every time multiple regression is used (usually after 2-3 predictors) since it is impossible to keep finding variables that have high correlations with a criterion but low correlations with each other.
Additional Information: Multiple Correlation and Multiple Regression

128
Q
Which of the following is NOT a disadvantage of a repeated measures design?
Select one:
A. multicollinearity
B. autocorrelation
C. practice effects
D. carryover effects
A

Correct Answer is: A
A “repeated measures” design, sometimes referred to as a “within-subjects design,” uses more than one measurement of a given variable for each subject. For example, longitudinal studies and pre-test/post-test designs measure the same subjects multiple times. These designs have several disadvantages including: “Autocorrelation”, which means that observations obtained close together in time from the same subjects tend to be highly correlated. This violates the independence of observations assumption made by statistical tests. “Practice effects”, “carryover effects”, and “order effects” all refer to systematic changes in subjects’ performance due to prior exposure to a treatment condition or measurement ( incorrect options) .
However, multicollinearity refers to a problem associated with multiple regression which occurs when two or more predictors are highly correlated with each other.
Additional Information: Threats to External Validity