8- Research Design and Statistics Flashcards
Which of the following is NOT generally considered a direct threat to external validity?
Select one:
A. order effects
B. hawthorne Effect
C. interaction between selection and treatment
D. history
Correct Answer is: D
Distinguishing between internal and external threats to validity can be difficult. Indeed, some experts disagree on how to categorize some of them. However, all of the choices except “history” are generally considered to be threats to external validity.
Order effects* (also known as carryover effects) occurs in repeated measures designs, or in studies in which the same subjects are exposed to more than one treatment. For example, in a study on the effects of marital therapy interventions, couples are given relaxation training followed by communication training. If significant improvement occurs, it may be due to relaxation training preceding communication training; therefore, the results could not be generalized to situations in which subjects only receive communication training.
The Hawthorne effect* occurs when subjects behave differently due to the fact that they are participating in research. Obviously this threatens external validity since the results cannot be generalized to real-life situations in which people are not participating in research.
Interaction between selection and treatment* refers to when a treatment has different effects depending on the selection of subjects. For example, studies that only use undergraduate students (as many studies do) might not generalize to non-undergraduate students (* incorrect options).
Finally, history refers to an external event, other than the experimental treatment, that affects scores on the DV. This is primarily considered a threat to internal validity. For example, if a study on the effects of a new treatment for depression began several weeks before the events on “9-11” and concluded several weeks after “9-11,” the results might indicate that the new treatment is not effective. However, this might not be a valid conclusion due to the effects of history.
Additional Information: Threats to External Validity
When a study has two or more independent variables the research design used is: Select one: A. MANOVA B. ANOVA-one way C. Factorial ANOVA D. ANCOVA
Correct Answer is: C
The factorial ANOVA is used when a study involves more than one independent variable. A one way ANOVA is used when a study has one independent variable and more than two independent groups. MANOVA is used when the study has two or more dependent variables and ANCOVA is used to adjust dependent variable scores to control for the effectiveness of an extraneous variable.
Additional Information: Factorial ANOVA
Mothers who get high scores on the WAIS-III tend to have children who get high scores on the WISC-III. On the basis of this information, which of the following conclusions is most justified?
Select one:
A. intelligence is hereditary
B. parental intelligence is correlated with offspring intelligence
C. the WAIS-III and the WISC-III are correlated
D. the WAIS-III and the WISC-III are uncorrelated
Correct Answer is: B
None of these answers is great, but the only one that is possible is that parental intelligence and offspring intelligence are correlated. To know this for sure, we would need to know more about moderate and low scorers on the WAIS-III – do their children have moderate and low scores, respectively on the WISC-III? However, none of the other choices makes any sense. For instance, we can’t say that intelligence is hereditary from this information, since environmental rather than genetic factors may have resulted in the similarity of scores between mother and child. Also, we can’t say the WISC-III is correlated (or uncorrelated) with the WAIS-III. To measure the correlation between two tests, one must administer both of them to the same set of examinees.
Which of the following is a measure of "amount of variability accounted for" Select one: A. alpha B. Cohen's d C. eta squared D. F-ratio
Correct Answer is: C
The “amount of variability accounted for” is assessed by a squared correlation coefficient. Eta squared is the square of the correlation coefficient (i.e., the correlation between the treatment and the outcome) and is used as an index of effect size.
Alpha* is the level of significance set by a researcher prior to analyzing the data. Cohen’s d* is used as an index of effect size, but it is a measure of the mean difference between two groups. The F-ratio* is the statistic calculated when using the analysis of variance (* incorrect options).
Additional Information: Correlation and the Correlation Coefficient
In designing a research study, you take a number of steps that have the effect of reducing beta. This means that you have reduced the probability of: Select one: A. retaining a true null hypothesis. B. retaining a false null hypothesis. C. rejecting a true null hypothesis. D. rejecting a false null hypothesis.
Correct Answer is: B
Beta is the probability of making a Type II error, or of retaining a false null hypothesis. In plain language, it is the probability of failing to detect a true effect.
Additional Information: Type II (Beta) Error
A MANOVA is used to statistically analyze data when:
Select one:
A. a study includes two or more independent variables
B. a study includes two or more dependent variables
C. there are more than two levels of a single independent variable
D. a study includes at least one independent variable that is a between-groups variable and another independent variable that is a within-subjects variable
Correct Answer is: B A MANOVA (multivariate analysis of variance) is used to analyze the effects of one or more independent variables on two or more dependent variables that are each measured on an interval or ratio scale. a study includes two or more independent variables
A factorial ANOVA is used to analyze data when a factorial design, which includes two or more independent variables, is used and the dependent variable is measured on an interval or ratio scale.
there are more than two levels of a single independent variable
A one-way ANOVA is used when a study has one independent variable and more than two independent groups.
a study includes at least one independent variable that is a between-groups variable and another independent variable that is a within-subjects variable
The split-plot (mixed) ANOVA is the appropriate technique when at least one independent variable is a between-groups variable and another independent variable is a within-subjects variable.
If data points are widely scattered around a regression line, it would indicate Select one: A. high heteroscedasticity. B. low heteroscedasticity. C. low homoscedasticity. D. a low correlation coefficient.
Correct Answer is: D
Simply put, a lot of variance around the regression line indicates that the correlation isn’t too high. Be careful not to confuse this with the idea of heteroscedasticity. This term means that the scatter is uneven at different points of the continuum. For instance, there might be high variability around the regression line at low x (predictor) values, and low variability around the line at high x values. In other words, heteroscedasticity refers to a differential level of scatter, not high scatter.
A psychologist is conducting research to evaluate the effectiveness of three predictor tests of overall mental health he has developed. He administers the predictors to 35 individuals randomly chosen from the population of interest and obtains a squared multiple correlation coefficient (R2) of .47. If the psychologist administers the predictors to another 70 individuals drawn from the same population, the best prediction is that, he would obtain an R2 that is:
Select one:
A. lower than .47.
B. about equal to .47.
C. slightly to moderately higher than .47.
D. much higher than .47.
Correct Answer is: C
The principle behind this question is that the greater the range of scores in both the predictor(s) and the criterion, the higher the validity coefficient will be. If you administer the predictors to 70 people as opposed to 35, you are likely to get a somewhat greater range of scores in the former case. Therefore, you will get a somewhat higher correlation coefficient. This choice (“much higher than .47”) is not a good answer. Increasing the range of scores can only do so much for your correlation coefficient, especially if you already have a reasonably representative sample to begin with. Increasing the sample size from 35 to 70, for example, will not turn a poor set of predictors into a good one.
Some of you might have gone for this choice (“lower than .47”), thinking that, due to shrinkage, the correlation coefficient would be smaller. Shrinkage, however, is associated with the development of a predictor or set of predictors. It occurs when, based on research with one sample, items for a predictor are chosen from a larger pool, and the newly developed predictor is then tested on a second sample. The correlation coefficient for the second sample is likely to be smaller, because the predictor was “tailor made” for the first sample. In this question, however, the predictors are not in the process of development, and the first group of 35 people is not a validation sample (i.e., a sample of people used to determine which items to retain for the final version of the test). To see if you understand this distinction, try to rewrite the question, changing as few words as possible, so that this (“lower than .47”) becomes the best answer.
Additional Information: Multiple Correlation and Multiple Regression
The best control for practice effects in an experiment is: Select one: A. counterbalancing. B. random selection. C. double-blind. D. equivalent groups.
Correct Answer is: A
If subjects are getting a series of treatments, and the order of the presentation might affect the outcome, you would present their treatments in different orders in a counter-balanced design in order to control for the possible practice effect of receiving a set order of treatments.
Additional Information: Ways to Increase External Validity
A study is conducted to determine if males and females differ in terms of the reasons they buy automobiles; 100 males and 100 females are asked whether the primary reason they bought their car was its appearance, its price, or its perceived reliability. The best statistical test to evaluate the hypothesis that males and females differ in these preferences would be a Select one: A. factorial ANOVA. B. t-test for independent samples. C. one-way ANOVA. D. chi-square test.
Correct Answer is: D
The chi-square test is used to analyze the result of studies where the data is classification of objects into categories. This is also referred to as nominal data. For example, in this study, the data will be a 2 X 3 matrix of frequencies, or counts, of responses within each category (male-appearance, female-appearance, male-price, female-price, male-reliability, female-reliability). The chi-square test could then be used to determine whether the observed frequencies differ significantly from what would be expected if males and females did not differ in these preferences. All the other choices are tests used in studies that use interval or ratio data as opposed to categorical data. With this type of data, the difference between two interval scale values can be quantified (e.g., on an IQ test, a difference between an IQ of 100 and 85 is equivalent to the difference between 130 and 115). With interval or ratio data, one can obtain the mean scores of different research groups or trials, and choices 1, 2, and 3 are all tests that involve comparing group means to each other. By contrast, with nominal data, the concept of mean scores would not make sense; all you can do is count frequency of occurrence within categories.
Additional Information: Chi-Square Test
The use of "pooled variance" in statistics assumes that: Select one: A. the sample sizes are equal B. the sample variances are equal C. the population sizes are equal D. the population variances are equal
Correct Answer is: D
Pooled variance is the weighted average variance for each group. They are “weighted” based on the number of subjects in each group. Use of a pooled variance assumes that the population variances are approximately the same, even though the sample variances differ.
An admissions committee is planning to modify its application and admissions policy. They are evaluating the current student enrollment and are interested in the relationship between gender and high school GPA. Which statistical method would be used? Select one: A. Point biserial correlation B. Multiple correlation C. Canonical correlation D. Tetrachoric correlation
Correct Answer is: A
The point biserial correlational technique is used when one variable is dichotomous (gender) and one is continuous (high school GPA).
Multiple correlation
Multiple correlation is used when there are two or more predictor variables and a single criterion variable.
Canonical correlation
Canonical correlation is used when there are two or more predictor variables and two or more criterion variables.
Tetrachoric correlation
Tetrachoric correlation is a technique used to estimate the magnitude of the relationship between two continuous variables that have been dichotomized, such as dividing age into two groups: under 40 and over 40.
Additional Information: Other Correlation Coefficients
The correlation obtained between two tests supposedly measuring the same ability will be affected most by the:
Select one:
A. time of day during which the tests are taken.
B. reliability of the tests used.
C. whether raw scores or standard scores are used as data.
D. ranges of abilities tested
Correct Answer is: B
If a test’s reliability is low, the scores obtained will not be accurate. You will get too much error. If you compare two tests with a lot of error in them, you will not get an accurate prediction of their relationship. Remember that low reliabilities of measures used on the predictor or on the criterion measures will restrict your obtained correlation.
Additional Information: Reliability
You conduct a study designed to assess the effectiveness of psychotherapy in the treatment of depression. You work with two groups, one of which receives the therapy and one of which is an attention-only control group. All of your subjects are hospitalized inpatients; thus, all of them are extremely depressed and therefore score extremely low on your pretest measure of depression. The biggest threat to external validity in this study is:
Select one:
A. regression to the mean
B. reactivity
C. interaction between selection and treatment
D. pretest sensitization
Correct Answer is: C
Note that you are being asked for the biggest threat to external validity, not internal validity in this question. Therefore, you can rule out regression to the mean, which is generally viewed as a threat to internal validity (regression probably wouldn’t threaten internal validity anyway in this case, since both groups appear to be equivalent in terms of their baseline depression levels).
External validity refers to the generalizability of research results. An “interaction between selection and treatment” means that the effect of a treatment may not generalize to other members of the target population who differ in some way from the research subjects. For example, in this case, it’s possible that your therapy is effective for individuals who are highly depressed, but would not have any effect on individuals who are moderately depressed.
Additional Information: Threats to External Validity
A set of past graduate students are divided into two groups by a doctorate admissions committee. One group consists of students who finished the program in five years or less, the other consists of those who did not. Based on undergraduate grade point average and GRE score, which of the following could be used to predict successful completion of the graduate program? Select one: A. MANOVA B. Structural equation modeling C. Discriminant function analysis D. Cluster analysis
Correct Answer is: C
Discriminant function analysis (DA) is used to determine which continuous variables discriminate between two or more naturally occurring groups, or provide insights into how each predictor (e.g., grades, GRE score) individually and/or in combination predicted completion or non-completion of a graduate program. In DA, the independent variables are the predictors and the dependent variables are the groups. In contrast, in MANOVA, the independent variables are the groups and the dependent variables are the predictors.
A multivariate analysis of variance (MANOVA) is used to analyze the effects of one or more independent variables on two or more dependent variables that are each measured on an interval or ratio scale.
Structural equation modeling is a technique used to evaluate or confirm the cause-and-effect or hypothesized relationship between both measured and latent variables.
Cluster analysis is a method for grouping objects of similar kind into respective categories. It can be used to discover structures in data without providing an explanation/interpretation.
Additional Information: Discriminant Function Analysis
With regard to research design, the term external validity refers to: Select one: A. significance B. control C. generalizability D. accuracy
Correct Answer is: C
External validity refers to the ability to generalize findings beyond the specifics (e.g., time, setting, and subjects) of a research study. Internal validity is the extent the changes in the dependent variable are believed to be caused by the independent variable. Both types of validity are research design considerations.
Additional Information: External Validity
A moderator is
Select one:
A. a variable that affects the direction or strength of the association between two other variables.
B. an explanation of how external physical events take on internal psychological significance.
C. a variable that identifies the relationship between two variables and serves to magnify the strength of the variables.
D. a variable that accounts for the relationship between two variables.
Correct Answer is: A
In general, a moderator is a qualitative (e.g., race, sex, class) or quantitative (e.g., level of reward) variable that affects the direction and/or strength of the relation between an independent or predictor variable and a dependent or criterion variable. A moderator only influences the strength of the relationship between two other variables, it doesn’t fully account for it. In contrast, a variable functions as a mediator to the extent that it accounts for the relation between the predictor and the criterion.
A psychologist wants to study the effectiveness of a new treatment she developed to reduce self-mutilative behaviors in patients with Borderline Personality Disorder. She plans to use a single-subject design but, if effective, she does not want to withdraw the treatment due to the potential harm that could result. She should, therefore, use which of the following research designs: Select one: A. ABAB B. multiple baseline C. reversal D. latin square
Correct Answer is: B
A multiple baseline design is a single-subject design in which an independent variable is sequentially administered across two or more subjects, behaviors, or settings (i.e., across “baselines”). The multiple baseline design has the advantage of not having to withdraw the treatment once it has been applied to a baseline.
Reversal designs, on the other hand, such as the ABA or ABAB designs have a second baseline (the second “A”), during which the treatment is withdrawn. The latin square design is not a single-subject design. Rather, it uses many subjects who are all administered all levels of an independent variable, but the order of administration varies between subjects or subgroups of subjects.
Additional Information: Single-Subject Designs
The underlying structure in a set of variables is identified by which of the following? Select one: A. canonical correlation B. multiple regression C. factor analysis D. discriminant analysis
Correct Answer is: C
Factor analysis is a complex statistical technique designed to determine the degree to which a large set of variables can be accounted for by fewer, underlying constructs (referred to as “factors” or “principal components” ). For example, factor analyses of the WAIS-IV have suggested that four factors - verbal comprehension, perceptual reasoning, processing speed, and working memory - explain, to a large degree, scores on the subtests.
Additional Information: Factor Analysis
An organization is interested in improving employee morale in an off-site office with 150 employees. An organizational psychologist is contracted to identify and train the employees with the lowest morale. The employees scoring in the bottom 10% of a pretest are selected for extensive training. At the conclusion of the training, a posttest is administered and improvement in scores is noted. Test performance improvement would be expected even without training because:
Select one:
A. There has been a lapse of time between the first and second administrations.
B. Such tests are notably unreliable, particularly when based on small samples.
C. Regression of scores toward the mean is to be expected as a purely chance phenomenon.
D. The range for which the test was designed has been restricted by the method of sampling.
Correct Answer is: C
The net effect of regression toward the mean is that the lower scores (or measurements) on the pretest tend to be higher on the posttest, and the higher scores (or measures) on the pretest tend to be lower on the posttest. It is important to note that regression is always to the population mean of a group. However, there is essentially no change from the pretest to the posttest due to the dependent variable or treatment. It is important to note when conducting experiments because it affects the internal validity of the experimental design and occurs whenever the sample or subjects are chosen on the basis of extreme pretest scores.
Additional Information: Regression to the Mean
In a positively skewed distribution, one would most likely find, ranked from lowest to highest in value, the: Select one: A. median, mean, mode. B. median, mode, mean. C. mean, mode, median. D. mode, median, mean.
Correct Answer is: D
You have to picture the positively skewed curve in order to get this correct. Positive skewness means there are some outliers (extreme scores) way over on the positive side. That’s where the tail is, way off to the right, or positive, end. Since the mean takes into account the magnitude of the scores, these outliers can be pictured as “pulling” the mean to the positive side, or the right. So, in any ordering of measures of central tendency, the mean would be the highest value. Thus, you can eliminate the two distractors that don’t list the mean as the highest value. To distinguish between the remaining answers, let’s go back to consider what the median is. The median is the middlemost point irrespective of value. If you’ve pictured the curve correctly you can see that more than half the cases fall on the right side because some are way over on the positive side. If you put a line where the highest point is on the curve, which is the mode, you’d see that more than half the cases fall to the right of that line. Hence the median, the 50% point, is to the right of the high point, the mode. This should have gotten you to the correct answer.
Additional Information: Skewed Distributions, Measures of Central Tendency
To use the statistical technique known as trend analysis, you need:
Select one:
A. a quantitative independent variable.
B. a linear relationship between independent and dependent variables.
C. a true experimental research design.
D. two or more independent variables.
Correct Answer is: A
Trend analysis is what is sounds like; i.e., it is used to identify trends and, therefore, requires a quantitative independent variable. You might use trend analysis, for example, to determine if amount of time you spend studying is related to your score on the licensing exam in a linear or nonlinear fashion.
Additional Information: Trend Analysis
Which of the following statistical techniques would involve specifying a model of a problem domain that may involve observed and latent variables related to each other causally and non-causally in a unidirectional and bi-directional fashion? Select one: A. multivariate multiple regression B. structural equation modeling C. multiple ANOVA D. discriminant function analysis
Correct Answer is: B
Structural equation modeling is a complex statistical technique that is used to explore and test relationships among many variables. The variables may be latent (i.e., unobserved variables, such as hypothetical traits or constructs) or observed, may have causal or correlational relationships, and the causal relationships may be specified as unidirectional or bi-directional. The first step in structural equation modeling is model specification. Here, you specify the variables involved, whether they are latent or observed, and the expected relationships among them. The results of statistical analysis indicate whether or not the model is a good fit for the data. The statistics involved are often specialized versions of other multivariate techniques, such as factor analysis and multivariate multiple regression, and the analysis requires specialized statistical software packages such as LISREL or EQS.
Regarding the other choices, multivariate multiple regression allows you to test hypothesized predictive relationships between multiple input (or predictor) and output (or criterion) variables. A multiple ANOVA (MANOVA) is a statistical significance test used in experiments with multiple dependent variables. And discriminant function analysis is used to identify variables that are most useful for distinguishing among two or more groups.
Additional Information: Structural Equation Modeling
In forward stepwise multiple regression analysis, the goal is to obtain the smallest subset of predictors to account for the largest amount of variability in the criterion variable. Statistically, this involves:
Select one:
A. adding predictors to the multiple regression equation and determining, through statistical analysis, if the coefficient of multiple determination is significantly increased
B. using the correction for attenuation formula to estimate what the predictive power of the multiple regression equation would be if all the predictors had perfect reliability
C. using the Spearman-Brown Prophecy formula to estimate the magnitude of the multiple correlation coefficient if all the predictors were used, and comparing the result to the magnitude of the coefficient when different subsets of the predictors are used
D. administering different subsets of the predictors to two validation samples, and conducting statistical analyses to estimate the degree of shrinkage in the multiple correlation coefficient from the first to the second validation sample
Correct Answer is: A
The goal of stepwise regression analysis is to derive the smallest subset of predictors, out of a larger set, that maximizes the ability to predict outcome on a criterion variable. There are two types of stepwise multiple regression: forward and backward. In forward stepwise regression, predictors are successively added to the multiple regression equation. With each addition, an analysis is conducted to determine if the predictive power of the equation is increased. Predictive power is measured by the squared multiple correlation coefficient (also known as the coefficient of multiple determination).
Thus, this choice is the best answer.
Additional Information: Stepwise Multiple Regression
Which of the following is a statistical measure of the degree of difference among scores of subjects within the same experimental or treatment group? Select one: A. F ratio B. mean square between C. mean square within D. standard error of the mean
Correct Answer is: C
Mean square within (or MSW) is a measure of within-group variance – the degree to which subjects within the same experimental group differ from each other. MSW is the denominator of the F ratio, and is referred to as the error term. The larger the magnitude of MSW, the less likely the F ratio will be significant.
Additional Information: Logic of the ANOVA and the Derivation of the F Ratio
Which of the following techniques would not be useful for controlling or assessing the effects of an extraneous variable? Select one: A. stratified random sampling B. blocking C. matching D. ANCOVA
Correct Answer is: A
Stratified random sampling involves dividing a population of interest into sub-populations (strata) and obtaining random samples from each strata. For instance, a researcher interested in studying the American population as a whole may break it down by ethnic groups and take proportionate random samples from each. The technique is designed to ensure that subjects are representative of the population of interest. Unlike the other choices, it is not used to control for the effects of an extraneous variable.
Additional Information: Ways to Increase External Validity
You have conducted a study assessing the relationship between salary and job performance, and you find a significant correlation between these two variables. Your assistant tells you that the data fail to take into account a $25.00 cost of living raise which every employee received. You should:
Select one:
A. decide that the raise invalidated the research.
B. reanalyze the data after the raises have been added to the current salary.
C. not worry about small details; the actual amount is too small to make a significant difference.
D. assume the correlation will not be affected.
Correct Answer is: D
The basic point being tested here is that if you add a constant to each score – in either or both data sets – the relationship between the two variables won’t be affected. In other words, adding a constant to every score does not affect the correlation coefficient. The same is true of multiplying or dividing all scores by a constant, or subtracting a constant from every score.
The technique which allows a researcher to identify the underlying (latent) factors that relate to a set of measured variables and the nature of the causal relationships between those factors is: Select one: A. structural equation modeling (SEM) B. cluster analysis C. Q-technique factor analysis D. survival analysis
Correct Answer is: A
Structural equation modeling is a multivariate technique used to evaluate the causal (predictive) influences or test causal hypotheses about the relationships among a set of factors.
Cluster analysis* is used to identify homogeneous subgroups in a heterogeneous collection of observations. Q-technique factor analysis* determines how many types of people a sample of people represents. Survival analysis* is used to assess the length of time to the occurrence of a critical event (* incorrect options).
Additional Information: Structural Equation Modeling
If a person has a T-score of 70 in a normal distribution with 200 people, what does the 70 mean? Select one: A. 70th percentile B. 3 standard deviations above the mean C. z-score of plus one D. better than 97%
Correct Answer is: D
This is a difficult question because none of the choices offer what you are expecting which would be “the 98th percentile.” Instead the best choice is answer D, which is “better than 97%.” In actuality, a T-score of 70 is two standard deviations above the mean (the mean of a T-score distribution is 50; the standard deviation is 10). When any score is two standard deviations above the mean, 98 percent of the distribution is below that score. In this case, 98 percent of the scores are below a T-score of 70, in other words, better than approximately 97% of people in the distribution.
A percentile rank is
Select one:
A. a norm-referenced score, but not a standard score.
B. a standard score, but not a norm-referenced score.
C. a standard score and a norm-referenced score.
D. neither a standard score nor a norm-referenced score.
Correct Answer is: A
To answer this question, you have to be able to define and understand three terms: norm-referenced, standard score, and percentile rank. A norm-referenced score is one that is interpreted in terms of a comparison to others who have taken the same test. A standard score is a type of norm-referenced score that is interpreted in terms of how many standard deviation units a score falls above or below the mean. Examples include z-scores and T-scores. A percentile rank indicates the percentage of scores that fall below a given score. For example, a person who achieves a percentile rank of 90 on the SAT scored better than 90% of others who took the test. Since interpretation of percentile ranks involves a comparison between scorers, a percentile rank is a norm-referenced score. However, since it is not interpreted in terms of standard deviation units, it is not a standard score.
Additional Information: Percentile Ranks
Which of the following would increase the power of a statistical test? Select one: A. an increase in alpha B. a decrease in alpha C. a decrease in sample size D. use of a two-tailed test
Correct Answer is: A
The “power” or sensitivity of a statistical test is the probability of rejecting the null hypothesis when it is false, that is, the probability of correctly identifying that a difference exists. When alpha is increased (e.g., from .01 to .05), it becomes easier to reject the null hypothesis and, consequently, power is also increased. All of the other choices (a decrease in alpha, a decrease in sample size, or the use of a two-tailed test) would decrease the test’s power.
Additional Information: Power of a Statistical Test
When a multiple regression analysis is employed to predict outcome, there should be
Select one:
A. low intercorrelations among the predictors and high correlation of each predictor with the criterion.
B. high intercorrelations among the predictors and high correlation of each predictor with the criterion.
C. low intercorrelations among the predictors and low correlation of each predictor with the criterion.
D. high intercorrelations among the predictors and low correlation of each predictor with the criterion.
Correct Answer is: A
This question has come up in other examples throughout the tests. Simply stated, we need to have a high correlation between the predictor and the criterion we’re making predictions about (this eliminates two of the four alternatives). Also, we need to have the predictors themselves be more or less independent of each other. That is, they shouldn’t intercorrelate. If they do, then there’s no point in using all of them – if they all measure the same thing, why not use just one? So, you don’t want the predictors to intercorrelate.
Additional Information: Multiple Correlation and Multiple Regression
Which of the following best describes confidence intervals use?
Select one:
A. estimate true scores from obtained scores
B. calculate the standard error of measurement
C. calculate the test’s mean
D. calculate the standard deviation
Correct Answer is: A
Confidence intervals allow us to determine the range within which an examinee’s true score on a test is likely to fall, given his or her obtained score.
The standard error of measurement is used to construct confidence intervals, not the other way around.
Additional Information: Standard Error of Measurement
In a normal distribution of scores, the range of raw scores represented by the percentile rank range of 50 to 55 is _______ the range of raw scores represented by the percentile rank range of 90 to 95.
Select one:
A. less than
B. greater than
C. the same as
D. depending on the standard deviation, either less than, greater than, or the same as
Correct Answer is: A
This question is a bit tricky and requires careful reading and a good grasp of the concepts of percentile rank and normal distribution. The easiest way to understand it is in terms of an example. Say that you have a test, with a mean of 70 and a range of possible raw scores from 0-100. The raw score mean is 70; in a normal distribution, the mean is equivalent to a percentile rank of 50 and is in the exact middle of the distribution (if you don’t know why this is, go back and review the Statistics section before attempting to understand this question). In a normal distribution, most of the raw scores are near or at the middle of the distribution; thus, most of the raw scores will be near or at 70. Similarly, the PR score range of 50 to 55 is in the middle part of the distribution, which is to say that most of the raw scores in this part of the distribution will be at or near 70. So the raw score range set by PR 50 to PR 55 will not be wide.
Now if you look at a normal curve, you will see that in the high end of the raw score distribution, there is a long tail spread across the bottom. This reflects the fact that there are relatively few high scorers, and the scores of these individuals are spread out (over the length of this tail). Since the 90 to 95 PR range is in the high end of the distribution, the range of raw scores here will be relatively higher than the range of raw scores in the middle of the distribution.
If you chose “the same as”, you probably did so based on the fact that the percentile rank distribution is flat. This means that the same amount of people will score between 50 and 55 and 90 and 95. However, the question is not about how many people will score within this PR range. Instead, it’s asking about the raw score ranges these PR ranges correspond to.
If you didn’t understand the above explanation, it might be useful to read it again with the normal curve in front of you. As you’re looking at the curve, remember that it is a raw score distribution, and try to approximate where the percentile ranks in the question would be placed on this distribution. If you still don’t understand, don’t worry too much. As you review this concept and practice with more questions, these things will be come clearer and clearer. Remember that difficult technical content requires repeated review, so you should keep track of those particular concepts you need to review on a regular basis.
Additional Information: Percentile Ranks
Which of the following correlation coefficients is used to assess convergent validity: Select one: A. heterotrait-monomethod B. monotrait-heteromethod C. heterotrait-heteromethod D. monotrait-monomethod
Correct Answer is: B
The response choices make up a multitrait-multimethod matrix, a complicated method for assessing convergent and discriminant validity. Convergent validity requires that different ways of measuring the same trait yield the same result. Monotrait-heteromethod coefficients are correlations between two measures that assess the same trait using different methods; therefore if a test has convergent validity, this correlation should be high. Heterotrait-monomethod and heterotrait-heteromethod both confirm discriminatory validity, and monotrait-monomethod coefficients are reliability coefficients.
Additional Information: Convergent and Discriminant (Divergent) Validation
A significant finding for a one-way ANOVA indicates that the Select one: A. group means were different. B. sample means were different. C. population means were different. D. within-group variance was different.
Correct Answer is: C
We use statistical tests to make inferences about a population. So if we have significant results, we assume that this represents what happens in the real world – that is, in the population.
Additional Information: Parametric Tests
Choosing a correlation coefficient is based on factors such as the variables scale of measurement and the shape of the relationship between them. When measuring the relationship between two continuous variables when the relationship between them is nonlinear, which of the following is used? Select one: A. rho B. eta C. phi D. tau
Correct Answer is: B
Eta is the appropriate correlation coefficient to use when both variables are measured on an interval or ratio scale and the relationship between the predictor (the X variable) and the criterion (the Y variable) is curvilinear.
Rho (sometimes referred to as the Spearman rank-order correlation coefficient) is appropriate when both variables are measured as ranks.
The phi coefficient is used when both variables are true (natural) dichotomies.
When both variables are measured on an ordinal scale, Kendall’s tau is appropriate.
Additional Information: Other Correlation Coefficients
While studying the use of journaling in the treatment of depression, a researcher finds only individuals with good writing ability benefit from journaling. Writing ability is a(n): Select one: A. outcome variable B. mediating variable C. moderator variable D. feedback variable
Correct Answer is: C
The strength of the relationship between the independent and dependent variables is affected by a moderator variable. Writing ability is moderating the effects of journaling on the treatment of depression.
Outcome variable* is another term for dependent variable. A mediating variable* is affected by the independent variable and affects the dependent. It is responsible for an observed relationship between an independent variable and a dependent (outcome) variable. A feedback variable* is an unrelated term (* incorrect options).
Additional Information: Factors Affecting the Validity Coefficient
The probands, in a study comparing characteristics of adult ADHD patients, with characteristics of their first degree and second degree biological relatives and non-patients (controls), are: Select one: A. non-patients B. first degree relatives C. first and second degree relatives D. ADHD patients
Correct Answer is: D
The ADHD patients are the probands in this study. Probands, or index cases, are the individuals who are first brought to the attention of the researcher - i.e., individuals manifesting the characteristic of interest or disease.
You are conducting a study to examine the differences in reaction time between elderly people and young people. Subjects are asked to view stimuli on a computer screen and to press a lever every time they see certain target stimuli. Your results indicate that younger people respond faster than older people, and you conclude that reaction time is faster for younger people. Your conclusion is faulty because of Select one: A. carry-over effects. B. differential attrition effects. C. a selection bias. D. cohort effects.
Correct Answer is: D
The study described here is an example of a cross-sectional design, in which two or more different age groups are compared to determine whether aging has an effect on a particular dependent variable. A problem with cross-sectional designs is cohort effects. This refers to differences between the groups in experience rather than age that could be accounting for differences between them on the dependent variable. Cohort effects seem like a particularly plausible explanation for the results here, since it’s likely that young people have more experience with computers than older people.
Additional Information: Developmental Research
Consider the factor loadings for each of the four tests as explained in the table below:
Correlations of Various Tests with Job Skills
JOB SKILLS
Test Concentration Verbal Reasoning Motoric Computation 1 .60 .53 .04 .13 2 .21 .11 .87 .09 3 .07 .15 .03 .62 4 .33 .22 .20 .45
If these tests are combined into a test battery, the predictive validity of the battery would be maximized if the correlations among the four tests were
Select one:
A. at least as high as the correlation of each one with its highest correlated skill.
B. as low as possible, preferably close to .00.
C. no higher than the average correlations of each of the four tests with each skill.
D. equal to the sum of the squared correlations of each test with each skill.
Correct Answer is: B
The idea behind this question is that when you use various tests for prediction, each of the tests you use should have some relationship to the thing you’re predicting. And, if you use several of these tests together, say, in a battery, it is best if each of the tests tells you something different and apart from the others. That is, each of the tests should contribute some unique information to the equation. If you give two tests and they both give the same information (that is, they correlate), then you needn’t give the two tests. Why not stick with only one, say the cheaper one? Putting this all together, we have the situation where you have several tests, each contributing something to predicting a job skill, but each contributing a unique bit of added information. In other words, the tests themselves should not correlate.
Additional Information: Multicollinearity
LISREL would be most useful in
Select one:
A. testing the relationship between multiple predictor variables and one criterion variable.
B. confirming hypotheses regarding relationships between several latent and observed variables.
C. testing a hypothesis regarding the effects of multiple independent variables on one dependent variable.
D. validating questions on a personality inventory.
Correct Answer is: B
LISREL is an acronym that stands for linear structural relations analysis. It is a software program used in Structural Equation Modeling, which is a technique used to test theories regarding unidirectional and bi-directional relationships among latent (unobserved) and manifest (observed) variables. Most experts recommend that Structural Equation Modeling should be used as a confirmatory method, to provide evidence for a theory as opposed to exploratory method to originate new theories.
Additional Information: Structural Equation Modeling
Which one of the following is least likely to attenuate a measure of correlation? Select one: A. restricted range B. homoscedasticity C. curvilinear relationship D. the use of unreliable measures
Correct Answer is: B
Homoscedasticity refers to even scatter around the regression line. Homoscedasticity is actually a good thing. It wouldn’t attenuate the correlation at all. The other three choices list factors that would attenuate the correlation coefficient.
Additional Information: Factors Affecting the Pearson r
\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ is used to study the cognitive processes that occur during the performance of tasks or solution of problems. Select one: A. protocol analysis B. functional analysis C. event sampling D. situation sampling
Correct Answer is: A
Protocol analysis, or the “think aloud” technique, is the only technique listed that is useful for obtaining information on cognitive processes. Protocol analysis assumes that subjects instructed to verbalize their thoughts in a manner that doesn’t alter the sequence of thoughts mediating the completion of a task, can think-aloud without any systematic changes to their thought process and can therefore be accepted as valid data on thinking.
A functional analysis* is used to identify the functions of a behavior - i.e., the antecedents and consequences that maintain the behavior. Both sampling responses are behavioral observation techniques with event sampling* useful when a behavior of interest occurs infrequently and situation sampling* when observing a behavior in a variety of situations (* incorrect options).
Additional Information: Protocol Analysis
During a research study the participants are able to guess the research hypothesis, causing them to behave differently than they would under normal conditions. This phenomenon is due to:
Select one:
A. demand characteristics
B. the Hawthorne effect
C. the use of a quasi-experimental design
D. the use of psychic research participants
Correct Answer is: A
Demand characteristics are cues in a research study that allow participants to guess the hypothesis. As a result, participants may behave differently than they would under normal conditions.
the Hawthorne effect
The Hawthorne effect is a similar phenomenon, but refers to the tendency of research participants to behave differently due to the mere fact they are participating in research - rather than due to cues about how they are expected to behave.
the use of a quasi-experimental design
Quasi-experimental designs are simply designs which do not randomly assign participants to groups.
the use of psychic research participants
Finally, this choice is a possible, but less probable, cause of this phenomenon.
Additional Information: Threats to External Validity
In the analysis of the effects of two independent variables, multiple regression analysis is sometimes used as a substitute for the factorial ANOVA. One advantage of using multiple regression as opposed to a factorial ANOVA is that:
Select one:
A. multiple regression analysis can be used for multiple dependent variables as well as multiple independent variables.
B. continuous or categorical data (as opposed to solely categorical data) can be used to measure the independent variables in multiple regression analysis.
C. the use of multiple regression allows one to estimate the probability that obtained differences on the dependent variable between groups represent true population differences.
D. when multiple regression is used and a significant result is obtained, the conclusion that there is a causal relationship between the independent variables and the dependent variables is more plausible.
Correct Answer is: B
One limitation of the ANOVA technique is that independent variables must be divided into categories for the analysis to be conducted. In multiple regression, the researcher has the choice of using categories or continuous data (e.g., scores on a test) to measure the independent variables. This is considered an advantage of regression, because it allows for the data to provide more precise and specific information about the variables being measured.
multiple regression analysis can be used for multiple dependent variables as well as multiple independent variables.
This choice is not true of multiple regression; it is designed for use with one dependent variable only.
when multiple regression is used and a significant result is obtained, the conclusion that there is a causal relationship between the independent variables and the dependent variables is more plausible.
This is also not true; the strength of the conclusion that variables are causally related depends on the research design, not the statistical analysis.
the use of multiple regression allows one to estimate the probability that obtained differences on the dependent variable between groups represent true population differences.
This is true of both multiple regression and ANOVA, since they are both inferential statistical methods.
Additional Information: Multiple Correlation and Multiple Regression
A psychologist believes that physical exercise can reduce a person's anxiety level, which reduces the strength of substance cravings in people recovering from substance dependence. According to this hypothesis anxiety is a: Select one: A. suppressor variable B. mediator variable C. moderator variable D. criterion contaminator
Correct Answer is: B
A mediator variable is a variable that accounts for or explains the effects of an IV on a DV. That is, the IV affects the mediator variable, which affects the DV. In this example, the IV is exercise, the mediator variable is anxiety, which explains how the DV, substance craving, is reduced.
A moderator variable is similar to a mediator variable, but a moderator variable only influences the strength of the relationship between two other variables, it doesn’t fully account for it. For example, if a job selection test has different validity coefficients for different ethnic groups, ethnicity would be a moderator variable because it influences the relationship between the test (predictor) and actual job performance (the criterion) but it does not fully account for the relationship.
A suppressor variable reduces or conceals the relationship between variables. For example, the K scale in the MMPI-2 is a suppressor variable because it measures defensiveness, which can suppress the scores on the clinical scales. The K scale is, therefore, used as a correction factor for some of the clinical scales.
Criterion contamination is the artificial inflation of validity which can occur when raters subjectively score ratees on a criterion measure after they have been informed how the ratees scored on the predictor.
In a study of 400 personality variables, it was found that 19 correlated at the .05 level of significance with a measure of actual behavior. The 19 significant correlations could be considered valid for:
Select one:
A. future research.
B. future therapy.
C. both future research and future therapy.
D. neither future research nor future therapy.
Correct Answer is: D
At the .05 level of significance, there is a 5% probability of making a Type I error. So, out of 400 relationships, you’d expect 20 or so to be found significant when they really aren’t. Hence the 19 significant correlations probably aren’t very meaningful. They likely have no application to either research or therapy.
Additional Information: Type I Error and Alpha Level
The Drugs-R-Us company wants to compare the effectiveness of 3 new antidepressant medications. Patients with depression are randomly assigned to one of the three medications and depressive symptoms are measured at weeks 1, 6, and 12. Which type of research design would be most appropriate for this study? Select one: A. ABAB B. between subjects C. within subjects D. mixed
Correct Answer is: D
A mixed research design has at least one between-subjects independent variable and at least one repeated measures variable (or within-subjects variable). Since this study is comparing the effects on three different groups of subjects (i.e., a between-subjects variable) combined with the use of a repeated measures (within-subjects) variable, it would be considered a mixed design. An ABAB design is a type of reversal design, in which a baseline measure of a behavior is obtained (the “A” phase), the behavior is again measured after a treatment is administered (the “B” phase), the treatment is removed or reversed and the behavior is measured again (the second “A”), and the treatment is then re-applied (the second “B”) and a final behavior measure is taken.
Additional Information: Variations of the Factorial ANOVA
If scores obtained by parents on an adult test of intelligence are highly correlated with scores obtained by their children on a childhood intelligence test, one can conclude that
Select one:
A. high parental intelligence is the cause of high childhood intelligence.
B. intelligence has a high heritability factor.
C. scores on the two tests are associated.
D. both nature and nurture account for scores on childhood intelligence tests.
Correct Answer is: C
The point of this question is that one cannot make any theoretical conclusions about variables on the basis of a high correlation alone. A correlation means that two variables co-vary, or that values change in a predictable direction–e.g., as the value of one goes up, the other tends also to increase (positive correlation), or as the value of one increases, the other tends to decrease (negative correlation). Another way of saying this is that the two variables are associated. When two variables are correlated, it could be that either one is a cause of the other, or that there are one or more other variables causing the two in question to co-vary. For example, in the case of parent-child intelligence, there could be third variables, such as SES or test bias, that account for any observed association. There is evidence that intelligence has a strong genetic component, but the question is about what one can conclude on the basis of a correlation alone, not about conclusions that have been drawn on the basis of all the available evidence.
Additional Information: Correlation and Causality
A psychologist uses a two-group pretest/posttest design to evaluate the effects of a new treatment. She obtains the following data:
PreTest Post Test
Group 1 Mean 13.4, SD 1.2 Mean 19.8, SD 1.5
Group 2 Mean 19.5, SD 1.5 Mean 21.7, SD 1.9
The biggest threat to this study's internal validity is Select one: A. reactivity. B. test x treatment. C. selection. D. history.
Correct Answer is: C
In this study the means of the two groups are very different initially (Pretest), which will make it hard to interpret the results. When internal validity is threatened by initial group differences, this threat is called selection. Note that the term selection is misleading because it actually refers to assignment. If assignment was random, we would expect the pretest scores for Groups 1 and 2 to be approximately equal, which they are not, 13.4 and 19.5, respectively.