Research Design, Statistics, Tests, and Measurements Flashcards

1
Q

Nonequivalent Group Design

A

In a nonequivalent group design, the control group is not necessarily similar to the experimental group since the researcher doesn’t use random assignment. This is common in educational research because you can’t randomly assign subjects to different classes. For example, if a researcher wants to see if a new method for teaching reading is better than the usual method, the researcher might assign the new method to one class and the usual method to another class, and measure each subject’s increase in reading skill from the beginning to the end of the study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

External Validity

A

The degree to which the results of an experiment may be generalized.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Internal Validity

A

The certainty with which results of an experiment can be attributed to the manipulation of the independent variable rather than to some other, confounding variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Hawthorne Effect

A

The alteration of behavior by the subjects of a study due to their awareness of being observed. One way to control for the Hawthorne effect is to use a control group design and observe both the control group and the experimental group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Single-Blind vs. Double-Blind Experiments

A

In a double-blind experiment, neither the researcher who interacts with the subjects nor the subjects themselves know which groups received the independent variable or which level of the independent variable. If the subjects do not know whether they are in the treatment or control group, but the researchers know, it is called a single-blind experiment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Placebo Effect

A

A beneficial effect produced by a drug or treatment, which cannot be attributed to the properties of the treatment itself, and must therefore be due to the patient’s belief in the treatment. The placebo effect is a special kind of demand characteristic. One possible remedy for the placebo effect is to have control groups, so that the effectiveness of the drug over-and-above the placebo effect can be determined.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The 2 basic types of statistics

A

Descriptive statistics and inferential statistics. Descriptive statistics is concerned with organizing, describing, quantifying, and summarizing a collection of actual observations. With inferential statistics, researchers generalize beyond actual observations. That is, inferential statistics is concerned with making an inference from the sample involved in the research to the population of interest, and providing an estimate of population characteristics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Frequency Distribution

A

A frequency distribution is a graphic representation of how often each value occurs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

3 measures of central tendency

A

The mode, the median, and the mean all provide estimates of the “average” score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The Mode

A

The mode is the value of the most frequent observation in a set of scores. If two values are tied for being the most frequently occuring observation, the data has 2 modes, or is bimodal. A distribution can also have 3 modes, 4 modes, etc.––or no mode, if every value in the distribution occurs with equal frequency. This makes the mode different from the other two measures of central tendency, as there can only be one median and one mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The Median

A

The median is the middle value when observations are ordered from least to greatest, or from greatest to least. If there are an even number of data points, then the median is the arithmetic mean of the two middle-most numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The Mean

A

The mean of a data set is the sum of scores divided by the number of scores. The mean is the measure of central tendency most sensitive to outliers, i.e. extreme scores. If you have outliers in your data set and you are interested in a representative score, it usually makes sense to use the median, and not the mean, as your measure of central tendency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Percentile

A

The percentile tells us the percentage of scores that fall at or below a particular score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

z-score

A

A z-score expresses how many standard deviations above or below the mean a particular score is. To determine z-scores, you subtract the mean of the distribution from your score, and divide the difference by the standard deviation. Negative z-scores fall below the mean, and positive z-scores fall above the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Mean, median, and mode in normal distributions versus skewed distributions

A

Because the normal distribution is symmetrical and has its greatest frequency in the middle, the mean, median, and mode of a normal distribution are identical. In skewed distributions, where the distribution of scores is not symmetrical, the mean, median, and mode are not identical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

T-scores

A

Don’t confuse T-scores with t-statistics. Z-scores can be converted to T-scores. The T-score distribution has a mean of 50 and a standard deviation of 10. So, for example, a T-score of 60 is one standard deviation above the mean. Because of their nice round numbers, T-scores are often used in test score interpretation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Correlation Coefficients

A

Correlation coefficients are a type of descriptive statistic that measure the extent to which two variables are linearly related. Correlation coefficients range from -1.00 to +1.00. A positive correlation means that as the value of one variable increases, the value of the second variable tends to increase as well. A negative correlation means… The absolute value of a correlation coefficient tells us how strong the relationship is. If two variables have a correlation of zero, knowing the value of the first variable does not help you predict the value of the second variable. The graphical representation of correlational data is called a scatterplot.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Best-Fitting Straight Line

A

Used to highlight correlation on a scatterplot.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Factor Analysis

A

See the physical flashcard.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Inferential Statistics

A

Inferential statistics is concerned with making inferences, or generalizations, from samples to populations, while taking into account the possibility for error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

When should the criterion of significance be chosen?

A

Prior to collecting data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Significance Testing Process

A

See physical flashcard.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Errors in Significance Testing

A

There are two possible errors. A Type 1 error is when you reject the null hypothesis by mistake. The likelihood of making a Type 1 error is called alpha, and is the same as the criterion of significance. A Type 2 error is when you accept the null hypothesis even though the null hypothesis is false. In other words, you obtain a statistically insignificant result and conclude, wrongly, that the null hypothesis is true. The probability of making a Type II error is called beta, and depends largely on sample size and variance. (Note that the probability of making a Type 1 error is called alpha and the probability of making a Type 2 error is called beta.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Note regarding significance testing

A

The purpose of significance testing is to make an inference about a population on the basis of sample data. Statistical significance does not tell us anything about whether the research is poorly designed, or whether the results are trivial or meaningless. The larger the size of the sample, the smaller the difference between the groups has to be in order to be significant. Therefore, if you use really large sample sizes and you get a statistically significant result, the difference between the groups on the DV measure might be so small as to make the results trivial.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

F ratio

A

See physical flashcard.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Factorial Design

A

Each level of a given independent variable occurs with each level of the other independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Meta-Analysis

A

Meta-analysis is a statistical procedure that can be used to make conclusions on the basis of data from different studies. If researcher A publishes a study on therapeutic outcomes, and researcher B publishes a similar study using different methods, we can use meta-analysis to combine the results of these studies and come up with a more general conclusion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Face Validity

A

Face validity refers to the degree to which a procedure appear__s to measure what it is supposed to measure. If you are interested in measuring knowledge of 20th-century American history, but you give subjects a test on 20th-century European history, the test will lack face validity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Scales of Measurement

A

See physical flashcard.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

The 2 types of ability tests

A

There are two types of ability tests: aptitude tests and achievement tests. Aptitude tests are used to predict what one can accomplish through training. In other words, they are used to predict future performance. Intelligence tests are aptitude tests. Achievement tests, on the other hand, attempt to assess what one knows or can do now.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Personality Tests and Inventories

A

Formally, in taking a test, the subjects are instructed to do their best; in completing an inventory, they are instructed to represent their typical reactions. A personality inventory is a self-rating device usually consisting of somewhere between 100 and 500 statements. The subject is asked to determine if the given statements apply to him or her. Although these structured tools are quite reliable, the veracity of responses is not guaranteed. For example, if an item says, “I occasionally steal,” most people will tend to answer “no” regardless of whether or not they occasionally steal. The perceived social acceptability of a response is just one factor that can affect the accuracy of inventories that involve self-reporting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

How the Minnesota Multiphasic Personality Inventory (MMPI) was first revised

A

In 1989, a revision of the MMPI, the MMPI-2, added content scales. These scales were formed using items derived from theoretical concerns rather than from an empirical criterion-keying approach. For example, to form the low self-esteem content scale, the authors selected items that ought to be related to low self-esteem. Hence, the original clinical scales have been supplemented with content scales that were developed using a more theoretical approach.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Minnesota Multiphasic Personality Inventory (MMPI)

A

The MMPI is one of the major personality inventories. It consists of 550 statements to which subjects respond “true,” “false,” or “cannot say.” The MMPI yields scores on ten clinical scales, measuring things such as depression, schizophrenia, and masculinity/femininity. It has scales that can indicate whether the person is careless, faking answers, misrepresenting him- or herself, or distorting responses, and whether the distortion is being done intentionally or unintentionally. The purpose of the MMPI is to aid in the assessment of various clinical disorders. All scores on the MMPI are expressed as standard scores with a mean and standard deviation derived from standardization samples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Standardization Sample

A

A standardization sample is a population of individuals who have previously well-documented intelligence and/or achievement levels, which is used to “standardize” new or revised test instruments to assure that they are reliably measuring what they are intended to measure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Projective Tests

A

Projective tests are different from personality inventories in two basic ways: first, the stimuli in a projective test are relatively ambiguous; and second, the test taker is not limited to a small number of possible responses. A test taker is presented with stimuli and asked to interpret what he or she sees. This means that the scoring of a projective test is subjective, whereas the scoring of personality inventories is objective.

36
Q

Rorschach Inkblot Test

A

The Rorschach inkblot test is a famous projective test created by Hermann Rorschach. The test is made up of 10 cards that are reproductions of inkblots. The cards are presented to the subject in a specific order with very specific instructions to describe what it is that the blots remind the subject of. The clinician then interprets the results based upon what the person saw and the spontaneous remarks that the person may have made.

37
Q

The Blacky Pictures

A

The Blacky pictures is a projective test devised especially for children. The test consists of 12 cartoon-like pictures that feature a little dog named Blacky. Developed according to psychoanalytic theory, each picture depicts Blacky in a situation designed to correspond to a particular stage of psychosexual development. The test taker is asked to tell stories about the pictures.

38
Q

The Rotter Incomplete Sentences Blank

A

The Rotter Incomplete Sentences Blank is a projective test. The test taker is provided with 40 sentence stems and asked to complete them. The theory is that the test taker will fill in the blanks with whatever is on his or her mind.

39
Q

Personality Assessment

A

See physical flashcard.

40
Q

Barnum Effect

A

The Barnum effect is the tendency to accept certain information as true, such as character assessments or horoscopes, even when the information is so vague as to be worthless. The Barnum effect is a form of pseudo validation.

41
Q

Interest Testing

A

Interest testing is usually used to assess an individual’s interest in different lines of work. The best-known test of this kind is the Strong-Campbell Interest Inventory. This inventory is organized like a personality inventory, and in fact, like the MMPI, was developed using an empirical criterion-keying approach. Test takers are given lists of interests and asked to indicate whether they like or dislike the interest listed. In other sections of the test, the test taker is asked to indicate his or her preference for one of two paired items. The interpretation of the results is based, at least partly, on John L. Holland’s model of occupational themes. Holland divided interests into six types: realistic, investigative, artistic, social, enterprising, and conventional. That’s why it is sometimes called the RIASEC system. John L. Holland was an American psychologist who lived from 1919 – 2008.

42
Q

The 3 Wechsler Intelligence Scales

A

David Wechsler developed 3 major IQ tests:

  • Wechsler Preschool and Primary Scale of Intelligence (WPPSI)
  • Wechsler Intelligence Scale for Children (WISC)
  • Wechsler Adult Intelligence Scale (WAIS)

All have been revised and are now called the WPPSI-R, WISC-R, and WAIS-R, and are used with preschoolers, school-aged children (5 – 16 years old), and adults (16 years and older), respectively. The WAIS-IV is the current version utilized for adult intelligence testing.

43
Q

Intelligence Quotient (IQ)

A

See physical flashcard.

44
Q

Cross-Validation

A

Cross-validation, sometimes called rotation estimation or out-of-sample testing, is any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. When assessing the criterion validity of a test, cross validation involves repeating the assessment of criterion validitity on a second sample, after you demonstrated validity using an initial sample.

45
Q

Significance Testing

A

A significance test is one tool researchers use to draw conclusions about populations based on research conducted on samples. The idea is to show that the observed results are unlikely to have been observed due to chance, and therefore we should reject the null hypothesis and accept the research, or alternative, hypothesis. The cutoff we use to decide whether to reject the null hypothesis is called the criterion of significance. By convention, psychologists usually use 5% as their criterion of significance. The P-value is the probability of obtaining an effect at least as extreme as the one in your sample data, assuming the null hypothesis is true. If our P-value is less than or equal to the criterion of significance (also called the alpha level, or simply alpha), our results are statistically significant, and we reject the null hypothesis.

46
Q

68 – 95 – 99.7 rule

A

For an approximately normal data set, the values within one standard deviation of the mean account for about 68% of the set; within two standard deviations account for about 95%; and within three standard deviations account for about 99.7%.

Note: for easy divisibility by 2, as a heuristic it may be helpful to use 96% instead of 95%.

47
Q

3 measures of variability

A

The range, standard deviation, and variance are measures of variability, or dispersion. If the scores in the distribution are all the same, then there is no variability. If the scores are very spread out, then the variability is high. The range is the smallest number in the distribution subtracted from the largest number. The standard deviation provides a measure of the typical distance of scores from the mean. The variance is the square of the standard deviation (the standard deviation is the positive square root of the variance). Both the standard deviation and the variance must be either zero or a positive number.

48
Q

Demand Characteristics

A

Demand characteristics are cues that inform the subject how he or she is expected to behave. One possible remedy for demand characteristics is the use of deception.

49
Q

4 Types of Research

A
  • Naturalistic observation: Researcher does not intervene; measures behavior as it naturally occurs; also called field study
  • Correlational: IV not manipulated
  • Quasi-experiment: IV manipulated; subjects not randomly assigned to groups
  • True experiment: IV manipulated: subjects randomly assigned to groups
50
Q

Alfred Binet (1857 – 1911)

and

Theodore Simon (1872 – 1961)

A

Together they published the first intelligence test, known as the Binet-Simon Intelligence Scale. The purpose of the test was to assess the intelligence of French schoolchildren to ascertain which children were too intellectually disabled to benefit from ordinary schooling. Binet also introduced the concept of mental age, or the level at which a person functions intellectually, regardless of their actual chronological age.

51
Q

Significance Tests (chart)

A

See physical flashcard.

52
Q

Lewis Madison Terman

A

American psychologist who in 1916 revised the Binet-Simon Intelligence Scale for use in the United States. This became known as the Stanford-Binet Intelligence Scale. (Terman was a professor at Stanford University. One of his doctoral students was Harry Harlow.)

53
Q

Wilhelm Wundt

A

(1832 – 1920) Founded the first psychology laboratory in 1879. Wundt brought together earlier work in philosophy, physiology, and psychophysics to create psychology as a science. Wundt is often remembered as a structuralist, and for the rather narrow utility of his experimental strivings to reduce consciousness to its elements. Actually, Wundt himself believed that experimental psychology had a very limited use, and could not be used to study the higher mental processes such as memory, thinking, and language. To study the higher mental processes, Wundt proposed a sort of cultural psychology.

54
Q

Christiana Morgan

and

Henry Murray

A

Devised Thematic Apperception Test (TAT), a projective test consisting of 20 simple pictures which depict scenes with ambiguous meanings. For example, one picture might be a boy staring sadly at a violin. The test taker is told to tell a story about what is happening in the picture, and to provide an ending. Like the Rorschach test, there is no standardized scoring method for the TAT. Scoring is qualitative and the clinician has to rely on his or her clinical skills.

55
Q

Construct Validity

A

Construct validity is the extent to which the measurement or manipulation of a variable accurately represents the theoretical variable being studied. Convergent and discriminant validity are the two subtypes that make up construct validity. Convergent validity refers to the degree to which two measures of constructs that theoretically should be related are, in fact, related. Discriminant validity refers to whether constructs that are supposed to be unrelated are, in fact, unrelated.

56
Q

Validity

A

Validity is the extent to which a test actually measures what it purports to measure. All types of validity assessment examine the relationship between performance on the test in question and other independent and objective sources of information about the knowledge or behaviors of interest. Types of validity include criterion validity (concurrent and predictive), construct validity (convergent and discriminant), content validity, and face validity.

57
Q

Examples of Test Validity

(chart)

A

See physical flashcard.

58
Q

How the Minnesota Multiphasic Personality Inventory (MMPI) was originally developed

A

The original MMPI was developed by Starke R. Hathaway and J. C. McKinley, faculty of the University of Minnesota, and first published in 1943. Hathaway and McKinley used the emperical criterion-keying approach. They tested thousands of questions and retained those that differentiated between patient and nonpatient populations, even if the item didn’t seem to have anything to do with abnormality. The authors examined the responses of patient groups with different diagnoses. Each criterion group’s responses formed the basis of a particular clinical scale, so that if a new patient answered questions in the same way that, say, the depressive criterion group did, that patient would receive a high depression score.

59
Q

Wechsler Scales

A

A major group of intelligence tests is the Wechsler scales. Unlike the Stanford-Binet, which were not organized by content, the Wechsler scales have all items of a given type grouped into subtests. These items are arranged in order of increasing difficulty within each subtest. The Wechsler scales have two broad subscales: a verbal scale which is based on information, vocabulary, and related skills; and a performance scale, which is derived from tests of manipulative skill, eye-hand coordination, and speed.

60
Q

Reliability

A

Reliability is the consistency with which a test measures whatever it is that the test measures. In practice, no test is perfectly reliable. The standard error of measurement (SEM) estimates how repeated measures of a person on the same instrument tend to be distributed around his or her “true” score. The true score is always an unknown. The smaller a test’s SEM, the more reliable the test is. There are 3 basic types of reliability / methods of assessing reliability: test-retest reliability, alternate-form reliability, and split-half reliability. Each type involves a correlation. In test-retest reliability, one test is administered twice to the same group of individuals. In alternate-form reliability, two forms of a test are administered to the same group of people. In split-half reliability, a single test is divided into equal halves, and scores on one half are correlated with scores on the other half. In all of these methods, a correlation coefficient greater than +0.80 indicates a high level of reliability.

61
Q

James McKeen Cattell

A

(1860 – 1944) American psychologist who studied under Wilhelm Wundt in Germany and later became the first professor of psychology in the United States. Cattell was a long-time editor and publisher of scientific journals and publications, most notably the journal Science.

62
Q

Hermann Ebbinghaus

A

(1850 – 1909) A contemporary of Wundt, who studied memory using nonsense syllables, thereby showing that at least one of the higher mental processes (memory) could be studied empirically using good experimental methodology, contrary to Wundt’s assertion.

63
Q

California Psychological Inventory

(CPI)

A

The California Psychological Inventory (CPI) is another personality inventory that is based on the MMPI. It was developed to be used with normal populations from age 13 and up. It is especially oriented to high school and college students. The CPI consists of 20 scales, including three validity scales, used to assess test-taker attitudes. Through a series of 462 true-false items, the CPI measures such personality traits as dominance, sociability, self-control, and femininity. Like the MMPI, all scores are expressed as standard scores with a mean and standard deviation derived from standardization samples.

64
Q

Content Validity

A

Content validity, also called logical validity, refers to the extent to which a measure represents all facets of a given construct. For example, a depression scale may lack content validity if it only assesses the affective dimension of depression while failing to take into account the behavioral dimension. An element of subjectivity exists in relation to determining content validity, which requires a degree of agreement about what a particular personality trait such as extroversion represents.

65
Q

Criterion Validity

A

In psychometrics, criterion validity, aka concrete validity, is the extent to which a measure is related to an outcome. Criterion validity is often divided into concurrent validity and predictive validity. Concurrent validity refers to a comparison between the measure in question and an outcome assessed at the same time. Predictive validity, on the other hand, compares the measure in question with an outcome assessed at a later time.

66
Q

Wilhelm Wundt, Oswald Külpe, and mental images

A

Wilhelm Wundt believed that whenever you think of something, an image forms in your mind, i.e. there can be no thought without a mental image. Oswald Külpe disagreed. Külpe strongly believed that there can be imageless thought, and he performed experiments to prove his hypothesis. Külpe lived from 1862 – 1915, and was a protégé of Wundt. Both Wundt and Külpe were part of the structuralist school.

67
Q

David Wechsler

A

(1896 – 1981) David Wechsler was a Romanian-American psychologist who developed the Wechsler Intelligence Scales.

68
Q

Intelligence Quotient (IQ)

A

The intelligence quotient (IQ) is a measure of intelligence aptitude using an equation comparing mental age to chronological age. IQ is mental age divided by chronological age, multiplied by 100. An IQ of 100 indicates that a person’s mental age is equal to his or her chronological age. This concept is known as the ratio IQ and was developed by William Stern. One of the problems with the ratio IQ is that after a certain age, chronological age increases while mental age does not. Therefore, even if your mental age remains constant, your IQ will decrease with age. In order to get around this problem, the 1960 revision of the Stanford-Binet used deviation quotients. Essentially, a deviation IQ score tells us how far away a person’s score is from the average score for the particular age group the subject is a member of.

69
Q

Adaptive Test

A

An adaptive test is a computerized achievement test that adapts to the test taker’s ability by assessing the accuracy of previously answered questions. A test taker with a high ability will be faced with more difficult questions than a test taker with a low ability.

70
Q

Norm-Referenced vs. Domain-Referenced Testing

(Domain-Referenced is sometimes called Criterion-Referenced)

A

Norm-referenced testing involves assessing an individual’s performance in comparison to others. For example, “Erika did better than 99% of second graders tested.” Test norms are derived from standardization samples; the samples should be large and representative of the population to whom the particular test will be administered. One problem with norm-referenced testing is that the population to whom the tests will be administered can, and often does, change. If the population of interest changes, then the original standardization sample would no longer be representative of the population. Domain-referenced testing, also called criterion-referenced testing, is concerned with the question of what the test taker knows about a specified content domain. Performance is described in terms of what the test taker knows or can do, not how you score in relation to your peers. An example of domain-referenced testing is the written test you must take for your driver’s license.

71
Q

3 kinds of significance tests to know

A
  • t-test: used to compare the means of 2 groups
  • ANOVA (analysis of variance): used to compare the means of more than 2 groups; also used to determine whether there is any interaction between 2 or more IVs (i.e. the effects of one independent variable are not consistent for all levels of the other independent variables); ANOVAs estimate how much group means differ from each other by comparing the between-group variance to the within-group variance using a ratio, called the F ratio.
  • Chi-square test: tests the equality of two frequencies; chi-square tests work with categorical data, also called nominal data
72
Q

Special Note

A

From time to time, a question pops up on the GRE Psychology Test about what would happen if you converted every score in a distribution to a z-score. Remember that if you have a distribution of z-scores and calculate the mean and standard deviation, the mean of the distribution of z-scores will always be zero and the standard deviation will always be 1. This is true regardless of whether the distribution is normal, and regardless of the mean and the standard deviation of the original distribution.

73
Q

Experimenter Bias

(Expectancy Effects)

A

Any intentional or unintentional influence that the experimenter exerts on subjects to confirm the hypothesis under investigation. Alternatively, the experimenter might also let his or her expectations affect how the results of the experiment are interpreted. One remedy for experimenter bias is double-blinding.

74
Q

3 Types of Probability Sampling

A
  • Simple random sampling: Every member of the population has an equal probability of being selected for the sample.
  • Stratified random sampling: The population is divided into subgroups (also called strata), then random sampling techniques are used to select sample members from each stratum.
  • Cluster sampling: The researcher identifies “clusters” of individuals, then all individuals in each cluster are included in the sample; what makes cluster sampling a form of probability sampling is the way in which the clusters are selected.
75
Q

Probability vs. Nonprobability Sampling

A

In probability sampling, each member of the population has a specifiable probability (chance) of being chosen. In nonprobability sampling, the probability (chance) of any particular member of the population being chosen is unknown.

76
Q

William Stern

A

(1871 – 1938) German psychologist and philosopher who developed a function to compare mental age with chronological age. Stern coined the term intelligence quotient, or IQ, to designate the output of this function.

77
Q

Hypothesis

A

A tentative and testable explanation of the relationship between one or more independent variables and one or more dependent variables.

78
Q

Independent and Dependent

Variables

(IV and DV)

A

The independent variable is the variable whose effect is being studied. The dependent variable is the variable expected to change due to variations in the independent variable.

79
Q

Variable

A

A factor that varies in amount or kind and can be measured.

80
Q

Operational Definitions

A

Operational definitions state how the researcher will measure the variables.

81
Q

Population and Sample

A

The population is the group to which the researcher wishes to generalize her results. A sample is a subset of the population.

82
Q

Representative Sample

A

A representative sample is a sample which matches as many characteristics as possible of the population as a whole.

83
Q

Subject Variables, or

Participant Variables

A

Characteristics of individuals, such as age, gender, ethnic group, nationality, birth order, personality, or marital status. These variables are by definition nonexperimental; they cannot be manipulated, they can only be measured.

84
Q

Between-Subjects Design,

Matched-Subjects Design, and

Within-Subjects Design

(also called Repeated-Measures Design)

A

In a between-subjects design, each subject is exposed to only one level of each independent variable. A matched-subjects design is like a between-subjects design, except that every subject in one group is “matched” with an “equivalent” subject in another group. The idea is to negate the effect of confounding variables. In a within-subjects design (also called repeated-measures design), the subject’s own performance is the basis of comparison. Each subject experiences multiple levels of the IV.

85
Q

Counterbalancing

A

Counterbalancing is an attempt to counteract order effects in within-subjects design (also called repeated-measures design). Half the subjects might be given IVA on day one and IVB on day two, while the other half would be given IVB on day one and IVA on day two.

86
Q

Confounding Variables

A

Unintended independent variables.

87
Q

Control Group Design

A

Control group design means treating every group identically in all respects except for carefully varying the levels of one or more independent variables.