Exam #2 Flashcards

1
Q

ch.5 Self-report measure +1

A

A method of measuring a variable in which people answer questions about themselves in a questionnaire or interview. For example, asking people how much they appreciate their partner and asking about gender identity are both self-report measures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Observational measures+1

A

A method of measuring a variable by recording observable behaviors or physical traces of behaviors. Also called behavioral measure. For example, a researcher could operationalize happiness by observing how many times a person smiles. Intelligence tests can be considered observational measures, because the people who administer such tests in person are observing people’s intelligent behaviors. People may change behavior because they know they are being watched.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Physiological measures+1

A

A method of measuring a variable by recording biological data, such as heart rate, galvanic skin response, blood pressure. Physiological measures usually require the use of equipment to record and analyze biological data.
Ex: measuring hormone levels, brain activity;
Measure cortisol in saliva of children and check how it’s related to their behavior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What’s reliability? What are the kinds of reliability?+1

A

The consistency or stability of the results of a behavior measure. They are Test-retest reliability, Alternate forms reliability, Interrater reliability, Internal reliability and Split-half reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

test-retest reliability?+1

A

The consistency in results every time a measure is used.
- Test-retest reliability is assessed by measuring the same individuals at two points in time and comparing results. High correlation between test and retest indicates reliability.
For example, a trait like intelligence is not usually expected to change over a few months, so if we assess the test-retest reliability of an IQ test and obtain a low score, we would be doubtful about the reliability of this test. In contrast, if we were measuring flu symptoms or seasonal stress, we would expect test-retest reliabilities to be low, simply because these constructs do not
stay the same over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

interrater reliability?+1

A

The degree to which two or more observers give consistent ratings of a set of targets.
- Interrater reliability is the correlation between the observations of different RATERS.
-A high correlation indicates raters agree in their ratings.
- To test the interrater reliability of some measure, we might ask two observers to rate the same participants at the same time, and then we would compute r. If r is positive and strong (according to many researchers, = .70 or higher), we would have very good interrater reliability.
For example, suppose you are assigned to observe the number of times each child smiles in 1 hour at a childcare playground. If, for one child, you record 12 smiles during the first hour and your lab partner also records 12 smiles in that hour, there is interrater reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What’s validity?+1

A

how accurate an assessment/test/measure is
-The appropriateness of a conclusion or decision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Construct validity?+1

A

An indication of how well a variable was measured or manipulated in a study.
- It can be used in observational research.
For example, how much do people eat in fast-food restaurants?
Construct validity is especially important when a construct is not directly observable. Take happiness: We have no means of directly measuring how happy a person is. We could estimate it in a number of ways, such as scores on a well-being inventory, daily smile rate, blood pressure, stress hormone levels etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Face validity+1

A

The extent to which a measure is subjectively considered a plausible operationalization of the conceptual variable in question.
-!!The content of the measure appears to reflect the construct being measured.
Ex: Head circumference has high face validity as a measurement of hat size, but it has low face validity as an operationalization of intelligence. In contrast, speed of problem solving, vocabulary size, and curiosity have higher face validity as operationalizations of intelligence.
-Does the measure look good: the weakest validity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Content validity+1

A

The extent to which a measure captures all parts of a defined construct.
Ex: measure all anxiety domains. Consider this conceptual definition of intelligence, which contains distinct elements, including the ability to “reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly, and learn from experience”. To have adequate content validity, any operationalization of intelligence should include questions or items to assess each of these seven components.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Criterion validity+1

A

evaluates whether the measure under consideration is associated with a concrete behavioral outcome with which it should be associated.
- Criterion validity is especially important for self-report measures because the correlation can indicate how well people’s self-reports predict their actual behavior.
- Criterion validity evidence could show that IQ scores are correlated with other behavioral outcomes that are theoretically related to intelligence, such as the ability to solve problems and indicators of life success
- It’s measuring current outcome (ask).
We make sure that no other events can influence the outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Predictive validity (ask not in book)+1

A

refers to the ability of a test or other measurement to predict a future outcome. Here, an outcome can be a behavior, performance, or even disease that occurs at some point in the future.
e.g. A pre-employment test has predictive validity when it can accurately identify the applicants who will perform well after a given amount of time, such as one year on the job.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Surveys definition? (ch. 6)+1

A

A method of posing questions to people on the telephone, in personal interviews, on written questionnaires, or via the Internet. Also called polls.
But, it is often used when people are asked about consumer products.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Polls definition +1

A

A method of posing questions to people on the telephone, in personal interviews, on written questionnaires, or via the Internet. Also called survey.
But, it is often used when people are asked about their social or political opinions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Observational research+1

A

The process of watching people or animals and systematically recording how they behave or what they are doing.
Some claims based on observational data.
– Observing how much people talk, behave, etc.
Strength - for people that struggle with introspection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Observer bias?+1

A

a bias that occurs, when observers’ expectations influence their interpretation of participant behavior or outcomes of the study. Instead of rating behaviors objectively, observers rate behaviors according to their own expectations or hypotheses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Observer/expectancy effects+1

A

observers inadvertently change the behavior of the participants they are observing.
- Observers not only see what they expect to see; sometimes they even cause the behavior of those they are observing, such as rats to conform to their expectations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Ways to reduce observer bias & effects+1

A
  • Researchers can assess the construct validity of a coded measure by using multiple observers.
  • Masked design/blind design- observers are unaware of the purpose of the study and the conditions/groups participants assigned to.
  • Training for observers
    If there is disagreement, the researchers may need to train their observers better and develop a clearer coding system for rating the behaviors.
  • “Blend in”. One way to avoid observer effects is to make unobtrusive observations—that is, make yourself less noticeable.
  • “Wait it out”. A researcher who plans to observe at a school might let the children get used to his or her presence until they forget they’re being watched.
  • “Indirect measure”. Instead of observing
    behavior directly, researchers measure the traces a particular behavior leaves behind. e.g. The number of empty liquor bottles in residential garbage indicates how much alcohol is being consumed in a community.
    (- Researchers develop clear rating instructions, often
    called codebooks, so the observers can make reliable judgments with minimal bias.)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Constructing Leading Questions to Ask (simplicity)+1

A

The way a question is worded and presented in a survey can make a tremendous difference in how people answer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Constructing Double-barreled Questions to Ask+1

A

A type of question in a survey or poll that is problematic because it asks two questions in one, thereby weakening its construct validity. People might be responding to the first half of the question, the second half, or both.
e.g. Do you enjoy swimming and running?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Constructing Negatively-worded Questions to Ask+1

A

A question in a survey or poll that contains negatively phrased statements, making its wording complicated or confusing and potentially weakening its construct validity.
Ex: People who do not drive with an expired license should never be punished. “It’s impossible that it never happened.” In order to give your opinion, you must be able to unpack the double negative of “and“ So instead of measuring people’s beliefs, the question may be measuring people’s working memory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Constructing Acquiescence Questions to Ask+1

A

One potential response set is acquiescence, or “yea saying”. This occurs when people say “yes” or “strongly agree” to every item instead of thinking carefully about each one. For example, a respondent might answer “5” to every item on Diener’s scale of subjective well-being—not because the respondent is happy, but because that person is using a yea-saying shortcut. It can threaten construct validity because instead of measuring the construct of true feelings of well-being, the survey could be measuring the lack of motivation to think carefully.

23
Q

Open-ended vs forced-choice (closed-ended) questions+1

A

Open-ended - A survey question format that allows respondents to answer any way they like. They might ask people to name the public figure they admire the most or comment on their experience at a hotel.
Ex: What do you think of this food? (Lots of answers)
Closed-ended - A survey question format in which respondents give their opinion by picking the best of two or more options.
Ex: Do you like this food? ( Yes no answer)
Would you vote for the Republican Or the Democrat?
Forced-choice questions are also used to measure personality.

24
Q

Rating scales: semantic differential format?+1

A

A survey question format using a response scale whose numbers are anchored with adjectives.
e.g. on the Internet site RateMyProfessors.com, students assign ratings to a professor using the following adjective phrases.
“Profs get F’s too 1 2 3 4 5 A real gem”
Internet rating sites (like Yelp) use is another example: one star means “poor” or (on Yelp) “Eek! Methinks not,” and five stars means “outstanding” or even “Woohoo! As good as it gets!”

25
Q

What’s Likert scale?+1

A

A survey question format using a rating scale containing multiple response options labeled with the specific terms: strongly agree, agree, neither agree nor disagree, disagree, and strongly disagree.
- A scale that does not follow this format exactly is called a Likert-type scale.
Ordinal scale (ask)

26
Q

Graphic rating scale? +1

A

It’s a nonverbal scale for children
e.g. Pain scale with pictures

27
Q

What are the kinds of survey questionnaires? +1 (ask)

A

-mail surveys
-online surveys
-interviews: Personal administration to groups
or individuals.

28
Q

Response sets definition+1 The threat to which validity?

A

also known as “non-differentiation” are short-cuts used when answering questions, usually in a long survey. It is a tendency to respond to survey questions from a particular perspective rather than answering questions directly.
- It weakens construct validity

29
Q

Fence sitting definition+1 The threat to which validity?

A

Playing it safe by answering in the middle of the scale for every question in a survey or interview, especially when survey items are controversial or when a question is confusing and unclear. Of course, some people honestly may have no opinion on the questions; in that case, they choose the middle option for a valid reason (no opinion or I don’t know). It can be difficult to distinguish those who are unwilling to take a side from those who are truly ambivalent.
- it weakens construct validity
- to protect against it, use even number of questions so participant is forced to pick a side or use forced choice questions

30
Q

The social desirability response set+1 The threat to which validity?

A

the tendency to answer questions in the way that make one look better than one really is (threatens construct and statistical validity). Also called faking good.
The idea is that because respondents are embarrassed, shy, or worried about giving an unpopular opinion, they will not tell the truth on a survey or other measure.
- To avoid socially desirable responding, a researcher might ensure that the participants know their responses are anonymous.

31
Q

Faking good vs bad+1

A

Faking good - Giving answers on a survey (or other self-report measure) that make one look better than one really is.
Faking bad - Giving answers on a survey (or other self-report measure) that make one look worse than one really is. e.g. A person stuck in a pattern of hopelessness who avoids having to change by believing, “I am bad; nobody accepts me; nothing will change”. These individuals attempt to convince others that they are ‘hopeless cases’.

32
Q

Reactivity meaning?+1

A

A change in behavior of study participants (such as acting less spontaneously) because they are aware they are being watched. They might react by being on their best behavior—or in some cases, their worst—rather than displaying their typical behavior.
e.g. Suppose you’re visiting a first-grade classroom to observe the children. You walk quietly to the back of the room and sit down to watch what the children do. What will you see? A roomful of little heads looking at you!

33
Q

What’s Generalizability? (ch. 7) +1

A

The extent to which the subjects in a study represent the populations they are intended to represent or how well the settings in a study represent other settings.
Questions to ask: How did the researchers choose the study’s participants, and how well do those participants represent the intended population?
The whole point with external validity of frequency claim is generalizability.
e.g. “74% of the world smiled yesterday.” Did Gallup researchers survey every one of the world’s 8 billion people to come up with this number?

34
Q

Population vs sample? +1

A

Population- entire set of people or things in which you are interested; for example, all freshman currently enrolled at your college or university.
Sample- smaller set of people or things that is taken from the population; for example, 100 freshman currently enrolled at your college or university.

35
Q

Probability sampling?+1

A

A category name for random sampling techniques, such as simple random sampling, stratified random sampling, oversampling, multistage and cluster sampling, in which every member of the population of interest has an equal chance of being selected for the sample (also known as random sampling).

36
Q

Non-probability sampling?+1

A

A category name for nonrandom sampling techniques, such as convenience, purposive, and quota sampling, that result in a biased sample.

37
Q

simple random sampling?+1

A

The most basic form of probability sampling, in which the sample is chosen completely at random from the population of interest (e.g., drawing names out of a hat).
Ex: putting every member of the population of interest in a pool and then randomly selecting a number of names from the pool to include in your sample.

38
Q

random sampling vs random assignment? Which validities increase?+1

A

Random sampling: creating a sample using some random method so that each member of the population of interest has an equal chance of being in the sample; this method increases external validity.
Random assignment: only used in experiments; participants are randomly put into different groups (usually a treatment group and a comparison group); this method increases internal validity (e.g. flipping a coin).

39
Q

stratified random sampling+1

A

A form of probability sampling; A multistage technique in which the researcher selects specific demographic categories or strata (such as race or gender) and then randomly selects individuals from each of the categories. The population is divided into subgroups (strata), and random samples are taken from each strata.
For example, a group of researchers might want to be sure their sample of 1,000 Canadians includes people of South Asian descent in the same proportion as in the Canadian population (which is 4%). Thus, they might have two categories (strata) in their population: South Asian Canadians and other Canadians.

40
Q

Cluster sampling+1

A

Clusters of participants within a population of interest are randomly selected, and then all individuals in each selected cluster are used.
Clusters are identified, and samples are taken from those clusters.
For example, we choose a cluster (sample) of 50 hospitals in Philadelphia and we choose all the doctors from each of 50 hospital.

41
Q

Convenience sampling+1

A

Choosing a sample based on those who are easiest to access and readily available; a biased sampling technique. “Haphazard” or “take-them-where-you-find-them” sampling.
Ex: Psychology studies are often conducted by psychology professors, and they find it handy to use easy-to-reach college students as participants.
Another e.g. is used in online studies. Psychologists may conduct research through websites such as Prolific Academic or Amazon’s Mechanical Turk.

42
Q

Purposive sampling+1

A

A biased sampling technique in which only certain kinds of people are included in a sample, so you only recruit those types of participants.
- Sample meets predetermined criterion
For example, if you wanted to recruit smokers, you might recruit participants from a tobacco store.

43
Q

Quota sampling+1

A

similar to stratified random sampling; A biased sampling technique (not random) in which the researcher identifies subsets of the population and then sets a target number (i.e., a quota) for each category in the sample. Then she uses nonrandom sampling until the quotas are filled.
e.g. - You would like to have 20 college freshman, 20 sophomores, 20 juniors, and 20 seniors in your sample. You know some people in each of these categories but not 20 of each, so you might use snowball sampling until you meet your quota of 20 in each subset.

44
Q

Multistage sampling (ask) +1

A

A probability sampling technique involving at least 2 stages: a random sample of clusters followed by a random sample of people within the selected clusters.
e.g. we choose a cluster (sample) of 50 hospitals in Philadelphia, but we choose only a certain number of doctors instead of all, for example only 3 doctors from each of 50 hospitals.

45
Q

Oversampling+1

A

A form of probability sampling; a variation of stratified random sampling in which a researcher over represents one or more groups.
e.g., Perhaps a researcher wants to sample 1,000 people, making sure to include South Asians in the sample. Maybe the researcher’s population of interest has a low percentage of South Asians (say, 4%). Because 40 individuals may not be enough to make accurate statistical estimates, the researcher decides that of the 1,000 people he samples, a full 100 will be sampled at random from the Canadian South Asian community.

46
Q

Systematic sampling+1

A

A probability sampling technique in which the researcher uses a randomly chosen number, such as 9-th and counts off every 4-th member of a population to achieve a sample.
- Using a computer or a random number table, the researcher starts by selecting two random numbers— say, 4 and 7. If the population of interest is a roomful of students, the researcher would start with the fourth person in the room and then count off, choosing every seventh person until the sample is the desired size.

47
Q

Snowball sampling+1

A

a variation on purposive sampling in which participants are asked to recommend other participants for the study.
- Snowball sampling is unrepresentative (biased) sampling technique, because people are recruited via social networks, which are not random.
For example, for a study on coping behaviors in people who have Crohn’s disease, for example, a researcher might start with one or two who have the condition, and then ask them to recruit people from their support groups. Each of them might, in turn, recruit one or two more acquaintances, until the sample is large enough.

48
Q

Self-selection? Where it’s prevalent and which problems it may cause?+1

A

A form of sampling bias that occurs when a sample contains only people who volunteer to participate.
- Self-selection is especially prevalent in online polls.
- Self- selection can cause serious problems for external validity.
e.g. When Internet users choose to rate something—a product on Amazon.com, an online quiz on Twitter or BuzzFeed.com, a professor on RateMyProfessors.com—they are self-selecting when doing so.

49
Q

When external validity is a high priority+1

A

In a frequency claim, external validity is a priority.
- External validity is extremely important when making frequency claims because you are reporting on how often something happens in a population.
- external validity concerns both samples and settings.
e.g. “Does the sample of drivers who were asked about road rage adequately represent American drivers?”
“Can feelings of the Afghan people in the sample generalize to all the people in Afghanistan?”
“Can we predict the results of the election if the polling sample consisted of 1,500 people?”

50
Q

When external validity is a lower priority+1

A
  • Nonprobability samples in the real world
    Consider whether self-selection affects the results of an online shopping rating, as in the Zappos.com headline “61% said this shoe felt true to size.” You can be pretty sure the people who rated the fit of these shoes are self-selected and therefore don’t represent all the people who own that model. Are the feet of raters likely to be very different from those of the general population? Probably not, so their opinions about the fit of the shoes might generalize. If you believe this self-selected sample is almost the same as those of others who bought the shoes, their ratings might be accurate for you after all.
  • Nonprobability samples in research studies
    e.g. recall the 30 dual-earner families who allowed the researchers to videotape their evening activities. Only certain kinds of families will let researchers walk around the house and record everyone’s behavior. Would this self-selection affect the conclusions of the study? The researchers may have to live with some uncertainty about the generalizability of their data. However, our uncertainty does not mean that the results are wrong or even uninteresting.
51
Q

Importance of the size of sample+1

A

Large samples are not more representative than smaller samples.
- When a phenomenon is rare, we do need a large random sample in order to locate enough instances of that phenomenon for valid statistical analysis.
But for most variables, when researchers are striving to generalize from a sample to a population, the size of the sample is in fact much less important than how that sample was selected. When it comes to the external validity of the sample, it’s how, not how many.
- the margin of error is a statistic that sets up the confidence interval for a study’s estimate; the larger the sample size, the smaller the margin of error. That’s why many researchers consider 1,000 to be an optimal balance, which can generalize to the whole population.
- In effect, sample size is not an external validity issue; it is a statistical validity issue.

52
Q

Loaded question+1

A

a trick question, which presupposes at least one unverified assumption that the person being questioned is likely to disagree with. For example, the question “have you stopped mistreating your pet?” is a loaded question, because it presupposes that you have been mistreating your pet

53
Q

Visual cliff - Depth Perception? The goal and results?+1

A

skills needed to assess depth and spatial relationships.
- the goal of this research was to examine whether depth perception is learned or innate
- They were unable to answer question- is depth perception learned or innate?
- Depth perception may represent a combination of inborn and learned abilities.
- Depth perception may be present at birth but fear of falling and avoidance of danger are learned through experience.
- Babies used social referencing from mother’s expressions and verbal directions as cues to proceed