Reading Key Points Flashcards

1
Q

Two key features of an Experiment?

A
• Manipulation: 
 Researchers manipulate, or 
   systematically vary, the level 
   of the independent variable. 
  The different levels of the 
  independent variable are 
  called conditions.
• Control: 
 The researcher controls, or 
   minimises the variability in, 
   variables other than the 
   independent and dependent 
   variable. These other 
   variables are called 
  extraneous variables. 
*They manipulate the 
  independent variable by 
  systematically changing its 
  levels and control other 
  variables by holding them 
  constant.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Manipulation of the Indpendent Variable:

A

 To manipulate an independent variable means to change its level systematically so that different groups of participants are exposed to different levels of that variable (between-subjects), or the same group of participants is exposed to different levels at different times (within-subjects).
 The different “levels” of an independent varaible are called conditions.
 Involves the active intervention of the researcher (i.e., they produce the difference between the two groups; it’s not an existing subject factor they differ in).
 Manipulation of the IV eliminates third-variable problems because researchers put in effort to ensure that they only difference between the two groups is the different levels of the IV the experimenter exposes them to.
 It is sometimes unethical to manipulate the IV and an experiment can not be conducted (medical studies or inducing – emotion/trait/behaviours that cause harm or distress to participant).
 The IV is a construct which is indirectly measured through our operational variables.
 A manipulation check is a seperate measure of the construct used to verify that the researchers have successfully manipulated the variable (self reported stress & blood pressure).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Control of Extrenuous Variables:

A

 An extraneous variable is anything that varies in the context of a study other than the independent and dependent variables.
 Individual differences or Situational and task variables.
 They pose a problem because they are likely to exert an effect on the dependent variable.
 This makes it harder to seperate the effects of the IV and externeous variables (i.e., confounds). Therefore, researchers control these extaneous variables by keeping them constant.

Extraneous Variables as “Noise”
 Extraneous variables make it hard to detect the effects of the IV in two ways:
• Adding variability or “noise” to the data makes it harder to detect the effects of the IV von the DV.
• One way to control for extraneous variables is to keep them constant.
 Through keeping situation and task variables equal across condtions, using a standardised format, providing same tools, interacts to participants or putting inclusion/exclusion criteria on participants.
 Putting restrictions on participant inclusion limites the external validity of the study (i.e., how generalisable it is to the population).
 In many studies the importance of having a representative sample outwieghs the benifits of minimising noise.

Extraneous Variables as Confounds
 The second way extraneous variables make it hard to detect the effects of the IV is that they act as confounds.
 Confund variables are extraneous variables which differs on average across the levels of the independent variable[s] (i.e., intellegence if there is not an equal mix of high and low IQ participants in each condition).
 To confound means to confuse because they provide alternative explanations for the effect found on the DV that can not always be ruled out or disproven.
 One way to avoid confounds is to keep extaneous variables held constant.
 Another way is to randomly assign participants to conditions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Summary:
Chapter 6
experimental design

A

 An experiment is a type of empirical study that features the manipulation of an independent variable, the measurement of a dependent variable, and control of extraneous variables.
 An extraneous variable is anything that varies in the context of a study other than the independent and dependent variables. Extraneous variables make it difficult to detect the effect of the independent variable because they add variability or “noise” to the data
 A confounding variable is an extraneous variable that differs on average across levels of the independent variable. Because they differ across levels of the independent variable, confounding variables provide an alternative explanation for an effect on the dependent variable.
 Experiments can be conducted using either between-subjects or within-subjects designs. Deciding which to use in a particular situation requires careful consideration of the pros and cons of each approach.
 Random assignment to conditions in between-subjects experiments or to orders of conditions in within-subjects experiments is a fundamental element of experimental research. Its purpose is to control extraneous variables so that they do not become confounding variables.
 Experimental research on the effectiveness of a treatment requires both a treatment condition and a control condition, which can be a no-treatment control condition, a placebo control condition, or a wait-list control condition. Experimental treatments can also be compared with the best available alternative.
 Studies are high in internal validity to the extent that the way they are conducted supports the conclusion that the independent variable caused any observed differences in the dependent variable. Experiments are generally high in internal validity because of the manipulation of the independent variable and control of extraneous variables.
 Studies are high in external validity to the extent that the result can be generalised to people and situations beyond those actually studied. Although experiments can seem “artificial”—and low in external validity—it is important to consider whether the psychological processes under study are likely to operate in other people and situations.
 There are several effective methods you can use to recruit research participants for your experiment, including through formal participant pools, advertisements, and personal appeals. Field experiments require well-defined participant selection procedures.
 It is important to standardise experimental procedures to minimise extraneous variables, including experimenter expectancy effects.
 It is important to conduct one or more small-scale pilot tests of an experiment to be sure that the procedure works as planned.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

summary
chapter 1
what is science

A

 There is a history of biased research that was labelled science being used as a tool to justify European colonisation and other injustices. It is important that psychological science is conducted ethically and free from bias.
 Science is a general way of understanding the natural world. Its three fundamental features are systematic empiricism, empirical questions, and public knowledge.
 Scientific psychology takes the scientific approach to understanding human behaviour.
 Pseudoscience refers to beliefs and activities that are claimed to be scientific but lack one or more of the three features of science. It is important to distinguish the scientific approach to understanding human behaviour from the many pseudoscientific approaches.
 Research in psychology can be described by a simple cyclical model. A research question based on the research literature leads to an empirical study, the results of which are published and become part of the research literature.
 Scientific research in psychology is conducted mainly by people with doctoral degrees in psychology and related fields, most of whom are university academic staff. They do so for professional and for personal reasons, as well as to contribute to scientific knowledge about human behaviour.
 Basic research is conducted to learn about human behaviour for its own sake, and applied research is conducted to solve some practical problem. Both are valuable, and the distinction between the two is not always clear-cut.
 People’s intuitions about human behaviour, also known as folk psychology, often turn out to be wrong. This is one primary reason that psychology relies on science rather than common sense.
 Researchers in psychology cultivate certain critical-thinking attitudes. One is scepticism. They search for evidence and consider alternatives before accepting a claim about human behaviour as true. Another is tolerance for uncertainty. They withhold judgement about whether a claim is true or not when there is insufficient evidence to decide.
 The scientific approach to psychology has tended to view the individual as an isolated unit and critics have argued that has resulted in psychology research overlooking social issues. Social constructionism and mātauranga Māori embed social connections as a fundamental part of our psychology.
 The clinical practice of psychology—the assessment and treatment of psychological problems—is one important application of the scientific discipline of psychology.
 Scientific research is relevant to clinical practice because it provides detailed and accurate knowledge about psychological.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

summary
chapter 5
what is measurement

A

 Measurement is the assignment of scores to individuals so that the scores represent some characteristic of the individuals. Psychological measurement can be achieved in a wide variety of ways, including self-report, behavioural, and physiological measures.
 Psychological constructs such as intelligence, self-esteem, and depression are variables that are not directly observable because they represent behavioural tendencies or complex patterns of behaviour and internal processes. An important goal of scientific research is to conceptually define psychological constructs in ways that accurately describe them.
 For any conceptual definition of a construct, there will be many different operational definitions or ways of measuring it. The use of multiple operational definitions, or converging operations, is a common strategy in psychological research. Variables can be measured at four different levels—nominal, ordinal, interval, and ratio—that communicate increasing amounts of quantitative information. The level of measurement affects the kinds of statistics you can use and conclusions you can draw from your data.
 Psychological researchers do not simply assume that their measures work. Instead, they conduct research to show that they work. If they cannot show that they work, they stop using them.
 There are two distinct criteria by which researchers evaluate their measures: reliability and validity. Reliability is consistency across time (test-retest reliability), across items (internal consistency), and across researchers (interrater reliability). Validity is the extent to which the scores actually represent the variable they are intended to.
 Validity is a judgement based on various types of evidence. The relevant evidence includes the measure’s reliability, whether it covers the construct of interest, and whether the scores it produces are correlated with other variables they are expected to be correlated with and not correlated with variables that are conceptually distinct.
 The reliability and validity of a measure is not established by any single study but by the pattern of results across multiple studies. The assessment of reliability and validity is an ongoing process.
 Good measurement begins with a clear conceptual definition of the construct to be measured. This is accomplished both by clear and detailed thinking and by a review of the research literature.
 You often have the option of using an existing measure or creating a new measure. You should make this decision based on the availability of existing measures and their adequacy for your purposes.
 Several simple steps can be taken in creating new measures and in implementing both existing and new measures that can help maximize reliability and validity.
 Once you have used a measure, you should re-evaluate its reliability and validity based on your new data. Remember that the assessment of reliability and validity is an ongoing process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

 Pseudoscience:

A

o Refers to work that claims to be and appears to be science at a first glance but doesn’t adopt the rigorous scientific methodology and bases its claim on anecdotal evidence.
o E.g., trepidation or biorhythms theory:
o The idea is that people’s physical, intellectual, and emotional abilities run in cycles that begin when they are born and continue until they die. Allegedly, the physical cycle has a period of 23 days, the intellectual cycle a period of 33 days, and the emotional cycle a period of 28 days
 Criteria for Pseudoscience:
o A set of beliefs or activities can be said to be pseudoscientific if (a) its adherents claim or imply that it is scientific but (b) it lacks one or more of the three features of science.
o i.e., empirically tested, falsifiable hypothesis (observations can provide evidence to support or disprove claim), public knowledge.
 Why care about pseudoscience?
o Helps highlight the core components of sicence and why they’re important.
o The acceptance of false beliefs can have determental effects on society and learning about pseudoscience can help us evaluate theory and spot them.
o Psuedopsychology exists and its important for students of psychology to know this.
 Examples:
o Cryptozoology:
 The study of “hidden” creatures like Bigfoot, the Loch Ness monster, and the chupacabra.
o Pseudoscientific psychotherapies”
 Past-life regression, rebirthing therapy, and bioscream therapy, among others.
o Homeopathy:
 The treatment of medical conditions using natural substances that have been diluted sometimes to the point of no longer being present.
o Pyramidology:
 Odd theories about the origin and function of the Egyptian pyramids (e.g., that they were built by extraterrestrials) and the idea that pyramids in general have healing and other special powers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Can We Only Rely on Common Sense?

A

 Folk psychology: intuitive beliefs about peoples thoughts, feelings and behaviour.
 Intuition can be inaccurate and scientific research can often disprove such claims:
 E.g., flase confessions are common and yelling and screaming to vent out anger only makes you more angry.
 E.g., myths.
How Could We Be So Wrong?
 How can our intuitive beliefs be so wrong?
 Forming detailed and accurate beliefs requires powers of observation, memory, and analysis to an extent that we do not naturally possess.
 Thus, we tend to rely on mental shortcuts or heuristics (i.e., conformation bias, believing something is likley to be true because it is endorsed by others or experts; or believing a myth because it would benifit us if its true).
 Psychologists are just as prone to false beliefs so we work on training a level of sckeptism when we digest new information (i.e, consider its reliability, validity and alternative explanations).
 Scientists also cultivate a tolerance for uncertainty. They accept that there are many things that they simply do not know.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

 An operational definition is a definition of a variable in terms of precisely how it is to be measured.
 These measures generally fall into one of three broad categories:

A

o Self-report measures are those in which participants report on their own thoughts, feelings, and actions, as with the Rosenberg Self-Esteem Scale.
o Behavioural measures are those in which some other aspect of participants’ behaviour is observed and recorded. This is an extremely broad category that includes the observation of people’s behaviour both in highly structured laboratory tasks and in more natural settings.
o Physiological measures are those that involve recording any of a wide variety of physiological processes, including heart rate and blood pressure, galvanic skin response, hormone levels, and electrical activity and blood flow in the brain.
o For any given variable or construct, there will be multiple operational definitions:
 When psychologists use multiple operational definitions of the same construct—either within a study or across studies—they are using converging operations.
 The idea is that the various operational definitions are “converging” or coming together on the same construct.
 When scores based on several different operational definitions are closely related to each other and produce similar patterns of results, this constitutes good evidence that the construct is being measured effectively and that it is useful.
 This is what allows researchers eventually to draw useful general conclusions, such as “stress is negatively correlated with immune system functioning”, as opposed to more specific and less useful ones, such as “people’s scores on the Perceived Stress Scale are negatively correlated with their white blood counts”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Psychological Constructs:
 Some variables are straightforward to measure like demographic information such as sex, age, height, weight or birth order.
 Most variables are not straightforward or simple to measure. These variables are called “constructs” and include personality traits, emotional states, attitudes and abilities.

A

 Psychological constructs cannot be observed directly. One reason is that they often represent tendencies to think, feel, or act in certain ways (i.e., that varies across settings) or that they are internal states that are not directly observable.
 The conceptual definition of a psychological construct describes the behaviours and internal processes that make up the big construct and its related variables.
o E.g., neuroticism can be conceptually defined as “people’s tendency to experience negative emotions such as anxiety, anger, and sadness across a variety of situations. It has a strong genetic component, is a stable trait, and positively correlated with the tendency to experience pain or other physical symptoms”.
o Psychologists write definitions which are more detailed and precise than the dictionary and allow us to test them empirically and refine them if needed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
Levels of Measurement:
o	Stevens (1947) suggested four different levels of measurement (which he called “scales of measurement”) that correspond to four different levels of quantitative information that can be communicated by a set of scores, nominal, ordinal, interval, and ratio levels.
A

 Nominal:
• Categorical data which indicates whether participants are a member of a certain category (i.e., male/female, old/young, high/low SE, ethnicity, favourite colour etc.).
• Lowest level of measurement that merely categories responses and doesn’t imply a rank or order to the responses.
 Ordinal:
• Assigning scores so that they represent the rank order of the individuals.
• Ranks provide information about whether individuals are in the same category or not, and who’s higher or lower on the variable (i.e., consumer satisfaction).
• Missing information:
o The intervals between ranks or points on the scale cannot be assumed to be equal.
 Interval:
• Assigning scores using a numerical scale where there is equal distance between each point on the scale (i.e., Degrees Celsius or IQ).
• Scale doesn’t have a true point of zero (zero is not meaningful; communicate the absence of the trait).
• Communicating interval data as if it were a ratio doesn’t make sense (i.e., 20 degrees is twice as hot as 10 degrees is an unsupported statement).
 Ratio:
• True point of zero exists and indicates the absence of the trait (i.e., height, weight, number of correct answers on an exam, Kelvin scale, money).
• Accumulates aspects of the other three scales; 1) Nominal by providing the category of each object; 2) Ordinal where objects are ranked; 3) interval where there is equal distance between each point on the scale; 4) ratios at to places on the scale also have equivalent meanings.
o Why are levels of measurement important?
 They emphasise the generality of the concept of measurement that there are four different levels with their own features and uses.
 They serve as a rough guide on the statistical procedures which can be conducted based on the level of measurement you have and the conclusions you can make.
• Nominal = mode
• Ratio = ratio comments like x2 as big etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Reliability:
 Reliability refers to the consistency of a measure. Psychologists consider three types of consistency: over time (test-retest reliability), across items (internal consistency), and across different researchers (inter-rater reliability).

A
  1. Test-Retest Reliability:
     The extent to which measurements are consistent overtime. It’s important to consider whether the trait being studied is stable or unstable (mood is unstable over days but should be stable over a month period, intelligence and self-esteem are expected to be more stable).
     Assessing test-retest reliability requires using the measure on a group of people at one time, using it again on the same group of people at a later time, and then looking at test-retest correlation between the two sets of scores. This is typically done by graphing the data in a scatterplot and computing Pearson’s r.
     In general, a test-retest correlation of +.80 or greater is considered to indicate good reliability.
  2. Internal Consistency:
     The consistency of people’s responses across the items on a multiple-item measure.
     In general, all the items on such measures are supposed to reflect the same underlying construct, so people’s scores on those items should be correlated with each other.
     If people’s responses to the different items are not correlated with each other, then it would no longer make sense to claim that they are all measuring the same underlying construct.
     This is as true for behavioural and physiological measures as for self-report measures.
     Like test-retest reliability, internal consistency can only be assessed by collecting and analysing data. One approach is to look at a split-half correlation. This involves splitting the items into two sets, such as the first and second halves of the items or the even- and odd-numbered items. Then a score is computed for each set of items, and the relationship between the two sets of scores is examined. A split-half correlation of +.80 or greater is generally considered good internal consistency.
     Most common measure is the Cronbach’s α (the Greek letter alpha) a value of +.80 or greater is generally taken to indicate good internal consistency.
  3. Inter-Rater Reliability:
     Many behavioural measures involve significant judgement on the part of an observer or a rater. Inter-rater reliability is the extent to which different observers are consistent in their judgements. Different observers’ ratings should be highly correlated with each other.
     Inter-rater reliability is often assessed using Cronbach’s α when the judgements are quantitative or an analogous statistic called Cohen’s κ (the Greek letter kappa) when they are categorical.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Validity:

 Validity is the extent to which the scores from a measure represent the variable they are intended to.

A

 Face Validity:
• Face validity is the extent to which a measurement method appears “on its face” to measure the construct of interest.
• Face validity is at best a very weak kind of evidence that a measurement method is measuring what it is supposed to. One reason is that it is based on people’s intuitions about human behaviour, which are frequently wrong.
• It is not the participants’ literal answers to these questions that are of interest, but rather whether the pattern of the participants’ responses to a series of questions matches those of individuals who tend to suppress their aggression.
 Content Validity:
• Content validity is the extent to which a measure “covers” the entire construct of interest.
• For example, if a researcher conceptually defines test anxiety as involving both sympathetic nervous system activation (leading to nervous feelings) and negative thoughts, then his measure of test anxiety should include items about both nervous feelings and negative thoughts. Or consider that attitudes are usually defined as involving thoughts, feelings, and actions toward something.
• By this conceptual definition, a person has a positive attitude toward exercise to the extent that they think positive thoughts about exercising, feel good about exercising, and actually exercises So to have good content validity, a measure of people’s attitudes toward exercise would have to reflect all three of these aspects.
• Like face validity, content validity is not usually assessed quantitatively. Instead, it is assessed by carefully checking the measurement method against the conceptual definition of the construct.
 Criterion Validity:
• Criterion validity is the extent to which people’s scores on a measure are correlated with other variables (known as criteria) that one would expect them to be correlated with. For example, people’s scores on a new measure of test anxiety should be negatively correlated with their performance on an important school exam.
• A criterion can be any variable that one has reason to think should be correlated with the construct being measured, and there will usually be many of them. For example, one would expect test anxiety scores to be negatively correlated with exam performance and course grades and positively correlated with general anxiety and with blood pressure during an exam.
• When the criterion is measured at the same time as the construct, criterion validity is referred to as concurrent validity; however, when the criterion is measured at some point in the future (after the construct has been measured), it is referred to as predictive validity (because scores on the measure have “predicted” a future outcome).
• Criteria can also include other measures of the same construct. For example, one would expect new measures of test anxiety or physical risk taking to be positively correlated with existing measures of the same constructs. This is known as convergent validity.
 Discriminant Validity:
• Discriminant validity, on the other hand, is the extent to which scores on a measure are not correlated with measures of variables that are conceptually distinct.
• For example, self-esteem is a general attitude toward the self that is fairly stable over time. It is not the same as mood, which is how good or bad one happens to be feeling right now. So people’s scores on a new measure of self-esteem should not be very highly correlated with their moods.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Deciding on an Operational Definition:
 Using an Existing Measure:
 Creating your Own Measure:

A

 Using an Existing Measure:
• The advantages of using an existing measure that has been validated in previous literature are:
 you save the time and trouble of creating your own
 there is already some evidence that the measure is valid (if it has been used successfully),
 your results can more easily be compared with and combined with previous results.
• If you choose to use an existing measure, you may still have to choose among several alternatives. You might choose the most common one, the one with the best evidence of reliability and validity, the one that best measures a particular aspect of a construct that you are interested in, or even the one that would be easiest to use.
• You want to use the full scale from the original citation.
 Creating your Own Measure:
• Instead of using an existing measure, you might want to create your own. Perhaps there is no existing measure of the construct you are interested in or existing ones are too difficult or time-consuming to use. Or perhaps you want to use a new measure specifically to see whether it works in the same way as existing measures—that is, to evaluate convergent validity.
• Issues when creating your own behavioural, self-report or physiological scale:
 be aware that most new measures in psychology are really variations of existing measures, so you should still look to the research literature for ideas. Perhaps you can modify an existing questionnaire, create a paper-and-pencil version of a measure that is normally computerised (or vice versa), or adapt a measure that has traditionally been used for another purpose.
 When you create a new measure, you should strive for simplicity. Remember that your participants are not as interested in your research as you are and that they will vary widely in their ability to understand and carry out whatever task you give them. You should create a set of clear instructions using simple language that you can present in writing or read aloud (or both). It is also a good idea to include one or more practice items so that participants can become familiar with the task, and to build in an opportunity for them to ask questions before continuing. It is also best to keep the measure brief to avoid boring or frustrating your participants to the point that their responses start to become less reliable and valid.
 The need for brevity, however, needs to be weighed against the fact that it is nearly always better for a measure to include multiple items rather than a single item. There are two reasons for this. One is a matter of content validity. Multiple items are often required to cover a construct adequately. The other is a matter of reliability. People’s responses to single items can be influenced by all sorts of irrelevant factors—misunderstanding the particular item, a momentary distraction, or a simple error such as checking the wrong response option. But when several responses are summed or averaged, the effects of these irrelevant factors tend to cancel each other out to produce more reliable scores. Remember, however, that multiple items must be structured in a way that allows them to be combined into a single overall score by summing or averaging.
 Finally, the very best way to assure yourself that your measure has clear instructions, includes sufficient practice, and is an appropriate length is to test several people. (Family and friends often serve this purpose nicely). Observe them as they complete the task, time them, and ask them afterwards to comment on how easy or difficult it was, whether the instructions were clear, and anything else you might be wondering about.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Evaluating the Measure:

A

 In most research designs, it is not possible to assess test-retest reliability because participants are tested at only one time.
 It is also customary to assess internal consistency for any multiple-item measure—usually by reporting Cronbach’s α.
 Convergent and discriminant validity can be assessed in various ways. For example, if your study included more than one measure of the same construct or measures of conceptually distinct constructs, then you should look at the correlations among these measures to be sure that they fit your expectations. Note also that a successful experimental manipulation also provides evidence of criterion validity.
 Recall that MacDonald and Martineau manipulated participant’s moods by having them think either positive or negative thoughts, and after the manipulation their mood measure showed a distinct difference between the two groups. This simultaneously provided evidence that their mood manipulation worked and that their mood measure was valid.
 But what if your newly collected data cast doubt on the reliability or validity of your measure? The short answer is that you have to ask why. It could be that there is something wrong with your measure or how you administered it. It could be that there is something wrong with your conceptual definition. It could be that your experimental manipulation failed. For example, if a mood measure showed no difference between people whom you instructed to think positive versus negative thoughts, maybe it is because the participants did not actually think the thoughts they were supposed to or that the thoughts did not actually affect their moods. In short, it is “back to the drawing board” to revise the measure, revise the conceptual definition, or try a new manipulation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is an Experiment?
 An experiment is designed specifically to answer the question of whether there is a causal relationship between two variables (i.e., whether changes in an independent variable cause changes in a dependent variable).
 Two key features:

A

 Two key features:
• Manipulation:
 Researchers manipulate, or systematically vary, the level of the independent variable. The different levels of the independent variable are called conditions.
• Control:
 The researcher controls, or minimises the variability in, variables other than the independent and dependent variable. These other variables are called extraneous variables.
*They manipulate the independent variable by systematically changing its levels and
control other variables by holding them constant.

Manipulation of the Indpendent Variable:
 To manipulate an independent variable means to change its level systematically so that different groups of participants are exposed to different levels of that variable (between-subjects), or the same group of participants is exposed to different levels at different times (within-subjects).
 The different “levels” of an independent varaible are called conditions.
 Involves the active intervention of the researcher (i.e., they produce the difference between the two groups; it’s not an existing subject factor they differ in).
 Manipulation of the IV eliminates third-variable problems because researchers put in effort to ensure that they only difference between the two groups is the different levels of the IV the experimenter exposes them to.
 It is sometimes unethical to manipulate the IV and an experiment can not be conducted (medical studies or inducing – emotion/trait/behaviours that cause harm or distress to participant).
 The IV is a construct which is indirectly measured through our operational variables.
 A manipulation check is a seperate measure of the construct used to verify that the researchers have successfully manipulated the variable (self reported stress & blood pressure).

Control of Extrenuous Variables:
 An extraneous variable is anything that varies in the context of a study other than the independent and dependent variables.
 Individual differences or Situational and task variables.
 They pose a problem because they are likely to exert an effect on the dependent variable.
 This makes it harder to seperate the effects of the IV and externeous variables (i.e., confounds). Therefore, researchers control these extaneous variables by keeping them constant.

Extraneous Variables as “Noise”
 Extraneous variables make it hard to detect the effects of the IV in two ways:
• Adding variability or “noise” to the data makes it harder to detect the effects of the IV von the DV.
• One way to control for extraneous variables is to keep them constant.
 Through keeping situation and task variables equal across condtions, using a standardised format, providing same tools, interacts to participants or putting inclusion/exclusion criteria on participants.
 Putting restrictions on participant inclusion limites the external validity of the study (i.e., how generalisable it is to the population).
 In many studies the importance of having a representative sample outwieghs the benifits of minimising noise.

Extraneous Variables as Confounds
 The second way extraneous variables make it hard to detect the effects of the IV is that they act as confounds.
 Confund variables are extraneous variables which differs on average across the levels of the independent variable[s] (i.e., intellegence if there is not an equal mix of high and low IQ participants in each condition).
 To confound means to confuse because they provide alternative explanations for the effect found on the DV that can not always be ruled out or disproven.
 One way to avoid confounds is to keep extaneous variables held constant.
 Another way is to randomly assign participants to conditions.

17
Q

Random Assignment:
 The process where the assignment of participants to conditions is random.
 Random assignment shoud meet two criteria:

A

 Random assignment shoud meet two criteria:
1. Participants should have equal chance of being assigned to each condition (i.e., 50-50 between two conditions).
2. Each participant is assigned to a condition independently of other participants.
 Flip a coin.
 Assign numbers to pariticpants and use a random number generator to assign participants to conditions.
 Use computer software to do random assignment for you.
 Random assignment doesn’t garuntee that the extraneous variables are equivilent across conditions. Chance.
 Works well but best in larger samples.

18
Q

(4) control conditions:

A
  1. No-treatment control:
    - Where participants receive no treatment at all.
    - Limitation is the possible placebo effect, where a positive effect (physical or psychological) similar to treatment can occur when participants are given a placebo, non-active ingredient.
    - Their expectation that it will work drives this effect.
    - Make the effects of the IV harder to spot.
  2. Placebo-control condition:
    - One solution to the confund of the placebo effect is to use a placebo control condition.
    - Where participants receive a placebo that looks identical to the real active ingrediant.
    - Both placebo control and treatment groups are expected to improve but the effects of treatment above that of the placebo are attributed to treatment and not participants expectations.
    - Requires informed consent. Consent to either be placed in control or experimental group and not being allowed to know what group you are in til the end.
    - Ethical concerns about withholding treatment.
  3. Waitlist Control:
    - In which participants are told that they will receive the treatment but must wait until the participants in the treatment condition have already received it.
    - This disclosure allows researchers to compare participants who have received the treatment with participants who are not currently receiving it but who still expect to improve (eventually).
  4. New Treatment:
    - The last solution to the placebo effect is to leave out the control condition and compare participants currrently best available treatment with the new treatment.
    - Allows us to ask, does it work? Does it work better than treatment which is currently available?
19
Q

Carryover Effects and Counter Balancing

A

 The primary disadvantage of a within-subjects design is that they result in carryover effects (i.e., where participants performance in one condition impacts their perforance in the next condtion[s]).
 One type of carryover effect is called the practice effect where people perform better on the task the more practice they have.
 Another is called the fatituge effect where peoples perfromance gets worse overtime due to being tired or bored.
 Another is called the context effect where being tested in one condition changes how the percieve stimuli or interpret the task in a later condition.
 Within subjects design makes it easier for the participants to guess the research hypothesis because they experience all condtions and act in the way they think the researchers want them to (i.e, demand characteristics and social desirability effects).
 The order of conditions is a confounding variable (i.e., if not adequatley counterbalanced it makes it harder to detect the effects of the IV).
 To remove carryover effects researcher counterbalance the order groups of participants experience conditions.
 For example if there are three conditions there are six possible orders participants can experience (ABC, BCA, CAB, BAC, CBA, ACB).
 Participants are randomly assigned to one of the six orders.
 It rules out the alternative explanation that the effects on the DV are caused by the order of conditions and makes it possible to detect if there are any carryover effects present.
 Participants experience one condition at at time.

20
Q

Simaltaneous Within-Subjects Designs

A

 Where participants make multiple responses in each condition. For example, judging attractive and unattractive defendants within the same condition in a random and intermixed order.
 For example, socially anxious participants ability to recall positive and negative adjectives by presenting them with a list with both and counting how many of each they were able to remember.

21
Q

 Between-subejcts design have the advantage of:

 Within-subjects design has the advantage of:

A

 Between-subejcts design have the advantage of:
1. Being conceptually simpler to do and less time testing with each participant.
2. Avoids carryover effects and the need for counterbalancing.
 Within-subjects design has the advantage of:
1. Controlling for extraneous participant variables which reduces noise in the data and makes it easier to detect the effects of the IV.
 Within-subjects is generally the better design if counterbalancing is used to reduce carryover effects, feasable within time and monetery constraints, if a comparision between treatment and control-group is not the main foucs of the study.
 Within-subjects and Between-subjects methods can be used within the same experimental design (i.e., mixed methods is a common approach).

22
Q

The Four Big Validities:

A
  1. Internal Validity:
     Correlation doesnt equal causation because there is the problem of not knowing the direction of the relationship or wether a thrid varible causes the effect.
     The purpose of an experiment, however, is to show that two variables are statistically related and to do so in a way that supports the conclusion that the independent variable caused any observed differences in the dependent variable.
     The logic is based on this assumption: If the researcher creates two or more highly similar conditions and then manipulates the independent variable to produce just one difference between them, then any later difference between the conditions must have been caused by the independent variable.
     An empirical study is said to be high in internal validity if the way it was conducted supports the conclusion that the independent variable caused any observed differences in the dependent variable (i.e., the way they are conducted, with the manipulation of the independent variable and the control of extraneous variables, provides strong support for causal conclusions).
  2. External Validity:
     An empirical study is high in external validity if the way it was conducted supports generalising the results to people and situations beyond those actually studied. As a general rule, studies are higher in external validity when the participants and the situation studied are similar to those that the researchers want to generalise to and participants encounter everyday, often described as mundane realism.
     There is a trade off between mundane realism and psycholgical realism between labtratory and naturalistic studies.
     Some labtratory studies is on psychological processes like objectification of women is expected to be generalisable to other people and settingings. They can be conducted in the feild to increase external validity.
  3. Construct Validity:
     The quality of the manipulations.
     The conversion from research question to experiment design is called operationalisation
     The number of conditions used influences the construct validity because it impacts on how well the operationalised varaible[s] are able to answer the research question.
  4. Statistical Validity:
     Number of participants or sample size needs to be large enough to have enough statistical power and the findings be generalisable to the population.
     Sample size speaks to statistical validity because it impacts on whether the conclusions formed support the research question posed.
     Statisitics allow us to see if the predicted relationship between the IV-DV was found in the data.
     The number of conditions and total number of particpants influence the effect size. A power analysis is conducted with this information to ascertain if you are likely to find a real difference.

Prioritising Validities:
 It is often not possible to have high levels of validty in all four areas.
 Most studies have high internal and construct validity and sacrafice external validity.

23
Q

STEPS:

A

Recruiting Participants:
Standardising the Procedure:
 The way to minimise unintended variation in the procedure is to standardise it as much as possible so that it is carried out in the same way for all participants regardless of the condition they are in. Here are several ways to do this:
o Create a written protocol that specifies everything that the experimenters are to do and say from the time they greet participants to the time they dismiss them.
o Create standard instructions that participants read themselves or that are read to them word for word by the experimenter.
o Automate the rest of the procedure as much as possible by using software packages for this purpose or even simple computer slide shows.
o Anticipate participants’ questions and either raise and answer them in the instructions or develop standard answers for them.
o Train multiple experimenters on the protocol together and have them practice on each other.
o Be sure that each experimenter tests participants in all conditions.
double-blind study/ SINGLE BLIND STUDY.

Record Keeping:
 It is typical for experimenters to generate a written sequence of conditions before the study begins and then to test each new participant in the next condition in the sequence. As you test them, it is a good idea to add to this list basic demographic information; the date, time, and place of testing; and the name of the experimenter who did the testing.

Pilot Testing:
 It is always a good idea to conduct a pilot test of your experiment. A pilot test is a small-scale study conducted to make sure that a new procedure works as planned. In a pilot test, you can recruit participants formally (e.g., from an established participant pool) or you can recruit them informally from among family, friends, classmates, and so on. The number of participants can be small, but it should be enough to give you confidence that your procedure works as planned. There are several important questions that you can answer by conducting a pilot test:
o Do participants understand the instructions?
o What kind of misunderstandings do participants have, what kind of mistakes do they make, and what kind of questions do they ask?
o Do participants become bored or frustrated?
o Is an indirect manipulation effective? (You will need to include a manipulation check.)
o Can participants guess the research question or hypothesis?
o How long does the procedure take?
o Are computer programs or other automated procedures working properly?
o Are data being recorded correctly?

24
Q

 The experimenter expectancy effect;

A

or example, if an experimenter expects participants in a treatment group to perform better on a task than participants in a control group, then they might unintentionally give the treatment group participants clearer instructions or more encouragement or allow them more time to complete the task

25
Q

Sample statistics are not perfect estimates of the population. They have a normal level of random variation or noise even if you re-sample from the same population. This is called

A

sampling error

26
Q

The Null Hypothesis allows us to test whether differences in …

A

the sample refelct a diffrence in the population or if it is just due to sampling error.

27
Q

Three steps of Null-Hypothesis Testing:

A
  1. Assume for the moment that the null hypothesis is true (i.e., there is no relationship between the variables in the population).
  2. Determine how likely the sample relationship would be if the null hypothesis were true (test-statistics).
  3. If the sample relationship would be extremely unlikely, then reject the null hypothesis in favour of the alternative hypothesis. If it would not be extremely unlikely, then retain the null hypothesis.
28
Q

P-Values:

A

 A crucial step in null hypothesis testing is finding the likelihood of the sample result if the null hypothesis were true. This probability is called the p value.
 A low p value means that the sample result would be unlikely if the null hypothesis were true and leads to the rejection of the null hypothesis.
 A high p value means that the sample result would be likely if the null hypothesis were true and leads to the retention of the null hypothesis. But how low must the p value be before the sample result is considered unlikely enough to reject the null hypothesis?
 In null hypothesis testing, this criterion is called α (alpha) and is usually set to .05. This means that if there is less than a 5% chance of a result as extreme as the sample result if the null hypothesis were true, then the null hypothesis is rejected. When this happens, the result is said to be statistically significant. If there is greater than a 5% chance of a result as extreme as the sample result when the null hypothesis is true, then the null hypothesis is retained. This does not necessarily mean that the researcher accepts the null hypothesis as true—only that there is not currently enough evidence to conclude that it is true. Researchers often use the expression “fail to reject the null hypothesis” rather than “retain the null hypothesis”, but they should never use the expression “accept the null hypothesis”.

My Misunderstanding:
 The most common misinterpretation is that the p value is the probability that the null hypothesis is true—that the sample result occurred by chance. For example, a misguided researcher might say that because the p value is .02, there is only a 2% chance that the result is due to chance and a 98% chance that it reflects a real relationship in the population. But this is incorrect. The p value is really the probability of a result at least as extreme as the sample result if the null hypothesis were true. So a p value of .02 means that if the null hypothesis were true, a sample result this extreme would occur only 2% of the time.
 You can avoid this misunderstanding by remembering that the p value is not the probability that any particular hypothesis is true or false. Instead, it is the probability of obtaining the sample result if the null hypothesis were true.