Experimental Design Flashcards
What does the scientific method allow us to do?
- examine statements with specific methods that use systematic, objective observation
- psychological investigation can be described as systematic observations of abtract(s) of the world which are then subjected to a series of tests
What is the problem of not using scientific research when investigating matters in psychology?
- we tend to make judgements using our own intuition
- use anecdotal evidence like “everyone knows that”, “my feelings are”, “the authorities say so” - not very powerful form of evidence
- potential of bias?
What are the problems of observation and how can we overcome them using the scientific method?
- many people do not attempt to control or eliminate factors that may influence events that are being observed
- therefore as a result, conclusions that are gathered about behaviour are often incorrect
- also there is also a worry that personal issues or beliefs may influence someone’s work and certain details could be left out
- the scientific methods allows scientists to use this form of systematic observation to try and report their observations objectively - what happened, the mere facts without any personal interpretation
Describe the case of Clever Hans in 1904
- Hans was a German horse who was thought to have amazing talents
- for example they thought he could do complete maths sums
- commission was held to examine Hans’ abilities, to see whether animals were clever enough to be taught how to carry out these advanced tasks
- turns that this wasn’t the case - Hans was watching for subtle cues from the questioner
- if Hans could not see the questioner or they did not know the correct answer then he did not get it right
What is the problem with research when it comes to concepts?
- normally we rarely define the concepts of particular words that we use when referring to specific behaviours
- for example anxiety
- the main problem is that many concepts have different meanings for different people
- to tackle this problem, scientists need to define their concepts precisely, involving operation definitions
How do we measure observed behaviours?
- by assigning numerical values to them
- this allows them to be summarised and have statistical analyses run on them
- can either be physical (length, weight) or psychological (anxiety, IQ)
What is the ultimate goal for any measurements of observed behaviours?
for them to be VALID and RELIABLE
V = the measure is truly measuring what it claims to be measuring R = the measure is consistent - can you measure it over time?
What are circular explanations?
- when no form of explanation is given
- basically when the reason for behaviour is the behaviour itself
- not a full explanation of what is going on
Why do need to adopt this scientific approach?
- because of the human factor in science
- normally we make decisions based on our beliefs and own knowledge
- this can make our results biased
- so scientists adopt a highly critical and sceptical view point
- nothing can be taken for granted unless there is a scientific or some other sort of explanation; simply relying on a gut instinct with regards to a particular behaviour can result in an incorrect, invalid and unreliable conclusion
What is basic research?
- mainly theory driven, less problem driven
- uses basic research questions
- gets a better understanding of how people think, behave and how the mind works
- areas = perception, memory, thinking, language, social behaviour, development, individual differences, biological correlates of behaviour
- all the areas are concerned with the fundamental question of how things work
What is applied research?
- focus on practical problems
- problem solving and less explanation
- eg does a type of therapy work?
does it eliminate symptoms rather than why or how does it work to eliminate the symptoms - areas:
psychodiagnostics - assessing personality
work psychology - work environment, software design
organisational psychology - staff management
abnormal psychology - diagnosis, therapy
Why do we normally just use basic research?
- if we just focus on the practical problems, it’s likely that we may neglect important research areas
- lets us accumulate knowledge and understand how things work
- attempts to provide theories which can then be practically applied - answer questions first and then conduct the research afterwards
- further down the line, there may be more applications compared to th initial discovery
How can we get ideas for research?
- personal experience - Piaget developed his conservation tasks afteriwatching his own child, working in a particular area or place?
- psychological literature - vast source of ideas where you get ideas of what could have been done better, alternatives or different tangents from a particular piece of research
- inconsistencies in or problems with previous research
What is a hypothesis and how do you come about forming one?
- they are tentative explanations of behaviour
- usually contain a statement about the relationship between two (sometimes more) variables
- need to specify the variables / concepts that you are interested in studying
- have to give them operational definitions - how you manipulate or measure them
- CANNOT be circular - have to be previously independent of each other
How are hypotheses generated - Langer and Rodin (1976)
- personal experience, improving performance
- used to work in a nursing home and found that patients wanted to be able to make their own decisions rather than getting told what to do all the time
- hypothesis - well-being will be greater for nursing home residents who are responsible for personal decisions than for nursing home residents who are not responsible
- well-being = rate happiness, alertness, actively and sociability
- personal responsibility = residents either (1) made responsible for decisions OR (2) told that staff members were responsible
- patients who were made responsible for their own personal decisions were found to be happier since they were more responsible
What are variables?
- Behaviours that you are measuring
- Can be manipulated as well as being measured
- Need to have operational definitions that tell you how you are measuring the particular variable
- 2 different types - independent and dependent
Define Independent Variable
- the variable that is manipulated by the experimenter
- the explaining / predictor variable
- not always manipulated eg observation
Define Dependent Variable
- the variable that is measured
- the explained / criterion variable
What is verfication?
Empirical confirmation of the hypothesis
weaker than proving a statement
What is falsification?
Rejection of a hypothesis on the basis of empirical evidence
weaker than disproving a statement
What is a sample?
- a group of people that we are going to study
- taken as a small subset of a population which are assumed to be representative of the population
What is the population?
- all existing members of a group (eg undergraduate students)
- the aim is to generalise from a sample to a population
- problem - is it justified to generalise from a sample to a vast number of people?
- > coin flipping example
Describe the Milgram experiment
- fake electric shock study
- verbal prods and grey lab coat
- 60% or more all delivered shocks up to 360v (highest setting)
- implications - highly criticised, bad name to the disciple
- ethical issues
What are the purpose of ethical guidelines?
- been developed to help prevent any harm (physical or psychological) to participants
- mandatory to submit a proposal of your experiment to a board to explain what you want to do, the potential risks and benefits and how you will be examining it
- nowadays, Milgram’s experiment would have never been allowed to happen
Define competence
- the need to recognise boundaries of competence, limits of expertise etc
- can only do research on humans / animals as long as you are qualified to do so
- for example, if a surgery needs to take place on an animal then you need a vet
Define integrity
- the need to be honest in science, teaching and the practice of psychology
- need to be honest, fair and respectful of others
Define professional and scientific responsibility
- the need to uphold professional standards of conduct and accept appropriate responsibility for your actions
Define respect for people’s rights and dignity
- avoid treating people in a biased manner
- respect their basic human rights to privacy, self-determination and autonomy
Define concern for other’s welfare
- expected to contribute to the welfare of those with whom they interact professionally
- do not exploit or mislead other people during professional relationships
- no deception or make it as ethical as possible
Define social responsibility
- apply and make public your knowledge of psychology in order to contribute to human welfare
- publish your research not only for your career but for the public too so they can know what is going on
Define planning of research
- ethical considerations are central to the planning of the study
- weigh the scientific value of the study against the degree of intrusion into participants (the risks)
- it is mandatory to submit experiment proposals to an ethical committee before any research is conducted
- the committee is make up of more than one person who decides whether the costs outweigh the risks
Ethics - Kassin and Kiechel (1996)
- the likelihood of a conviction increases when the defendant confesses
- is it possible for people to falsely confess to a crime - when they say that they have committed a crime when they actually have not?
- lots of different factors can contribute to this - false evidence and witnesses, coercive techniques, vulnerable states such as stress, drugs, sleep deprivation and torture
- 75 students participants for a “Reaction Time Task”
- had to type letters as they were read aloud
- told not to hit the ‘alt’ key otherwise the computer would crash and all the data would be lost
- computer was actually rigged so that it would crash after one minute
- would P’s confess to hitting the ‘alt’ key
- 2 conditions - letters read by the confederate at either 43 letters per minute or 67 letter per minute
- faster rate = more stress for P’s
- other manipulations - the distressed experimenter accused the P of hitting the ‘alt’ key and there was a false witness who agreed with the experimenter
Ethics - Kassin and Kiechel (1996)
RESULTS
- 69% signed a written confession - confessed to a crime they had not committed
- did P’s really believe in their own guilt:
28% told another person that they had ‘ruined’ the experiment
9% made up specific details to explain how they could have hit the key - fast place & false evidence = 100% signed confession, 65% believed their own guilt and 35% made up details to explain their behaviour
Ethics - Kassin and Kiechel (1996)
EVALUATION
Risks to participants?
- could become stressed by the accusations
- upset about being deceived
Addressing the risks:
- situation did not make them seem like bad people
- received a complete debriefing - told about the nature of the experiment, told they did not do anything wrong and they were told that the computer was not damaged at all
- explained why the deception was a necessity in order to study this important social problem
What is deception?
- leading participants to believe that something other than the true IV is involved or withholding information such that the reality of the investigate situation is masked or distorted
- contradicts the principle of informed consent
What is debreifing?
- informing participants about the full nature and rationale of the study they have experience
- attempt to revere any negative influence
- depends on the degree of deception
- may involve some more deception itself (you performed very well)
What do you have to do within an experiment regarding stress and discomfort towards the participants?
- need to guarantee the safety of participants
- have to protect them from harm or discomfort
- not all stress arises from deception - exposure to violent or pornographic film sequences can cause distress as well as sensory deprivation studies
Physical discomfort - pain, hunger, thirst etc
- any experiment / procedure needs to be terminated when discomfort levels are higher than expected or the P is disturbed to an unacceptable level
What is informed consent?
- highly crucial to obtain this - have to gain the P’s consent to actually take part in the experiment
- need to use language that the P will understand - lots of scientific jargon may hid what is actually happening in the experiment itself
- details what you will be doing
- full information about the expected level of discomfort (if any) and a heavy emphasis on the voluntary nature of the study
- reminder that the P has the right to withdraw at any point in the procedure
BUT sometimes there is a need to deceive P’s about the nature of a study (eg Milgram’s study)
What is involuntary participation?
- eg what happens in a filed surfy where someone takes part without agreeing to or even knowledge of the study itself
Describe scientific fraud
Falsifying data - data needs to be made public & accessible
Plagiarism - taking someone else’s ideas & claiming them as your own
- need to acknowledge the ideas of others
Scepticism - need to be carful if it sounds too good to be true
Describe animal research issues
- need to have the knowledge to justify the procedures
- use the smallest number of animals
- species-specific care for animals (caging, procedures causing discomfort)
- use no members of endangered species if possible - instead use naturalistic studies
- researchers need to be familiar with technical aspects of anaesthesia, pharmacological compounds etc, post-operative checks
What is direct observation?
Behaviour as it happens here and now
eg intervention vs no intervention
What is indirect observation?
Behaviour that has happened in the past
eg archival records or physical traces
What is a target population?
- a particular population that we are interested in
- where we draw samples from
- eg undergraduate psychology students
What is representative sampling?
An abstract ideal because it is practically impossible to get a completely representative sample
- can’t force people to take part in studies
- most people who turn up to take part in experiments are usually motivated to take part - not really representative perhaps?
- is it justified to generalise from a small number of participants to a vast number of participants? (where inferential statistics can help us)
Describe some sampling / selection biases?
- there is often a systematic tendency towards over- or under- representing some groups within a sample
- this means the sample differs systematically from the target population
Describe non-probability / convenient sampling
- selects P’s who are available and willing to participate
- convenience of having your friends or the first 20 students who walk into the lecture
- no guarantee however that each member has a change of being included in the sample
- P who are available and willing to participate - motivated?
- again may be over- or under- represented
- cannot force people to take part so most samples are convenience
Probability sampling - Simple Random Sample
- a sample in which every member of the target population has an equal probability of being selected
- if the sample is at random and large enough, it is representative in proportion to the population
- can be done using a computer selection, random number tables, manual selection
Probability sampling - Stratified Random Sampling
- also has a random element to it
- where pre-defined groups of people who you want to study are selected at random
- need to have pre-existing knowledge of the sample
- specified groups appear in numbers proportional to their size in the target population
Describe nominal data
- used to label / categorise events (individuals or observable behaviours) into discrete categories (gender, opinion polls)
- yes-no categories, sometimes a bit more
- least informative, just using numbers to measure
- can use frequencies and suitable inferential tests
- only using numbers for labelling categories so any unique transformation of the scale is allowed
Describe ordinal data
- when you order or rank events
- more than or better than relationships
- eg order students according to their marking or preferences for a specific car brand
- need to be careful about how you interpret the scales as you cannot use number to assign to data - the differences between the different values may differ
- eg not happy to moderately happy may be different to the gap between very happy and moderately happy
Describe interval data
- possess the characteristics of both nominal and ordinal
- numerically equal distances represent equal distances in the property being measured
- eg temperate in degrees C - it has an arbitrary zero point and an arbitrary 100 point
- continuous scale
- have to be careful - arbitrary to the concept being measured only!
Describe ratio data
- highest level of data
- same as interval
- has an absolute / natural zero point that has a theoretical meaning - 0 = an absence of a property being measured
- all mathematical transformations are possible
- the numbers on the scale indicate the actual amounts of the property being measured
- physical scales measuring time, weight, distance, temperature in K
- difficult to use for psychological measurements, eg IQ
Describe reliability
- the consistency of a measure
- the extent to which measures of behaviour can be repeated with similar results
- variance (Var) in a measured variable may be caused by a number of factors; may be because of variances between people or it may be down to situational factors, time, feelings on the day and so on
Var(DV) = Systematic Var + Error Var
How can we assess reliability?
- it is rarely assessed in experiments
- split half method - scores for individuals questions are randomly split into two random halves and then correlated to see how similar they are
- Cronbach’s alpha - come up with a summative score
(both of these are most often used in research using psychometric tests)
High R - the measure is consistently close to the true measure BUT that tells us nothing about the validity of that measure
Describe validity
- whether a test is measuring what it is supposed to be measuring
- the extent to which an instrument measures what it was designed to be measuring
- eg an IQ text should truly measure intelligence, not something else
V assumes R - an unreliable measure measure cannot be valid but an invalid measure can in principle be reliable
What is face validity?
- whether a test looks like / appears to be valid
- it makes sense, it looks like it is measuring what you are intending to measure
- for example, an IQ asking people to solve problems - seems to be valid
- not really formally assessed
What is criterion validity?
- whether you can rely on a measurement to predict future behaviour and relate it to other accepted measures of behaviour
- for example, an IQ test should predict success in school as well as producing similar results to other known tests of intelligence
What is construct validity?
Whether the construct being measured is valid (whether or not it ‘really’ exists) and the tool measuring the construct is suited to measure it
Construct validity - Mischel
- social-cognitive learning theory of personality
- construct = delay of gratification
- sometimes children want things to grow
- operational definition - willingness to wait for a large reward instead of preferring an immediate small reward
- hypothesis - younger children should be less willing to wait for a reward compared to older children
- other studies have showed that this construct can be related to others like social responsibility, emotional maturity and so on
What are the ethical implications of reliability and validity?
- lots of measures (reaction times, IQ scores, anxiety scale, personality tests etc) are used to make decisions about people
- for example schools, the army, job assessments, psychiatric diagnoses, exams and so on all use them
- some very important decisions are based upon these kind of tests - what happens if they turn out to be invalid or unreliable?
What is internal validity?
Whether the independent variable(s) is / are the true causes of change in the dependent variable(s)
Is the independent variable causing the dependent variable to change?
What is external validity?
Whether we can truly generalise to the population we are talking about
Do we have representative findings?
What is ecological validity?
Whether experimental findings can be generalised to more natural / less controlled settings
What happens in an experiment?
- investigative the effects of the independent variable(s) on the dependent variable(s)
- independent variables are manipulated by the experimenter
- dependent variable is assumed to be directly affected by the changes in the independent variable
- all other conditions / extraneous variables are controlled for
- can be conducted in the laboratory or in the field
What are the strengths of experimental methods?
Isolate cause and effect
Control of extraneous variables = high internal validity
Eliminate alternative explanations
Easy to replicate
What are the weaknesses of experimental methods?
Participant bias = low external validity
Artificial conditions and measures = low ecological validity
Participants contributions are completely prescribed - the kinds of studied phenomena are limited
What is between-subjects deign?
- when experiments compare at least two conditions (A and B)
- there are at least two levels of the independent variable
- P’s are placed into conditions A or B (ideally at random with control of EV’s)
- P’s receive one conditon, not both
- each condition has different groups of P’s - they only experience one level of the IV
When must a between-subjects be used?
If the IV is:
1) a quasi-experimental subject variable, for example gender and anxiety
- when you cannot manipulate the independent variable - you can’t give a person a particular gender
2) manipulated in a certain way that precludes a within-subjects design
- when it is impossible / unreasonable for the same person to complete the second conditon
- for example, problem solving with and without a crucial piece of information
Between-subjects design example - Sigall and Ostrove (1975)
- wanted to investigate the influence of physical attractiveness of a defendant on recommended sentence
- P’s see given written descriptions of a crime and were then asked to recommend a jail sentence
- IV1 = the type of crime
2 levels - a burglary where a women stole £2,200 against a swindle in which a woman induced a man to invest £2,200 - IV2 = the attractiveness of the woman - 3 levels; very attractive, unattractive, no photo
Results:
- control - B = 5.1 years and S = 4.4 years
- unattractive - B = 5.2 years and S = 4.4 years
- attractive - B = 2.8 years and S = 5.5 years
What are the advantages of a between-subjects design?
Each subject enters the study fresh and naive to the procedure / purpose of the study
What are the disadvantages of a between-subjects design?
- large number of participants required
- differences between conditions might be due to accidental differences between subject groups
- > countermeasure = randomisation
What is randomisation?
A method of placing randomly selected subjects into different groups
- there is an equal probability for each subject to be assigned to a specific condition
- this allows for us to spread possible individual difference factors evenly across conditions
Issue - with a small number of participants, it could happen that random assignment places all A-subjects into one group - this results in non-equivalent groups
What is ‘matching’?
When you control for the problem of non-equivalent groups
- basically match participants
- have explicit control for individual differences
- does require pre-testing of participants
- have to be careful about what variables you are interested in as you can’t pre-test every single variable
What are the differences between manipulated and subject variables?
Subject variables = already existing characteristics of the participants in the study, for example gender, intelligence, age and anxiety
Within-subjects design advantages
- fewer subjects are required since P’s are exposed to all levels of the IV
- more powerful / sensitive to statistical tests can be applied as each subjects serves as their own control
For some experiments, within-subjects designs are the only reals label choice:
- studies where a trial lasts only a few seconds (reaction times etc)
- when subjects are rare, the target population is small
- if there is a limited number of participants, you may as well get as much information out of them as possible
- this type of design allows you to rule out individual differences - can be eliminated form the estimate of variability between conditions
- just care about the difference in scores rather than the differences between participants
- equivalent groups problem is eliminated
Within-subjects design disadvantages
Progressive effects
- practice, fatigue or boredom
- performance steadily changes (also spontaneous recovery)
Carryover effects
- systematic changes in performance that occur as a result of completing a sequence of conditions
- difficult trials tend to linger on into the next trial
- for example, easy to hard against hard to easy
- if we complete a hard task before an easier task, the mentality that we had for completing the more difficult task stays with us
What can we do to counter progressive and carryover effects?
Counterbalancing!
2 general categories - P’s are tested in each conditions just once or more than once
Once per condition
eg Reynolds (1992) Recognition of expertise in chess players
- 6 games with about 20 moves - order effects?
Complete Counterbalancing
- every possible sequence will be used at least once
- when many conditions are examined, complete counterbalancing is often practically impossible
- eg with 6 conditions, there are 720 possible sequences
What is breaking the confound?
- a way of controlling for order effects
- eg say you have 2 orders:
One half of P’s = ABCDEF and the other half = DBCFAE - practice effects are balanced across the conditions
- another technique is the random order of trials, for example reaction time experiments
What is a mixed design?
- when you have more than one IV
- can include both between and within subjects elements
Example: a hypothetical study on the effectiveness of a new pain killer drug
- between factor = the treatment group (placebo or painkiller)
- within factor = time of measurement (before and after)
What is a one-way design?
Includes one IV with two or more levels
Example:
Treatment = Painkiller or Placebo (2 levels)
Treatment = Painkiller, Placebo or Nothing (3 levels)
What is a complex (factorial) design?
Includes 2 or more IVs, each with two or more levels
Example:
Treatment: Painkiller or Placebo (2 levels); Time: before treatment; one week after; two weeks after (3 levels) = a 2*3 design
- in complex factorial designs, you are interested in main effects of and interactions between the IVs
What 3 dimensions can we classify designs into?
How do we compare? ANALYSIS RELEVANT
How many independent variables? ANALYSIS AND INTERPRETATION-RELEVANT
Manipulation of Independent Variables? INTERPRETATION RELEVANT
What is a main effect?
- refers to the overall contribution of an IV to changes in the DV
- this is averaged across levels of other IVs included in the design (typically a complex facto trial design)
What is an interaction?
Refers to the specific contribution of an IV to changes in the DV, dependent on levels of other IVs included in the design (implies that the design must be complex)
- difference between differences!
Describe the different types of interaction
Disordinal - contrasts points in opposite direction
- cross over of contrasts
Ordinal - contrasts points in the same direction
- typically along with at least one main reliable main effect
No Interaction - two main additive effects; differences between differences are equal
Describe the main data analysis steps
1) Check the data
- identify / eliminate errors, extreme values, outliers
2) Summarise the data
- descriptive statistics; values for central tendency (eg mean) and variability (standard deviation) in each condition / design cell
3) Statistically confirm what the data reveals
- inferential statistics (NHST); are differences detected in the sample just due to non-systematic sample fluctuations?
What does it mean by ‘sensitive’?
The change to the IV is detected even when it is small