Lecture 2 Flashcards
Experiments
IV has two or more levels, you manipulate it to see its effect on the DV
Most common
Can infer causality
Low ecology
Natural observation
Watch people in the field
Score their behaviour (or assess in some way)
No causality
Basic assumption for all research
Events are governed by some lawful order
The three goals of research
Measurement and description (How/why)
Understanding and prediction (How/what will happen)
Application (to help people usually)
If you divide people into two groups and have some play video games and some not, then measure spatial memory, you can describe the changes and caveats
Maybe another study could determine what part of the game accounted for the observed change
Could then predict changes in spatial memory caused by playing the game
Could maybe then prescribe game playing to help people improve spatial memory
Structure of a scientific paper
Introduction:
Theories, hypothesis
Methods:
Operational definitions
Acquisition of empirical evidence
Results:
Adherence to the scientific method
Precision
Analysis of empirical evidence
Discussion:
Openness - strengths/weaknesses, unexpected findings (good papers would be honest about this_
Willingness to reject hypothesis and draw correct conclusions (must be willing to say it did not work)
Theories
Organized systems of assumptions that aim to explain phenomena and their interrelationships
Should be referenced in the introduction of papers
Organize findings from prior research into a coherent set of ideas.
Hypotheses
Attempt to predict or account for something. specify relationships among variables and are explicitly tested
Should be explicitly stated in the intro of papers
Two sources of hypotheses
(a) theories
When a hypothesis is derived from a theory, it tests the theory. If the hypothesis is correct this adds supporting data to the theory (but does not prove it). When it is not, the theory must be revised.
(b) Personal experience
If this is for an experiment it is a formal statement that when X is changed this will cause Y to change (both in specific ways) or if there is no causation implied, that X and Y are related predictably.
Operational definitions
define terms in hypotheses by specifying the operations for observing and measuring the process or phenomenon
Relatively subjective: ie you define it
Can be very hard
Goes in method of papers
Random sampling/assignment
Randomly select people from the population
Randomly assign them to a condition
This way the chances of their being systematic differences between the groups are minimized as the differences should occur in both groups and so cancel out
A true random sample is representative of the population
By making the sample representative of the population, you allow the study to infer something about the population. If it is not, you cannot do this (or should not).
Unrepresentative samples may not apply to everyone.
Convenience sampling
Samples that are convenient to get.
Sometimes needed like in rare medical conditions
Often WEIRD which makes findings less generalizable
Skepticism
Do not accept ideas based on faith or authority
Do not assume everything was done correctly, treat every aspect of a study with caution.
Assess
Willingness to reject H1
Confirmation bias
Hard to do
Conformation bias: the tendency to look for or pay attention only to the things that are consistent with your own belief. Big issue with social media, big issue with data interpretation.
Karly Popper’s Critical Rationalism
Falsifiability
A scientific theory must make predictions specific enough to disconfirm the theory. It must predict what will and will not happen.
Often our impressions are wrong.
Example of confirmation bias & the importance of falsifiability: Prefrontal lobotomy
Egaz Moniz Damage PFC Treatment to schiz Drops notable psychotic behaviors Thought of as acceptable, got a Nobel Controlled studies showed useless Does not drop non-psychotic schiz symptoms Drops DIRECTED behavior across board so psychotic directed behaviors (that is disturbing to others) drops but so does non-psychotic directed behavior (like dressing yourself). Replaced with antipsychotics
Moniz hypothesized it would drop psych symptoms, then focused on the behaviors he wanted to see drop (psychotic one’s) and ignored those that did not confirm this belief. He did not define what changes in behavior might be negative.
If he had defined this better (eg said there would be no cognitive impairment), it would have been falsified. 20k were done.
What makes evidence empirical?
Reliability/Validity
What makes evidence empirical?
Evaluate measures based on reliability and validity
A test must be reliable to be valid but a reliable test can be invalid (if the test reliably measures an invalid construct).
Reliability
2 types
How consistent is the measurement?
80% is good
(1) test-retest reliability
Are scores similar from one session to another?
Sometimes one may improve if its the same test again and again.
(2) Alternate-forms reliability
Are scores similar on different forms of tests?
Correlational studies
Descriptive studies looking for relationships between phenomena
Correlation is a statistical measure of how strongly two variables are related to one another (between -1 and1)
Strengths: Can test predictions, evaluate theories and suggest new hypotheses
CANNOT INFER CAUSALITY
Although often reported that way both in the media and in sloppy discussions
Often shown as scatter diagrams
Stronger correlations yield better predictive power
Random selection of ptps from the population
Key for generalizability
Not always possible, aim for it
If not, say so in discussion
Random assignment of ptps
To each group
Experiment/control
Give placebos to account for placebo effect (as everyone thinks they have the drug)
Confounds
Any difference between the experimental and control group, other than the IV
Cause and effect
Possible to infer with random assignment and manipulation of IV
Placebo effect
improvement resulting from the mere expectation of improvement
Subjects must be BLIND; unsure if they are in the experimental or control group
Placebo show many of the same characteristics of real drugs
Nocebo effect
Harm resulting from the mere expectation of harm,
Women perform worse at math after told women perform badly on the test
Experiments
strength/weakness
Can establish causality
Can be confounded
Involves variables of interest, IV and DV, control conditions and random assignment
Hindsight bias
“I knew it all along”
tendency to overestimate how well we could have forecasted a known outcome
Overconfidence
Our tendency to overestimate our ability to make predictions
Experimenter expectancy effect
Phenomenon where researchers hypotheses lead them to unintentionally bias a study outcome
Example: Facilitated communication
Experimenter effect
“revolutionary” treatment for autism
Bikiken thought it was a movement disorder
Sat with autistic child and helped them type
Students made huge progress in communication
Loads of sexual abuse claims which their families denied and were not experienced by their siblings
Turns out, words came from the experimenter
Experimenter wanted it to work, made it work, like a ouiji board
Still done somewhere: VERY hard to get published results out the the publics mind (same with autism vaccine)
Hawthorne effect
participants knowledge they are being observed changes their behavior
Demand charecteristics
Cues ptps pick up that allow them to generate guesses of H1
How to avoid Hawthorne and demand effects
Covert observation (watch without them knowing - ethically challenging)
Experimenter blindness (they do not know which group is which)
Double blind design
Researchers and ptps do not know what group is what
Normal distribution
68% are within 1SD
95% within 2
Measures of central tendancy and standard deviation
Mean, median and mode are the same in a normal distribution.
If the mean is greater than the mode, the distribution is positively skewed.
If the mean is less than the mode, the distribution is negatively skewed.
If the mean is greater than the median, the distribution is positively skewed.
If the mean is less than the median, the distribution is negatively skewed.
SD, how scores are distributed around the mean
Inferential statistics: Significance tests
P
How likely is it that the study’s results appeared by chance
If NOT by chance, attributed to your manipulation
P,.05 is a convention in psyc
Means 1/20 chance this happened randomly
Effect size
Effect size is the amount of variance among scores in the study accounted for by the IV
If it is low, small effect
If it accounts for only 1%, do we care? Maybe but that must be reported so people can interpret the thing fully.
Misuse of statistics (2 things)
Reporting significant results with small effect sizes
(many journals require effect sizes now)
If the distribution is not normal (ie skewed) you cannot use normal dependent statistics to analyze it. You must use stats that account for this
Truncated graphs (that do not start at zero) used to make tiny effects look significant
Interpretation of statistics
Examine whether the sample is representative before drawing conclusions
Mention ALL results (even those that do not support)
Compare honestly to prior research, if the data disagrees, say so
Discuss limitations and what should be done in future to build on this based on results
Cover story
mild deception, designed to stop ptps from guessing H1
Continuous variables
Spectrum, if numerical can be quantitative variables
Discrete variables
One or the other
If named, categorical/qualitative
Indepdenat groups/in-between subjects
Different groups
Repeated measures/between subjects
Same people at different times
Stimulus variable
If the IV exposed someone to a stimulus, it is a stimulus variable
Response variable
If the IV exposed someone to a stimulus, it is a stimulus variable and the DV is a response variable
Manipulation check
is the IV in the right levels in the experimental condition (IE does your manipulation change the IV as you want it to)
Quasi experiments
IV not manipulated
e.g. one group is male, the other female
Similar to experiments but because the UV is not manipulated, harder to infer causality
Content validity
(1) Content Validity
Do the items broadly represent the trait in question?
If they miss aspects of the trait, they are invalid, must not focus on only one aspect.
Criterion Validity
(2) Criterion Validity
Do the tests results predict other measures of the trait?
Compare to other validated ways of assessing the trait. What are the results?