Psych 1111 Flashcards

1
Q

what is critical thinking


A

evaluating sources of information and making judgements based on evidence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are the 4 common sources that people refer to for information?

A

common sense, superstition and intuition, authority, tenacity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Common sense →

A

believing that information is correct because it is collectively agreed upon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Superstition and intuition

A

Gaining knowledge based on subjective feelings → issues with this include lack of knowledge of the source of information (cant evaluate), interpreting random events as causally related, patterns (people like them), caused by priming of attention, observations tend to occur together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Authority

A

Gain knowledge through authority figures → do they really know what they are talking about?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Tenacity

A

Gaining knowledge by hearing information so often you accept that it is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Critical thinker =

A

scientific thinker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

To be a critical thinker you must analyse evidence:
LOSAAR

A
  • Rational
  • Analytical
  • Logical
  • Skeptical
  • Open minded
  • Able to update your opinion based on the evidence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Assumptions of science
(TVD PEF)

A

parsimony, empirical, verifications, testability, falsification, determinism

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Parsimony

A

the simplest explanation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Empirical

A

claims supported by evidence → systematic * The more unusual the claim, the stronger the evidence needs to be.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Verificationism

A

You must be able to provide evidence that supports your claim.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Testability

A

must prove claim

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Falsification

A

It must be possible for you to find evidence that refutes your scientific claim. Your scientific claim should allow for the possibility that you are incorrect. Good scientific theories must be able to be falsified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Determinism

A

is an important assumption of the scientific method. In science determinism refers to the idea that every event in nature has a causes, or causes that account for the occurrence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Understand the difference between independent and dependent variables

A

Independent variable is the manipulated variable → is randomly assigned to control for systematic differences → normally has two levels (such as drug and placebo)
Quasi independent variable → variables that the experimenter cannot be randomly allocated → Commonly used as grouping variables
Natural Variables
Country of birth
Biological Sex
Age
CAB
Attribute/person variables
Individual difference variables that fall on a spectrum
Level of risk taking
Anxiety
AIL

Dependent variables
* The dependent variable is the variable used to assess or measure the effects of the independent variable
* Dependent on the independent variable
* Measures a behaviour or response for each treatment condition of the experiment
* The dependent variable is NOT manipulated it is only ever measured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define and understand the importance of operationalisation

A

Operationalise = to quantify (measure)
Operationalising variables allows you to specify exactly what you mean in your hypothesis/theory

Operational definition
* Detailed description of the procedures or operations used to measure or manipulate the variables.
* Providing clear instructions about
o Definition of variable
o How it is measured/quantified
* This is important it ensures that the hypothesis is clear.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Identify and apply the steps of the scientific method

A

Initial/past observations → hypothesis → test → analyse/conclude → update or discard (can go back to the hypothesis step and start again) → theory

Observation
Scientific studies begin with an initial observation.
* A point of interest for further investigation.
* You must be able to find a way to collect observable evidence.
‘Gap in research’
Past observations are important for the scientific method.
* Try to answer questions raised by existing theories.
* Replication is critical. → indicates confidence of results

Hypotheses
* A hypothesis is a very specific statement about the predicted/expected relationship between variables (both variables)
* It is usually phrased in the form: “If ___[I do this]___, then ___[this]___ will happen.”
* A hypothesis usually predicts the effect of a manipulated variable on a measured variable.
* States that a relationship should exist between variables, the expected direction of the relationship between the variables and how this might be measured

Test
The scientific method requires that you can test the hypothesis.
Design an experiment
Use good experimental design
Collect appropriate data
Control as many aspects as possible
Research Methods
Is the experiment reliable?
Are your measures valid?

Analyse and conclude
Consider whether the data supports your hypothesis
Is there sufficient evidence?
Are the results statistically significant?
Are further studies required?
Conclude
Conclusions are the researcher’s interpretation of the evidence
Based on the results of the experiment
Explain the results of the experiment

Update or Discard
The scientific method is dynamic
* Must be able to update your hypothesis when there is a lack of data to support it
* Must be able to discard your hypothesis when the evidence refutes it.
This requires many aspects of critical thinking
* Open to the possibility you are incorrect
* Evaluation of the evidence
* Ability to change your opinion with new evidence

Theories are NOT hypothesis → theory is based on years of work

Theory
* A theory is an organised system of assumptions and principles that attempts to explain certain phenomena and how they are related.
* Many hypothesis are tested and data collected before a theory is formed
* Provide a framework regarding the facts
* Theories can also lead to further questions and hypotheses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Identify and explain the goals of science

A

The goals of Science
Description → observing phenomena in a systematic manner, needed for prediction
Prediction → make predictions from one variable to another
Explanation → provide a causal explanation regarding a range of variables

Description
* You might want to observe a simple behaviour
* Are people taking “pills” at this festival?
* Might want to investigate something more complex
* Is there a relationship between the type of pill and rates of overdose?
* Need to describe types of pills being consumed
* Need to describe and measure contents
* Need to observe and describe the number of overdoses

Prediction
* Identify the factors that indicate when an event will occur
* Scientific prediction: We are able to use the measurement of one variable to predict the measurement of another variable
* The relationship between variables
* Does X occur with Y?
* Does X change in relationship to Y?
* We are looking at the correlation between two variables

Explanation
* The final goal of science is explanation
* This is the ultimate goal of science
* Is there a causal relationship between X and Y?
* Does X cause Y
* We need to test the causal relationship
* This requires research methods and experimental design
* This requires statistics to evaluate the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Understand the difference between Pseudoscience and actual science

A

Science vs. Pseudoscience
* The main difference is that science usually modifies or abandons failed hypotheses/theories when flaws or new evidence have been identified
Unfalsifiable hypotheses/theories
Vague/Unclear/poorly defined concepts
Un-parsimonious hypotheses/theories
Using testimonials
* Need systematic observations
Biased sampling/groups allocation
Placebo Effects/Experimenter bias
* Double-blind control studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Measurement and Error

A
  • All measurements can be broken into at least two components
  • The true value of what is being measured and measurement error
    Measured Score = True Score + Error
    X = T + e
  • However, we want: Measured Score = True Score
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

how do we reduce error?

A
  • Error is reduced with:
    Many participants – Individual differences error
    Many measurements – Measurement error
    Many occasion → able to replicate findings in different contexts
  • Averages of scores are more reliable than individual scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Identify and define the types of reliability

A

Inter-observer reliability
* Degree to which observers agree upon an observation or judgement.
Can be frequency or categorical judgement.
Rating attractiveness.
Scoring rat behaviours.
Coding explanations/descriptions.
* Measure inter-Observer reliability with correlations

Inter-observer reliability
Measure inter-Observer reliability with correlations
* Positive relationship between the scores of each observer
To have high inter-observer reliability we want both observers to agree- Very important for scientific research
* The higher the correlation between observer judgements the more reliable the results are.

Internal/Split-half
Internal Reliability
The degree to which all of the specific items or observations in a multiple item measure behave the same way
* Measuring Intelligence: All the items should equally measure intelligence
High internal reliability shows the entire measure is consistently measuring what it should be
We want more items to measure to reduce error
* Very important that these items all consistently measure the construct we are interested in

Internal Reliability
How can we examine whether multiple items on a test equally measure the same thing?
Divide the test into two halves
Look at the correlation between individuals’ scores on the two halves
Split-Half reliability

  • All items of an IQ test should measure intelligence

Need to compare like with like
* Don’t just split in half down the middle (group them and then split down the middle) (we want high correlations)
Look at the correlation between individuals’ scores on the two halves
* High correlation between scores indicates good internal reliability

Internal Reliability
The degree to which all of the specific items or observations in a multiple item measure behave the same way
High internal reliability shows the entire measure is consistently measuring what it should be
We want more items to measure to reduce error
* Very important that these items all consistently measure the construct we are interested in

Test-retest
Test-Retest Reliability
If we were looking at scores on a visual search task, we need the measurement to remain constant over time
Practice effects undermine test-retest reliability
Should counterbalance the order of presentation
Randomly assign people to differ orders

Test-Retest Reliability in practice
Brain Training example

Practice Effects
There is an improvement on scores in the game which indicates poor test-retest reliability
Practice effects – you get better because you do the same task several times
* If I asked you to play Pac-Man every day for 15 minutes and you improved your score, no one would be surprised!
Not a reliable measurement for cognitive improvement

Practice Effects
We have very reliable test-retest measures in these experiments testing the efficacy of brain training
* Shows the tests are reliable measures over time
* Scores on the external measure don’t change after training
* This has been found multiple times
So, these games don’t improve your brain function at all

Replication
The reliability of results across experiments
* Can we replicate the results when all variables and conditions remain the same
* Need clear and detailed method sections
Critical to the scientific method
* Must have evidence from multiple experiments
* More times a result is replicated the more likely it is the findings are accurate and not due to error

Replication Crisis
A lot of published psychological papers couldn’t be replicated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Understand the difference between reliability and replication

A

It means that if the same study is repeated under the same conditions, it should produce similar outcomes. Replicability, on the other hand, refers to the ability of other researchers to reproduce the results of a study using the same or equivalent methods and data.

Replication
The reliability of results across experiments
* Can we replicate the results when all variables and conditions remain the same
* Need clear and detailed method sections
Critical to the scientific method
* Must have evidence from multiple experiments
* More times a result is replicated the more likely it is the findings are accurate and not due to error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
what is validity
* Validity refers to how well a measure or construct actually measures or represents what is claims to. * Validity relates to accuracy. * Very important psychology where we often measure abstract constructs
26
Identify and define the types of validity
Types of Validity Measurement Validity (measurements are the dependent variable) * Construct validity * Content validity * Criterion Validity → correlations → how valid are the predictions Concurrent Predictive Internal validity → can you make a claim based on these results * Strength of causal claim External Validity → can i generalised these claims to the population or environment? * Population validity * Ecological (environment) validity Measurement Validity Measurement Validity - how well a measure or an operationalised variable corresponds to what it is supposed to measure/represent. Show that the measurement procedure actually measures what it claims to measure We use a number of methods to assess the validity of a measurement o Critical for scientific research Construct validity How well do your operationalized variables (independent and/or dependent) represent the abstract variables of interest? Experimentally: Are you measuring what you think/say you are measuring? Construct validity = strength of the operational definitions The strength of your operationalising of variables Construct validity For example: Measuring hunger in rats * Weigh amount of food consumed? * Speed of running towards food? * Duration spent at the normal site of food delivery? * How much they are willing to press a lever for food? * How would you assess this? Define hunger Should relate to manipulations known to produce different levels of hunger e.g. food deprivation Ideally will be consistent with other measures of hunger Content validity Degree to which the items or tasks on a multi-faceted measure accurately sample the target domain * How well does a measure/task represent all the facets of a construct? * E.g. IQ tests: 7 questions on a facebook quiz that ask you about mathematics, general knowledge and logical reasoning… can they really adequately represent something like IQ? Many constructs are multi-faceted and sometimes multiple measures must be used to achieve adequate content validity Domains = grouping of contents Need all domains to accurately measure the construct of interest. Content validity vs internal reliability Content validity demonstrates that all of the items on a multiple domain measure accurately measure the construct. * Extroversion scale need all questions to accurately measure extroversion and not another construct Internal reliability relates to whether the items on a multiple measure domain consistently measure the construct * Intelligence test: All questions about verbal intelligence should produce consistent score in the same individual Criterion Validity Criterion validity measures how well scores on one measure predicts the outcome for another measure. * The extent a procedure or measure can be used to infer or predict some criterion (i.e. another outcome) Two types of criterion validity * Concurrent validity (now) * Predictive validity (future) Concurrent Validity Concurrent validity compares the scores on two current measures to determine whether they are consistent. * How well do scores on one measure predict the outcome for another existing measure. * If the two tests produce similar and consistent results you can say that they have concurrent validity Predict the outcome of a current behaviour from a separate measure * How many chickens do you follow on instragram? Are you going to buy more chickens now? Predictive Validity Scientific theories make predictions about how scores on a measure for a certain construct affect future behaviour * If the measurement of a construct accurately predicts future behaviour, the measurement has high predictive validity. Example: Listening to Hip-Hop leads to violent crime * Compare hours listening to Hip-Hop with number of violent criminal offenses * Low correlation indicates that listening to hip-hop has poor predictive validity for future violent crimes Or scores on atar and how successful you will be at university You want HIGH predictive validity Internal validity Internal validity is focused on whether the research design and evidence allows us to demonstrate a clear cause and effect relationship. High internal validity occurs when the research design can establish a clear and unambiguous explanation for the relationship between two variables Internal Validity Relative statement rather than an absolute measure (not a direct way to measure) * Can we rule out other explanations? * Are the variables accurately manipulating or measuring the construct? * Does the research design support the causal claim? Can’t directly measure internal validity with a correlation. Crucial to make claims about the causal relationship between variables External validity How well a causal relationship hold across different people, settings, treatment variables, measurements and time. How well we can generalize the causal relationship outside the specifics of the experiment * Is your sample representative? * Is the context representative? * Can results from animal labs generalize to humans? High external validity occurs when we are able to generalize our experimental findings Population Validity Population validity refers to how well your experimental findings can be replicated in a wider population Aim to have the findings generalize from our experimental sample to the wider population It is difficult to obtain high external validity in controlled experimental settings Population Validity WEIRD population Western, Educated, Industrialized, Rich, Democratic Differences in tasks ranging from motivation, reasoning and even visual perception Can you generalise your results to the wider population? * Example: Run a test looking at drinking habits – the participants that sign up all happen to be around 22 and female. Ecological validity (just means the setting of the experiment → have i thought abou the fact that if it was done in a different environment, would people act the same?) Ecological validity is how well you can generalise the results outside of a laboratory environment to the real world. Laboratory experiments vs. real-life settings * E.g. aggression studies in the lab vs. in real life Laboratory settings are very controlled and different from real life settings * People are aware they are under experimental conditions and behave differently
27
Understand the difference between reliability and validity
Validity = accuracy Reliability = consistency Reliability vs. Validity Reliability: The consistency and repeatability of the results of a measurement. * My scales at home always consistently tell me that I weigh 55kgs – they are reliable because they produce the same results consistently Validity: The degree to which a measure or experiment actually measures what it claims to measure * If my scales are always 5 kg less than my actual weight then it is not a valid measure of my weight (though it would be very reliable if it was always exactly 5 kg off). Reliability vs. Validity We want scientific measures to be both reliable and valid * Reliability demonstrates the measures consistently performs the same way * Validity demonstrates that the measure actually measures what it claims to measure * A valid measure that is also reliable – the measure accurately measures what it claims to and does this consistently.
28
what are the requirements for causality
* J.S. Mills proposed three requirements for causality Covariation * Is there evidence for a relationship between the variables? Temporal sequence * One variable occurs before the other Eliminate confounds * Explain or rule out other possible explanations.
29
Define and identify confounds
Identify and define confounds * Third Variable * Experimenter bias * Participant effects * Time effects Threats to Internal validity Internal Validity: Strength of Causal Claim J. S. Mill’s 3 criteria to infer causation 1. Covariation → show relationship between two things 2. Temporal Sequence → one thing causes another 3. Eliminate alternative explanations Third Variable Problem Confounds (same as third variable) – Extraneous variable that systematically varies or influences both the independent and the dependent variable Confound Confound is a third variable that differs between the groups. Confounds influence the DV and are not the variable you are manipulating You may have a different confound in your experimental and control groups Third Variable Problem There is a positive correlation between coffee drinking and the likelihood of having a heart attack Can we conclude that drinking coffee causes heart attacks? People that are smokers tend to drink more coffee May be increased job stress May have poor sleep Could be why you drink more coffee Could be due to more coffee Internal validity threats from the experimenter Experimenter bias is a confound which undermines the strength of a causal claim The bias of the experimenter may influence the way a dependent variable is scored Experimenter may behave in a way that influences the participants and confounds the results of the experiment. Not always intentional * Previous knowledge and ideas can create tunnel vision in the experimenter Example → smart v dumb rats case study Double blind studies help this Threats to internal validity: the participant The way a participant behaves can influence the validity of the results. Individual differences that are systematic can interfere with the causal relationship you are investigating Demand Characteristics * Participants identify the purpose of the study * Behave in a certain way as a result of identifying the purpose of the study Unobtrusive observation aids with resolving demand characteristics Indirect measures also aid this This also ties into measurement effects Deception and confederates aid in resolving this Time related Confounds Maturation – The effect of time on participants * Short term: mood, tiredness, hunger, boredom Deal with this using a number of measures * Counterbalancing the order of tests * Control for time of day * Design experiments of reasonable length * Include breaks in the experimental design Maturation: Long term effects Maturation – The effect of time on participants * Long term: Age, education, wealth Difficult to control for long term maturation. * Only important for longitudinal studies which take place over many years. Random assignment and sampling helps to reduce this confound
30
Define and identify artifacts
Identify and define artifacts * Mere measurement effect * History effects * Selection Bias Artifacts Artifacts reduce external validity Prevents you generalising your results Unlike a confound, an artifact is something that is ever present in all groups being tested that stays constant Mere measurement effect Being aware that someone is observing or measuring your behaviour may change the way you behave. This is important for external validity as it undermines the ability to generalise lab results to wider population and context. Similar, to demand characteristics- except that it affects all subjects in the experiment Not an individual difference variable History Effects The effect of a period of time may make an entire sample biased Example: level of education in Syria * War zone = limited access to school * Limited shelter and food The data is influenced by the moment in time Can’t generalize these findings to a wider population or different contexts Selection Bias Selection bias is where participants volunteer for a study who have a biased interest in the topic of research or the outcome of the study. You would expect people who really loved beards would be the ones that completed this survey. Non-response bias Non-Response Bias is a problem for experiments that involve voluntary sign ups or surveys People do not respond when they are not interested in something * You lose a large sample of the population to non-response bias This undermines the external validity of the experiment * Limited population means that the results cannot be generalized to a wider population. Many Polls and online surveys are subject to this threat How to Manage selection bias It is important to use a random sample of the population Does not eliminate all problems but reduces the likelihood of systematic biases in your data. Compulsory poll – census * This reduces sampling bias * Produces results and data that is more widely applicable * Less susceptible to biased groups * Still susceptible to demand characteristics
31
Understand the difference between confounds and artifacts
Artifacts Artifacts reduce external validity Prevents you generalising your results Unlike a confound, an artifact is something that is ever present in all groups being tested that stays constant Confound Confound is a third variable that differs between the groups. Confounds influence the DV and are not the variable you are manipulating You may have a different confound in your experimental and control groups
32
identify and define non experimental methods
- Descriptive & observational research (lowest level of non experimental research) (such as the census) → like creating a database → very large studies to get a lot of data → use this to create further studies. Observational is observing the data → not taking part or manipulating anything (like national geographic → watching animals). Use observational to get an idea about the topic area to then expand on → highest external validity and the lowest internal validity - Correlational → dont manipulate anything → looking at the relationship between variables/measures → no independent variables → second highest external validity and the scond lowest internal validity
33
identify and define experimental methods
- Quasi-experiment → use info that we already have to split people into groups → cannot randomly assign participants (such as you cant give people depression to measure depression) (If you want to look at a clinical population). → interested at looking at the different TYPES of participants (sane v the insane) → second highest internal validity and the second lowest external validity (can be generalised more easier) - True experiment → always has an IV and DV → IV has random assignment → minimises individual differences and third variables → also has a control → highest internal validity but the lowest external validity
34
Define and identify the characteristics of a true experiment
1) Systematic manipulation of 1 or more variable(s) between or within groups – I.V. * Guarantee temporal order of cause-effect * Observe covariation between variables * Minimize alternative explanations/confounds 2) Random assignment to each condition/group * Minimise alternative explanations/confounds * Must have systematic manipulation of the independent variable * Independent variable = violent video games * Need to give this an operational definition * IV is playing violent video games as operationalised by playing call of duty * Must have a measurement – need to measure the effect of the IV * Dependent variable – aggression or violence * Need to provide an operational definition We have a true experimental design * Manipulation of the IV gives us confidence in the cause-effect relationship between violent video games and aggression * Control group: Gives us confidence we can minimise systematic differences * Random assignment to groups minimises systematic differences/confounds between our groups (Internal Validity) * Random sampling reduces bias in our sample [Population Validity]
35
Define and understand random allocation
Random Assignment * You randomly assign participants to each of the groups * Reduces the likelihood of systematic differences between the participants in the group which undermine internal validity * May have some differences but in the long run we can be sure we don’t have biased assignment * In the long run over multiple experiments we can be sure we have eliminated this confound
36
Identify and understand the difference between random assignment and random sampling.
* Random sampling is an approach to recruiting subjects for your study * Try to sample different elements of the population proportionally * More representative sample * Applies to all forms of research design * External validity * Random assignment is an approach to controlling bias in group allocation * Minimise confounds * Internal validity
37
Understand the difference between between and within subject experiments
Within Subject * Can control third variables in other ways * Use the same participants in the different conditions * Repeated measures design/ Within-subjects design * True experiment – systematic manipulation of the IV * No random assignment – but using the same participants for each task so we remove individual difference confounds * Sometimes called a repeated measures design Advantages * Can be very powerful – remove the noise * Powerful in terms of statistics * Accounts for individual differences Limitations: Order Effects * Fatigue * Practice * Carry-over COUNTERBALANCE
38
Define and identify the characteristics of a Quasi experiment
Characteristics of Quasi-Experiments Research designs where the researcher has only partial control over the independent variables Participants are assigned to groups or conditions without random assignment. Two types of Quasi-Experiments * Person x treatment * Natural experiments Quasi-Experiments are very useful when random assignment is not possible or ethical Quasi-Experiments Have dependent variables and sometimes have true independent variables BUT ALSO HAVE Quasi-independent variables Like independent variables except * Not manipulated by the experimenter * Random assignment is not possible Types of Quasi-Independent variables There are two major types of quasi-independent variable 1. Person/Attribute variable 2. Natural variables
39
Define and understand the difference between an attribute and natural variable
Person/ Attribute Variables Individual difference variables * Can vary along a spectrum * Can be based on diagnostic criteria Use these variables most commonly for comparing groups – grouping variables We can use these to compare any differences on a dependent variable when random assignment is not possible Person/Attribute Quasi-independent variables Use of an individual difference variable (essentially a measured not manipulated independent variable) Note: Must be measured prior to the experiment Otherwise, there are issues with internal validity Temporal sequence! * Need to be sure that we didn’t cause the difference Attribute variable Attribute variable: Extroversion vs Introversion * Have randomly selected participants complete a personality test which measures a range of personality traits. * Examine the scores and then select a group that score high in extroversion and introversion The attribute is measured and then participants are split into groups on the basis of their score How do we split? Splitting these attribute variables into high and low is a common practice Not the best method statistically (quite crude) Median Split Find mid-point Advantage: * Easy * Get to keep all participants Disadvantage: * Participant 10 and 11 are similar… * Loss of information about unique individual differenc Natural Variables Another form of quasi-independent variable is the natural variable These are variables that are manipulated by nature! * Sometimes called natural experiments * Called “acts of god” by insurance companies For example * Being in a hurricane * Living in a warzone * Country you were born in * Biological gender * Age Natural Variables Notice that these are all variables that can’t be manipulated by the experimenter or randomly assigned They have been independently manipulated by Nature. Natural variable Quasi experiments allow us to look at the effect of war zones, different environments or biological differences Natural Variables vs. Attribute variables Can be quite hard to distinguish * Age – is a natural variable * Living in a hurricane zone is a natural variable What about introversion- is this obviously an attribute variable… * Genes and environment both are usually out of the control of the person! (epigenetics → how genes interact with the environment) It is not always clear. However, this is not a problem because: 1. They are treated the same way statistically 2. They provide the same kinds of threats to internal validity
40
Understand and identify a Person x treatment Quasi Experiment
Person/Attribute X Treatment Design Quasi-Independent variable – measured not manipulated and no random assignment True Independent variable – manipulated and random assignment Dependent variable – measured by the experimenter. Allow us to examine group differences and how they interact with a manipulated or treatment variable Person/Attribute X Treatment design → my experiment? Quasi-IV = anxiety levels * Split participants into severe or moderate anxiety groups based on their scores on a valid anxiety scale True IV: New treatment v Existing treatment DV: Reduction of anxiety symptoms
41
Identify the benefits and limitations of QuasiExperiments
Threats to internal validity In quasi-experiments the lack of random assignment or controlled/systematic manipulation of the quasiIndependent variable means: We can never be certain of temporal order of quasi-IV and DV Third variables -> alternative explanations! * Pre-existing group differences on other variables * We can try to match the groups on other characteristics Why do them? Higher in External Validity Match or Patch groups for relevant threats * Not perfect but can be quite effective for ruling out specific 3rd variables/alternative explanations Sometimes you can’t manipulate things in the lab * Not possible to manipulate depression * Not possible to manipulate whether someone is a psychopath Patching Number of different control groups used to try and account for the major threats to internal validity Forensic control group vs. Psychopath group – similar life situation Amygdala damage patient compared to subjects with brain damage but in a different region Community control groups for brain damaged subjects to control for age and IQ differences
42
Understand correlational designs
How do correlational designs work? Measuring but not manipulating variables * Multiple Dependent Variables * The experimenter is not manipulating anything, just measuring participants Same issues regarding descriptive studies apply * Measurement/testing effects * Question wording * Random sampling needed to ensure External Validity Characteristics of Correlational design WARNING!!!! * The difference between correlational and quasiexperiments is not always 100% clear * Correlation ONLY DVS!!! Only measuring NOT manipulating * Quasi-experiments have more. Cant use something that is a discrete category for a correlational experiment (has to be continuous variables → like happiness and wealth (on a scale of 1 - 10)) Quasi experiments → split them into groups and distinguish the differences between groups Correlational studies need continuous variables Quasi need categorical variables
43
Define and identify the basic correlations
Positive Correlational Relationship This relationship is called a positive correlation. The two variables co-vary in the same direction. As scores on one variable increase, scores on the other variable increase. * Also: As scores on one variable decrease, scores on the other variable decrease. Meaning: As annual salaries increase, the amount of happiness increases. Or: As annual salaries decrease, the amount of happiness decreases. Negative Correlational Relationship This relationship is called a negative correlation. The two variables co-vary in different directions. As scores on one variable increase, scores on the other variable decrease. * Also: As scores on one variable decrease, scores on the other variable increase. Meaning: As annual salaries increase, the amount of happiness decreases. * Or: As annual salaries decrease, the amount of happiness increases. No Correlational Relationship This relationship is called an No Correlation /uncorrelated. The two variables do not co-vary. As scores on one variable increase, scores on the other variable are unrelated. Meaning: Annual salaries and the amount of happiness are not related
44
Understand why correlation does not equal causation
Is this a causal statement? This sounds like a causal statement! Judging a causal statement: Internal validity J. S. Mill’s 3 criteria to infer causation 1. Covariation (yes) 1. Temporal Sequence (yes and no) Hard to establish most of the time 2. Eliminate alternative explanations (no) Third Variable Problem Correlation is NOT causation In informal logic, an argument that tries to suggest that two things are related just because they co-occur or co-vary is a fallacy * X occurs after Y so they must be related * X occurs at the same time as Y so they must be related So, when arguing that correlation implies causation these are the informal logical fallacies that are being committed.
45
Identify and understand the types of correlational relationships
Correlational relationships What is the direction of the correlation? You need to be able to identify the temporal sequence of this relationship * Very hard to do * Does having more money (x) make you happier (Y)? * Does being happier (Y) increase your chances of making more money (x)? * Both of these? Issue of Reverse Causality * Cannot determine the direction of the relationship Indirect Correlational relationships Indirect correlational relationships occur when there is a variable in between the two variables of interest that is critical to the correlation Does beauty (x) cause happiness (Y) by increasing wealth (variable z)? Do cat bites (X) cause depression (Y) by decreasing the amount of time you spend out of the house (Z)? Third variable Correlational relationships A third un-measured variable actually causes X and Y and creates the illusion of a correlation between X and Y – confounding variable Does a failed relationship (variable z) increase the chances of buying a cat and the likelihood of developing depression? Are wealth and happiness both increased by higher education (variable z)? Spurious Correlational relationships Spurious correlations occur when two things appear to co-vary but are not actually related in anyway. A spurious correlation is different from a third variable correlation as there is no relationship or connection at all between variable X and Y- it just appears this way * “correlation does not mean causation” Illusion → can be called an illusionary correlation
46
Identify and understand the confounds and limitations of correlational research
Sources of confounds in correlational designs Person Confounds – Individual differences that tend to co-vary * For example: Depression and feelings of loneliness (and thus the desire for a cat) * Depression and anxiety Environmental Confounds - Situations that cause multiple differences * For example: coming to UNSW can increase knowledge and anxiety * Listening to your lecturer can simultaneously increase boredom & frustration. Methodological Sources of Confounds Operational Confounds – A measure that measures multiple things For example: Correlation between impulsivity and poor decision making Definition of impulsivity: a tendency to act on a whim, displaying behaviour characterized by little or no forethought, reflection, or consideration of the consequences. So, poor decisions are part of the definition, therefore they are correlated by definition! Limits of correlational research Correlational studies look at the relationship between measured variables * Can establish co-variation * Cannot establish temporal sequence effectively * Cannot eliminate alternate explanations effectively * Low in Internal Validity Confounds can arise due to: * Individual differences * Environments * Operational definitions
47
Define the difference between nonexperimental and experimental research
Non experimental: Descriptive correlational Low internal validity but high external validity No manipulation Measurement only Experimental: Quasi → non random assignment True experiment → Random assignment High internal validity Low external validity Descriptive Research No Independent Variables Only Dependent Variables Aim is to measure and describe * Not to explain * One of the 3 aims of science * [description, prediction, explanation] Can be thought of as only looking at a single dependent variable Descriptive Research Aims to simply describe what is occurring in a certain context. Alfred Kinsey Interested in sexuality What percentage of the population is homosexual? Based on a large survey Kinsey questioned the label of homosexuality Found that this label was inadequate Survey Methods majority of descriptive studies are conducted by surveys Survey benefits Limited data from large samples * Opposite to case studies Address questions of “how many”, “how much”, “who” and “why” Advantages: * Quick and efficient * Very large samples * Obtain public opinion almost immediately * Simple to use Observational Research Overview Usually good for external validity Terrible for internal validity (by themselves) Observational studies allow for observation in the real world Participant observation can lead to issues of experimenter bias Longitudinal and Cross-Sectional designs * Cannot manipulate variables but can get a sense of behaviour over time or across groups
48
Define and identify the types of descriptive and observational research
Descriptive and Observational Studies (can use the terms interchangeably) (none of these have an independent variable) Types of descriptive and observational research Case Study * Single subject Descriptive research * Describe and measure * No independent variables Observational research * Observe subjects * No independent variables
49
Understand the benefits and limitations of descriptive and observational research
Dangers of Non-Random Sampling You gain a representative sample by taking a random sample of the population Surveys often have response bias This is critical as it reduces the generalisability and the results of the survey. Naturalistic Observation Advantages * necessary in studying issues that are not amenable to experimentation * extremely useful in the initial phases of investigation Disadvantages * Cannot determine cause-effect relations * No internal validity * Very time consuming * Observed aware of observer? Hawthorne Effect Participant Observation Advantages: * It can be used in situations that otherwise might be closed to scientific investigation Disadvantages: * The dual role of the researcher maximizes the chances for the observer to lose objectivity and allow personal biases to enter into the description * Time consuming and expensive Longitudinal Research Longitudinal research – follow the same participants across a long time period Advantages * Genuine changes and stability of some characteristics observed * Major points of change observed Disadvantages * Time consuming and expensive * Participant attrition – threat to validity Cross- Sectional Research Take groups from different points in time to get a crosssection of the community Advantages * Relatively inexpensive and less time consuming * Low attrition rate Disadvantages * Cannot observe changes in individuals * Insensitive to abrupt changes * Age-Cohort effects
50
Relevance of Statistics in Psychological Research
Data-Driven Insights Statistics are essential for extracting meaningful insights from the vast amounts of data collected in psychological studies. Informed Decision Making Statistical analysis helps psychologists make evidence-based decisions and draw reliable conclusions about human behavior and cognition. Collaboration and Replication Rigorous statistical methods enable psychological research to be shared, replicated, and built upon by the scientific community
51
The Ethical Imperative: Why Understanding Statistics Matters
Ethical Data Practices Understanding statistics is crucial for ethically collecting, analyzing, and presenting data. It helps avoid misrepresentation and unintended biases. Informed Decision-Making Proficiency in statistics empowers psychologists to make well-founded, evidence-based decisions that positively impact research and clinical practice. Transparency and Accountability Robust statistical knowledge fosters transparency, allowing psychologists to communicate findings clearly and be accountable to research participants and the public. Advancing the Field Mastering statistics is essential for pushing the boundaries of psychological research and contributing to the ethical progress of the discipline
52
Implications of Misreporting
Overgeneralization Misleading one-size-fits-all impression of therapy effectiveness. Patient Harm Wasted time on ineffective treatments. Research Mistrust Damages credibility of psychological studies. Ethical Responsibility Researchers must present complete picture, including Limitations.
53
Measures of Central Tendency
Mean The arithmetic average of a set of values. Calculated by summing all the values and dividing by the total number of values. Median The middle value when the data is arranged in numerical order. Useful for skewed distributions where the mean may not be representative. Mode The value that occurs most frequently in the dataset. Can identify the most common or typical value. When to Use Each The mean is most commonly used, but the median or mode may be more appropriate depending on the distribution and research Goals.
54
Measures of Dispersion: Understanding Variability
Range The difference between the highest and lowest values in a dataset, indicating the overall spread. Variance The average squared deviation from the mean, capturing the dataset's overall dispersion. Standard Deviation The square root of variance, providing a more intuitive measure of the average deviation. Interquartile Range The difference between the 75th and 25th percentiles, describing the middle 50% of the data.
55
what are histograms and bar charts?
Histograms and bar charts are powerful tools for visualizing the distribution of data. Histograms display the frequency of values, while bar charts compare the magnitudes of different categories. These visualizations help identify patterns, outliers, and the overall shape of the data - crucial for gaining insights and communicating findings effectively.
56
whats a box plot
Boxplots offer a concise yet powerful way to visualize the distribution of data. They display the median, interquartile range, and any outliers, providing valuable insights into the spread and symmetry of a dataset. Analyzing the boxplot can reveal key characteristics such as the presence of skewness, the extent of variability, and the identification of unusual data points. This visual tool is especially helpful for quickly comparing data distributions across different groups or conditions.
57
whats a scatterplot?
Scatterplots allow us to visualize the relationship between two variables. The pattern of data points reveals the strength and direction of the correlation - whether the variables are positively, negatively, or not correlated. Analyzing scatterplots provides insights into the nature of the relationship, highlighting potential trends, clusters, and outliers. This lays the groundwork for deeper statistical analysis to quantify the correlation coefficient and determine its significance.
58
Importance of Data Cleaning and Preparation
Identifying Errors Thoroughly inspect your data for missing values, outliers, and inconsistencies that could skew your analysis. Handling Missing Data Decide on appropriate methods to address missing information, such as imputation or exclusion, to maintain data integrity. Standardizing Formats Ensure all data is in the correct format and units to enable accurate comparisons and calculations. Transforming Variables Apply necessary data transformations, such as logarithmic or square root, to meet statistical assumptions.
59
Handling Missing Data: Strategies and Considerations
Imputation Replace missing values with estimates based on patterns in the existing data, such as mean or median substitution. Listwise Deletion Remove any cases with missing data, but this can reduce statistical power and introduce bias if the missingness is not random. Multiple Imputation Generate multiple plausible values for each missing data point to account for uncertainty, then pool the results. Analysis of Missingness Investigate the patterns and mechanisms behind missing data to select the most appropriate handling method.
60
Interpreting Descriptive Statistics: Beyond the Numbers
Visualizing the Data Graphs and charts can bring descriptive statistics to life, revealing patterns, outliers, and relationships that may not be evident in raw numbers alone. Contextual Interpretation Understanding the real-world implications of descriptive statistics requires considering the study design, sample characteristics, and potential biases. Practical Significance Statistical significance alone does not necessarily equate to practical or clinical significance. Evaluating the magnitude of effects is key.
61
Ethical Considerations in Data Presentation
Transparency Ethical data presentation means being transparent about the source, methods, and limitations of the data. Hiding key details can mislead or manipulate the audience. Avoiding Bias Carefully consider how data is visualized and framed to ensure it does not introduce unconscious biases. Selective highlighting or omission can skew interpretation. Context Matters Ethical practice requires providing appropriate context to help the audience understand the full picture. Isolating data points without broader context can be misleading. Responsible Reporting Researchers have a duty to report findings accurately and avoid sensationalizing or exaggerating results. Honest, objective presentation builds trust in the scientific process. Avoiding Common Pitfalls in Descriptive Statistics Misinterpreting Visualizations Ensure proper understanding of graph types and their limitations to avoid drawing incorrect conclusions from descriptive data. Choosing Inappropriate Analyses Matching the right descriptive statistic to the research question is crucial to obtain meaningful and ethical insights. Data Entry Errors Meticulous data cleaning and verification are essential to ensure the accuracy of descriptive statistics and Visualizations.
62
Practical Applications of Descriptive Statistics in Psychology
Research Design Descriptive statistics are essential for planning studies, determining sample sizes, and interpreting research findings. Psychological Assessment Measures of central tendency and variability help clinicians understand client test scores and make informed decisions. Intervention Evaluation Descriptive stats allow psychologists to track progress, identify areas for improvement, and demonstrate program effectiveness. Data Visualization Charts and graphs based on descriptive statistics enhance communication and improve understanding of psychological Phenomena.
63
1. How does a scientist approach thinking differently from everyday thinking?
Answer: Scientists rely on objective analysis, systematic observation, and evidence-based conclusions, avoiding assumptions or subjective beliefs. They apply skepticism, empiricism, and critical thinking to form judgments.
64
2. Why is critical thinking important in evaluating scientific evidence?
Answer: Critical thinking allows individuals to objectively analyze and evaluate evidence, assess credibility, recognize biases, and form well-supported conclusions. Without it, people might accept information based on authority, intuition, or tenacity without validating it.
65
3. What are some sources people commonly rely on for information? Why might these be unreliable?
Answer: Common sources include common sense, superstition, intuition, authority, and tenacity. These sources are unreliable because they often lack systematic evidence, are based on personal bias, or rely on repetition rather than factual support.
66
4. Explain the issues with relying on 'folk wisdom' or common sense.
Answer: Folk wisdom often includes contradictory statements (e.g., "Absence makes the heart grow fonder" vs. "Out of sight, out of mind") and lacks systematic evidence, leading to unreliable or biased conclusions.
67
5. What is parsimony, and why is it important in scientific research?
Answer: Parsimony, or Occam's Razor, suggests choosing the simplest explanation with the fewest assumptions when multiple hypotheses predict the same outcome. It prevents unnecessary complexity and focuses on the most likely solution.
68
6. What does 'extraordinary claims require extraordinary evidence' mean?
Answer: This principle, associated with Carl Sagan, means that highly unusual or improbable claims need very strong and compelling evidence. For instance, seeing a celebrity in public might only need a photo, but alien encounters require extensive, credible proof.
69
7. Describe the concept of verification in scientific research.
Answer: Verification involves providing observable, confirmable evidence to support a claim. For a hypothesis to be scientifically valid, there must be evidence that can be consistently observed by others.
70
8. What is falsification, and why is it important in science?
Answer: Falsification, proposed by Karl Popper, is the idea that scientific claims must be able to be proven wrong. A hypothesis should allow for the possibility that it might be incorrect, encouraging rigorous testing and honest evaluation.
71
9. Give an example of how people naturally seek confirmatory evidence rather than falsification.
Answer: People tend to search for information that supports their views, such as googling "Does homeopathy work?" instead of "Evidence that homeopathy doesn’t work." This confirmation bias prevents objective analysis.
72
10. Why might relying on authority figures for information be problematic?
Answer: Authority figures can have biases, and they may not be experts in the specific area of inquiry. Evaluating evidence even from authorities is necessary to avoid misinformation or unsupported claims.
73
Q1: What are the four key ethical principles in psychological research?
A1: The four key principles are: Do no harm Informed consent Protection of privacy Valid research design
74
Q2: Explain the "Do No Harm" principle in research ethics.
A2: This principle ensures that researchers avoid causing physical, mental, or emotional harm to participants. It aligns with the principle of non-maleficence and emphasizes the importance of minimizing harm and discomfort in research.
75
Q3: Why is informed consent essential in psychological research?
A3: Informed consent ensures participants are aware of the study's nature, potential risks, and their right to withdraw without consequence. It is a legal and ethical requirement to respect participants' autonomy.
76
Q4: How does a "valid research design" contribute to ethical research?
A4: A valid research design ensures that the study has the potential to provide meaningful results, justifying any risks involved. Ethical panels evaluate this design to weigh the cost-benefit ratio and ensure ethical standards are met.
77
Q5: Describe the unethical practices observed in WWII German medical trials on concentration camp prisoners.
A5: The Nazi medical trials included experiments to create immunity to tuberculosis, where Dr. Heissmeyer injected live tuberculosis bacteria into subjects' lungs and removed lymph glands. Dr. Joseph Mengele also conducted inhumane twin studies, including injecting chemicals into eyes and attempting to create conjoined twins.
78
Q6: What ethical dilemma arises from using data obtained from unethical experiments like those of WWII?
A6: Although the methods were unethical, some argue that the data might hold value for modern medicine. This raises a dilemma about whether using this data is justified if it has potential life-saving applications.
79
Q7: How did the Nuremberg Trials contribute to modern ethical guidelines in research?
A7: The Nuremberg Trials exposed the Nazi war crimes, including unethical human experimentation. This led to the establishment of the Nuremberg Code, a set of ethical principles for human research that strongly influenced later guidelines.
80
Q8: Name three major ethical bodies for psychological research and where they are located.
A8: American Psychological Association (APA) – USA British Psychological Society (BPS) – UK Australian Psychological Society (APS) – Australia
81
Q9: Summarize the ethical dilemma in Henle & Hubbell's (1938) study on egocentricity in conversation.
A9: The study involved unobtrusive observations, raising issues around informed consent as participants were unaware they were being observed. Although no physical harm was done, the lack of consent and potential discomfort make it ethically questionable.
82
Q10: What were the ethical issues in Zimbardo's (1973) Stanford Prison Experiment?
A10: Ethical concerns included psychological harm, as participants experienced significant stress and distress. There was limited informed consent since participants didn’t expect to be arrested at home, and privacy was compromised as arrests happened publicly.
83
Q11: Why was deception used in Milgram's (1963) obedience study, and what ethical concerns did it raise?
A11: Deception was necessary to test genuine obedience, but it compromised informed consent as participants didn’t know the study's true nature. The study caused psychological distress, raising concerns about harm and whether the deception was justified.
84
Q12: What are the three "Rs" in animal research ethics?
A12: The three "Rs" are: Replacement: Use alternative methods if possible. Reduction: Minimize the number of animals used. Refinement: Improve procedures to reduce suffering.
85
Q13: Discuss the main ethical dilemma associated with animal research.
A13: The ethical dilemma centers on whether the potential benefits to human health justify the harm to animals. Although animal physiology often mirrors human systems, critics argue that ethical standards should protect animal welfare, while supporters focus on the value of research outcomes.
86
Q14: Define scientific misconduct and name four forms it can take.
A14: Scientific misconduct refers to unethical practices in research. The four main forms are: Plagiarism: Using others' work without credit. Conflict of Interest: When personal gain influences research outcomes. Fabricating Data: Making up data that didn’t exist. Falsification of Data: Manipulating or selectively reporting data.
87
Q15: Give an example of a famous case of fabrication in psychological research.
A15: Diederik Stapel, a Dutch psychologist, fabricated data in at least 30 published studies. His actions significantly impacted the credibility of social psychology research.
88
Q16: Explain how conflicts of interest can bias research findings with an example.
A16: Conflicts of interest occur when a researcher’s personal or financial gain could skew results. For instance, Coca-Cola funded studies suggesting sugar doesn’t contribute to obesity, raising questions about the impartiality of these findings.
89
Q1: Why is data analysis considered an ethical issue in psychology?
A1: Data analysis and reporting require transparency to avoid misleading interpretations. Ethical data analysis ensures accurate representation of results, respects participant confidentiality, and avoids manipulation or selective reporting that could misrepresent findings.
90
Q2: Explain the difference between statistical significance and clinical significance.
A2: Statistical significance shows patterns in the data, indicating whether observed effects are likely due to chance. Clinical significance, however, considers if a treatment has a meaningful impact on participants, addressing practical implications beyond mere patterns.
91
Q3: Why is transparency important in data analysis, particularly regarding the replicability crisis?
A3: Transparency helps other researchers replicate studies and achieve similar results, which is crucial for scientific credibility. The replicability crisis—where studies often fail to replicate—highlights the need for clear data reporting and honest disclosure of limitations.
92
Q4: Outline the main steps in the data analysis process.
A4: The key steps are: Collect: Gather data from surveys, experiments, or observations. Organize: Use tools like Excel to structure data. Analyze: Apply statistical methods to identify patterns. Interpret: Draw conclusions to answer the research question.
93
Q5: Describe two types of data collection methods used in psychology.
A5: Survey Responses: Collects participants' opinions and experiences. Behavioral Observations: Records and analyzes participants' actions and reactions in natural or controlled settings.
94
Q6: Differentiate between quantitative and qualitative variables in research.
A6: Quantitative Variables: Represent measurable quantities, like age or test scores. Qualitative Variables: Represent categorical attributes, like gender or favorite color.
95
Q7: What is the difference between continuous and discrete data?
A7: Continuous Data: Includes values that can be divided indefinitely (e.g., reaction time, distance). Discrete Data: Consists of indivisible units represented by whole numbers (e.g., number of children).
96
Q8: List and describe the four main measurement scales in data analysis.
A8: Nominal Scale: Categorizes data without a quantitative order (e.g., gender). Ordinal Scale: Ranks data, indicating order but not precise intervals (e.g., race placement). Interval Scale: Orders data with equal intervals, but lacks a true zero (e.g., temperature in Celsius). Ratio Scale: Includes order, equal intervals, and a true zero, allowing meaningful ratio comparisons (e.g., height, weight).
97
Q9: Why is it important to organize data before analyzing it?
A9: Proper organization clarifies relationships and patterns in the data, making it easier to identify meaningful trends and ensuring the analysis is accurate and efficient.
98
Q10: Explain the purpose of a box plot in data visualization.
A10: A box plot displays the distribution and variability of data. The bold horizontal line represents the median, and the box size reflects data variability, showing the spread of data around the median.
99
Q11: How do continuous and discrete data appear differently in bar charts?
A11: In bar charts, continuous data bars have no gaps, indicating a continuous range of values. Discrete data bars are separated by gaps, representing distinct, categorical data points.
100
Q12: What is the final step in data analysis, and why is it critical?
A12: The final step is interpretation, which involves making sense of the findings in relation to the research question. This step is crucial for drawing conclusions that are relevant, meaningful, and applicable.
101
Q13: List three ways Excel can assist in data analysis.
A13: Organizing Large Datasets: Efficiently stores and structures data. Creating Visualizations: Generates graphs and charts to visually represent findings. Performing Calculations and Basic Analyses: Uses formulas and statistical functions to analyze data.
102
Q14: How does Excel help in conducting basic statistical analyses?
A14: Excel provides formulas for calculations, as well as statistical tools that allow researchers to perform tests such as averages, correlations, and variances directly in the spreadsheet.
103
Q15: Why is it essential for psychology students to learn statistics?
A15: Statistics are essential for understanding research studies, identifying patterns in complex data, and making informed, evidence-based conclusions. Statistics also prepare students for data-driven careers, especially in research and analysis.
104
Q16: How can understanding statistics help a psychologist become more effective?
A16: Statistics allow psychologists to critically analyze data, evaluate treatment efficacy, and interpret trends in human behavior. This skill set enables them to make scientifically grounded decisions in both clinical and research settings.
105
Q1: What are some key principles to consider in data collection?
A1: Principles include: Identifying the type of data needed. Deciding on the data collection location. Ensuring the data collection form is clear. Creating a duplicate backup of data files. Training anyone who assists in data collection. Creating a detailed schedule for data collection. Cultivating sources for participant recruitment. Following up with subjects who missed sessions. Retaining all original data documents.
106
Q2: What are the four types of measures in psychological research?
A2: Self-Reported Measures: Collect data on what people report about their actions, thoughts, or feelings through questionnaires or interviews. They are often unreliable due to biases. Tests: Assess individual differences, including personality (self-report affective tests) and ability (e.g., aptitude and achievement tests). Behavioural Measures: Observe participants’ actions, often using a coding system to convert observations into numerical data. Physical Measures: Measure biological or physiological responses, such as heart rate or cortisol levels.
107
Q3: What is central tendency, and why is it important?
A3: Central tendency is a statistical measure that identifies the center of a data distribution, providing a summary value that represents the entire data set. Common measures are the mean, median, and mode.
108
Q4: How do you calculate the mean, and when is it most appropriate to use it?
A4: The mean is the arithmetic average calculated by summing all values and dividing by the number of values. It’s best used with data on interval or ratio scales that are not skewed.
109
Q5: What is the median, and why might you choose it over the mean?
A5: The median is the midpoint of a ranked data set, dividing it into two equal halves. It is less sensitive to outliers, making it ideal when there are extreme values that could distort the mean.
110
Q6: Describe how to find the mode and when it’s useful.
A6: The mode is the most frequent score in a data set and can be used with any measurement scale. It is particularly useful for categorical data, like eye color or political affiliation, where other central tendency measures may not apply.
111
Q7: What are the characteristics of a normal distribution?
A7: In a normal distribution, data are symmetrically distributed around the mean, with the mean, median, and mode being equal. This bell-shaped curve represents many naturally occurring phenomena.
112
Q8: How does a positively skewed distribution differ from a normal distribution?
A8: In a positively skewed distribution, the mean is higher than the median or mode, as the distribution tails off to the right. This can occur when there are higher outlier values pulling the mean upwards.
113
Q9: What is a negatively skewed distribution, and what does it indicate about the data?
A9: A negatively skewed distribution tails off to the left, with the mean lower than the median or mode, often due to lower outlier values pulling the mean downward.
114
Q10: When should each measure of central tendency (mean, median, mode) be used?
A10: Mode: For categorical data where items fall into distinct classes. Median: When data include extreme scores or are skewed, as the median is less affected by outliers. Mean: For numerical data without extreme scores, providing an overall average that reflects the entire data set.
115
Q11: What are some potential issues with self-report measures, and how might they affect data reliability?
A11: Self-report measures can be unreliable due to participant biases, memory recall errors, or the influence of social desirability, where participants respond in a way they think is expected. These issues can distort the accuracy of the collected data.
116
Q1: Why is variability important as a descriptive tool?
A1: Variability describes the extent to which scores in a distribution are clustered around the mean or spread out. High variability means scores are more spread out, while low variability indicates scores are closer to the mean. It provides insights into data distribution patterns, allowing researchers to understand consistency and predictability within the data.
117
Q2: What is the range, and how is it calculated?
A2: The range is the difference between the highest and lowest scores in a distribution. It’s calculated by subtracting the lowest score from the highest. The range gives a simple, rough measure of spread and can be used with ordinal, interval, or ratio data.
118
Q3: What is the standard deviation, and why is it useful?
A3: The standard deviation is the average distance of scores from the mean. A higher standard deviation indicates more variability, while a lower one suggests scores are closer to the mean. It’s sensitive to extreme values and provides insight into the distribution's consistency. Standard deviation can only be used with interval or ratio data.
119
Steps to Compute Standard Deviation
Calculate the mean of the data set. Subtract the mean from each score to find the deviation of each score. Square each deviation. Find the average of these squared deviations (this is the variance). Take the square root of the variance to get the standard deviation.
120
Q4: What is variance, and how does it relate to standard deviation?
A4: Variance is the average of the squared deviations from the mean and provides a measure of how spread out scores are around the mean. It is computed as the square of the standard deviation, making variance a "squared" measure of spread. Like standard deviation, it’s also sensitive to extreme scores and is used with interval or ratio data.
121
Q5: How are standard deviation and variance similar, and how are they different?
A5: Similarities: Both measure variability and depend on the mean. Both are sensitive to extreme values and can only be used with interval and ratio data. Differences: Variance is the squared value of standard deviation, making it harder to interpret in the original units of measurement. Standard deviation, as the square root of variance, is expressed in the same units as the original data, making it more intuitive and easier to understand.
122
what are the measures of variability
Variability is valuable for describing the spread of data. Range is calculated as the difference between the highest and lowest scores. Standard deviation is the average deviation from the mean, indicating data spread. Variance is the average of squared deviations and is equal to the standard deviation squared. Standard deviation is more interpretable than variance, as it uses the same units as the data.
123
1. Why is it important to understand variables and scales of measurement in psychological research?
Answer: Understanding variables and scales of measurement is essential because they: Guide the design of studies and data collection. Ensure that the appropriate statistical tests are applied. Help in accurately interpreting results and drawing valid conclusions. This ultimately leads to more reliable and valid psychological research.
124
2. What are the different types of variables, and how do they affect research design?
Answer: There are two primary types of variables: Independent Variables (IVs): These are usually categorical (e.g., treatment group vs. control group) and are manipulated to observe their effect on the dependent variable. Dependent Variables (DVs): These are usually numerical (e.g., test scores or response times) and represent the outcomes being measured. The type of variable influences the study design, the measurement approach, and the statistical analysis used.
125
3. What is the difference between nominal and ordinal variables, and how are they used in research?
Answer: Nominal Variables: These represent categories with no natural order (e.g., treatment preferences: Cognitive Behavioral Therapy, Medication, Combined Treatment, No Treatment). They are used to classify data into distinct categories. Ordinal Variables: These represent ordered categories with a meaningful sequence (e.g., severity of side effects: None, Mild, Moderate, Severe). They help in understanding the relative position or rank of items but do not provide the exact difference between them.
126
4. How do interval and ratio scales differ, and how are they applied in psychological research?
Answer: Interval Scales: These have equal distances between points but no true zero (e.g., IQ scores). They allow for comparisons of differences but not ratios (you can’t say someone has "zero intelligence"). Ratio Scales: These have a true zero point, allowing for both differences and ratios (e.g., reaction time in milliseconds). A ratio scale makes it meaningful to say one value is "twice as much" as another. Both scales are used in research that measures continuous data.
127
5. What are some common mistakes when using different scales of measurement in psychological research?
Answer: Treating ordinal data as interval: This can lead to inaccurate conclusions. For example, you cannot say "Depression increased by 2 points on a severity scale." Using inappropriate averages: Averages cannot be computed for nominal data, and even ordinal scales need careful interpretation when averaged. Misleading comparisons: "Twice as anxious" only applies to ratio scales; using this language for ordinal or interval data is inappropriate.
128
6. What is a Likert scale, and how is it typically used in psychological research?
Answer: A Likert scale is a fixed-choice rating scale commonly used to measure attitudes, opinions, or perceptions. It typically includes 5 or more points ranging from one extreme (e.g., "Strongly Disagree") to the other (e.g., "Strongly Agree"). Researchers use Likert scales to measure subjective responses like satisfaction, agreement, or frequency.
129
7. In a study of student satisfaction, if you use a 5-point Likert scale, how would you analyze the data both categorically and numerically?
Answer: Categorical Analysis: You would create a frequency table to show how many students selected each response option. You could calculate percentages of students who were satisfied/very satisfied vs. dissatisfied/very dissatisfied. This analysis gives insight into the most common responses and trends in the data. Numerical Analysis: You would calculate the mean satisfaction score to find the average level of satisfaction. Additionally, you could compute the standard deviation to see how spread out the responses are, offering a deeper understanding of the variability in satisfaction levels.
130
8. In a research scenario, when would you treat data as categorical versus numerical?
Answer: Categorical: When data represents categories or groups with no meaningful order (e.g., "How satisfied are you with your instructor?" with response options: Dissatisfied, Neutral, Satisfied). Numerical: When the data is measured on a scale with meaningful differences between points (e.g., a 7-point scale measuring anxiety with responses normally distributed, or when calculating the sum of scores from multiple items in a questionnaire).
131
9. What are the advantages and limitations of using categorical versus numerical approaches to analyze Likert scale data?
Answer: Categorical Approach (Mode or Frequency Analysis): This method is useful for understanding which response option is most common and provides clear insights when interpreting responses to individual items. However, it loses precision and doesn’t capture subtle differences. Numerical Approach (Mean and Standard Deviation): This method provides more detailed and quantitative information about the central tendency and spread of the data, especially useful when comparing multiple items. However, it may not always be interpretable, especially when dealing with skewed or non-normally distributed data.
132
10. In your student discovery exercise, when did the mean seem most useful, and when did frequencies tell a better story?
Answer: Mean: The mean is most useful when comparing multiple items, tracking changes over time, or when a more sophisticated statistical analysis is needed. Frequencies: Frequencies (or mode) are more useful when describing a single item or when the data is skewed. Frequencies give clear insights into the most common response, making them easier to communicate.
133
11. What information might be lost when using categorical versus numerical approaches in the student satisfaction study?
Answer: Categorical (Mode) Approach: You lose precision, as it doesn't detect subtle differences between respondents' experiences or show how many people chose each option. Numerical (Mean/SD) Approach: You lose the ability to identify patterns in responses, such as the most frequent score, and the mean can be less interpretable (e.g., a mean of 3.67 is less intuitive than “between neutral and satisfied”). These questions and answers will help solidify your understanding of variables, scales of measurement, and how to apply them in research.
134
1. Why are central tendency and variability important in psychological research?
Answer: Central tendency helps to identify the typical or average response in a dataset, making it easier to summarize data and draw conclusions. Variability shows the spread or diversity of responses, helping researchers understand how consistent or varied results are. It is crucial for assessing the reliability and generalizability of findings.
135
2. What does central tendency allow researchers to do in a clinical context?
Answer: Central tendency allows researchers to summarize participant responses with a single numerical value (mean, median, or mode). For example, "On average, patients' anxiety decreased by 30 points" is a clearer and more meaningful way to communicate changes in anxiety levels compared to listing all individual scores.
136
3. What is the difference between variability in two groups with the same mean?
Answer: Even if two groups have the same mean, their variability can be very different: Group A: Small variability, with responses clustered closely around the mean (e.g., 50, 51, 49, 50, 50). Group B: High variability, with responses spread out across a wide range (e.g., 20, 40, 50, 60, 80). This shows that the mean alone does not tell the full story, as variability indicates how consistent or diverse the responses are within each group.
137
4. Why is variability important in decision-making and research quality?
Answer: Treatment Decisions: High variability suggests that treatment outcomes differ widely among individuals, requiring further investigation into why some people respond better than others. Low variability allows for more predictable outcomes. Research Quality: Variability helps identify outliers and assess the reliability of the results. High variability in a small sample size might indicate that the results are unreliable. Sample Size: Variability helps determine whether a larger sample size is needed. A small sample size often leads to more extreme scores, which can increase variability.
138
5. What does the variance of a dataset tell us?
Answer: Variance quantifies the average of the squared differences between each data point and the mean. It indicates the degree of spread in the data: Low variance means the data points are close to the mean, indicating consistency. High variance means the data points are spread out, indicating diversity in responses.
139
6. How do you calculate variance, and what does it represent in a dataset?
Answer: Calculate each score’s deviation from the mean. Square these deviations. Find the average of the squared deviations (variance). Variance represents the degree of dispersion or spread in the data and tells us how much each score deviates from the mean.
140
7. How is the standard deviation related to variance, and why is it preferred for interpretation?
Answer: The standard deviation is the square root of the variance, and it provides a measure of spread in the same units as the original data. It is generally preferred for interpretation because it is more intuitive, as it describes the average deviation of scores from the mean. It is easier to understand than variance, which is in squared units and may be less relatable.
141
8. How do you calculate standard deviation and compare it across groups?
Answer: Calculate the variance for each group (following the steps of calculating deviations, squaring them, and averaging the squared deviations). Take the square root of the variance to find the standard deviation. Compare the standard deviations of different groups to understand the level of variability in each group. For example: Group A: Low standard deviation indicates consistent performance. Group B: High standard deviation indicates diverse performance levels.
142
9. In a scenario where two groups have the same mean but different variability, how would you interpret the data?
Answer: Even though the two groups have the same mean, their variability can offer important insights: Group A with low variability suggests that the participants have similar responses, and the treatment or condition is consistent across the group. Group B with high variability indicates that the responses vary widely, meaning that the treatment or condition affects people in diverse ways. This difference in variability is important for decision-making and understanding the reliability of the findings.
143
10. What are the key differences between variance and standard deviation in practical research?
Answer: Variance is a measure of spread but is in squared units, making it less intuitive to interpret. Standard deviation is the square root of variance and is in the same units as the original data, making it easier to interpret and more practical for decision-making and communicating research findings.
144
11. Why is it essential to understand both central tendency and variability in psychological research?
Answer: Central tendency gives you an average or typical response, but it doesn't reveal the range or diversity of individual responses. Variability tells you how much responses differ from the mean, providing context for how reliable or predictable the results are. Together, these two measures allow researchers to understand both the typical outcomes and the diversity of responses, helping them make more informed decisions and draw valid conclusions.
145
1. What is the purpose of using frequency distribution graphs in research?
Answer: Frequency distribution graphs are used to: Make sense of the data by visualizing relationships between scores and their frequencies. Identify trends in the data, such as common or extreme values (outliers). Display the distribution of data, allowing for easy comparison between different values or variables. Show how data changes or behaves across different categories or over time.
146
2. What are the three types of frequency distribution graphs?
Answer: The three types of frequency distribution graphs are: Bar Graphs: Used for categorical data (nominal and ordinal scales), where bars represent the frequency of each category. The bars do not touch, as categories are distinct. Histograms: Used for numerical data (interval and ratio scales), where bars represent frequency within specific ranges or intervals. The bars touch each other, indicating continuous data. Frequency Polygons: Also used for numerical data (interval and ratio scales), where points are plotted above each score and connected with lines to show the shape of the distribution.
147
3. What are the key features of a bar graph?
Answer: Bar graphs are used for categorical data (nominal and ordinal scales). Each bar represents the frequency of a category, and the bars do not touch each other. The x-axis represents the categories (independent variable), and the y-axis represents the frequencies (dependent variable). The height of each bar shows the frequency of that category.
148
4. How are histograms different from bar graphs?
Answer: Histograms are used for numerical data (interval and ratio scales), while bar graphs are used for categorical data. In histograms, the bars are adjacent to each other, indicating that the data is continuous (there are no gaps between the intervals). The width of the bars in histograms can represent class intervals, and the bars extend to the real limits of the category.
149
5. What is a frequency polygon, and how is it used?
Answer: A frequency polygon is a line graph used to represent numerical data (interval and ratio scales). It is created by plotting a dot above each score in the dataset and connecting the dots with a line. It is used to show how values change over time or to compare multiple sets of data. A frequency polygon provides a clear view of trends and distributions.
150
6. What is the difference between continuous and discrete variables in the context of frequency distribution graphs?
Answer: Continuous variables (e.g., time, distance) are measured on interval or ratio scales and can take any value within a range. In histograms, bars for continuous variables extend to the real limits of each category. Discrete variables (e.g., number of children, errors on a test) are measured on nominal or ordinal scales and can only take specific, indivisible values. In histograms, bars for discrete variables extend only halfway to the adjacent category, showing the indivisible nature of the data.
151
7. How do you read data from a frequency distribution graph?
Answer: The x-axis represents the variable being measured (independent variable), which could be either categorical (in bar graphs) or numerical (in histograms or frequency polygons). The y-axis represents the frequency or count of the data (dependent variable), showing how many times a particular value or category occurs. By examining the height of the bars (in histograms or bar graphs) or the dots/line (in frequency polygons), you can identify trends, outliers, and the overall distribution of the data.
152
8. Why is it important to present data visually in research and everyday life?
Answer: In research, presenting data visually helps: Make sense of the data, revealing trends, outliers, and relationships that might not be immediately clear from raw numbers. Compare different values of the same variable or different variables more effectively. Observe changes over time or across categories. In everyday life, visual data presentations: Clarify numerical information for easier understanding and communication. Allow for comparison between different sets of data or variables, helping to identify patterns and differences.
153
9. What are the components of the x-axis and y-axis in frequency distribution graphs?
Answer: x-axis (horizontal): Represents the measurement scale or variable being analyzed. It could be a categorical variable (in bar graphs) or numerical (in histograms and frequency polygons). y-axis (vertical): Represents the frequency or count of the data. It shows how many times a particular score or category occurs.
154
10. How would you present your own data using a frequency distribution graph?
Answer: First, decide the type of data you have (categorical or numerical). If categorical, use a bar graph. If numerical, use a histogram for continuous data or a frequency polygon for displaying trends and comparing multiple datasets. On the x-axis, label your variable categories or numerical intervals, and on the y-axis, plot the frequency or count of each category or interval. Ensure that the bars in a histogram touch if the data is continuous, or are spaced if it is discrete.
155
11. What does it mean when a bar graph has no gaps between bars?
Answer: If a bar graph has no gaps between the bars, it indicates that the data is continuous. This typically applies to numerical data measured on interval or ratio scales (e.g., time, temperature), where each category is connected to the next without distinct breaks.
156
1. What is bivariate data?
Answer: Bivariate data refers to measurements or observations on two variables, 𝑥 x and 𝑦 y. Each observation consists of a pair of numbers: the first number represents the value of 𝑥 x, and the second number represents the value of 𝑦 y.
157
2. What is a scatterplot?
Answer: A scatterplot is a graphical representation of bivariate numerical data, where each observation (pair of values) is represented by a point on a rectangular coordinate system. The horizontal axis represents the values of 𝑥 x, and the vertical axis represents the values of 𝑦 y. Each point corresponds to the intersection of the 𝑥 x value (horizontal) and the 𝑦 y value (vertical).
158
3. How is a point on a scatterplot determined?
Answer: A point on a scatterplot is determined by its 𝑥 x and 𝑦 y values. The 𝑥 x-value is plotted along the horizontal axis, and the 𝑦 y-value is plotted along the vertical axis. The point corresponding to the pair ( 𝑥 , 𝑦 ) (x,y) is where the vertical line from the 𝑥 x-value intersects the horizontal line from the 𝑦 y-value.
159
4. What is the difference between positive and negative correlation?
Answer: Positive correlation: As one variable increases, the other variable also increases. The points on the scatterplot tend to slope upwards from left to right. Negative correlation: As one variable increases, the other variable decreases. The points on the scatterplot tend to slope downwards from left to right.
160
5. How can you interpret the relationship between two variables using a scatterplot?
Answer: To interpret the relationship between two variables on a scatterplot, observe the direction of the points: If the points show an upward trend (from left to right), it indicates a positive correlation. If the points show a downward trend (from left to right), it indicates a negative correlation. If the points are scattered with no discernible trend, there may be no correlation between the variables.
161
6. What is the significance of the scatterplot in analyzing bivariate data?
Answer: The scatterplot allows us to visually assess the relationship between two variables. It helps identify patterns, correlations (positive, negative, or none), and outliers. It provides an intuitive way to understand the data and is a useful tool for preliminary analysis before further statistical testing.
162
1. What is the purpose of a stem-and-leaf plot?
The purpose of a stem-and-leaf plot is to visually display the frequency with which certain classes of values occur in a dataset. It is a method that helps organize data in a way that shows the distribution of values and allows for easy identification of patterns or outliers.
163
2. What are the components of a stem-and-leaf plot?
Answer: A stem-and-leaf plot divides each data point into two parts: The stem, which consists of the first digit(s) of the value. The leaf, which consists of the last digit of the value.
164
3. How do you read data presented in a stem-and-leaf plot?
To read a stem-and-leaf plot, look at the stem (the first digit(s)) and the leaf (the last digit) together as a whole number. For example, in the stem “5” with leaves “2, 4, 6,” the values represented are 52, 54, and 56. The plot shows the frequency of different values within each group (stem).
165
4. What are some disadvantages of using stem-and-leaf plots?
Answer: Some disadvantages of stem-and-leaf plots include: They are not ideal for presenting large datasets, as they can become cluttered. When there are too many leaves for each stem, the plot may become difficult to interpret. If there are too many observations in the data set, the plot may not fit neatly in the table, affecting clarity and presentation.
166
5. How can you present your own data using a stem-and-leaf plot?
Answer: To present your own data using a stem-and-leaf plot: Organize the data in ascending order. Separate each number into a stem (the first part of the number) and a leaf (the last part of the number). Write down the stems in a vertical column and list the corresponding leaves next to each stem. Ensure that the leaves are ordered numerically for clarity.
167
6. When would you choose a stem-and-leaf plot over a histogram or other display?
Answer: You might choose a stem-and-leaf plot when you want to: Preserve the actual data values, as stem-and-leaf plots show each individual data point. Provide a quick visual representation of the data’s distribution. Compare the shape of the data distribution without losing too much detail. However, for larger datasets, other methods like histograms may be more appropriate.
168
1. What are the four types of data, and how are they different?
Answer: Nominal Data: Categories without a logical order (e.g., gender, colors). Ordinal Data: Categories with a logical order but no fixed intervals (e.g., survey ratings). Interval Data: Numerical data with equal intervals but no true zero (e.g., temperature). Ratio Data: Numerical data with a meaningful zero point (e.g., weight, height).
169
2. What is the difference between a bar graph and a histogram?
Answer: Bar Graph: Used for categorical data, where the bars do not touch, emphasizing the discrete nature of the categories. Histogram: Used for numerical (continuous) data, showing frequency distributions where the bars touch, indicating the continuous nature of the data
170
3. How do you create a frequency table in Excel for categorical data?
Answer: Ensure your data is in one column and categorical. In a separate column, list the categories of interest. Use the COUNTIF function to count how many participants picked each category. For example, in cell D2, use: =COUNTIF(B:B, "Psychology") Repeat the COUNTIF function for other categories (e.g., "Math", "Biology", "History"). To create a bar graph, select the frequency table, go to Insert -> Chart, and select a “Frequency chart.”
171
4. What is the function used to find out how many participants participated in a study in Excel?
Answer: Use the COUNT function to count cells with numerical values (e.g., =COUNT(A:A)) or COUNTA to count all non-empty cells, including those with text, numbers, or other data. To exclude the column title, subtract 1 from the total count (e.g., 151 participants – 1 = 150 participants).
172
5. What are the differences between the COUNT and COUNTA functions in Excel?
Answer: COUNT: Counts only cells with numerical values, ignoring text and blanks. COUNTA: Counts all non-empty cells, including those with text, numbers, or other data.
173
6. How do you calculate the range, variance, and standard deviation in Excel?
Answer: Range: =MAX(data) - MIN(data) Variance: Use =VAR.S(data) for sample variance. Standard Deviation: Use =STDEV.S(data) for sample standard deviation.
174
7. How do you create a histogram for the "Age" variable in Excel?
Answer: Select the data for the variable (e.g., age) by clicking the first cell and using the keyboard shortcut Ctrl + Shift + Down Arrow (Windows) or Command + Shift + Down Arrow (Mac) to select the entire range. Go to Insert -> Recommended Charts, and select "Histogram." Format the histogram by clicking on it to bring up the "Chart Tools" options in the ribbon.
175
8. What is the purpose of a stem-and-leaf plot, and how does it differ from a bar graph or histogram?
Answer: A stem-and-leaf plot is used to display the distribution of numerical data while preserving individual data points. Unlike a bar graph or histogram, which group data into categories or intervals, a stem-and-leaf plot keeps the exact data values but organizes them in a way that makes patterns and frequencies easy to observe. It is ideal for smaller datasets.
176
1. What are the three measures of central tendency?
Answer: Mean: The average score in a dataset, calculated by summing all values and dividing by the number of observations. Median: The midpoint value in a dataset when arranged in order, less affected by outliers. Mode: The most frequent score in the dataset, representing the most common value.
177
2. How do you calculate the percentile rank of a value in Excel?
Answer: To calculate the percentile rank in Excel, use the PERCENTRANK function: =PERCENTRANK(array, x, [significance]) Where: array is the range of cells containing the data. x is the specific value you want to find the percentile rank for. [significance] is optional and defines the number of decimal places for the result. To express the result as a percentage, multiply by 100: =PERCENTRANK(A2:A20, A5)*100.
178
3. What is the Interquartile Range (IQR) and how is it interpreted?
Answer: The Interquartile Range (IQR) is the difference between the third quartile (Q3) and the first quartile (Q1) in a dataset. It represents the middle 50% of the data and is less sensitive to outliers. The IQR is useful for understanding the spread and variability of the central portion of the data.
179
4. What is the purpose of a box plot, and what information does it convey?
Answer: A box plot provides a visual summary of a dataset, showing the median, quartiles (Q1, Q3), and potential outliers. It helps to: Identify the symmetry or skewness of the data. Show the spread of the middle 50% of the data (IQR). Indicate the whisker length, which represents the range of the data, excluding outliers.
180
5. How can box plots help in comparing multiple data sets?
Answer: Box plots can be used in the following ways to compare multiple data sets: Side-by-Side: Place box plots next to each other for easy comparison of distributions, medians, and spreads. Stacked: Stack box plots vertically to visualize relative positions and differences. Overlaid: Overlay box plots on the same plot to highlight similarities and differences in data distributions.
181
6. What defines an outlier in a box plot, and how is it identified?
Answer: An outlier in a box plot is a data point that falls outside the whiskers, typically more than 1.5 times the IQR. Box plots make it easy to spot outliers, which are often plotted as individual points beyond the whiskers. Outliers may indicate extreme values, unusual observations, or errors in the data.
182
7. Why are outliers important, and how can they affect statistical analysis?
Answer: Outliers are important because they can: Influence the mean and standard deviation, making these measures unreliable. Cause models to overfit or perform poorly, leading to inaccurate predictions. Indicate errors or anomalies in data that require further investigation or correction. They can impact the overall analysis and should be carefully analyzed.
183
8. What are some common causes of outliers in datasets?
Answer: Measurement Errors: Mistakes during data collection, equipment malfunction, or faulty sensors. Data Entry Errors: Typographical mistakes, incorrect data formatting, or accidental duplication. Unusual Events: Extreme weather events, natural disasters, or other unexpected occurrences that generate outlier data.
184
9. How can outliers be dealt with in data analysis?
Answer: Removal: Deleting outliers if they are considered to be errors or irrelevant. Imputation: Replacing outliers with more representative values, such as the mean, median, or based on a regression model. Transformation: Applying mathematical transformations (e.g., logarithmic or square root) to reduce the impact of outliers.
185
10. What are the ethical considerations when handling outliers?
Answer: Clinical Responsibility: Outliers may indicate important clinical signals or immediate help for individuals. Removing data can remove valuable information. Research Integrity: Always document decisions regarding outlier handling, report results both with and without outliers, and be transparent about the choices made. Balance: Balance statistical rigor with the clinical or real-world implications of the data.
186
11. What are the key steps in assessing and treating outliers in your data?
Answer: Assess the Impact: Ask if the outlier is physically or clinically possible and whether it might indicate an important signal. Visual Inspection: Use box plots, histograms, or scatter plots to identify potential outliers. Statistical Methods: Use methods like z-scores or IQR to identify outliers based on how far they deviate from the central tendency.
187
12. What are some practical applications for percentile ranks in real-world data?
Answer: Percentile ranks are used in various fields, including: Standardized Test Scores: NAPLAN, ATAR, etc. Clinical Assessments: IQ, DAS, etc. Job Performance Rankings: Ranking employees based on performance. Medical Assessments: Tracking health metrics like growth or BMI. Growth Monitoring: Assessing children’s development over time.
188
Question: What are three ways to compare data sets using box plots, and when might each be useful?
Answer: Side-by-Side: Useful for comparing the distribution, median, and spread of multiple data sets simultaneously. Stacked: Useful for visualizing relative positions and differences in distributions, especially when dealing with a larger number of groups. Overlaid: Useful for highlighting similarities and subtle differences in data distributions.
189
Question: How do box plots differ from bar graphs, and when is each more appropriate to use?
Answer: Bar Graphs show means or total counts and are better for categorical or discrete data, offering simplicity for general audiences. However, they do not show data spread or outliers. Box Plots show median, quartiles, range, and outliers, making them ideal for examining data distribution, spread, and skewness, though they can be more complex for general audiences.
190
Question: How are outliers defined and identified in box plots?
Answer: Outliers are data points that fall outside the whiskers of the box plot, typically more than 1.5 times the interquartile range (IQR) away from the quartiles. They appear as individual points beyond the whiskers, making them easily identifiable.
191
Question: Why might a data point that deviates from normality not necessarily be an outlier?
Answer: If data come from multiple distributions, points that deviate from one distribution might simply belong to another. In cases of bimodal or multimodal distributions, these deviations could represent centers of valid data clusters, not outliers.
192
Question: What are some common causes of outliers in data sets?
Answer: Common causes include measurement errors, data entry errors, and unusual events like natural disasters. Each of these can introduce values that fall far outside the typical range of data.
193
Question: Describe three strategies for treating outliers and when each might be appropriate.
Answer: Removal: Appropriate when outliers are clear errors or irrelevant to the analysis. Imputation: Replacing outliers with representative values, such as the mean or median, useful when preserving overall data structure. Transformation: Applying transformations (e.g., logarithmic) to reduce outlier impact, useful in situations where outliers might skew analysis.
194
Question: What ethical considerations should guide outlier handling in clinical and research data?
Answer: Ethical considerations include clinical responsibility (outliers may represent important cases needing attention), transparency in decision-making (documenting and justifying all actions taken), and balancing statistical accuracy with clinical relevance. Outliers should not be removed without considering their potential impact on conclusions and human context.
195
Question: Summarize the key ethical responsibilities when cleaning data and handling outliers.
Answer: Researchers should balance statistical rigor with clinical or real-world implications, ensuring that data cleaning decisions preserve the integrity of findings. They must document their processes, consider how outliers affect results, and remember the human aspect of the data, especially in fields impacting health and well-being.
196
Question: Why is it important to view data points as representing real people in statistical analysis?
Answer: Viewing data as representations of real people ensures that data handling decisions respect the individual and clinical significance of each data point, preventing impersonal or overly mechanical decisions that might overlook important health or behavioral insights.
197
Q1: What is a sample in the context of sampling theory?
A1: A sample is a subset of the population that is actually measured. It is finite and concrete and is used to make inferences about the broader population.
198
Q2: What do we call summary properties of a sample, and what notation is commonly used?
A2: Summary properties of a sample are called statistics, and they are usually denoted with Latin letters. For example, the sample mean is represented by
199
Q3: What is a population, and what type of values summarize it?
A3: A population includes all items of interest. Summary properties of a population are called parameters and are often represented with Greek letters, like μ for the population mean and σ for population standard deviation.
200
Q4: List three key differences between a sample and a population.
A4: A sample is finite, concrete, and incomplete. A population is abstract, complete, and includes all individuals or entities of interest. A sample is used to make inferences about a population.
201
Q5: What is the purpose of inferential statistics in relation to populations?
A5: Inferential statistics aim to make inferences about population parameters based on sample data.
202
Q6: Describe random sampling without replacement.
A6: In random sampling without replacement, once an item is selected, it does not go back into the sampling pool. Each selection is unique, and it ensures a more accurate representation of the population.
203
Q7: How does biased sampling affect the results?
A7: Biased sampling skews results because it only includes specific characteristics of the population. For example, if only one color is used in a sample, it does not represent the full diversity of the population.
204
Q8: What is the difference between sampling with and without replacement?
A8: Sampling with replacement allows an item to be selected multiple times, as it is returned to the sample pool after each selection. Sampling without replacement does not allow re-selection, ensuring each item is chosen only once.
205
Q9: What is the Central Limit Theorem (CLT)?
A9: The Central Limit Theorem states that the distribution of sample means approaches a normal distribution as the sample size increases, regardless of the shape of the population distribution. It also states that: The mean of the sample means is equal to the population mean. The standard deviation of the sample means (standard error) decreases as the sample size increases.
206
Q10: What is sampling error?
A10: Sampling error is the discrepancy between a sample statistic and the corresponding population parameter. It reflects the variation that occurs because the sample is only a subset of the population.
207
Q11: How does sample size affect sampling error and standard error?
A11: As sample size increases: Sampling error tends to decrease, as larger samples provide more accurate estimates of the population. Standard error (the standard deviation of the sampling distribution) also decreases, meaning the sample means are closer to the true population mean.
208
Q12: Explain the Law of Large Numbers in terms of sampling.
A12: The Law of Large Numbers states that as the sample size increases, the sample mean 𝑀 M tends to get closer to the population mean 𝜇 μ. Larger samples generally provide more reliable and accurate information about the population.
209
Q13: In an IQ study, if the population mean is 100, how would the sample mean change with different sample sizes (e.g., N=1, N=4, N=10)?
A13: Smaller samples (e.g., N=1 or N=4) are likely to have sample means that differ significantly from the population mean of 100 due to higher sampling error. Larger samples (e.g., N=10 or more) will generally have sample means closer to the population mean of 100.
210
Q14: What is the standard error, and how is it calculated?
A14: The standard error (SE or SEM) is the standard deviation of the sampling distribution of the sample mean. It measures the accuracy of the sample mean as an estimate of the population mean and is calculated as: 𝑆 𝐸 𝑀 = 𝑠 𝑁 SEM= N ​ s ​ where 𝑠 s is the standard deviation of the original distribution, and 𝑁 N is the sample size.
211
Q15: What happens to the standard error as the sample size increases?
A15: As the sample size 𝑁 N increases, the standard error decreases, meaning the sample mean is a more accurate estimate of the population mean.
212
Q16: What three methods can reduce the standard error in a sample?
A16: To reduce the standard error, one can: Increase the sample size. Use random sampling methods. Use reliable and precise measurements.
213
Q17: Why can’t sampling error be completely eliminated in research?
A17: Sampling error can’t be eliminated entirely because samples are typically incomplete representations of populations. There will always be some discrepancy between sample statistics and population parameters due to the inherent variability between samples.
214
Q18: Differentiate between standard error and sampling error.
A18: Standard error is a measure of how much variation is expected between different sample means and the population mean. It represents the accuracy of an individual sample as an estimate of the population mean. Sampling error is the overall discrepancy between a sample statistic and the corresponding population parameter, reflecting how well the sample represents the population.
215
What is hypothesis testing, and what is its purpose?
Hypothesis testing is a statistical method that uses sample data to draw conclusions about a larger population. Its purpose is to evaluate evidence and determine the validity of a claim.
216
What are the three main aspects of hypothesis testing?
Data-Driven Decision Making: Using sample data to infer conclusions about a population. Statistical Inference: Making informed judgments about population parameters with limited information. Evidence-Based Conclusions: Providing a framework for evaluating evidence and assessing claims.
217
What does psychological research aim to achieve?
Draw conclusions about the mind, brain, and behavior for a population of interest. Demonstrate that an independent variable influences a dependent variable.
218
What is the null hypothesis (H0)?
The null hypothesis states that there is no effect or no difference between groups being compared. It assumes the independent variable does not influence the dependent variable.
219
What is the alternative hypothesis (H1 or Ha)?
The alternative hypothesis states that there is an effect or difference between the groups being compared. It represents the researcher’s claim and can only be supported by rejecting H0.
220
What analogy is used to explain null hypothesis testing?
Court Trial Analogy: Court Presumption: "Innocent until proven guilty." Statistics Presumption: "Null hypothesis is true until proven otherwise." The jury decides based on evidence, just as statistical decision rules determine whether to reject H0.
221
What are the two types of errors in hypothesis testing?
Type I Error (False Positive): Rejecting H0 when it is true (e.g., convicting an innocent person). Type II Error (False Negative): Failing to reject H0 when it is false (e.g., failing to convict a guilty person).
222
What is a decision rule in hypothesis testing?
A criterion that determines when there is sufficient evidence to reject H0. Commonly, if p < .05, H0 is rejected.
223
What does a p-value represent?
The probability of obtaining the observed results if H0 is true. A lower p-value indicates stronger evidence against H0.
224
What is the difference between one-tailed and two-tailed tests?
One-Tailed Test: Sensitive to a difference in one direction; more statistical power but limited in scope. Two-Tailed Test: Sensitive to differences in either direction; more generalizable.
225
What is a confidence interval (CI), and what does it measure?
A range of values likely containing the true population parameter. It measures the precision and uncertainty of sample estimates.
226
How are confidence intervals affected by sample size and confidence level?
Sample Size: Larger sample sizes result in narrower CIs. Confidence Level: Higher confidence levels (e.g., 99%) result in wider CIs compared to lower levels (e.g., 95%).
227
What does effect size indicate, and how does it relate to p-values?
Effect size measures the magnitude of a difference or relationship. While p-values indicate significance, effect size reflects the practical importance of the result.
228
What are common effect size measures in psychology?
Cohen’s d (for comparing groups): Small effect: 𝑑 = 0.2 d=0.2 Medium effect: 𝑑 = 0.5 d=0.5 Large effect: 𝑑 = 0.8 d=0.8 Correlation coefficient 𝑟 r (for relationships): Small effect: 𝑟 = 0.1 r=0.1 Medium effect: 𝑟 = 0.3 r=0.3 Large effect: 𝑟 = 0.5 r=0.5
229
What happens if confidence intervals for two groups overlap?
If CIs overlap, there is no statistically significant difference between the groups.
230
What are the primary purposes of confidence intervals?
Provide a range of plausible values for a population parameter. Measure the precision and reliability of findings while acknowledging uncertainty.
231
How do confidence intervals differ from p-values?
p-Values: Indicate the likelihood of results under H0. Confidence Intervals: Offer a range of plausible values for the population parameter, giving more information about effect size and precision.
232