Statistical Workflow Flashcards

Question 1

Q

Problems With P

Answer

A

A p-value is the probability of obtaining test results at least as extreme as the results actually observed, under the assumption that the null hypothesis is correct.
University Graduates IQ vs. Primary School Students IQ
University Graduates IQ vs. Secondary School Students IQ
Depression in left-handed people vs. depression in right-handed people

Question 2

Q

• But less probable results are still possible!

Answer

A

You could roll a 20 on a 20-sided dice
You could toss a coin 10 times and get 10 tails
You could measure significantly more depression in left-handed than in right-handed people

Question 3

Q

• Publication bias

Answer

A

Less probable results are often more interesting than probable results
Significant results are more interesting than non-significant results
probable results are often more interesting than probable results
Unusual findings might open new areas of investigation
Challenges to existing theory
Significant results are more interesting/respected than non-significant results
File drawer problem
Positive-results bias, a type of publication bias, occurs when authors are more likely to submit, or editors are more likely to accept, positive results than negative or inconclusive results.
Lots of non-significant studies never get published so literature does not necessarily show a balanced picture

Question 4

Q

Bad science:

Answer

A

A researcher could keep running the same experiment until they get a significant result (see bonus video)
A researcher could measure so many things that some might be significant by chance
fMRI voxels
EEG electrodes
Lots of conditions in an experiment
A researcher could conduct different types of analyses on the same data

Question 5

Q

Pre-Registration

Answer

A

• You determine many things about your study, before running it, and register this with a specially made tool

Benefits
- Scientific
• Can not fiddle with data or hypotheses once data have been collected
• You will be spotted if you keep running the same study
• Journals can accept a publication based on pre-registration before the data are collected, avoiding the file drawer problem

Organisational
• We know exactly what analysis to do – might take months / years to collect data so its easy to forget
• Really understand all elements of your study and can then make sure your study is going to be the best it can be
• Read Andrews & Justice ‘Replication crisis’ chapter in Essential Psychology textbook for more info.

Question 6

Q

Other methods

Answer

A

Grant-funded research
Studies get evaluated and reviewed by experts in the grant proposal
Many similar elements to pre-registration such as choosing analyses and sample numbers
Higher budgets
More reliable results as more participants or more time to spend developing measures
Multiple-experiment papers
Get an ‘interesting’ result in Experiment 1, repeat and develop it in Experiments 2, 3 etc
Bonus: Often deeper theoretical insight through replication of a slightly adapted Experiment 1 study
Publishing data sets

Question 7

Q

Pre-registration steps

Answer

A

Hypothesis/es
DV what we are measuring and how we will measure it
Conditions: how many and how participants will be assigned
What model (stats) will you use
How might we handle outliers and what exclusion criteria might we use?
Sample size
Any other secondary or exploratory analyses

Question 8

Q

Hypothesis/es

Answer

A

What model (stats) will you use
Sample size
Other considerations
How will you deal with outliers
Will you explore other parts of the data without a hypotheses
Before you even start thinking about data analysis, you need two things:
Clear research questions
Clear statements about how the manipulations in your experiment will affect the measure (hypotheses)
Without these, you won’t know what you are studying or why

Question 9

Q

Determine the appropriate model

Answer

A

Before we even collect our data, we should have a clear idea of what statistical test we are going to run
Should not be deciding this once data are collected
What if no model suitable for the data
What if data not in correct format, or not enough conditions etc
Can lead to ‘fishing’ around in data to find results

Question 10

Q

Run a sample size estimation

Answer

A

How many people should you test?
You learned about power and sample size a few weeks ago
Need to use sample size estimation to determine how many people we need to detect the effect size we are interested in
With too few people, we might not detect an effect that exists
E.g., We might only have power to detect the largest effects – what if our effect is small?
Waste of resource (time, money, effort)

Question 11

Q

Other considerations

Answer

A

Define how incorrect responses and outliers will be determined
What would lead to exclusion of a participant
X incorrect responses
What would lead to exclusion of a trial
2.5 SD higher/lower than the mean
Extra fast/slow responses
What other things might you do with the data, if exploratory analyses are conducted these will be flagged up in the pre-registration to avoid cherry picking

Question 12

Q

Pre-Registration and Organisation

Answer

A

Many factors make running projects complicated
Researchers usually have many projects running in parallel
Small Projects
Grants
PhD Students
Summer Projects/Volunteer Projects
3rd Year Projects/Masters Projects
Projects run for multiple years (e.g., 3 months to get reviewed by a journal, 3 year PhDs)
All projects similar due to area of expertise
Multiple people working on each project
Crucial to organise materials, data and analyses so
They can be revisited months/years down the line
Different team members can understand the data and analyses
Pre-registration can act as a large part of this organisation
Publishing data can also help keep it organised

Question 13

Q

overview

Answer

A

Scientific Idea
Pre-register hypotheses, methods and planned analyses
Organise your materials, data analyses
Run the study as planned, report any changes from pre-registered plans
Write up (another story, see other aspects of the course)
Publish anonymised data where possible

Question 14

Q

get descriptive stats (R STUDIO)

Answer

A

Describe(data, mean = mean(dataset), stdev = sd(dataset))

Question 15

Q

^^ arrange the descriptive stats

Answer

A

Describe(data, by dataset, mean = mean(dataset), stdev = sd(dataset))

Question 16

Q

^^ by conditon

Answer

Study These Flashcards

A

• describe(data =data, mean_dataset = mean(dataset), SD_dataset = sd(dataset), max_dataset = max(dataset), min_dataset = min(Intrusion), by = Condition)

Question 17

Q

example r studio

Answer

Study These Flashcards

A

describe(data =tetris, mean_intrusion = mean(Intrusion), SD_Intrusion = sd(Intrusion), max_Intrusion = max(Intrusion), min_Intrusion = min(Intrusion), by = Condition)

Statistical Workflow Flashcards

(17 cards)