Fundamental assumptions of parametric models Flashcards

Question 1

Q

reminders

Answer

A

Models are approximations of reality focusing on practically relevant aspects.
“All models are wrong, but some are useful.” – George Box
– Help us to understand a complex matter.
– Ideally we want to be able to assess limitations to their usefulness.

Question 2

Q

Why are the model assumptions important?

Answer

A

– Machines need input.
– Perform operations on input.
– Always give some output.
– Parametric statistical models make assumptions about the input they receive.
– Reliability of output depends on the fit of input and assumptions.

Question 3

Q

What is a parametric model?

Answer

A

A family of probability distributions with a finite number of parameters.
E.g. normal distribution has two parameters: mean and standard deviation
Normal distribution is entailed in t-test, ANOVA, linear regression
Those models make parametric assumptions about their input.
Non-parametric models do not make the same assumptions: e.g. Chi-squared [χ^2] test, Mann Whitney U test, Spearman’s rank correlation

Question 4

Q

What do parametric models assume?

Answer

A

– All parametric models make the same assumptions about their input.
– Normal distribution is at the heart of parametric models
• Interval / continuous data
• Central limit theorem
• Observations must be independent and identically (iid) for the central limit theorem to apply.
– See also lecture and workshop week 6
– Homogeneity of variance
– Linearity (for continuous predictors in regression models)

Question 5

Q

What do parametric models assume?

Answer

A

– Linearity
– Independence
– Normality
– Equal variance (aka homogeneity)

Question 6

Q

density plots

Answer

A

– Relative likelihood of x taking on a certain value.
– The normal distribution is defined by its density function.
– We don’t need to worry about the maths here.

Question 7

Q

symmetric

Answer

A

– Left and right half are mirror images of each other

– Mean = Mode = Median

Question 8

Q

Example for non-normal responses

Answer

A

– Psychometric scales are neither continuous nor linear (see intro of Bürkner & Vuorre, 2019).Caveats of normal distributions

Psychometric scale; see Robinson (2018)
–	Response categories
–	Limited discrete options (vs sliders)
–	Ordinal: implicit order
–	Not equidistant (vs, say, inch)
–	See Liddell & Kruschke (2018)
–	We will see why the use of lms is not unjustified.

Question 9

Q

Caveats of normal distributions

Answer

A

– Strictly speaking, nothing is really normal distributed.
– Most variables have an upper and lower bound, e.g., people can’t be fast than 0 secs or smaller than 0 inch.
– All observations are discrete in practice due to limitations of our measuring instruments.
– However, a normal distribution is often suitable for practical considerations.
– So we typically want data to be distributed approximately normal.

Question 10

Q

Interim summary

Answer

A

– Parametric models assume that the data are normal distributed.
– However, psychologists often obtain non-normal distributed data.
– Why do we bother with the normal distribution?
– We will see in the following that the data don’t need to come from a normal distribution at all.
– The reason is the central limit theorem.

Question 11

Q

Central limit theorem (CLT)

Answer

A

The sampling distribution will be approximately normal for large sample sizes, regardless of the (type / shape of the) distribution which we are sampling from.
We can use parametric statistical inference even if we are sampling from a population that is weird (i.e. not normal distributed).
From week 6: mean of sampling distribution is estimate of population mean (μ; Greek mu)

Works also for totals (e.g. IQ), SDs, etc.

Question 12

Q

Demo of CLT

Answer

A

– CES-D scale: self-report depression (Radloff, 1977)
– 22 items to assess the degree of depression
– 5-point Likert scale: Strongly disagree - Strongly agree
– Item 1: I was bothered by things that usually don’t bother me.
– Item 2: I had a poor appetite.
– Item 3: I did no feel like eating, even though I should have been hungry.

Question 13

Q

Simulate one participant

Question 14

Q

Repeat for another participant

Question 15

Q

Calculate means for each participant

Answer

A

mean(ppt_1); mean(ppt_2)

– The sample distribution will approach normality as the number of participants increases.

Question 16

Q

Independent and identically distributed (iid)

Answer

Study These Flashcards

A

– One observation must be unrelated from the next.
– Assessing the spread of COVID infections: sample only one person per house hold

– Self-report depression
– Item 1: I was bothered by things that usually don’t bother me.
– Item 2: I had a poor appetite.
– Item 3: I did no feel like eating, even though I should have been hungry.
– Different questions related to the same psych phenomenon.
– Violations:
– repeating the same questions
– testing the same people multiple times
– not randomising the presentation order
– Consequence:
Unreliable / biased results

Question 17

Q

Identical distribution

Answer

Study These Flashcards

A

– Observations must come from the same distribution
– or family of distributions: e.g. normal, count, ordinal, binary
– Depression example: 22 items about depression, all 5-point Likert scale

Question 18

Q

– Violation:

Answer

Study These Flashcards

A

measuring responses on different scales (6-point Likert, continuous scale)
studying the effect of snapchat on self esteem but including people without snapchat
asking questions about coffee preference to measure depression

Question 19

Q

R studio - create data that should be normally distributed

Answer

Study These Flashcards

A

n

Question 20

Q

rnorm

Answer

Study These Flashcards

A

x

Fundamental assumptions of parametric models Flashcards

(20 cards)