Introduction to statistical reasoning Flashcards
research process
- initial observation
- generate theory
- generate hypotheses
- collect data to test theory
- analyse data
kijken naar figuur en overtekenen nog!!! in schrift, pen was op (studentwiki, lecture 2)
independent variable =
a variable that is thought to be the cause of some effect (experimenter manipulates this)
dependent variable =
a variable that is thought to be affected by changes of the independent variable (outcome)
predictor variable =
a variable that is thought to predict an outcome variable (een independent variable is een van de predictor variables)
level of measurement: 2 types
categorical and numerical
categorical =
qualitative: divided into distinct categorieso
voorbeelden categorical
- binary
- nominal (more than two categories, not in order)
- ordinal (categories in order)
numerical =
quantitative, echt alleen nummers
continous voorbeelden
- interval (equal intervals represent equal differences)
- ratio (ratios also make sense: een score van 16 op anxiety scale is dubbel zoveel anxiety als iemand met score van 8). scale must have a meaningful zero point for this!!!
wat voor measurement is likert variable
ordinal
oke dus measurement levels de eerste
categorical (qualitative) & numerical (quantitative)
soorten categorical =
nominal (geen order) en ordinal (wel order)
soorten numerical =
- discrete (steps, only particular numbers)
- continuous (kunnen alle nummers zijn)
soorten discrete
- nominal (nummers voor classification purposes: dus 0 voor man, 1 voor vrouw)
- ordinal (heeft een waarde: 1 voor lage SES, 2 voor middel SES etc)
soorten continous=
- interval
- ratio (met zero point, 2x zoveel score is 2x zo veel effect)
Interval variable: Equal intervals on the variable represent equal
differences in the property being measured (e.g., the difference between 6 and 8 is equivalent to the difference between 13 and 15).
Ratio variable: The same as an interval variable, but the ratios of scores on the scale must also make sense (e.g., a score of 16 on ananxiety scale means that the person is, in reality, twice as anxious as someone scoring 8). For this to be true, the scale must have a meaningful zero point.
measurement error=
discrepancy between the thing we are measurng and the actual value of the thing we are measuring
criterion validity =
evidence that scores from an instrument correspond with (concurrent validity) or predict (predictive validity) external measures conceptually related to the measured construct
dus is het gelijk aan de criterion van andere testen?
concurrent validity=
a form of criterion validity where there is evidence that scores from an instrument
correspond to concurrently recorded external measures conceptually related to the
measured construct
content validity=
does the content of the test correspond to the content it was designed to cover?
predictive validity =
a form of criterion validity where there is evidence that scores from an instrument
predict external measures (recorded at a different point in time) conceptually related
to the measured construct.
dus twee vormen van criterion validity=
predictive & concurrent validity
reliability =
zijn de resultaten repliceerbaar? hoeveel measurement errors en noise is opgetreden?
test-retest reliability =
will the same group of people tested twice get the same results?
david hume said to infer cause and effect…
- cause and effect should occur close in time: continuity
- cause must occur before the effect does: priority
- the effect should never occur without the presence of the cause: exclusivity
tertium quid problem=
an unidentified third element that is in combination with two known ones. ook wel confounding
wat heeft john stuart mill er door confounding variables bij gezegd
dat causality alleen infered kan worden through comparison of two controlled situations: one in which the cause is present and one in which the cause is absent (=experiment)
two methods of data collection
between-groups/between subjects/independent design
within-subject/repeated measures
two types of variation
unsystemic: random factors that exist between the experimental conditions
systemic: when the experimenter does something in one condition but not in the other
2 sources of systematic variation in experiments
- practice effects
- boredom effects (tired/bored)
hoe kan je deze systematic variation tegengaan
door counterbalancing the order in which a person participates in a condition
5 statistics of a quantitative variable
- mean
- standard deviation
- skewness (direction and degree)
- kurtosis: thickness of the tails
- z scores
frequency distribution ander woord
histogram
2 ways in which a graph can deviate from a normal distribution
- skew (lack of symmetry)
- pointiness (kurtosis)
hoe interpreteer je skew
hoe dichter bij de nul, hoe kleiner de skew. hoe verder weg (negatief of positief), hoe meer geskewed
hoe heet een distribution met positive kurtosis en wat is het
leptokurtic distribution: veel scores in de tails.
hoe heet een distribution met negative kurtosis en wat is het
platykurtic: dun in de tails, flatter than normal
hoe heet een normale kurtosis
mesokurtic
hoe de kurtosis interpreteren
alleen positieve getallen.
> 2 is high positive
1-2 moderate positive
0.5-1 weak positive
0-0.5 zero kurtosis
mode=
the score that occurs most frequently
hoe heet het als meerdere getallen het vaakste voorkomen
bimodel = 2 getallen
multimodel = meerdere getallen
median =
the middle score, when scores are ranked in order of magnitude
disadvantage of the mean
- can be influenced by extreme scores
- can be influenced by skewed distributions
- only with interval or ratio data
advantage of the mean, tov median en mode
-de mean gebruikt alle getallen ipv focussen op 1
-mean tends to be stable in different samples
range disadvantage
extremely easily influenced by extremes
hoe fix je dat de range zo beinvloedt wordt door extreme scores
interquartile range gebruiken. -> split the data into four equal parts.
median = second quartile