Lecture 1 Flashcards
behavioural data science aims to…
facilitate understanding, prediction, and change of human behaviour through the analysis of behaivourally defined variables as they arise in large datasets (big data) typically gathered using modern digital technology and analyzed with techniques for detecting patterns from high-dimensional data
what does bds fuse
mathematical modeling and statistical data analysis ideas with substance from social sciences.
wat moet bds altijd hebben
iets te maken met behaviour
3 goals of science
understanding, predicting and control (now called change)
understanding =
construction of psychological theories to explain behaviour
prediction =
application of statistical models to predict behaviour
change =
development of interventions to change behaviour
why do we need bds
- human behaviour is the root of many problems
- human behaviour is complicated to study, but standard methods to study it are remarkably simple (tests, questionnaires etc)
- new sources of data -> need new ways of approach
the golden age of social science:
social science is transforming from a largely qualitative or experimentally oriented field, to a more data driven field wherein formal theories of human behaivour will become much more important
data =
representations of observations, specific to a particular person at a particular time
phenomena =
robust features of the world, patterns in data, bv associations and correlations.
theory =
set of principles that aim to explain a phenomena
why dont we try to explain data, only phenomena
omdat je dat alleen bij fraude bv doet, observaties zijn heel individueel en dus hoeven die niet uitgelegd te worden
data are structured in..
rows = cases
columns = variable/features/properties etc.
variables =
abstract structures that represent the differences between cases
verschil variables and data
data is meer specific, tested at one person at a certain time point. here we create variables from.
can you generalise data
no, but we can form phenomena
what describes a world in which the phenomena would follow as a matter of course
= theory
coming up with a good theory is a creative act, but it can be systematized and practiced
oke
the relationship between data, phenomena and theory
data establishes -> phenomena <- theory explains
hoe kom je van data naar theory stappenplan
- data are used to represent observations
- statistical models are used to detect patterns in observations: phenomena
- theories aim to explain phenomena
- theories describe a world in which phenomena would follow as a matter of course
- if theories are to explain statistical patterns, they are ideally cast in mathematical form
the lexical decision task =
In the lexical decision task, a participant is presented with a single word, usually visually in the center of a computer screen. The participant’s task is to decide, as quickly and as accurately as possible, whether the word is a real word of his or her language.
press keyboard key w/ index fingers
meestal 50% real words and 50% non real words
performance on ldt measures
how well which lexical representations are activated from memory
welke woorden leiden tot betere performance
high freq. words (more common) is better than low freq
what are 2 key variables of interest bij ldt
response time
accuracy (proportion correct responses)
wat voor mensen zijn beter in ldt
jonge mensen beter dan oudere
hoe kan je dit verschil in leeftijd verklaren
global slowing hypothesis: older adults are slower in all cognitive processes
due to age related demyelination? -> which harms transmission speed
speed accuracy trade off
participants can invest more time, but then use this time to think better about their answer -> accuracy goes up. bij ouderen is dit het geval
problems with standard analysis
- no account for the tradeoff between accuracy and rt
- no process model (does not show how the data originates)
- unclear what mechanisms are
- unclear how ppl generate the data
dus wat is de oplossing
use a process model, die dus laat zien hoe de cognitive machinery de data produceert
wat is een voorbeeld van een process model
ratcliff’s diffusion model
ratcliff’s diffusion model =
a model that describes how noisy evidence is accumulated over time. when you look at the stimulus, you mentally accumulate information towards the word or non-word response, but this is a noisy process! dus het laat zien hoe simpele decision making processes occur
wat is nog meer bijzonder aan ratcliffs diffusion model
laat zien hoe behaviour decomposed kan worden into latent psychological processes
ratclif model history
vroeger bij bio en physics, maar later in psycho
hoe werkt het ratcliff diffusion model
noisy info is accumulated over time (sequential sampling), deterministic of this noisy process = drift rate.
repeated draws from the lexical dimension drive a noisy accumulation of evidence. after some time, the accumulated evidence reaches a predetermined threshold amount, and the corresponding response is initiated.
non decision time =
time needed for encoding and motor processes
less noise in the model =
more accuracy!
high drift rate =
very quickly the correct boundary is met.
higher drift rate for cat than for feline, dan minder snel naar de correct boundary
waar ligt de drift rate aan
aan de task difficulty and subject ability
boundary separation =
distance between word and non-word threshold
what if the boundary separation is small
dan is er dus minder onderscheidt tussen word and non-word -> decisions are made more quickly, but with less accuracy
what does boundary separation do
it quantifies the response caution, and is responsible for the speed-accuracy tradeoff
starting point =
reflects a priori bias to saying word or non word.
als meeste stimuli woorden zijn, ga je daar ook van uit voor de volgende test.
wat als researcher says its very important to correctly choose words
you would move your threshold further away for the word reponse
dus bias in dit experiment =
starting point
speed accuracy tradeoff in een zin
the general ability of people to increase accuracy at the cost of taking more time
wat liet zien dat de global slowing hypothesis niet klopt
de drift rate is all the same
wat laat sloomheid zien
als de boundary separation groter is
nondecision time shows fluctuation…
dus heeft niets met leeftijd te maken
wat is de conclusie van het onderzoek
reaction time task supports the global slowing hypothesis, maar diffusion model laat zien dat older adults are just as efficient in activating lexical content, but they are slower because they are more cautious
process models help…
understand, decompose, measure and predict
mensen hebben slechte intuitieover statistical processes zonder een concreet model, want verbal arguments can continue endlessly
oke
dus theory construction is … not …
a skill, not art! (there can be a method to theory construction)
psychology probleem
uses mostly verbal theories that do not map to empirical predictions
theory construction stappen
- identifiy set of phenomena that you want to explain
- come up with a proto-theory
- formalize the proto-theory and the phenomena
- evaluate how well the resulting formal theory actually explains the phenomena -> evaluate explanatory adequacy
- overall evaluation of the theory
wat is speed-accuracy trade off: data, theory of phenomena?
= phenomena
wat is heritability
ook phenomena
techniques used to detect data patterns
anova
regression
factor analysis
principal component analysis
proto-theory =
often verbally formulated, set of principles that putatively explain the phenomena.
(what if the world worked like this, then the phenomena would not be surprising)
wat zou de proto-theory bij positive manifold zijn
general intelligence
positive manifold =
Positive manifold refers to the fact that scores on cognitive assessment tend to correlate very highly with each other, indicating a common latent dimension that is very strong. This latent dimension became known as g for general intelligence or general cognitive ability.
why do we need to formalize the proto-theory and the phenomena
because ppl are very bad at assessing explanatory power intuitively -> we need a formal model
theories almost never work as they were originally made
you always have to insert principles, or adjust
theoretical cycle
identify empirical phenomena -> abduction -> develop proto-theory -> abstraction -> formalize theory and phenomena -> mathematical analysis and simulation -> check explanatory adequacy -> theoretical analysis -> evaluate theory -> deducation -> …
waar is bds een combi van
psychology, statistics, data science