lecture 4 - model fitting Flashcards
types of models
- model animal
- algorithmic model
- artificial neural network
- data driven models
model animal
- model animals allow researchers to draw conclusions that may generalize across species
- e.g. mice are often used as models to study biological processes and behaviors relevant to humans
algorithmic model
- never touches data
- relies on algorithms or theoretical constructs
- they are abstract and typically focus on understanding or simulating processes in a hypothetical or idealized way
artificial neural network
- more of a tool than a scientific model
- typically applied in engineering contexts to process data, rather than to explain underlying biological mechanisms
- do mimic some properties of biological neural networks
data-driven model
- used by scientists to explain data
- explicitly created to analyze and interpret real data, helping scientists draw insights directly from empirical observations
George Box: ‘All models are wrong. But some models are useful’
- emphasizes that no model can perfectly represent reality, because models are simplifications of complex systems.
- they leave out details and assumptions, and therefore cannot fully capture the intricacies of real-world phenomena.
- however, despite their limitations, models can still be valuable tools.
descriptive models
- mathematical description of the data
- ‘fitting’ is important
- fitted parameters can be assessed, but are properties of the data more than of the model
process models
- mathematical description of the process that gave rise to the data
- ‘fitting’ is important
- fitted parameters have meaning because they tell us something about the generative process - i.e., how the data was produced
utility of descriptive models
- gaussian distribution (central limit theorem): models noise
- n-degree polynomial: describe the shape of data
utility of process models
- parameters have cognitive/neural meaning
E.G.,
- parameters quantify the process by which the brain reaches a decision
- modeler commits to latent variables (e.g., action value in RL)
why are process models harder to formulate
- because you have to think about the underlying process, not just the data
- i.e., they require a deep understanding of the cognitive or neural mechanisms, not just the ability to fit a dataset
process models or descriptive models
- many models are a bit of both descriptive and process categories
- e.g., SDT
occam’s razor/rule
- helps decide what the better model is
- lex parsimoniae: suggests that “entities are not to be multiplied without necessity.”
- if two models give an accurate description of the data, the simpler model is to be preferred
why choose the simpler model
- generalization: a good model is not dependent on the experiment: it generalizes.
- there is always a model with more parameters, giving a better fit
key questions for model selection
- does the data require more complexity: if a simpler model fits the data adequately, adding complexity might not be necessary.
- are you fitting the process or the noise: the goal is to model the actual process that generated the data, not the random fluctuations or noise within it.
overfitting
happens when there are too many parameters for the data, and are fitting the noise in a specific dataset, rather than the true underlying pattern
cross-validation
- main method against overfitting
- splits the data into fit (train) and test datasets
- if you’re fitting noise that is unique to the training set, your model should fail to predict the test data (CVr^2 = N(0,σ)
comparing model fits
- cross validation: splits data into training and test sets to check if the model generalizes well to unseen data
- information criteria: downweight the quality of fit of a model by penalizing the number of parameters.
types of information criteria
- bayesian information criterion (BIC): -2ln(L) + k * ln(n)
- akaike information criterion (AIC): -2ln(L) + 2k
k = nr parameters, n = sample size, ln(L) = loglikelihood
how do information criteria work
- If two models fit the data equally well, the one with fewer parameters will be preferred because it incurs a smaller penalty.
- the criteria favor simpler models unless the more complex model demonstrates a significantly better fit to the data
BIC & AIC: quantity of parameters
- there is flexibility in the number of parameters in a model
- it is possible to formulate a model that decreases the number parameters without losing predictive power
BIC and AIC: what does it mean to say that they are conservative metrics
- BIC and AIC are conservative: they favor simpler models that avoid overfitting by penalizing added complexity
- BIC is a more conservative criterion as it includes a stronger penalty for the number of parameters than AIC, especially when the sample size is large
- therefore, first do cross validation. if you can’t then you look at information criteria
what does it mean to ‘fit a model to data’
- it means we found the parameters for our model that, when used to create a prediction (simulate), best explain our data
model fitting methods
- quantify explanation differently
- search the parameters differently