Stats 6 - Non-Linear Models Flashcards
What is characteristic of Linear models?
- All the co-efficients/parameters (β0, β1, β2) in a linear model are linear –> simple
- The data can be fitted with the Ordinary Least Sqaures (OLS) Solution –> minimizing sum of the residuals
Example shown below
- Note that even the last example with eB0 is linear as it is a constant term
How can we characterise Non-Linear Models?
A non-Linear model is not linear in the parameters
Examples of Non-Linear models
In all of these examples, at least one parameter is non-linear (xiβ2)
When trying to fit a Linear model to our data, how do we decide what’s best?
Least Squares Solution!
Can we apply the Least Squares solution to a Non-Linear Model?
No! –> It does not work!
Why do we care about Non-Linear models in the first place?
Many observations in biology are not well-fitted by linear models –> the underlying biological phenomenon is not well described by a linear equation
Examples:
- Michaelis-Menten Biochemical Kinetics
- Allometric growth (growth of two body parts in proportion to each other),
- Response of metabolic rates to changing temperature
- Predator-prey functional response
- Population growth
- Time-series data (sinusoidal patterns)
Non-Linear model Example – Temperature and Metabolism
Enzyme responsible for Bioluminescence is very temperature dependent –> captured by modified Arrhenius equation
So how do we fit data when with a Non-Linear Model?
We can use a computer to find the approximate but close-to-optimal least squares solution!
- Choose starting values –> guess some initial values for the parameters
- Then adjust the parameters iteratively using an algorithm –> searching for decreases in RSS
- Eventually, end up with a combination of β where the RSS is approximately minimized.
Note –> Better if your guess of initial parameters is closing to the global minimum
Outline the general procedure of fitting Non-Linear Models to data
General Procedure
- Start with an initial value for each parameter
- Generate a curve defined by the initial values
- Calculate the RSS
- Adjust the parameters to make the curve fit closer to the data (Minimize sum square of residual) - Tricky part
- Adjust the parameters again…
- Iterative process –> repeat steps 4+5
- Stop simulations when the adjustments make virtually no difference to RSS
What are the two main types of Optimizing Algorithms used when adjusting parameters to minimize RSS?
- Gauss-Newton algorithm is often used but doesn’t work well if the model to be fitted is mathematically complicated (parameter search landscape is difficult), plus furthermore it does not help if the values for parameters that you have inputted are far from optimal
- Levenberg-Marquardt –> algorithm that switches between Gauss-Newton and “gradient descent” (Helps decide which direction to take in a complicated landscape) –> more robust against starting values that are far from optimal and is more reliable in most scenarios.
What should you do when your Non-linear model has been fitted?
Once NLLS fitting is done, you need to get the goodness of fit measures –> Is the model representative?
- First, we assess the fit visually
- Report the goodness of fit results:
a) Residual Sum of Squares (RSS)
b) Estimated co-efficients
c) For each co-efficient, we can present the confidence intervals (How confident we are that the co-efficient is between a specific range), t-statistic and the corresponding (two-tailed) p-value - You may also want to compare and select between multiple competing models
Note –> Unlike Linear models, R2 should NOT be used to interpret the quality of an NLLS fit.
What are the NLLS assumptions?
NLLS has the all the same assumptions as Ordinary least square regression.
- No/minimal measurement error in the explanatory variable
- Data have constant normal variance –> errors in the y-axis are homogenously distributed over the x-axis range
- The measurement/observation errors are normally distributed (Gaussian)
- Observations are independent of eachother
What happens if our error in our Non-Linear model are not normally distributed?
But what happens when the errors are not normal?
We have to interpret the results cautiously and use maximum likelihood or Bayesian fitting methods instead
What algorithm is normally used in R?
When using the nls() function –> Gauss-Newton algorithm is used
But for the Levenberg-Marquardt (LM) algorithm –> nlsLM() –> we require the installation of a package - minpack.lm
It offers additional features like the ability to “bound” parameters to realistic values
Outline the Coefficients in the Michaelis Menten equation.
Co-efficients
- Vmax –> Maximum rate of reaction –> occurs at saturating substrate concentration
- Km –> Substrate concenttation at Vmax/2 –> indication of affinity –> High = Low affinity/Low = High affinity
Km will dictate the overall shape of the curve –> does it approach Vmax quickly or slowly?
We have to remember that Vmax and Km have to be greater than zero –> important when picking starting values
How to set up a Michaelis Menten model on R?
MM_model <- nls(V_data ~ V_max * S_data / (K_M + S_data))
V_data –> Rate of reaction
S_data –> Substrate of reaction
When trying to fit a Non-Linear model on R, what will R do if you don’t input starting parameters/Coefficients?
For nls models you need to provide starting values for the parameters
If non are given then it will set all parameters to ‘1’ and work from there –> For simple models, despite the warning, this works well enough.