Stats 6 - Non-Linear Models Flashcards

1
Q

What is characteristic of Linear models?

A
  • All the co-efficients/parameters (β0, β1, β2) in a linear model are linear –> simple
  • The data can be fitted with the Ordinary Least Sqaures (OLS) Solution –> minimizing sum of the residuals

Example shown below

  • Note that even the last example with eB0 is linear as it is a constant term
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How can we characterise Non-Linear Models?

A

A non-Linear model is not linear in the parameters

Examples of Non-Linear models

In all of these examples, at least one parameter is non-linear (xiβ2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When trying to fit a Linear model to our data, how do we decide what’s best?

A

Least Squares Solution!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Can we apply the Least Squares solution to a Non-Linear Model?

A

No! –> It does not work!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why do we care about Non-Linear models in the first place?

A

Many observations in biology are not well-fitted by linear models –> the underlying biological phenomenon is not well described by a linear equation

Examples:

  1. Michaelis-Menten Biochemical Kinetics
  2. Allometric growth (growth of two body parts in proportion to each other),
  3. Response of metabolic rates to changing temperature
  4. Predator-prey functional response
  5. Population growth
  6. Time-series data (sinusoidal patterns)

Non-Linear model Example – Temperature and Metabolism

Enzyme responsible for Bioluminescence is very temperature dependent –> captured by modified Arrhenius equation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

So how do we fit data when with a Non-Linear Model?

A

We can use a computer to find the approximate but close-to-optimal least squares solution!

  • Choose starting values –> guess some initial values for the parameters
  • Then adjust the parameters iteratively using an algorithm –> searching for decreases in RSS
  • Eventually, end up with a combination of β where the RSS is approximately minimized.

Note –> Better if your guess of initial parameters is closing to the global minimum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Outline the general procedure of fitting Non-Linear Models to data

A

General Procedure

  1. Start with an initial value for each parameter
  2. Generate a curve defined by the initial values
  3. Calculate the RSS
  4. Adjust the parameters to make the curve fit closer to the data (Minimize sum square of residual) - Tricky part
  5. Adjust the parameters again…
  6. Iterative process –> repeat steps 4+5
  7. Stop simulations when the adjustments make virtually no difference to RSS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two main types of Optimizing Algorithms used when adjusting parameters to minimize RSS?

A
  1. Gauss-Newton algorithm is often used but doesn’t work well if the model to be fitted is mathematically complicated (parameter search landscape is difficult), plus furthermore it does not help if the values for parameters that you have inputted are far from optimal
  2. Levenberg-Marquardt –> algorithm that switches between Gauss-Newton and “gradient descent” (Helps decide which direction to take in a complicated landscape) –> more robust against starting values that are far from optimal and is more reliable in most scenarios.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What should you do when your Non-linear model has been fitted?

A

Once NLLS fitting is done, you need to get the goodness of fit measures –> Is the model representative?

  1. First, we assess the fit visually
  2. Report the goodness of fit results:
    a) Residual Sum of Squares (RSS)
    b) Estimated co-efficients
    c) For each co-efficient, we can present the confidence intervals (How confident we are that the co-efficient is between a specific range), t-statistic and the corresponding (two-tailed) p-value
  3. You may also want to compare and select between multiple competing models

Note –> Unlike Linear models, R2 should NOT be used to interpret the quality of an NLLS fit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the NLLS assumptions?

A

NLLS has the all the same assumptions as Ordinary least square regression.

  1. No/minimal measurement error in the explanatory variable
  2. Data have constant normal variance –> errors in the y-axis are homogenously distributed over the x-axis range
  3. The measurement/observation errors are normally distributed (Gaussian)
  4. Observations are independent of eachother
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What happens if our error in our Non-Linear model are not normally distributed?

A

But what happens when the errors are not normal?

We have to interpret the results cautiously and use maximum likelihood or Bayesian fitting methods instead

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What algorithm is normally used in R?

A

When using the nls() function –> Gauss-Newton algorithm is used

But for the Levenberg-Marquardt (LM) algorithm –> nlsLM() –> we require the installation of a package - minpack.lm

It offers additional features like the ability to “bound” parameters to realistic values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Outline the Coefficients in the Michaelis Menten equation.

A

Co-efficients

  1. Vmax –> Maximum rate of reaction –> occurs at saturating substrate concentration
  2. Km –> Substrate concenttation at Vmax/2 –> indication of affinity –> High = Low affinity/Low = High affinity

Km will dictate the overall shape of the curve –> does it approach Vmax quickly or slowly?

We have to remember that Vmax and Km have to be greater than zero –> important when picking starting values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How to set up a Michaelis Menten model on R?

A

MM_model <- nls(V_data ~ V_max * S_data / (K_M + S_data))

V_data –> Rate of reaction

S_data –> Substrate of reaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When trying to fit a Non-Linear model on R, what will R do if you don’t input starting parameters/Coefficients?

A

For nls models you need to provide starting values for the parameters

If non are given then it will set all parameters to ‘1’ and work from there –> For simple models, despite the warning, this works well enough.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

After fitting you Michealis Menten Model what should you do?

Hint - Look at image

A
  1. First Step is to visualize how well the model fit the data

Create Plot

plot(S_data,V_data, xlab = “Substrate Concentration”, ylab = “Reaction Rate”)

Input Trendline from Model

lines(S_data,predict(MM_model), lty=1, col=”blue”, lwd=2)

  1. After plotting, gather some information using summary()

Estimates –> Estimated values for the Co-efficients (Vmax and Km)

Estimate/Std. error = t-value which has a given Pr(>|t|) –> T-Test to test for the statistical significance of the obtained estimate value

Number of iterations –> Number of times the NLS algorithm had to adjust the parameter values to find the minimal RSS solution.

Achieved Convergence tolerance –> tells you on what basis the algorithm decided it was close enough to the solution –> basically if the RSS does not improve past a certain point despite adjusting parameters the algorithm stops searching.

17
Q

What are the main differences between lm and nls summary() output?

A

Difference between LM and NLM summary output?

Generally, the same format except for…

The last two rows are specific to an NLS output 

  1. Number of Iterations
  2. Acheived convergence tolerance

Why they are included?

NLLS is not an exact process, it requires computer simulations.

Normally, the last two rows are not reported BUT they can be useful in solving problems if the fitting does not work

18
Q

What is a quick way to obtain the co-efficient values from a nlm?

A

You can quickly obtain the Coefficient values from your NLM using the following code…

coef(MM_model)

19
Q

Can a ANOVA be performed on a Non-Linear model?

A

NO! –> ANOVA cannot be performed on a non-linear model

20
Q

How can you obtain confidence intervals for Co-efficient perdictions of a nls model? What can they be used for?

A

One very useful thing you can do after NLLS fitting you can calculate/construct confidence intervals (CI’s) around the estimated parameters/coefficients

Use the following Code - confint(MM_model)

It can be used for…

  1. The CI’s can be used to test whether the coefficient estimate is significantly different from a reference value
  2. It can also be a quick way to test whether coefficient estimates from the same model with another population sample have statistically different coefficients

In either case…

If the ranges overlap -> They are not statistically different

If the ranges do NOT overlap -> They are statistically different

Image –> Shows us that we are 95% certain that our co-efficient is located between these numbers

21
Q

Are R2 values obtained from a Non-Linear model reliable?

A

R2 values obtained from a Non-Linear model ARE NOT reliable, and thus should not be used

They don’t always accurately reflect the quality of the fit and can definitely not be used to select between competing models

22
Q

How can we tell R to start with specific coefficients for a non-linear model?

A

MM_model2 <- nls(V_data ~ V_max * S_data / (K_M + S_data), start = list(V_max = 12, K_M = 7))

Example –> Include start = list (… , …)

Note –> When selecting starting number make sure they are sensible and make biological sense

23
Q

Does using different starting values impact the final co-efficient?

A

YES!

Example below for Michaelis Menten Non-Linear model

  1. Co-efficients both set to one
  2. Co-efficients - V_max = 12 and K_M = 7
  3. Co-efficients - Vmax=0.01 and Km=10

A look at the different outputs!

24
Q

What happens when you using starting values that are too far from their actual value?

A

If you provide values that are VERY far from the optimal you will receive an error message –> e.g. Singular gradient matrix at initial parameter estimates error

Takeaway message –> NLLS model fitting is NOT an exact procedure.

But given that you provide starting values that are reasonable, NLLS is exact enough

25
Q

What is a more robust algorithm that can be used if the standard Gauss-Newton doesn’t work?

A

Levenberg-Marqualdt algorithm –> uses a function called nlsLM()

Note - install and load the a package using the following code:

install.packages(“minpack.lm”)

require(“minpack.lm”)

If you were to rerun any nls models that intially produced an error message, nlsLM() is more likely to produce an actual output

26
Q

Can you bound the co-efficient/parameter values in nlsLM?

A

Yes!

You can also bound the starting values –> preventing them from exceeding or falling below a Max and a Min.

Result of this?

Computer is more likely to produce the output in fewer iterations

Quick Aside –> The nls() function too has an option to provide lower and upper parameter bounds, but that is only in effect available when using algorithm = “port” (only available for a particular algorithm).

27
Q

What happens if you set up the bounds of the co-efficients/parameters too tightly?

A

If you bound the parameters too much  the algorithm will have insufficient parameter space  solution won’t be as reliable.

28
Q

What is the main diagnostic plot used to test the appropriatness of an NLLS fit?

A

Plotting the Residuals of a Fitted NLLS model –>To check for Nomral Distribution

At the very least you should plot the residuals of the NLLS model in a histogram

Example: hist(residuals(MM_model6))

You can run further diagnostics with the nlstools package

  1. install.packages(“nlstools”)
  2. require(“nlstools”)
29
Q

What does Allometric Scaling of traits refer to?

A

Allometric Relationships take the form of…

y = axb

  • Where ‘x’ and ‘y’ are morphological measures
  • The constant ‘a’ is the value of y when x=1
  • ‘b’ is the scaling component

Note that this is not a Linear model –> Hence, would be a good candidate for nls.

Example of an allometric relationship:

Body Length vs. Body weight –> the body weight does not increase proportionally for a given amount of body length

30
Q

How can we compare NLLS models?

A

Important to compare NLLS model with one or more alternatives for a more extensive and reliable investigation.

Remember R2 can not be used for Non-Linear models

So how do decide which model is better?

Akaike Information Criterion (AIC) using the AIC() function –> Estimates the information lost as a result of fitting the model

Example comparing a Nonlinear Model to a Linear model (Quadratic)

AIC(PowFit) - AIC(QuaFit) = -2.1474260812509

How can you tell which one is better?

Rule of Thumb if the AIC value difference is more than 2 ( >2 ), we can decide a winner in terms of the better model

31
Q

How can we gauge the goodness of fit of a NLLS model?

A

You can NOT use ANOVA or R-Squared Values

The best way to assess the quality of a NLLS model fit is to compare it to another, alternative model’s fit.

Other than that…

  • assess the quality of fit is to examine whether the fitted coefficients are reliable

For example:

  1. Low standard errors
  2. High t-values
  3. Low p-values