Smoothing Parameter Flashcards by Sophie Wilkinson

The Interpolating Spline

-line passing through all data points -so minimises: Σ [yi - μi^]²

How well did you know this?

Not at all

Perfectly

Smoothing Spline Definition

given data: D = { (ti,yi), i=1,…,n} -with model: yi = f(ti) + εi , εi~N(0,σ²)
where f(t) is smooth -given knot positions ti, i=1,…,n we can estimate f with smoothing spline fλ^(t) calculated using the matrix solution to the smoothing spline

How well did you know this?

Not at all

Perfectly

Methods for Chossing the Optimal Lambda

1) training and test
2) cross-validation / ‘leave-one-out’
3) generalised cross-validation

How well did you know this?

Not at all

Perfectly

Training and Test

1) partition indices I={1,…,n) into subsets I1 & I2 such that I1⋃I2=S
- this gives a training set: D1 = { (ti,yi), i∈I1} -and test set: D2 = { (ti,yi), i∈I2}
2) fit smoothing spline to D1 to find fλ,I1 for some specified λ
3) calculate the goodness of fit to D2: QI1:I2(λ) = Σ [yi - fλ,I1^(ti)]²
- sum over i in I2 and choose λ to minimise QI1:I2(λ)

How well did you know this?

Not at all

Perfectly

Cross Validation / Leave One Out

-same as the training and test set but use only one observation in the test set:

–training set: D1 = D-j = {(ti,yi), i∈I-j}

–test set: D2 = (tj,yj) for given j

-then calculate:

Q-j:j(λ) = [yj - fλ,-j^(tj)]²

-repeat for each j∈{1,…,n} then average to form the ordinary cross-validation criterion:

Qovc(λ) = 1/n Σ [yj - fλ,-j^(tj)]²

How well did you know this?

Not at all

Perfectly

Disadvantage of Cross Validation / Leave One Out

very computationally intensive
can write in terms of the smooting matrix instead

How well did you know this?

Not at all

Perfectly

Matrix Form of the Smoothing Spline Description

-for given λ and index ν≥1 the fitted value fλ^(tk) at each knot tk may be written as a linear combination of observation y1,…,yn

How well did you know this?

Not at all

Perfectly

Matrix Form of the Smoothing Spline Coefficients

-for a smoothing spline which minimises the penalised sum of squares for given λ has coefficients a^ and b^:

[a^ b^]^t = {Mλ}^(-1) [y 0]^t

How well did you know this?

Not at all

Perfectly

Matrix Form of the Smoothing Spline f

f = [f1 … fn]^t = K b^ + L a^ = [K L] [Mλ(11) Mλ(21)]^t y

How well did you know this?

Not at all

Perfectly

Matrix Form of the Smoothing Spline Smoothing Matrix

-can show:

S = Sλ = [K L] [Mλ(11) Mλ(21)]^t

f = S y

-where S, the smoothing matrix, is a symmetric, positive definite matrix

How well did you know this?

Not at all

Perfectly

Cross Validation / Leave One Out Smoothing Matrix

-to speed up cross-validation, Qocv can be computed directly from applying spline, fλ^ fitted to the full dataset:

Qocv(λ) = 1/n Σ [(yj - fλ^(tj))/(1-sjj)]²

-where fλ^ is the full data fitted spline at tj and sjj is the jth diagonal element of Sλ

How well did you know this?

Not at all

Perfectly

Generalised Cross-Validation

a computationally efficient approximation to cross-validation
replaces sjj with the average of the diagonal elements of Sλ:

Qgcv(λ) = 1/n Σ [(yj - fλ^(tj)) / (1 - 1/n trace(Sλ)]²

-this is the optimal smoothing method used in the mgcv package in R

How well did you know this?

Not at all

Perfectly

How mang degrees of freedom are in a smoothing spline? Outline

-there are (n+ν) parameters in (b_.a_) but not all are completely free

How well did you know this?

Not at all

Perfectly

How mang degrees of freedom are in a smoothing spline? λ -> ∞

-smoothing spline f^(t) becomes the least squares regression solution for model formula y~1 when ν=1, OR y~1+t when ν=2

How well did you know this?

Not at all

Perfectly

How mang degrees of freedom are in a smoothing spline? λ -> 0

-number of degrees of freedom becomes n, since smoothing spline f^(t) becomes the interolating spline when λ=0

How well did you know this?

Not at all

Perfectly

Ordinary Least Squares Regression Fitted Values

Study These Flashcards

y^ = X [X^t X]^(-1) X^t y

Ordinary Least Squares Regression Hat Matrix

Study These Flashcards

y^ = H y

-where H, the hat matrix, linearly maps data y onyo fitted values y^:

H = X [X^t X]^(-1) X^t

Ordinary Least Squares Regression Hat Matrix & DoF

Study These Flashcards

-for ordinary least squares regression:

trace(H) = p

-the trace of the hat matrix is equal to the number of model parameters (the number of degrees of freedom

Smoothing Matrix Hat Matrix

Study These Flashcards

for the smoothing spline, the smoothing matrix takes on the role of the hat matix
it linearly maps the data onto the fitted values

Smoothing Matrix Effective Degrees of Freedom

Study These Flashcards

edf_λ = trace(Sλ)

-can show that:

edf_∞ = ν edf_0 = n

Penalised Sum of Squares

Study These Flashcards

Rλ(f) = Σ[yi - f(ti)]² + λ J(f)

-sum from i=1 to i=n

When can the penalised sum of squares be used?

Study These Flashcards

-the penalised sum of squares is fine for Gaussian data BUT for non-Gaussian or non-identity link functions this needs to be replaces with the penalised deviance

Penalised Deviance

Definition

Study These Flashcards

Rλ(f,β) = D(y,f,β) + λ J(f)

where D is the deviance for the vector y of observations modelled by a linear predictor comprising of spline function, f, of order ν (& possible covariate main effects and interactions, β)
penalised deviance is then minimised with respect to spline coefficients b and a (& regression parameters β, if any)

Penalised Deviance Roughness Penalty

Study These Flashcards

-when there are several smooth terms of order ν in models f1,…,fm each may be assigned its own roughness penalty:

Rλ1,..,λm(y,f1,…,fm,β) = D(y,f1,…,fm,β) + Σ λn J(fn)

sum form n =1 to n=m
or the same one can be used for all of them

Penalised Sum of Square Residuals

Rλ(f) = Σ [yi - f(ti)]² + λ J(f) - where the first term is the sum of square residuals, sum from i=1 to i=n - and λ≥0 is the smoothness parameter - and J(f) is the roughness penalty

Which spline minimises the penalised sum of squares?

-f^, the function that minimises Rλ(f), is a natural spline: f^(t) = Σ bi^ |t-ti|^p + {ao^ if ν=1 OR (ao^+a1^ t) if ν=2} - where p=(2ν-1) - and IF ν=1 then Σ bi^ = 0 - or IF ν=2 then Σ bi^ = Σ t\*bi^ = 0

Penalised Sum of Squares λ-\>0

-f^ is rougher and converges to the interpolating spline

Penalised Sum of Squares λ-\>∞

-f^ is smoother, regardless of where the points are, f^ becomes a straight line

Smoothing Parameter Flashcards

(28 cards)