newtons method Flashcards
κ(A):
the condition number, ||A||.||A^†|| where A^† is the pseudoinverse of A
A^†:
(A^(T)A)^(-1)A^T, so A^(†)A=I when m>=n and lin indep cols
A^(T)(AA^T)^-1 so AA^†=I when n>=m and lin indep rows
error in the k+1th iterate is bounded by:
||x(k+1)-x||<=((κ^2(A)-1)/(κ^2(A)+1))||xk-x||
newton’s method:
pick a starting guess x0, then x(k+1)=xk-[∇^(2)f(xk)]^(-1)∇f(xk), it stops when ||x(k+1)-xk||<ε for some predefined tolerance ε
please note this finds roots Not minimum points so check if it’s a saddle or maximum
newton’s method in practice:
find y such that ∇^(2)f(xk)y=-∇f(xk), x(k+1)=xk+y
issues with newton’s method:
it’s v fast but ∇^(2)f(xk) the matrix has to remain invertible at every step, to avoid this start as close as possible basically
lipschitz continuous with respect to norms:
||f(x)-f(y)||<=L||x-y|| where L is the lipschitz constant
quadratic convergence w/ newton:
let f in C^(2)(Rn) with lipschitz continuous hessian, let x* in Rn be a local minimiser with ∇f(x)=0 and ∇^(2)f(x)≻0 (positive definite), then for x0 sufficiently close to x*, newton’s method has quadratic convergence
quasinewton method:
computing the hessian is expensive so sometimes we approximate, a popular method for this is bfgs (broyden-fletcher-goldfarb-shanno)
start with x0, B0, compute pk=(Bk)^(-1)∇f(xk), x(k+1)=xk+akpk for a suitable step length ak, sk=akpk, yk=∇f(x(k+1))-∇f(xk), B(k+1)=Bk+(ykyk^(T)/yk^(T)sk)-(Bksksk^(T)Bk/sk^(T)Bksk)
stop if ||∇f(xk)||<ε or ||x(k+1)-xk||<ε for some tolerance ε