MCS3: Continuous Mathematics Flashcards
Lemma
A symmetric 2x2 matrix A is…
5 points
- positive definite iff |A| > 0 and a > 0
- positive semidefinite, not definite iff |A| = 0 and a + c ≥ 0
- negative definite iff |A| > 0 and a < 0
- negative semidefinite, not definite iff |A| = 0 and a + c ≤ 0
- indefinite iff |A| = 0
Lemma
If g: ℝ → ℝ is a stricly increasing function, then
Optimization Tricks
arg min f(x) = arg min g(f(x)) for x ∈ F
Lemma
If g: ℝ → ℝ is a 1-1 function, then
Optimization Tricks
arg min f(x) = g(arg min f(g(x))) for x ∈ F
Knowledge
Midpoint Rule
2 points
- Approximate f: [a, b] → ℝ by 0ᵗʰ-order Taylor polynomial at m
- M₁[f, a, b] = (b - a)f(m)
Knowledge
Bounds for err(M₁)[f, a, b]
2 points
- Lower bound: -((b - a)³/24)D̅₂
- Upper bound: -((b - a)³/24)D̲₂
Knowledge
Trapezium Rule
2 points
- Approximate f: [a, b] → ℝ by its linear interpolation
- T₁[f, a, b] = ((b - a)/2)(f(a) + f(b))
Knowledge
Bounds for err(T₁)[f, a, b]
2 points
- Lower bound: ((b - a)³/12)D̲₂
- Upper bound: ((b - a)³/12)D̅₂
Knowledge
Simpson’s Rule
2 points
- Approximate f: [a, b] → ℝ by quadratic interpolation
- S₂[f, a, b] = ((b - a)/6)(f(a) + 4f(m) + f(b))
Knowledge
Bounds for err(S₂)[f, a, b]
2 points
- Lower bound: ((b - a)⁵/2880)D̲₄
- Upper bound: ((b - a)⁵/2880)D̅₄
Formula
Composite Midpoint Rule
Mₙ[f, a, b] = ((b - a)/n)(f((x₀ + x₁)/2) + … + f((xₙ₋₁ + xₙ)/2))
Knowledge
Composite Midpoint Rule Error Bounds
2 points
- O(n⁻²)
- Depends on -f’’
Formula
Composite Simpson’s Rule
Sₙ[f, a, b] = ((b - a)/(3n))(f(x₀) + 4f(x₁) + 2f(x₂) + 4f(x₃) + … + 4f(xₙ₋₁) + f(xₙ))
Knowledge
Composite Simpson’s Rule Error Bounds
2 points
- O(n⁻⁴)
- Depends on f’’’’
Knowledge
d-dimensional Midpoint Rule Error Bounds
2 points
- N = nᵈ
- O(N^(-2/d))
Knowledge
d-dimensional Simpson’s Rule Error Bounds
2 points
- N = nᵈ
- O(N^(-4/d))
Knowledge
Monte Carlo Integration
4 points
- MCₙ[f, R] = A(R)(1/n)∑ᵢ f(Xᵢ)
- A(R) is the area or volume of R
- Xᵢ are indpenedently uniformly random on R
- Pdf is p(x) = 1/A(R) if x ∈ R otherwise p(x) = 0
Lemma
MCₙ[f, r] is unbiased
E[MCₙ[f, r]] = ∫ᵣ f(x̲) dx̲
Lemma
Var[MCₙ[f, R]] = …
2 points
- V / n for a constant V
- V can be estimated by ∑ᵢ f(Xᵢ) + ∑ᵢ f²(Xᵢ)
Lemma
MCₙ[f, R] is consistent
P[|err(MCₙ)[f, R]| ≥ ε] → 0 as n → ∞ for any ε > 0
Lemma
For large n err(MCₙ)[f, R] approaches
N(0, V/n)
Knowledge
IEEE 754 Single Precision Standard
4 points
- 23 bits for m
- 8 bits for e
- 1 bit for sign
- ε = 2⁻²⁴ ≈ 6⋅10⁻⁸
Knowledge
IEEE 254 Double Precision Standard
4 points
- 52 bits for m
- 11 bits for e
- 1 bit for sign
- ε = 2⁻⁵³ ≈ 10⁻¹⁶
Knowledge
Truncation Error
Approximating an infinite process by a finite one
Knowledge
Roundoff Error
Approximating a real number for example in floating point format
Definition
Linear Convergence
2 points
- εₙ → 0
- |εₙ₊₁|/|εₙ| → a for 0 < a < 1
Definition
Sublinear Convergence
2 points
- εₙ → 0
- |εₙ₊₁|/|εₙ| → 1
Definition
Logarithmic Convergence
3 points
- εₙ → 0
- xₙ converges sublinearly
- |εₙ₊₂ - εₙ₊₁|/|εₙ₊₁ - εₙ| → 1
Definition
Superlinear Convergence
2 points
- εₙ → 0
- |εₙ₊₁|/|εₙ| → 0
Definition
Order-k Convergence
2 points
- εₙ → 0
- |εₙ₊₁|/|εₙ|ᵏ → a for a > 0, k > 1
Lemma
If xₙ converges with order q, then …
x₂ₙ converges with order q²
Knowledge
Interval Bisection
4 points
- Let xₙ = (aₙ + bₙ)/2
- If f(aₙ)f(xₙ) < 0 then (aₙ, xₙ) is a bracket
- If f(aₙ)f(xₙ) > 0 then (xₙ, bₙ) is a bracket
- If f(xₙ) = 0 then we have found a root
Knowledge
Interval Bisection Error Analysis
2 points
- |εₙ| < (b₀ - a₀)/2ⁿ⁺¹
- Linear convergence
Knowledge
Newton’s Method (1 dimension)
2 points
- Approximate f by its first-order Taylor polynomial
- xₙ₊₁ = xₙ - f(xₙ)/f’(xₙ)
Lemma
Error Analysis for Newton’s Method (1 dimension)
1st Lemma; 2 points
Assume f has two continuous derivatives and f(x*) = 0
Let A(c) = max|f’‘(β)|/min|f’(α)| where α, β ∈ I = (x* - c, x* + c)
1. If xₙ ∈ I then |εₙ₊₁| ≤ (A(c)/2)εₙ²
2. If x₀ ∈ I and c A(c)/2 < 1 then xₙ → x* at least quadratically
Proof
Assume f has two continuous derivatives and f(x*) = 0
Let A(c) = max|f’‘(β)|/min|f’(α)| where α, β ∈ I = (x* - c, x* + c)
1. If xₙ ∈ I then |εₙ₊₁| ≤ (A(c)/2)εₙ²
2. If x₀ ∈ I and c A(c)/2 < 1 then xₙ → x* at least quadratically
4 points
- Use definition of Newton’s method
- Use 2nd order Taylor’s remainder term
- Show |εₙ| ≤ ρⁿ|ε₀| where ρ = c A(c)/2 by induction
- Apply 1 to inductive hypothesis
Lemma
Error Analysis for Newton’s Method (1 dimension)
2nd Lemma; 2 points
Assume f has two continuous derivatives and f(x*) = 0
Let B(c) = max|f’‘(β)|/min|f’(α)| where I = (x₀ - 2c, x₀ + 2c)
1. If x* ∈ (x₀ - c, x₀ + c) and c B(c)/2 < 1 then xₙ → x* at least quadratically
2. Such an I exists if f’(x*) ≠ 0
Proof
Assume f has two continuous derivatives and f(x*) = 0
Let B(c) = max|f’‘(β)|/min|f’(α)| where I = (x₀ - 2c, x₀ + 2c)
1. If x* ∈ (x₀ - c, x₀ + c) and c B(c)/2 < 1 then xₙ → x* at least quadratically
2. Such an I exists if f’(x*) ≠ 0
5 points
- Deduce |ε₁| < ρc
- xₙ ∈ (x₀ - 2c, x₀ + 2c)
- Use induction to show |εₙ| ≤ ρⁿ|ε₀|
- A(c) → |f’‘(x*)/f’(x*)| as c → 0
- c A(c)/2 < 1 for sufficiently small c
Knowledge
Secant Method (1 dimension)
2 points
- Approximate f by its linear interpolation
- xₙ₊₁ = xₙ - f(xₙ)(xₙ - xₙ₋₁)/(f(xₙ) - f(xₙ₋₁))
Knowledge
Secant Method Error Analysis (1 dimension)
2 points
- Converges under same conditions as Newton’s method
- Order ~1.62 convergence
Knowledge
Halley’s Method
4 points
- Second-order Taylor approximation
- Cubic convergence under certain conditions
- Second-order polynomial may have zero or two roots
- Solving second-order polynomials requires the slow square-root function
Knowledge
Muller’s Method
5 points
- Tracks three points on the function, interpolating them by a parabola
- Second-order polynomial may have zero or two roots
- Solving second-order polynomials requires the slow square-root function
- Allows complex roots so may converge to a complex number
- Order of convergence ~1.84
Knowledge
Inverse Quadratic Interpolation
3 points
- Fit x = ay² + by + c
- Always has exactly one real solution to y = 0
- Order of convergence ~1.84
Knowledge
Brent’s Method
3 points
- Global convergence
- Linear convergence for badly-behaved functions and order ~1.84 for well-behaved
- Keeps track of three values, using IQI, but then Secant method if IQI tries to go outside the bracket and then interval bisection if neither provide sufficient reduction in the bracket size
Knowledge
Newton’s Method (d dimensions)
4 points
- Approximate each component of f(x) by its first-order Taylor polynomial
- xₙ₊₁ = xₙ - (J(f)(xₙ))⁻¹f(xₙ)
- Solve J(f)(xₙ) ∆x = -f(xₙ) then set xₙ₊₁ = xₙ + ∆x
- Computational cost of evaluating the Jacobian is O(d²) and solving the system is O(d³)
Knowledge
Broyden’s Method
6 points
- Track the Jacobian by an approximation
- xₙ₊₁ = xₙ + ∆x, ĵₙ∆x = -f(xₙ)
- Computational cost of solving the system is O(d³)
- Can track ĵₙ⁻¹ for O(d²) computation
- Converges if started close to an isolated root slower than Newton’s method
- Limited memory methods reduce space requirement to O(d)
Knowledge
f(x) and f(x*) may be indistinguishable when…
|x - x*|/|x*| < O(√ε)
Knowledge
Golden Section Search
4 points
- A bracket is an interval (a, b, c) with f(b) < f(a) ∧ f(b) < f(c)
- If cₙ - bₙ > bₙ - aₙ then z = cₙ - ((√5 - 1)/2)(cₙ - bₙ)
- If f(b) < f(z) then aₙ₊₁ = aₙ, bₙ₊₁ = bₙ, cₙ₊₁ = z
- Symmetric for other cases
Knowledge
Golden Section Search Error Analysis
2 points
- |εₙ| < Φⁿ⁺¹|c₀ - a₀|
- Linear convergence
Knowledge
Line Search Methods
4 points
- xₙ₊₁ = xₙ + αₙ dₙ
- dₙ should be a descent direction d(xₙ + α dₙ)/dα (0) = dₙᵀ gₙ < 0
- dₙ, dₙ₋₁, dₙ₋₂ should not veer wildly in different directions
- αₙ should loosely minimize f(xₙ + α dₙ)
Knowledge
Backtracking
3 points
- Start with αₙ = α’
- Repeatedly multiply through by a factor of 0 < ρ < 1 (usually ρ = 1/2) until the step length obeys the Armijo Rule
- Choose α’ = αₙ₋₁ (gₙ₋₁ᵀdₙ₋₁)/(gₙᵀdₙ)
Knowledge
Armijo Rule
2 points
- f(xₙ + αₙdₙ) < f(xₙ) + σαₙgₙᵀdₙ
- σ in the range 10⁻⁴ to 10⁻¹
Knowledge
Coordinate Gradient Descent
dₙ = ±eᵢ
Knowledge
Gradient Descent
3 points
- dₙ = -gₙ
- O(d) computation for 1 evaluation of df/dx
- Global, linear convergence
Knowledge
Newton’s Method
3 points
Gradient Descent
- Approximate f by its second-order Taylor polynomial
- d̲ₙ = -H̲(f)(x̲ₙ)⁻¹ g̲ₙ
- αₙ = 1
Knowledge
Newton’s Method Error Analysis
2 points
Gradient Descent
- O(d²) derivatives in H(f) and O(d³) computation
- Convergence not always guaranteed and only quadratic when H(f)(xₙ) is positive definite near the minimum
Knowledge
BFGS Method
3 points
- Update an approximate Hessian Ĥₙ
- dₙ = -Ĥₙ⁻¹ gₙ
- Shermar-Morrison formula updates an approximate inverse Hessian Ĥₙ⁻¹
Knowledge
BFGS Method Error Analysis
2 points
- O(d³) computation or O(d²) using Shermar-Morrison formula
- Global, superlinear convergence for well-behaved f