background info Flashcards
O(g(n)):
f(n) in O(g(n)) as n->∞ if there exists constant C>0 and an integer n0 such that |f(n)|<=C|g(n)| for n>n0
O(g(x)):
f(x) in O(g(n)) as x->0 if there exists a constant C>0 and a real number ε>0 such that |f(x)|<=C|g(x)| for |x|<ε
Ω(g(n))/Ω(g(x)):
just g(n/x) in O(f(n/x)), everything else the same
o(g(n/x)):
is good for All C
coordinate of a vector:
xi in x basically, digits
e vector:
vector with all coordinates = 1
inner product:
|x,y|=(n)Σ(i=1)xiyi
bilinear:
|x, ay1+by2|=a|x,y1|+b|x,y2|
orthogonal:
|x,y|=0
linear subspace:
subset V in R^n such that for any x,y in V and a,b in R, ax+by in V (so all linear combos in V)
linear combo:
of vectors x1,…,xn in R^n, of the form y=(k)Σ(i=1)λixi, λi in R
span:
set of linear combos, forms a linear subspace in R^n
linearly independent:
(k)Σ(i=1)λixi = 0
basis:
minimal set of vectors that spans a linear subspace, elements are always linearly independent, basically from unique linear combos of these the entire subspace can be described
dimension:
no. of elements in a basis
orthogonal basis:
|bi,bj|=0 (i!=j)
orthonormal basis:
orthogonal and |bi,bi|=1
standard basis:
orthonormal basis {e1,…,en} where ei has 1 in ith column and 0 otherwise
direct sum:
of vector spaces that don’t intersect, V⊕W={v+w: v in V, w in W}
orthogonal complement:
of a subspace V, just all vectors that are orthogonal to smth in V, written V^⊥, V⊕V^⊥=R^n
direct product:
VxW={(v,w) in R^(n+m): v in V, w in W} (V in R^n, W in R^m) where (v,w) is just. w tacked on the end of v
linear map:
a mxn matrix makes a linear map from R^n to R^m with y=Ax, yi=(n)Σ(j=1)aijxj
symmetric:
a matrix is symmetric when A=A^T
matrix multiplication:
C=AB where cij=(p)Σ(k=1)aikbkj
⟨x,y⟩:
x^(T)y
rank of a matrix:
rk(A), max number of linearly independent rows/columns
kernel:
kerA:={x in R^n: Ax=0}, dim(kerA)=n-rk(A)
image:
ImA={Ax: x in R^n}, dim(ImA)=rk(A), (kerA)^⊥=ImA^T
system of linear equations:
aijxj=bi but several systems of sums -> Ax=b where x and b are vectors, if the columns of A are lin independent, there’s at most 1 solution, there’s infinite otherwise
conditions of a matrix that are equivalent:
A is invertible
rk(A)=n
kerA={0}
ImA=R^n
the rows are linearly independent
the columns are linearly independent
det(A)!=0
det(A):
Σsgn(σ)a(↓1σ(1))…a(↓nσ(n)) where Sn is the group of permutations of [n]={1,…,n} with sgn(σ) the sign, the parity of the no. of inversions
example - for A=
(a11 a12
a21 a22)
det(A)=a11a22-a12a21
also all the eigenvalues multiplied together
orthogonal matrix:
⟨qi,qj⟩=δij, δij=1 if i=j and 0 otherwise, alternatively Q^(T)Q=QQ^T=1
⟨Qx,Qy⟩=⟨x,y⟩, det(Q)=1
eigenvector:
u in Au=λu, u!=0, λ is the eigenvalue
characteristic polynomial:
det(λI-A)=0, eigenvalues are the roots
trace:
sum of the diagonal elements, also sum of the eigenvalues
norm:
a function that satisfies:
||x||>=0 and ||x||=0 iff x=0
||λx||=|λ| ||x|| for all λ
||x+y||<=||x||+||y||
||x||1:
1 norm, (n)Σ(i=1)|xi|
||x||2:
2 norm, ((n)Σ(i=1)(xi)^2)^(1/2)
||x||∞:
max(|xi|), 1<=i<=n
(||x||2)^2:
euclidean norm, x^(T)x=⟨x,x⟩ (inner product)
unit sphere (with respect to a norm):
{x in R^n: ||x||=1}
unit ball:
{x in R^n: ||x||<=1}, closed
norms relations:
||x||∞<=||x||2<=root(n)||x||∞
||x||∞<=||x||1<=n||x||∞
cauchy-swartz inequality:
⟨x,y⟩<=||x||2||y||2, with equality iff x and y are linearly dependent
also -1<=⟨x,y⟩/||x||2||y||2<=1
angle:
angle between vectors x and y is θ such that cos(θ)=|x,y|/||x||2||y||2
if x and y are orthogonal, cosθ=0
matrix norm:
fits all the vector norm conditions and also ||AB||<=||A||||B||
matrix and vector norm relation:
given ||x||, ||A||=max(||Ax||/||x||)=max(||Ax||), x:||x||<=1
||A||1:
maxj((n)Σ(i=1)|aij|)
||A||∞:
maxi((n)Σ(j=1)|aij|)
||A||2:
root(λ(max)(A^(T)A)), where λ(max) is the largest eigenvalue of A^(T)A
so when A is symmetric, ||A||2=λ(max)(A)
dual norm:
||x||*=max⟨x,y⟩, y:||y||<=1
frobenius norm:
||A||F=root((n)Σ(i,j=1)(aij)^2)
orthogonal invariant:
2 norm and frobenius norm are, so for any Q in O(n), ||QA||2=||AQ||2=||A||2, same with F instead of 2
positive semidefinite:
A is symmetric, x^(T)Ax>=0, written A⪰0, is positive definite if that’s >0, also all eigenvalues are nonnegative (and positive for definite)
⟨x,y⟩A:
⟨x,Ay⟩=x^(T)Ay
||x||A:
root(⟨x,x⟩A)
QR decomposition:
A is mxn, A=QR where Q is orthogonal and mxm and R is upper triangular and mxn
LU decomposition:
A is nxn, A=LU where L is lower triangular nxn and U is upper triangular nxn
symmetric eigenvalue decomposition:
A is symmetric, A=QΛQ^T where Q is orthogonal with eigenvectors as rows, and Λ is diagonal with eigenvalues on the diagonal
cholesky decomposition:
A is positive definite symmetric, A=LL^T, L is nxn lower triangular with strictly positive diagonal entries
singular value decomposition:
A is mxn, A=UΣV^T, where U is mxm orthogonal, V is nxn orthogonal, and Σ is a diagonal matrix with singular values σ1>=…>=σ↓min{m,n}, these are the square roots of the eigenvalues of A^(T)A
singular values and norms:
||A||2=σ1(A)
||A||F=root((n)Σ(i=1)(σi(A)^2))
if A is symmetric, the singular values are the absolute values of the eigenvalues
C^(k)([a,b]):
the set of functions continuous on [a,b] and whose first k derivatives exist and are continuous on (a,b)
infimum:
inf(f(x))=max{y in R: for all x in [a,b], f(x)>=y} (largest y such that any f(x) is larger, a lower bound essentially)
supremum:
sup(f(x))=min{y in R: for all x in [a,b], f(x)<=y} (smallest y such that any f(x) is smaller, an upper bound essentially)
intermediate value theorem:
f in C([a,b]) and y satisfies inf(f(x))<=y<=sup(f(x)), then there exists ξ in [a,b] such that f(ξ)=y
basically because f is continuous and y is between the lowest and highest value f(x) can take, f must pass through y
mean value theorem:
f in C^1([a,b]), x,x0 in (a,b) with x!=x0, then there exists ξ in (x0,x) (or (x,x0) whatever works) such that f(x)=f(x0)+f’(ξ)(x-x0)
rearranges to f’(ξ)=(f(x)-f(x0))/(x-x0)
taylor expansion:
f in C^n([a,b]), x,x0 in (a,b) with x!=x0, then there exists ξ in (x,x0) or (x0,x) such that f(x)=(n)Σ(i=0)((f^(i)(x0)/i!)(x-x0)^i) + (f^(n+1)(ξ)/(n+1)!)(x-x0)^(n+1)
f^(n) meaning nth derivative
rolle’s theorem:
special case of the mvt, f in C^1([a,b]) with f(a)=f(b), then there exists ξ in (a,b) such that f’(ξ)=0
if you start and end at the same elevation, there must be a flat part at the top or bottom of a hill (or the whole walk ig)
integral mean value theorem
f,g in C([a,b]), f does not change sign on [a,b], then there exists ξ in (a,b) such that:
(b)∫(a)f(x)g(x)dx=g(ξ) (b)∫(a)f(x)dx
open ball:
radius ε around point p in R^n, B^n(p,ε)={x: ||x-p||2<ε}
open subset:
for every p in subset U, there exists ε>0 such that B(p,ε) entirely within U (can’t have overlapping boundaries)
a set C is closed if R^n\C is open
closure:
clS of a set S within R^n, the intersection of all closed sets containing S
interior:
intS of a set S within R^n, the union of all open sets contained in S
boundary:
bdS=clS\intS
unit sphere:
{x in R^n: ||x||2=1}, also the boundary of the unit ball
neighbourhood:
a neighbourhood of a point x is a set N such that there exists an open set U with x in U within N
span(S):
{(k)Σ(i=1)λixi, λ1,…,λk real and x1,…,xk in S}
relatively open/closed:
S is relatively open/closed if it is open/closed in the induced topology on span
there also exists relative interior/closure
compact:
a set is compact if it’s closed and bounded
lipschitz continuous:
f:S->R, for all x,y in S, ||f(x)-f(y)||2<=L||x-y||2, L>0
limit point:
a point x that is the limit of a subsequence
cauchy sequence:
for every ε>0 there exists k0>0 such that for all k,l>k0, ||xk-xl||2<ε
banach space:
a vector space where every cauchy sequence contains a convergent subsequence
sets and limit points:
a set C is closed iff every sequence has all its limit points in C
the closure of C is the set of all limit points in C
a set C is compact iff every sequence of points has A limit point in C (so only at least 1, not all)
differentiation (first principles):
lim(h->0)(||f(x0+h)-f(x0)-Jf(x0)h||2/||h||2)=0
jacobian matrix:
Jf(x0)=(∂f1/∂x1……..∂fm/∂xn) (is a matrix you get it), the partial derivatives are evaluated at x0
gradient:
∇f(x0)={∂f/∂x1,…,∂f/∂fxn}^T
level set:
{x in R^2: f(x)=c}, the curve on which the function value does not change (the contour lines on a map)
hessian matrix:
∇^(2)f(x0)={∂^(2)f/∂x1∂x1……∂^(2)f/∂xn∂xn) but matrix you get it, symmetric therefore square
directional derivative:
Dx0f(x0) of f:R^n->R^m in direction v in R^n at x0 is Dvf(x0)=lim(h->0)(f(x0+hv)-f(x0)/h)
directional derivative special cases:
v=ei, Deif(x0)=(∂f/∂xi)(x0), the partial derivative of xi at x0
if f:R^n->R with continuous derivative near x0, Dvf(x0)=∇f(x0)^(T)v=⟨∇f(x0),v⟩
chain rule:
h=g(f(x)), Jh(x0)=Jg(f(x0))Jf(x0)
chain rule and directional derivative:
if f’(x0)=v, then the derivative of g(f(x)) is the directional derivative of g in the direction of v, ⟨∇g(f(x0)),v⟩
multivariate mean value theorem:
f in C^(1)(U), x0,x in U, x!=x0, then there exists t in (0,1) such that f(x)-f(x0)=⟨∇f(tx+(1-t)x0),x-x0⟩
higher order partial derivative:
D^(a)f(x)=∂^(|a|)f(x)/∂^(a1)x1…∂^(an)xn, |a|=(n)Σ(i=1)ai
x^a=(x1)^(a1)…(xn)^(an)
taylor expansion:
f(x)=(|a|<=k)ΣD^(a)f(x0)(x-x0)^a + (|a|=k)Σra(x)(x-x0)^a, ra(x)->0 as x->x0
lagrange multipliers:
let x* be a maximum of f(x) under the constraint g(x)=c (so a max among all points x such that g(x)=c), then there exists a λ in R such that ∇f(x)=∇λg(x)
lagrangian:
the lagrangian of a function f(x) with the constraint g(x)=c is the function Λ:R^(n)xR->R defined by Λ(x,λ)=f(x)-λ(g(x)-c)
jacobian with respect to coordinates:
Jx(x0,y0)=the jacobian regular but all with (x0,y0) on the right to show with respect to
basically means we are considering f as a function only at (x0,y0) and taking all other coordinates (only the y’s are important?) as parameters
implicit function theorem:
f:R^mxR^n->R^n and k times continually differentiable, assume f(x0,y0) is 0 and that the jacobian at (x0,y0) is nonsingular, then there exists an open neighbourhood y0 in Uy and function h in C^(k)(Uy,R^(n), such that h(y0)=x0, f(h(y),y)=0 for y in Uy
Jh(y)=-Jfy(h(y),y)(Jxf(h(y),y))^(-1) for all y in Uy
absolute error:
Eabs(x^)=|x-x^| (x^ is an approximation and also an x with a hat)
relative error:
Erel(x^)=|x-x^|/|x|
floating point arithmetic:
x=+-f*2^e, f is a fraction in [0,1] and e is the exponent (not the 2.72 thing), f gets 52 bits e 11 and 1 for the sign for total 64 bits
significant figures:
ikyk what these are but we are working with 4, also Important Note the final 0 counts as one if it’s to the right of the decimal point, so 1.2040 is Five significant figures, 1.204 is assumed to have stuff after that has been rounded to the 4