Topic 7: Some Required Concepts In Mathematics Flashcards

1
Q

What is the notation if function F depends on certain parameters z

A

Fz (subscript z)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

⟨w,x⟩

A

The dot product between w and v
w1x1 + w2x2 + …+ wnxn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Lipschitzness

A

A function F ∶ Rp → R is said to be L-Lipschitz if for all x,y:
|F(x) - F(y) | ≤ L|| x - y ||

magnitude of the gradient is bounded by a constant factor throughout its domain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is || x- y || notation

A

The distance (usually euclidean) between two vectors x and y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is squared loss not lipschitz

A

There doesn’t exist a single Lipschitz constant that can bound the difference in function values for all pairs of input values
The key is that value L changes with every example

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is lipschitzness in english terms

A

A function is lipschitz if it does not change too rapidly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Is huber loss lipschitz

A

yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How can we informally describe a convex function F

A

F’s graph between any two points x1 and x2
always lie below the straight line joining the points (x1, F(x1)) and (x2, F(x2))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Convexity

A

A function F:Rp -> R si convex is for all Real x,y and θ between [0,1]:

F(θx + (1 − θ)y) ≤ θF(x) + (1 − θ)F(y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When is a function “strictly convex”

A

If the convex equation never becomes an inequality (only <)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When does the convex inequality always become an equality

A

At the end of the chord, at points θ = 0,1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When is a function definitely not strictly convex

A

If at any point in the line it becomes a flat line (and is not curving up) eg. ReLU
Can still be convex though!!!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the Convexity equation for differentiable functions

A

An at least once differentiable function F ∶ Rp → R is convex if ∀ x, y we have:
F(x) + ∇F(x)⊺(y − x) ≤ F(y)

ensures the function lies above the tangent line at (x, F(x))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is ∇F(x)

A

The gradient (1st derivative) of F(x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is p⊺q

A

The dot product of vectors p and q
(p1q1 + p2q2+…+pnqn)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A function is a differentiable convex function if…

A

for any point x, the tangent to the function at that point is below the graph of the function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is α−Strong Convexity

A

For some α > 0, a F: Rp -> R is α−Strongly Convex if:

F(x) - α/2 ||x||^2 is a convex function

Essentially adding α/2 x^2 to any convex function, yields a α-strongly convex function

Strongly convex functions are strictly convex and convex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

The larger the α…

A

The stronger the convexity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How does the exponential loss function work

A

label can be +1 or -1
e^(-yh(x))
Where y = true label
and h(x) is predicted label

If y=1 and h(x) is negative -> loss function close to 1

If y=1 and h(x) is large and positive -> loss function close to 0

The exponential loss function penalises heavily when h(x) predicts wrong answers with high confidence

20
Q

What is Lipschtiz smoothness

A

For some β > 0, an at least once differentiable function F: Rp -> R is said to be β-smooth if

∥∇F(x) − ∇F(y)∥ ≤ β∥x − y∥

Eg F(x) = x^2 is smooth with β=2

21
Q

Lipschitz-Smoothness vs general smoothness

A

In analysis textbooks smoothness refers to a function being infinitely differentiable (eg F(x) = x^3) - this is not the same thing!

22
Q

What is strict convexity

A

f”(x) > 0
It will have a slope

straight lines cannot be strictly convex

23
Q

Do strongly convex functions have global minimums

A

always have one global minimum

24
Q

How many global minima do convex and strictly convex functions have

A

none or 1

25
Q

What can we say about the local minimum of a convex function

A

it is the global minimum

26
Q

What is Convergence (def 1)

A

A sequence of points xi ∀i = 1, 2, . . . in Rn will converge to the point x* if
limn→∞||xn - x*||2 = 0

Essentially as n tends to infinity, xn and x* are the same point

27
Q

What is ∥a − b∥2 (subscript 2) notation

A

The euclidean distance between vectors a and b

28
Q

What is x*

A

the “limit point”

29
Q

What is Convergence (more generalised def 2) we only need def 1 on this course

A

A sequence of points xi ∀i = 1, 2, . . . in Rn will converge to the point x* if (for all ε > 0)
||xn - x*||2 ≤ ε

Essentially “no matter how small of a distance ϵ you choose, eventually all the points in the sequence will be within ϵ distance from x*

30
Q

What is a Supremum

A

The “Least upper bound”
Let I be a subset of R. Suppose there exists an upper bound M on I s.t M ≤ M′ ∀ upper bounds M′ on I
Then M is the “least upper bound” on I and it’s called the “supremum” of I and is denoted as
sup I

Eg1 I = [0,1)
sup I = 1

Eg2 I = [0,1) U [1.5,2]
sup I = 2 = max I

31
Q

What is a maximum

A

Let I be a set, when sup I ∈ I we call it the “maximum” of I, max I

32
Q

What is an Infimum

A

The “Greatest lower bound”
Let I be a subset of R. Suppose there exists a lower bound m on I s.t m ≥ m′, ∀ lower bounds m′ on I
Then m is the “greatest lower bound” on I and it’s called the “infimum” of I and is denoted as
inf I

Eg1 I = [0,1)
inf I = 0 = min I

Eg2 I = [0,1) U [1.5,2]
inf I = 0 = min I

33
Q

What is a minimum

A

Let I be a set, when inf I ∈ I we call it the “minimum” of I, min I

34
Q

What is the spectral norm of a matrix

A

Denoted as ||A||subscript2
Tells you how much a matrix can stretch any unit vector in its domain
Eg if you have a matrix that is just a column vector, its spectral norm is the length of the vector

35
Q

What is the spectral radius of a matrix

A

The largest absolute value of its eigenvalues
(maximum amount by which a matrix can stretch or compress vectors)
eg λ1 = 0.5, λ2 = 3
3 is the spectral radius

36
Q

What is a Span

A

Given a finite set of vectors v1,…,vk

Span ({v1,…,vk}) = Σ wivi

The span of a set of vectors is all the different combinations you can make by stretching, shrinking, or rotating those vectors in different ways

We call the set of vectors the “spanning set” for the vector space

37
Q

What does the spanning set e1 = [10] and e2 = [01] mean

A

It is a 2-dimensional space
Every vector in this space can be written as a linear combination of e1 and e2

38
Q

What is Dimension

A

Given a finite set of mutually orthogonal vectors z1,…,zp

p = dimension(Span({z1,…,zp})

The number of dimensions described by the vector set

{z1, . . . , zp} is an instance of a basis of Span({z1, . . . , zp})

eg dimension 3

39
Q

What is Basis

A

The smallest set of spanning vectors that is needed to be able to describe every direction in the space
Eg (e1,e2,z) -> the basis : (e1,e2)

40
Q

What does mutually orthogonal mean

A

A set of vectors where each vector is perpendicular (orthogonal) to every other vector in the set

41
Q

What is the Rank of a matrix

A

Given any m × n matrix T i.e T ∶ Rn → Rm:

rank(T) = dimension(Image(T))

How many different dimensions a vector v can move in after being transformed by matrix T

Eg M = 2x2 = [2 0] [0 1] = rank 2
D = 2x2 = [3 0] [0 0] = rank 1

count the number of non-zero rows in REF

42
Q

What is the image of a matrix

A

Given any m × n matrix T i.e T ∶ Rn → Rm:

Image(T) = {T[v] | v ∈ Rn}

The Image is all the points a vector v can be transformed to by matrix T

43
Q

What is “full rank”

A

if rank(T) = min{n,m}
Then T is of “full rank”

Eg matrix: 4 by 2 and it is rank 2 then its full rank

44
Q

What is the Null space of a Matrix

A

Given any m × n matrix T i.e T ∶ Rn → Rm:

Null Space(T) = {v ∈ Rn | T[v] =0}

All the vectors that get squished down to nothing when you transform them

45
Q

What is a kernel

A

Same as null space
Null Space(T) = ker(T)

46
Q

what does adding α/2 ||x||^2 do to the graph

A

makes it (α- strong convex) bounded by a paraboloid - a bowl/cup shape

47
Q

when is the rank of a matrix 0

A

when all elements of the matrix are 0