IP bank Flashcards

1
Q

elaborate on how we can model a non convex feasible region

A

The thing about non convex regions is that we cannot formulate a feasible region that is “either in this or in this etc” using regular LP. However, with IP, we can.

WE think of as “either we force set A of constraints to hold, or set B to hold”. We can also generalize it further.

The typical to do this is to add M(1-indicatorVariable) to the constraints. In the case where we have only two regions, this is fine.
the indicator variable will be set to choose one of them.

If we have many regions, where one of them must be chosen, we can do this:
for each region, we have an indicator variable.
Then we have a constraint that say that the sum of hte indicator variables MUST be equal to 1. This force one set of constraints to apply.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

elaborate on how we can model a non convex function using IP

A

Assume it is piecewise linear. If not, we assume that we can appxoimate it using a piecewise linear approach.

Then we need a set of weights, and a set of indicator variables. We also need slopes/coefficients that will be the A-matrix.

and some pmore idk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Elaborate on knapsack problems

A

Extremely important: the knapsack problem is differnet for min and max problems.

For max, all the constraints must be <=.
for min problems, all constraints must be >=.
This allows us to make use of the knapsack property of the ratio between objective function coeffciients and corresponding technology coeffcients ot make greedy decisions.

The problem is generally defined as selecitng a set of objects, with the aim of maximizng value, subject to capacity constraints.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Elaborate on assignment problem

A

There are two braod classes:
1) Specific
2) General

THe specific assignment problem is concerned by using sets of equal size.
The general assignment problem has a different structure as it has typically different sized sets.

The regular specific assignment problem minimize cost, but probably could maximize something instead.
There are 2 constraints, which looks identical to each other, except for the fact that one of them sum over j, and the other over i. One for each set.

The generalized assignment problem is better described as “We have a set of machines that can do tasks, and we have many tasks. These tasks needs to be solved by the machines”.

Fun fact: If we take the generalized assignment problem, and remove the constraint that enforce each “task” to be assigned a machine, and also swap the objective to be maximization rather than minimization, we actually get a so-called “Multi-knapsack problem”. IMPORTANT: We must add a constraint that force each item to only be legally placed in ONE knapsack, so that the solver dont put the best object in each pack. This is done byt instead of removing the ∑xij=1 constraint, we make it an inequality: ∑xij<=1. Now, when summing over all the packs, a specific item is not required to be placed in a pack, but if it is, it can only be placed in one pack.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

elaborate on the matching problem

A

THe matching problem is about making pairwise connections between elements in a single set. Each element MUST be connected to one, and only one, other element.

The matching problem is really about understanding how the objective function play along with the constraints. Because of this relation, there is not need to enforce symmetry or something similar. If we pick (i,j) to be an arc we set to 1, adding (j,i)

min z = ∑∑cij xij
s.t.
∑xij = 1, for all nodes
xij in {0,1}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

elaborate on TSP

A

We are looking for a hamiltonian cycle.
We have n nodes, and we must visit all of them, and only once each.

there are two versions of this problem:
1) Symmetric
2) Asymmetric

The difference is whether they are having the same cost for arcs in oppåosite directions or not.

There are 3 constriant types:
1) Enforce that the leaving arc from each node is equal to 1
2) Enforce that the entering arc into each node is equal to 1
3) Enforce that there are no sub-tours allowed

The idea of the sub-tour elimination constraints is that in a set of n nodes, there can only be at most n-1 arcs. If there are more arcs than n-1, then there is a sub-tour in it.
IMPORTANT: These constraints apply only for proper subsets of N. This means that the subset of the entire set is not included here.

In real life, there will be a shit load of constraints for this problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

elaborate on the symmetric TSP

A

since no direction is needed now, we drop an index.
xk represent arc k.
∑xk = 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Elaborate on VRP

A

Vehicle Route planning

We care about the classical VRP problem.
We want to determine a set of routes for a set of vehicles so that given demands are satisfied.

There is a number of custoemrs, and they have demands.
We have a depot, which is where we have to return to each time we “use up” our vehicle capacity.
There is a cost in using an arc.

If we have only one vehicle, and the vehicle has basically unlimited capacity, the VRP can be solved as TSP.

There are many versions of VRP, but the classical one assumes that all vahivles have the same capacity.

We assume K vehicles.
We denote the demand of customer “i” as di.
cij represnet the cost of travelling from customer i to j.
We use i=0 to indicate depot.

we have variables:
yik = 1 if customer i is visited by vehicle k.
xijk = 1 if vehicle k go from i to j directly.

The problem is a nightmare to set up due to its many constraints.
For simplicity, it is often assumed that we have already made a number of subsets of possible routes. When doing this, we remove the capacity from consideration.
So, we just want to find a number of routes (routes are specified) so that we cover each customer.
This problem now becomes a set partitioning problem.
min z = ∑cjxj
s.t.
∑aij xj [j]= 1, for each i in I
∑xj = K

So, set partitioning + the number of routes must be equal to the number of vehicles we have.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

why are there so many different ways to solve IP problems?

A

They all have differnet characteristics. The point is that the structure of an IP problem is important in terms of how easy or hard it is to solve. Some problems simply do not converge fast enough with the most general methods like branch and bound. Therefore, there is a real need of methods that can be used in various scenarios.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the most important strategies for solving IP problems?

A

1) enumeration
2) Relaxation and decomposition
3) Cutting plane methods
4) Heuristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When are we considering heuristics?

A

If the problem is very difficult to solve, and we need a solution. With heuristics, there is no guarantee that the solution is optimal. This means that such methods are only viable if we can guarantee that the error is small enough

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Is the number of variables important in IP?

A

Yes. but the problem structure is as well. One typically say that the number of integer variables and the problem structure determine the difficulty in solving it.

In comparison, LP problems are typically restricted in their number of constraints.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

elaborate on constraints in IP

A

Consrtraints doesnt always make the solution time longer. This is because of the thing called valid inequalities that works very well with the relaxed problem, and reduce running time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Elaborate on optimality conditions

A

In LP, we have KKT conditions that basiclaly guarantee optimality.
No such conditions exists for IP.
The best way to discuss optimality for IP, is to define a certain error that we are happy with, and continuously find pessimistic and optimistic bounds that grow towards each other (converge). The gap is the error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why do we care about models in IP? Why is it not enough that they describe the correct constraints? given the same feasible region, each solver will find the same solution?

A

The reason is because different model formulations will affect the LP relaxation. We use the LP relaxation as a tool to solve for optimistic bounds. in order to get more realistic optimistic bounds, we benefit from having a strong formulation of the model.

The question is then, what is a strong model?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a strong or a weak model+

A

We say that a model’s strenght depends on the gap between the feasible region of the LP relaxation and the convex hull.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Elaborate on the convex hull

A

The convex hull is the entire region/area/set of points that are defined as all convex combinations of a set of points.

In IP, we consider the set of feasible points, and use them to find all convex combinations. The area of all these points is the convex hull.

If we have a problem whose convex hull was available as linear constraints, we’d be able to run simplex (regular LP) and guarantee an Integer solution. Since we’d get a feasible pessimistic solution, and an optimistic solution at the same time that are equal, we know that we are done.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define a valid inequality

A

A valid inequality is an inequality that is satisfied for all x in X. In other words, the inequality doesnt cut away any of the feasible points (feasible points that are obviously the integer soluitions).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is hte point of valid inequalities

A

Get a better approximation of the convex hull. this in turn will make our solvers run faster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

when we consider a valid inequality, what is the goal of it?

A

It should be as strong as possible.

To relate strength to valid inequalities, we use the mathematical concept of a “face”. The goal is to generate valid inequalities that defines a face to the convex hull with the largest possible dimension. A face is a surface of the convex hull. The more surface we can cover with one valid inequality, the better our approximation will be.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Define a face

A

F = {x | x in X_c, ∑ajxj = b}

In other words, a face is defined as the set of points that are in the convex hull’s points AND are in the hyperplane of the constraint.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the strongest valid inequalities?

A

Facets. These have dimensions of n-1 when the convex hull has dimension n.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

aim of the cutting plane methods

A

To approximate the convex hull better and better. if we systematically add valid inequalities, we know that if we find a solution that is integer feasible, it will be optimal (in regards to the LP relaxation).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

In an iterative cutting plane method approach, what is “important”?

A

We want the cutting plane to cut away the current LP optimal solution. If we do not do this, the solution will not change in the next iteration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

There are typically two ways to add cuts to a solution. Name em

A

Use a specified cutting plane method, like Gomory.

Or we can use problem specific cuts, like solving the separation problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

elaborate on Gomorys

A

Gomory’s method gollow the general outline of a cutting plane method.

Start by solving LP relaxation. if Integer, we’re good.
if not, we must add a valid ineuqality that removes the current LP optimal solution.

Gomory use only the informaiton provided in the simplex tableau of hte optimal tableau.

Consider the current basic solution. for those basic variables that have a fractional solution, we can consider them. Typically, we take the one with the largest fractional value.
We take this constraint as it is specified in the final tableau.

xi + ∑aijxj = bi

We separate the integer part and fractional part. this is equivalent to taking a floor operation and storing the remainder as well.

xi + ∑(int(aij) + frac(aij))xj = int(bi)+frac(bi)

Re-allocate

xi - int(bi) + ∑int(aij)xj = frac(bi) - ∑frac(aij)xj

LHS consists of purely integers. Therefore LHS is integer. Therefore, RHS is integer.
frac(bi) can never be larger than or equal to 1. Same with frac(aij).
At the same time, they can never be negative.
Therefor,e the LHS is smaller than, or equal to 0.

this gives us:

xi - int(bi) + ∑int(aij)xj <= 0

relocate

xi + ∑int(aij)xj <= int(bi)

This is a valid inequality. It removes the fractional part of a solution (single basic variable) by only using logic in terms of what values it can be.

IMPORTANT: Recall that this will give is an equality that use slack variables. We dont want this. Therefore, we must perform a sutiable substitution by using the ORIGINAL problem formulation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Recall the aim with problem specific cuts

A

Use the separation problem to generate cover inequalities in a cutting plane method

28
Q

elaborate on covers in regard to the 0/1 knapsack problem

A

Recall that the feasible region of 0/1 knapsack can be defined as:

X = {x in {0,1} | ∑ajxj<=b}

In other words, we have a set of variables that when multiplied by the technology coefficient, the sum cannot be larger than b.

We say that a set of variables S is a cover if ∑aj [j in S] > b.
This just means that the set of variables is a cover if when including those variables, their sum must be larger than b.
We can extend this to the definition of “minimal cover” which has the additional condition of satisfying that we can remove any one of the j variables in the set, and the set will no longer be a cover.

In other words, a minimal cover represent a set of variables that can never be seen together in an optimal solution.

29
Q

Why do we care about the idea of covers and minimal cover?

A

If S is a minimal cover, then the constraint

∑xj [j in S] <= |S|-1

is a valid inequality for X.

This idea is actually very simple. if we can find a set of variables that are a minimal cover, which means that it would be a limit in terms of how they can never be selected all together at the same time, we can make a cover constraint specifying that they can never be 1 at the same time.

30
Q

Generalize the idea of a cover constraint

A

We need to find a set of variables that can never appear together because of some constraint. When we add the constraint that specifies this specific relationship, we have added a cover constraint.

31
Q

can we make a cover constraint stronger?

A

Yes, we introduce the idea of an “augmented cover constraint”.

The augmented cover constraint is exactly the same as the regular cover constraint, but we are now looking into whether we can add any variables to the LHS and still keeping the constraint valid.

To do this, we use the mathematical relationship between technology coeffcieint sizes and RHS values.

Define E(S) = S union {j : aj >= ai, for all i in S}.

Explanation: E(S) now holds the variables from the original cover, but we also add all variables whose technology coefficient is larger than or equal to ALL technology coefficients in the original cover set.

The augmented cover constraint is given as:

∑xj [j in E(S)] <= |S|-1

Note that this is not only 0/1 knapsack specific. This definition only add more dimensions to the hyperplane that is the constraint.

32
Q

elaborate on the mathematical way of asking whether tehre exxists a subset S of the variables such that there exists a cover inequality to the problem

A

∑aj [j in S] > b

AND

∑(1-xj^(LP)) [j in S] < 1

The first one basically just say that there must exist a cover.
The second constriant is about making sure that the current solution from LP violates the cover inequality.

The aboslute CRUCIAL insight is that the cover inequalities are always valid ineuqalities. By finding a cover inequality, we automatically know it is valid. By ensuring that it violates the current LP solution, we know it is in the right direction.

33
Q

elaborate on the separation problem

A

The separation problem generates a cover inequality .

alpha = min ∑(1 - xj^(LP))zj [j in N]
s.t.
∑ajzj [j in N] > b

zj in {0,1}

the z-variables are indicator variables that correspond to whether variable j is included in the cover or not.

The separation problem will return a minimum cover if one exists. If a minimum cover doesnt exist, it will return a cover if one exist.

if alpha is smaller than 1, we have an inequality on the form:
∑xj [j: zj-star = 1] <= ∑zj-star [j] - 1

So basically, take all the positive valued (1 valued) z-variables, and the sum of these corresponding x-variables must be smalelr than or equal to the number value of the sum of z-variables less 1.
this is exactly the same as the earlier cover inequality result.

if alpha is larger than or equal to 1, it means that the cover is such that it will not cut away the current LP optimal solution.

34
Q

Some things worth noting regarding the separation problem

A

it is a hard problem, knapsacvk problem.

In practice, we use heuristics to solve it fast.

If the result is a valid inequality that has alpha less than 1, we know that it will cut away the fractional solution wiht hte proportion 1-alpha. Therefore, we have a measure of the strneght of the inequality.

35
Q

general idea of branch and bound

A

Split the feasible region into smaller and smaller regions and solve a relaxed problem in eahc subproblem.

We collect information from each subproblem, and explore in a way that makes us look into regions where it is more likely that a solution will be.

We progress the search by optimistic bounds that will be more and more realistic. On the way, we hope to find better and better feasible soliutions because we need them to prune.

Branching is made by adding extra constraints, typically in the shape of setting a binary variable to be either 1 or 0.

36
Q

elaborate on the general steps of the general strategy for branch and bound

A

Relax branch prune and select.

we relax a problem, typically by finding LP relaxation. This gives us an optimistic bound. The quality of the bound is determined by method specifics and problem structure.

Now we want to progress. We do this by branching on a variable. The general idea is that we have not found the optimal solution, so we need to look into smaller regions and continue until we reach IP solution. The crucial part here is that we must not cut away any IP feasible solutions in the process.
Also crucial: we need to make sure that we remove the LP relaxation’s optimal solution from the new sub problems, otherwise the new problems will just find the same solution as this one.

Prune based on current best pessimistic soltuion in combinaiton with the optimistic bound for a solution to a subproblem.

we also need a selection policy in regards to which sub problems we look into first. The usual suspects are:
1) Depth first
2) breadth first
3) best first

Depth firsth as the advantage of fidnign IP solution relatively quickly, which can be used to prune.

37
Q

What is also typically included in a branch and bound styrategy, especially in real life?

A

Some kind of termination criterion.

If we have many variables, it may take a lot of time to split on each and grow the search tree all the way down. It might not even be practical, taking weeks or longer to finish.

Therefore, we typically define an error which is the error we tolerate in the objective function value. Then we simply terminate the search once reached.

The termination criterion is typically made based on current optimistic and current pessimistic bounds. The error is the difference between them, and then divided by the pessimistic bound to get hte ratio.

38
Q

some comment worth mentioning regarded land doig dakin?

A

We typically use dual simplex to re-optimize with the added constraints.

39
Q

What problems have specific methods in regards to branch and bound?

A

knapsack (integer, but not 0/1)

TSP

job scheduling

40
Q

elaborate on Branch and bound for knapsack problems

A

We can solve the LP relaxation very easily by making use of the relative contribution of each variable by comparing ratios of c/a. The variable with the best ratio has the most bang for the buck, and we want to exhaust it.

Given a subproblem, we want to to the following steps:
1) Initialize all variables to be their lower bound. For most, this is probably 0. We also set k=1.

2) Compute the “remaining resource” RHS as ∆=b - ∑ajxj, where xj is the lower bound we sat earlier.
We assume a knapsack with only one constraint, so this delta represent how much capacity we have left.

Step 3) Check if we need to stop. If ∆ < 0, it means that we have no feasible solutions.
Step 4) Check if we have optimal soluiton. this is the case if ∆=0.

Step 5) if ∆>0, we have more capacity, and we want to fill the knapsack more.
To do this, we set xk = lowerBound_k + min{∆/a_k, u_k - lowerBound_k}.
we also update ∆ := ∆ - a_k(xk - lowerBound_k).
Then we set k:= k+1 and go to the previous step.

Explanation:
index k represent the order of decreasing quotient/ratio. We start with the best, and finish with the last.
If we have more available space in the capacity, we compare two sizes:
1) ∆/a_k
2) u_k - l_k

∆/a_k represent the amount of x_k we’d pick if we could fill the sack with it.
u_k - l_k represent our constraint in regards to how much more of the item we have available. We cannot fill more than we have available.

Upper bounds are made by u_j = floor(b / a_j).

The updating of ∆ is just reducing delta by the amount we took, when accounting for the coefficient as well.

We can always round down fractional solutions using this approach, so we get a candidate for a pessimistic bound at each sub problem. This refers to rounding down the basic solution. For instance, (1.27, 0, 0, 0) can be rounded down to (1, 0, 0, 0).

41
Q

elaborate on using branch and bound on TSP

A

It is common to relax the sub tour constraints. There is a good reason for this.

The relaxed problem becomes the assignment problem (with both sets being N). The good thing about the assignment problem is that the A matrix is totally unimodular. Therefore, the LP relaxation always produce an integer solution.

if a sub tour appears in the solution, we can branch on it to break it.
The breaking process is about enforcing a variable to be 0.

42
Q

elaborate on the motivation behind branch and cut

A

Regular branch and bound but also add valid inequalities.

43
Q

elaborate on branch and cut

A

THe exact same as branch and bound but we add valid inequalities at each node in the search tree.

we solve a subproblem, and then we add a valid inequality that cuts away this solution.

Important about branch and cut is that once we add a valid inequality at some sub problem, the remaining subproblems along the branch is strengthened.

44
Q

elaborate on constraint branching

A

regular branching can actually be considered constraint branching, it is just that it applies only to a single variable. When we think about constraint branching, I believe the general idea is to include more than 1 variable in a constraint. The motivation behind this is that some problems have structures where simply setting a variable (binary) to 0 and 1 does very little “damage” to the feasible region. In fact, there are problems where if we set a specific variable to 1, we can automatically rule out a set of other variables (they must be 0). By exploiting such obvious relationships (if they appear to be obvious at least), we can define the new subproblem by ∑xj = 1, where the sum is over the variables that are mutually exclusive.

45
Q

recall set partitioning constraints.

Elaborate on how this structure can be utilized in constraint brnaching

A

for each item/object, we define a constraint. The cosntraint should loop over all sets, and make sure that the sum of objects over all the sets equals to 1.

So, if j is for set, and i is for item, we’d get:

∑aij xj [j in J] = 1, for i in I

This means that we KNOW that for each constraint, one of the variables must be 1.

Because of this, we can select a pair of constraints p and q, and for each such pair we can define a set of variables that have non-zero technology coefficient aij for both constraints, so that we have:

J_pq = {j | a_pj = 1 AND a_qj = 1}

So, for two constraints, we define a set of variables that have non-zero coefficient in both the constraints. We know that the constraints are for a speciifc item. Therefore, this is basically us saying: “We pick a pair of items, and find the set of variables that includes these two variables”.
Select two items/constraints, and find the set of variables (sets) that have both these items in it.

Since we know assume we have picked two items, and we have found the set of variables that includes both items: We know that we can legally only pick one of the variables in this set. Therefore, we can formulate two new constraints, one per sub problem, that defines the branching:
1) P0 + ∑xj [j in J_pq] = 1
2) P0 + ∑xj [j in J_pq] = 0

46
Q

Define SOS1

A

Set of variables (continuous or integer) where at most one variable is not zero

47
Q

Define SOS2

A

Set of variables where at most 2 variables are not zero. These 2 variables must be consecutieve in the given order.

48
Q

use case for SOS

A

piecewise linear function representation of a non-linear function

49
Q

Elaborate on motivation for special ordered sets

A

some variables are better used (more efficiently used) when we consider them as a single entity rather than a collection of variables.

This is because it will then be possible to branch on the entity rather on variables.

50
Q

Consider S1. how do do treat it?

A

we can ocnsier it as a set of many variables. We can split this set into two subsets. We know that iehter the first subset or the second must be 0 (all must be zero).

{x1, x2, …, xr} are all zero, OR
{x_(r+1) … xn} are all zero

We can vosnider these as two branches.

51
Q

What happens if the set of special ordered variables have more than 1 non-zero element?

A

I violates S1. However, we have S2 if there are only max 2 elements that are non zero.

52
Q

What is the purpose of a reference row?

A

The reference row holds the order of the special ordered set of variables.
The reference row has values that are monotonically increasing.

We can then use the reference row to find the sort of mid-point by taking

r-bar = ∑ajxj / ∑xj

When summing over all the variables in the special ordered set.

We use r-bar as the branching marker where we split the set into subsets.

53
Q

How do we find branching marker for S2?

A

We have to remove one element.
{x1, x2, .., x_(r-1)} are all zero
OR
{x_(r+1), …, xn} are all zero

What we do is find the branching marker as always using the same method as before. Then we remove the item closest to this marker, like above.

54
Q

how can we represent the logical structure of

x > 0 –> y=1

A

We add a simple indicator vairable

x <= My

is x is alrger than 0, we force y=1.

55
Q

When is it not enough to enforce x>0 —> y=1?

A

if we need to enforce the other way around as well:

y=0 –> x<=0

This is the same as saying:

y=0 –> x=0

in many cases, we have a cost associated with the indicator variable y. This allow us to never worry about this relation. However, if there is no cost associated with y, then the model solver needs to be explictly told how to model the relation.

The problem is that we never want to have y=1 if x=0.

This is impossible to model perfectly, so we use an approximation.

We say “if x is below a certain threshold, we force y to be 0”.

x >= my

Now, if x is to be larger than 0, nothing is imposed from the constraint.
However, is x is equal to 0, due to the small m, y is forced to be 0.
There is a wiggle room equal to the size of m.
if x is larger than m, there is no issue.
If x is smaller than m, we must understand that there might be a case where x is very samll, but not 0, and still y=1. However, since it is a threshold level, it is usually fine.

The combination of

x <= My
x >= my

enforce the relations we want

56
Q

Say we have 2 variables, xa and xb that are proportions of ingredeitns in a blend. How can we model the relationship that if we have xa, we must alos have xb?

A

Since these are proportions, we cannot just use something like xa <= xb. This would work if xa and xb happened to be binary. However, they are continuous with upper bound of 1.

We need to add indicator variables that indicate whether xa is included or not:

xa >= my
xa <= My
Now, if xa is above m, y is forced to 1.
if xa is below m, m is forced to 0.

Now we can add the real relationship we’re after:

xb >= my

57
Q

how do we represent the constraint:

xy = 0, where both x and y are binary

A

This is the same as saying:

x = 0
OR
y=0

x+y <= 1

58
Q

if the term xy (both are binary) appear soemwhere, what is the general procedure to make it linear?

A

We define a new binary variable, z. z should replace them.

We need the logic saying that :

z = 1 <==> x=1 AND y=1

We can enforce this by:
z <= x
z <= y
z >= x+y-1 , alternatively :x+y-1<=z, alternatively: x+y-z <= 1

59
Q

say we have : yx, where y is binary and x is continuous. How do we linearize this?

A

Define new variable that is continuous, z.
z <= My (if y is 0, then z is 0)
z <= x
x - z <= M(1-y) (the enforce that if y=1, then we have that x=z)

60
Q

Define SOS1

A

A set of variables (integer or continuous) within which exactly one varialbe MUST be non-zero.

61
Q

Define SOS2

A

A set of variables (integer or continuous) within which AT MOST 2 variables can be non-zero, AND these two variables must be adjacent.

62
Q

What is the reason for using special ordered sets?

A

Treating the set of variables as an entity, rather than treating each individual variable. The aim is to branch on entities.

63
Q

elaborate on the shite surrounding SOS1

A

we have a set of variables.

{x1, x2, …, xn}
At most ONE can be non-zero.

We know that in a feasible soltion, if one of these variables is 1, then this non-zero variable must lie in either:
{x1, x2, …, x_r}
OR
{x_(r+1), …, x_n}

The benefit in this is that we can basically branch on the condition/constraint ∑xi <= 0 for the two sets. This is much better in terms of working on the feasible region than taking eahc variable at a time.

We can choose any selection of variables to be in either set as long as we are dealing with SOS1 cases.

64
Q

Elaborate on the shite surrounding SOS2

A

So we now require adjacency and that there is a monotonic reference row. The reference row is crucial because we need it to capture adjacency.

We refer to the index we use to split as the “branching marker”.

Use cases for SOS2 are those that require adjacency. Adjacency is tricky because we need constraints ot enforce it along with binary vairables for eahc “case”. If we instead use SOS2, we can effectively branch on constraints.

We can add SOS properties to gurobi, which will basiclaly just tell it that it is useless to branch on individual variables in the set, and it is much better to branch on the entire entity.

65
Q

elaborate on SOS1 and SOS2

A

SOS1 is a property that we can find in a great variety of cases. It basically just means that we can only have at most one variable equal to 1. For instance, if a firm has a single position our for hire, they can only hire one candidate. if we view each candidate as a variable (binary, hire/not hire), this would be a SOS1 property.
SOS1 is very general, and can be applied every time we have a list of binary variables that is basically constrained to be at most equal t 1.
SOS2 on the other hand, is more difficult to apply. I understand that they are very useful in approximating a point on a non linear function, regardless of dimension. So, if we have a function that represent cyclical patterns that are distorted etc because of many different events, and the function is a function of time, we can approximate what the value will be at different points in time using SOS2 as a basis for branching

66
Q
A