Unit 5: Reductions and branch and bound Flashcards

1
Q

Hashing

A

Alternative data structure for the dictionary problem.

Observation: Usually only a small subset of all possible keys is stored.

Idea: Instead of searching through a set of data elements using key comparisons, one can calculate the (exact) position of an element in storage (usually an array) by using arithmetic operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Hashing: basics

Hash table

A
  • We assume that the hash table T has a fixed size m and uses an array with indices 0, ..., m-1.
  • For a hash table of size m that currently stores n keys (elements), we denote α = n/m its load factor.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Load factor

A

For a hash table of size m that currently stores n keys (elements), we denote its load factor:

α = n/m

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Hashing: basics

Hash function

A

We choose a hash function h: K → {0, ..., m-1} that assigns every key k ∈ K a unique hash value.

(however, more than one key can be assigned the same hash value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Advantages of hashing

A

Runtime for the operations search, insert and remove is constant Θ(1) in the ideal case and assuming that h(s) can be calculated in constant time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

3 Limitations of hashing

A
  • Whenever h(k) = h(k') for k≠k', i.e. two distinct keys obtain the same hash value, we say that we have a collision.
  • Θ(1)d only holds if we assume that the number of collisions is negligible.
  • Worst-case runtime is still O(n)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

2 Characteristics of a “good” hash function

A
  • The aim is a uniform distribution of the used keys to the bucekts 0, ..., m-1 of the hash table.
  • Small changes to the keys should lead to a distinct and possible independent hash value.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Hash functions

Modulo method

A

Assumption: k∈N (keys are natural numbers.

Calculation: h(k) = k mod m

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Hash functions

Properties of the Modulo method

A

Properties:
- the hash function can be calculated very efficiently.
- the right choice of m is very important. To ensure uniformity, a good choice for m is usually a prime number.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Modulo method

Bad choices for m

A
  • m = 2ⁱ: Only the last i bits play a role.
  • m = 10ⁱ: Analogously for decimal numbers.
  • m = rⁱ: Analogously for r-adic numbers.
  • m = rⁱ ± j for small j can be problematic.’’’

’’’ e.g. m = 2⁷ - 1 = 127 . p=112 in ASCII.
h("pt") = (112 · 128 + 116) mod 127 = 14452 mod 127 = 101
h("tp") = (116 · 128 + 112) mod 127 = 14960 mod 127 = 101
(Keys with different ordering of the same letters obtain the same hash value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Hash functions

Multiplication method

A

Assume k∈N.
Let A be an irrational number.

Calculation:
h(k) = ⌊m (k · A - ⌊ k · A ⌋) ⌋

  • The key is mutiplied with A and only the fractional part is kept.
  • This provides a value in [0, 1), which is multiplied with the table size m and rounded.

Observation that confirms uniformity: For a sequence 1, 2, 3, …, i of keys it holds that k · A - ⌊ k · A ⌋ of the next key k=i+1 always lies in a largest interval of the values of the previous keys, 0 and 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Multiplication method: choice for A

A

In general: the choice of m is not of importance as long as A is an irrational number.

Best choice for A: The golden ratio

A = ɸ⁻¹ = (√5 - 1) / 2

ɸ⁻¹ = limₙ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Open hashing: idea

A

All elements are directly placed into the array, every slot in the array is marked with a flag

fᵢ ∈ {free, used, free again}

Handling of collisions:
If a slot is used, consider other slots in a certain predefined order (probing).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Open hashing: probing

A
  • Every slot i=0, 1, ..., m-1 is marked by a flag fᵢ ∈ {free, used, free again{.
  • fᵢ = free for every i in the initial hash table.
  • At insertion, the new element will be added at the first slot marked with free or free again, the flag is then updated to used.
  • The search iterates the slots until the key is found or a slot is marked with free (in which case the key is not there).
  • After removal of an element, the flag is set to free again.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Linear probing

A

Given: a normal hash function
h': K → {0, 1, ..., m-1}

Linear probing: For every i = 0, 1, ..., m-1, we set:
h(k, i) = (h'(k) + i) mod m

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Linear probing: problems

A

Problems:
- After insertions, the probability for a new key to be inserted at a certain slot becomes non-uniform.
- After a slot is used, the probability for the insertion of a key at the next slot changes.

  • Long consecutive sequences of used slots have a tendency to grow faster than shorter ones.
  • This effect amplifies, since long sequences combine more easily to create even longer ones.
  • As a consequence of this phenomenon (primary clustering), the efficiency of the hash table deteriorates drastically once the load factor gets closer to 1.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Quadratic probing

A

Idea: To avoid the primary clustering of linear probing, quadratic probing uses a quadratically growing distance to probe a new slot.

Given: A normal hash function:
h': K → {0, 1, ..., m-1}

Quadratic probing: probing function:
h(k, i) = (h'(k) + c₁i + c₂i²) mod m

Here c₁ and c₂ are carefully chosen constants.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Double hashing

A

Idea: The efficiency of uniform probing can already be almost achieved if one uses a second hash function instead of a randomized probing order.

Given: 2 hash functions:

h₁, h₂: K → {0, 1, ..., m-1}

Double hashing: We set (for i = 0, 1, ..., m-1):

h(k, i) = (h₁(k) + ih₂(k)) mod m

Choice of h₂(k): The probing order must reach all slots 0, ..., m for every key k. This implies that h₂(k) ≠ 0 and that it must not divide

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Á la Brent

A

Idea: If at insertion of a key, a probed slot j is used with k' = T[j].key, then set:
j₁ = ( j + h₂(k) ) mod m
j₂ = ( j + h₂(k') ) mod m

If j₁ is used but j₂ is free, move k' to j₂ to make space for k at j.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Analysis of Brent’s improvement

A

Number of probes: On average (for large m and n):
- Unsuccessful search ≃ 1 / (1 - ɑ)
- Successful search < 2.5 (independent ɑ for ɑ ≤ 1)

Advantage: Average time for a successful search is constant even in the extreme event that ɑ = 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Open Hashing
Suitability and reorganization

A
  • Open hashing methods only allow a load factor of at most 1.
  • Load factors close to 1 are in general bad, since the number of probing steps can get very large.
  • If necessary, a reorganisation is required, i.e. completely new hash table (e.g. with double the size) is populated if more elements need to be stored than originally anticipated.
  • Open hashing methods are usually well-suited whenever the number of elements to be stored is known in advance and elements are rarely or not at all removed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Tractable problems

A

We say that a problem is efficiently solvable, if it can be solved in polynomial time.

I.e. for such a problem, there is a O(n^c) algorithm, where:

  • n = input size
  • c = constant

Efficiently solvable problems are also referred to as tractable problems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Cobham-Edmonds Assumption

A
  • The assumption that tractability can be considered the same as solvability in polinomial time.
  • Pioneered by Alan Cobham and Jack Edmonds, who proposed this assumption in the 1960s.
  • The assumption has become the predominant assumption in computer science over the last 50 years.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Cobham-Edmonds Assumption

Justification

A
  • In practice, polynomial time algorithms usually have small constants and exponents (e.g. linear, quadratic, or cubic)
  • Overcoming the exponential bound of brute-force algorithms usually goes hand-in-hand with significant structural and combinatorial insights into the problem.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Cobham-Edmonds Assumption

Exceptions and Criticism

A
  • Some polynomial time algorithms have huge constants and exponents and are considered useless in practice.
  • Some exponential time algorithms are considered highly useful in practice since their worst-case pehaviour only occurs rarely or the problem instances are sufficiently small.
  • For huge datasets, even linear time algorithms may be too slow.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Decision problem

A
  • A problem whose solution is either YES or NO
  • In contrast to functional or optimization problems that return a solution or set of solutions.
  • Typically: every functional or optimization problem has a natural corresponding decision problem and vice versa.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Define

Polynomial time reduction

A

A polynomial time reduction from a problem X to a problem Y is an algorithm A that - for every instance x of X computes an instance y of Y such that:

  • x is a YES-instance of X if and only if y is a YES-instance of Y (equivalence).
  • A runs in polynomial time, i.e. there is a constant c such that given an instance x of X with size n, A computes the corresponding instance y of Y in time O(n^c)

We write X ≤ pY if there is a polynomial time reduction from XY

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

X ≤ pY intuition

A

If X ≤ pY, “X can be polynomially reduced to Y”, then intuitively

“Y is at least as hard as X”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Showing Intractability

A

We show that a problem is intractable by reducing to it from some other intractable problem.

Lemma:
If X ≤ pY and X cannot be solved in polynomial time, then Y cannot be solved in polynomial time.

30
Q

Reductions

Transitivity

A

If X ≤ pY and Y ≤ pZ,
then X ≤ pZ

31
Q

Graphs

Independent Set

A

A set of vertices that are pairwise not adjacent.

32
Q

Independent Set problem

A

Input: A graph G and an integer k.

Question: Does G have an independent set of size at least k?

33
Q

Graphs

Vertex Cover

A

A set of vertices that contains at least one endpoint from every edge.

34
Q

Vertex cover problem

A

Input: A graph G and an integer k.

Question: Does G have a vertex cover of size at most k?

35
Q

Vertex cover vs Independent set

A

Let X be a vertex cover of a graph G = (V, E).

Then V \ X is an independent set.

Moreover, a set X is a vertex cover if and only if V \ X is an independent set.

36
Q

Graphs

Non-blocker

A

A non-blocker of G is a subset N ⊆ E of edges such that G constains a path between any pair of vertices that does not contain an edge in N.

N is a non-blocker if graph G minus all edges in N is connected

37
Q

Weight of a non-blocker N

A

the sum of all the weights of all edges in N, i.e. Σₑw(e)

38
Q

Maximum non-blocker

A

A non-blocker of maximum weight

39
Q

Non-blocker Problem

A

Input: A connected graph G=(V,E) with edge weights w: E → Q and a number W ∈ Q.

Question: Does G have a non-blocker of weight at least W?

40
Q

Spanning Tree problem

A

Input: A connected graph G=(V,E) with edge weights w: E → Q and a number W ∈ Q.

Question: Does G have a spanning tree T=(V,F) of weight at most W?

i.e. ∑w(e) ≤ W for e ∈ F

41
Q

Set Cover problem

A

Input: A set of elements U, a collection S₁, ..., Sₘ of subsets of U and an integer k.

Question: Does there exist a collection of ≤k of these sets whose union is equal to U?

42
Q

Logic & Satisfiability

a literal

A

Either a propositional variable (e.g. x) or its negation (e.g. ¬x).

43
Q

Logic & Satisfiability

a clause

A

A disjunction of literals, i.e.
Cⱼ = l₁ ∨ l₂ ∨ ... ∨ lᵣ

44
Q

Logic & Satisfiability

Conjunctive normal form

A

ɸ is in conjunctive normal form if it is a conjunction of clauses, i.e.:

C₁ ∧ C₂ ∧ ... ∧ Cₘ

45
Q

Satisfiability

A
  • Φ is a set of clauses, and each clause is a set of literals
  • V(Φ) is the set of variables of Φ

The aim is to decide whether Φ is satisfiable, i.e. whether there is an assignment τ : V(Φ) → {0, 1} satisfying Φ.

The assignment τ satisfies Φ if every clause of Φ contains either:
- a variable with τ(x) = 1 or
- a literal ¬x with τ(x) = 0.

46
Q

3-CNF formula

A

A CNF-formula that has exactly 3 literals per clause.

47
Q

NP: Polynomial time certifier

Certifier

A

Algorithm C(s, t) is a certifier for problem X if every string s satisfies s ∈ X iff there exists a string t such that C(s, t) = yes.

48
Q

The class NP

A

Decision problems for which there exists a polynomial time certifier.

Intuition:
- P ≈ problems that can be solved in polynomial time.
- NP ≈ problems for which a (polynomial-sized) solution can be verified in polynomial time.

49
Q

NP-complete

A

A problem Y in NP with the property that for every problem X in NP, X ≤ pY.

NP-complete problems are the hardest problems in NP.

50
Q

Combinatorial optimization

A

Combinatorial optimization is about finding a subset of elements that:
- satisfies certain constraints and
- is optimal with respect to a given cost function, e.g. smallest weight, shortest path, etc.

51
Q

6 Examples of combinatorial optimization problems

A
  • Minimum spanning tree in a graph
  • Shortest path in a graph
  • Minimum vertex or set cover
  • Maximum independent set
  • Minimum vertex coloring
  • Knapsack
52
Q

Coping with NP-completeness

Impossible trinity

A

Sacrifice one of 3 desired features:

  • Solve the problem to optimality (heuristic & approximation algorithms)
  • Solve problem in polynomial time → algorithms with exponential runtime (e.g. branch and bound and parameterized algorithms)
  • Solve arbitrary instances of the problem → identify special cases solvable in polynomial time.
53
Q

Branch and bound

A

Restrict the search space (of a systematic enumeration of all solutions) with the help of lower bounds and upper bounds and provide an optimal solution.

Still require exponential time in worst case, but can be much more efficient for practical instances.

54
Q

Knapsack problem

A

Input: n items with positive rational weights w₁, ..., wₙ and values v₁, ... vₙ as well as a positive rational capacity C.

Task: Find a subset S of items with weight ≤ C that maximizes the total value.

55
Q

Maximization: Approach

A
  • The problem is divided into subproblems, e.g. by fixing variables or adding additional constraints. This leads to a partitioning of the solution space.
  • If the calculated upper bound U' is not larger than the best current lower bound L (= the value of the current best solution) for one (or more) of these subproblems (subsets), then one no longer needs to consider solutions for this subproblem.
  • Otherwise one needs to further divide the current subproblem (subset of solutions).
  • One continues to divide the solution-space until the upper bound is no longer larger than the current best lower bound for all current subproblems.
56
Q

Maximization Approach: Suitability & Efficiency

A
  • Branch and bound is a general principal (meta-approach)
  • It can be applied to a wide variety of discrete optimization problems.
  • Crucial for its efficiency are:
    • The choice of heuristics for the computation of U' and L'.
    • How the branching is accomplished.
    • The order in which the subinstances are considered.
57
Q

Branch and bound:
Choice of the next subproblem

A
  • The choice of the next subproblem does not change the general functionality and correctness of branch and bound.
  • Nevertheless, the choice can have a huge impact on practical running-time.
58
Q

Branch and bound: choice of problems

Best-first vs Depth-first

A

Best-first:
- Always choose a subproblem with the best dual bound (upper bound).
- This way we always process the smallest number of subproblems.

Depth-first:
- Always continue with a most recently created subproblem (similar to depth-first search for searching graphs).
- Leads to a first valid solution very quickly.
- One can also combine an initial depth-first search strategy with the best-first search strategy that combines after an initial solution has been found.

59
Q

Independent Set Problem

A

Given a graph G and an integer K, does G, contain an independent set of size K?

60
Q

Prove that the Independent Set Problem is in NP

A

Proof:

A solution to the independent set problem is a subset V' of the vertices of G.

Let n be the number of vertices of G and m be the number of edges.

Given V', we first verify that it contains K elements.

For each pair u, v ∈ V', check that (u, v) ∉ E.

There are O(n²) pairs of vertices in V', so this takes O(mn²) time, which is polynomial in n and m.

61
Q

3-SAT Problem

A

Given a Boolean expression in 3-CNF, can it be satisfied?

62
Q

Reduce 3-SAT to the Independent Set Problem in polynomial time.

A

Proof:
Let φ be a Boolean expression in 3-CNF with n variables and m clauses.

Construct G and k as follows:

  • G contains 3 vertices for each clause of φ (one for each literal).
  • Connect the three literals in each clause to a triangle.
  • Connect each literal with all its negations.
  • Set k=m.

φ is satisfiable if and only if G has an independent set of size at least k

63
Q

Prove Correctness:

φ is satisfiable if and only if G has an independent set of size at least k

Recall: k is set = m, the number of clauses in φ

A

Proof:
Let S be an independent set of size k=m.
- S contains exactly one vertex per triangle (at most one because S is independent, and at least one because |S| = k)
- Moreover, the edges between complementary literals inforce that S contains at most one literal, i.e. x or ¬x, from any variable.

64
Q

2 Ways to picture a 3-SAT instance

A
  1. You have to make an independent 0/1 decision for each of the n variables, and you succeed if you manage to achieve one of three ways of satisfying each clause.
  2. You have to choose one term from each clause, and then find a truth assignment that causes all these terms to evaluate to 1, thereby satisfying all clauses. You succeed if you can select a term from each clause in such a way that no two selected terms “conflict”.

We say that two terms conflict if one is equal to a varialbe xᵢ, and the other is equal to its negation ¬xᵢ.

65
Q

Hamiltonian Cycle

A

Given a directed graph G = (V, E), we say that a cycle C in G is a Hamiltonian cycle if it visits each vertex exactly once.

I.e. it constitutes a “tour” of all the vertices, with no repetitions.

66
Q

Hamiltonian Path

A

Given a directed graph G = (V, E), we say that a path P in G is a Hamiltonian path if it contains each vertex exactly once.

(The path is allowed to start at any node and end at any node).

67
Q

6 Genres of NP-complete problems

A
  • Packing
  • Covering
  • Partitioning
  • Sequencing
  • Numerical
  • Constraint Satisfaction
68
Q

6 Genres of NP-complete problems

Packing problems

A

You’re given a collection of objects. You want to choose at least k of them.

Making your life difficult is a set of conflicts among the objects, preventing you from choosing certain groups simultaneously.

E.g.
- Independent Set. Given a graph G and a number k, does G contain an independent set of size at least k?
- Set Packing. Given a set U of n elements, a collection S₁, ..., Sₘ of subsets of U and a number k, does there exist a collection of at least k of these sets with the property that no two of them intersect?

69
Q

6 Genres of NP-complete problems

Covering problems

A

Covering problems form a natural contrast to packing problems.

You’re given a collection of objects, and you want to choose a subset that collectively achieves a certain goal. The challenge is to achieve this goal while only choosing k of the objects.

E.g.
- Vertex Cover. Given a graph G and a number k, does G contain a vertex cover of size at most k?
- Set Cover. Given a set U of n elements, a collection S₁, ..., Sₘ of subsets of U, and a number k, does there exist a collection of at most k sets whose union is equal to all of U?

70
Q

6 Genres of NP-complete problems

Partitioning problems

A

Involve a search over all ways to divide up a collection of objects into subsets so that each object appears in exactly one of the subsets.

  • 3-Dimensional Matching. Given disjoint sets X, Y, and Z, each of size n and given a set T ⊆ X ⤬ Y ⤬ Z of ordered triples, does there exist a set of n triples in T so that each element of X ⋃ Y ⋃ Z is contained in exactly one of these triples.
  • Graph Coloring. Given a graph G and a bound k. Does G have a k-colouring?
71
Q

6 Genres of NP-complete problems

Sequencing Problems

A

Involves searching over the set of all permutations of a collection of objects.

E.g.
- Hamiltonian Cycle.
- Hamiltonian Path.
- Travelling Salesman. Given a set of distances on n cities and a bound D, is there a tour of length at most D?