Greedy Flashcards
Interval Scheduling
“We have a set of requests {1, 2, . . . , n}; the i’th request corresponds to an interval of time starting at s(i) and finishing at f(i)
The interval scheduling maximization problem (ISMP) is to find a largest compatible set, i.e., a set of non-overlapping intervals of maximum size.
Solution: Sorting by earliest finish time!
Implementation and Running Time We can make our algorithm run in time O(n log n) as follows. We begin by sorting the n requests in order of finishing time. In this way, we implement the greedy algorithm analyzed above in one pass through the intervals, spending constant time per interval
“
Interval Partitioning Problem
“we have many identical resources available and we wish to schedule all the requests using as few resources as possible
suppose that each request corresponds to a lecture that needs to be scheduled in a classroom for a particular interval of time. We wish to satisfy all these requests, using as few classrooms as possible.
Suppose we define the depth of a set of intervals to be the maximum number that pass over any single point on the time-line.
(4.4) In any instance of Interval Partitioning, the number of resources needed is at least the depth of the set of intervals.
we show how to assign a label to each interval, where the labels come from the set of numbers {1, 2, . . . , d}, and the assignment has the property that overlapping intervals are labeled with different numbers.
The algorithm we use for this is a simple one-pass greedy strategy that orders intervals by their starting times.
We go through the intervals in this order, and try to assign to each interval we encounter a label that hasn’t already been assigned to any previous interval that overlaps it.
(4.5) If we use the greedy algorithm above, every interval will be assigned a label, and no two overlapping intervals will receive the same label
(4.6) The greedy algorithm above schedules every interval on a resource, using a number of resources equal to the depth of the set of intervals. This is the optimal number of resources needed.
“
Scheduling to minimize lateness
“Instead of a start time and finish time, the request i has a deadline di, and it requires a contiguous time interval of length ti, but it is willing to be scheduled at any time before the deadline.
Solution: earliest deadline first
”
Dijkstras algorithm: Greedy
“The algorithm maintains a set S of vertices u for which we have determined a shortest-path distance d(u) from s; this is the “explored” part of the graph.
Initially S = {s}, and d(s) = 0.
for each node v∈V−S, we determine the shortest path that can be constructed by traveling along a path through the explored part S to some u ∈ S, followed by the single edge (u, v).
First, the algorithm does not always find shortest paths if some of the edges can have negative lengths.
Many shortest-path applications involve negative edge lengths, and a more com- plex algorithm—due to Bellman and Ford—is required for this case.
Running time:
(4.15) Using a priority queue, Dijkstra’s Algorithm can be implemented on a graph with n nodes and m edges to run in O(m) time, plus the time for n ExtractMin and m ChangeKey operations.
Using the heap-based priority queue implementation discussed in Chap- ter 2, each priority queue operation can be made to run in O(log n) time. Thus the overall time for the implementation is O(m log n).”
The Minimum Spanning Tree Problem
“there should be a path between every pair of nodes—but subject to this requirement, we wish to build it as cheaply as possible.
For certain pairs (vi, vj), we may build a direct link between vi and vj for a certain cost c(vi, vj) > 0. Thus we can represent the set of possible links that may be built using a graph G = (V , E), with a positive cost ce associated with each edge e = (vi, vj). The problem is to find a subset ‘of the edges T ⊆ E so that the graph (V,T) is connected, and the total cost e∈T ce is as small as possible
(4.16) Let T be a minimum-cost solution to the network design problem defined above. Then (V , T ) is a tree.
this is a case where many of the first greedy algorithms one tries turn out to be correct
Kruskals algorithm One simple algorithm starts without any edges at all and builds a span- ning tree by successively inserting edges from E in order of increasing cost.
Prim’s algorithm analogy with Dijk- stra’s Algorithm: We start with a root node s and try to greedily grow a tree from s outward. At each step, we simply add the node that can be attached as cheaply as possibly to the partial tree we already have.
Reverse-Delete Algorithm: Finally, we can design a greedy algorithm by running sort of a “back- ward” version of Kruskal’s Algorithm. Specifically, we start with the full graph (V , E) and begin deleting edges in order of decreasing cost.
Eliminating the Assumption that All Edge Costs Are Distinct
perturb all edge costs by different, extremely small numbers, so that they all become distinc
“
Prim’s algorithm
“We will see that both Prim’s and Kruskal’s Algorithms can be implemented, with the right choice of data struc- tures, to run in O(m log n) time.
By analogy with Dijkstra’s Algorithm, we need to be able to decide which node v to add next to the growing set S, by maintaining the attachment costs a(v) = mine=(u,v):u∈S ce for each node v ∈ V − S.
As before, we keep the nodes in a priority queue with these attachment costs a(v) as the keys;
(4.22) Using a priority queue, Prim’s Algorithm can be implemented on a graph with n nodes and m edges to run in O(m) time, plus the time for n ExtractMin, and m ChangeKey operations.
As with Dijkstra’s Algorithm, if we use a heap-based priority queue we can implement both ExtractMin and ChangeKey in O(log n) time, and so get an overall running time of O(m log n).”
Kruskal’s Algorithm
“the graph has a fixed population of nodes, but it grows over time by having edges appear between certain pairs of nodes. Our goal is to maintain the set of connected components of such a graph throughout this evolution process.
Rather, we will develop a data structure that we call the Union-Find structure, which will store a representation of the components in a way that supports rapid searching and updating.
Given a node u, the operation Find(u) will return the name of the set containing u.
If they are not, then Union(Find(u),Find(v)) can be used to merge the two components into one.
Maybe the simplest possible way to implement a Union-Find data structure is to maintain an array
The data structure for this alternate implementation uses pointers
(4.24) Consider the above pointer-based implementation of the Union-Find data structure for some set S of size n, where unions keep the name of the larger set. A Union operation takes O(1) time, MakeUnionFind(S) takes O(n) time, and a Find operation takes O(log n) time.
(4.25) Kruskal’s Algorithm can be implemented on a graph with n nodes and m edges to run in O(m log n) time.”
Clustering
“Suppose we are seeking to divide the objects in U into k groups, for a given parameter k. We say that a k-clustering of U is a partition of U into k nonempty sets C1, C2, . . . , Ck. We define the spacing of a k-clustering to be the minimum distance between any pair of points lying in different clusters. Given that we want points in different clusters to be far apart from one another, a natural goal is to seek the k-clustering with the maximum possible spacing.
Thus we start by drawing an edge between the closest pair of points. We then draw an edge between the next closest pair of points. We continue adding edges between pairs of points, in order of increasing distance d(pi, pj).
so if we are about to add the edge (pi, pj) and find that pi and pj already belong to the same cluster, we will refrain from adding the edge
Each time we add an edge that spans two distinct components, it is as though we have merged the two corresponding clusters.
(4.26) The components C1, C2, . . . , Ck formed by deleting the k − 1 most expensive edges of the minimum spanning tree T constitute a k-clustering of maximum spacing.”