Share on Facebook Share on Twitter Email
Answers.com

Set cover problem

 
Wikipedia: Set cover problem

The set covering problem is a classical question in computer science and complexity theory. It is a problem "whose study has led to the development of fundamental techniques for the entire field" of approximation algorithms.[1]

As input you are given several sets. They may have some elements in common. You must select a minimum number of these sets so that the sets you have picked contain all the elements that are contained in any of the sets in the input. It was one of Karp's 21 NP-complete problems shown to be NP-complete in 1972.

More formally, given a universe \mathcal{U} and a family \mathcal{S} of subsets of \mathcal{U}, a cover is a subfamily \mathcal{C}\subseteq\mathcal{S} of sets whose union is \mathcal{U}. In the set covering decision problem, the input is a pair (\mathcal{U},\mathcal{S}) and an integer k; the question is whether there is a set covering of size k or less. In the set covering optimization problem, the input is a pair (\mathcal{U},\mathcal{S}), and the task is to find a set covering which uses the fewest sets.

The decision version of set covering is NP-complete, and the optimization version of set cover is NP-hard.

Covering-Packing Dualities
Covering problems Packing problems
Minimum Set Cover Maximum Set Packing
Minimum Vertex Cover Maximum Matching
Minimum Edge Cover Maximum Independent Set

Contents

Integer linear program formulation

The minimum set cover problem can be formulated as the following integer linear program.[2]

minimize \sum_{S \in \mathcal S} c(S) x_S (minimize the total cost)
subject to \sum_{S\colon e \in S} x_S \geq 1 for all e\in \mathcal U (cover every element of the universe)
x_S \in \{0,1\} for all S\in \mathcal S. (every set is either in the set cover or not)

This ILP belongs to the more general class of ILPs for covering problems. The integrality gap of this ILP is at most logn, so its relaxation gives a factor-logn approximation algorithm for the minimum set cover problem (where n is the size of the universe).[3]

Hitting set formulation

Set covering is equivalent to the Hitting set problem. It is easy to see this by observing that an instance of set covering can be viewed as an arbitrary bipartite graph, with sets represented by vertices on the left, the universe represented by vertices on the right, and edges representing the inclusion of elements in sets. The task is then to find a minimum cardinality subset of left-vertices which covers all of the right-vertices. In the Hitting set problem, the objective is to cover the left-vertices using a minimum subset of the right vertices. Converting from one problem to the other is therefore achieved by interchanging the two sets of vertices.

Greedy algorithm

The greedy algorithm for set covering chooses sets according to one rule: at each stage, choose the set which contains the largest number of uncovered elements. It can be shown that this algorithm achieves an approximation ratio of H(s), where s is the size of the largest set and H(n) is the n-th harmonic number:

 H(n) = \sum_{k=1}^{n} \frac{1}{k} \le \ln{n} +1
Tight example for the greedy algorithm with k=3

There is a standard example on which the greedy algorithm achieves an approximation ratio of log2(n) / 2. The universe consists of n = 2(k + 1) − 2 elements. The set system consists of k pairwise disjoint sets S_1,\ldots,S_k with sizes 2,4,8,\ldots,2^k respectively, as well as two additional disjoint sets T0,T1, each of which contains half of the elements from each Si. On this input, the greedy algorithm takes the sets S_k,\ldots,S_1, in that order, while the optimal solution consists only of T0 and T1. An example of such an input for k = 3 is pictured on the right.

Inapproximability results show that the greedy algorithm is essentially the best-possible polynomial time approximation algorithm for set cover (see Inapproximability results below), under plausible complexity assumptions.

Low-frequency systems

If each element occurs in at most f sets, then a solution can be found in polynomial time which approximates the optimum to within a factor of f as follows: while an uncovered element exists, add all (at most f) sets which contain this element to the cover. In the end we end up with some cover, which we output. To see that this is an f-approximation it suffices to note that every time we add f sets, at least one of those is in an optimal cover as well. For f = 2, this is the normal approximation algorithm for vertex cover.

See k-approximation of k-hitting set for an f-approximation of weighted set cover.

Inapproximability results

Lund & Yannakakis (1994) showed that set covering cannot be approximated in polynomial time to within a factor of (\log_2{n})/2\approx{}0.72\ln{n}, unless NP has quasi-polynomial time algorithms. Feige (1998) improved this lower bound to (1-o(1))\cdot\ln{n} under the same assumptions, which essentially matches the approximation ratio achieved by the greedy algorithm. Raz & Safra (1997) established a lower bound of c\cdot\ln{n}, where c is a constant, under the weaker assumption that P\not=NP. A similar result with a higher value of c was recently proved by Alon, Moshkovitz & Safra (2006).

Related problems

Notes

  1. ^ Vazirani (2001, p. 15)
  2. ^ Vazirani (2001, p. 108)
  3. ^ Vazirani (2001, pp. 110-112)

References

  • Noga Alon, Dana Moshkovitz, and Muli Safra. Algorithmic construction of sets for k-restrictions. ACM Transactions on Algorithms (TALG), v.2 n.2, p.153-177, April 2006.
  • Uriel Feige. A Threshold of ln n for Approximating Set Cover. Journal of the ACM (JACM), v.45 n.4, p.634 - 652, July 1998.
  • Carsten Lund and Mihalis Yannakakis. On the hardness of approximating minimization problems. Journal of the ACM (JACM), v.41 n.5, p.960-981, Sept. 1994
  • Ran Raz and Muli Safra. A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP. Proceedings of STOC 1997, pp. 475-484, 1997.

External links


Search unanswered questions...
Enter a question here...
Search: All sources Community Q&A Reference topics
 
 

 

Copyrights:

Wikipedia. This article is licensed under the Creative Commons Attribution/Share-Alike License. It uses material from the Wikipedia article "Set cover problem" Read more