Details, Explanation and Meaning About Subset sum problem

Subset sum problem Guide, Meaning , Facts, Information and Description

The subset sum problem is an important problem in complexity theory and cryptography. The problem is this: given a set of integers, does any subset sum to exactly zero? For example, given the set { −7, −3, −2, 5, 8}, the answer is YES because the subset { −3, −2, 5} sums to zero. The problem is NP-Complete, and is perhaps the simplest such problem to describe.

An equivalent problem is this: given a set of integers and an integer s, does any subset sum to s? Subset sum can also be thought of as a special case of the knapsack problem.

Table of contents
1 General discussion
2 The complexity of subset sum
3 Exponential time algorithm
4 Pseudo-polynomial time dynamic programming solution
5 Polynomial time approximation algorithm
6 Generalisations
7 References
8 External link

General discussion

The subset sum problem is a good introduction to the NP-complete class of problems. There are two reasons for this

Subset Sum Problem (as well as all other NP-complete problems) have the following interesting feature. Imagine you had a magical way of solving the problem in polynomial time. If the answer is positive, then you can say "it does not matter how I did it, here is the solution and here is the proof that it is right (it adds up)". However, what if someone, without your knowledge, gives you an infeasible instance of subset sum? Your magical solution algorithm can not deliver a feasible solution. It will either fail, go on for ever, or deliver a solution for which there is no proof of correctness and hence no one will believe you.

In either case, after a polynomial number of steps, you can conclude that there is no solution. This means that having a polynomial time algorithm for Subset Sum also allows, given an infeasible instance, to prove that it is not feasible in polynomial time.

Most physical problems in life can be solved quite nicely with a +/- 1% error. Being asked to solve a subset sum problem for 100-digit numbers with a precision of +/-10−100 might seem silly and irrelevant. There are two reasons why this is not quite the case.

First, Subset Sum gives a precise statement of the logical complexity of a class of problems (NP-complete problems). Solving it exactly would mean solving all problems in this well-defined class. Solving it with +/- 1% error would mean that the algorithm would become useless for some of the other problems. Second, in at least one context, it is actually important to solve Subset Sum exactly. In cryptography, Subset Sum problem comes up when a codebreaker attempts, given a message and ciphertext, to deduce the secret key. A key that is within +/- 1% of the real key is essentially useless for the codebreaker.

The cases when an approximate solution is sufficient have also been studied, in the field of approximation algorithms. We give one such algorithm for Subset Sum later in this article.

The complexity of subset sum

The complexity of subset sum can be viewed as depending on two parameters, N, the number of decision variables, and P, the precision of the problem (stated as the number of binary place values that it takes to state the problem).

The complexity of the best known algorithms is exponential in the smaller of the two parameters N and P. Thus, the problem is most difficult if N and P are of the same order but it only becomes easy if either N or P becomes very small. We give efficient algorithms for both of those cases below.

Exponential time algorithm

There are several ways to solve subset sum in time exponential in N. The most naive algorithm would be to cycle through all subsets of N numbers and, for every one of them, check if the subset sums to the right number. The running time is of order O(2NN), since there are 2N subsets and, to check each subset, we need to sum at most N elements.

A better exponential time algorithm is known, which runs in time O(2N/2N). The algorithm splits the N elements into two sets of N/2 each. For each of these two sets, it calculates sums of all 2N/2 possible subsets of its elements and stores them in an array of length 2N/2. It then sorts each of these two arrays, which can be done in time O(2N/2N). When arrays are sorted, the algorithm can check if an element of the first array and an element of the second array sum up to s in time O(2N/2). To do that, the algorithm passes through the first array in decreasing order (starting at the largest element) and the second array in increasing order (starting at the smallest element). Whenever the sum of the current element in the first array and the current element in the second array is more than s, the algorithm moves to the next element in the first array. If it is less than s, the algorithm moves to the next element in the second array. If two elements with sum s are found, it stops.

Pseudo-polynomial time dynamic programming solution

Suppose the sequence is x1, ..., xn and we wish to find a subset which sums to 0. Let N be the sum of the negative values and P the sum of the positive values. Define the boolean valued function Q(i,s) to be the value (true or false) of "there a subset of x1, ..., xi which sums to s". (Thus, the value we really want is Q(n,0).) Clearly Q(i,s)=false if s<N or s>P. Create an array to hold the values Q(i,s) for 1≤i≤n and N≤s≤P. The array can now be filled in using a simple recursion. Q(1,s) = "x1=0 or x1=s". For i>1, Q(i,s) = Q(i-1,s) or Q(i-1,s-xi). The total number of arithmetic operations is O(n(P-N)). For example, if all the values are O(nk) for some k, then the time required is O(nk+1).

This solution does not count as polynomial time in complexity theory because P-N is not polynomial in the size of the problem, which is the number of bits used to represent it.

Polynomial time approximation algorithm

An approximate version of the subset sum would be: given a set of N numbers x1, x2, ..., xN and a number s, output

If all numbers are non-negative, the approximate subset sum is solvable in time polynomial in N and 1/c.

The solution for subset sum also provides the solution for the original subset sum problem in the case if the numbers are small (again, for nonnegative numbers). If any sum of the numbers can be specified with at most P bits, then solving the problem approximately with c=2-P is equivalent to solving it exactly. Then, the polynomial time algorithm for approximate subset sum becomes an exact algorithm with running time polynomial in N and 2P (i.e., exponential in P).

The algorithm for the approximate subset sum problem is as follows:

 initialize a list S to contain one element 0.
 for each i from 1 to N do
   let T be a list consisting of xi+y, for all y in S
   let U be the union of T and S
   sort U
   make S empty 
   let y be the smallest element of U 
   add y to S 
   for each element z of U in increasing order do //trim the list by eliminating numbers close one to another
     if y<(1-c/N)z, set y=z and add z to S 
 if S contains a number between (1-c)s and s, output yes, otherwise no
The algorithm is polynomial time because the lists S, T and U always remain of size polynomial in N and 1/c and, as long as they are of polynomial size, all operations on them can be done in polynomial time. The size of lists is kept polynomial by the trimming step, in which we only include a number z into S if the previous y is at most (1-c/N)z. This step ensures that each element in S is smaller than the next one by at least a factor of (1-c/N) and any list with that property is of at most polynomial size.

The algorithm is correct because each step introduces a multiplicative error of at most (1-c/N) and N steps together introduce the error of at most (1-c/N)N<1-c.

Generalisations

Although the problem was described above in terms of integers and addition, it can actually be defined using any group. For example, the problem could be: given an integer n and a set of integers in the range [0, n − 1], does any subset sum to zero modulo n? This form of the problem has been used as a basis for several public key cryptography systems. However, most of them have been broken, reducing confidence in those still unbroken. For some reason, it is traditional in cryptography to say "knapsack problem" when it is actually the subset sum problem that is meant.

No algorithm is known for which we can prove that it solves subset sum in polynomial time. If any such algorithms exist, then some of them are already known. See the bottom of the complexity classes P and NP for one such algorithm.

References

  1. T. Cormen, C. Leiserson, R. Rivest. Introduction to Algorithms. MIT Press, 1990. Chapter 37.4, The subset-sum problem.

External link

An algorithm (exponential time) for solving the subset sum problem

This is an Article on Subset sum problem. Page Contains Information, Facts Details or Explanation Guide About Subset sum problem


Google
 
Web www.E-paranoids.com

Search Anything