In literature, for proving the complexity of scheduling problems people use 2 partition and 3 partition problems. Can any one please help me to distinguish these problems for proving the complexity.
The only difference between 2-partition (i.e., the Partition problem), and the 3-partition problem is that in the partition problem, you need to find two sets S_1 and S_2, so that the sum of the elements in S_1 equals the sum of the elements in S_2. In 3-partition, instead of having two sets, it is three sets (S_1, S_2, S_3) that have equal value. It is worth mentioning that these two problems have different properties.
The one I see more frequently is just Partition as it naturally can be employed to show that the makespan problem on identical machines (when m=2) is NP-hard (i.e., decision variant of what I described above is NP-complete). Keep in mind this is because the two sets can be thought of as machines and the goal is to minimize the makespan so one needs to put the loads on so that they are as "balanced" as possible. It is easy to characterize things like identical machines in this fashion.
Quite often I encounter variants NP-complete matching problems used a lot with scheduling problems. It really depends on what you are after for the objective. The reason why is because an assignment of jobs is very similar to the idea of matching vertices based on certain properties in a graph. Also, it is important to note that 3-Partition can be shown to be NP-hard by a reduction from one such matching problem (3-dimensional matching).
I am looking for lateness objective. Can you please tell whether i should use 2 partition or 3 partition problem. What if the objective is tardiness minimization?
Hmm, that is a good question. I haven't worked as much with that particular objective (though I am very aware of it). Maybe somebody else has a good answer for that one. What I could suggest is seeing if you can tinker with each in a clever way to show the reduction. I'd suspect that you should consider a special case of your problem (if you do that, the whole version of your problem's computational complexity will be at least at difficult as the special case).
2-partition is weakly NP-complete (pseudo-polynomial), 3-partition is strongly NP-complete. So, the choice of the problem depends on what you want to prove.
It is a common misconception that 3-partition involves partitioning a set of numbers into three subsets of equal weight. Actually, it is partitioning it into triples of equal weight.
2-partition is NP-hard in the weak sense, whereas 3-partition is NP-hard in the strong sense. If you can solve your scheduling problem in pseudo-polynomial time (e.g., via dynamic programming), it cannot be strongly NP-hard (unless P = NP), and therefore there is no point trying to reduce 3-partition to it.
An algorithm is said to run in "pseudo-polynomial time" if its running time is bounded by the size of the input, when the input data is expressed in unary rather than binary. Some combinatorial optimization problems can be solved in pseudo-polynomial time, usually via dynamic programming. For example, the knapsack problem with n items and knapsack capacity Q can be solved in O(nQ) time using the standard Bellman dynamic programming recursion. This running time is only pseudo-polynomial: the number of binary bits needed to encode the number Q is only log_2 Q, and therefore O(nQ), though polynomial in n and Q, is exponential in the number of bits needed to encode Q.
A combinatorial optimisation problem is strongly NP-hard if it remains NP-hard even when all of the numbers appearing in the input are polynomially bounded. For example, the bin packing problem with n items and bin capacity Q is strongly NP-hard. That is, even when the bin packing problem is restricted to instances for which Q is bounded by a polynomial in n, it remains NP-hard.
A problem that is NP-hard but not strongly NP-hard is called weakly NP-hard. For example, the knapsack problem is weakly NP-hard. Indeed, if a problem can be solved by a pseudo-polynomial algorithm, then it cannot be strongly NP-hard.
2-partition is weakly NP-hard, but 3-partition is strongly NP-hard. Then, to prove that a problem is weakly NP-hard, it suffices to (a) reduce 2-partition to it, and (b) find a pseudo-polynomial algorithm for it. To prove that a problem is strongly NP-hard, it suffices to reduce 3-partition to it.
Finally, note that the term "NP-complete" applies only to decision problems. For optimisation problems, one uses the term "NP-hard". The NP-hard optimisation problems are those that are at least as hard to solve as the NP-complete decision problems.
If you have an algorithm for your problem which solves it in pseudo-polynomial time, and you think that the problem is intractable, then you may try to reduce the weakly NP hard PARTITION problem to it. For the same reason, it is no use to try to reduce the strongly NP-hard 3-PARTITION to your problem, because a well-known theorem of computational complexity says that under very mild conditions, if a problem can be solved in pseudo-polynomial time, then it is not NP-hard in the strong sense, unless P = NP.
A common mistake is to believe that the polynomail reduction used in the theory of the complexity to define NP-completeness does preserve the strong sense. This is absolutely false :
- By definition of NP-complete, ANY NP-complete problem can be reduced to ANY other NP-complete problem.
- A reduction from a NP-complete problem in tre strong sense, say 3-Partition, does not prove that your problem is NP-complete in the strong sense. It simply proves that your problem is NP-complete. In particular there does exist a reduction from 3-PARTITION to PARTITION.
- Conversely a reduction from a NP-complete problem in the ordinary sense does not prevent your problem to be NP-complete in the strong sense. In particular there does exist a reduction from PARTITION to 3-PARTITION.
To preserve the strong sense, you need more than the usual reduction. In their book Garey & Jonhson define around page 100 the notion of pseudo-polynomial reduction for this purpose. Basically it ensures that in the transformation, the maximum value does not grow exponentially and the size of the encoding is not shrink exponentially.