The hadoop mapreduce based approach is linearly scalable i.e. the time efficiency improves as number of processors increases. What is the theoretical limit for increasing number of processors in a cluster?
See Amdahls law. Since a parallel programm will still contain serial parts, e.g. blocked message exchange, a given problem will scale out when the time to exchange results exceeds the time needed to actually compute the results. One will usually see some kind of pleateu, so no further speedup occurs and a performance drop with even more cores is likely.
Unfortunately this question can't be answered directly, since it highly depends on software implementation and hardware properties and so must be benchmarked.
The most important factor is network latency between nodes of a cluster, so a high speed network like QDR/FDR Infiniband is highly desired.
When it comes to mathematics for Map reduce. There are two basic properties that are required for any operation to be performed using Map Reduce as follows: 1) Catamorphism and 2) Monoid - a combination of two properties existence of neutral element and associativity. Now none from both of these imply to a mathematical model for predicting optimal number of nodes for processing data on map reduce.
To answer your question: linear scalability of Map Reduce over what? By growing data set size or by growing number of nodes (or processors). If data set size is fixed, then optimal number of nodes may be found out experimentally. Also, since your data set size is fixed, if size of data set is quite less few nodes should be enough. Also, it depends on which nodes your data is present. Say you have 10 nodes, and after inserting some 1TB data, this data is stored on only 5 nodes than other nodes would not take part into the Map Reduce job concerning that data by using the property of existence of neutral elements.