Any recommendations of implementations which parallelize the "information gain" part of a decision tree building algorithm (such as C4.5). Preferably using Hadoop but would also be interested in generic tips.

More David F. Nettleton's questions See All
Similar questions and discussions