It can ensemble the output of any classifier. RandomForest and Extratrees only ensemble the output of Decision Trees.
However, RandomForest is likely to provide the best and most stable outcome among the bagging methods. Then again, depends on the problem and the training data.
Therefore, trees ensemble methods are better than simple decision trees, but is an ensemble better than the other? Which one should I use? This post is going to try to answer these questions, studying the differences between the Extra Trees and Random Forest and comparing both of them in terms of results.
The two ensembles have a lot in common. Both of them are composed of a large number of decision trees, where the final decision is obtained taking into account the prediction of every tree. Specifically, by majority vote in classification problems, and by the arithmetic mean in regression problems. Furthermore, both algorithms have the same growing tree procedure (with one exception explained below). Moreover, when selecting the partition of each node, both of them randomly choose a subset of features.
So, the main two differences are the following:
Random forest uses bootstrap replicas, that is to say, it subsamples the input data with replacement, whereas Extra Trees use the whole original sample. In the Extra Trees sklearn implementation there is an optional parameter that allows users to bootstrap replicas, but by default, it uses the entire input sample. This may increase variance because bootstrapping makes it more diversified.
Another difference is the selection of cut points in order to split nodes. Random Forest chooses the optimum split while Extra Trees chooses it randomly. However, once the split points are selected, the two algorithms choose the best one between all the subset of features. Therefore, Extra Trees adds randomization but still has optimization.