It is a very nice answer given by Professor Kelsey. Added with this information, you may go for following reference papers:
[1] Sathyadevan, S., & Nair, R. R. (2015). Comparative analysis of decision tree algorithms: ID3, C4. 5 and random forest. In Computational Intelligence in Data Mining-Volume 1 (pp. 549-562). Springer, New Delhi.
[2]Ali, J., Khan, R., Ahmad, N., & Maqsood, I. (2012). Random forests and decision trees. IJCSI International Journal of Computer Science Issues, 9(5), 272-278.
[3] Robnik-Šikonja, M. (2004, September). Improving random forests. In European conference on machine learning (pp. 359-370). Springer, Berlin, Heidelberg.
[4] Banfield, R. E., Hall, L. O., Bowyer, K. W., & Kegelmeyer, W. P. (2007). A comparison of decision tree ensemble creation techniques. IEEE transactions on pattern analysis and machine intelligence, 29(1), 173-180.
[5] Pumpuang, P., Srivihok, A., & Praneetpolgrang, P. (2008, October). Comparisons of classifier algorithms: bayesian network, C4. 5, decision forest and NBTree for course registration planning model of undergraduate students. In Systems, Man and Cybernetics, 2008. SMC 2008. IEEE International Conference on (pp. 3647-3651). IEEE.
C4.5 takes the training data and generates a single tree. It can work with continuous and categorical data, and missing values. It also goes back over the tree to delete nodes or modify the internal structure. It is a very good classifier method. You still have to guard against overfit, though. So you typically use the tree to classify test data, and use the results to prune the original tree (if needed) so that you have a good balance between training error and the errors you get when generalizing to unseen data.
Random forests is a method in which you build a few thousand classification trees. For each tree you sample your training instances with replacement. For each node in each tree, you only consider a random subset of the attributes. The classifier you get is an aggregate of all these trees - there isn't a single tree that you can draw and analyse. For technical reasons, random forests don't overfit (in general), so there much less need to investigate generalization error.
So the main differences are: CS4.5 is a single tree, and you have to think about pruning it to guard against overfit. A random forest classifier is the average of several thousand trees generated using random subsets of your data.
It is a very nice answer given by Professor Kelsey. Added with this information, you may go for following reference papers:
[1] Sathyadevan, S., & Nair, R. R. (2015). Comparative analysis of decision tree algorithms: ID3, C4. 5 and random forest. In Computational Intelligence in Data Mining-Volume 1 (pp. 549-562). Springer, New Delhi.
[2]Ali, J., Khan, R., Ahmad, N., & Maqsood, I. (2012). Random forests and decision trees. IJCSI International Journal of Computer Science Issues, 9(5), 272-278.
[3] Robnik-Šikonja, M. (2004, September). Improving random forests. In European conference on machine learning (pp. 359-370). Springer, Berlin, Heidelberg.
[4] Banfield, R. E., Hall, L. O., Bowyer, K. W., & Kegelmeyer, W. P. (2007). A comparison of decision tree ensemble creation techniques. IEEE transactions on pattern analysis and machine intelligence, 29(1), 173-180.
[5] Pumpuang, P., Srivihok, A., & Praneetpolgrang, P. (2008, October). Comparisons of classifier algorithms: bayesian network, C4. 5, decision forest and NBTree for course registration planning model of undergraduate students. In Systems, Man and Cybernetics, 2008. SMC 2008. IEEE International Conference on (pp. 3647-3651). IEEE.