Neural Network can load the large dataset in batch while Random Forest has to load all large files.
Neural networks capture non-linear data interactions while Random forests, based on decision trees, provide piecewise constant predictions and may not capture non-linear correlations as well as neural networks.
Neural Networks vs. Random Forests – Does it always have to be Deep Learning?
by Prof. Dr. Peter Roßbach
"How do Neural Networks and Random Forests work? Let’s begin with a short description of both approaches. Both can be used for classification and regression purposes. While classification is used when the target to classify is of categorical type, like creditworthy (yes/no) or customer type (e.g. impulsive, discount, loyal), the target for regression problems is of numerical type, like an S&P500 forecast or a prediction of the quantity of sales.
Neural Networks Neural Networks represent a universal calculation mechanism based on pattern recognition. The idea is to combine simple units to solve complex problems. These units, also called neurons, are usually organized into several layers that have specific roles.
The Neural Network consists of an input and an output layer and in most cases one or more hidden layers that have the task to transform the inputs into something that the output layer can use. Neural Networks can process all kinds of data which is coded in numeric form. The data is inserted into the network via the input layer, transformed via the hidden layer(s) and finally scaled to the wanted outcome in the output layer. In the case of an assessment of creditworthiness, the input neurons would, for example, take values for income, education, age and house ownership. The output neuron would then provide the probability for creditworthiness.
Random Forests Random Forests belong to the family of decision tree algorithms. A Decision Tree represents a classification or regression model in a tree structure. Each node in the tree represents a feature from the input space, each branch a decision and each leaf at the end of a branch the corresponding output value.
To obtain a result for a specific input object, e.g. a person who applies for a credit, the decision process starts from the root node and walks through the tree until a leaf is reached which contains the result. At each node, the path to be followed depends on the value of the feature for the specific input object. In figure 3 for example the process walks to the left, if the person has an income lower than 3000.
Similar to Neural Networks, the tree is built via a learning process using training data. The learning process creates the tree step by step according to the importance of the input features in the context of the specific application. Using all training data objects, at first the most important feature is identified by comparing all of the features using a statistical measure. According to the resulting splitting value (3000 for income in figure 3), the training data is subdivided. For every resulting subset the second most important feature is identified and a new split is created. The chosen features can be different for every subset . The process is now repeated on each resulting subset until the leaf nodes in all the branches of the tree are found.
Summary The intention of this blog was to show that Neural Networks, despite their current high visibility in the media, not always need to be the first choice in selecting a machine learning methodology. Random Forests not only achieve (at least) similarly good performance results in practical application in many areas, they also have some advantages compared to Neural Networks in specific cases. This includes their robustness as well as benefits in cost and time. They are particularly advantageous in terms of interpretability. If we were faced with the choice of taking a model with 91% accuracy that we understand or a model with 93% accuracy that we don't currently understand, we would probably choose the first one for many applications, for example, if the model is supposed to be responsible for investigating patients and suggesting medical treatment. These may be the reasons for the increasing popularity of Random Forests in practice."