While training a neural network model for multiclass datasets using Keras, earth-analytics, and TensorFlow, variables "accuracy" and “loss” per epoch were carried out.
Now the questions are:
How to interpret these variables?
How do they affect the behavior of the trained model?
How to define the appropriate epochs number for a better modeling performance?
The loss function is the function that computes the distance between the current output of the algorithm and the expected output. There are two major categories of loss functions depending upon the type of learning task: Regression losses (e.g. Mean Square Error) and Classification losses (e.g. Cross Entropy Loss). The accuracy measure aims to estimate the number of correctly predicted data points out of all the data points.
Concerning the number of epochs, it is an experimental process that determines through the result of reaching the best possible solution; so if increasing the number of epochs does not lead to a better solution there is no need to increase. In other words, select the less number of epochs that leads to the best possible results.
maybe this will help with the number of epochs. Yes, it is trial and error ultimately but there is a thing called callbacks
https://keras.io/api/callbacks/
that can be of help. It is essentially a way to evaluate the performance after each epoch and perform operations mid training. Two things that could help your model perform better with callbacks is
1) a stopping criterion. If the loss/accuracy hasnt improved for x number of epochs stop the training to not waste time and resources
2) an evaluation criterion. Save the model that behaved best up to that point. That way, even if the loss fluctuates, you would still in the end retrieve the best behaving model (i would opt for the validation loss rather than the training loss by the way)
I would like to add some questions which i think is still on the same topic. So i have an issue in imbalance segmentation data for landslide modelling using U-Net. So my questions are:
1. Should i try to find a proper loss function for the imbalance problem? or should i focus on my imbalance data to improve my model?
2. Some suggest to use SMOTE (oversampling), but since my data are images (3D) i have found out that it is not suitable to use SMOTE for my data. So, any other suggestions?