I always recommend watching the crystal-clear lectures of Pr Ng (Stanford University) about bias/variance: https://www.coursera.org/learn/machine-learning/lecture/yCAup/diagnosing-bias-vs-variance
This Mathworks/Matlab page is extremely well explained and illustrated and should be helpful: http://uk.mathworks.com/help/nnet/ug/improve-neural-network-generalization-and-avoid-overfitting.html?refresh=true
Finally, here is a related RG discussion: https://www.researchgate.net/post/How_to_Avoid_Overfitting
Haykin has a great section on this particular aspect of training of neural networks. There is a relationship between how representative your data is of the data-space and the network itself.
Search Results
[PDF]Neural Networks and Learning Machines Third Edition
Simon Haykin. McMaster ... Neural networks and learning machines / Simon Haykin. ..... rameterized probability density function and the corresponding factorial.
[PDF]Neural Networks. A Comprehensive Foundation.pdf
Neural Networks Viewed as Directed Graphs 15. 1.5. Feedback ..... ftp://ftp.mathworks.com/pub/books/haykin .... probability density function of random vector X.
[PDF]NEURAL NETWORKS Introduction to Neural Networks and ...
sydney.edu.au/engineering/it/~comp4302/ann1-6s
Artificial neural networks and their biological motivation. Introduction to machine .... According to Haykin, NNs: A Comprehensive Foundation, 1999. ○ A NN is a ...
The overfittimg occurs because the network has memorized the training data but it has not learned to generalize to new situations. This occurs, especially when you has noisy data, or a small data set
When the number of parameters in the network is close to the number of training data. If you has the sufficient training data, the parameters will be much smaller than the total number the training data and you can avoid the overfitting