I would like to ask is there any rule to determine the optimal number of PLS components i.e. above which it will overfit and below which it underfits. Is it like one third of the number of total samples?
Tons of literature are available on that subject. References can be found on: http://www.chemometry.com/Site%20map.html
Mr. Faber, the owner of the website, is one of the specialists in this topic.
The simple answer to your question is: if your system has N components you will need N-1 variables. That is when there is no interaction between the components, Beer's law applies, your sample set is well designed, your measurements are nicely linear and so on.
In practice the optimal number is found by applying some cross validation (jack-knife, or other). The measurement set is divided in multiple sub-sets. The PLS models with increasing numbers of variables are developed on one set and tested on the other(s). This way an seemingly optimal number of variables is found in terms of prediction error.
What is crucial is validating the developed model on an independent dataset which was set aside before starting the model. The prediction on that set is the actual prediction error of your developed model. If it turns out that refinement is needed a new validation set must be measured/collected.
After validation of your model you could check the optimum number of factors (rank) looking at the root mean square error of prediction (RMSEP) or of the cross validation (RMSECV). Plotting the RMSEP or RMSECV against the rank (number of factors that you would use in the model) a minimum can be observed. The minimum indicate the optimum rank or the number of factors that minimize the error of prediction.
Hair, Joseph F., Jr.; Hult, G. Tomas M.; Ringle, Christian M.; & Sarstedt, Marko (2014). A primer on partial least squares structural equation modeling (PLS-SEM). Thousand Oaks, CA: Sage Publications.
According to cross-validation procedure, we find out the Root Mean Square (Error of Prediction) (RMS(EP)) for different factors and try to take the number of factors with least RMS(EP) whereas we also try to minimize the number of factors for a stable solution. We kind of balance error and minimum number of factors.