it is difficult to answer such a broad question: of course it depends... on your application, on the experimental design behind your data, on your modeling strategy, and so on.
This can be said in general:
- make sure the test set is able to answer your question (e.g. predictive performance for unknown samples or predictive performance for unknown future samples? etc.)
See e.g. Esbensen, K. H. & Geladi, P. Principles of Proper Validation: use and abuse of re-sampling for validation, J. Chemometrics, 2010, 24, 168-187 (http://dx.doi.org/10.1002/cem.1310)
- As Lionel said: repeat it over and over, with new random splits (if doing cross validation or out-of-bootstrap), with different algortihms for subset selection if you go for one of these.
The idea is that this allows you to check whether the results of different splits agree.
Also, a whole lot of information about model validation can be found at CrossValidated (http://stats.stackexchange.com/), including specific discussions for different modeling strategies.
Regarding the R function soil.spec::ken.sto vs. Matlab kenstone.m: After a very quick look at both I'd personally go for the Matlab version. ken.sto does enforce a PCA first, and there are several known problems (but maybe also a solution) http://r.789695.n4.nabble.com/problems-with-method-ken-sto-in-package-soil-spec-subscript-out-of-bounds-td4288193.html
(I've never used neither of them and instead go for iterated(repeated) k-fold cross validation or out-of-bootstrap;
My collaborators tend to use cross-validation ==> split dataset into 10 parts. Take 9 parts for training and the last part for testing. Repeat for 9 times with different part/section for testing. The advantage of this method is that you will end up with 10 precision/recall values which can be used to calculate a standard error.
Ok... I think that I will not use a training and a testing set so no Kennard-Stone... I think I will use Ten-fold cross validation or Bootstrap cross-validation. Thanks to all!
Do it by random sampling with different sizes; for which ever the set you got minimum error in prediction of the object under study, flag down the same as optimum