Hello...
I have 40 samples... I want to randomly select a percentage of these 40 samples and make an equation from them by regression and test the remaining percentage of the sample with this regression equation... that is, a number of data For correlation (creation of regression equation) and some data for validation.... My question is:
1- On what basis is this sample selection percentage chosen for correlation and validation? For example, one can take 50% to 50% of the samples, one can take 70 to 30... which one is correct? Is there an article or book that tells the basics?
2- What should be the sample selection criteria for regression equation and validation?