Machine learning model, normally, is applied on new data for testing. If the accuracy of the model is below a certain threshold, the model is rejected. But why? This is just a random sample, the result may not represent the performance of the model on the whole population. What is the statistical meaning of this operation?