If I have 1000 class 'A' and 300 class 'B' samples, and train a network over the data and get a MSE of 30 (the network is trained for specific number of epochs), is there any link between the threshold and the error? Can the threshold for binarizing the output for the test set be calculated from the error + class ratio?