Any method of analysis (classification say) is a tool. It should be treated the same way as any hardware tool. Get results from part of a data set (define parameters), and predict segmentation in the rest of the data. Then in the validation data set, examine the error statistics from say a distance square criteria, compared to the true segmentation (gold standard). In other words, any methoid should first be applied to an artificial data set with and without noise, using characteristics of experimental data. It should perform well on this data set, before you try on expt data. In this validation of method, the errors in segmentation, border curves, should have zero mean at the very least, and gaussian statistics if possible to allow use of confidence intervals. Otherwise, confidence in border locations would require using bootstrap or Monte Carlo methods.
Another separate step in validation of an algorithm is the 'model' complexity applied for the definition of segmentation shapes. The previous method validation needs to be repeated for each possible model complexity...
Using ROC, the 'optimum' parameter is that that produces the point on the ROC closer to the top left corner. That is the optimal combination of sensitivity and specificity.
Here I assume that you provide a score for each pixel which you can compare to your ground truth labels. perfurce and roc can help you with that.
The thresh parameter at the roc function provides you the thresholds for each ROC point. Why? Because nobody but you can tell at which point you want to operate. The more false positives you accept, the more true positives you will gain. But this tradeoff is up to you and your specific application.
To generate a ROC you need a technique that classifies events/measurements as positive or negative and the technique depends on the value of a parameter. You also need a gold standard, i.e. situations for which you can apply the technique, but you also know if they are positive or negative. To generate one point on the ROC you choose a value for the parameter and then you classify the universe of the situations you have (and for which you know the true classification). You can count correct decisions as well as false positives and false negatives, therefore you can derive sensitivity and specificity. Do that for a wide range of the parameter and you will be able to plot the ROC curve. You can then choose a point of operation based on the ROC curve. Normally one would like to maximise both sensitivity and specificity, as I indicated above, but you might want to have more sensitivity sacrificing a little specificity or vice-versa, depending on the problem.
The confusion matrix parameters are true positive, true nagative, false positive and false nagative. we can easily take the ROC curve by the use of cross entropy method by training artificial neural network simulation.