I wonder, should this be considered as classification problem? Why don't use linear regression to predict the concentration? If the number predicted by the model is above the threshold, then we know that it is highly contaminated.
What does your data look like? What does the response represent and what are your predictors? I am a statistician who specializes in classification, but I also work in environmental remediation so with a little more information I can probably help you out.
Thanks for these answers, these are great. Maybe it would suit a classification based approach better.
Mainly groundwater variables focusing on metals and using differing geological/hydrogeological/biogeochemical based predictors based on my previous research. It wouldn't really be focused on remediation, but more on implementing a way to potentially predict if an area is susceptible to contamination of a parameter, based on the predictor variables. Thanks