I have been playing around with stepwise refinement of Linear Discriminant Analysis (LDA) models for geochemical data - many variations are possible though.
I am interested to know if anyone can give me advice on choice of model improvement criterion (e.g., Correctness Rate, Accuracy, Ability to Separate, Confidence, ...) and/or minimum improvement tolerance (I have tried values from 0.1% to 5%).
_
If it helps I am using the stepclass function from the R package 'klaR', and additive-logratio transformed compositional variables generated using the R package 'rgr'. I have 3 categories to classify, with different numbers in each. So far I'm only using LDA in 'training' mode, but will use the 'best' model (providing I can answer the questions above to my satisfaction) for prediction on related datasets.
Thanks! --Andrew