Hello! In my last question, I asked a question about how to analyze a dataset with an ordinal dependent variable and multiple categorical independent variables. Here's the question if you'd like to check it out: https://www.researchgate.net/post/How_can_I_analyze_multiple_binary_independent_variables_against_an_ordinal_dependent_variable
My dataset is questionnaire data that has a field about skill level in a certain sport. This skill level is the target variable in the study. The questionnaire also had a question about which of four sports the respondents answers regard. The aim is to compare if the correlations between skill level and answers are similar or not in each sport. So I would like to find which variables predict skill level the best in each sport, and how important the variables are in the prediction.
It was suggested that I use ordinal logistic regression, and test for proportional odds assumption. If the proportional odds assumption is not met, I should use consecutive binary logistic regressions to construct an ordinal model myself. It was also suggested that I could use a Boosted regression tree. I would like to use these both as a cross validating method, as there seem to be uncertainties in ordinal logistic regression.
I understand that the workflow should be as follows:
The binary logistic regressions should be run as follows: Class 1 vs Class 2-4, Class 1-2 vs Class 3-4, Class 1-3 vs Class 4. I understand this and know how to do this, but I do not know if the boosted regression tree should be done in the same manner. Should I make three different boosted regression trees and calculate the importances separately, or should I only create one tree model that I train with all four target classes at once? It seems boosted regression trees don't perform well with target variables with more than two values.
I would truly appreciate your help. Also, if you know of studies that have used a similar method, I would really appreciate if you could link them.
Best regards,
Timo Ijäs