I want to detect multivariate outliers in my dataset, which contains participant responses to various questionnaires such as DASS-21, PSWQ etc. Should I compute Mahalanobis distance using total scores for constructs such as depression, anxiety and worry or should I use item-level data from the questionnaires before aggregating them into their total scores? When using item-level data, 25 participants are detected as outliers among 370 participants, but when using the total scores, only 1 participant is detected as multivariate outlier.