Given there is some k-step (k=3 or 5 or 7 etc) Likert data with "absolutely no" on one side, "absolutely yes" on the other side, and "neutral" in the middle.

A correct way to analyze such (ordinal) data is using an ordered logistic model, what is possibly difficult to interpret. I don't really understand the reason why this is the case, but must people seem to analyse such data by assigning some numerical values to the categories (often intergers, and if there is a neutral element they usually use -(k-1)/2 ... (k-1)/2 and assume these numerical values being normal distributed (to use a Gaussian linear model, like t-tests and such). This can cause a lot of problems, especially if k is small and if the values are clustering close to the possible extremes. It's also difficult to understand what the estimates of these analyses mean (even if the t-test is interpreted to test a difference in the mean rank, what is the underlying definition of "rank"? It's not the rank of the data but the rank of the category, what it weird, at least in comparison to other rank-based analyses...).

Now to my question:

What if the data is dichotomized, where 50% of the neutral choices are randomly assigned either of the two possible values?

This would allow using a simple logistic model and the coefficient is estimating the (practically meaningful) log odds ratio in favoring one direction (one side of the neutral answer) versus the other? Additionally, it solves the problem that the original k steps might not be linearily related to the strengths of the underlying factor that should be analyzed (what makes problems in both: ordinal logistic and gaussian models).

However, this ignores some information of the original data (and the amount of information loss depends on k, and possibly also on the sample size), and my question is:

Is there some research / paper / proof showing when (under what circumstances) dichotomizing (as described above) is advantagous or disadvantagous and in what way and to what amount?

I was not able to find anything useful. A lot is written about dichotomizing interval- or ratio-scaled variables (what is always bad!), but I could not find anything about dichotomizing simple Likert-like ordinal data (contrasting opposite values like "agree" vs. "disagree", or "love" vs. "dislike"), particularily when such data is just recorded in a survey and not carefully constructed and externally calibrated and validated).

More Jochen Wilhelm's questions See All
Similar questions and discussions