My Likert scale is missing data (Skipped Questions). If only 1 - 2 variable items are missing, can I still use the remaining data for that case to reach a final score (by weighting?, Mean? etc)
I would not delecte observations but impute. For a simple and quick solution, use single imputation, e.g. random hot-deck. When you are interested in proper inference statistics, you should think of a multiple imputation approach. R packages VIM, mice or mi might be your friend.
If less than 20% of the responses are missing, I would still compute the mean for the responses. In the methods section for the publication, be sure to mention that you computed the scale, provided that 80% or more of the responses were present.
Good question. I would suggest Imputation and then contrasting the results with a complete case analysis (listwise deletion). Plenty theory on why available case analysis (pairwise deletion) is a poor choice. An undifferentiated random hot deck is okay if missingness is rather low. I would however recommend a nearest neighbor hot deck, in which you substitute the missing values in one object with those in another. Selection of the donor can be done by a simple distance function or within clusters. Certainly a regression imputation is also possible. MCMC can be hard to understand and thus to explain in a paper.
There are many pieces of literature that deal with missing data. Some of the more common ones include:
Enders, C. (2010) Applied Missing Data Analysis. Guilford Press: London.
Little, R.J.A. and Rubin, D.B. (2002) Statistical Analysis with Missing Data. Wiley: Hoboken.
Best to make every effort to avoid missing data on the collected survey, even if it means calling the respective respondents for the missing information. Some statistical programs would include only items with complete information in the analysis. If the program allows, I would use imputation and a good idea to report both results -- before and after imputation.
To decide the best method of recovering lost data is necessary to answer three questions:
What is the percentage of missing data?
How have lost data: Missing Completely At Random (MCAR) or Not Missing at Random (NMAR) ?. In the first case the data is lost at random and in the second case the loss is biased to one side of the distribution. The best situation is that data were MCAR. However, social and psychological studies it is normal that the loss is NMAR. For example with an item such as "I am a good student," students tend not to answer are precisely those who have more difficulties. In this case the data tends to get lost in the bottom of the distribution of the variable "academic performance". In such situations (lost NMAR) is not a good decision to replace the missing data by the mean of scale. You probably unduly overestimate the average population.
Do you have incomplete data or missing data? Let me explain. Suppose you have 5 Likert items to measure a psychological construct. Suppose a person responds to 3 items and unresponsive 2 items. In this case you have incomplete data. Instead the person answering the questionnaire does not answer your any of the 5 items would be before a lost data.
Depending on your answers to the 3 questions you've got several options to choose from.
Our group has compared different methods for recovering missing data. Unfortunately our work is in Spanish ... We compared seven methods to recover lost data: listwise deletion, replacement by the mean of the scale, by the item mean, mean the subject, the subject corrected mean, multiple regression, and Expectation-Maximization (EM ). They all have advantages and disadvantages but some are better than others.
The most handy option (displayed by default in programs such as SPSS) is eliminates the missing data (listwise deletion). Today we know that listwise deletion is the least desirable method to handle missing data. I do not recommend it, since it will skew your data. Especially if your losses are conditioned.
The simplest methods are replacement for the average (or mean of the scale, the item or subject). For you it would be a simple and basic possibility. If you have incomplete data and its loss is not very large (less than 10%) it seems that gets better population data is replacement method for the average of the subject. It is based on a basic principle in psychology: usually the best predictor of future behavior of a person is precisely the past behavior of that person.
Finally, there are what might be called "More refined methods", like regression and multiple imputation based on iterative procedures. These methods, where auxiliary variables are used, are better than the average replacement. Especially if your data is completely lost and the percentage of data loss exceeds 20%. Of course they are more complex. However, if you can, I encourage you to investigate.
I do not know what program you used. In any case, the commercial software (SPSS, Mplus ...) has specific modules to recover lost data. If you tell me what software you use and I know it could be more precise in my answer.
Finally I dare to let one readings on the subject.
Willms, J. D., and Smith, T. (2006). A Manual for Conducting Analyses with Data from TIMSS and PISA (Report prepared for the UNESCO nstitute for Statistics). New Brunswick: Canadian Research Institute for Social Policy. Retrieved May 17, 2011 in: http://www.unb.ca/crisp/pdf/Manual_TIMSS_PISA2005_0503.pdf. The first pages of the document are a good initial introduction to the methods of recovering missing data
Todd Little at TTU covers this - it is in his area of expertise. He is on research gate and you might be able to see his articles there. He has also recently published a book Longitudinal SEM. If you are still stuck next summer, he teaches a STATS Camp and addresses this in his intro to SEM class.
How can I account for missing data on an 22 Item Scale (7 lickert scale)? - it is MBI fo Burnout has 3 dimentions, so the scale is divided in 3 itens and with the scores I can classified in low, medium and high risck and asses burnout.
Wich is best method of of recovering lost data: Mean of item, Mean of subject or delete cases (not missing value) ?
I use SPSS 22 version
I answer your 3 questions.
1. What is the percentage of missing data? 4,5% per case (1/22 itens) 9% if 2 itens (I have just one case with 2 missing values). And
Ideally, you do not have any missing data. But, of course, in real life, you will have it. If your response number is high enough, you can discard these respondents and move on. If, however, you definitely need the responses -- as mentioned by several folks here -- you can take measures to attempt to handle the problem. Here are two resources that may be of help: https://measuringu.com/handle-missing-data/ and https://liberalarts.utexas.edu/prc/_files/cs/Missing-Data.pdf