18 January 2020 8 2K Report

Good morning

I am a Stata newbie and am trying to clean the data set of a survey which I recently conducted using LimeSurvey. One of the questions involved a 19 part psychometric test (Basic Psychological Needs satisfaction scale), with each respondent having to tick one of the following responses for each question:

1 = Not at all true

2

3

4 = Somewhat true

5

6

7 = Very true

The response data arrived into Stata in string form so I destringed it using

encode BPN_1, generate (BPN_1_d)

encode BPN_2, generate (BPN_2_d)

etc for all 21 questions in the scale

While the desstringed version of the response data (i.e. BPN_1_d etc) looks fine on the surface i.e still shows 1= Not at all true or 3 or 5 etc in line with the original data entry, when I click on each cell, I can see from the bar at the top of the page that as part of the destringing process Stata has 'recoded' all the original responses using its own (alphabetical??) system so that an original 1 response is coded by Stata as a 2 etc.

I was planning to fix this by going either of the following routes:

label define BPN_order

label values BPN_1_d BPN_order

or

replace BPN_1_d = 1 if BPN_1_d == 5 | Burnout1_d == "1: Strongly Agree"

but the problem is that Stata appears to have recoded a 3 response as a 4 for some of the 19 sub-questions, as a 5 for others and a 9 for others i.e. there doesn't appear to be any consistency as to how Stata has recoded the original responses when destringing.

Has this happened to anyone before and does anyone have any suggestions as to how I might overcome this problem or maybe de-string the original data in a fashion that might avoid it?

Many thanks

More Diane Pelly's questions See All
Similar questions and discussions