I have a dataset with multiple visits recorded for patients. I want to use only data from enrollment for an analysis. How can I code so I limit the analysis for the enrollment visit?
Mehdi, yoru question is not so clear to me, so I need to do some answer to you.
In your dataset you have more visits for each patient but, if I understood well, you need to count (or to use) only the enrolment visiti (maybe the first). Is it right?
Are there some characteristic that distinguish the enrolment visit by the others? Is the first visit?
What tool you use to manage the dataset? SQL?
Excuse me if I don't understand you request but I think that is too vague.
One way is to use nested sorting of patient records, and creating an elapsed date variable, also assuming the enrollment visit is the first. Generate the elapsed date variable based on date of visit. The elapsed date variable is numeric, (in Stata this uses the date() function). Once you have a numeric type date variable, sort within each equivalent patient record, which is like a nested sort, so that for each equivalent patient you have their records sorted from lowest elapsed date to highest date. Then select records based on the minimum date field to select the first enrollment visit, within equivalent patient records... Hope this helps...
@ Rafaelle. Thank you for your reply and sorry if my query was not clear enough. I am using data from a patient register in which the first record of each patient includes enrollment information. We have a date variable but I don't know how to select only the data of enrollment for this round of analysis. I am using stata.
Dear Edwin, I don't have programming experience and as you can see even not at this very basic level! I only want to limit the analysis to first recorded observaton for patients. This must be very easy to do but not for me at the moment. I will try to do it based on your xplanation and will come back if it didn't work.
Sorry Mehdi, I don't know Stata so I cannot help you. In SQL you could group the patients by name and select only the record, for each group, with the lower (min) date. I don't know if you can do something of similar in Stata.
I assume, you are working with Stata, since your question is classified "stata programming"?
Second assumption: you have several records (observations) for each patient, one record per visit?
If both is yes, here is how you can do it:
1. step: you need an identifier for the enrollment visit. Is it the first (oldest=smallest) date? If yes, sorting by date makes the enrollment visit the first observation of a patient. If this does not hold, look for a condition that is fulfilled only for enrollment visits, then generate an indicator variable with the command "generate vi = cond(put here the condition, 1, 0) " This command will create vi having value 1 if the condition is true, and 0 otherwise).
2. step: select the observations that are renrollment visits:
a) In case you can use the date, the Stata command is:
bysort date: keep if [_n] == 1
b) in case you use the indicator variable
keep if vi == 1
After that your working file only contains the enrollment visits. Take care to save it under a different name than the original file..