You may refer the paper "A Framework for Intelligent Medical Diagnosis using Rough Set with Formal Concept Analysis; International Journal of Artificial Intelligence & Applications; Vol. 2 (2), pp. 45 – 66, (2011)" I hope you will get the answer.
Yes, those papers are fine... But, in a nutshell, I understand thing like this :
A prime concept of Rough Set is the reduct. A reduct is the mininal attribute set required to classify a dataset. We generally add features one by one to the empty set (or, if you like, take an arbitrary subset of features-- as in your step 1-- Search or subset generation), and see how good the feature subset fares in classifying the dataset. As soon as you get the highest possible classification accuracy (same as the set of all features), you have found the reduct. Of course, provided it does not contain any redundant features.