The expression "fault prediction" is not sufficiently specific to describe what you want to do. What is the outcome (class) variable that you want to predict, and what are the predictor variables (attributes)?
The following white paper uses maximum-accuracy classification tree analysis (CTA) to compare the use of chemical biocide versus clean recycled water to produce oil in hydraulic fracturing operations. The water produces greater flowback, putatively attributable in part to lower viscosity--yielding deeper engorgement into and faster withdrawal from faults (this is discussed in a follow-up white paper). The same method can of course be used to predict with maximum accuracy (normed against chance, and weighted if weights are available) where to drill, factors predicting fault failure, and so forth--if data are available.
Thank for the reply. We are try to develop a model to predict whether the class is faulty of not. For this work, we are considering different feature metrics of software as input and bugs in class or modules as output.
Well that is perfect application for CTA (predicting faulty or not). If you simply want to predict faulty versus not faulty, CTA is the only methodology that will return the answer that is explicitly proven to be the most accurate and parsimonious. If you also have a metric that can be used to weight the fault (instability, magnitude, depth--whatever numerical weight) then CTA will explicitly maximize weighted accuracy). The second link I posted compares the performance of CTA versus other popular methods, including random forest.
The predictive accuracy and parsimony of CTA models has never yet been matched in any published applications. This is no accident, because predictive accuracy is the objective function that is maximized (the only method to do this), and dual pruning methods prevent over-fitting, ensure experimentwise Type I error (there are no distributional assumptions, so the validity of P values is never in doubt), and maximize parsimony.