What are the different types of tree based classification techniques?

More Lov Kumar's questions See All

What is the procedure to call filter method of weka tool in matlab?

My data is imbalance. I want to apply filter method before classification. Can you please help to call Spread Subsample method of weka tool in Matlab.

05 June 2017 5,852 0 View

Can anyone please help me in finding a data set for the NAVAL software system which I can use for my research work?

I am planing to work in the area of software quality for the software used in Naval Research Board (NRB). Can anyone please help me in finding a software source code which I can use for my...

08 September 2016 8,436 0 View

What is difference between case study and data set?

what is difference between case study and data set Both are same or different.

07 August 2016 2,044 0 View

What is the most effective way to represent the idea?

What is the most effective way to represent the idea ? 1. Video 2. Text 3. Text with figure. 4. Figure 5. Cartoon Ans with reason...

06 July 2016 8,859 7 View

What is the relation between training error and testing error?

what is the relation between training error and testing error? Is it always possible that a model having higher training accuracy have also high testing accuracy.

06 July 2016 1,367 1 View

What are the advantages of radial basic function neural network (RBFN) over artificial neural network (ANN) ?

05 June 2016 953 0 View

What is the best way to explain software metrics?

How to explain software source code metrics?

05 June 2016 7,204 3 View

What are the different types of feature selection techniques?

what are the different types of feature selection techniques and their application i.e., whether these feature selections techniques are used for both district and continuous data.

05 June 2016 3,136 7 View

Cloud services vs. Web services: Are these two term same?

What are the similarities and dissimilarities between cloud cloud service and web service?

04 May 2016 5,821 14 View

How to compute the source code metrics for graphical language?

How to compute the size, halstead different metrics for graphical language such as petrintet, Visual Logic etc.

04 May 2016 10,092 3 View

What is the relationship between protein structure and N or C terminal tagging choosing?

I want to do 2,3-butanediol dehydrogenase(BDH) enzyme purification to confirm its activity for 2,3-butanediol. Before that, I need to confirm which N or C terminal tagging is better for enzyme...

28 July 2024 366 3 View

Are these cassettes suitable for expressing PETase mutant in E. coli?

I created two potential gene expression cassettes (constitutive and inducible) for expression of a mutant PETase gene on PeptiCloud using the version tree feature, which allows users to create...

28 July 2024 7,559 1 View

Please, what is the memory consumption of the Matlab function quad tree decomposition procedure [S = qtdecomp(I)] with respect to the input set I?

27 July 2024 5,455 2 View

Is it redundant to use both Random Forest and Decision Tree algorithms in the same regression project?

I am currently working on a regression model for a project and considering using both Random Forest and Decision Tree algorithms. Given that Random Forest is essentially an ensemble of Decision...

23 July 2024 4,306 3 View

Look for qualified candidates of Visiting Scholars to Southwest Jiaotong University?

We invite qualified candidates of Visiting Scholars to engage in research collaboration or deliver lectures on their research, with travel expenses covered. The visiting period is flexible,...

20 July 2024 1,165 1 View

How do you assign allele numbers and sequence type after amplifying housekeeping genes by PCR for MLST?

How to assign allele numbers and sequence type after amplifying housekeeping genes by PCR for genotyping of S. pneumoniae strains by MLST using bioedit ?

19 July 2024 5,239 2 View

What analyzes do you use to compare biodiversity in a square, at two different times?

e.g. moment one: square in its normal state moment two: square after cutting large trees and replacing them with dwarf trees.

12 July 2024 8,258 5 View

How to create a database management system of trees species using gis and remote sensing techniques?

I want to assess the trees in an rainforest habitat and collect every necessary detailed data by utilizing GIS and Remote sensing technology techniques. And build a database management system for...

11 July 2024 4,747 1 View

What are the relationship between insect biodiversity and durian tree? maybe from the nutrition contain or canopy tree, or other things?

I want to know what are the relationship that happen between insect biodiversity and durian tree

10 July 2024 8,230 1 View

Hello all Can we use different concentrations of lemon or orange extract to grow flowers in ornamental trees and shrubs?

I need more pdf on this topic

02 July 2024 6,988 1 View

Paul Yarnold

There are many.

The expression "fault prediction" is not sufficiently specific to describe what you want to do. What is the outcome (class) variable that you want to predict, and what are the predictor variables (attributes)?

The following white paper uses maximum-accuracy classification tree analysis (CTA) to compare the use of chemical biocide versus clean recycled water to produce oil in hydraulic fracturing operations. The water produces greater flowback, putatively attributable in part to lower viscosity--yielding deeper engorgement into and faster withdrawal from faults (this is discussed in a follow-up white paper). The same method can of course be used to predict with maximum accuracy (normed against chance, and weighted if weights are available) where to drill, factors predicting fault failure, and so forth--if data are available.

http://content.stockpr.com/esph2/files/techreports/11.19.2014_ESPH_Ozonix_Report_Package_FINAL.pdf

And, here is an article that describes CTA (rather than simply using it as an analytic engine):

https://www.researchgate.net/publication/291947229_Using_data_mining_techniques_to_characterize_participation_in_observational_studies

Article Using data mining techniques to characterize participation i...

Lov Kumar

Dear Paul Yarnold,

Thank for the reply. We are try to develop a model to predict whether the class is faulty of not. For this work, we are considering different feature metrics of software as input and bugs in class or modules as output.

Roberto Diaz

Gradient Boosted Trees have shown very good results in many different tasks:

https://scholar.google.es/scholar?hl=es&q=gradient+boosted+trees&btnG=&lr=

You can also try other techniques based on trees like Random Forests

Dear Lov,

Well that is perfect application for CTA (predicting faulty or not). If you simply want to predict faulty versus not faulty, CTA is the only methodology that will return the answer that is explicitly proven to be the most accurate and parsimonious. If you also have a metric that can be used to weight the fault (instability, magnitude, depth--whatever numerical weight) then CTA will explicitly maximize weighted accuracy). The second link I posted compares the performance of CTA versus other popular methods, including random forest.

The predictive accuracy and parsimony of CTA models has never yet been matched in any published applications. This is no accident, because predictive accuracy is the objective function that is maximized (the only method to do this), and dual pruning methods prevent over-fitting, ensure experimentwise Type I error (there are no distributional assumptions, so the validity of P values is never in doubt), and maximize parsimony.

Best of luck!

Paul