How to address the overfitting problem in the training of the J48 decision tree classifier?

More Abhishek D. Patange's questions See All

How to increase protein lysate concentration?

I am trying to prepare samples for IP. The pellet size is good and I am using 1% Tx-100 and using a syringe to further lyse the cells and maintain protein-protein interactions. I keep having a...

01 August 2024 7,199 1 View

I have synthesized a composite of ZnO and In2O3 with variations in ZnO content. In Raman , what is the peak observed at 130 cm-1?

Dear all, I have noticed, as the ZnO content increases, the Raman intensity at 100cm-1 also increases. However, there is a peak nearby at 130 cm-1, which I could not find in any paper related to...

28 July 2024 5,410 1 View

What are the trending topic to writ e a review article on it?

The topic should relate to pharmacy ,drug, methos of drug discovery, or any other topic but related to dug or pharmaceutical field

02 July 2024 4,031 3 View

Why my mesh independent fastener as beam cartesian type not working in abaqus?

i have attached my model and i have used point based mesh independent fastener using attachment points and type of connector is basic cartesian but this is not working as i apply displacement on...

02 July 2024 6,293 0 View

How does the foilar application of Banana Pseudostem sap influences the growth as well as yield attributes of oilseed crops?

Also, explain its pros and cons concerning other organic sources of nutrients.

01 July 2024 612 0 View

How to simulate FSS with patch antenna in HFSS ?

I want to simulate square loop fss with antenna so that it can work as a filter . how to simulate it in HFSS

29 June 2024 4,840 1 View

The value of RF3(Z) in Abaqus model is huge how to control that?

I am modelling CFS Sheath wall in abaqus and i am applying the 100 mm displacement but i am getting high reach force close to 150 kN but i need it till 50kN. I have used mesh independent fastner...

27 June 2024 4,212 2 View

Can AI platforms aid in efficient and early prediction of biomarkers w.r.t. haematological malignancies ? If yes, than which AI can be used?

I'm really struggling to find a manuscript on this topic. However, I'm pretty sure about the benefits of using TensorFlow AI tool when it comes to ML, but prediction of Biomarkers is still not...

17 June 2024 3,968 3 View

Is there any relation between UV absorbance and PL emission spectrum?

Hello Everyone, I have done UV-Vis absorbance and PL emission of 3 samples. When plotted, there is a clear correlation between UV-Vis absorption and PL emission spectrums, where the sample with...

06 June 2024 8,937 2 View

How to compute molar Gibbs free energy (chemical potential) for single molecule using Gaussian 09 or 16 software?

I want to calculate the Gibbs free energy of single gas molecule (AsH3) using Gaussian 09, but I am confused between "Thermal correction to Gibbs Free Energy" and "Sum of electronic and thermal...

02 June 2024 6,365 2 View

Training for new staff?

I am looking for some training for new staff that will be starting in a self contained classroom with students with ASD. Most new staff have little to no experience working with students with ASD....

03 August 2024 6,717 3 View

A Question about Phd thesis?

Hello everyone What is your opinion about the introduction of an expert decision support system in which the rules are extracted from existing data without human intervention, instead of being...

31 July 2024 5,785 4 View

Are these cassettes suitable for expressing PETase mutant in E. coli?

I created two potential gene expression cassettes (constitutive and inducible) for expression of a mutant PETase gene on PeptiCloud using the version tree feature, which allows users to create...

28 July 2024 7,559 1 View

Please, what is the memory consumption of the Matlab function quad tree decomposition procedure [S = qtdecomp(I)] with respect to the input set I?

27 July 2024 5,455 2 View

Is it redundant to use both Random Forest and Decision Tree algorithms in the same regression project?

I am currently working on a regression model for a project and considering using both Random Forest and Decision Tree algorithms. Given that Random Forest is essentially an ensemble of Decision...

23 July 2024 4,306 3 View

Will the leadership style used in the U.S. be successful in Australia, or will the Australians respond better to another?

Will the leadership style used in the U.S. be successful in Australia, or will the Australians respond better to another? Which leadership training methodology would be most successful with your...

14 July 2024 173 4 View

What analyzes do you use to compare biodiversity in a square, at two different times?

e.g. moment one: square in its normal state moment two: square after cutting large trees and replacing them with dwarf trees.

12 July 2024 8,258 5 View

How to create a database management system of trees species using gis and remote sensing techniques?

I want to assess the trees in an rainforest habitat and collect every necessary detailed data by utilizing GIS and Remote sensing technology techniques. And build a database management system for...

11 July 2024 4,747 1 View

What are the relationship between insect biodiversity and durian tree? maybe from the nutrition contain or canopy tree, or other things?

I want to know what are the relationship that happen between insect biodiversity and durian tree

10 July 2024 8,230 1 View

Is there any research paper on impact of knowledge sharing, training and development on employees retention??

I want to make thesis on this topic is it right??

06 July 2024 7,101 5 View

Mantas Lukauskas

What programming language/library/package you are using? I think that J48 performs pruning after generating the perfect tree so overfitting should be not that huge error.

Abhishek D. Patange

I'm using Weka

Gyanendra Chaubey

Dear Abhishek D. Patange

In J48 decision tree, over fitting happens when algorithm gets information with exceptional attributes. This causes many fragmentations in the process distribution. Statistically unimportant nodes with least examples are known as fragmentations. Usually J48 algorithm builds trees and grows its branches ‘just deep enough to perfectly classify the training examples’. This approach performs better with noise free data. But most of the time this strategy over fits the training examples with noisy data. At present there are two strategies which are widely used to bypass this overfitting in decision tree learning. Those are: 1) If tree grows taller, stop it from growing before it reaches the maximum point of accurate classification of the training data. 2) Let the tree to over-fit the training data then post-prune tree.

Please also refer the materials attached below:

1. https://www.periyaruniversity.ac.in/ijcii/issue/marnew/2_mar_18.pdf

2. https://weka.8497.n7.nabble.com/Producing-a-perfect-decision-tree-using-J48-td11751.html

3. https://www.degruyter.com/document/doi/10.1515/jisys-2020-0061/pdf

Hope you find this useful.

Thanks,