What are the main differences between C4.5 and Random Tree Data Mining Classification Algorithms in terms of generating decision tree?

More Muhammad Lawan Jibril's questions See All

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

After COVID-19 it has seen that EFL learners technological affiliation has raised. In addition, in the post-COVID period learners started to engage AI technologies like ChatGPT while learning...

08 August 2024 8,964 4 View

How to generate a citation of my paper from ResearchGate?

How we can cite the papers from ResearchGate. I am trying to create citations for this article, Quantum Machine Learning Algorithms for Optimization Problems: Theory, Implementation, and...

08 August 2024 6,690 3 View

Does Anyone have expertise in in vitro transcription and RNA pull down assay?

I am currently working on LncRNA; to know the lncRNA-protein interactions I want to do RNA pull down assay, so I need to design primers with T7 promoter. I need assistance in this regard.

07 August 2024 6,622 1 View

How to fix background error in rietveld refinement of one XRD peak using GSAS-II?

I want to refine one XRD peak of my in-situ xrd but the background is never working good which ultimately fails the refinement. How to refine and adjust the background using GSAS-II

05 August 2024 5,291 2 View

How can I add own Henry coefficients in Aspen Plus?

Hi, i would like to simulate an absorption process in Aspen Plus. I want to use the NRTL model und would like to add some individual Henry coefficients. Is that possible and how?

05 August 2024 2,333 2 View

Why might the impedance values for DI water and 0.1X PBS buffer solution exhibit a decreasing and increasing trend, respectively over time (HP 4194A)?

Hello everyone, I'm encountering an issue with my electrochemical impedance spectroscopy (EIS) measurements and would appreciate some insights. Experimental Setup: Electrodes: Gold interdigitated...

05 August 2024 3,783 2 View

Can usage of AI tools like chat GPT in research work is recommendable ?

AI tools like ChatGPT can enhance research work significantly when used responsibly and in conjunction with thorough human oversight.

05 August 2024 1,842 3 View

Usage of internal standards in LC-MS/MS analysis?

Have you ever seen a LC-MS/MS method uses both internal standards and external standards (in matrix matching purpose) but the concentrations of internal standards are outside the calibration curve...

05 August 2024 3,084 6 View

ANY free software for reconstructing neurons in the microscopic image?

Hi everyone, I am working on brain slices for visualizing a protein in the soma and dendrites, using a fluorescence tag. However, I need a tool (not paid) for reconstruction of the whole neuron,...

04 August 2024 4,725 2 View

How effective is the Citi Bloc standard basket in enhancing the accuracy and comparability of international construction cost assessments?

Citi BLOC Standard Basket Definitions: A standardized unit representing a fixed basket of construction materials, labor, and equipment costs priced in various cities. Purpose: To create a common...

04 August 2024 8,997 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Do you know best mines of western part of Afghanistan?

I want to know more about Mn deposits in west of Afghanistan.

07 August 2024 3,427 1 View

Is Galaxy.org good to use for research for analyzing data and for publication?

Hello all, I wanted to know, can I use galaxy (USA, Europe or Australia) platform for analyzing the shotgun data, and can it be used for publication purpose as well? Thanks :)

06 August 2024 6,610 4 View

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

05 August 2024 8,836 2 View

What are possible strategies can be used to analyze data under sequential explanatory mixed method approach?

Better ways to analyze the qualitative and quantitative data in a sequential explanatory mixed method approaches

04 August 2024 2,703 6 View

How can I interpret the data without the need of solving it manually?

How can I interpret the data gathered without solving?

03 August 2024 9,054 3 View

Why can't academics earn the money they deserve?

Only Journals make money from the articles we have worked on for years. Academics do not earn money from their refereeing. Then shouldn't the solution be a system in which academics can earn...

01 August 2024 6,469 6 View

What is Random Audit?

HI there, I've came across several articles discuss about random audit an Non random to tax evasion or compliance. Most of the articles is relating about effect of audit (random or non random)...

31 July 2024 5,309 7 View

Sobhan Sarkar Popular answer

Dear Jibril,

It is a very nice answer given by Professor Kelsey. Added with this information, you may go for following reference papers:

[1] Sathyadevan, S., & Nair, R. R. (2015). Comparative analysis of decision tree algorithms: ID3, C4. 5 and random forest. In Computational Intelligence in Data Mining-Volume 1 (pp. 549-562). Springer, New Delhi.

[2]Ali, J., Khan, R., Ahmad, N., & Maqsood, I. (2012). Random forests and decision trees. IJCSI International Journal of Computer Science Issues, 9(5), 272-278.

[3] Robnik-Šikonja, M. (2004, September). Improving random forests. In European conference on machine learning (pp. 359-370). Springer, Berlin, Heidelberg.

[4] Banfield, R. E., Hall, L. O., Bowyer, K. W., & Kegelmeyer, W. P. (2007). A comparison of decision tree ensemble creation techniques. IEEE transactions on pattern analysis and machine intelligence, 29(1), 173-180.

[5] Pumpuang, P., Srivihok, A., & Praneetpolgrang, P. (2008, October). Comparisons of classifier algorithms: bayesian network, C4. 5, decision forest and NBTree for course registration planning model of undergraduate students. In Systems, Man and Cybernetics, 2008. SMC 2008. IEEE International Conference on (pp. 3647-3651). IEEE.

Thanks,

Sobhan

Thomas W Kelsey

C4.5 takes the training data and generates a single tree. It can work with continuous and categorical data, and missing values. It also goes back over the tree to delete nodes or modify the internal structure. It is a very good classifier method. You still have to guard against overfit, though. So you typically use the tree to classify test data, and use the results to prune the original tree (if needed) so that you have a good balance between training error and the errors you get when generalizing to unseen data.

Random forests is a method in which you build a few thousand classification trees. For each tree you sample your training instances with replacement. For each node in each tree, you only consider a random subset of the attributes. The classifier you get is an aggregate of all these trees - there isn't a single tree that you can draw and analyse. For technical reasons, random forests don't overfit (in general), so there much less need to investigate generalization error.

So the main differences are: CS4.5 is a single tree, and you have to think about pruning it to guard against overfit. A random forest classifier is the average of several thousand trees generated using random subsets of your data.

Muhammad Lawan Jibril

Thanks Prof.

Sobhan Sarkar