What is the best computer software package for Random Forest Classification?

More Pouya Sinaian's questions See All

What is the most effective approach to urge the government to enhance its focus on pollution control?

26 March 2024 2,648 0 View

How do researchers collaborate?

Hello everyone, I'm Pouya Ataei, a PhD student at Auckland University of Technology. I'm currently researching how collaboration tools affect research productivity and teamwork in academia. I...

22 January 2024 7,593 0 View

API of Natural bitumen?

Can the API of natural bitumen be calculated?

30 November 2023 4,619 4 View

According to scientific papers, the organic carbon content cannot be the values presented in relevant datasets in GEE !?

Maybe the unit should be %, not g/kg (which means 1/1000)! The organic carbon content of peat ranges 18–58% (Agus, Hairiah, et al. 2010) for peatlands, but by implementing these datasets in my...

30 October 2023 6,902 4 View

What does it mean "putting numbers before units, like x 5 g/kg, in some of datasets?

What is the correct interpretation of the notation 'Soil organic carbon content in x 5 g/kg' in the context of soil datasets? If a dataset provides a value, such as 39, using this notation, what...

30 October 2023 5,196 1 View

How to add thiol functional group to already synthesized oligonucleotide?

I have purchased an FMR1 oligonucleotide with 105 bases. I want to immobilize it on the gold electrode for DNA probe use. after studying some procedures, we have found that introducing the thiol...

04 July 2023 9,571 2 View

How to Calculate the average of the received power using the SNR in the fading channel?

I found a sample code for OFDM in fading channel. Can anyone explain how the standard deviation of the noise is calculated? i also uploaded the picture of the code to see

27 October 2022 1,419 0 View

Using PLUMED for calculating binding free energy of ligand?

Is there any way to calculate binding free energy for protein-ligand using PLUMED and SuMD and or conventional MD?

22 January 2022 407 2 View

How can we find charge sharing ratio in Monoatomic bridging with additional bridging fromate ion?

In the schematic drawing below (By L.Osterlund, Solid-state phenomena Vol.162, p 203) different possible formic acid and formate coordination to Ti metal atoms are illustrated. In my case, I'm...

29 December 2021 3,662 0 View

What is the method of electron counting in PbS + organic compounds(formate) ?

I want to passivate the PbS crystal by an organic ligand like Formate. My atomic simulation does not converge and it's because I have not used proper molecular structure. In the first 3 periods...

27 December 2021 5,200 2 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Could dyes amplify the spectrum of light to a specific wavelength?

I am interested to know the behavior of dyes toward light. Specifically, Blue dyes re-emit the spectrum, especially from the green zone (known as principal in LED lamps, and blue dyes are known...

05 August 2024 3,290 1 View

How to report results of Generalised Linear Mixed Models in a journal article?

Hi everyone, If you have written or come across any papers where Generalised Linear Mixed Models are used to examine intervention (e.g., in mental health) efficacy, could you please share the...

04 August 2024 4,130 4 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

Andreas M Brandmaier

I recommend to use R. Specifically, cforest() from library "party" or randomForest() from library "randomForest".

Pekka Jounela

Pouya, RapidMiner visualizes each random forest tree (number of nodes) after training (http://rapid-i.com/)

Sebastiano Panichella

Rweka package of R because you can use others models and compare them... Godd work

Nitish Mishra

Rapidminer have option for random forest, there are several tool for random forest in R but RandomForest is the best one for classification problem.

Pouya Sinaian

Thanks everyone. I've started using R but I will take a look at Rapidminer.

Terence Matthew Sloan

Please note if you are going to use the R randomforest package and you have computational performance issues then you should try the R sprint() package. This has parallel implementation of the R randomforest package. For typical use cases, we can obtain a speedup of around 40 over this same serial code. The sprint randomforest interface exactly mimics the existing serial implementation: modifying existing serial R scripts to take advantage of this functionality is trivial. You can find out more about its performance in the Concurrency and Computation journal article at

http://onlinelibrary.wiley.com/doi/10.1002/cpe.2928/full

Vladimer Kobayashi

R is definitely a good choice. you can use the randomForest package

Adam Hughes

I really think scikitlearn is better than R unless you are already fluent in R. Python is much easier to read/learn, and scikit learn is optimized at the C level, meaning it will already be fast and suited for bigger datasets. That is not to say the randomForest package may not be more extensive than in skilearn, but skilearn does have randForest functions.

You also may want to look into ''ilastik'' if visualization and interactivity are important for you application.

Fabrice Clerot

regarding the second part of the question, the size(s) of the trees are of course problem-dependent and the maximal size is usually a user parameter.

if you can pay, Salford Systems RF is a good choice ; randomForest in R is also quite efficient

Yuk W Cheng

Understand what tools and models are you using before applying it. You can dump any randomforest package if it can produce the final tree to you. You need to test the training time, testing time, accuracy and uncertainty with all available packages related to randomforest. I found that RF in R needs extensive time to train the model, especially with categorical variables over 20 levels. Salford RF provides me a fast training model but with poor in accuracy and large uncertainty. cforest needs huge computer memory with very very poor accuracy and large uncertainty. In addition, you needs huge running to complete the testing part. R rpart provides me a very fast training model but with very very poor in accuracy and large uncertainty. R RF provides me a training model with good accuracy and the lowest uncertainty among all other packages but it took several days to complete the training. Bagging does not provide me a satisfactory answer compared with RF. Hope this can help.

Jyrki Launes

What do you exactly mean by "You can dump any randomforest package if it can produce the final tree to you. You need to test the training time, testing time, accuracy and uncertainty with all available packages related to randomforest."

Are you referring to the actual classifying capabilities in real life situations, or writing about the mathematical properties of a particular algorithm?

It is not likely that you can produce the final tree. There are a lot of commercial packages (RandomForest) and they can produce the final tree. For rpart or CART, you can do it but not randomforest..

Henry

Yes, but what is the relevance of this? Will the classification then be false or inferior in some aspect? I have been in the belief that banks that avoid loan risks and prostate cancer patients who get the right prognosis from their marker studies are pretty happy with results from these applications.

In my statistical consulting and reviewing experience over US 40 billions in different fields, only 20% of the proposed models is useful and about 5% of them may be modelled properly with the right predictor variables. For other proposed models, even all the predictor variables are significant but the information is not the right information. In one situation, I only can have 2-5% accuracy in the prediction, even all the predictor variables are highly significant and not cor-related. I told my client that the right information is not in the model, or you need to provide me more predictor variables, or you may need to wait until the right information exists and collect it. For each prediction, it will cost $5000 to a quarter of million. I need to use it over 20000 times. So, I told my clients that the best available science is not using the best available proposed model. For this type of prediction, I need over 70% prediction accuracy in my model validation and forward validation (not 2-5%). Hope this can help.

Liangliang Nan

Try the ALGLIB

Vasudha Jha

I had a doubt regarding this, I recently loaded a dataset onto my Weka tool but I am not able to apply Random Forest or even Linear Regression on that data. The problem I am facing is that when I select these algorithms to run on the data, the start option is not enabled. Is it something to do with the data set I am using? Since they have nominal and numerical values.

If anyone knows the reason behind this, kindly let me know.

Abdulwaheed Tella

Please what application software is best suited for random forest Algorithm for prediction?

Mehdi Jamei

You can use Salford predictive modeler which contains MARS, CART, Random forest , etc.

Argha Ghosh

Pouya Sinaian But, WEKA 3.9.4 latest version able to do so.