Can I combine a LASSO and decision tree model?

More Holger Steinmetz's questions See All

Do correlations within paneldata sets differ from pooled correlations?

Dear colleagues, I already have some thoughts on this but I ask to get some feedback about what you think. Together with colleagues, I am conducting a meta-analysis that aggregates simple...

06 April 2023 367 0 View

How to treat outliers in count predictors in machine learning?

Hi there, I am planning to run some classification models and found that several of my 53 predictors are count variables with a 5-6 extreme outlying values (out of N=1,500) My questions 1) I...

28 November 2022 1,926 4 View

Latent profile analysis with R and non-Gaussian distributions?

Dear colleagues, yesterday I luckily discovered that tidyLPA allows easily to conduct a latent profile analysis. However, I saw tht the package does only work with Gaussian/normally distributed...

01 February 2022 550 3 View

Can you help me find an "exotic" reference, please?

I'm looking for the following Korean publication: Uyeki, H. & Sakata, T., 1935. Circumstance of Ulleungdo. - Bulletin of the agricultural and forestry college Suigen (= Suwon Kodung Nongnim...

02 January 2022 5,949 3 View

How to estimate monthly abundance (closed pop.) in MARK?

Hi all, I have closed population capture-recapture data from photo-identification for a period of 12 encounter occasions (months) that I want to estimate overall and monthly abundance for....

21 December 2021 4,677 2 View

How to use the Rising Plate Meter accurately on high grass stands (e.g. common in mob grazing)?

As part of a research project, I would like to conduct regular height measurements of grass stands in paddocks grazed at high and low grazing density using the Rising Plate Meter. I am aware that...

10 August 2021 581 1 View

Can I use literature rather than interviews to decuce a score for the Scoring Method by Norton (1993)?

Most research using the scoring method by Norton, such as "TOWARDS A METHOD TO SET PRIORITIES AMONGST SPECIES FOR TREE IMPROVEMENT RESEARCH - A CASE STUDY FROM WEST AFRICA" by Jaenicke (1995)...

24 May 2021 9,261 2 View

Working with insect primary cells in cell medium and in saline solution?

I work with insect primary cells, which I let grow in TC100 insect medium. For fluorescence measurements I have to discard medium (because of its fluorescence properties) and give a saline...

28 January 2021 4,223 1 View

How to approach a longitudinal model with IV and DV on different levels?

Dear colleagues, I became member of a project that wants to analyse the effect the COVID infection rates on family mood. The longitudinal design will be around 25 weekly measures. The problem is...

26 January 2021 4,964 7 View

Does anybody work with CoolLED in fluorescence microscopy?

I have trouble with exciting wavelengths higher than 460 nm while e.g. ex with 405 nm works. Filtercubes are correc, as I used them with my old xenon lamp before... I work on axiovert200M and with...

25 January 2021 5,540 2 View

A Question about Phd thesis?

Hello everyone What is your opinion about the introduction of an expert decision support system in which the rules are extracted from existing data without human intervention, instead of being...

31 July 2024 5,785 4 View

Are these cassettes suitable for expressing PETase mutant in E. coli?

I created two potential gene expression cassettes (constitutive and inducible) for expression of a mutant PETase gene on PeptiCloud using the version tree feature, which allows users to create...

28 July 2024 7,559 1 View

Please, what is the memory consumption of the Matlab function quad tree decomposition procedure [S = qtdecomp(I)] with respect to the input set I?

27 July 2024 5,455 2 View

Is it redundant to use both Random Forest and Decision Tree algorithms in the same regression project?

I am currently working on a regression model for a project and considering using both Random Forest and Decision Tree algorithms. Given that Random Forest is essentially an ensemble of Decision...

23 July 2024 4,306 3 View

What analyzes do you use to compare biodiversity in a square, at two different times?

e.g. moment one: square in its normal state moment two: square after cutting large trees and replacing them with dwarf trees.

12 July 2024 8,258 5 View

How to create a database management system of trees species using gis and remote sensing techniques?

I want to assess the trees in an rainforest habitat and collect every necessary detailed data by utilizing GIS and Remote sensing technology techniques. And build a database management system for...

11 July 2024 4,747 1 View

What are the relationship between insect biodiversity and durian tree? maybe from the nutrition contain or canopy tree, or other things?

I want to know what are the relationship that happen between insect biodiversity and durian tree

10 July 2024 8,230 1 View

Hello all Can we use different concentrations of lemon or orange extract to grow flowers in ornamental trees and shrubs?

I need more pdf on this topic

02 July 2024 6,988 1 View

Incomplete information on performance values in MCDM methods?

Good evening, I am looking for a method or approaches in multi-criteria decision making (MCDM) that deal with incomplete decision matrices. This means that no values or intervals can be assigned...

23 June 2024 6,557 1 View

What is the impact of the devastating Syrian war on the environment in Syria?

Syria has been suffering from a senseless, devastating war for more than 14 years, in which the combatants did not respect any sanctity of humans, trees, animals, cities, archaeological,...

19 June 2024 8,496 3 View

David Eugene Booth

You might have a look at the attached paper. Best wishes David Booth

On rereading your question I could have been a little more clear. Gradient Boosting and lasso are equivalent in the sense that both should give the same predictors on the same data. That's why we used gradient Boosting here to confirm our lasso results. I don't know what regression tree method you are thinking about but you might want to keep this result in mind. Best wishes David Booth

Sapargali Zhanatauov

I think about your question

Tashreef Muhammad

I am not very familiar with LASSO, but I know a way if you want to reduce number of features or find more useful features / predictors for your work.

1. Feature Selection Algorithms:

Algorithms like information gain, chi-square test and correlation matrix help understand which features are more correlated and by how much with the class that is to be predicted. From that list, you can take the top most 4/5 or more or less feature/predictor for use.

2. Dimension Reduction Algorithms:

There are algorithms like PCA that help reduce dimensionality thus reduce the number of features. You can use it to get more resourceful features. Unlike feature selection however, you will not be using raw features from your data, but processed ones.

In my opinion you may first use feature selection and dimension reduction algorithms to find out which features hold most promises, then take only those features and train the model. At least, that is how I perform tests.

Holger Steinmetz

Hello Tashreef,

I thought about at but the problem is that I intend to end up with interpretable profiles of person characteristics that are (in a nonlinearly interactive fashion) associated with the success of participating in the hackathlon. PCA creates composites that are difficult to interpret and, more importantly , difficult to use in practice (e.g., for identifying students that will likely be successful).

But I will probably give it a try :)

Best,

Holger

Holger Steinmetz thank you for your response and I understand your requirements.

I think you can give feature selection techniques a run then, to see which features are more strongly related to the class determination. After seeing your requirements, I think it will be more appropriate to try them. I personally have experience of using selection technique and reduced the number of features by 50% and got better outcome from my models.