Which is the best machine learning regression model to analyse small data sets?

More Samiksha Pantha's questions See All

Can you please suggest a simple and easy protocol for Medicarpin estimation?

Medicarpin estimation requires a long protocol and HPLC. Is there any other way Medicarpin can be estimated? Your help will mean a lot. Thank you.

06 September 2022 589 0 View

Do antibiotic resistance gene primer is same for all bacteria?

I need a suggestion on different antibiotic resistance gene. Is it same for all bacteria?

16 May 2021 7,888 3 View

How to find which group has higher significant difference than the other group?

I have a dataset that has two groups: mean_1 and mean_2. I want to find statistical significant difference between mean_1 and mean_2. The groups are paired but data is not normally distributed so...

17 February 2021 8,697 12 View

Has anyone here used NEXRAD Data to access historic hourly precipitation data?

I've trying to access hourly precipitation data from NCEI's Radar Archive however I've not been able to configure past the downloads. Is there any code available to decode the data?

15 February 2021 6,861 1 View

How do we calculate 95% confidence intervals for RMSE?

I have two datasets (measured and modelled). I want to calculate 95% confidence intervals for RMSE. Cany anyone please help me with this? Thank you in advance

04 February 2021 6,094 6 View

how to analyse the impact of independent variable on multiple dependent variables?

I have a dataset which three variables: date, mean1 and mean2. date - datetimens64 (YYYY-mm-dd format) mean1 - float type mean2 - float type The data is not normally distributed. I want to find...

19 January 2021 8,391 5 View

How to calculate interlayer static resistance force for a bilayer 2D material in Quantum Espresso?

I want to calculate interlayer shear strength of MoSSe bilayer during the sliding of the top monolayer over a fixed bottom layer. In literature, it is defined as, τ=|f/A| where, f is interlayer...

24 June 2020 5,912 2 View

Is 10 years of data enough to study trend of stream discharge and precipitation?

I'm studying to study hydrological regime of a watershed for a statistics project and I'm wondering if running a Mann-Kendall test would be useful for the analysis of these data set.

15 April 2020 9,547 35 View

Why iron sulphide sediments are frequently formed during winter and role of microorganism in formation of iron sulphide mineral?

I read in a book Environmental Chemistry that" in anoxic region of water bodies , some bacteria use sulfate ion as an electron receptor where as other bacteria reduce iron(III) to iron (II). These...

22 October 2019 9,029 3 View

SiRNA mediated knockdown is observed at RNA level but not at the protein level. What might be the reason?

I am currently doing siRNA experiments and getting 80% knockdown at RNA level, however no effect is seen at the protein level even at higher concentrations tested. What more should i try?

04 August 2019 9,686 1 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

How can I prepare virus for a TEM or SEM imaging?

I have virus (viral hemorrhagic septicemia virus) in suspension and the experiment will not involve cells. What level of TCID50 is preferred?

11 August 2024 3,115 1 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Is it possible to use the Fused Deposition Modeling (FDM) to additively manufacture interconnected porous structure generation of >100-200 micrometer?

Usually, additive manufacturing techniques like SEBM, SLS, and SLM are used for interconnected porous lattice structure generation with sizes of >100–200 micrometers. Can the Fused Deposition...

09 August 2024 7,892 0 View

How to define an anisotropic material with asymmetric elastic compliance/stiffness matrix in ANSYS APDL?

I need to model an anisotropic material in which the Poisson's ratio ν_12 ≠ ν_21 and so on. Therefore, the elastic compliance matrix wouldn't be a symmetric one. In ANSYS APDL, for TB,ANEL...

09 August 2024 5,048 2 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

How can I apply boundary conditions in an orthotropic steel deck numerical model using ABAQUS software?

I am trying to simulate vehicular loading on an orthotopic steel deck bridge section in ABAQUS software. The red arrow mark in the attached figure indicates the direction in which the vehicle will...

08 August 2024 719 0 View

Stability of the Solar System: Insights from Einstein’s Equations ?

The stability of the Solar System is a complex subject that blends the classical framework of Newtonian mechanics with the modern insights provided by General Relativity (GR). Understanding this...

07 August 2024 2,569 1 View

Can you suggest reliable sources defining "3D mesh" and "3D city models"?

Dear fellow researchers, I am currently working on a paper where I need to provide a reliable reference that defines and distinguishes between 3D mesh models and 3D city models. Although I am...

06 August 2024 9,986 2 View

Is there an alternative to a multinomial regression which allows the DV to be non mutually exclusive?

I am trying to analyse data from a survey examining what variables affect teachers perceived barriers to incorporating technology into their classroom. I have 5 predictor variables however my DV...

06 August 2024 1,752 3 View

Alexander Kolker Popular answer

If some of the predictor variables are correlated then they first must be decorrelated using techniques such as principal component decomposition analysis (PCA).

PCA is formally a dimension reduction technique, its primary purpose is removing multicollinearity by producing totally uncorrelated (orthogonal) linear combination of the original variables, i.e principal components (when all other assumptions are met). The orthogonal variables are needed instead of original ones because only orthogonal variables produce truly independent regression coefficients.

You can add or remove orthogonal regression variables without recalculation all other coefficients, thus allowing estimation of the true relative contribution of independent variables into the dependent one and finding out which variables are truly significant. Instead of 28 predictor variables you will use only a few uncorrelated linear combinations. You then could proceed with regression with principal components.

See, for example, the book "Healthcare Management Engineering. What does this fancy term really mean? Springer NY, 2012", chapter 5-

In this example, instead 40 original variables for which ordinary regression was totally meaningless, it was found that only 9 principle components account for all data. Performing then regression with principal components allowed determining the primary contributors among all 40 original variables.

Alexander Kolker

Toto Haryanto

Hi Samiksha

You can use PCA as answered by Alexander to make sure the variables don't have a linear correlation.

After that, you can support vector regression is one of the machine learning techniques. This is the reference:

https://papers.nips.cc/paper/1238-support-vector-regression-machines.pdf

https://www.analyticsvidhya.com/blog/2020/03/support-vector-regression-tutorial-for-machine-learning/

Regards

Toto

Aki Koivu

After PCA, I would try out random forest regression because it deals quite well with continuous "wide-type" data which you have, and because it provides feature importance scores of the chosen metric. These scores can be evaluated to further investigate the relations of predictors to the predicted value (most contributing predictors), something that nonlinear SVMs do not provide. Be careful though, if your limited dataset is noisy then there is a good chance of overfitting, regardless of the method.

Samiksha Pantha You did not provide details of your problem and available data to recommend a specific regression technique. But, as I mentioned in my comment, an example of linear regression with principal components is provided in chapter 5 of the book

https://www.amazon.com/gp/product/1461420679?pf_rd_r=PQ3JX8XHJ7SEM1T7V54M&pf_rd_p=edaba0ee-c2fe-4124-9f5d-b31d6b1bfbee