Is normality checking of residual important for machine learning?

More Pankaj Das's questions See All

What's the most suitable dataset type for ANN and SVR?

I can't differentiate the application of SVR and ANN based on dataset nature. Please help me to understand the concept.

10 November 2018 2,635 0 View

Is the numbers of support vector effect on prediction???

I want to what is the effect on prediction/forecasting when the numbers of support vectors.

07 August 2018 3,224 3 View

What is 3SLLS technique? I need a detail theory of the topic?

I want to work on it..so I need material and concept behind the topic?

04 May 2017 8,209 0 View

DEEP learning techniques

need some materials to understand the concept of Deep learning.. help me.

01 January 1970 4,220 16 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

How to analyze multiple phosphorilation sites?

Hello, I am currently analyzing some phosphoproteomics data, but I have peptides with multiple phosphorylation sites or phosphorylations together with carbamidomethylation or oxidation. How can I...

04 August 2024 8,432 3 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

Some new emerging problems on application of RL for scheduling in IoT networks?

I have seen plenty of existing works on applied Reinforcement Learning (RL) policies for optimized scheduling in IoT networks including Q-learning, DQNs, and the newer ones including PPO for...

01 August 2024 8,754 2 View

Shin Siang Choong

Residual analysis is a good way to validate the performance of a regression model. A residual plot can visualize the influence of other variables to the regression accuracy. For example, if the residuals are near to 0 on the left part, and far from 0 on the right part of the residual plot, you could discover that, the model works very well in predicting smaller values, but it always wrongly predicts bigger values. Then, actions (e.g. data transformation) could be taken to improve the regression model.

The following link is a good source for your understanding:

http://docs.statwing.com/interpreting-residual-plots-to-improve-your-regression/

Ranjan Kumar Dash

Residual analysis is one of the important concept that must be learnt before writing any ML code. It is particularly important in regression.

Navid Kardani

Hi,

The analysis of residuals plays an important role in validating the regression model. Using residual plots, you can assess whether the observed error (residuals) is consistent with stochastic error. You shouldn’t be able to predict the error for any given observation. And, for a series of observations, you can determine whether the residuals are consistent with random error.

Cheers

Pankaj Das

I know residual analysis must for all statistical models like regression. But Machine Learning approach not belong to them. It is totally different from normal regression assumptions. We also cross validate the results. So is it necessary to check residuals?

Thank all of you for your suggestions.

Pankaj Das I think residual analysis is still helpful when using Machine Learning approaches (e.g random forest, deep learning, etc) for a regression task. Regardless of the types of approaches that are used, residual analysis is still informative for recommending actions that could be taken to improve the regression performance.

Thanks

Ilya Jackson

Residuals indicate the quality of a model. If the expected value of residuals is not close to 0, it implies that the model is systematically biased toward either over-prediction or under-prediction. Besides, if residuals contain a pattern, it seems that the model is failing to explain some relations within the data and, thus, qualitatively inconsistent. It is also worth to check, whether residuals are distributed normally and homoscedastic (their variance does not change over time). If residuals are heteroscedastic, this means that the predictive power of the model is different for different sections of the data and, perhaps, it is worth thinking about dividing the dataset into two (or more) subsets in order to train two (or more) models, such that each model is specializing on the corresponding subset.