I am looking for a comparison of commonly used optimization tech in context of ML Linear models such as gradient descent (& it's varieties), QR decomposition etc. Also, I am interested in their R & Python implementation.
My advice is to save yourself some work. as soon as some new regression technique (e.g. all the known variants of lasso)is worked out it is immediately coded into R and made available. Can you be more specific about what you are looking for? It may be easier for someone to help you. Best, D. Booth
Hello, a while ago I found the stepwise technique, it is based on the selection of variables in a multiple linear model, uses AIC criteria. I invite you to review the technique.
Hi, Whatever you do, DO NOT USE STEPWISE. Please see the attached paper for why and what you might try if you are in a prediction mode. Best wishes, David Booth
Agree with David Eugene Booth Beginners usually love to use step-wise method for variable selection, which it's not recommended.
However, my question was neither about 'variable selection' nor about 'code of linear models in R/Python'.
What I am interested in is 'Optimization techniques' used for optimizing cost function of linear models. Few of these are Normal Equation Method, Gradient descent (& it's variants like stochastic GD), QR-Decomposition (default method used by 'lm()' in R), genetic algorithms etc.
I am looking for in-depth answer from someone with expertise in both Optimization and Linear models.
most of us use inv (x'x)x'y because most linear models are dealt with by ols. if more efficient methods are required it depends on the requirement. If you go to variable selection models(my apologies for mentioning it again), you find various ones used depending on the characteristics of the math. programming model used. if you look at the various lasso variants most use different ones but the characteristics of the problem dictate the choice. Some of the choices as used in the early work are discussed in papers available at: https://www.google.com/search?rlz=1C1CHBF_enUS847US847&ei=BilaXbDgHsu8tQWw4LHwDw&q=boos+adaptive+lasso+ncsu&oq=boos+adaptive+lasso+ncsu&gs_l=psy-ab.3...47432.61674..64807...0.2..0.532.1383.0j3j2j5-1......0....1..gws-wiz.......0i71j33i299j33i160j33i22i29i30.e2KgATrMmz0&ved=0ahUKEwiw-_3jj47kAhVLXq0KHTBwDP4Q4dUDCAo&uact=5
and similar searches. I apologize for only taking a year of numerical analysis and a couple of stat computing courses but I would have thought if I could find such info. it would be general knowledge. You might look at group lasso and adaptive group lasso. I found those particularly interesting. The reason the lasso variants are still coming is because the optimization requirements for different techniques are different. May the force be with you. D. Booth N.B. For intro to many of these see the link: https://www.google.com/search?q=optimization+methods+in+statistics+and+machine+learning&rlz=1C1CHBF_enUS847US847&oq=optimization+methods+in+statistics+and+machine+learning&aqs=chrome..69i57j33.48632j1j8&sourceid=chrome&ie=UTF-8
PS you may find the attached interesting as well. DB