Genetic Programming uses an evolutionary algorithm to evolve mathematical expression or algorithms. The search space focuses on finding a solver that solves well a problem. In genetic programming, many researchers evolves some programmes or mathematical expression then apply them to solve the problem. They are interested in good problem solutions. Some other researchers (like me) prefers studying the algorithms or mathematical expressions and solves other instances of a type of problems. These results give us an idea how general is the generated solver.
You may find these papers useful. I would research some work from these authors: Koza, Banzhaf, Langdon, J.F.Miller and L. Spector.
Patricia
Conference Paper GECCO 2013 tutorial: Cartesian genetic programming
Conference Paper Generating Human-readable Algorithms for the Travelling Sale...
If your aim is to find the best line that represents those data points, there are numerous ways in which one can be found. One of the simplest ways is using a linear regression model.
In the following link check the first two answers. First one gives a general idea of the methods already implemented in software (an example written in R) and the second one of the algebra behind it.
Thank you Rob and Diego...but i found SVR(support vector regression) is better than linear regression due to it's lower RMSE... As it mentioned in the following link a comparison between linear regression and SVR is applied for the same dataset as input. The results declare the higher accuracy for SVR which have a better fitness to sample points.
I wonder if you are joking? The SVR provides a curvilinear fit, not a linear fit, and must do so at the sacrifice of degrees of freedom. Any standard linear or multi-linear calculation program will produce the fit with the MSE minimized-- that's how it works to set the line parameters (b and c), and calculate the R, or R-squared for proportion of variance accounted for..
A curvilinear fit ought not be attempted, unless you are testing a theory that calls for a curvilinear model, and then you ought to use a non-linear modeling program. When the SVR obtains a better fit than a linear model, you run the risk of having optimized on error in the original data, which begs for empirical cross validation, and experience implies the cross validation R squared will be disappointing. You might then run a standard linear model, from which you can apply any of the shrinkage estimates w/o having to run an empirical cross validation, although such makes for the best reliability of results.
One can use y=mx+c where y is the y ordinate, x is the x co-ordinate and m is the slope of the strainght line y=mx+c. You need to first find the Statistical Centroid of the data points and let this line pass through it. Now simply revolving it and linearly translating it forward and/ or backward can get you the best linear relationship in the x,y scatter. Actually, in Microsoft Excel and also in MATLAB, there is a facilitation to find and plot such x,y scatter relationship in a linear form.