I have large dataset. I want to find the structure of data, whether it is linear or non-linear. Based on this, I need to find whether to use neural network (which is good for non-linear representation ) or PCA .
Data Visualization would help if the data set is not more than 3D. You can as well consider the variability of each attribute with respect to the output graphically. That would help us judge the structure of the data. There could also be some advanced statistical method of doing this.
I think you could fit a regression line to the data and calculate some measure of dispersion (e.g. squared errors), which would tell you *how much* linear the relationship is. Calculating the results also for fake linear (which would have squared errors = 0) and some non-linear data relationships should give you some border cases.
At the outset, it is a simple scatter diagram that will give the clue. If dimension is very high, PCA will help to some extent of understanding variability ( in terms of Eigen Values ). However, as Aleksandra Perz suggested, do linear as well as non linear regression analysis. Accept which ever the fit gives lower value of RMSE.
I don't know about matlab. But I wrote a script in python (couldn't resist :P) that does it, so if you can use python or Linux command line, I could share.
Hey, I'm sorry this took me a while, I was a little busy ;) Here's the GitHub repo with the script... as is stated in the README, this is not a formal hypothesis testing, but rather intended to give you some general intuition on the variable relationships.