If we have two variables, one dependent and one independent and correlation between them is almost 0. Can we apply method or technique to develop a model??
Yes: plot the data and get an impression. If you see pattern, you may think about some model that would produce such a pattern. You may then think harder to find out how you could experimentally show that this is not an approriate model (or at least that the key feature of the model is not a good explanation for what you see). Then you can actually perform the experiment and "test that hypothesis".
Thanks Carlos Jimenez-Gallardo, I am also use the same way to develop model. There is no any linear correlation between independent variables (n) to dependent variable (1) but we can use combination of independent variables to develop better models.
To complement Jochen's point, the other direction you have to work is from model to data. What process generated the data? I've been working this past couple of weeks on data from an epilepsy study, where there should be two distinct processes at work. One process determines whether seizures will occur, while the other process determines how many seizures will occur each day. So we're expecting to see an excess of zero seizures. Once I understood the process that gave rise to the data, I knew what I should see when I explored it, and how I might model it.
You cannot tell simply from looking at the data what process gave rise to it. To fit the right model, you need to understand (or think you understand!) the process that gave rise to the data. And if there are several candidate processes, you need to understand how each might leave its 'fingerprint' on the data.
I am working on software change proneness using static source metrics. When we calculating correlation that show the linear relation between them. It may possible there is not direct linear correlation between.
You have to check the presence of anomalies in a data set.
See, for example,
Prykhodko S.B. Statistical anomaly detection techniques based on normalizing transformations for non-Gaussian data // Computational Intelligence (Results, Problems and Perspectives): Proceedings of the International Conference, May 12-15, 2015, Kyiv-Cherkasy, Ukraine / Ministry of Education and Science of Ukraine, Taras Shevchenko National University of Kyiv and [etc]; Vitaliy Ye. Snytyuk (Editor). – Cherkasy: editor July Chabanenko, 2015. – p.286-287. – ISBN 978-966-493-975-8
I believe that some of the answers above get you going in the right direction -- what are the theoretical underpinnings of the model, what does the theory tell you about which variables to include, and are the assumptions of OLS working.
currently we are doing this analysis on 500 different PLC program. For all program, we calculate 9 different type of software metrics and the change proneness.
After getting all value, we try to find the relation between change proneness and software metrics.