I think you should use some clustering method to get similarity measures for datasets. Probably, your analysis model should determine the best useful clustering method for each case.
Hello Nitin Pise, I am just wondering whether you could elaborate more on the ? this may help in getting answers. An example maybe just a very vague example is that I wrote an article on the global children's challenge program which is on the in-touch page of my website. The class average steps/ day over 50 days for the class at my daughter's school was 14003 steps/day. This program is run worldwide and while the website has shut down now the goal was to compare across 50 countries worldwide. So in this instance you would have geographical, climate, socioeconomic factors which varies. I guess you could adjust for these differences in the analysis. Is that the kind of comparisons you have that maybe you have some means values on a variable for groups within countries or areas, but the datasets vary with some measure of education, socioeconomic status, employment or something where you need to factor this into the equation. Is this a consistent measure for eg in Australia we have socio economic index for areas or the like. Just a few thought for what they are worth. Best Wishes, Deborah Hilton Statistics Online
Well, I would like to give another suggestion. Did you think about use some kind of "degree of similarity" to establish a proximity among attributes? Maybe you could use fuzzy clustering or another fuzzy logic tools. I look forward to helping you. Kind regards, Wagner Arbex