I have a set of measurements for Y (with error bars) for different values of X. How do I take into account the error bars in finding if Y is significantly correlated with X (using either Spearman or Pearson correlation)?
If the Y values at a particular X are independent and if you have the data "behind" the error bars, simply use them.
If the Y values at a particular X are independent and if only have the error bars, use the inverse of their legth as weights and do a weighted correlation analysis.
If the Y values at a particular X are not independent, use only the means and forget about the error bars.
OK thank you, this is an obvious answer. Let me explain the problem in more detail: I do not have the data that give rise to error bars (in fact, the error bars are the confidence intervals of proportions). They are of similar widths for most data points, so weighting would not make a difference. When I plot the values, I get a significant +ve correlation. However, when I plot the data with error bars, I can draw lines in +ve and in -ve direction that pass through all the error bars. Am I still justified in reporting a significant +ve correlation?
Yes, you are. You do analyze the correlation on the aggregated data. Whatever you result is, is valid, and it refers to aggregated data (i.e. to such averages). This is different to the correlation that might exist between individual data. Since you don't have individual data, you cannot infer anything about the individual correlation.
You may have a look at http://en.wikipedia.org/wiki/Ecological_fallacy
I attached a picture shoing a simple (made-up) example:
two groups with some normally distributed data (n=10 per group, gray points). Square black dots are the group mean values (error bars: 95% CIs). The thik black line shows the regression line, the p-value (of the difference between means = regression slope = correlation coefficient) is 0.028. The curved dottel lines indicate the 95% confidence band for the regression line.
The red dotted lines show the extreme lines connecting the ends of the error bars. One of them has a -ve slope, and still the +ve slope(correlation) is significant at the 5% level (what is known here because the underlying data is known).
This is just to give an example that lines through the CIs are not indicative for the correlation.