"However, these studies may be unreliable because they are not based on statistically significant samples."
Mason and Hoeksema (2010) used nearly the entire history of SOHO/MDI for a total of 1075 active regions, and 71,000 magnetograms. I see that you also included white light, which would've expanded the available data we could have used for our study, but it's hard to argue that >1000 active regions is not a statistically significant sample. I agree that the majority of similar papers don't use statistically signficiant samples and I haven't done any kind of p-value statistical stuff on our sample, but am I missing something?
Also, "Mason and Hoeksema (2010) found that the gradient-weighted neutral line length was a good parameter for predicting major flares." What we found was that GWILL was the best correlated parameter, but we dedicated a large portion of the paper to showing that even it was not a good real time predictor of flares.
Article Dynamics of Microwave Sources Associated with the Neutral Li...