When I am performing Sargen test to check whether instruments are correlated with error term or not, I get a statistically significant p-value when I use 3 instruments and a statistically insignificant p-value when I use more than 6 instruments! How do we decide how many instruments we need to choose?