I make this decision based on case-wise visual inspection. But I would quite often exclude as many as 25-30 out of 64 ICs, especially if it is patient data, which tends to be very artifact-prone. And I would recommend to do a visual raw data inspection and exclusion of the most noisy segments prior to ICA (I assume you do this anyway), this will reduce the number of ICs that you have to exclude.
I think giving general recommendations for how many components to reject is not possible. This is foremost due to differing levels of noise –if noise is very low and stereotypical, few components might capture it sufficiently. If noise levels are high and / or if artifactual sources are diverse, most components likely include some noise.
Reducing noise before ICA is most crucial. As Marius already mentioned, data should be carefully inspected in order to remove non-stereotypical and large amplitude noise. If the gamma band is of interest, separating data into a low (e.g. 30 Hz) can also be helpful.
In my opinion, however, one decision has to be made a priori. That is, how conservative or liberal do I want to handle the decision for rejection? The conservative perspective would be to reject components only if there is a very clear signature of noise (e.g. heart beat, eye movement etc.). After all, every component represents a mixture of signal and noise. The liberal perspective would keep components only if noise is apparently absent. While both approaches are legitimate, it is very important to be consistent across participants.
In general, eleminating 25 out of 64 component seems unreasonable. According to Cohen`s opinion, if you are not sure whether a component is artifact or EEG, you should not remove it. See details: http://mikexcohen.com/lectures.html ( Independent components analysis for removing artifacts). Hope this helps!
Generally, removing 40% of your components seems too much. Are you sure you are not removing non-artifactual activity? Therefore I would stick with the conservative approach Jonas Misselhorn mentioned.
If your data are quite noisy (especially over frontal sites), and your not interested in higher frequencies, you could try to set a lowpass below the utility frequency. Also, I would suggest you to go through this tutorial https://labeling.ucsd.edu, it should be helpful