Anna-Maria - I think there is no general answer to this, but if you have a large number of channels (say, >32), it should not hurt to exclude several components. I have run many analyses where I had to exclude 25 out of 60 components or so. More important than the absolute number of components you exclude is that the way you handle your components does not introduce a systematic bias. For example, when you want to compare a patient sample to healthy controls and you exclude twice as many components for the patients because their data quality is lower, you may bias your results. You can prevent this bias, e.g., by comparing the number of excluded components or the total power of the backprojection of the excluded components, between conditions or groups.
Like Marius, I think there's no real solid answer to this question, and it depends somewhat on your data, however, what I find when I'm doing ICA-denoising, is that you always get three classes of components when you classify them: 1. Definitely noise/artefact, 2. Definitely signal (resting-state networks, task signal, etc.), and 3. Components that you're unsure about. What's important is to only exclude the components that are definitely noise/artefacts - if you're not sure whether a particular component is noise or good signal, the safest thing to do is leave it in the data. I'd say excluding 20-40% of your components would be fairly typical.