If you are looking for an official document, NATO STANAG 3769 may help you, which says : "The aim of this agreement is to standardize the definition of terms, and to specify the minimum resolved object size requirements in imagery used for interpretation and technical analysis.". However, it is not possible to say that the given requirements are sufficient for computer-aimed (automated) solutions.
Also the problem depends on the used sensor type (hyperspectral, thermal, Sar , electro-optic vs..). If you are dealing with hyperspectral data, it is possible to detect subpixel objects. If you are dealing with EO data (like google maps..) you have to consider:
What is the invariant for that object? spectral value, shape, or texture
As Caglar Senaras rightly says, the answer will be modulated by the type of remote sensing tool you are using. And why "three factors"? Why not two or four? Limiting ourselves to the three most important factors, and keeping it valid for all types of remote sensing imagery, the three factors would be:
(1) size: the object has to be large enough to stand out from its background;
(2) reflectivity: the object has to be different enough (more reflective, or less reflective);
(3) contrast with background.
Looking at sonar (my own field) or radar (similar wave scattering), the size will be governed by the resolution of the sensor at the range/angle considered compared to the size of the object (across- and along-track if the sensor is moving, for example towed by a ship or on a satellite or plane). The same would be true of satellite imagery.
Targets need to be resolvable with the sensor, i.e. have a reflectivity that makes them stand out from the background. With the example of sonar, water-saturated sediments on top of a hard seabed will not be detectable if using certain frequencies, as their acoustic reflectivity would be close to that of water. With radar, very dry material like sand might be transparent to microwave radiation (this is for example how SIR-A detected fossil rivers below the Sahara). An optical analogue would be clouds in the atmosphere: if they are thin enough, they will not be visible at certain wavelengths, or maybe just add fuzziness or decrease local intensities slightly.
Objects of even 1 pixel in size can be detected if they are "different" enough from their background, i.e. if their reflectivity is much lower (shadow or absorbent) or much higher (strong or rough reflector). This contrast can be enhanced if looking with several frequencies (for example in optical imagery, one would immediately identify a red dot in a green field, even if it is really small).
I hope this answer helps too. Good luck with your work!