I need to increase the input resolution of a "simple" LeNet CNN, from 32x32 to 640x480. The net must perform object classification, therefore I need probability values as output (scalar quantity). I could perform a big spatial subsampling (increasing the stride factor in the pooling layers), but I don't know how much the accuracy of the classification is affected by this choice. Is it better to add layers, making a deeper CNN?