Some time ago I published a paper called "Identification of tuberculosis bacteria based on shape and color". In that work, the number of descriptors was not very high. I made a reduction of descriptors by plotting each characteristic as a Gaussian and I only kept the descriptors, whose Gaussian did not overlap. I have used the method successfully in other projects and my students also use it regularly because it works quite well for them.
For dimension reduction you may use PCA/SVD or LDA by following https://stats.stackexchange.com/questions/35185/dimensionality-reduction-svd-or-pca-on-a-large-sparse-matrix?rq=1
PCA is unsupervised in the sense that it tries to compute the direction of maximum variance for the entire dataset as a whole (principal components) while LDA is regarded as supervised since it estimates maximum variance for each class individually. This is done by maximizing the distance between the class means and minimizing the spread of the individual classes.