Depends on your data, you can find what kind of clustering algorithm is proper to solve your problem. However, you can search for Deep Embedded Clustering (DEC) which is one of the most promising approaches in this matter.
In addition to Saeed's post, Deep Embedding may be declined in 2 versions:
an unsupervised one (DEC): Xie et al., " Unsupervised Deep Embedding for Clustering Analysis ", 2016 - https://arxiv.org/pdf/1511.06335.pdf
a supervised one (DEK): Le et al., " Deep Embedding Kernel ", 2018 - https://arxiv.org/pdf/1804.05806.pdf
More generally, Deep Kernel Processes match your needs. I suggest that you follow:
Lu et al., " How to Scale Up Kernel Methods to Be As Good As Deep Neural Nets ", 2015 - https://arxiv.org/pdf/1411.4000.pdf
Anselmi et al., " Deep Convolutional Networks are Hierarchical Kernel Machines ", 2015 - https://pdfs.semanticscholar.org/7227/d4e427c89202e4368922bb7f0304ee82a582.pdf
Wilson et al., " Gaussian Process Kernels for Pattern Discovery and Extrapolation ", 2013 - https://arxiv.org/pdf/1302.4245.pdf
Wilson et al., " Thoughts on Massively Scalable Gaussian Processes ", 2015 - https://arxiv.org/pdf/1511.01870.pdf
Montavon et al., " Kernel Analysis of Deep Networks ", 2011 - http://jmlr.csail.mit.edu/papers/volume12/montavon11a/montavon11a.pdf
Wilson et al., " Deep Kernel Learning ", 2016 - http://proceedings.mlr.press/v51/wilson16.pdf
Song et al., " Optimizing Kernel Machines using Deep Learning ", 2017 - https://arxiv.org/pdf/1711.05374.pdf
Bengio et al., " Representation Learning: A Review and New Perspectives ", 2014 - https://arxiv.org/pdf/1206.5538.pdf
Your specificity criterion largely depends on your application context. Nevertheless, you may check out:
Kiran et al., " An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos ", 2018 - https://arxiv.org/pdf/1801.03149.pdf
Verdoja et al., " Graph Laplacian for Image Anomaly Detection ", 2018 - https://arxiv.org/pdf/1802.09843.pdf
Kwon et al., " Kernel RX-Algorithm: A Nonlinear Anomaly Detector for Hyperspectral Imagery ", 2005 -
Article Kernel RX-Algorithm: A Nonlinear Anomaly Detector for Hypers...
Reece et al., " Anomaly Detection and Removal Using Non-Stationary Gaussian Processes ", 2015 - https://arxiv.org/pdf/1507.00566.pdf
Herlands et al., " Gaussian Process Subset Scanning for Anomalous Pattern Detection in Non-iid Data ", 2018 - https://arxiv.org/pdf/1804.01466.pdf
Please provide a general explanation of what you are trying to achieve.
For any abnormality detection (considering the picture is not altered), you still need the model to learn what is considered normal or not.
Unsupervised model would only classify images into specific group the model found. It does not mean it is abnormal, it just mean there are trends in the image that made them similar with each other.
If you want to find altered pictures, then you need different other tools.