first, remember that there are no absolute advantages / disadvantges when it comes to exploratory data analysis ; define the goals and constraints on the analysis, then it is possible to discuss the (dis)advantages of various tools and techniques
.
this said, with no special ranking (and no attempt at exhaustivity !) :
does not build a generative model for the data
relies on a predefined distance in feature space (a problem shared by most clustering algorithms, to be fair)
magnification factors not well understood (at least to my best knowledge)
1D (proven) topological ordering property does not extend to 2D
slow training, hard to train against slowly evolving data
not so intuitive : neurons close on the map (topological proximity) may be far away in feature space
does not behave so gently when using categorical data, even worse for mixed data
no generally admitted rule of thumb for the various parameters (map size, neighbouring function, time evolution of the learning rate ...)
needless say, some of those disadvantages have been adressed by gazillions of variants of the SOM and vanilla 2D-SOM (12x12 map, hexagonal neighbourhoods, box neighbourhood functions, fixed small learning rate, no squeezing of the neighbourhood below the first neighbours but keep training for a while at this point and so on) remains my first choice for exploratory data analysis of numerical data
"Reasons why the Som is very popular might be because it has an easy to under-stand algorithm, it is simple to use, and it produces good and intuitive results.Furthermore being a neural network and in turn in some way a model for the hu-man brain makes it attractive. And last but not least theSombrought lots of visualization into an otherwise quite plain and number oriented process of miningdata with a very efficient algorithm with can be run on all common computers. Butit is necessary to realize the limitations of the Som algorithm. Some of them have been addressed by the Gtm algorithm and some simply cannot be solved because they are inherent to the model of mapping datapoints from a high dimensionaldata-space onto a 2-dimensional map while preserving local distances. This in turncan be overcome when the use ofSomorGtmis combined with other data-miningmethods such as pure clustering or multi-dimensional scaling techniques. " from https://pdfs.semanticscholar.org/c93a/e9ffeda90c9ea4cd951989a00a0afde8845b.pdf
Once the learning process is over, if the input distribution moves, the map will start misclassifying new input data as a result of its static nature.
Because the original neuronal weights are initialized randomly two independent experimenters can produce completely different maps.
The KSOM will not necessarily produce a grouping of results in quite the way that the user expects. Sometimes two obviously similar groups will be on opposite sides of the map.
If the input distribution has hotspots, i.e. is not flat, the KSOM algorithm does not work well. What it will tend to do, instead of creating one large group of many data items, is to create several smaller groups ignoring the high correlations between them.
The number of neurons in the map is fixed and is decided in advance. So, if too few neurons are chosen, the grid points will be very widely spaced over the input data set resulting in a poor model of the distribution.
The number of parameters, the values for the training parameters, the size and topology of the map, all have to be determined in advance. However, there are several settings to bring good results but they are time-consuming.