The Johnson transformation will be unable to "normalise" data which is multimodal, although it might be able to "improve" certain properties, depending on why you are trying to normalise the data. Similarly, it is not immediately suitable for discrete-valued data unless you are willing/able to deal with moving from a regular to an irregular grid of possible values ... which might mean just treating the data as if they were continuous-valued (which may be reasonable in some cases).
For a very interesting discussion see thelink below. Broadly I agree with the thrust of their argument that not all data is suitable for normalisation and even if it is normalising it may hide important trends. Also it isoften possible to use non-parametric methods which do not rely on assuming a data distribution.