I have been recently interested in source separation for musical signals. The paper I study (see below), uses nonnegative matrix factorization (NMF) for separation of musical audio recordings based on the magnitude spectrogram which could be a size MxN nonnegative matrix.

"Score-informed source separation for musical audio recordings: An overview", Ewert, S., Pardo, B., Muller, M., Plumbley, M. D., IEEE Signal Processing Magazine, vol: 31, no: 3, pp:116 - 124, May 2014.  

NMF separates the magnitude spectrom into a size MxK template matrix W and a size KxN activation matrix H, both of which are also nonnegative valued. Dimensions M and N correspond to the numbers of the frequecy bins and time frames, respectively, of the input magnitude spectrogram. But, the additional dimension value of K is shared by both W and H and should also be given to the NMF. In the above paper, K  is manually set dependent on the number of instruments and musical pitches existing in the particular musical piece that is to be separated.

In that case, can we still claim that we are performing  a blind source separation method? Or, it is better to classify it as semi-blind, or even something else? What is the accepted terminology? I will appreciate some expert opinions.

Similar questions and discussions