As your question is vague, I would assume that you are not referring to removal of noise in acoustic sound imagery but to pure acoustic signals.
Here is a patent that has been granted recently and it can help you start thinking of how you can apply the knowledge taken from the patent.
https://www.google.com/patents/US8606571
My understanding is that you want to apply two-dimensional filtering to stereo sound ( or 3D filters to 3D sound).
2D filters can be applied as separable or non-separable. You can apply separable filters in each one of the dimensions and the non-separable 2d filters similar to the application in imaging. Sometimes you can apply their inverse form and use FFT.
Here is an example in an article from Ieee explore:
another software that may help you is iZotope RX6. It contains powerful tools to filter and process in spectrogram domain (time-frequency). Moreover, Adobe Soundbooth has also time-spectrogram combined view that can help you to understand the spectrogram view.
Tonal noise removal is usually an easier task than broadband.