Formulated like this, I don't think your question is precise enough to get a useful answer.
Suggestion: Try giving one implementation that does what you say along with its computing cost (whatever you mean by that) ask people if they know of more efficient way of doing this (either computing cost or lower PSNR)