I have been looking at the "convolutionSeparable" code sample provided with CUDA 7.5, for the implementation of a 2-D FIR filter for image processing. 

The rationale behind the design is discussed in the link.

Has anybody else looked closely at this code and found possible improvements or have any other comments/thoughts? 

http://docs.nvidia.com/cuda/samples/3_Imaging/convolutionSeparable/doc/convolutionSeparable.pdf

More Hugh Lachlan Kennedy's questions See All
Similar questions and discussions