Is there a A* implementation for nvidia gpus? Or a Library for HPC stuff on GPGPUs at-least? Some real life parallel algorithms implemented on them would be good too.
I think, if you are interested in GPU computing applications, the best way to go for fully implemented CUDA codes are the CUDA samples from NVIDIA. These samples are available online and they cover many diverse applications.