On Fermi, my coda C code performs well, however it gets me wrong results on Kepler. So I cuda-memcheck the code and got the error message:
========= CUDA-MEMCHECK
========= Invalid __shared__ read of size 8
========= at 0x00000868 in datTime(int*, int*, double*, double const *, double const *, double const *, double const *, double const *, double const *, double const *, double const *, int, int, int)
========= by thread (127,0,0) in block (62,0,0)
========= Address 0x00000878 is out of bounds
========= Saved host backtrace up to driver entry point at kernel launch time
========= Host Frame:/usr/lib64/libcuda.so (cuLaunchKernel + 0x2c5) [0x14ad95]
========= Host Frame:rr [0x1c968]
========= Host Frame:rr [0x3be33]
========= Host Frame:rr [0x3c0e]
========= Host Frame:rr [0x3a09]
========= Host Frame:rr [0x3ac0]
========= Host Frame:rr [0x3473]
========= Host Frame:/lib64/libc.so.6 (__libc_start_main + 0xfd) [0x1ed1d]
========= Host Frame:rr [0x27a9]
=========
========= Program hit error 30 on CUDA API call to cudaThreadSynchronize
========= Saved host backtrace up to driver entry point at error
========= Host Frame:/usr/lib64/libcuda.so [0x2ef673]
========= Host Frame:rr [0x3a8f6]
========= Host Frame:rr [0x3478]
========= Host Frame:/lib64/libc.so.6 (__libc_start_main + 0xfd) [0x1ed1d]
========= Host Frame:rr [0x27a9]
Any suggestions?
Thanks in advance.