I’m trying to use double precision in CUDA, however, I hardly find any good information for this.
I tried -arch sm_13 , then the compiler complains:
1>nvcc error : ‘ptxas’ died with status 0xC0000005 (ACCESS_VIOLATION)
1>Build log was saved at “file://e:\work\b4_cmi_cuda\src\GLModel\Release\BuildLog.htm”
1>GLModel - 0 error(s), 268 warning(s)
I tried -arch compute_13, it can compile, but the program died when the first time device memory is allocated.
Could someone tell me how to use CUDA with double precision? What is the correct options, and why I run into memory problem before any real computation is started yet?