Here’s my problem. I wrote a simple program in OpenCL and got the PTX code. Then i tried to run this code with the CUDA driver API. When i executed the code, i got a 700 error from cuMemcpyDtoH(). I saw the PTX code and there was this line:
mov.u32 %r5, %envreg3;
The problem got away, by replacing %envreg3 with 0, and the code run flawlessly.
I searched up, but the only thing i found about this special register is on ptx_isa.pdf, and it sais: “Precise semantics of these registers is defined in the driver documentation.”, but i couldn’t find anything .
If someone knows what is this register (%envreg<32>) and what’s it’s use i would be grateful!
Some additional info:
OS: Ubuntu 2.6.38-13-generic
nvcc: release 4.0, V0.2.1221
NVRM version: NVIDIA UNIX x86 Kernel Module 280.13
GCC version: gcc version 4.4.5 (Ubuntu/Linaro 4.4.5-15ubuntu1)