Ptx code not running

Hello all.

I tried to run a ptx code with the CUDA Driver API but the execution couldn’t complete. The ptx code is taken from a .cl kernel.

I think the problem is with some special registers ( %envreg ), and more specific with %envreg6. I figured that %envreg6 has no value and thats why the execution cannot end. I manually changed the ptx assembly, replacing the %envreg6 with a value, and the program runned. I also figured that %envreg6 normally stores the blocksize.

Is there any way to set values to these special registers ( ptx_isa doesn’t say much )? Am i missing something with the driver calls, a flag maybee on cuLaunchKernel?

I also had made a similar thread last year: What is '%envreg<32>' special register? - CUDA Programming and Performance - NVIDIA Developer Forums .

My pc configuration is pretty much the same.

So in the bottom line, can I and if yes HOW can i set values to those registers?

Thanks.

However, I do not find any references to these envregs in the driver documentation:
http://docs.nvidia.com/cuda/cuda-driver-api/index.html

Is there a way to use these special registers?
Are they only used for some OpenCL-Kernels?
Would debugging and profiling of a kernel launch still work if I initialize them with user defined values?