cuda context monitor && nvvm opt=0 problem

Hi, I face a very tough problem, so I hope I would get some ideas from here.
this is the background of my problem:
In the project, I use llvm to generate the IR code dynamically, then I chose the nvvm as the backend to generate the ptx code, after that I use cuModuleLoadDataEx to load the ptx and do the kernel launch.
my problem is very weired:
If when I use nvvm to compile the IR to PTX, if I chose the opt=3, I seem every thing is right, and there is no problem,
however, if the set opt=0, the kernel function seem to fall into a dead loop, after launch the kernel function several time.
so, these are my question

  1. Is there any tool to let me monitor the cuda context, because my kernel function failed after kernel launch several times, so I want to know what’s the context difference between each kernel launch.
  2. Does anyone know is there any problem with nvvm opt=0.
    looking forward to any reply.
    thanks