Hi, I face a very tough problem, so I hope I would get some ideas from here.
this is the background of my problem:
In the project, I use llvm to generate the IR code dynamically, then I chose the nvvm as the backend to generate the ptx code, after that I use cuModuleLoadDataEx to load the ptx and do the kernel launch.
my problem is very weired:
If when I use nvvm to compile the IR to PTX, if I chose the opt=3, I seem every thing is right, and there is no problem,
however, if the set opt=0, the kernel function seem to fall into a dead loop, after launch the kernel function several time.
so, these are my question
- Is there any tool to let me monitor the cuda context, because my kernel function failed after kernel launch several times, so I want to know what’s the context difference between each kernel launch.
- Does anyone know is there any problem with nvvm opt=0.
looking forward to any reply.