tool/simulator to monitor the instructions executed by each thread ?

Hi All,

Say that, I have multiple control-flow divergences in a kernel, and each thread in a warp may take different path.

is there a way/tool/simulator to monitor the instructions executed by each thread ?

Thanks in advance,

I don’t think that you can get that information easily from a debugger or profiler.

You could try a simulator:

  1. Ocelot -
  2. GPGPU-Sim -

Or if you are willing to invest more time for faster trace generation, a binary instrumentation tool:

  1. Lynx -

Thanks for the reply,

Any of them has ability to show information per basic block ?

I’ve tried the GPGPU-Sim, but it seems I have to go deep inside the source to get info per basic block. Now, still trying the gpuocelot, and will take a look on Lynx later on.

Both Ocelot and Lynx should be able to give you a sequence of basic blocks. You probably will have to write some code for either one to print out the block labels. They should both have callback interfaces that allow you to plug in code that inspects a basic block object and prints out the label (you won’t be able to use PCs because they work on PTX instructions).