The CUDA 6.5 documentation says __prof_trigger() is of type void, ie does not return a value.
Is it possible (perhaps via PTX???) to read the hardware performance counters.
They can be increased (using __prof_trigger) by a kernel, but can that kernel read them back?
Any comments or suggestions would be most welcome
Bill