Using Visual Profiler when CUDA is called by a wrapper function

I converted a program from IDL into CUDA that performs some calculations on a 256X256Xn cube of densities and renders a 2-D image.The program works correctly, but all the pre-processing is still done in IDL (such as reading in the density cube, etc) and passes that info to a wrapper function (using call_external to a C program), that then calls CUDA.

Currently I am trying to optimize the program and would like to use NVIDIA Visual Profiler to check my coalescence, and was wondering if there was a way to do this…a way to get the visual profiler to run when I call the CUDA part of the program?

I currently can’t test anything because there are way too many variables to just hard-code into the CUDA function, but without those values passed in from IDL to C to CUDA it cannot run.

I do have it set up so I can run the IDL, have it stop and then manually call the C wrapper function instead of just running the IDL and having it automatically do everything.

Thanks